Stract: Open Source Search Engine With Its Own Crawler and "Optics"

Post a reply

Confirmation code
README! Enter the code exactly as it appears and then add yuki to the end of your captcha. All letters are case insensitive.
Smilies
:D :) ;) :( :o :shock: :? 8-) :lol: :x :P :oops: :cry: :evil: :twisted: :roll: :!: :?: :idea: :arrow: :| :mrgreen: :geek: :ugeek:

BBCode is ON
[img] is ON
[url] is ON
Smilies are ON

Topic review
   

Expand view Topic review: Stract: Open Source Search Engine With Its Own Crawler and "Optics"

Re: Stract: Open Source Search Engine With Its Own Crawler and "Optics"

by nafnlj » Fri Feb 02, 2024 5:36 pm

Yukinu wrote: Wed Jan 24, 2024 4:56 am Interesting, they seem to have written their own crawler and everything, pretty impressive. Also the search engine works very well without JavaScript.
The only thing I noted didn't work with JS is the ability to boost or discard domains using a GUI. But you can still write rules and have them work without JS.
Yukinu wrote: Wed Jan 24, 2024 4:56 am I tested out a few queries, the breadth and depth of the index is pretty close to Brave at this point, but without corporate backing. In the long run, this could be a good option as a general purpose search engine, and I'm going to create a search shortcut and set it as my default for a few days to see if I can run into some search edge cases.
I think it depends from my limited testing. I follow a version of the approach you articulated in your search engine experiment articles and keep shortcuts for many search engines and site-specific searches (I probably refer more queries to Startpage/DDG/Brave/Mojeek, but similar idea). If I were going to try searching like most people do -- simply replacing Google with a different search engine 1 for 1 and not calling others with shortcuts -- I could get by with Brave whereas I could not with Stract, Mojeek, or another similar independent engine/index, InfoTiger. One major case for me is searches for law resources. Brave is about 85-90% as good as the Google/Bing indexes, depending on what you are looking for. Stract and Mojeek are wholly inadequate in that area (to be fair, I think many of the resources block non-Google/Bing crawlers). I would like to see Stract improve here though since it's ability to write domain rankings would be helpful. I wrote one for Brave and it works decently well. (I'm setting aside all the questions about how Brave doesn't provide much information about its crawler.)
Yukinu wrote: Wed Jan 24, 2024 4:56 am I tested out a few queries, the breadth and depth of the index is pretty close to Brave at this point, but without corporate backing. In the long run, this could be a good option as a general purpose search engine, and I'm going to create a search shortcut and set it as my default for a few days to see if I can run into some search edge cases.
I look forward to reading your impressions after you complete your testing. I made one useful find with Stract that none of the others, including Marginalia, turned up. I am doing some research for an article about the 2002 NBA MVP race. Stract returned a 2002 small forum thread with some good posts to include in my article since one of the main points is to look at how the race was viewed at the time.

Re: Stract: Open Source Search Engine With Its Own Crawler and "Optics"

by Yukinu » Wed Jan 24, 2024 4:56 am

Interesting, they seem to have written their own crawler and everything, pretty impressive. Also the search engine works very well without JavaScript.
nafnlj wrote: Wed Jan 17, 2024 4:57 am I do not think the Yukinu Blog is in Stract yet but there are a few Lainchan webring sites).
It may depend on which user agent they are using and if they follow robots.txt. I've debated over time how visible I want my site to be on search engines, as I like the idea of my site being primarily found through links from other sites on the web, instead of on a search result page. At the moment, the robots.txt recommends not scraping the site (although crawlers are free to ignore the recommendation), and I block some crawler user agents (notably Googlebot). I'll think about this some more and consider my options.
nafnlj wrote: Wed Jan 17, 2024 4:57 am Stract is not too far off from having a good enough index to be molded into a sort of combination of Marginalia + useful resources for specific use-cases.
I tested out a few queries, the breadth and depth of the index is pretty close to Brave at this point, but without corporate backing. In the long run, this could be a good option as a general purpose search engine, and I'm going to create a search shortcut and set it as my default for a few days to see if I can run into some search edge cases.

Stract: Open Source Search Engine With Its Own Crawler and "Optics"

by nafnlj » Wed Jan 17, 2024 4:57 am

Stract is an open source and non-commercial search engine with its own crawler and independent index. I was inspired to share it by Search Engine Experiment.

Stract Home
Git Repository

While Stract is promising for being both open source and having its own independent index, what makes it relevant to the search engine experiment is its support for "Optics" -- similar to Brave Search's lenses. You can follow its syntax guide to write your own optic and make it public, and there are a number of public optics to try including one for IndieWeb sites and blogs shared on Hacker News. If you enable JavaScript in regular searches (not required), you can use a GUI next to search results to boost, downrank, or discard results from specific domains and export your current list as an optic.

I have been playing around with Stract a bit and it is impressive for what it is in its early stages. It is a serviceable generalist search engine for limited purposes. At this stage, it does not match Marginalia in its independent/old site index (e.g., my site is partially indexed in Stract but just about fully indexed by Marginalia and everything else; I do not think the Yukinu Blog is in Stract yet but there are a few Lainchan webring sites). I agree with Yukinu's search engine experiment take-away of preferring site-specific searches (e.g., I use shortcuts for Arch Wiki, VNDB, and many other sites), but Stract is not too far off from having a good enough index to be molded into a sort of combination of Marginalia + useful resources for specific use-cases.

Top