_ __ __ _
(_)__ / /_________ / /________ _______(_)
/ / _ \/ __/ ___/ _ \/ __/ ___/ / / / ___/ /
/ / __/ /_(__ ) __/ /_/ / / /_/ / / / /
__/ /\___/\__/____/\___/\__/_/ \__,_/_/ /_/
/___/
(Originally posted to Cohost on Sun, Feb 25, 2024, 11:57 AM)
The internet is a gold mine of textual information that we take for granted. We use search engines to browse this endless world wide web, and I can't underestimate its importance.
However, whether you think Google is dying, DuckDuckGo is screwed, or you're using ChatGPT, let's take a break from all of that and check out more.
If you're finding yourself bored of your search engine returning the same ten websites, there are independent search engines with their own self-crawled indexes! Special thanks to Seirdy's in-depth blog post for bringing them to light:
Marginalia - It follows the user's exact query (similar to older search engines) and leans towards non-commercial, text-heavy websites, punishing the "modern" sites that Google rewards. You can also filter your search depending on whether or not you want JavaScript, "small web" websites, academic websites, etc.
Mojeek - Mojeek has been around since 2004, and in 2006 became one of the first privacy-focused engines. While its results are a bit unreliable (they're working on it), it likely sports the biggest index, and a feature called Focus that allows you to create your own filter of included or excluded websites.
Stract - Way newer than both and surprisingly promising in terms of search results. I think it hits a sweet spot between Marginalia and Mojeek, and has support for different "optics" similar to Mojeek's Focus and Marginalia's filters.
There's also more search engines and websites that I wanted to share:
Old'aVista - Search engine used to search older websites. Great if you're looking for serendipity or maybe something you want that works on older hardware.
FrogFind - FrogFind is a web proxy for old computers and weaker connections that doubles as a search engine based on DuckDuckGo. It converts many (but not all) modern websites to plain text thanks to the help of Mozilla's "Readability" library. Its developer, ActionRetro, is also responsible for 68k.news, which takes a similar approach but for internet news.
Protoweb - The Wayback Machine has snapshots of how the web was back then, while Protoweb is trying to restore it by making sure every site has its functionalities intact (like Newgrounds, search engines like Yahoo and Google, etc). It also works with older computers.
SearX Belgium - One of many SearX instances. SearX is a self-hosted metasearch engine that aggregates results from search engines. For example, this means you can change the settings to have Google and/or DDG results on one page, with the advantage of your queries being sent from the instance instead of your IP, similar to Invidious for YouTube.
Anna's Archive - A search engine for various "shadow libraries" like Library Genesis and Z-Library, which contains tens of millions of books for free. It helps if you're broke, looking for books that are hard to find elsewhere, or want a DRM-free version of a popular book. If a book isn't available in your preferred format, you can use Calibre to convert them (which I used for my Paperwhite prior to jailbreaking it and installing KOReader).
Kiwix - Not primarily a search engine, but I've talked about it before. You can download backups of entire websites for offline use in a variety of languages, like Wikipedia, Project Gutenberg, Stack Exchange, iFixIt, Khan Academy, etc. and browse through them via an app or a LAN web server. I go for the latter because most devices with a modern web browser (even more if you use Browservice or Web Rendering Proxy) can connect to it, and it has a built-in plain-text search engine that lets you search through your offline websites.
Gemini - Gemini is a text-heavy alternative to the world wide web, inspired by Gopher and early HTTP, primarily used for "small-web blogging". It has three major search engines, which you can see here via a web proxy.