[Home] - [Blog] - [Notes] - [About]
Here's a list I keep of bookmarks consisting of website snapshots, or overlooked websites with easily-searchable older material.
Started on 2024-05-30.
Last updated on 2024-10-27.
For context, Kiwix is a useful piece of software that allows you to download and view Wikipedia articles offline, alongside others such as Wiktionary, StackExchange, iFixIt, etc. Its archives are usually up-to-date and older archives are deleted to save on costs.
ChatGPT's release in late November of 2022 resulted in a flood of bland garbage polluting the web for SEO, and many are still unhappy with our data being used, without permission or compensation, to train glorified, overhyped auto-complete models.
Some form of slop existed prior with the release of GPT-3 in June 2020, and GPT-2 in February 2019, but not to this level.
I was inspired by a
couple of Reddit
posts, so I wanted to search what was available on the Internet
Archive. You can do the same by searching
title:(wikipedia_en_all)
, for
example.
Here's a few of my own interest. Figured it'd fit in with the rest of my bookmarks, and my cut-off points are ChatGPT's release:
December 2010: https://en.wikipedia.org/wiki/Wikipedia:Version_0.8
November 2020: https://archive.org/details/wikipedia_en_all_maxi_2020-11.zim
January 2021 (no pictures): https://archive.org/details/wikipedia_en_all_nopic_2021-01.zim
July 2020: https://archive.org/details/wiktionary_en_all_maxi_2020-07.zim
July 2020 (no pictures): https://archive.org/details/wiktionary_en_all_nopic_2020-07.zim
July 2020 (English-only): https://archive.org/details/gutenberg_en_all_2020-07.zim
July 2020 (all languages): https://archive.org/details/gutenberg_mul_all_2020-07.zim