Tag archive

Turn your Mastodon archive file into a standalone static HTML site. Easy to customize, easy to search, all posts also indexed not just chronologically, but on tag pages.

# archive,mastodon,static_site

2052.

Надежное хранение личной информации — 2025 год

habr.com/ru/articles/929920

# archive

1997.

The Advantages of Text-Based Information Versus Videos, Audio or Images

karl-voit.at/2022/01/08/text-vs-video-audio-images

# archive,mindset,text

1910.

If it is worth keeping, save it in Markdown - Piotr Migdał

p.migdal.pl/blog/2025/02/markdown-saves#user-content-fnref-pinboard

Why and how to preserve digital content in plaintext format for long-term accessibility and reuse

The author actually doesn't think Markdown is the only correct storage format. All lightweight markup languages will do. They just use Markdown themselves. It's good they allow Mycomarkup, I'd've written a remarque here otherwise.

I myself am not sure if it's the right way. Maybe the websites should be preserved closer to the way they were made? I'll now archive this page, in HTML.

# archive

1794.

Зарождение новой науки.

ahitech.livejournal.com/171492.html

Алекс Хитеч подкидывает документы компании, которая забыла, как устроен её завод.

# archive

1737.

GitHub - s427/MARL: Mastodon Archive Reader Lite - a lightweight single-page app to explore the contents of your Mastodon archive file

github.com/s427/MARL

Mastodon Archive Reader Lite - a lightweight single-page app to explore the contents of your Mastodon archive file - s427/MARL

# archive,mastodon

1715.

The Library Innovation Lab at Harvard Law School

lil.law.harvard.edu

Home of Perma.cc, H2O, Caselaw Access Project, and others. The Library Innovation Lab is growing knowledge and community by bringing library principles to technological frontiers.

# archive,community

1711.

Websites change. Perma Links don't.

perma.cc

Broken links are everywhere. Perma helps authors and journals create permanent links for citations in their published work.

# archive,link_rot

1710.

Century-Scale Storage

lil.law.harvard.edu/century-scale-storage

If you had to store something for 100 years, how would you do it?

# archive,history,society

1709.

Digital decluttering

alexwlchan.net/2024/digital-decluttering

I'm resisting my temptation towards digital hoarding and "save everything", and trying to be more selective about the data I'm keeping.

# archive,mindset

1703.

Using static websites for tiny archives

alexwlchan.net/2024/static-websites

I've been creating small, hand-written websites to organise my files. It's a lightweight, flexible approach that I hope will last a long time.

There's a screenshot there, take a look. What's surprising is that the author employs no static site generator. Huh?? I'd rather use one or come up with a cool CGI setup. I wonder if they still make these websites...

# archive,html,pim,static_site

1700.

Archival with a universal virtual computer (UVC) ⁑ Dercuano

dercuano.github.io/notes/uvc-archiving.html

# archive,computer,vm

1688.

2024-12-13 Archiving homepages

alexschroeder.ch/view/2024-12-13-archiving-homepages

# archive,personal_site

1626.

bouncepaw

remarked

Artifacts: Image + Link Organizer app for macOS/iOS

artifacts.app

danila

Artifacts is an image + link organizer app for macOS and iOS. A
completely native, local first way to save all that stuff you find
across the web.

# archive,bookmarking,software

Open original 1533.

iipc/awesome-web-archiving: An Awesome List for getting started with web archiving

github.com/iipc/awesome-web-archiving

An Awesome List for getting started with web archiving - iipc/awesome-web-archiving

# archive

1488.

go-shiori/obelisk: Go package and CLI tool for saving web page as single HTML file

github.com/go-shiori/obelisk

Embeds all resources (e.g. CSS, image, JavaScript, etc) producing a single HTML5 document that is easy to store and share.

In case the submitted URL is not HTML (for example a PDF page), Obelisk will still save it as it is.

Downloading each assets are done concurrently, which make the archival process for a web page is quite fast.

Accepts cookies, useful for pages that need login or article behind paywall.

Might actually be a good choice for Betula archival. Might need to fork it, though.

# archive,go,library

1486.

edent/Tweet2Embed: Convert a public Tweet into embedded semantic HTML

github.com/edent/Tweet2Embed

Convert a public Tweet into embedded semantic HTML - edent/Tweet2Embed

# archive,python,twitter

1477.

2024-06-27 Keep your own archives

alexschroeder.ch/view/2024-06-27-keep-archives

Alex reminds us to keep our own archives of stuff we like. I remember how we had a discussion about that last November.

# archive

1354.

imputnet/cobalt: save what you love

github.com/imputnet/cobalt

save what you love. Contribute to imputnet/cobalt development by creating an account on GitHub.

# archive

1299.

Launch: History Book - And a Dinosaur

andadinosaur.com/launch-history-book

History Book automatically saves the content of your browsing history for searching. And it does it in a privacy-friendly way.

# apple,archive,bookmarking,software

1174.

Paperless-ngx

docs.paperless-ngx.com

Turn paper documents into a searchable digital database.

# archive,software

1095.

archive.ph

archive.today

Archive.today is a time capsule for web pages! It takes a 'snapshot' of a webpage that will always be online even if the original page disappears. It saves a text and a graphical copy of the page for better accuracyand provides a short and reliable link to an unalterable record of any web page

# archive

1089.

TubeArchivist

www.tubearchivist.com

Archive whole YouTube channels.

# archive,software

1056.

The Inadequacy of Most Proposed Approaches

web.archive.org/web/20010519155455/http://www.clir.org/PUBS/reports/rothenberg/inadequacy.html

Most approaches that have been proposed fall into one of four categories: (1) reliance on hard copy, (2) reliance on standards, (3) reliance on computer museums, or (4) reliance on migration. Though some of these may play a role in an ultimate solution, none of them comes close to providing a solution by itself, nor does their combination.

On archival.

# archive

1034.

wormi4ok/evernote2md

github.com/wormi4ok/evernote2md

Convert Evernote .enex files to Markdown.

# archive,markdown,software

931.

ArchiveBox

archivebox.io

🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more…

# archive,software

930.

adbar/trafilatura

github.com/adbar/trafilatura

# archive,python

571.

igdl - Instagram Image Downloader

www.datagubbe.se/igdl

igdl is a Python script for downloading an image from a given
Instagram URL and either save it directly to disk or write it to stdout.
It will automatically pick the highest image resolution available.

# archive,python,social_media

531.

bellingcat/snscrape

github.com/bellingcat/snscrape

snscrape is a scraper for social networking services (SNS). It scrapes things like user profiles, hashtags, or searches and returns the discovered items, e.g. the relevant posts.

Supports many major social networks!

# archive,python,social_media

530.

A Paper Internet

carlos.bueno.org/2010/09/paper-internet.html

Build your time capsule with epoxy.

# archive

513.

ArchiveTeam/grab-site

github.com/ArchiveTeam/grab-site

The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns

# archive,python,software

502.

dosyago/DiskerNet

github.com/dosyago/DiskerNet

DiskerNet empowers you to be the master archivist of your own internet browsing. As a robust, lightweight tool, DiskerNet seamlessly connects to your browser, saving and organizing your online discoveries in real-time. With an option to archive everything or only bookmark-worthy content, DiskerNet places you in full control of your browsing history. No special plugins or extensions required.

# archive

474.

web archiving

agnessa.pp.ru/computer/20210731230816-web_archiving.html

Коллекция ссылок про архивирование веба.

# archive

471.

deathau/markdownload

github.com/deathau/markdownload

This is an extension to clip websites and download them into a readable markdown file. Please keep in mind that it is not guaranteed to work on all websites.

# archive,browser_extension,markdown

470.

mozilla/readability: A standalone version of the readability lib

github.com/mozilla/readability

A program by Mozilla that powers the Reader mode in FF and many other programs. Something I like.

# archive

469.

Save Your Threads

social.perma.cc

Save birdsite threads to PDF:s. I wonder if it still works.

# archive,pdf,twitter

439.

Archive it or you will miss it

drewdevault.com/2017/06/19/Archive-it-or-miss-it.html

At this point, link rot is an axiom of the internet. In the face of this, I store a personal offline archive of anything I want to see twice. When I see a cool YouTube video I like, I archive the entire channel right away. Rather than subscribe to it, I update my archive on a cronjob. I scrape content out of RSS feeds and into offline storage and I have dozens of websites archived with wget. I mirror most git repositories I’m interested in. I have DRM free offline copies of all of my music, TV shows, and movies, ill-begotten or not.

# archive,link_rot

334.