Tag archive

32 bookmarks have this tag.

Archiving stuff so it's with you.

2025-01-07

1715.

GitHub - s427/MARL: Mastodon Archive Reader Lite - a lightweight single-page app to explore the contents of your Mastodon archive file

github.com/s427/MARL

Mastodon Archive Reader Lite - a lightweight single-page app to explore the contents of your Mastodon archive file - s427/MARL

1711.

The Library Innovation Lab at Harvard Law School

lil.law.harvard.edu

Home of Perma.cc, H2O, Caselaw Access Project, and others. The Library Innovation Lab is growing knowledge and community by bringing library principles to technological frontiers.

1710.

Websites change. Perma Links don't.

perma.cc

Broken links are everywhere. Perma helps authors and journals create permanent links for citations in their published work.

1709.

Century-Scale Storage

lil.law.harvard.edu/century-scale-storage

If you had to store something for 100 years, how would you do it?

2025-01-06

1703.

Digital decluttering

alexwlchan.net/2024/digital-decluttering

I'm resisting my temptation towards digital hoarding and "save everything", and trying to be more selective about the data I'm keeping.

2025-01-04

1700.

Using static websites for tiny archives

alexwlchan.net/2024/static-websites

I've been creating small, hand-written websites to organise my files. It's a lightweight, flexible approach that I hope will last a long time.

There's a screenshot there, take a look. What's surprising is that the author employs no static site generator. Huh?? I'd rather use one or come up with a cool CGI setup. I wonder if they still make these websites...

2025-01-03

1688.

Archival with a universal virtual computer (UVC) ⁑ Dercuano

dercuano.github.io/notes/uvc-archiving.html

2024-12-13

1626.

2024-12-13 Archiving homepages

alexschroeder.ch/view/2024-12-13-archiving-homepages

2024-09-28

Reposted 1533.

Artifacts: Image + Link Organizer app for macOS/iOS

artifacts.app

Artifacts is an image + link organizer app for macOS and iOS. A
completely native, local first way to save all that stuff you find
across the web.

2024-09-02

1488.

iipc/awesome-web-archiving: An Awesome List for getting started with web archiving

github.com/iipc/awesome-web-archiving

An Awesome List for getting started with web archiving - iipc/awesome-web-archiving

1486.

go-shiori/obelisk: Go package and CLI tool for saving web page as single HTML file

github.com/go-shiori/obelisk
  • Embeds all resources (e.g. CSS, image, JavaScript, etc) producing a single HTML5 document that is easy to store and share.

  • In case the submitted URL is not HTML (for example a PDF page), Obelisk will still save it as it is.

  • Downloading each assets are done concurrently, which make the archival process for a web page is quite fast.

  • Accepts cookies, useful for pages that need login or article behind paywall.

Might actually be a good choice for Betula archival. Might need to fork it, though.

2024-09-01

1477.

edent/Tweet2Embed: Convert a public Tweet into embedded semantic HTML

github.com/edent/Tweet2Embed

Convert a public Tweet into embedded semantic HTML - edent/Tweet2Embed

2024-06-28

1354.

2024-06-27 Keep your own archives

alexschroeder.ch/view/2024-06-27-keep-archives

Alex reminds us to keep our own archives of stuff we like. I remember how we had a discussion about that last November.

2024-06-07

1299.

imputnet/cobalt: save what you love

github.com/imputnet/cobalt

save what you love. Contribute to imputnet/cobalt development by creating an account on GitHub.

2024-03-03

1174.

Launch: History Book - And a Dinosaur

andadinosaur.com/launch-history-book

History Book automatically saves the content of your browsing history for searching. And it does it in a privacy-friendly way.

2024-02-03

1095.

Paperless-ngx

docs.paperless-ngx.com

Turn paper documents into a searchable digital database.

2024-02-02

1089.

archive.ph

archive.today

Archive.today is a time capsule for web pages! It takes a 'snapshot' of a webpage that will always be online even if the original page disappears. It saves a text and a graphical copy of the page for better accuracyand provides a short and reliable link to an unalterable record of any web page

2024-01-18

1056.

TubeArchivist

www.tubearchivist.com

Archive whole YouTube channels.

2024-01-11

1034.

The Inadequacy of Most Proposed Approaches

web.archive.org/web/20010519155455/http://www.clir.org/PUBS/reports/rothenberg/inadequacy.html

Most approaches that have been proposed fall into one of four categories: (1) reliance on hard copy, (2) reliance on standards, (3) reliance on computer museums, or (4) reliance on migration. Though some of these may play a role in an ultimate solution, none of them comes close to providing a solution by itself, nor does their combination.

On archival.

2023-11-28

931.

wormi4ok/evernote2md

github.com/wormi4ok/evernote2md

Convert Evernote .enex files to Markdown.

930.

ArchiveBox

archivebox.io

🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more…

2023-08-15

571.

adbar/trafilatura

github.com/adbar/trafilatura

2023-07-31

531.

igdl - Instagram Image Downloader

www.datagubbe.se/igdl

igdl is a Python script for downloading an image from a given
Instagram URL and either save it directly to disk or write it to stdout.
It will automatically pick the highest image resolution available.

530.

bellingcat/snscrape

github.com/bellingcat/snscrape

snscrape is a scraper for social networking services (SNS). It scrapes things like user profiles, hashtags, or searches and returns the discovered items, e.g. the relevant posts.

Supports many major social networks!

2023-07-26

513.

A Paper Internet

carlos.bueno.org/2010/09/paper-internet.html

Build your time capsule with epoxy.

2023-07-24

502.

ArchiveTeam/grab-site

github.com/ArchiveTeam/grab-site

The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns

2023-07-17

474.

dosyago/DiskerNet

github.com/dosyago/DiskerNet

DiskerNet empowers you to be the master archivist of your own internet browsing. As a robust, lightweight tool, DiskerNet seamlessly connects to your browser, saving and organizing your online discoveries in real-time. With an option to archive everything or only bookmark-worthy content, DiskerNet places you in full control of your browsing history. No special plugins or extensions required.

2023-07-15

471.

web archiving

agnessa.pp.ru/computer/20210731230816-web_archiving.html

Коллекция ссылок про архивирование веба.

470.

deathau/markdownload

github.com/deathau/markdownload

This is an extension to clip websites and download them into a readable markdown file. Please keep in mind that it is not guaranteed to work on all websites.

469.

mozilla/readability: A standalone version of the readability lib

github.com/mozilla/readability

A program by Mozilla that powers the Reader mode in FF and many other programs. Something I like.

2023-07-09

439.

Save Your Threads

social.perma.cc

Save birdsite threads to PDF:s. I wonder if it still works.

2023-06-13

334.

Archive it or you will miss it

drewdevault.com/2017/06/19/Archive-it-or-miss-it.html

At this point, link rot is an axiom of the internet. In the face of this, I store a personal offline archive of anything I want to see twice. When I see a cool YouTube video I like, I archive the entire channel right away. Rather than subscribe to it, I update my archive on a cronjob. I scrape content out of RSS feeds and into offline storage and I have dozens of websites archived with wget. I mirror most git repositories I’m interested in. I have DRM free offline copies of all of my music, TV shows, and movies, ill-begotten or not.