Tag archive

24 bookmarks have this tag.

Archiving stuff so it's with you.

2024-09-28

Reposted 1533.

Artifacts: Image + Link Organizer app for macOS/iOS

artifacts.app

Artifacts is an image + link organizer app for macOS and iOS. A
completely native, local first way to save all that stuff you find
across the web.

2024-09-02

1488.

iipc/awesome-web-archiving: An Awesome List for getting started with web archiving

github.com/iipc/awesome-web-archiving

An Awesome List for getting started with web archiving - iipc/awesome-web-archiving

1486.

go-shiori/obelisk: Go package and CLI tool for saving web page as single HTML file

github.com/go-shiori/obelisk
  • Embeds all resources (e.g. CSS, image, JavaScript, etc) producing a single HTML5 document that is easy to store and share.

  • In case the submitted URL is not HTML (for example a PDF page), Obelisk will still save it as it is.

  • Downloading each assets are done concurrently, which make the archival process for a web page is quite fast.

  • Accepts cookies, useful for pages that need login or article behind paywall.

Might actually be a good choice for Betula archival. Might need to fork it, though.

2024-09-01

1477.

edent/Tweet2Embed: Convert a public Tweet into embedded semantic HTML

github.com/edent/Tweet2Embed

Convert a public Tweet into embedded semantic HTML - edent/Tweet2Embed

2024-06-28

1354.

2024-06-27 Keep your own archives

alexschroeder.ch/view/2024-06-27-keep-archives

Alex reminds us to keep our own archives of stuff we like. I remember how we had a discussion about that last November.

2024-06-07

1299.

imputnet/cobalt: save what you love

github.com/imputnet/cobalt

save what you love. Contribute to imputnet/cobalt development by creating an account on GitHub.

2024-03-03

1174.

Launch: History Book - And a Dinosaur

andadinosaur.com/launch-history-book

History Book automatically saves the content of your browsing history for searching. And it does it in a privacy-friendly way.

2024-02-03

1095.

Paperless-ngx

docs.paperless-ngx.com

Turn paper documents into a searchable digital database.

2024-02-02

1089.

archive.ph

archive.today

Archive.today is a time capsule for web pages! It takes a 'snapshot' of a webpage that will always be online even if the original page disappears. It saves a text and a graphical copy of the page for better accuracyand provides a short and reliable link to an unalterable record of any web page

2024-01-18

1056.

TubeArchivist

www.tubearchivist.com

Archive whole YouTube channels.

2024-01-11

1034.

The Inadequacy of Most Proposed Approaches

web.archive.org/web/20010519155455/http://www.clir.org/PUBS/reports/rothenberg/inadequacy.html

Most approaches that have been proposed fall into one of four categories: (1) reliance on hard copy, (2) reliance on standards, (3) reliance on computer museums, or (4) reliance on migration. Though some of these may play a role in an ultimate solution, none of them comes close to providing a solution by itself, nor does their combination.

On archival.

2023-11-28

931.

wormi4ok/evernote2md

github.com/wormi4ok/evernote2md

Convert Evernote .enex files to Markdown.

930.

ArchiveBox

archivebox.io

🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more…

2023-08-15

571.

adbar/trafilatura

github.com/adbar/trafilatura

2023-07-31

531.

igdl - Instagram Image Downloader

www.datagubbe.se/igdl

igdl is a Python script for downloading an image from a given
Instagram URL and either save it directly to disk or write it to stdout.
It will automatically pick the highest image resolution available.

530.

bellingcat/snscrape

github.com/bellingcat/snscrape

snscrape is a scraper for social networking services (SNS). It scrapes things like user profiles, hashtags, or searches and returns the discovered items, e.g. the relevant posts.

Supports many major social networks!

2023-07-26

513.

A Paper Internet

carlos.bueno.org/2010/09/paper-internet.html

Build your time capsule with epoxy.

2023-07-24

502.

ArchiveTeam/grab-site

github.com/ArchiveTeam/grab-site

The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns

2023-07-17

474.

dosyago/DiskerNet

github.com/dosyago/DiskerNet

DiskerNet empowers you to be the master archivist of your own internet browsing. As a robust, lightweight tool, DiskerNet seamlessly connects to your browser, saving and organizing your online discoveries in real-time. With an option to archive everything or only bookmark-worthy content, DiskerNet places you in full control of your browsing history. No special plugins or extensions required.

2023-07-15

471.

web archiving

agnessa.pp.ru/computer/20210731230816-web_archiving.html

Коллекция ссылок про архивирование веба.

470.

deathau/markdownload

github.com/deathau/markdownload

This is an extension to clip websites and download them into a readable markdown file. Please keep in mind that it is not guaranteed to work on all websites.

469.

mozilla/readability: A standalone version of the readability lib

github.com/mozilla/readability

A program by Mozilla that powers the Reader mode in FF and many other programs. Something I like.

2023-07-09

439.

Save Your Threads

social.perma.cc

Save birdsite threads to PDF:s. I wonder if it still works.

2023-06-13

334.

Archive it or you will miss it

drewdevault.com/2017/06/19/Archive-it-or-miss-it.html

At this point, link rot is an axiom of the internet. In the face of this, I store a personal offline archive of anything I want to see twice. When I see a cool YouTube video I like, I archive the entire channel right away. Rather than subscribe to it, I update my archive on a cronjob. I scrape content out of RSS feeds and into offline storage and I have dozens of websites archived with wget. I mirror most git repositories I’m interested in. I have DRM free offline copies of all of my music, TV shows, and movies, ill-begotten or not.