•  TehPers   ( @TehPers@beehaw.org ) 
    link
    fedilink
    English
    51 month ago

    On the flip side, nobody can be expected to keep their website up for 4000 years. Hosting costs money and time, and at some point, the thing you’re hosting will fall out of relevance enough to no longer be worth the cost.

    This is why archiving is important. Hopefully most of the content that was lost was archived at some point. Getting a good chunk of that content onto long term storage would do future generations a favor (even if it’s just a bunch of tape storage locked away in a warehouse or something).

    • This is true. Right now the OG internet is sort of kept alive by oral history, but we have the technology to save these websites in perpetuity as historical artifacts. That might be a good coding project - a robust archiving system that lets you point a URL at a webpage and scrape everything under its domain and keep a static collection of its contents. The issue, though, is that this doesn’t actually truly “capture” many web pages. A lot of the backend data that might have been served dynamically from a database isn’t retrievable, so the experience of using the page itself is potentially non-archivable.