Skip to main content

View Post [edit]

Poster: MGeog2022 Date: Apr 20, 2024 4:40am
Forum: forums Subject: Internet Archive backups and long-term continuity

Hello. As I posted here: https://meta.wikimedia.org/wiki/Wikimedia_Forum#Collaboration_with_Internet_Archive
as far as it's publicly known, all content in Internet Archive is hosted in San Francisco area, in only 2 copies accross Archive's 4 datacenters in the area.
I think Archive is one of humanity's greatest achievements (the volatile Internet, where almost all human information is, is made permanent; all content in Wikipedia and its sister projects, "the sum of all human knowledge", as its slogan says, is also here, along with the content of almost all its external links, that allow to verify and to get further information about any of the different subjects; in summary, Archive is something really great and unique).
Being something so important, I think that, in the current situation, an important part of Archive's content is likely to be lost over the next few decades. Following the 3-2-1 backup rule, for such inmense collection, is probably only a dream. But having only 2 copies in the same area, an area that we're sure that it will suffer an 8 magnitude earthquake at some point in the future, I think this is the biggest threat to this wonderful library. It's often talked about about financial or political threats, about legal problems... I think none of that will never destroy Archive. Archive is of great value for anyone. Almost nobody would like to destroy this. If the organization as such ever disappears, I bet enough donations would be made to create another one to continue with it. The same if it runs out of money. No judge or politician (under a free regime) would have the courage to destroy the new Library of Alexandria, and if it happened, people would have time to mobilize to prevent it. But "the big one" earthquake that some day will hit San Francisco, will give no previous warning. Of course, the city and the vast majority of its population will survive, but, can we bet the same for most of Internet Archive's content? I think that, if there is none, having offline backups in a distant place is an urgency (searching for additional funds for it, if needed). Moving part or all of the datacenters to places with less natural risks would be another option. Another possibility is that Archive's datacenters already are earthquake-safe (not only the buildings, but even the server racks), and the real danger is not as great as I think it is.. True backups are always a must, though (cyberattacks, etc). I do know that backuping 200 PB is a real challenge, but I think it's a very important necessity. Perhaps raising awareness that all of this content is really in danger, could get the much needed donations to accomplish it (for example, Wikimedia Foundation is getting perhaps 5 times the money that Archive gets, so it isn't impossible).
This post was modified by MGeog2022 on 2024-04-20 11:40:11