Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That is a problem, but it's a problem that you have with "roll your own infrastructure" as well. Digital data isn't like paper -- archiving doesn't mean keeping tapes in a vault. You need to periodically refresh and check the integrity of the media.

Amazon's service gives you an expectation that 99.999999999% of your data will survive intact on average every year. So while you need to maintain your data logically, you can set clear expectations with respect to the integrity of that data.

The fact that Amazon's infrastructure may not be around in 20 years isn't especially relevant either -- neither will any maintained infrastructure. Tapes need to be refreshed, archival storage infrastructure usually ages out in a 8-12 years, and you need to do a TCO analysis each time to refresh.

The other thing is that you get physical isolation. Flooding hasn't been a problem in lower Manhattan or Staten Island in recent memory. How many safe deposit boxes do you think were destroyed in those two places recently?



I know Amazon says they have 11 9s, but merely given that there have been 5 mass extinction events in the past 500M years, you probably can't do better than 8 9s. (11 9s gives you even odds that your data will last 70 billion years, as long as you pay Amazon's fee.)


You're probably right that Amazon holds your data for as long as that, however, there's another way to look at 11 nines: If you upload one hundred billion (10^11) files on Glacier, you expect to lose one of them per year, and not more. That's probably realistically achievable.

Compare this with S3, where you will lose 1 in one billion files per year. Shocking!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: