Backblaze hard drive reliability stats for Q3 2016

pfarnsworth · on Nov 15, 2016

With hard drive size increasing so quickly but hard drive transfer speeds basically flat, I wonder if there are long-term implications for them with respect to recovery from backup and downtime. For example, if a whole rack goes down, and they are on 32TB drives in the future, for example, could it takes a week or more for their data to get online?

dom0 · on Nov 15, 2016

Well, uh, yes.

No one really expects that rotational rust will get much faster, and in fact history shows that, compared to the increase in density, the increase in transfer rates are laughable at best. Between 1990 and today you are probably looking at a 20 000 times increase in density, yet transfer rates only increased around a factor of around 150-200. [In fact, from the early 1960 to today it's only a factor of about 1000. There are quite possibly few performance metrics that increased so slowly as disk transfer rate].

Why is that?

Increasing density does only marginally increase transfer speed: Most density increases are achieved by packing more tracks onto the platter, while storing more sectors per track plays a minor role. But a single R/W head can only read a single track, not parallel tracks, hence speed only increases if you pack more sectors into each track, not by increasing the number of tracks. That's why between today and 10 years ago performance in desktop or server drives only differs a little, compared to the capacity increase to 8+ TB in a 3.5" drive.

More platters also don't help in transfer rate, because the alignment of all heads on the actuator is fixed: at any given time only one platter and one R/W head is used (locked to the track). [More platters can help reduce seek time in certain scenarios though]

More disks on the other hand...

hinkley · on Nov 15, 2016

That's not entirely true.

Higher density means more data per track, not just more tracks per disk. You get an entire track per revolution so a track with more data is more MBps. So linear reads on a higher density drive are faster, and semi-linear accesses (ie, reading two files that are next to each other) do get faster.

I remember reading a story about a guy who built a drive array with high capacity 7200 RPM drives that got within 20% of the performance of the 10K RPM setup they had, by partitioning the drives at the same capacity as the 10K equivalent. The head only had half as many tracks to traverse, so worst case access time was better, and the higher density made up for the lower RPMs.

baruch · on Nov 16, 2016

Short stroking helps get more performance from the disk, at least in terms of latency. The bandwidth change occurs because the density is fixed and there are more blocks on the outer rings per track compared to the inner rings so in one revolution you can read more tracks and you don't need to change tracks so often.

You parent comment is right though, there are only small changes in bit density on the track in recent years so the bandwidth is not improving by much.

hinkley · on Nov 16, 2016

So we went from 1TB to 8TB disks just by packing 3x as many tracks onto drives? I find that hard to believe, and the benchmarks agree with me.

You don't double the write throughout on a disk by doubling the number of tracks. You need more platters and/or sectors per track to do that.

dom0 · on Nov 16, 2016

> Higher density means more data per track, not just more tracks per disk.

I literally said that:

> Most density increases are achieved by packing more tracks onto the platter, while storing more sectors per track plays a minor role.

hinkley · on Nov 16, 2016

You are using a different definition of 'minor'. I don't think something that accounts for a 30-100% improvement is minor.

dom0 · on Nov 17, 2016

When capacity increased in the same time frame by 800-2000 % then I kinda call that minor.

mahyarm · on Nov 15, 2016

How about the stupid idea of making a minimum block size be a multiple of the number of platters to write to. The block is divided evenly amongst all platters and are always at the same parallel track/sector. That way you can multiply the read/write speeds to be a function of the number of platters.

Instead of 50-100MB/s, you can get 4-8x the speed in large linear transfers, which helps get dead racks back up faster and would work quite well in backblaze's backup model.

Your block sizes will be huge, but I think in backblaze's case that doesn't really matter so much.

eye2sky · on Nov 15, 2016

You cannot read from multiple platters in parallel. When you are aligned to read from a certain track on a certain side of a platter, you are not necessarily aligned on all platters. This head mechanism is not accurate enough for that.

rzzzt · on Nov 16, 2016

Google's paper on datacenter hard drives addresses this point -- it's easier to do this and other tricks with a set of disks than attempting it within the confines of a single drive.

https://news.ycombinator.com/item?id=11172876

kardos · on Nov 15, 2016

Is there a reason you couldn't use a second set of actuators?

dom0 · on Nov 15, 2016

There's kinda not enough space: https://upload.wikimedia.org/wikipedia/commons/thumb/7/75/Sa...

You could probably do it if the platter diameter were reduced, which would then accordingly reduce capacity as well. At that point you could just as well use two drives and also get lower overall failure probability. Or you use SSD caches, or memory caches, ...

kardos · on Nov 15, 2016

I was thinking of a Frankendrive with 5 actuators, which can do 5 parallel reads/writes, as long as they don't overlap. But the new dimensions would probably be the roadblock.

digler999 · on Nov 15, 2016

I was going to ask this too, but start with a simple 2nd voice coil on the opposite side of the spindle, keep the 3.5" form factor and just make it longer. My guess is the motion of one actuator would disrupt the airflow of the other, since we're talking about micrometer-ish (?) mechanical tolerances. This might be able to be mitigated by interleaving which side each arm wrote to. So the left arm's heads address the tops of the platters, the right side addresses the bottoms.

Perhaps another option is to go back to the 1980's hdd designs where the arm moves straight down the radius of the platter. This design might permit multiple heads on the same arm. I'm sure all this stuff has been researched thoroughly.

Either way, this doubles/triples the probability of mechanical failure

dom0 · on Nov 16, 2016

> since we're talking about micrometer-ish (?) mechanical tolerances.

Even less than that. Tracks are 100-300 nanometers wide; the head flies about 3-6 nm above the platter.

The actuator mechanism literally locks onto the signal encoded in the magnetic track and follows it as the platter rotates underneath the R/W head.

hga · on Nov 16, 2016

Perhaps another option is to go back to the 1980's hdd designs where the arm moves straight down the radius of the platter. This design might permit multiple heads on the same arm.

At my first job, sysadmin and programming on a PDP-11/44, 1-2 fascinating Winchester drives were procured for it. They had clear plastic covers, and you could see everything, the disk, the actuator, which had two heads, and it was by and large square, I'm almost positive it was rotary, not solenoid based straight stroke like many '70-80s drives, CDC's famous line in particular.

kalleboo · on Nov 16, 2016

No problem, just bring back the Bigfoot https://en.wikipedia.org/wiki/Quantum_Bigfoot

mapt · on Nov 15, 2016

Or more read-write heads?

andy4blaze · on Nov 15, 2016

The Backblaze Vault design mitigates that as the "raid array" is scattered across 20 different Storage Pods in twenty different racks. You'd need more than three racks to go down before you would be offline. Backup systems in place make that highly unlikely. Andy at Backblaze.

fpp · on Nov 15, 2016

I'll see from your reports that you're migrating 1000s of 2TB drives to 8TB drives - what is actually happening to your old 2TB drives?

Guess you are not throwing those away so what are you doing with them.

brianwski · on Nov 15, 2016

Disclaimer: I work at Backblaze. We securely wipe the drives, then we sell them to a "used hard drive reseller".

kikoreis · on Nov 15, 2016

Does "secure wipe" mean ATA SE, i.e. hdparm --security-erase, or --security-erase-enhanced?

ams6110 · on Nov 16, 2016

For that volume I'd think a degausser would be used.

teraflop · on Nov 16, 2016

AFAIK, if you physically destroy the contents of a drive with a strong magnetic field, you also destroy the servo tracks, rendering the drive useless. Modern hard drives can't "low-level format" themselves; they're not mechanically precise enough.

deelowe · on Nov 16, 2016

Degaussers render the drive nonfunctional.

SteveNuts · on Nov 15, 2016

I know it's a lot easier to wholesale resell them, but it'd be so awesome if you could dedicate a portion to be sold to the homelabbers among us.

jerguismi · on Nov 16, 2016

I don't know how that would be more awesome, essentially spending effort towards the upper middle-class hobbyists. I would rather see them sell them directly to the highest bidder and focus the effort on improving their business, or give the drives to charity.

theandrewbailey · on Nov 15, 2016

They were reusing their 1TB drives to test new pods with. But I think they would only need a hundred or so for that, unless they die faster in that workload.

https://www.backblaze.com/blog/hard-drive-reliability-q4-201...

pfarnsworth · on Nov 15, 2016

I guess my question is more about the long term implications of transfer speeds increasing much slower than hard drives space. And if there's a hidden risk to downtime because all of a sudden, a full rack of 32 or even 64TB drive will take day(s) to transfer as opposed to an hour or so because transfer speeds are so slow.

pgeorgi · on Nov 15, 2016

There are implications. ZFS' raid-z2 (which uses 2 disks for redundancy data) was pitched - in 2008 or so - with the idea that disks are now large enough that recovery can take long enough that it's likely that another disk in the raid set breaks before the redundancy is back.

andy4blaze · on Nov 15, 2016

I suppose if you had a single 32TB drive that went offline for say a week, and then once it came back online you'd have some type of pent up demand and a "slow" transfer speed. Storage systems in general spread the load across multiple devices so the effect of slow transfer speed is near zero in most backup and archiving applications and maybe more problematic in transactional applications.

Retric · on Nov 15, 2016

Transfer increases as square root of drive density. http://www.anandtech.com/show/10541/western-digital-adds-hel... Shows 10GB drives at 249 MB/s sustained sequential transfer rate.

That takes ~11 hours to fill. At 400x the density it would take 20x as long or 9 days. I don't think HDD drives are hitting 400x the density any time soon but if they did it would be a problem.

However, in an array you could take a month to fill a drive to 75% without causing to much trouble. Assuming you had enough drives. That's around a ~80PB limit drive. IMO, the real issue is it would take another month to download all that data. Relegating HDD firmly into archival storage.

PS: I don't think rust is going to get into those density's making this far less of an issue.

drzaiusapelord · on Nov 15, 2016

Heck, just a raid rebuild takes days nowadays if you dare use RAID5 still or, less risky, RAID6.

The smart move is to keep several smaller arrays instead of one big one. This lowers risk as well. I dont put in anything bigger than 7 disks into production. Past that I'm just asking for trouble. Its better to have 4 7 disk arrays than one 28 disk array. A drive fail means a quick rebuild and a restore is going to be 1/4 the time.

cm2187 · on Nov 15, 2016

Stupid question but why does it take more time to rebuild a 10 disks array than a 6 disks array. I mean modern PCIe can go several GB/s and the disks are in parallel, so it should take the same time to rebuild irrespective of how many disks.

fulafel · on Nov 16, 2016

Internet connection speeds have been flat longer so it eves out I guess.

meritt · on Nov 15, 2016

We have about 20TB in an AWS S3 bucket we'd like to backup somewhere separate from Amazon. Is there any chance of Backblaze offering ingestion from an Amazon Snowball export (https://aws.amazon.com/snowball/)?

atYevP · on Nov 15, 2016

Yev from Backblaze here -> Believe Snowball is encrypted and fairly locked down, not sure we could do that.

leejoramo · on Nov 15, 2016

How are people using Backblaze's excellent hard drive reliability reports in making purchasing decisions?

For example when I search for HGST HMS5C4040ALE640 on Amazon I get a dealer selling old out of warrantee drives as new.

https://www.amazon.com/HGST-MegaScale-HMS5C4040ALE640-Coolsp...

I get similar results with many of the other drives listed and with other websites such as NewEgg.

ars · on Nov 15, 2016

I did, I picked HGST because of their reports. No problems so far.

When you buy hard disks ONLY buy from Amazon or Newegg directly - never buy from a 3rd party seller on their site. Especially for hard disks there is too much fraud, and for a hard disk especially the risk of data loss makes it just too risky (unlike other items).

discreditable · on Nov 15, 2016

> When you buy hard disks ONLY buy from Amazon or Newegg directly - never buy from a 3rd party seller on their site.

Agreed. A few times I've purchased third-party disks and found (via SMART data) that the drives were well used despite not being sold as such.

chiph · on Nov 15, 2016

Just bought some drives that were fulfilled via Amazon (but not sold by Amazon.com LLC themselves). You've given me something to check when they arrive.

sigstoat · on Nov 16, 2016

Just returned a "new" drive with 31k power on hours. I should've known when the drive wasn't in a standard WD antistatic bag.

comboy · on Nov 15, 2016

I don't always buy the exact models, but I've switched to HGST few years ago after studying Backblaze reports and so far, no regrets (I'm running heavily used postgres on these drives at home)

Llevel · on Nov 15, 2016

I chose to get 4 ST4000DM000 drives based on previous reports. Sure HGST drives never die, but it's cheaper to RMA or buy a single new drive if one fails, than the added cost of 4 reliable drives. Assuming only one fails, which is a risk I'm willing to take with my very non-mission-critical data.

I don't read this as 'which drive to buy' but more as 'which drive not to buy'.

brianwski · on Nov 15, 2016

> HGST drives never die

Disclaimer: I work at Backblaze. I know you didn't mean that as an absolute, but I just want to point out 100% of drives fail. It's my OCD that makes me point this out. We have NEVER found a drive that lasted forever. There are two types of drives: 1) those that have already failed, and 2) those that are about to fail. For any data you would be annoyed to lose, you need three copies in three locations with three separate vendors (three different pieces of software that don't share any lines of code).

mioelnir · on Nov 16, 2016

18 disk zpool with three raidz2 vdevs, with 2 disks from 3 different vendors per vdev, bought in two batches per vendor and evenly spread across the case. That's about as paranoid as I consider practical for the homelab/online part of my storage needs.

Also, not sure there are three pieces of storage software that I trust and are readily available to me.

kikoreis · on Nov 15, 2016

To be somewhat similarly OCD, if your software is controlling storage to each of those drives, then there are many shared LOC involved; indeed, this is the reason why async replication into separate clusters is always recommended for data which can't be lost.

e40 · on Nov 15, 2016

For example when I search for HGST HMS5C4040ALE640 on Amazon I get a dealer selling old out of warrantee drives as new.

We found this, as well. And we stopped trying to get the HGST drives after we got a bad batch from a seller on Amazon.

vinay427 · on Nov 16, 2016

I also bought an HGST portable drive recently as they had one of the cheapest 7200rpm USB 3.0 offerings available online.

kn0where · on Nov 15, 2016

FWIW, I happen to have the HGST HMS5C4040ALE640, which I purchased from Fry's Electronics earlier this year when it was on sale for ~$120ish.

trowawee · on Nov 15, 2016

Note to self: stay the hell away from Western Digital. Good lord, an 8.2% annualized failure rate? That's unbelievable. They have the first, second, and third worst failure rates on that chart.

theandrewbailey · on Nov 15, 2016

A few years ago, you could have said the same for Seagate. Even worse, actually. Certain models had over 10% failure rate. Some WD models almost had HGST levels of relibility.

https://www.backblaze.com/blog/hard-drive-reliability-update...

trowawee · on Nov 15, 2016

Hmmm. Curious. I wonder if Seagate has improved their production somehow to avoid those failure rates, because even with those previous very high rates, they fare much better on the 2013-2016 total chart.

theandrewbailey · on Nov 15, 2016

They might have. But I also wonder if those drives were among the first after the 2011 Thailand floods, and if Seagate's factories were still a bit dirty after resuming production.

trowawee · on Nov 15, 2016

Ahhh, didn't even think about that. That seems like it could be a definite possibility.

trowawee · on Nov 15, 2016

Looks like they're offline. Google cache link: https://webcache.googleusercontent.com/search?q=cache:DNdyZ1...

andy4blaze · on Nov 15, 2016

Website is fine, Backup service is fine. Blog is overloaded and we're working on it.

MrMullen · on Nov 15, 2016

You should really consider moving this webpage and report to something like AWS S3 when you first release it. Then move back to your usual servers when traffic has fallen off. Your poor servers must melt down when this shows up on Hacker News and Slashdot.

atYevP · on Nov 15, 2016

Internally we're blaming our SEO people for putting to much crap on the blog itself ;) But yea, it's worth exploring - though we have our own servers that should be able to handle the load. We haven't had blog loading trouble in a while, so it'll be neat to debug this later :D

PuffinBlue · on Nov 15, 2016

From the outside it looks like your running a fairly intensive Wordpress install on an Apache webserver with no page caching.

Also seems there's no minification or combining of stylesheets/js and there are query strings on those static assets which is going to discourage caching.

No wonder you need a datacenter to handle that kind of resource punishment!

There are plenty of reasons to stick with Wordpress in a decent sized corporation but if not switching to a static site at least stick W3TC on there so you're minimising your server load and serving out static html and minified/combined resources.

You could then consider using Varnish in front of Apache or maybe nginx with a FastCGI cache.

I"m sure you've got some folks in the team who could whip up a W3TC install in 10 minutes.

atYevP · on Nov 15, 2016

Our web sys admin heard me read that out loud and now we have to get him an ice-pack because he almost shoved his head entirely through his desk.

PuffinBlue · on Nov 15, 2016

Is that meant as a rude retort?

Because if it is it from the team that currently can't keep a blog post online when you get a few thousand concurrent visitors, so you might keep yourself open to suggestions and perhaps undertake the BASIC best practices of keeping a Wordpress site up under load.

If nothing else it shows a basic lack of planning for what you know to be a massively popular post, so turn a little of that judgement back on yourselves.

It's possible easily handle tens of millions of hits a day on a tiny VPS if you do even some basics right[1] and that was without any particularly extensive optimisation.

[1] http://reviewsignal.com/blog/2014/06/25/40-million-hits-a-da...

EDIT: I may not be allowed to reply to the comment below due to HackerNews restrictions so incase the option doesn't become available in the next while I'll just say I accept the answer below gracefully, withdraw my daggers and take a calming beer at the end of a long day :-)

I'm wish you continued success and look forward to the next post.

atYevP · on Nov 15, 2016

No, he was agreeing. We have a lot of projects on our map to shore up some of these types of issues, but our admins are in high demand, so some of the lower-priority tasks slip on occasion. Since we rarely have issues with the blog (today was an exception) it tends to be a "we know what we'd like to change, but we'll do it when we have time" type of silo on our website.

*Edit -> to your above edit -> I think if you expand the comment by hitting the "time submitted" link you can leave a reply, thus subverting HN :P

PuffinBlue · on Nov 15, 2016

Huh, it works! :-)

atYevP · on Nov 15, 2016

#SubvertingTheInternetsSince1998 :D

NicoJuicy · on Nov 15, 2016

EasyEngine for nginx with redit cache and php7 Will set your wordpress blog blazing.

Yoast is a culprit of performance though.

Don't forget the plugin query monitor and http2 doesn't need bundling resources ( I suppose)

donutpepperoni · on Nov 15, 2016

I'd suggest switching to a static site in general. I have no idea what kind of traffic they're sustaining at the moment but NGINX serving up static html (or s3) is a lot more efficient than Wordpress or another blog engine consuming cpu cycles.

theandrewbailey · on Nov 15, 2016

At the very least, use a reverse proxy with lots of caching.

theandrewbailey · on Nov 15, 2016

Backblaze always gives great stats with these. I upvoted the story before looking at it.

trowawee · on Nov 15, 2016

Agreed. I'm always interested in the results , even if their physical usage of the drives is miles above anything I'm ever going to do with a hard drive.

BorisMelnik · on Nov 16, 2016

same here - I build a lot of home servers and base a lot of my HDD purchases based on these.

atYevP · on Nov 15, 2016

It's back now :D

porker · on Nov 15, 2016

Anyone have a reason why the WD30EFRX has such poor reliability in their last table? Granted it's a smaller sample size.

Only asking 'cause it's my main data hard drive...

devonkim · on Nov 15, 2016

IIRC 3TB drives were among the newer drives commercially available at retail when the flooding happened in Thailand and all retail drives were impacted negatively. Furthermore, aggregate capacity is only one factor in the design of a hard drive and most 3TB drives were made in a manner that reduces reliability (more platters of same capacity or fewer platters with greater individual platter capacity, can't quite remember which unfortunately). I don't see why a manufacturer would put the newest technology into product lines that are older so I'd presume that 3TB drives are among the greatest in number of platters and the extra components contributes to the failure rate.

Among the least reliable drives I saw in previous reports were Seagate 3TB drives (supposedly they had worse reliability than the legendary IBM Deathstars) and after reading about how 3TB drives were designed across manufacturers years ago during the flooding crisis I decided to avoid 3TB drives entirely. Seems like my decision is finally getting some data to back it up now in hindsight (no pun intended).

caf · on Nov 16, 2016

The smaller sample size is accounted for by the width of the confidence interval, which is 5.2% - 7.1%; even the low end of that still looks pretty bad.

pfarnsworth · on Nov 15, 2016

Yeah unfortunately 4/5 of my drives are those as well... gulp...

Fej · on Nov 15, 2016

Backblaze is consistently a great service. Needs a Linux client though.

atYevP · on Nov 15, 2016

Yev from Backblaze here -> have you checked out B2? We likely won't have a Linux client for our backup service any time soon, but our B2 service has a lot of integrators (like Cloudberry and Duplicity, HashBackup, etc..) that can back up Linux machines, a lot of folks have been going that route.

liotier · on Nov 15, 2016

> We likely won't have a Linux client for our backup service any time soon

Ok. But why ? Technical obstacles such as having to deal with distribution diversity - or is it a way of market segmentation ?

atYevP · on Nov 15, 2016

It's a mixture of a couple things. One is that we tend to run pretty lean and our engineers are all booked up for the foreseeable future. Linux users are a passionate community, but we can't quite justify the development time for a market segment that is not very large. Additionally, because we run an unlimited model, a lot of people would immediately sign up and back up their Linux servers for $5/month and we'd sail out of business. We could address that by putting in limits for those types of devices and only allowing certain Linux builds, but that adds complexity and we want to keep the backup side of the service very simple - which works for the vast majority of folks. So it's a combination of a bunch of factors. We hoped that developing B2 APIs and CLIs would give Linux folks something that they could use if they needed to have offsite backups or archives and wanted to use our infrastructure cause we're pretty neat. Long-winded answer, but TL:DR - small market segment, development time/cost, possible abuse.

distances · on Nov 15, 2016

It's worth considering though that the Linux compatibility is worth more than the actual market share. Our company (with roughly 90% Mac / 5% Linux / 5% Windows users) went with Crashplan to have a single backup solution for all of the employees.

atYevP · on Nov 16, 2016

Absolutely, and that makes perfect sense. Having one system in place definitely beats out multiples. We know we can't be all things for all folks, but Crashplan is great, no hard feelings ;)

brianwski · on Nov 15, 2016

> Technical obstacles...?

Disclaimer: I work at Backblaze. The underlying base of original client backup software was originally written from scratch on three platforms simultaneously: 1) Windows, 2) Macintosh, and 3) Linux. It was designed that way from the beginning. This code continues to compile every time we do a client release, simply as part of the process. However, it is entirely lacking a GUI layer and an installer - those were never written. The underlying backup engine runs even when the user is logged out or the GUI has stopped working.

So it is technically possible, but along the way we released Backblaze B2 (storage API) which not only supports Linux, we assume Linux is the primary customer! We're seeing if that can satisfy the Linux community. Backblaze B2 is a large ongoing effort consuming a lot of our software developer's time.

A note about limited resources: Backblaze never really raised any funding, there are no deep pockets, so we can ONLY hire an additional programmer when the products we sell throw off enough money to pay that salary. We run on really tight margins (thus our obsession with failure rates of drives) which is fabulous for our customers, but not so great for hiring lots of extra help to do projects like a Linux GUI. :-)

zeroer · on Nov 15, 2016

Their MacOS client is amazing, so I doubt it's a technical constraint. My guess is that because they don't have size limits on their backups, if all of their customers are backing up terabytes, Backblaze bleeds money, so they depend on most people only backing up a couple hundred gigabytes. Linux users may have a wildly higher amount of stuff to back up such that it's not profitable. And changing their branding to have mostly unlimited except for some people is probably not too appealing either.

toomuchtodo · on Nov 15, 2016

Probably because the market is too small to justify the developer time when other developers are willing to spend the time integrating their B2 cloud offering as a driver.

mixedCase · on Nov 15, 2016

CrashPlan offers a Linux desktop client, and I believe they're a smaller player than Backblaze.

toomuchtodo · on Nov 15, 2016

CrashPlan made a different decision. If I was to offer an unlimited backup service, I would offer it to novice users, not a small minority of power users and professionals who were going to break the business model with TBs of data.

Let's be honest: when people see unlimited, most think "I don't have to worry about how much I'm storing" but a small group thinks "How can I take advantage of this?"

Not supporting a Linux client fixes that quickly.

Filligree · on Nov 15, 2016

> Let's be honest: when people see unlimited, most think "I don't have to worry about how much I'm storing" but a small group thinks "How can I take advantage of this?"

I don't want to take advantage of it. I just happen to have 8TB of data to back up...

But backing up that much data over the internet isn't practical in any case.

5ilv3r · on Nov 15, 2016

> Not supporting hackers fixes that quickly.

Fixed that for you. We're just trying to maximize our resources and minimize cost. Nothing wrong with that.

leejoramo · on Nov 15, 2016

CrashPlan built their system using Java and offers enterprise versions of their clients and servers. They have a very different technology stack and business model.

jms703 · on Nov 15, 2016

Duplicity and B2 works well on both Linux and OpenBSD servers.

wjnc · on Nov 15, 2016

RClone [1] will back up to Backblaze and is a very versatile tool.

Why would you want a CLI per hoster, if there are CLIs that target most hosters? Service?

[1] http://rclone.org

sammoth · on Nov 15, 2016

That's B2 not the standard backup service

izacus · on Nov 15, 2016

Yeah, not being able to backup the Linux machine at my house means that I really don't want to bother with buying it on my Macs/Windows machines either - having two solutions is a pain :/

234dd57d2c8db · on Nov 15, 2016

Just use the backblaze API in a script with a cronjob, that's what I do. It's linux, you're going to end up writing some code to get what you want, heh.

Fej · on Nov 20, 2016

Why do you use Backblaze over, say, CrashPlan, if the latter has a Linux client?

avitzurel · on Nov 15, 2016

Really surprising to me that some disks have 4-5% failure rate yearly. Great data

caf · on Nov 16, 2016

It's fantastic to see the confidence interval quoted in those latter tables (I assume 95%?) - that's far more informative than just the mean failure rate.

arekkas · on Nov 15, 2016

Why aren't you using tapes? Wouldn't that be more suited for back ups - larger and less failures. You probably don't have a lot of reads anyways?

atYevP · on Nov 16, 2016

Yev from Backblaze here -> Tape tends to have much higher read times, and can even be more expensive than hard drives in some cases. Since we have Backblaze B2 the data needs to be highly available.

joshdance · on Nov 15, 2016

https://www.backblaze.com/blog/ - 404 right now?

atYevP · on Nov 15, 2016

Yev from Backblaze here -> We're kicking a couple of server. Lots of load from the traffic so we're putting top me on it! TOP MEN!

atYevP · on Nov 15, 2016

Back now!

Shivetya · on Nov 15, 2016

Where do retired hard drives go? Do you have in house recycling or contract it out?

andy4blaze · on Nov 15, 2016

In the Backblaze case, they are securely wiped clean and then recycled.

swang · on Nov 15, 2016

What does Backblaze do with all the removed HD that still work but maybe have a ton of cycles on them? Are they just recycled, or resold?

brianwski · on Nov 15, 2016

Disclaimer: I work at Backblaze. We securely wipe the drives, then we sell them to a "used hard drive reseller".

executive · on Nov 15, 2016

404 File Not Found

Not so reliable I gather

atYevP · on Nov 15, 2016

Yev from backblaze here -> Yea we're kicking a server or two, it's been loading slow from all the traffic so we put top men on it. TOP MEN!