Skip to main content

SSD reliability in the real world: Google's experience

http://www.zdnet.com/article/ssd-reliability-in-the-real-world-googles-experience/


SSDs are a new phenomenon in the datacenter. We have theories about how they should perform, but until now, little data. That's just changed.
The FAST 2016 paper Flash Reliability in Production: The Expected and the Unexpected, (the paper is not available online until Friday) by Professor Bianca Schroeder of the University of Toronto, and Raghav Lagisetty and Arif Merchant of Google, covers:
  • Millions of drive days over 6 years
  • 10 different drive models
  • 3 different flash types: MLC, eMLC and SLC
  • Enterprise and consumer drives

Key conclusions

  • Ignore Uncorrectable Bit Error Rate (UBER) specs. A meaningless number.
  • Good news: Raw Bit Error Rate (RBER) increases slower than expected from wearout and is not correlated with UBER or other failures.
  • High-end SLC drives are no more reliable that MLC drives.
  • Bad news: SSDs fail at a lower rate than disks, but UBER rate is higher (see below for what this means).
  • SSD age, not usage, affects reliability.
  • Bad blocks in new SSDs are common, and drives with a large number of bad blocks are much more likely to lose hundreds of other blocks, most likely due to die or chip failure.
  • 30-80 percent of SSDs develop at least one bad block and 2-7 percent develop at least one bad chip in the first four years of deployment.

The Storage Bits take

Two standout conclusions from the study. First, that MLC drives are as reliable as the more costly SLC "enteprise" drives. This mirrors hard drive experience, where consumer SATA drives have been found to be as reliable as expensive SAS and Fibre Channel drives.
One of the major reasons that "enterprise" SSDs are more expensive is due to greater over-provisioning. SSDs are over-provisioned for two main reasons: to allow for ample bad block replacement caused by flash wearout; and, to ensure that garbage collection does not cause write slowdowns.
The paper's second major conclusion, that age, not use, correlates with increasing error rates, means that over-provisioning for fear of flash wearout is not needed. None of the drives in the study came anywhere near their write limits, even the 3,000 writes specified for the MLC drives.
But it isn't all good news. SSD UBER rates are higher than disk rates, which means that backing up SSDs is even more important than it is with disks. The SSD is less likely to fail during its normal life, but more likely to lose data.

Comments

Popular posts from this blog

The Difference Between LEGO MINDSTORMS EV3 Home Edition (#31313) and LEGO MINDSTORMS Education EV3 (#45544)

http://robotsquare.com/2013/11/25/difference-between-ev3-home-edition-and-education-ev3/ This article covers the difference between the LEGO MINDSTORMS EV3 Home Edition and LEGO MINDSTORMS Education EV3 products. Other articles in the ‘difference between’ series: * The difference and compatibility between EV3 and NXT ( link ) * The difference between NXT Home Edition and NXT Education products ( link ) One robotics platform, two targets The LEGO MINDSTORMS EV3 robotics platform has been developed for two different target audiences. We have home users (children and hobbyists) and educational users (students and teachers). LEGO has designed a base set for each group, as well as several add on sets. There isn’t a clear line between home users and educational users, though. It’s fine to use the Education set at home, and it’s fine to use the Home Edition set at school. This article aims to clarify the differences between the two product lines so you can decide which

Let’s ban PowerPoint in lectures – it makes students more stupid and professors more boring

https://theconversation.com/lets-ban-powerpoint-in-lectures-it-makes-students-more-stupid-and-professors-more-boring-36183 Reading bullet points off a screen doesn't teach anyone anything. Author Bent Meier Sørensen Professor in Philosophy and Business at Copenhagen Business School Disclosure Statement Bent Meier Sørensen does not work for, consult to, own shares in or receive funding from any company or organisation that would benefit from this article, and has no relevant affiliations. The Conversation is funded by CSIRO, Melbourne, Monash, RMIT, UTS, UWA, ACU, ANU, ASB, Baker IDI, Canberra, CDU, Curtin, Deakin, ECU, Flinders, Griffith, the Harry Perkins Institute, JCU, La Trobe, Massey, Murdoch, Newcastle, UQ, QUT, SAHMRI, Swinburne, Sydney, UNDA, UNE, UniSA, UNSW, USC, USQ, UTAS, UWS, VU and Wollongong.

Building a portable GSM BTS using the Nuand bladeRF, Raspberry Pi and YateBTS (The Definitive and Step by Step Guide)

https://blog.strcpy.info/2016/04/21/building-a-portable-gsm-bts-using-bladerf-raspberry-and-yatebts-the-definitive-guide/ Building a portable GSM BTS using the Nuand bladeRF, Raspberry Pi and YateBTS (The Definitive and Step by Step Guide) I was always amazed when I read articles published by some hackers related to GSM technology. H owever , playing with GSM technologies was not cheap until the arrival of Software Defined Radios (SDRs), besides not being something easy to be implemented. A fter reading various articles related to GSM BTS, I noticed that there were a lot of inconsistent and or incomplete information related to the topic. From this, I decided to write this article, detailing and describing step by step the building process of a portable and operational GSM BTS. Before starting with the “hands on”, I would like to thank all the pioneering Hackers and Researchers who started the studies related to previously closed GSM technology. In particul