Skip to main content

SSD reliability in the real world: Google's experience

http://www.zdnet.com/article/ssd-reliability-in-the-real-world-googles-experience/


SSDs are a new phenomenon in the datacenter. We have theories about how they should perform, but until now, little data. That's just changed.
The FAST 2016 paper Flash Reliability in Production: The Expected and the Unexpected, (the paper is not available online until Friday) by Professor Bianca Schroeder of the University of Toronto, and Raghav Lagisetty and Arif Merchant of Google, covers:
  • Millions of drive days over 6 years
  • 10 different drive models
  • 3 different flash types: MLC, eMLC and SLC
  • Enterprise and consumer drives

Key conclusions

  • Ignore Uncorrectable Bit Error Rate (UBER) specs. A meaningless number.
  • Good news: Raw Bit Error Rate (RBER) increases slower than expected from wearout and is not correlated with UBER or other failures.
  • High-end SLC drives are no more reliable that MLC drives.
  • Bad news: SSDs fail at a lower rate than disks, but UBER rate is higher (see below for what this means).
  • SSD age, not usage, affects reliability.
  • Bad blocks in new SSDs are common, and drives with a large number of bad blocks are much more likely to lose hundreds of other blocks, most likely due to die or chip failure.
  • 30-80 percent of SSDs develop at least one bad block and 2-7 percent develop at least one bad chip in the first four years of deployment.

The Storage Bits take

Two standout conclusions from the study. First, that MLC drives are as reliable as the more costly SLC "enteprise" drives. This mirrors hard drive experience, where consumer SATA drives have been found to be as reliable as expensive SAS and Fibre Channel drives.
One of the major reasons that "enterprise" SSDs are more expensive is due to greater over-provisioning. SSDs are over-provisioned for two main reasons: to allow for ample bad block replacement caused by flash wearout; and, to ensure that garbage collection does not cause write slowdowns.
The paper's second major conclusion, that age, not use, correlates with increasing error rates, means that over-provisioning for fear of flash wearout is not needed. None of the drives in the study came anywhere near their write limits, even the 3,000 writes specified for the MLC drives.
But it isn't all good news. SSD UBER rates are higher than disk rates, which means that backing up SSDs is even more important than it is with disks. The SSD is less likely to fail during its normal life, but more likely to lose data.

Comments

Popular posts from this blog

The Difference Between LEGO MINDSTORMS EV3 Home Edition (#31313) and LEGO MINDSTORMS Education EV3 (#45544)

http://robotsquare.com/2013/11/25/difference-between-ev3-home-edition-and-education-ev3/ This article covers the difference between the LEGO MINDSTORMS EV3 Home Edition and LEGO MINDSTORMS Education EV3 products. Other articles in the ‘difference between’ series: * The difference and compatibility between EV3 and NXT ( link ) * The difference between NXT Home Edition and NXT Education products ( link ) One robotics platform, two targets The LEGO MINDSTORMS EV3 robotics platform has been developed for two different target audiences. We have home users (children and hobbyists) and educational users (students and teachers). LEGO has designed a base set for each group, as well as several add on sets. There isn’t a clear line between home users and educational users, though. It’s fine to use the Education set at home, and it’s fine to use the Home Edition set at school. This article aims to clarify the differences between the two product lines so you can decide which...

Let’s ban PowerPoint in lectures – it makes students more stupid and professors more boring

https://theconversation.com/lets-ban-powerpoint-in-lectures-it-makes-students-more-stupid-and-professors-more-boring-36183 Reading bullet points off a screen doesn't teach anyone anything. Author Bent Meier Sørensen Professor in Philosophy and Business at Copenhagen Business School Disclosure Statement Bent Meier Sørensen does not work for, consult to, own shares in or receive funding from any company or organisation that would benefit from this article, and has no relevant affiliations. The Conversation is funded by CSIRO, Melbourne, Monash, RMIT, UTS, UWA, ACU, ANU, ASB, Baker IDI, Canberra, CDU, Curtin, Deakin, ECU, Flinders, Griffith, the Harry Perkins Institute, JCU, La Trobe, Massey, Murdoch, Newcastle, UQ, QUT, SAHMRI, Swinburne, Sydney, UNDA, UNE, UniSA, UNSW, USC, USQ, UTAS, UWS, VU and Wollongong. ...

Logic Analyzer with STM32 Boards

https://sysprogs.com/w/how-we-turned-8-popular-stm32-boards-into-powerful-logic-analyzers/ How We Turned 8 Popular STM32 Boards into Powerful Logic Analyzers March 23, 2017 Ivan Shcherbakov The idea of making a “soft logic analyzer” that will run on top of popular prototyping boards has been crossing my mind since we first got acquainted with the STM32 Discovery and Nucleo boards. The STM32 GPIO is blazingly fast and the built-in DMA controller looks powerful enough to handle high bandwidths. So having that in mind, we spent several months perfecting both software and firmware side and here is what we got in the end. Capturing the signals The main challenge when using a microcontroller like STM32 as a core of a logic analyzer is dealing with sampling irregularities. Unlike FPGA-based analyzers, the microcontroller has to share the same resources to load instructions from memory, read/write th...