A few years ago I worked with a company called Storage Fusion that provided a SaaS offering for analysing storage systems.  With a few simple scripts it was possible to collect the configuration information on one or more storage arrays and use that data to gain insights into metrics such as utilisation, performance and efficiency.  One benefit of the product that interested me the most was the ability to be able to (anonymously) collate and analyse the configuration of petabytes of metadata, either as a snapshot or historically over time.  This data provided valuable insights into market trends but also with deeper insight (not available at the time) could actually help vendors deliver better and lower cost products.

Valuable as the data was, Storage Fusion’s platform was limited.  The metadata available was based pretty much on configuration information that showed disk and LUN layouts, efficiency of features like thin provisioning and so on.  What wasn’t possible was to collect data to the level of granularity that has been built into new storage architectures, such as those from Nimble Storage (InfoSight), Tegile (Intellicare), Pure Storage (Pure1) and PernixData (Architect).  These systems have two specific features that provide much greater insight; first they collect metrics that analyse all aspects of the platform, including specifics on the workload the system runs.  Second they collate that data and perform both historical and “what-if” analysis across many customers.  So how could this data be used?  Here’s a few ideas and examples:

  • Identifying failing hardware.  Obviously storage media fails from time to time, however being able to determine whether a batch of media has a worse failure rate than expected can both help to identify manufacturing defaults and plan to resolve them before other customers are affected.  Shipping a drive to the customer before they are aware of a problem is a cool feature.
  • Optimise Drive Usage.  As systems move to flash, one aspect of their use is the limited endurance SSDs offer.  Initial deployments of flash were a little bit of a guessing game and vendors undoubtedly erred on the side of caution.  However with field data companies like SolidFire were able to provide guarantees around drive lifetime (in this case an unlimited wear warranty) due to data received back from the field on the amount of data customers were actually writing. This has also allowed vendors to safely introduce the use of TLC NAND, which has lower endurance than both SLC and MLC.
  • Improve reliability/availability.  Ultimately having more data on system operations means being able to proactively address customer problems and improve uptime and availability.
  • Reduce Cost.  At some stage or other everything comes down to cost.  Being able to use cheaper drives and reduce parts replacements means vendors can pass on cost reductions to their customers, keeping them competitive.

SDS and Analytics

TECHunplugged-image-AUS.jpgAs we move to a software-defined world how will new storage solutions, based around open source or software-only, deliver the analytics that are currently integrated into hardware appliances?  At this stage I don’t believe I’ve seen anything in solutions like EMC ScaleIO, VMware Virtual SAN or Ceph that would deliver feedback on the infrastructure.  In fact, in many instances failures aren’t even dealt with pro-actively; the software simply waits for the device to fail.  Dealing with failure before it occurs is a much better way to operate infrastructure, especially at scale.  Imagine if we waited for aircraft parts or bridges to fail before tackling the problem.

I’ll touch on the hardware monitoring issues in my presentation at Tech Unplugged, coming up at the beginning of February in Austin, Texas.  If you will be in the area, please join us for this free event.  You can register here.  Please use the Promotional Code “architectingIT” to help me track who registered as a result of reading this post.

By way of thanks from me, all eligible US registered registrants using the “architectingIT” code will be entered into a draw for either the PS4 or Xbox One version of the latest Star Wars Battlefront game.  The winning name will be announced on Twitter during the event.  Good luck!

Comments are always welcome; please read our Comments Policy first.  If you have any related links of interest, please feel free to add them as a comment for consideration.  

Copyright (c) 2009-2016 – Chris M Evans, first published on https://blog.architecting.it, do not reproduce without permission.

Written by Chris Evans

With 30+ years in IT, Chris has worked on everything from mainframe to open platforms, Windows and more. During that time, he has focused on storage, developed software and even co-founded a music company in the late 1990s. These days it's all about analysis, advice and consultancy.