Home | Uncategorized | Benchmarks: Time for Some Vendor Honesty
Benchmarks: Time for Some Vendor Honesty

Benchmarks: Time for Some Vendor Honesty

1 Flares Twitter 0 Facebook 0 Google+ 1 StumbleUpon 0 Buffer 0 LinkedIn 0 1 Flares ×

Over the last couple of weeks I’ve been doing some perusing of vendor products and benchmarks for both storage arrays and PCIe SSD cards.  For the PCIe data, you can see the fruits of this labour at PCIe SSD Vendors and Products.  This morning I was looking at the figures for the latest NetApp EF5x0 upgrade, the EF550, their interim all-flash box and I was reminded why I have such an irritation with vendor-quoted performance figures.

My issue is in the unit of measurement and the explanation of what that unit covers.  Take for example the EF550 data sheet.  This shows “burst” I/O of up to 900,000 IOPS – what exactly does that mean?  How long is the burst? What is the block size?  How many parallel I/Os were needed achieve it? Essentially this is a meaningless figure based on the marketing department’s need to get as close as possible to the magical one meeelion IOPS, that all marketing teams seems to obsess about.  NetApp don’t quote actually latency times with the product, simply quoting “sub-millisecond” as the device capability.  This could mean anything but is clearly hiding their poorer capability compared to their peers.  Then there’s the throughput – 12GB/s.  What configuration was needed to achieve this?  Could a customer actually see this level of performance?

NetApp are not alone in this deception.  At an EMC presentation recently on their XtremIO product, they chose to show competitor performance comparisons without actually naming the competitor.  This isn’t much better than just making figures up.  Without some kind of provenance to the source of the data, how can customers believe the testing process was independent?  The same applies to the PCIe vendors I have listed; there’s a mix of varying block sizes, presumably designed to show their product in the best light; there are read and write figures, but they don’t always indicate whether this is sequential or random – totally random I/O workload.  Then there’s the 100% random read I/O – exactly what use is that statistic other than showing how good your hardware backplane is?

The Architect’s View

I know there are official benchmarks out there, such as SPECsfs2008 (Standard Performance Evaluation Corporation) and the Storage Performance Council, but not all vendors choose to submit to them.  Just having consistency on vendor-quoted benchmarks would be a start.  A little honesty and transparency goes a long way to gaining customer confidence.

 

Comments are always welcome; please indicate if you work for a vendor as it’s only fair.  If you have any related links of interest, please feel free to add them as a comment for consideration.

Subscribe to the newsletter! – simply follow this link and enter your basic details (email addresses not shared with any other site).

Copyright (c) 2013 – Brookend Ltd, first published on http://architecting.it, do not reproduce without permission.

 

 

About Chris M Evans

Chris M Evans has worked in the technology industry since 1987, starting as a systems programmer on the IBM mainframe platform, while retaining an interest in storage. After working abroad, he co-founded an Internet-based music distribution company during the .com era, returning to consultancy in the new millennium. In 2009 Chris co-founded Langton Blue Ltd (www.langtonblue.com), a boutique consultancy firm focused on delivering business benefit through efficient technology deployments. Chris writes a popular blog at http://blog.architecting.it, attends many conferences and invitation-only events and can be found providing regular industry contributions through Twitter (@chrismevans) and other social media outlets.
  • http://www.deepstorage.net/ Howard Marks

    Chris,

    It’s more complicated by the fact that the standard benchmarks aren’t up to the task of testing leading edge storage systems. SPC for example while it writes data in the block sizes and locations of a real application doesn’t write real data. It writes the same data all the time. A system like XtremIO or Pure (a client BTW) would dedupe that to 1 block and deliver numbers better than they would with real data that isn’t infinitely dedupeable. SPC won’t let you publish results for a system that does dedupe.

    – Howard

    • http://thestoragearchitect.com/ Chris M Evans

      Howard, that’s such a major fail from the vendors and seems like something they could easily remedy.

      Chris

      • http://www.deepstorage.net/ Howard Marks

        It’s a lot harder problem than it appears to be at first glance. The current good benchmarks are based on traces of real applications. You can’t record the data the same way because that would mean exposing some real user’s database as part of the benchmark.

        Then even if you created synthetic data that was realish you’d have to play it back from the load generator(s) that means to get 100,000 4K write IOPS you’d need to read 400,000KBps at the load generator to write it to the system under test and sustain that for an hour or more. That would require an array of servers with PCIe SSDs to read from as load generators.

        All that makes the next generation of benchmarks VERY expensive to create and to run.

        – Howard

  • Kevin Stay

    The marketing groups from all the storage vendors feel compelled to one degree or another tell these lies. EMC is the grandmaster, but they all do it. I read the various submissions as a point of interest only; at least SPEC makes you also include the price and actually use a decent amount of the provisioned space for the testing.

    Ultimately any company too lazy and/or stupid to first thoroughly test a system with their actual workload deserves what they get. Yes, that can be challenging. Unfortunately the storage in MANY places is much more a result of schmoozed back door deals with management or maybe at best some windows administrator than careful requirements analysis and design considerations from an actual storage engineer.

    Disclaimer – We are currently a NetApp shop. Truth be told it does a fair job of meeting a decent amount of our storage needs though it has also been foolishly pressed into service for Exchange and Sharepoint. Spinnaker is an absolute non-starter so we are now in the earliest stages of planning a migration.

    • http://thestoragearchitect.com/ Chris M Evans

      Kevin, thanks for your honesty. Why do you feel there’s a problem with C-mode (other than the fact it’s not a true scale out technology)?

      Chris

      • Kevin Stay

        Getting from your current environment to C-mode without disrupting the business is the first big challenge. Once there what did you get? Incoming requests are not actually handled by whatever controller receiving it, but instead ‘shuttled over’ to the head still owning the disks. My calendar says 2013 and ALUA has no place in any system calling itself enterprise. Lots of other good stuff still in there, but it is clearly time for us to migrate.

        • http://thestoragearchitect.com/ Chris M Evans

          Kevin, I have to say I agree with you. Despite what NetApp claim, this is still two separate operating systems. This is the time I think many customers will review and move on. It could be NetApp’s inflexion point.

          Regards
          Chris

          • http://www.linkedin.com/in/maddenca Chris Madden

            While some might compare Data ONTAP 7-mode and clustered Data ONTAP to siblings, I think they are more like twins. Much of the DNA is in fact shared since the 8.0 release. The cDOT CLI is different/improved, but if you look at the feature set they are the same under the covers. Features like deduplication, compression, snapshots, replication, SAN, NAS, cloning, etc, are all there and leverage the same underlying code. Further, if you migrate from 7DOT to cDOT you can keep your dedupe savings, snapshot history, replication relationships, etc. If these are two different operating systems they do seem to have pretty high interoperability!

            Either you have a shared backplane (scale up) or you have to shuttle data around (scale out). We allow a bit of both with scale-up between our controller models, and scale-out using clustered Data ONTAP. The scale out design is useful in many ways, one of them being that it decouples the host attach from the backend storage. With it you can move FC and IP interfaces around the cluster to handle hardware refresh, maintenance, workload balance, etc. without any downtime or reconfiguration on the network or host. The combination of frontend mobility of network interfaces, and backend mobility of data, is powerful.

            ALUA and pNFS allow clients to learn an optimal path, but it’s not a big deal if you don’t. For an example check the now 2yr old 1.5 million IOP specFC posting which was 96% (23/24) remote access: http://www.spec.org/sfs2008/results/res2011q4/sfs2008-20111003-00198.html. ALUA is a relatively recent invention (from a standards perspective at least) and today all modern clients support it in a hands-off way providing tighter integration between the client and the storage array. In my opinion it is a great improvement to the proprietary multipathing solutions that used to be commonplace.

            Regarding success in the marketplace, let the sales numbers speak for themselves. See http://investors.netapp.com/events.cfm and then NetApp’s recent Q2 conf call transcript: “In Q2, we ship more than 1,900 clustered nodes, an increase of almost 300% from Q2 a year ago and almost 60% from last quarter. 37% of high-end systems and 24% of mid-range systems were deployed in clustered configurations”. Given our market share as measured by IDC is trending upward, and our cDOT adoption is spiking upward, I fail to see any inflexion point.

            Regards,

            Chris Madden
            Storage Architect, NetApp EMEA

          • http://thestoragearchitect.com/ Chris M Evans

            Chris, that was a positive spin, let’s put some realism on this. Please let me know if any of this is wrong:

            7-mode to C-mode upgrade in place is not possible, nor is a head replacement. It’s a requirement to deploy new hardware and migrate the data over.

            7-mode to C-mode migration (via the transition tool) doesn’t support LUNs, traditional volumes, restricted volumes, SnapLock and Flexcache volumes. Synchronous Snapmirrors are not supported, neither is MetroCluster. NDMP backup configurations are lost.

            Migrations from a source configuration prior to 7-mode v 8.2 will be much more tricky (or impossible), so customers on a 7.x.x release (which presumably there are many) will have many issues; customers on 32-bit only products will have no migration strategy.

            By the way, just putting in place similar features doesn’t make products twins. Imagine having a Ferrari and a Nissan Cube; both have 4 wheels, automatic transmission, radio, power steering, doors, windows, etc. Which would you prefer?

            Chris

          • http://www.linkedin.com/in/maddenca Chris Madden

            Hi Chris,

            7DOT to cDOT transition is indeed not in place; your data has to move from one controller and set of disks to another. That kit can either be net-new or on loan from us. To move the data you can use any client side data copy tool (robocopy, rsync, VMware storage vMotion, LVM, etc) or a NetApp provided option(7-Mode Transition Tool, DTA2800, and soon RapidData).

            The 7-Mode transition tool indeed has some ‘unsupported features’ which are stated in the admin guide, some of which you have quoted for us. Of the ones that you mentioned most are expected due to no equivalent feature/function yet in cDOT (sync sm, metrocluster, snaplock), or because there is no real need to transition it (restricted vol: temp state in snapmirror cfg, tradvol: extinct in practice, flexcache: temp data, NDMP: post migration it is std practice to take a new full backup anyway). Please clarify if you think those restrictions are real blockers.

            It is true that today the 7-Mode transition tool cannot transition LUN data. Time to market, and availability of other mature solutions (DTA2800 and host-based techniques) made us defer it in the first go around, but it is high priority to add it.

            You are misinformed about 7-Mode transition tool support for older releases; support exists from 7.3 family (released in 2008) and 32-bit aggregates. So no trickiness or impossibility if starting from an aging infra. Using host based copy techniques your source could be anything…

            The 7-mode transition tool does more than just migrate the data smartly (block-level incrementals, incl. snapshot, clone and dedupe savings) it also migrates the configuration (shares, quotas, etc). So overall it is less work and outage than going from one vendor to another, but more than a headswap. But, once on cDOT, the non-disruptive operations capabilities of the platform mean you won’t have to do it again, ever. It’s like going from physical servers to virtual ones; there is some effort to get there but it’s well worth it.

            I don’t get your analogy about Ferrari and Nissan Cube. If I were comparing 7DOT to EF550 I would agree, but if comparing 7DOT to cDOT I don’t. These products share the same code; a single build process generates the single 8.2 download image. At initialization time the 7DOT or cDOT personality is started based on system args. So if a Ferrari and the Nissan Cube have identical automatic transmission, engine, and power windows, but different number of seats then I guess that analogy works, but otherwise it doesn’t.

            Hope this is clear, but if not, please let me know.

            Regards,

            Chris Madden

            Storage Architect, NetApp EMEA

  • Chris Madden

    Unfortunately, some customers actually use a datasheet IOP number as part of their selection criteria. If vendor A does 1,200,000 and vendor B does 900,000 then obviously vendor A is better, right? Of course not. Without read / write mix, sequential / random mix, latency curve as you ramp up IOs, RAID/data protection, features enabled, etc, it means very little in the real world. But, especially in the all-flash market people seem to obsess about these numbers so there you go, NetApp started printing them too. It’s a NetApp first actually, check the EF540 datasheet and you won’t see any figures. Damned if you do, damned if you don’t.

    If you want more detailed info for your workload profile, ask a NetApp SE. If you want to see it yourself, run a PoC in your data center. There is also a performance demo of the EF540 on youtube showing 314k 4KB read IOPs at .6ms avg latency: http://www.youtube.com/watch?v=UMwIpryRzno. And this was the EF540, the EF550 has upgraded hardware that takes it higher. These boxes are very capable.

    Chris Madden
    Storage Architect
    NetApp EMEA

1 Flares Twitter 0 Facebook 0 Google+ 1 StumbleUpon 0 Buffer 0 LinkedIn 0 1 Flares ×