Home | Featured | Enterprise Computing: VPLEX – Write Back or Write Through?

Enterprise Computing: VPLEX – Write Back or Write Through?

0 Flares Twitter 0 Facebook 0 Google+ 0 StumbleUpon 0 Buffer 0 LinkedIn 0 0 Flares ×

A little discussion on Twitter between myself, @BasRaayman, @UltraSub and @esignoretti got me thinking about the evolution of VPLEX and the whole caching thing.  It’s something I mentioned on one of my previous VPLEX posts and it could have significant impact on designing and implementing VPLEX solutions.  Here’s the conundrum.

First of all, be aware that VPLEX is a write-through device.  That means it doesn’t keep I/O in cache and confirm write-completion to the host; it waits until the data is secured on disk before confirming the write success.  One the one hand this is a positive thing; the VPLEX clusters contain no data in cache that could become stale or in the event of a failure, not written to disk.  In VPLEX-Metro where synchronous cross-site writes are permitted, it also makes sense from a simplicity point of view; everything is on physical disk before the host I/O is confirmed as successful, so there’s no recovery issues to worry about.

However, write through in a heterogenous environment might not be so good.  Performance is directly connected to the storage layer.  Storage design has to continue as before and two layers of complexity now have to be considered in order to assure best performance. By using the VPLEX layer to achieve replication, we also have a situation where the performance is entirely dependent on the time it takes to write data to the underlying storage supporting the virtual VPLEX LUNs – in all locations.  It would be possible to create a very bad configuration with awful performance. It also means that diagnosing performance problems has another layer of complexity to wade through.

The trouble is, the concept of VPLEX effectively leans itself towards the need to have write-through I/O, as VPLEX is enabling multiple I/Os to the same data in multiple locations.  If I/O was cached, there would be a significant increase in data inconsistency, if one of the VPLEX devices in the complex was lost, for example.

Of course, other VPLEX-like technology uses write-through or write-back (Hitachi USP V for example).  I believe SVC also caches.  Are they in a better situation?  In some respects they are because USP V and SVC offer additional features like thin provisioning and snapshots, all of which are implemented at the virtualisation layer above the underlying storage.

What is everyone else’s opinion?  Will we see more or less complexity with VPLEX?

About Chris M Evans

Chris M Evans has worked in the technology industry since 1987, starting as a systems programmer on the IBM mainframe platform, while retaining an interest in storage. After working abroad, he co-founded an Internet-based music distribution company during the .com era, returning to consultancy in the new millennium. In 2009 Chris co-founded Langton Blue Ltd (www.langtonblue.com), a boutique consultancy firm focused on delivering business benefit through efficient technology deployments. Chris writes a popular blog at http://blog.architecting.it, attends many conferences and invitation-only events and can be found providing regular industry contributions through Twitter (@chrismevans) and other social media outlets.
  • UltraSub

    Cross-site caching would also give better performance. A write in cache on other sites would be faster acknowledged of course. So instead of waiting on two backend-devices (one local, one remote), the host just waits for two caches (one local, one remote), which will always be faster. Of course there’s added complexity. But I think doing everything in write-through will give some companies serious performance issues.

  • http://www.cinetica.it/blog Enrico Signoretti

    Chris,
    you know my point.
    Vplex is a great idea but it needs a lot of development.
    Vplex, at the moment, is only a cache (great cache but only a cache).
    Storage admins want something more to simplify their job, they want to harmonize their infrastructures.
    Adding 1-feature layers are not TCO savvy!

    On the other hand, SVC and USP have the great advantage to deliver a unique interface for snapshots, replicas, and many other features but they haven’t the vplex feature.

    I think that next versions of SVC, USP and Vplex will fill the actual holes and they will become better and more comparable solutions.

    Enrico

  • http://etherealmind.com EtherealMind

    Wouldn’t the use of SSD / Flash Tiers in the cache change the profile of write through performance. If the RAM cache can write through to Flash thus releasing the write request, and then tier the request into the correct storage in a second phase, wouldn’t this solve that problem ?

    And isn’t that what EMC does in their units ?

  • http://www.brookend.com Chris Evans

    Greg, agreed, that SSD would be sensible; VPLEX to my knowledge has no SSD built in. One would assume that’s the evolution of the product, which would align it to SVC and USP in the way it works.

  • http://blog.insanegeeks.com InsaneGeek

    From what I’ve read in headlines, etc it feels to me like vplex is geared directly towards VMWare VMFS, especially so regarding the future async plans.

    I think with the exception of VMFS, the full active/active cluster filesystems I know of will probably have a problem with the distance latency before the storage does (or at least about the same time). Having dealt with GFS, Veritas, Polyserve (evaled not in production), etc block level cluster filesystems for a number of years the node to node locking/update mechanisms are going to be a big problem over any distance; at best case with one node at each location it should be about equal add more than one node and you going to have dramatic slow downs very quickly as the filesystem tries to maintain all the nodes information. I’d guess 2x app nodes at site A and 2x app nodes at site B would at least double (if not more) the time at the app layer before you add in the storage cache to cache latency. A perfect example of this is that some of our Oracle instances in the same RAC cluster only run on one node because maintaining the SGA across all the nodes dramatically decreased the performance, we now have a config where one instance runs on all nodes, another instance runs on node 2, another on node 3, another on 2 & 3 and another only on node 4. We’d love to let them run on all 4x nodes but at a level above the storage maintaining state causes problems.

    VMFS while a massively limited filesystem with hardly any features feels like it might actually work kind of OK in this situation. The only time it creates locks is for filesystem meta updates, mainly around poweron/off and creating/growing files so generally you’d be speed hampered by storage speed rather than the filesystem.

    If you have a thick provisioned guests (the underlying storage luns could be thin), you should hardly have to send any lock requests to the remote nodes because you are mostly changing blocks within an existing file that happens only on the one node no notifications need to be sent/acked by any other node. Because of that, my guess is that the performance issues of an active/active lun on a VMFS filesystem spread out over a long distance should be very limited and hardly noticed.

    Throw in the future async and to me things get very interesting. Rather than having a DR copy in another datacenter sitting around doing nothing but accepting writes it can be used read/write at the same time. Want to move a single guest from datacenter to datacenter but not all of the guests on a lun? Poweroff in one and poweron in the other (or wan vmotion when that comes)… to me that will be very, very interesting when comes around (my datacenters are >500 miles apart so only async for me). I don’t know what restrictions there will inevitably be for async but vplex has tickled some of the out of the box thinking in my mind (and the thoughts of future filesystems that are more latency flexible).

  • http://www.brookend.com Chris Evans

    InsaneGeek

    Agreed, VPLEX does look very much a VMware specific product and it makes sense from EMC’s point of view as their world domination strategy needs an “operating system” and ESX is as good as any (I’ve put O/S in quotes for sarcastic purposes). Again, I agree, VMFS suits VPLEX well. I don’t imagine VPLEX will work quite as well with Hyper-V, but then that route isn’t the EMC strategy.

    So let’s turn things on their head. How well would VMware work with USP V and HAM? Hitachi have you tested this scenario in the way VPLEX is being marketed? That I would like to see!

    Chris

  • http://blog.insanegeeks.com InsaneGeek

    Because my locations are too far for current VPLEX or HAM, I haven’t done a real under the covers look. Conceptually I would think they would be relatively similar, but from the (lack of) details on HAM I can find; they seem a bit different.

    If HAM still requires HDLM path management (and I haven’t heard of HDLM as a supported 3rd party ESX MPIO driver), seems like a very quick dead-end. A good question would be: does VPLEX require a 3rd party path driver for VMWare as well to work?

    Another thing I’m not sure on is does HAM support a lun in true bi-directional replication mode? My read a while back when it came out is more that writes to physical drives still flow in one direction with one array being the source: i.e. HDLM directs writes to the 1st array and then syncronously replicated to the 2nd array, in a disaster the 2nd array’s luns become read write and HDLM shifts load to there, with no downtime. The underlying details as to how HAM works are frustratingly difficult to find docs on. If that’s the case then it still feels like these are solving two different things. i.e. if I have hosts in two datacenters 50km apart I’m thinking with HAM all remote host I/O’s would have to be sent to the one datacenter (read/write) where I think VPLEX reads could be serviced locally and only writes would have the synchronous write latency.

    Either way until async is available they both have limited value (pretty much only migrations) to my organization right now.

0 Flares Twitter 0 Facebook 0 Google+ 0 StumbleUpon 0 Buffer 0 LinkedIn 0 0 Flares ×