One of the more interesting announcements at VMworld 2014 this year was a technology called VAIO or vSphere APIs for I/O Filtering. This new feature was detailed in session TEX1492 (although I didn’t see it) and will allow 3rd party vendors to insert code in the form of a filter driver and deployed as a VIB that can be used to intercept and potentially modify data in the I/O stream.
VAIO will be VM specific, potentially allowing different functionality for each virtual machine on a vSphere host. The idea is as follows; the host issues an I/O. This is forwarded to the VAIO filter driver, from there on to the actual target destination and then back through the filter driver as the I/O is returned to the host. This means the filter driver gets to see (and modify) the data both before and after the storage has processed the request. At this stage I don’t know the specific mechanics, although more information can be found in this post and these videos from Tech Field Day Extra.
VAIO has been initially slated as being used for caching and replication (see the SanDisk presentation). How else could this technology be used and what are the risks associated with it?
Let’s start with opportunities. The two ideas put forward so far are caching and replication. Caching seems quite simple to implement, using host DRAM to store cached data and returning I/O requests from the cache rather than forwarding the I/O to the external storage. Replication similarly could also be implemented easily, with data cached and moved to another location asynchronously or within the same host for in-host copies. It’s possible that both of these use cases could be used to improve the performance and functionality of not just external storage but specifically VMware’s Virtual SAN.
There are of course other possibilities.
- Encryption – VAIO could be used to encrypt data going to external (or even internal) storage, with the aim of ensuring data is securely stored on permanent media. As VAIO is installed per VM, this means encryption could be very granular and managed on a per-application basis.
- De-duplication – getting back to VMware Virtual SAN, dedupe, could be implemented through VAIO, using DRAM and dedicated ESXi processes to identify and eliminate duplicate I/O data. Note that this would only work for internal storage as the deduplication process needs to work with the host itself and this is exactly what Virtual SAN does.
- Tiering – again, looking at Virtual SAN, VAIO could be used to implement a tiering model, looking at data activity and placing it on the most appropriate media. In this case I’m thinking more effective use of flash rather than an old-style disk tiering model.
- Analytics – Being able to view the entire data stream offers some interesting ideas for doing analytics and VAIO could allow companies like CacheBox to deploy their technology directly into the hypervisor. The possible uses are many fold, including virus scanning, content and application discovery.
Of course placing any code into the data path represents a number of risks. Firstly there’s the simple case of I/O throttling from increased latency and reduced throughput as more code is walked through for each I/O transaction. Then there’s the additional load that will be placed on the host to run the VAIO code, resulting in increased processor utilisation from non-VM workloads, i.e. more overhead. There is a consideration on the quality of code and the ability of that code to crash the system if bug ridden. This is a perennial problem that existed in IBM mainframe days for anyone who wrote MVS exits and also was seen again with buggy drivers that took down Windows NT in the early days (usually drivers that weren’t specifically designed for the platform). These sorts of problems (security and resilience) are probably the reason why VMware has chosen to run VAIO in user space rather than in the kernel as this will restrict any failures to only the VM. However, if the same code exists in many VMs, it doesn’t stop the problem still being replicated across many guests.
One other thought – what about malicious code being deployed to a VM or group of VMs? Imagine a rogue employee who deploys a filter driver to randomly encrypt and de-crypt content on disk so the disk content is encrypted yet transparent to the host. One day they delete the driver and all of the data is garbage – how do we protect against that? Ultimately of course the answer is we don’t; someone with sufficient access can destroy any environment, but good backup and replication policies help, as well as a robust and audited security model for administrators.
The Architect’s View
VAIO is a great idea (even if the acronym is a little retro) and there will be lots of practical uses for this new feature. I’m looking forward to seeing the first deployments of VAIO drivers and of course to see if and when VMware will adapt Virtual SAN to use some of the benefits VAIO offers.
- TEX1492 – IO Filters: Adding Data Services to ESXi (JediMT blog)
- SanDisk Presents at Tech Field Day Extra at VMworld US 2014 (Tech Field Day website)
Comments are always welcome; please read our Comments Policy. If you have any related links of interest, please feel free to add them as a comment for consideration.
Copyright (c) 2009-2018 – Post #46C5 – Chris M Evans, first published on https://blog.architecting.it, do not reproduce without permission.