In a previous post, I touched on the need to have APIs for managing storage in cloud environments. In this post, I’ll talk about how the way in which storage is deployed in cloud environments has to change.
For the last 10 years, the advent of Storage Area Networks (SANs) has created a storage-centric view of the world with storage at the centre and the “planets” – networking and servers – wrapped around it like some pre-Copernican view of the universe. Over time, SANs have evolved to be ever bigger, with some organisations deploying huge fibre channel fabrics. As we’ve seen today, EMC continues to perpetuate that view with the release of the VMAX 40K, a 4PB monster of a storage array in the best traditions of the central SAN-based model.
However the world has changed. Storage is no longer the centre of the IT universe, but merely a player within it, and just as it came as a shock to those in power in the 1500′s when Copernicus proposed the sun was at the centre of the universe, so it will happen with IT and storage – especially so for cloud environments.
A Bit of History
SANs evolved from a time before (x86) virtualisation when everyone deployed physical servers. The storage in the server was isolated and the server chassis was the limiting factor on expansion of storage capacity. Copper SCSI cable limitations meant storage and server needed to be close, so expanding the storage for a single server could mean re-racking and downtime. Storage Area Networks and the use of optical fibre for the interconnect, allowed storage to be centralised. Now the resources were centrally stored and so sharable by all servers; they were not tied by physical distance as optical fibre could be run for hundreds of metres and they were scalable as the storage arrays could be scaled up in size simply by adding more disk to the shared pool. It’s also worth remembering that the first storage arrays from the 1990′s were made with much less reliable drive hardware than we have today. As a consequence the arrays were over-engineered to provide the high level of availability that centralisation required.
Consolidation can go too far. Placing all storage resources into one or a small number of arrays increases the impact of the following:
- Change Control – upgrading of microcode or other physical change has a wider impact and can be more difficult to get approved unless maintenance windows are well structured.
- Failure – the failure of a single array can have huge consequences as they scale and support more servers
- Complexity – large arrays benefit from scale in both capacity and performance, however larger arrays are more complex to manage (hence the introduction of auto-tiering technology), especially from a performance perspective,
- Lifecycle – as arrays get bigger in size, the effort to migrate data on and off the array at the beginning and end of their lifecycle results in additional cost and wasted resources.
Virtualisation & Cloud
The shared model works well with physically separate servers. However virtualisation has changed the server landscape; where before we had hundreds of servers in the data centre, now we see those ratios drop by a factor of 10:1 or 20:1 as virtualisation becomes mainstream. These ratios can be even higher in cloud environments where greater consolidation levels are required. Previously the server to storage ratio was a many to one relationship. Today we are seeing vendors push architectures that have, in some cases, a one to one relationship. Deploying a single storage array for every server may be a little extreme, but what we are seeing is a move away from a centralised model to one of scalable node-based storage, where storage can be added into an existing complex of arrays. In addition, data management intelligence is moving into the hypervisor. VMware now manages storage vMotion requests, dynamic data placement with DRS, offloading the “heavy lifting” to VAAI. Technologies such as remote replication aren’t needed.
What this means is we’re seeing a move towards storage hardware being used as a pure IOPS generator. In cloud solutions, storage needs to be lean and cheap, whilst still being reliable. What it doesn’t need is lots of additional extras.
The Storage Architect Take
The age of the super-scale single storage array is over. Storage consolidation through SAN is no longer needed and cloud deployments cope better from node-based scale-out storage solutions. Although most intelligence is moving to the hypervisor, the ability to seamlessly move from one array to another is still a requirement. Four petabytes in a single array isn’t needed by 90% of organisations and those who may need that level of capacity probably won’t deploy it in a single array. It’s time to move on.