Note: Updated on 18 January 2017 to include OpenEBS.
There’s an almost religious divide between those who see containers as entirely stateless objects and others taking a more pragmatic approach that says state and containers is an inevitable thing. In the stateless model, data is assumed to be replicated and protected by many container instances, so the loss of any individual container doesn’t lose data. In practical terms, this idea just doesn’t work, because in the Enterprise, we have to meet a set of standards around application availability, auditing and compliance. Assuming we want to containerise our databases (rather than relying on them remaining as virtual machine instances) and we surely will, then persistent data is as inevitable as death or taxes. However, what about a more contrary approach? How about building storage systems from containers?
The persistence of data is due to the media we store it on, not the system through which we access it. As an example, many vendors provide the ability to perform head upgrades on their dual controller-based systems. This is because persistent data and configuration information is stored on the media (HDDs and SSDs) and in many cases the media is self-describing. This means if we have a software crash, theoretically metadata and configuration information can be re-read by parsing the data on disk. Taking this idea to its logical conclusion, we can use stateless processes like containers to create storage systems, if we ensure that state is stored on the physical media (and protected across that media). If the container running our storage platform crashes, then we simply respawn it and read configuration data back from disk.
Building Storage with Containers
We are starting to see containers edging into the deployment of storage solutions. There are a number of reasons this is a good thing; firstly if we’re already running containers, then accessing storage on one of those containers provides a lightweight way to get to our data. Docker already implemented something like this with their data volume containers (see this link on Docker storage options, plus other references at the end of this post). Second, containerising storage means we can build storage features as separate microservices, making management, upgrading and patching much easier.
Other vendors are starting to bring products to the market with the idea of using containerised storage. Scality, an object storage vendor, recently released their S3 Server, a cut-down containerised version of the Scality RING platform written in node.js. This runs as a single container image and so has limited support/availability but provides a process to test S3 compatibility with Scality RING. We could imagine the offering could be extended in the future to have more functionality.
StorageOS, a UK startup has built a storage platform that runs in containers, for containers. The container footprint is (at present) a mere 40MB, which is an amazing achievement, although I can see this increasing as more functionality is added. Dell EMC’s VNX platform uses containers to implement VDMs (Virtual Data Movers). Portworx also has a storage solution that is build from containers. Currently this is available as a Developer (PX-Developer) edition that can be downloaded from GitHub, or an Enterprise edition (PX-Enterprise). There’s also OpenEBS, an open source software solution for container-based storage also sold as a hardware solution through Cloudbyte.
The Architect’s View
The lines between storage and application are being blurred with the idea of using containers for data persistence. HCI (hyper-converged infrastructure) set the scene for the ability to run storage and application services on the same hardware, storage with containers takes this to another level. As with all storage solutions, one product doesn’t fit all requirements, so the idea of storage containers will (initially at least) have limited application. However expect to see more more solutions come to market as Software Defined Storage starts to find a true niche. Please let me know if you have any other examples of containers being used to deliver storage and I’ll add them to this post.
Scality’s S3 Server is available for download on GitHub and in the Docker Hub. You can also find Portworx online using the links below. StorageOS has presented at Tech Field Day, link below.
- Manage data in containers (Docker documentation, retrieved 13 January 2017)
- Portworx PX-Developer on GitHub (GitHub website, retrieved 13 January 2017)
- Scality S3 Server on GitHub (GitHub website, retrieved 13 January 2017)
- S3 Server on Docker Hub (Docker Hub website, retrieved 13 January 2017)
- StorageOS Presents at Tech Field Day 12 (Tech Field Day website, retrieved 13 January 2017)
- Storage & Virtual Containers: Where Does The Data Go? (Network Computing, published 27 August 2016, retrieved 13 January 2017)
- Docker Containers and Persistent Storage: 4 Options (Network Computing, published 19 October 2016, retrieved 13 January 2017)
Comments are always welcome; please read our Comments Policy first. If you have any related links of interest, please feel free to add them as a comment for consideration.
Copyright (c) 2009-2017 – Chris M Evans, first published on https://blog.architecting.it, do not reproduce without permission.