The idea of improving the performance of applications through the use of parallel computing isn’t a new concept. Look back 50 years and we can see that Amdahl’s Law shows how to predict the performance improvement that can be achieved by using multiple processors. You can find details of Amdahl’s Law on Wikipedia, but essentially the formula states that the reduction in latency (or increase in speedup) is proportional to a combination of the proportion of the execution time that can be improved and the improvement value. Common sense lets us see that this formula is obviously true; if I give separate unrelated tasks to many workers, the more workers I have, the quicker I can get the job done. If I can increase the speed of the most critical task in the work I have, everything completes quicker.
In the mainframe days of Gene Amdahl, hard drives were horrendously slow compared to today’s flash and even spinning media. Applications ran on multi-processor machines and by design were executed across many separate jobs, tasks and processes to minimise the impact of I/O latency. However, as we moved into the mini-computer age with one application per machine, we lost focus on optimising the I/O path when all we had was a single processor to execute the work. Probably nowhere is this more obvious than on the Windows operating system, a platform that effectively evolved from the desktop.
Implementing Parallel Processing
With today’s multi-core processors, external I/O becomes a real bottleneck. Vendors have addressed shortcomings in their storage platforms. Dell EMC claim to have almost re-written all of the VNX2 FLARE operating system to take advantage of multi-core processors (you can read more with this white paper). NetApp had to make significant changes to ONTAP in their All-flash FAS (AFF) systems in order to gain the benefit of using flash media. Even in version 9.3, the latest ONTAP release, NetApp continues to parallelise (if that’s a word) internal I/O, gaining up to 40% improvement for some workloads. So removing single tasking and moving to a parallel architecture works. There are also other solutions such as caching that attempt to remove I/O from the equation altogether.
The best I/O is the one you don’t have to do
DataCore Parallel I/O
DataCore has known this problem for some time. The company released Parallel I/O technology within Parallel Server in the middle of 2016. The announcement was promoted with an SPC-1 IOPS benchmark showing 5.1 million IOPS, beating the previous leaders by a huge margin. Unfortunately, the industry cried foul (as I wrote at the time), claiming that there was an excessive use of DRAM used to deliver the performance figures. Personally, I couldn’t see a problem here; the benchmark is what it is. All vendors try to game the numbers (or present their systems in the best light). What matters is real-world application performance improvement.
Fast forward to September 2017 and DataCore has announced a new product, MaxParallel for SQL Server. This is an implementation of the Parallel I/O software that runs as a storage device driver on Windows servers. Figures from testing with HammerDB show significant improvement in transaction throughput, on both MS SQL Server 2012 and 2016. Of course, as we’ve said, benchmarks aren’t enough in terms of proof points. The value is in seeing how customer applications are affected through MaxParallel. George Teixeira (DataCore CEO) took me through one example where a customer worked on implementing database caching to marginally improve performance. MaxParallel was able to increase the number of users on the system from 40 to 150 while maintaining a greater TPM (transaction per minute) per user. What’s interesting about this particular case is the CPU utilisation on the server, which didn’t previously exceed 35%. With MaxParallel, the figure increased to 95%, showing the hardware was being more effectively used.
The Architect’s View
MaxParallel for SQL Server isn’t a piece of software written specifically with SQL Server in mind. In fact, I asked George if there was any SQL Server-specific code in the product and he told me there wasn’t. The MaxParallel software is simply being packaged and sold to be licensed on servers running SQL Server (it checks for a valid SQL Server licence). What DataCore has done is taken a piece of software and put much more interesting licensing terms around it.
MaxParallel costs around 15% of the cost of MS SQL Server and has the same per-core licensing terms. It’s also available in the public cloud, either as BYOL or on a per-hour cost. What does this licensing mean? Well, customers can get significant improvement on their SQL Server workloads, with a relatively modest increase in costs. This could mean being able to cost-avoid upgrades, buying cheaper hardware in the first place, or in public cloud, buying a smaller instance to run SQL Server.
What’s clever about the licence model is that this can be applied to other platforms, each time basing the pricing as a function of the application itself. It also changes the engagement model with the customer. DataCore can talk to application owners and DBAs about performance acceleration, rather than the storage team. Now they have a whole new set of champions in the Enterprise for their products and the discussion is about I/O performance, not storage.
For me, I like the idea that we’re fixing the I/O problem in software, wherever the problem exists – in the data centre, in virtual environments or in the public cloud. It shows there’s still room for I/O optimisation and improvement. A Linux version of MaxParallel is expected next year (if my memory serves me correctly). It will be interesting to see if the same savings can be achieved on a platform designed for multi-tasking. For now, you can find out more details by following the links below.
- Amdahl’s Law (Wikipedia, retrieved 11 October 2017)
- Validity of the Single Processor Approach to Achieving Large-Scale Computing Capabilities (PDF, retrieved 11 October 2017)
- Amdahl’s Law, Gustafson’s Trend and the Performance Limits of Parallel Applications (Intel PDF, retrieved 11 October 2017)
- EMC VNX2 MCx Multicore Everything (EMC white paper, PDF, retrieved 11 October 2017)
- Announcing NetApp ONTAP 9.3: The Next Step in Modernizing Your Data Management (NetApp Blogs, retrieved 11 October 2017)
- DataCore Rockets Past All Competitors, Sets the New World Record for Storage Performance (DataCore press release, retrieved 11 October 2017)
- Are DataCore’s SPC Benchmarks Unfair?
- DataCore MaxParallel Product Page (DataCore Website, retrieved 11 October 2017)
Comments are always welcome; please read our Comments Policy. If you have any related links of interest, please feel free to add them as a comment for consideration.
Copyright (c) 2009-2017 – Post #C54F– Chris M Evans, first published on https://blog.architecting.it, do not reproduce without permission.