Solid-State Cache is Not a Silver Bullet
Solid-state (SSD) is disrupting the storage industry and as a result all storage vendors, including NexGen, are scrambling to get ahead of the curve. Everyone has a different perspective on how to take advantage of the change. I wrote about how vendors are using SSD for caching in my previous post. Now let us focus on how most vendors are using it for reads while ignoring any benefits for writes.
The issue that a lot of folks lose sight of is that SSD isn’t a solution. It, by itself, doesn’t solve any real customer problem. SSD is a component of a broader solution that adds zero value without the right software capabilities surrounding it.
To me, the real meaty discussion is around the software capabilities used to manage SSDs and how it’s implemented in the broader system context. There are an infinite number of ways to do this with different decision factors to be considered before deciding which route to take. I’d like to say that we (the storage industry) focus on customer problems and come up with the best solution for them, but that’s not typically the case. Business issues like time-to-market, ease of implementation, engineering talent, existing investments, and internal politics all impact the route taken.
So what you see in the marketplace are different “waves” of types of approaches. The first wave is typically the least expensive and time consuming from an R&D perspective, allowing the fastest time-to-market. For storage vendors wishing to jump on the SSD bandwagon, this initial wave was plugging disk form-factor SSD into existing products with legacy storage controllers. While it’s the easiest thing to do from a vendor perspective, it severely limits the value of SSD to customers as it greatly sacrifices capacity for performance and causes bottlenecks in legacy controllers (more on that here).
The second wave contains increasingly optimized implementations. This is where most vendors are today. For example, EMC launched SSD drives in their products in 2008 and the industry quickly followed. NetApp took a different approach with Flash Cache, using SSD PCIe cards as a system read cache. EMC took the next step with SSD PCIe cards used as a read cache, but in the server and with the ability to tier to a shared storage system.
Both solutions provide customer value light years beyond the first wave of using SSD drives behind storage controllers, but they are still incomplete as they use SSD only as a read cache. That means none of the write workload benefits from SSD performance. It’s an incomplete approach, validated by the fact that EMC announced project “Thunder” along with VFCache so that they could talk around this issue.
The challenge with managing a write workload out of SSD is that all data has to be in a protected state before the write is acknowledged back to the host to ensure high availability. That means you have to create a parity bit or some sort of data redundancy which can significantly impact SSD performance. Existing RAID algorithms weren’t designed for the extremely low latencies of SSD and severely limit performance. Creating new RAID algorithms is extremely difficult, especially for vendors with billion dollar annual revenue streams from products that depend on those legacy RAID technologies.
The funny thing is that storage vendors taking this approach talk as if using SSD as a read cache is the end all, be all solution (kudos to EMC for admitting the need for Project Thunder). The only reason they do this is because it’s all they have to offer right now; it’s not because it’s the best solution.
SSD is an order of magnitude faster than traditional disk drives (or “spinning rust” as we call them). This fact remains valid even if you’re writing sequentially to disk. The right approach from a customer perspective is to apply SSD to both reads AND writes.
Using SSD as a read cache is not a silver bullet and it’s not the best solution for the customer. It’s a trade-off made by vendors because managing write-intensive workloads out of SSD requires a complete re-architecture of their solutions.
The NexGen team recognized these challenges and started with a new architecture designed from the ground up to take advantage of SSD speed for both reads and writes. We believe that customers should get the most return on their SSD investment and the most performance possible out of the system. We accomplish this by mirroring between the SSD layers in each of our active/active processors. We then copy the write down to traditional disk for the best $/GB and clear one of the mirrored copies. Our Dynamic Data Placement engine makes real-time decisions about what data to maintain or promote to the SSD tier based on the quality of service goals set by our customers.
SSD is a great tool, but at NexGen, we’ve taken steps to make sure customers get the maximum advantages of SSD rather than leaving half the benefits on the table.