Game Changer for SAP on Hadoop & Oracle

So, EMC just came out with a new storage system called “DSSD”. It is a truly remarkable platform, which can do wonders for certain use cases in the SAP world. Originally announced at EMCWorld in 2014, the team has worked tirelessly to not only deliver on the original performance targets but also harden the platform to an enterprise grade system. While most facts probably describe the incredible hardware capabilities, I am really amazed by the software stack that DSSD is accompanied by. I mean, you can build the fastest sports car in the world, but if you have an untrained driver behind the wheel, you won’t see the performance the car is capable of.

DSSD front  DSSD HBA

DSSD’s D5 system includes not only the obvious part of the 5U rack mounted actual storage component, but also the proprietary cable connections, PCIe Gen3 cards for the server, and the direct I/O software installed on the server operating system, which enables direct access to D5 data, bypassing traditional OS system calls, POSIX structures, and SCSI interrupts. Chad Sakac (Virtual Geek & President, VCE – Converged Platform Division of EMC) published an interesting early blog on DSSD in May 2015.

DSSD SW.png

While most all-flash systems available today focus on making flash economically appealing and sacrifice performance as a result, DSSD is taking the opposite approach, delivering the highest possible performance out its flash modules, while still ensuring data resiliency through its unique Cubic RAID® feature. You get 100TB of usable capacity (in the first version) with up to 10 Million IOPS at 100GB/s and <100µs latency in just 5U. Pick any TWO of these three major performance metrics (IOPS, GB/s, and latency) was the rule of the past – DSSD delivers all three. And you can connect the D5 to up to 48 servers with redundant paths. In an article of “The Register” from July 2015 you can read how TACC (Texas Advanced Computing Center) created a system based on multiple DSSD storage components with 1TB/s throughput and more than 250 million IOPS!!! You really have to stop and think for a moment what this means and what you can do with such amazing performance.

SAP HANA Vora – Hadoop connect

Since SAP announced HANA Vora many SAP customers are investigating and experimenting how to integrate larger data sets, often in Hadoop, as featured in Intel’s Hana Petabyte Scale project at SAP TechEd Las Vegas 2015. As these larger data sets need to be processed near real‐time, Hadoop, designed for batch analytics, will not provide the desired responsiveness. DSSD has created its own data node implementation, called “DSSD Hadoop Plugin”, to provide amazing performance along with the benefits of shared storage for Hadoop workloads. This creates the ability to query much more data at latencies mobile enterprise users demand. Have a look at Mike Olson’s (Founder & CSO Cloudera) video to hear his view of the incredible capabilities Cloudera on DSSD offers. So, if you are thinking about SAP HANA Vora on Cloudera, you definitely want consider DSSD’s D5 to not have any performance worries at all.

SAP on Oracle

Organizations that run SAP on large Oracle databases and have currently no plans migrating to SAP HANA can gain break-through advantages with DSSD. The performance capabilities of DSSD are so significant, that many of the traditional DBA techniques of staging data to meet business requirements become completely unnecessary. Materialized views, Indexes, partitions, and copies of data (dedicated data marts) are all often only required to increase the performance of a specific set of queries for business users. For example, if your SAP BWA (SAP Business Warehouse Accelerator) does not meet your performance requirements anymore and your SAP HANA project is still too far out, you could run SAP BW on DSSD. This would simplify your database design, where you don’t have to monitor and introduce indexes or manipulate cardinality to achieve your desired SQL access plan. It would also reduce database complexity, associated admin tasks, and therefore reduce risk. Since this is a brand-new EMC platform with snapshot and replication integration planned for later this year, you can continue to use well established Oracle replication and backup solutions like Data Guard and RMAN. The DSSD team tested Oracle on D5 and compared it to Oracle’s own top performance engineered system (Exadata).  For example Exadata’s performance benchmark achieved a max of 4.1 Million 8K IOPS @ ~1 ms; Oracle on DSSD achieved a max of 5.25 Million 8K IOPS @ 340 μs in only 5U, and Exadata storage requires 28U.

So if performance is your biggest problem or if any of these SAP use cases and examples resonate with you, I encourage you to learn more and test it to validate EMC’s breakthrough DSSD for your specific scenarios.

For more information go to

http://www.emc.com/DSSD 

3 thoughts on “Game Changer for SAP on Hadoop & Oracle

  1. Fantastic blog post and thanks for sharing. A quick question. With both Isilon and DSSD being possible solutions for Big Data & Vora.

    Is the main difference when choosing the right platform depending on the below?
    • Performance: DSSD (Perf) vs Cost (Isilon)
    • Size: (Isilon) Pedabyte vs 100s of Terabytes (D5)
    • Cost: (tied to Perf)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s