I've written a few blogs about SAP HANA Hardware in the past but it strikes me that none of them deal with the simple facts. I suspect that most of the time architects looking to buy SAP HANA just would like to know the facts so they can compare one vendor against another.
As I compiled this list, it became apparent that each of the hardware vendors has a slightly different value proposition. Some have upgradeable hardware. Some do blades. Some can provide it faster than others. It is up to you to decide which one provides the value proposition that suits you best.
What is the high-level SAP HANA architecture?
At a high-level, SAP HANA is made up of the following:
Intel Westmere-EX E7 system(s) with 2, 4 or 8 CPUs (20, 40, 80 cores)
128GB * CPU of RAM. A medium appliance is 4 CPUs and 512GB RAM
Log storage - usually Fusion-IO ioDrive, the same size as RAM
Disk storage - either local SAS storage or network storage depending on the configuration.
SAP HANA scales out to multiple nodes - in theory to an unlimited number. The largest appliances are currently 16 nodes, and this is expected to increase through 2012.
Who are the Single-Node Vendors?
A Single-Node HANA appliance is a good place to start, if you are considering testing out SAP HANA or require development or test instances. It is incredibly surprising how the offerings vary from vendor to vendor!
- Not available
+ Available and can be upgraded to the next size up
++ Available and can be upgraded to two sizes up
The size in Rack Units (RU) is for a typical 512GB "Medium" sized unit. Lead-Time is where there are agreed lead times for units.
Who are the Multi-Node Vendors?
Much the same applies when we look at multi-node vendors - otherwise known as scale-out. Some require fewer disks. Some require network storage. Some have better density per TB. Note that all multi-node vendors now supply high availability as of SAP HANA SP04.
It's worth noting that the IBM uses local storage and its GPFS distributed file system, which means that all the storage is in the server. For all others you require a minimum of one network attached storage unit for each 4 nodes.
IBM also deserves a special mention for three reasons. First, they commit to any configuration (within reason) to be delivered in 1 week. Second, they have a SAP lab unit of 100TB demonstrating to be working. Lastly, only a single node IBM medium appliance can be used as the basis for a multi-node cluster, though it requires reconfiguration.
Which is the best vendor?
There is not a "best vendor" and it depends on your situation. Most SAP HANA scenarios I am looking at now are multi-node so those vendors who do not have multi-node configurations would concern me.
As a buyer I would also be mindful of my existing strategy and how these systems fit into it. In most cases it probably does no harm to buy a HANA system from your existing strategic supplier, if they meet your needs.
Lastly I would consider the philosophical difference between replicated local GPFS storage, which is IBM's solution, and SAN storage, which is the solution from the other vendors.
Who is most cost effective?
The buying teams that I have spoken to found that the first quotations they received from different hardware suppliers varied substantially, and then when this was pushed to a discounted sale price, the variation disappeared. My recommendation is not to let cost be the decision criteria but rather to negotiate with your chosen vendor.
Whose SAP HANA appliance is fastest?
There aren't any official SAP HANA benchmarks just yet, but I can let you know the following:
First, HP has much more hardware for a 512GB single-node appliance - double the Fusion-IO throughput and twice the disks of any other vendor. This ought to provide performance benefits.
Second, IBM's GPFS file system seems to offer interesting performance benefits in real life, as well as better memory throughput leading to faster response times. I put this down to closer relationships with SAP and Intel meaning deeper tuning.
Third, Cisco has the most disk drives per node in a multi-node appliance. This should lead to better load times.
How does High Availability work?
With SAP HANA scale-out each node holds 1/(N-M) of the content, where M is the number of spare nodes. This way if a node fails, the others will pick up the workload whilst a spare node is brought online.
In the IBM GPFS scenario the data is replicated through the nodes to match this process and in the other hardware scenarios, the shared storage model means that this works out the box.
What is coming next?
In short: consolidation, and improvements to performance.
First, Intel's Ivy Bridge platform will provide better density of SAP HANA systems. I'm thinking 8TB in 10U or similar - down from 64U, due to improvements in SAP HANA Blade Chasses. This is likely to arrive in 2013. What's more, Ivy Bridge will perform at least 2x as well as current HANA systems.
Second, SAP has invested in Violin, a SSD storage company. From what I understand this is to stop the major hardware vendors from trying to cash in on expensive spinning disks. This will reduce 32TB of shared storage down from 24U to 4U.
Note to hardware vendors
I have based this on publicly available information. If you have certified units not on this list then I would ask you to ask SAP to add them to their Product Availability Matrix. If I have made an error then please feel free to correct me and I will update this.
Largely this blog was written so architects can draw their own conclusions so I will avoid making my own! If you have questions that I have not covered then please let me know and I will update this FAQ accordingly.