Jumbo Frames for NFS & iSCSI VMWare Datastores

June 1st, 2010 Jesse St. Laurent Comments off

We have been working on a comparison between VMware datastores running on NFS, iSCSI, and FC. (Stay tuned. We will publish those results shortly.) Along the way we were reminded of the performance boost that jumbo frames can provide. These tests were run using the same ‘boot storm’ test harness on the server side we have used before (details can be found at the end of this post). The question is, “How much faster will ESX be with jumbo frames enabled?”

Let’s jump right to the answer… Read more…

Share this post:
  • email
  • Digg
  • del.icio.us
  • StumbleUpon
  • Technorati
  • Google Bookmarks
  • Reddit
  • Slashdot
  • Facebook
Categories: Storage, Systems Tags:

Oracle/Sun F20 Flash Card – How fast is it?

April 15th, 2010 Jesse St. Laurent Comments off

I received several questions about the performance of the Oracle/Sun F20 flash card I used in my previous post about block alignment, so I put together a quick overview of the card’s performance capabilities. The following results are from testing the card in a dual socket 2.93Ghz Nehalem (x5570) system running Solaris x64. This is similar to the server platform Oracle uses in the ExaData 2 platform.

The F20 card is a SAS controller with 4 x 24GB flash modules attached to it. You can find more info on the flash modules on Adam Leventhal’s blog and the official Oracle product page has the F20 details.

All of my tests used 100% random 4KB blocks. I focused on random operations, because in most cases it is not cost effective to use SSD for sequential operations. These tests were run with a variety of different thread counts to give an idea of how the card scales with multiple threads. The first test compared the performance of a single 24GB flash module to the performance of all 4 modules. Read more…

Share this post:
  • email
  • Digg
  • del.icio.us
  • StumbleUpon
  • Technorati
  • Google Bookmarks
  • Reddit
  • Slashdot
  • Facebook

Block alignment is critical

March 26th, 2010 Jesse St. Laurent Comments off

Block alignment is an important topic that is often overlooked in storage. I read a blog entry by Robin Harris a couple months back about the importance of block alignment with the new 4KB  drives. I was curious to test the theory on one of the new 4KB drives, but I did not have one on hand. That got me thinking about Solid State Disk (SSD) devices. If filesystem misalignment hurts traditional spinning disk performance, how would it impact SSD performance. In short, it is ugly.

Here is a chart showing the difference between aligned and misaligned random read operations to a Sun F20 card. I guess it is officially an Oracle F20 card. Read more…

Share this post:
  • email
  • Digg
  • del.icio.us
  • StumbleUpon
  • Technorati
  • Google Bookmarks
  • Reddit
  • Slashdot
  • Facebook

TechForum Presentation

March 12th, 2010 Jesse St. Laurent Comments off

I spoke at TechForum in New York earlier this week. Here is a copy of my presentation for anyone who is interested. The official title is “Rethinking Storage Strategies: How Virtualization is Transforming Storage.” At a high level, I spoke about the current trends in storage and how they play together with server virtualization. I do not think it will have the same impact without the running commentary, so feel free to comment here or drop me a line if you have any questions.

  Storage Trends and Server Virtualization (199.0 KiB)

Share this post:
  • email
  • Digg
  • del.icio.us
  • StumbleUpon
  • Technorati
  • Google Bookmarks
  • Reddit
  • Slashdot
  • Facebook

Exadata V2 Surprises

February 22nd, 2010 Peter Galvin Comments off

When Oracle announced the Exadata V2 database appliance late last year, it created quite a stir. The performance numbers for the box are extremely high, and the feature set and capacity are quite large.

Last week we had an executive briefing for folks interested in Exadata V2. My colleagues Kurt Rosenfeld and John Laferrier presented information on business intelligence and the Exadata, as well as the business case and use cases for considering buying one. Joe LaFlamme from Oracle presented some reference customer examples.

I presented the Exadata V2 technical overview, traveling through the architecture details, migration strategies, and component details. Along the way there were a few points I made that seemed a bit surprising to the audience, and that led to a lively discussion. I summarize those points here, as they do not seem to be well known within the industry.

  • Existing Oracle licenses are transferable to Exadata (including Oracle DB, RAC, and Partitioning). That can greatly reduce the cost of an Exadata that is being used for database consolidation, for example.
  • The Exadata looks to be an excellent consolidation engine. Included with the Exadata software are resource management tools that can, for example, give some databases resource priority over others. These tools also allow the use of the flash storage to be fine tuned, pinning specific tables into flash or letting Oracle use the flash as an extended cache.
  • The Exadata V2 is designed to be able to perform OLTP and Data Warehouse transactions concurrently. If a single system can be used both ways, consider the implications compared to stand-alone, separate Data Warehouse solutions. Normally data must be extracted from the OLTP system, copied to the DW system, imported there, and then processed. The extraction and copying are overhead, on both the OLTP and DW systems. And, any reports or queries on the DW system are performed against “stale data” – data from the time the extraction started. Now consider being able to do DW operations against live, current OLTP data. And according to the performance numbers published by Oracle, those operations could run much faster than on most DW systems. That speed could result in completing more complex reports, the allowing of more ad hoc queries, and so on. Such a change could be a fundamental advantage to DW consumers (finance and senior management, for example).
  • Read more…

Share this post:
  • email
  • Digg
  • del.icio.us
  • StumbleUpon
  • Technorati
  • Google Bookmarks
  • Reddit
  • Slashdot
  • Facebook
Categories: Storage, Systems Tags: , ,

Column – OpenSolaris Crossbow

February 17th, 2010 Peter Galvin Comments off

Project Crossbow is an innovate, and I think important, new contribution to the OpenSolaris project. Crossbow makes network virtualization and resource management first-class citizens in OpenSolaris. If follows in the footsteps of ZFS by having a simple and easy-to-understand interface, while providing great flexibility and power to the administrator. Crossbow can only be found in OpenSolaris, and is not available in Solaris 10. My February column for ;login: Magazine describes and explores Project Crossbow in detail. You can download it here, but as always I encourage you to become a member of Usenix, thereby gaining access to all of the content of ;login: (along with many other great benefits).

  2010-02-galvin.pdf (678.9 KiB)

Share this post:
  • email
  • Digg
  • del.icio.us
  • StumbleUpon
  • Technorati
  • Google Bookmarks
  • Reddit
  • Slashdot
  • Facebook

You Are Invited to the New England Open Solaris Users Group (NEOSUG) Ninth Meeting

January 22nd, 2010 Peter Galvin Comments off

Topic: DTrace Deep Dive and a short talk on LDOM Domains and ZFS

When:
Burlington MA Sun Campus – Feb 2, 2010 6:00PM to 9:00 PM
Boston MA – Boston University – Feb 3, 2010 6:00PM to 9:00 PM
(Note: The same content will be presented twice – once in Burlington and once in Boston. Pick the best location and date as convenient.)

Where:
Feb 2 – Sun Microsystems Burlington Campus; 1 Network Drive, Burlington, MA
Feb 3 – Boston University, Electrical and Computer Engineering Department Photonics Center Building – Room PHO 339 (3rd floor), 8 Saint Mary’s Street Boston, MA 02215
BU Parking: Street parking available on St. Mary’s Street and Bay State Road. Metered parking spots do not require a fee after 6pm.

RSVP: To Linda Wendlandt: lwendlandt@cptech.com

Registration Required! – so we can plan food and drink

Join Jim Mauro and Shannon Sylvia for how-to DTrace, and how to use LDOMs with ZFS.

AGENDA:

6:00-6:20: Registration, Pizza and Beverages

6:20-6:30: Introductions: Peter Galvin, CTO, Corporate Technologies

6:30-8:30: Solaris Dynamic Tracing – DTrace – Jim Mauro, Principle Engineer, Sun Microsystems

8:30-9:00: LDOM Domains and ZFS: An example of creating a ZFS bootable root LDOM domain using jumpstart – Shannon Sylvia, Sysadmin, Northeastern University

9:00 Q&A and Discussion

Also we’ll be giving out official NEOSUG T-Shirts and other trinkets, and copies of the OpenSolaris CD and instruction manual.

For more information see the NEOSUG discussion forum.

Share this post:
  • email
  • Digg
  • del.icio.us
  • StumbleUpon
  • Technorati
  • Google Bookmarks
  • Reddit
  • Slashdot
  • Facebook
Categories: Events Tags: , , ,

VMware boot storm on NetApp – Part 2

December 28th, 2009 Jesse St. Laurent 2 comments

I have received a few questions relating to my previous post about NetApp VMware bootstorm results and want to answer them here.  I have also had a chance to look through the performance data gathered during the tests and have a few interesting data points to share. I also wanted to mention that I now have a pair of second generation Performance Accelerator Modules (PAM 2) in hand and will be publishing updated VMware boot storm results with the larger capacity cards.

What type of disk were the virtual machines stored on?

  • The virtual machines were stored on a SATA RAID-DP aggregate.

What was the rate of data reduction through deduplication?

  • The VMDK files were all fully provisioned at the time of creation. Each operating system type was placed on a different NFS datastore. This resulted in 50 virtual machines on each of 4 shares. The deduplication reduced the physical footprint of the data by 97%

A few interesting stats gathered during the testing. These numbers are not exact and due to the somewhat imprecise nature of starting and stopping statit in synchronization with the start and end of each test.

  • The CPU utilization moved inversely with the boot time. The shorter the boot time, the higher the CPU utilization. This is not surprising as during the faster boots, the CPUs were not waiting around for disk drives to respond. More data was served from cache the the CPU could stay more utilized.
  • The total NFS operations required for each test was 2.8 million.
  • The total GB read by the VMware physical servers from the NetApp was roughly 49GB.
  • The total GB read from disk trended down between cold and warm cache boots. This is what I expected and would be somewhat concerned if it was not true.
  • The total GB read from disk trended down with the addition of each PAM. Again, I would be somewhat concerned if this was not the case.
  • The total GB read from disk took a significant drop when the data was deduplicated. This helps to prove out the theory that NetApp is no longer going to disk for every read of a different logical block that points to the same physical block.

How much disk load was eliminated by the combination of dedup and PAM?

  • The cold boots with no dedup and no PAM read about 67GB of data from disk. The cold boot with dedup and no PAM dropped that down to around 16GB. Adding 2 PAM (or 32GB of extended dedup aware cache) dropped the amount of data read from disk to less that 4GB.
Share this post:
  • email
  • Digg
  • del.icio.us
  • StumbleUpon
  • Technorati
  • Google Bookmarks
  • Reddit
  • Slashdot
  • Facebook

Building a ZFS Deduplication System

December 24th, 2009 Peter Galvin 3 comments

The news of Sun integrating an in-line deduplication feature into ZFS has created quite a buzz in storage circles. And our clients have been asking us about how to gain access to this new feature. This blog post describes the steps needed to build an OpenSolaris server, integrate the deduplication feature, and enable it.

For details about the ZFS deduplication feature, what it does, and how it does it, have a look at Jeff Bonwick’s blog post on the topic. He was the lead engineer on the project so you can take his word on it.

Deduplication was integrated into OpenSolaris build 128. That takes a little explanation. Solaris is Sun’s current commercial operating system. OpenSolaris has two flavors – the semiannual support-able release, and the frequently-updated developer release. The current supportable release is called 2009.06 and is available for download here. Also at that location is the “SXCE” latest build. That distribution is more like Solaris 10 – a big ol’ DVD including all the bits of all the packages. OpenSolaris is the acknowledged future of Solaris, including a new package manager (more like Linux) and a live-CD image that can be booted for exploration, and installed as the core release. To that core more packages can be added via the package manager.
Read more…

Share this post:
  • email
  • Digg
  • del.icio.us
  • StumbleUpon
  • Technorati
  • Google Bookmarks
  • Reddit
  • Slashdot
  • Facebook
Categories: Storage, Systems Tags: , , ,

Column – Immutable Service Containers in OpenSolaris

December 21st, 2009 Peter Galvin Comments off

The OpenSolaris security team has added an interesting proof of concept feature. Immutable Service Containers are designed to make building, configuring, and recreating pre-secured containers easier. The net result, if incorporated into OpenSolaris and eventually a future version of Solaris, should be a set of security best practices managed via a feature-rich framework. Between now and then, there is quite a bit of work for the team to do. My December 2009 column in ;login: Magazine discusses the design goals and current state of Immutable Service Containers. Members of USENIX can read it on-line, while others can download it here:

  2009-12-galvin-login-column.pdf (269.0 KiB)

Share this post:
  • email
  • Digg
  • del.icio.us
  • StumbleUpon
  • Technorati
  • Google Bookmarks
  • Reddit
  • Slashdot
  • Facebook
Categories: Systems Tags: , ,