HSM without the headaches
Hierarchical Storage Managementement (HSM), Information Lifecycle Management (ILM), and Data Lifecycle Management (DLM). Everyone wants to manage their data intelligently to reduce their spending on storage infrastructure. The storage vendors and the trade rags would like to convince us that there are magic tools to solve this challenge. The truth is there is no magic tool to manage unstructured data. (I am not talking about the archiving tools that integrate with application here, I am only talking about unstructured data.) I have tried many tools over the years and they are simply not cost effective. Don’t panic though, in most cases, the solution is far simpler and far less expensive than HSM.
File services is a huge consumer of storage capacity. For the purposes of this conversation, let’s consider file services as NFS or CIFS storage whether they be integrated appliances or a servers leveraging back end storage devices. In most environments I visit, the file serving infrastructure is using tier 1 disk drives (fibre channel, SCSI, or SAS). These disk drives are populated with data that is mostly idle and the storage managers want to get that idle data onto a less expensive disk tier. The most common request is to transparently move the idle data to a SATA based devices.
Let’s walk through this the scenarios for an environment with 20TB of unstructured data.
To make the example a little simpler, I am going to ignore both RAID capacity overhead and drive right-sizing. I am going to use 300GB FC drives for tier 1 and 1TB SATA for tier 2. I am going to assume that 10% of the data gets 90% of the IO. (While every environment is different, this is is line with what I see in file sharing environments.) Check out this interesting paper about a recent file server analysis. “Measurement And Analysis Of Large-Scale Network File System Workloads” by Andrew W. Leung and Ethan L. Miller from UC Santa Cruz and Shankar Pasupathy and Garth Goodson of NetApp.
Storing 20TB of data on tier 1 drives takes 69 drives (20TB * 1024GB/TB / 300GB/drive = 68.26 drives). If 10% of that data is considered active, then a tiered environment would require 7 300GB drives for the tier 1 data and 18 1TB drives for the tier 2 data.
There are two apparent solutions to this problem. The first is 100% tier 1 disk (option A) and the second is 10% tier 1 and 90% tier 2 disk (option B). Using all tier 1 disk will deliver the required performance and capacity. The downside is that the disk is expensive and takes a tremendous amount of power and cooling. This is the expensive solution storage managers are trying to escape from. The second option is to use a mix of tier 1 and tier 2 disk. This has the potential to make the disks significantly less expensive. The challenge here is the requirement for a magic HSM tool. These tools are so expensive that they often cost more than is saved by using tier 2 disks. Additionally, they are very complex to deploy and manage.
There is a third option that is often not considered. Use 100% tier 2 disk.
Is it practical to use 100% tier 2 disk? Yes, in most environments the unstructured data will perform just fine on tier 2 disks. Let’s go back to the 10% tier 1 example for a minute. In this example the small number of tier 1 disks are being asked to shoulder 90% of the IO workload. The tier 2 drives in that example are nearly idle. When we use 100% tier 2 disk, we are able to put all of the spindles to work. Why pay the high price for tier 1 disk to centralize the workload and leave 70%+ of the drives underutilized? Put those tier 2 disks to work.
Disk IOPS are the most common performance limiter I see, so I am always looking for ways to spread out the workload. Modern disk drives not only run fine with a mix of active and idle data, they actually need to host some idle data. If a drive were filled to capacity with active data, it would most likely be unable to handle the workload.
Disclaimer: This is not true for every environment. Some environments drive too much IO to leverage tier 2 disk effectively. For those environment, I suggest using 100% tier 1 disk. Yes, I will admit that is some extreme cases, the HSM solutions make sense. Far more often the cost effective approach is to stick with either 100% tier 1 or 100% tier 2 disk.
Great points. I have worked with a number of customers who have tried or are attempting to implement a hierarchical storage management solution. Many of these customers have attempted to metaphorically duct tape multiple solutions together to meet their needs. There is global name space, stubs, and/or shortcuts but the fact is none of the solutions are simple or ideal.
I think it would behoove any organization to do the math and see if moving all data to tier two storage would work for them.