Sunday, April 5, 2009

The real cost of storage

What's the real cost of storage? I get asked this question all the time, and it's so difficult to answer because it really does depend on so many factors from storage team to storage team. What's really surprising to me is that I'm being asked the question at all. You would think that everyone who runs a storage organization would know exactly what that number is. Some people simply look at their budget and say "here you go, this is what it costs". But can you break it down? Do you know where all of that money is going, and why? I think that's really what people are asking. They know how much they are spending, but they want to know why and how they can save money. Certainly in these economic times storage managers are asked to do more with less while the data continues to grow. So that leaves them asking, how? How do I manage to address this growing pile of data with fewer people, less CAPEX budget, and more demands from the business around things like disaster recovery?

So how do you address the question? How do you do more with less? A lot of storage managers are looking at the cost per GB of their disks and asking, can I get this number down? I think that they can, but it may mean doing some things in different ways than they have in the past. Specifically, here are some things to look at.

Tiered Storage

Yup, I'm recycling that idea again. Getting data off expensive spinning disk and onto cheaper disk saves money, I think that's been well established and taking another look at how you are classifying your data is a worthwhile endeavor at this point in time. Why? Because things have changed in the last year or two, and those changes might have an impact on your data classification policies, so I think a review might be in order. For example, a few years ago when I was classifying data I used SATA disk pretty much just for dev/test and archive data. But things have changed, and now there's technology out there that will allow you to use SATA disk for some of your production workload. Some technology that will even allow you to use SATA drives for all but your most demanding workloads for that matter the IBM's new XIV are now available. So, another look at your tiering policies and the SATA technology that's available today is probably a good use of your time if you're looking to save some money.

Cost of Managing Your Storage

What does it really cost to manage your storage on a per GB basis? This is really the age old question of "how many TB of storage can a single storage admin administer?" that we have been asking for a long time. The answer to this question is critical since you probably aren't getting a whole lot more headcount right now, and you might even be asked to give some up. So how do you manage more disk space with the same or fewer people? First, you have to keep in mind all of the things that go into managing a TB of space. There's a lot more to it than just provisioning a TB to an application and then walking away, right? Here are a few examples of the kinds of things that go into managing a TB of space based on my experience:

  1. Provisioning – This one is obvious, right? But you would be surprised how many people have immature processes and procedures around disk provisioning. How many people still manage their disks based on spreadsheets and command-line scripts making the process time consuming and error prone.
  2. Backup/recovery – So you have to make sure that your data is protected, and that you can get it back should the need arise. This can be a time consuming effort, and one place that you can look for efficiencies that will save you money. It's also a place that people sometimes forget to account for when they are buying more disks. Don't forget that as you add disk capacity, you also have to add backup/restore capacity, and that means more tape, or backup disk, etc but it also means that you have to account for the increased load on your backup admins as well.
  3. Disaster recovery – All of the same things I talked about above with backup/recovery also applies to DR.
  4. Data migration – Sooner or later you're going to have to move this data around. Whether it's because the lease is up on an array, or you need to re-tier the data doesn't matter, what matters is that this can be a costly process in perms of people time, and sooner or later you're going to have to do it.
  5. Performance management – At some point you always get that call "hey, our database is slow and we've looked at everything else and haven't found the problem, can you look and see if it's the disks?" Unless you have some very mature performance management processes in place, this tends to turn into a huge people time sink.
  6. Capacity management – We all know that our data is growing, that's a given, so that means that we need to spend some time planning how we are going to address that growth. When are we going to have to make those new disk purchases, when will we have to buy a whole new array? What about the switches? Are we going to need to expand that environment when we bring in that new array as well?
  7. Documentation – yes, that's right, I said it, documentation is an important part of managing your storage, and it can take up quite a bit of the storage admins time, but it has to be done.

So the question I always ask is, "how mature and efficient are your processes?" Do you have a high degree of automation around all of the above? What use are you making of technology to help you manage the processes above? If you have very mature processes, employ a high degree of automation, and make good use of technology to help you automate as many of those processes as possible, then you probably have done everything you can to drive down the cost of managing your storage. But now is a good time to take a look and see if you can improve any of those areas. For example, does my disk vendor really provide tools to make managing my disk arrays easier? Not just from a provisioning standpoint, but from the standpoint of all of the above. If not, maybe it's time to consider looking at another vendor, one that has better tools.

Let me leave you with a final thought in this area based on my experience. What I found when I was managing storage was that the cost of managing a TB of disk could easily meet or exceed the cost of buying that disk over the 3-4 year life of that disk. So, a myopic focus on who has the cheapest disks on a per GB basis may not make much sense. Perhaps what we should focus on is how much it costs to manage a TB of a particular vendor's disk. In other words, the 3-4 year TCO for any storage acquisition needs to include the cost of management, not just the per GB cost of the space.

SSD vs. Wide Striping

So, what's this got to do with the topic at hand? Well, I think that a lot of the argument around this is really an argument around the cost of managing disks. Both technologies have their places, and both can help you address certain performance issues, and both can help you save money. The difference is that SSDs only help with a very small percentage of cases, whereas wide striping can help you with the vast majority of cases. What's more, wide striping can help you address those management costs and drive down that 3-4 year TCO I keep talking about, where-as SSDs really don't help there at all, and in a lot of cases, I believe that the 3-4 year TCO goes way up with SSDs. That's not to say that for those cases where you need the performance, that using SSDs in a targeted way isn't a good idea. But just keep in mind what I said about the cost of managing a TB of storage perhaps exceeding the cost of purchasing it in the first place. In the end, I think we need both, but I think that the bulk of your storage should be on a side striped array where your storage admins don't have to spend a lot of time trying to figure out exactly where they should place the data so that the new LUNs will perform, and the added load doesn't negatively impact existing applications.

My vision

So, ideally, I think that the storage team should have a vast majority of their data on an array that does wide striping, manage that space though some kind of virtualization engine, and purchase SSDs very tactically to address specific performance issues, again managing everything through the virtualization engine thus allowing re-tiering of the data should that be necessary, and making migrations when they are needed quicker, easier, and less impactful to the business. You also need to deploy software to help you with performance management as well as capacity management, and something to help automate the documentation process. This means that there is very likely not a single vendor that can provide all of the technology, but rather you will need to put together a "best of breed" approach you your storage environment. Here's an example of one set of technologies that I think can help get you to where you want to be.

IBM XIV storage – The XIV provides wide striped storage on SATA disks and makes it all very easy to manage. This is where I would put the bulk of my data since my admins wouldn't have to sit there and try and figure out where to place the data, etc.

EMC CLARiion – Put some flash drives in a CLARiiON and I think you have a great platform for those few LUNs you need that require the kind of performance that SSDs offer if you have that kind of need.

Datacore SANSymphony - A software approach to SAN virtualization which allows you to move data around to different arrays without the users being aware that it's going on. This is the way that you address things like re-tiering of your data as well.

Akorri – This is a software tool that helps you to manage your entire storage infrastructure find the bottlenecks, and generally free up storage admin time.

Quantum DXi 7500 – This is a deduplicating VTL that will help you reduce the amount of time that your backup admins spend troubleshooting failed backups.

Aptare Storage Console – This is software that will help you manage your backups. It will report on things like what backups failed, which of those were on SOX systems, etc.


The above are just a few examples of what's available out there to help you to create a more mature, automated, easier to manage storage environment, but they certainly aren't the only ones, just some good examples of what's available, and why you should be looking at that kind of technology. In the end, whatever you choose, just making sure that you are truly addressing the 3-4 year TCO of your environment is the key to getting those management costs under control and allowing your storage/backup admins to manage larger and larger environments.




No comments: