This is the fourth episode of “The Journey to a Hybrid Software Defined Storage Infrastructure”. It is a IBM TEC Study made by Angelo Bernasconi, PierLuigi Buratti, Luca Polichetti, Matteo Mascolo, Francesco Perillo.
Some episodes will follow and may be a S02 next year as well.
To read previous episode check here:
Enjoy your reading!
Moving to a SDS Hybrid Infrastructure is a journey and it needs to go through some specific steps.
Consolidate and Virtualize: Storage Virtualization is strategic and a Storage Monitor System to discover storage resources, check their dependency and track the changes is imperative as well.
Only through a standardized lifecycle management process we will be able to get:
- Automated provisioning / de-provisioning
- Virtual Storage Pools
- Capture and catalog virtual images used in the data center
- Management of the virtualized environment
Then Integrating virtualization management with IT service delivery processes the infrastructure can supply:
- Elastic scaling
- Pay for use
- Self-service provisioning
- Simplified deployment with virtual appliances
- Workload / Virtual Servers provisioning and Workload Management
- Virtual Servers / Hypervisors
- Dedicated Storage
- Integrated Infrastructure
- Server, Storage and Network
- Specialized storage services
- Orchestration/Management of the virtualized environment
- Hybrid Clouds
- Business Policy Driven
- Dynamic Infrastructure
- Data and workload services
- Based on business policy
- QOS driven
- Regulatory compliance
- Cost and performance optimization
- Extended Enterprise Infrastructure
SDS Hybrid infrastructure leverage the Cloud Services
At the end of the Journey the main goal of a SDS Hybrid Infrastructure is to get a Storage for Workload Optimized System, in other words a Storage Infrastructure able to match the requirements of the workload. A simple SDS Hybrid Cloud picture can be depicted as follow:
This architecture applying different data transfer solution will be able for block data to:
- Backup / Archive to the Cloud – physical media
- Backup / Archive to the Cloud by network – restore from the Cloud to on premise
- Pre-position data in the Cloud
- Migrate workloads to the Cloud – run against pre-positioned data
And for NAS
- Backup / Archive to the Cloud by network – restore from the Cloud to on premise
- Pre-position data in the Cloud with Object Store gateway
- Pre-position data in the Cloud with AFM
- Migrate workloads to the Cloud – run against pre-positioned data
In a Cloud Environment, we can define some Storage Class as seen by Guest VM. Each storage class could have one or more tiers of storage behind it.
An agnostic view of what is the Storage Technology for Cloud able to match Cloud Storage Classes, Cloud Storage Platform Services and Customer Workload can be summarized as following:
As long as this study aim to show how is possible to build a SDS Hybrid Infrastructure, its target is to show how IBM SDS portfolio can match and achieve this goals as well.
In the next picture the IBM SDS Technology for Cloud Product Selection is shown.
This flexibility makes hybrid cloud the ideal platform on which to build cognitive solutions. And because data is the resource on which all cognitive solutions depend, IBM storage is the foundation for hybrid cloud and cognitive solutions. It protects that data, delivering it when, where, and how it’s needed with the efficiency, agility, and performance that cognitive solutions demand.
IBM storage delivers a host of powerful capabilities that enable an enterprise to easily exploit the full value of its data while simultaneously reducing the cost of its management to achieve optimal data economics throughout its lifecycle. These are just a few. There are many more. The point is, whatever the use case, IBM storage offers a robust solution that satisfies it.
The key to agility, efficiency and performance in the modern data center is software defined flexibility.
Software Defined Storage takes the intelligence that is in traditional storage systems (which are a combination of proprietary hardware and software) and makes it available to run on commodity hardware.
By decoupling the software from the underlying hardware, the capabilities of a particular software stack can then be deployed wherever and consumed however they are needed – on premises or in the cloud – as a fully-integrated solution, an appliance, software, a cloud service, or various combinations.
The next paragraph will describe the software defined storage solutions in IBM SDS portfolio.
IBM Spectrum Virtualize helps simplify storage management by virtualizing storage in heterogeneous environments. Among other benefits, virtualization simplifies deployment of new applications and new storage tiers, eases movement of data among tiers, and enables consistent, easy-to-use optimization technologies across multiple storage types.
IBM Spectrum Accelerate™, is a highly flexible storage solution that enables rapid deployment of block storage services for new and traditional workloads, on-premises, off-premises and in a combination of both. Designed to help enable cloud environments, it is based on the proven technology delivered in IBM XIV Storage System and in use on more than 100,000 servers worldwide.
IBM® Spectrum Scale™ is scale-out file storage for high performance, large scale workloads either on-premises or hybrid cloud. It unifies storage for cloud, big data and analytics workloads to accelerate insights and deliver optimal cost and ease of deployment. It combines enterprise features with performance-aware intelligence to position data across disparate storage hardware, making data available in the right place at the right time.
IBM Cloud Object Storage enables storing and retrieving object data on-premises, in-the-cloud, or both with the ability to easily and transparently move data between them.
IBM® Spectrum Control provides storage and data optimization using monitoring, automation and analytics. It enables organizations to make an easy transition to virtualized, cloud-enabled, and software defined storage environments— because it provides a storage management solution for all types of data and storage. It helps significantly reduce storage costs by helping optimize storage tiers, and simplify capacity and performance management, storage provisioning and performance troubleshooting.
IBM Spectrum Protect provides a single platform for managing backups for virtual and physical machines, and cloud data. Modern capabilities, such as scalable deduplication and cloud storage access, are delivered entirely in software, eliminating the requirement for deduplication appliances and cloud gateways in many instances.
IBM Spectrum Archive gives organizations an easy way to use cost-effective IBM tape drives and libraries within a tiered storage infrastructure. By using tape libraries instead of disks for Tier 2 and Tier 3 data storage—data that is stored for long-term retention organizations can improve efficiency and reduce costs.
It’s from this software defined capabilities that four key storage platforms emerge for Virtualized Storage, Cloud, Big Data, and Business Critical storage needs. These are the cornerstones of a cognitive storage infrastructure.
And because of their software defined flexibility, they are available in a range of deployment models including fully-integrated solutions, software, cloud services, and appliances. Notice also that we have all-flash offerings in every platform.
Combined with the rest of our storage portfolio, they provide capabilities that enable a business to be more than digital but to marshal valuable data assets with the efficiency, performance, and agility required to be a truly cognitive enterprise.
IBM storage solutions offer flexibility in deployment to make data available where and how it’s needed, in the form most easily consumed by the applications that depend on it, and with the best data economics possible, whether on-premises or off.
Cloud-Scale solutions based on IBM Spectrum Accelerate software are purpose-engineered for the demands of cloud deployments with strong support for multi-tenancy and Quality-of-Service, and deliver consistent high performance even with unpredictable workloads. They dramatically simplify scale-out and management by eliminating tuning, load-balancing, and most other storage management activities. They offer extreme ease of use and task automation, reducing administrative overhead, and scale management to many petabytes in a single environment, and come with advanced mirroring, security and other enterprise capabilities including remote replication, multi-tenancy, snapshots, monitoring.
Versatile integration options make cloud infrastructure easy with rich integrations for the cloud, like a REST API, a thorough command line interface, OpenStack Cinder, and deep VMware and Microsoft integrations
The benefits of IBM Spectrum Accelerate are available as software, as a cloud-service, or in these fully-integrated solutions:
- The field-proven and much-loved XIV Gen3 storage system which is our capacity-optimized offering.
- IBM FlashSystem A9000, which integrates the extreme performance of IBM FlashCore technology, a full-featured data management stack, and flash-optimized data reduction in one very simple and efficient, all-inclusive 8U solution for cloud deployments.
- And IBM FlashSystem A9000R, designed for the global enterprise with data-at-scale challenges. It is a grid-scale, highly parallel, all-flash storage platform designed to drive business into the cognitive era with performance, MicroLatency response time and the reliability, usability and efficiency needed by today’s enterprise businesses.
All the products of the IBM Storage Portfolio will match the SDS Hybrid Infrastructure requirements and goals:
- To be ready for Private, Public and Hybrid Cloud
- Be flexible and agnostics thanks to Storage Virtualization layer
- Leverage Flash Technology for Business Critical and Analytics applications
- Match back up and DR customer requirements
- Easily managed with a top down view in a single pane product.
Basing on the fact that the Storage in Cloud will play a crucial role in the Storage future, the Cloud or Multi Cloud Storage Gateway (TCT for IBM) will be one of the facility that will drive the data from on premise to off premise.
An agnostic vision about how the Multi Cloud Storage Gateway will give benefits is shown in the next picture:
IBM has some technology formally called already present in some product (Spectrum Scale) and that will be ready on other SDS products as well in a short future. Currently this technology is called Transparent Cloud tiering (TCT), potentially in the future will have different name.
This technology potentially (Roadmap need to be confirmed time to time), will be present in any IBM SDS Portfolio products and in other storage subsystem that currently represent the IBM Enterprise Storage Subsystem as well, like DS8000 and XIV family.
The MCStore technology is already available on IBM Spectrum Scale Technology and can give benefits in the following Use Case:
- Enable a secure, reliable, transparent cloud storage tier in Spectrum Scale with single namespace
- Based on GPFS Information LifeCycle Management (ILM) policies
- Leveraging GPFS Light-Weight Event technology (LWE)
- Supported Clouds
- AWS S3 (Amazon) and Swift
This solution will do a couple of things for you.
- Because we are looking at the last read date, data that is still needed but the chance you will read it is highly unlikely can be moved automatically to the cloud. If a system needs the file/object there is no re-coding that needs to be done as the namespace doesn’t change.
- If you run out of storage and need to ‘burst’ out because of some monthly/yearly job you can move data around to help free up space on-perm or write directly out to the cloud.
- Data protection such as snapshots and backups can still take place. This is valuable to many customers as they know the data doesn’t change often but like the idea they do not have to change their recovery process every time they want to add new technology.
- Cheap Disaster Recovery. Scale does have the ability to replicate to another system but as these systems grow larger and beyond multiple petabytes, replication becomes more difficult. For the most part you are going to need to recover the most recent (~90 Days) of data that runs your business. Inside of Scale is the ability to create mirrors of data pools. One of those mirrors could be the cloud tier where your most recent data is kept in case there is a problem in the data center.
- It allows you to start small and work your way into a cloud offering. Part of the problem some clients have is they want to take on too much too quickly. Because Scale allows customers to have data in multiple clouds, you can start with a larger vendor like IBM and then when your private cloud on OpenStack is up and running you can use them both or just one. The migration would be simple as both share the same namespace under the same file system. This frees the client up from having to make changes on the front side of the application.
Today this feature is offered as an open beta only. The release is coming soon as they are tweaking and doing some bug fixes before it is generally available. Here is the link to the DevWorks page that goes into more about the beta and how to download a VM that will let you test these features out.
Addressing a solution with Data replicated in Cloud using MCStore or Multi Cloud Storage Gateway, more than Disaster Recovery, it is correct to talk about Back Up in Cloud solution as long as the data replicated, moved or backed up in cloud will be “Object”, hence their usage can’t be immediate for DR purposes with usual minimum RPO or RTO, but they will be available for restore purposes as needed.
Talking about IBM Spectrum Virtualize what we are expecting in a short future with V7.8 is shown in the following picture, where:
- User configures Back Up / DR for production volumes via Spectrum Virtualize GUI
- MCS Gateway runs alongside Spectrum Virtualize and pulls full or incremental FlashCopy snapshots from Volumes
- MCS Gateway applies encryption, integrity protection etc. as configured
- MCS Gateway stores data and metadata to remote object store(s), thus snapshots can be incremental forever
- User restores snapshot from cloud to original or new Volume via Spectrum Virtualize GUI
A kind of built-in easy-to-use “cloud based time machine for enterprise block storage” for the following Use Case:
Built-in easy-to-use feature for various use cases:
- Disaster recovery
- Data sharing
This technology, as already said, is something that is already present and something that will be available in a short term or medium term future.
By the way it is something need to take in consideration because its future evolution will be really interesting and will give tremendous benefit for Hybrid SDS Cloud infrastructures.
This is the end of Episode #4. Next Episode will come shortly
Thank you for reading…Stay tuned!