What is software-defined storage?

Canonical

on 18 May 2015

This article is more than 9 year s old.

What is software-defined storage, and how do NAS and SAN appliances compare to software-defined storage?

Large-scale storage presents an inherent scalability challenge: how do you connect multiple disk drives with producers and consumers of data while ensuring performance and durability — and furthermore, without blowing out bandwidth, capacities & budget? The most common way of addressing these requirements is to provide remote filesystems or block storage through NAS (Network Attached Storage) and SAN (Storage Area Networks) appliances.

NAS and SAN appliances are typically built with proprietary hardware, and powered by proprietary software which serves up the relevant storage protocols like iSCSI or NFS; the software internally handles replication, rebalancing and reconstruction. Their design normally allows for some scalability through added storage elements, but the number of nodes involved is typically low. They are generally built to be fault-tolerant through high internal redundancy — RAID, standby power supplies and multiple network & disk bus interfaces.

Software-defined storage (SDS) embodies a different philosophy: the storage service is actually built from a cluster of commodity-hardware-based server nodes, some of which act as proxies, managers or gateways, and some of which act as storage nodes. Each storage node is responsible for storing a subset of the overall data. This allows additional nodes to be added to provide greater storage capacity, higher availability or increased throughput; clusters of dozens and even hundreds of nodes are possible. The software powering the cluster manages data placement and may offer enhanced capabilities to clients, including novel interfaces such as HTTP REST APIs for object storage.

Fault-tolerance in SDS is implemented by assuming that the failure domain is an entire node, and that no single node is essential. This dramatically reduces the hardware complexity (and cost) of the individual nodes, but also forces a horizontally scaling software architecture. Each node hosting a component of the service is designed to share state and data to its peers, allowing the resulting cluster to deliver high throughput while surviving the failure of individual nodes.

This is how giants like Google, Amazon and Baidu have successfully and economically deployed storage and compute for over a decade. Ubuntu Advantage Storage brings to customers solutions with these same characteristics, but which are built on solid open source technology, and are ready for deploying on your hardware today.

Learn more about Ubuntu Advantage Storage

Talk to us today

Interested in running Ubuntu in your organisation?

What is software-defined storage?

Canonical

What is software-defined storage, and how do NAS and SAN appliances compare to software-defined storage?

Talk to us today

Newsletter signup

Related posts

How to reduce data storage costs by up to 50% with Ceph

How to utilize CPU offloads to increase storage efficiency

Meet the Canonical Ceph team at Cephalocon 2024