Week 10
Storage and Persistence (Cinder)
OPS3 - Virtualization and Cloud Infrastructure
1. The Hierarchy of Cloud Storage
1.1 Ephemeral vs. Persistent Storage
- It is crucial to distinguish between the two primary ephemeral and persistent storage models in cloud architectures.
- Ephemeral Storage (Nova) is strictly tied to the lifecycle of the Compute Instance.
- It functions effectively as a local scratch disk, optimized for operating system caches and temporary file processing.
- However, if the instance is terminated or the underlying hypervisor fails, this data is irrevocably lost.
- In contrast, Persistent Storage (Cinder) operates as an independent capability.
Section 1 Checkpoint
Summary:
- Ephemeral Storage (Nova): Temporary, fast, dies with the VM. Use for OS / Cache.
- Persistent Storage (Cinder): Durable, independent, survives VM deletion. Use for Databases / Critical Data.
- Analogy: Ephemeral is RAM/Swap; Persistent is the Hard Drive.
Reflection:
- Why shouldn't you store your customer MySQL database on the Nova Ephemeral disk?
- If you delete a Cinder Volume, is the data recoverable? (Hint: Only if you have a Backup).
2. Block Storage Architecture (Cinder)
2.1 The Driver Model
Just as Nova utilizes virtualization drivers to interact with various CPU architectures, Cinder employs a Volume Driver architecture to communicate with diverse storage backends.
Figure 1: Cinder Architecture - The Cinder Scheduler selects the backend, and the Volume Driver translates API calls into storage commands
- Laboratory: The LVM Driver manages local logical volumes on a standard Linux server.
- Enterprise: Customized drivers for Dell EMC, NetApp, or HPE arrays translate API calls into proprietary storage commands.
- Scale-Out: The Ceph Driver allows Cinder to provision resources from a distributed, software-defined storage cluster. When a user executes a creation command, Cinder identifies the correct driver and signals the backend hardware to provision the requested Logical Unit Number (LUN).
2.2 The Attachment Process (iSCSI/RBD)
The mechanism for attaching a volume to an instance involves a coordinated handshake between services.
Figure 2: The Attachment Handshake - How Nova and Cinder coordinate to plug a remote disk into a running VM
2.3 Deep Dive: Storage Backends
Cinder operates as an abstraction layer, capable of interfacing with a diverse array of storage backends. In private cloud environments, the two most prevalent technologies serve as excellent examples of this flexibility: the Network File System (NFS) and the Ceph distributed storage cluster.
2.3.1 NFS (Network File System)
- The Network File System (NFS) represents the simpler deployment model, often utilized in smaller environments or laboratories.
- In this architecture, Cinder acts as a client that mounts a remote directory from an existing NAS appliance or Linux server (e.g., 192.168.1.5:/toptier).
- When a user requests a new volume, the Cinder Volume service generates a large file, typically in the QCOW2 format, within this mounted directory.
- While this approach is notably easy to implement—requiring only a standard Linux server or a commercial NAS like Synology—it suffers from scalability limitations.
- The performance of the entire cloud storage pool is often constrained by the throughput of the single network link connecting the Controller to the NAS, creating a significant bottleneck and a single point of failure.
2.3.2 Ceph (The Gold Standard)
- In contrast, Ceph represents the industry standard for production-grade OpenStack deployments.
- As a software-defined storage solution, Ceph eliminates the need for a central storage controller.
- Instead, it aggregates storage capacity from hundreds of individual hard drives distributed across many physical servers, unifying them into a massive, scalable "Pool." The integration between Cinder and Ceph is facilitated by the librbd library, which allows Cinder to manage reliable RADOS Block Devices (RBD).
- Ceph distinguishes itself through its self-healing capabilities and advanced snapshotting mechanism.
- When data is written to a Ceph-backed volume, it is split into 4MB objects and scattered deterministically across the cluster.
- If a physical drive fails, the cluster automatically detects the missing objects and replicates them from surviving redundant copies, effectively healing the system without human intervention.
- Furthermore, because Ceph manages data as discrete objects, it can create instantaneous snapshots using a Copy-on-Write mechanism.
- This allows administrators to generate thousands of recovery points without incurring the performance penalties associated with traditional storage arrays.
Implementation Note: In production, Ceph is the preferred backend because it decouples storage from compute hardware entirely, allowing indefinite scaling.
2.3.3 Configuring Cinder with Ceph
Note: This assumes a Ceph cluster is already running. To learn how to build one from scratch, see the Optional Ceph Setup Guide.
To configure OpenStack Cinder to use a Ceph cluster as its backend, the administrator must edit the cinder.conf file on the Controller node. The process involves three key steps: installing the client libraries, authenticating, and defining the driver.
1. Install Ceph Client:
The Cinder service requires the python libraries to communicate with the Ceph public network.
2. Authentication (Keyring):
OpenStack acts as a client "user" to the Ceph cluster. You must copy the authentication keyring from the Ceph Monitor node to the Cinder node.
3. Driver Configuration (/etc/cinder/cinder.conf):
Define a new backend section (e.g., [ceph]) and reference it in the enabled_backends list.
2.3.4 Configuring Cinder with NFS
For smaller deployments or lab environments, NFS is a common backend. It requires a dedicated text file to list the shares and a specific driver configuration.
1. Create Shares File:
Create a text file (e.g., /etc/cinder/nfs_shares) and list your NFS exports, one per line.
2. Set Permissions:
Ensure the Cinder user can read this file.
3. Driver Configuration (/etc/cinder/cinder.conf):
Section 2 Checkpoint
Summary:
- Cinder: Manages block storage (Creating/Attaching volumes).
- Backends: Connects to LVM (Local), NFS (File), or Ceph (Distributed).
- Ceph: The Gold Standard for OpenStack. Self-healing, scalable, compliant.
Reflection:
- Why is Ceph preferred over NFS for large clouds? (Hint: Single Point of Failure).
- What happens to a generic "File" on an NFS share when Cinder creates a volume? (It becomes a .qcow2 or .raw disk image).
3. Data Safety Strategies
3.1 Snapshots (The Time Machine)
- A Snapshot represents a point-in-time copy of a specific volume using a "Copy-on-Write" (Redirect on Write) mechanism.
- This technique ensures that the snapshot is created nearly instantly, as it relies on the existing data blocks rather than duplicating the entire drive volume.
- Snapshots are invaluable for functional recovery scenarios, such as capturing the state of a database before a major upgrade; if the upgrade fails, the administrator can rollback instantly.
- However, it is critical to note that snapshots typically reside on the same physical hardware as the source volume.
- Therefore, if the underlying storage array experiences a catastrophic failure, both the active volume and its snapshots will be lost.
3.2 Backups (The Disaster Plan)
To mitigate the risk of physical hardware failure, Backups provide a complete disaster recovery solution.
Figure 3: Block vs Object Storage - Cinder Backups move data from expensive, fast Block Storage to cheap, durable Object Storage (Swift/S3)
- A backup involves reading the full content of a block volume and transferring it to a separate, physically isolated system—typically an Object Storage service like Swift or Amazon S3.
- Although this process is slower due to network transfer requirements, it ensures data survivability.
- If the primary SAN or Ceph cluster were to be destroyed by fire or malfunction, the data could still be restored from the backup repository located in a different rack or data center.
3.3 Architecting Redundancy: Public vs. Private
- In a Public Cloud (AWS/Azure), achieving higher redundancy is often as simple as selecting a premium tier in a dropdown menu.
- However, in a Private Cloud environment using OpenStack, you are the architect responsible for building these layers yourself.
- Understanding how standard cloud redundancy levels map to OpenStack implementation is crucial for designing robust infrastructure.
- Level 1: Local Redundancy (LRS)
The foundational level of data safety is Local Redundancy, known as LRS in Azure or EBS in AWS.
- This concept ensures that data survives disk failures within a single rack or datacenter.
- In an OpenStack private cloud, this is achieved natively through Ceph.
- By default, Ceph creates a "Replica=3" pool, which automatically stores three copies of every object on different physical Object Storage Daemons (OSDs).
- If a physical drive fails, the system self-heals by replicating the data from the surviving copies to a new drive, mirroring the durability guarantees of public cloud LRS.
- Level 2: Zonal Redundancy (ZRS)
The next tier is Zonal Redundancy (ZRS), designed to ensure data survives the total destruction of a building due to fire or power loss.
- Public clouds implement this by replicating data across distinct Availability Zones—separate facilities with independent power and cooling.
- In OpenStack, you replicate this architecture using Cinder Availability Zones.
- By modifying cinder.conf, an administrator can define logical zones (e.g., zone-A, zone-B) and map them to specific storage racks or entirely separate Ceph clusters.
- This requires users to consciously select a zone when provisioning a volume, ensuring their application architecture can withstand a facility-level failure.
- Level 3: Geo Redundancy (GRS)
The highest level of protection is Geo Redundancy (GRS), which ensures survival against regional catastrophes, such as major natural disasters.
- While public clouds handle this via asynchronous replication between regions (e.g., North Europe to West Europe), a private cloud architect typically implements this using Cinder Backup.
- By configuring Cinder to send volume backups to a remote Swift Object Storage cluster located in a different city, you guarantee that critical data can be restored even if the primary datacenter is lost.
- More advanced (and expensive) setups can also utilize driver-level volume replication for real-time Active/Passive disaster recovery.
Section 3 Checkpoint
Summary:
- Snapshot: Quick, local, dependent on source. Use for "Undo" before changes.
- Backup: Slow, remote, independent. Use for Disaster Recovery (Fire/Flood).
- Redundancy: LRS (Disk Fail), ZRS (Rack/Building Fail), GRS (City Fail).
Reflection:
- Why isn't a Snapshot considered a true Backup?
- Which Cinder feature would you use to protect against a data center power outage? (Cinder Backup / Replication).
4. Operations Cookbook (Nebula Inc.)
4.1 Creating a Volume
We begin by provisioning a specific persistent volume for our database. This is analogous to purchasing a physical hard drive; initially, it exists as an unattached operational resource within the storage inventory.
4.2 Attaching the Volume
Once created, the volume must be physically connected to the compute instance. This command mimics the act of plugging a USB drive or SAS cable into a running server.
- Verification:
bash
openstack volume list
# Status should be "in-use"
4.3 Formatting and Mounting (Guest OS)
It is important to remember that Cinder delivers a raw block device (e.g., /dev/vdb) without any file system. The administrator must log into the Guest OS to format the disk and mount it for use effectively transferring responsibility from the "Cloud Provider" to the "OS Administrator".
4.4 Snapshotting
Before executing any destructive changes or updates to the database, standard procedure dictates creating a snapshot to preserve the current state.
Section 4 Checkpoint
Summary:
- Process: Create Volume -> Attach to VM -> Format (mkfs) -> Mount.
- Guest Responsibility: The Cloud Provider connects the wire; YOU must format the disk.
- Persistence: Data survives detach/reattach and VM deletion.
Reflection:
- Why doesn't the volume show up automatically in /mnt when you attach it?
- What Linux command lists all block devices? (lsblk).
5. Industry Comparison: Storage
7. Summary and Next Steps
Preparing for Week 11
Next week, we stop clicking buttons manually. We will introduce Automation and APIs. We will learn how to deploy this entire infrastructure using code (Bash/Python) and configuration scripts (Cloud-Init), moving from "Pets" to "Cattle" effectively.
Checklist:
- Can you differentiate between nova-compute storage and cinder-volume storage?
- Do you understand why we need to format a volume after attaching it?
- Review your Linux command line skills (loops and variables) for next week.
8. Additional Resources
9. Lab Exercises
Summary
Review the key concepts covered in this week's material
Questions?