Week 4
Storage and Backup
OPS3 - Virtualization and Cloud Infrastructure
Welcome to Week 4!
What You'll Learn This Week
1. Block Devices and Partitions
1.1 Examining Storage
- One of the most powerful diagnostic tools in Linux is lsblk (List Block Devices), which provides a visual tree representation of all connected storage devices.
- When you run this command in the terminal, it displays a hierarchical view that shows physical disks, their partitions, and any logical volumes built on top of them.
- This tree structure makes it immediately clear which partitions belong to which disks and how storage is organized across the system.
- For administrators managing Proxmox servers, lsblk is indispensable for quickly understanding storage topology without needing to parse complex configuration files.
- The diagram below illustrates how Linux represents different types of storage devices and their partition schemes:
Figure 1: Linux Block Devices and Partitions - How the kernel represents different storage types (SATA, NVMe, and their partitions)
1.2 Managing Partitions (fdisk)
To create a formatted space on a disk, we use fdisk or parted.
Section 1 Checkpoint
- Summary: Linux treats disks as Block Devices (such as /dev/sda), and the lsblk command visualizes the storage hierarchy in a tree format while fdisk is used to create partitions.
- It is important to understand that Proxmox needs the underlying operating system to recognize and manage the disk before it can use it for virtual machine storage.
Reflection: Consider why NVMe drives have names like nvme0n1 instead of sda. What happens if you attempt to partition a disk that is already mounted and actively in use?
Resources:
2. Logical Volume Manager (LVM)
2.1 The LVM Hierarchy
The three-tier architecture of LVM provides the flexibility that traditional partitions lack. As shown in the diagram below, the hierarchy flows from physical disks to virtual volumes:
Figure 2: LVM Three-Tier Hierarchy - Physical Volumes (PV) combine into Volume Groups (VG), which are divided into Logical Volumes (LV)
- The hierarchy consists of three layers.
- At the foundation is the Physical Volume (PV), which represents the actual disk or partition (for example, /dev/sdb).
- These physical volumes are then combined into a Volume Group (VG), which acts as a unified storage pool—for instance, a data_pool might aggregate multiple drives to provide 500GB of total capacity.
- Finally, Logical Volumes (LV) are carved out from the volume group and allocated for specific uses, such as vm-100-disk for a virtual machine.
2.2 Hands-On LVM Commands
Proxmox uses LVM extensively. Here is how you manage it manually.
Section 2 Checkpoint
Summary: LVM adds significant flexibility over static partitions, enabling dynamic resizing and storage pooling. The architectural flow moves from PV (Physical Volume) to VG (Volume Group) to LV (Logical Volume), and Proxmox installs to LVM by default to take advantage of these capabilities.
Reflection: Can you shrink an LVM volume while it is online and actively in use? What is the difference between standard LVM and LVM-Thin provisioning?
Resources:
- Red Hat LVM Administration
3. ZFS: The Enterprise Standard (New Material)
3.1 Why ZFS?
- Copy-on-Write (CoW) is one of ZFS's foundational design principles.
- When you edit a file, ZFS does not overwrite the old data in place.
- Instead, it writes the new data to a fresh block on the disk and then updates the pointer to reference the new location.
- The benefit of this approach is profound: if power fails during a write operation, the old data remains valid and intact.
- There is no corruption because the original block is never destroyed until the write is confirmed to be successful.
The illustration below compares traditional write operations (which overwrite data in place) versus ZFS's Copy-on-Write approach:
Figure 3: ZFS Copy-on-Write (CoW) - Traditional filesystems overwrite data in place; ZFS writes to new blocks and updates pointers
- Self-Healing is another critical feature of ZFS.
- The filesystem stores a cryptographic checksum (a digital fingerprint) for every block of data.
- If a cosmic ray flips a bit on your drive—an event known as bit rot—ZFS detects the mismatch between the data and its checksum.
- If redundancy exists (such as in a mirrored or RAID-Z configuration), ZFS automatically repairs the corrupted block by restoring it from a valid copy.
The self-healing process is visualized below, showing how ZFS detects, validates, and repairs corrupted data blocks:
Figure 4: ZFS Self-Healing - Checksums detect corrupted blocks, which are automatically repaired from redundant copies
3.2 Basic ZFS Commands
Proxmox installs ZFS tools by default.
3.3 The Power of Instant Snapshots
- The Copy-on-Write architecture unlocks one of ZFS's most remarkable capabilities: instantaneous snapshots.
- Unlike traditional backup systems that must copy gigabytes or terabytes of data (a process that can take hours), a ZFS snapshot is merely a metadata operation—a lightweight bookmark that marks the current state of the filesystem.
- When you create a snapshot, ZFS doesn't duplicate any data blocks; it simply freezes a reference point in time.
- The snapshot consumes zero disk space initially because it shares all its data blocks with the current filesystem.
- Only when data begins to change does the snapshot start consuming space, as ZFS preserves the old blocks that the snapshot references while writing new data to fresh locations.
Section 3 Checkpoint
- Summary: ZFS is a next-generation filesystem with RAID, Copy-on-Write, and checksumming built directly into its architecture.
- Its self-healing capability detects and fixes silent data corruption (commonly known as bit rot), while snapshots are instantaneous and consume zero space initially due to the Copy-on-Write mechanism.
Reflection: Why does ZFS need direct access to the disk (passthrough) rather than working through a traditional RAID controller? What is the "ARC" in ZFS terms, and how does it improve performance?
Resources:
4. Virtual Disk Formats
4.1 Raw (.raw)
- The Raw disk format is effectively a bit-for-bit representation of a hard drive without any additional metadata or container structure.
- Because it lacks a translation layer, the file is read and written directly to the underlying block device, making it the most performant option available.
- However, this simplicity comes at a cost; creating a 100GB Raw disk immediately consumes 100GB of physical space (unless sparse provisioning is strictly enforced), and it does not support advanced features like internal snapshots.
- If you require snapshot capabilities with Raw disks, you must rely on the underlying storage system, such as LVM-Thin or ZFS, to handle them.
4.2 QCOW2 (QEMU Copy On Write)
- QCOW2 is a functional, feature-rich format designed specifically for the QEMU emulator.
- Unlike Raw, it acts as an intelligent container that creates a layer of abstraction between the VM and the physical disk.
- This allows for powerful features such as internal snapshots, transparent compression, and encryption directly within the file itself.
- While this abstraction layer introduces a minor performance overhead compared to Raw, the flexibility it offers—particularly the ability to grow the disk file dynamically as data is added—makes it the standard choice for file-based storage backends like NFS or local directories.
4.3 Summary Comparison
The visual comparison below highlights the key differences between Raw and QCOW2 disk formats:
Figure 5: Virtual Disk Formats - Raw disks offer maximum performance while QCOW2 provides flexibility with snapshots and thin provisioning
| Feature |
Raw (.raw) |
QCOW2 |
| Performance |
Highest (Near Native) |
High (Slight Overhead) |
| Space Usage |
Fixed (Pre-allocated) |
Dynamic (Grow on demand) |
| Snapshots |
Requires ZFS/LVM support |
Built-in (Internal) |
| Portability |
Universal (Byte stream) |
QEMU Specific |
| ### Section 4 Checkpoint |
|
|
| Summary: |
|
|
- Raw: Fast, pre-allocated, simple. Good for Ceph/LVM.
- QCOW2: Flexible, thin-provisioned, internal snapshots. Good for Directory/NFS.
- Trade-off is usually Performance vs Flexibility.
Reflection:
- Why can't you take an internal snapshot on a Raw disk?
- How does "sparse provisioning" differ from "thin provisioning"?
Resources:
8. Additional Resources
9. Lab Exercises
Summary
Review the key concepts covered in this week's material
Questions?