Linux RAID 0, 1, 5 Config Guide with mdadm

Q: Q: Can RAID 10 survive two disk failures?

It depends on which disks fail. If the two failed disks are in different mirror pairs, the array survives. If both disks in a single mirror pair fail, the array fails (no redundancy for that pair).

RAID (Redundant Array of Independent Disks) protects data and improves performance by distributing it across multiple physical disks. Linux implements software RAID through the md (multiple device) kernel subsystem, managed by the mdadm utility. Understanding RAID at the conceptual level — not just the commands — is essential for RHCA and for designing resilient storage architectures.

The RAID Concept: Three Techniques

All RAID levels are built from combinations of three fundamental techniques:

Striping (RAID 0)

Data is split into chunks and written to multiple disks simultaneously. If the data is "ABCDEF", disk 1 gets ACE and disk 2 gets BDF. Both disks are written at the same time, giving roughly double the write speed. Read speed also doubles since data is read from both disks in parallel.

Critical weakness: There is zero redundancy. If any single disk fails, ALL data is lost — including data on the surviving disks, because each disk holds only half the data for any given file. RAID 0 actually increases the probability of data loss compared to a single disk (two disks means double the failure probability).

Use case: Temporary scratch space, video editing workstations, gaming, any workload where maximum performance matters and data loss is acceptable.

Mirroring (RAID 1)

Every write is made to two (or more) disks simultaneously. Both disks are identical copies of each other at all times. If one disk fails, the system continues operating from the surviving mirror and the failed disk can be hot-swapped and rebuilt.

Overhead: 50% — you need 2 disks but only get the usable capacity of 1. Write performance is limited by the slower of the two disks.

Use case: OS disks, critical database transaction logs, any data that cannot afford downtime.

Parity (Used in RAID 5, 6)

Parity is a mathematical relationship between data blocks that allows reconstruction of missing data. For every N data blocks, one parity block is calculated using XOR. If any one block is lost, it can be reconstructed from the remaining blocks and parity.

Example with 3 disks, 1 parity: Data blocks A=1100, B=1010. Parity = A XOR B = 0110. If disk A fails, A = Parity XOR B = 0110 XOR 1010 = 1100 (original value recovered).

Overhead: 1 disk's worth of capacity (regardless of array size — a 10-disk RAID 5 wastes only 10%, but a 3-disk RAID 5 wastes 33%).

RAID Levels — Complete Reference

Level	Technique	Min Disks	Fault Tolerance	Usable Capacity	Read Perf	Write Perf
RAID 0	Striping	2	None	100% of all disks	Excellent	Excellent
RAID 1	Mirroring	2	1 disk failure	50% of total	Good	Fair
RAID 5	Stripe + Parity	3	1 disk failure	(N-1)/N of total	Good	Fair (parity calc)
RAID 6	Stripe + 2 Parity	4	2 disk failures	(N-2)/N of total	Good	Slow (2 parity calc)
RAID 10	Mirror + Stripe	4	1 disk per mirror	50% of total	Excellent	Good
RAID 0+1	Stripe + Mirror	4	One stripe can fail	50% of total	Excellent	Good

RAID 0 — Striping in Detail

With two 1 TB disks in RAID 0, the data layout is:

Data:    1  2  3  4  5  6
Disk 1:  1  3  5             (odd chunks)
Disk 2:  2  4  6             (even chunks)

The stripe size (chunk size) determines how large each chunk is. Default is 512 KB. Smaller stripes increase parallelism but add overhead; larger stripes reduce overhead but may reduce parallelism for small files.

# Create RAID 0 with custom chunk size:
# mdadm -Cv /dev/md0 -n 2 /dev/sdb /dev/sdc -l 0 --chunk=512
# cat /proc/mdstat                  # verify creation
# mkfs.xfs /dev/md0
# mkdir /mnt/raid0
# mount /dev/md0 /mnt/raid0

# Monitor array creation/rebuild:
# watch cat /proc/mdstat

# RAID 0 details:
# mdadm -D /dev/md0                 # detailed info
# mdadm --query /dev/md0

# Grow RAID 0 (add third disk):
# mdadm /dev/md0 --add /dev/sdd
# mdadm --grow /dev/md0 --raid-devices=3

# IMPORTANT: RAID 0 has NO fault tolerance
# Simulate failure (testing only):
# mdadm /dev/md0 -f /dev/sdb        # mark as failed
# mdadm /dev/md0 -r /dev/sdb        # remove
# cat /proc/mdstat                  # array now in degraded/failed state

RAID 1 — Mirroring in Detail

With RAID 1, both disks always contain identical data. Reads can be served from either disk (load balancing or round-robin), but writes must complete on both disks before being acknowledged.

Data:    1  2  3  4  5  6
Disk 1:  1  2  3  4  5  6
Disk 2:  1  2  3  4  5  6   (exact copy)

# Create RAID 1:
# mdadm -Cv /dev/md0 -n 2 /dev/sdb /dev/sdc -l 1
# mkfs.ext4 /dev/md0
# mkdir /mnt/raid1
# mount /dev/md0 /mnt/raid1

# RAID 1 with hot spare (auto-rebuild if a disk fails):
# mdadm -Cv /dev/md0 -n 2 /dev/sdb /dev/sdc -l 1 --spare-devices=1 /dev/sdd

# When a disk fails:
# mdadm /dev/md0 -f /dev/sdb        # simulate failure
# cat /proc/mdstat                  # shows degraded + rebuilding from spare

# Replace failed disk:
# mdadm /dev/md0 -r /dev/sdb        # remove failed disk
# mdadm /dev/md0 -a /dev/sde        # add replacement (rebuild starts automatically)
# watch cat /proc/mdstat            # monitor rebuild progress (can take hours for large arrays)

RAID 5 — Striping with Parity

RAID 5 distributes parity information across ALL disks — no dedicated parity disk. For any stripe set, one disk holds parity and the others hold data, but the parity disk rotates across the array.

Example with 3 disks, data is 1,2,3,4,5,6:

Disk 1:  1    3+4(parity)  6
Disk 2:  2    3            5+6(parity)
Disk 3:  1+2(parity)  4   5

If Disk 3 fails in the first stripe: parity(1+2) and value 2 are used to reconstruct value 1, and value 1 and parity are used to reconstruct value 2. This is why RAID 5 can survive one disk failure but not two.

# Create RAID 5:
# mdadm -Cv /dev/md0 -n 3 /dev/sdb /dev/sdc /dev/sdd -l 5
# cat /proc/mdstat                  # initialization may take time
# mkfs.ext4 /dev/md0
# mount /dev/md0 /mnt/raid5

# Capacity calculation:
# 3 × 1TB = 2TB usable (1 disk equivalent used for parity)
# 5 × 2TB = 8TB usable
# Formula: (N-1) × disk_size

# Replace failed disk:
# mdadm /dev/md0 -f /dev/sdb
# mdadm /dev/md0 -r /dev/sdb
# mdadm /dev/md0 -a /dev/sde        # add replacement

# IMPORTANT during rebuild:
# If a second disk fails DURING rebuild — ALL DATA IS LOST
# Rebuild is the most dangerous time — the array is working harder
# Keep hot spares ready for critical systems

RAID 10 — The Enterprise Choice

RAID 10 combines striping AND mirroring. Data is first mirrored (pairs of disks), then striped across mirror pairs. For 4 disks: disks 1+2 form a mirror pair, disks 3+4 form another. Data is striped across the two pairs.

# Create RAID 10:
# mdadm -Cv /dev/md0 -n 4 /dev/sdb /dev/sdc /dev/sdd /dev/sde -l 10

# Capacity: 4 × 1TB = 2TB usable (50% overhead for mirroring)
# Can survive ANY one disk failure, and potentially two failures
# if the two failures are in different mirror pairs

RAID 10 vs RAID 5 — Which to Choose?

Factor	RAID 5	RAID 10
Fault tolerance	1 disk	1 per mirror pair (potentially 2)
Rebuild risk	High — any failure during rebuild = data loss	Lower — mirror provides safety
Write performance	Slower (parity calculation overhead)	Fast (simple mirroring)
Capacity efficiency	Better (only 1 disk wasted)	Worse (50% overhead always)
Recommended for	Large storage, budget-conscious	Databases, transaction logs, performance-critical

Making RAID Persistent Across Reboots

# Save RAID configuration:
# mdadm --detail --scan >> /etc/mdadm.conf
# mdadm --detail --scan >> /etc/mdadm/mdadm.conf   # Debian-based

# Update initramfs (required to detect RAID at boot):
# update-initramfs -u                    # Debian
# dracut -f                              # RHEL

# Add to /etc/fstab:
/dev/md0  /mnt/raid5  ext4  defaults,nofail  0  0

# nofail option: system continues booting even if RAID is degraded

RAID Monitoring and Health Checks

# Real-time RAID status:
# cat /proc/mdstat
# watch -n 1 cat /proc/mdstat       # refresh every second

# Detailed array information:
# mdadm -D /dev/md0

# Check all arrays:
# mdadm --examine --scan

# Interpret /proc/mdstat output:
# Personalities: lists loaded RAID levels (raid1, raid5, etc.)
# md0: active raid5 sdb[0] sdc[1] sdd[2]
#      3145728 blocks super 1.2 level 5, 512k chunk, algorithm 2
#      [3/3] [UUU]    ← U=up, _ or F=failed, S=spare
# UUU = all 3 disks working
# UU_ = 1 disk degraded (underscore)

# Enable automatic email alerts for RAID events:
# vim /etc/mdadm.conf
MAILADDR root@example.com
MAILFROM mdadm@server.example.com

# Start monitoring daemon:
# mdadm --monitor --daemonise /dev/md0
# systemctl enable mdmonitor

Hardware RAID vs Software RAID

Factor	Hardware RAID	Software RAID (mdadm)
Controller	Dedicated RAID card (PERC, SmartArray)	CPU and kernel
Cost	High ($300–$3000+ for card)	Free (part of Linux kernel)
Performance	Better (dedicated ASIC, BBU cache)	Good (modern CPUs handle parity fast)
Portability	Poor (tied to specific card)	Excellent (any Linux system)
Visibility	OS sees single virtual disk	OS sees all member disks + md device
Boot support	Can RAID the boot drive easily	Requires GRUB configuration

Common RAID Interview Questions

Q: 4 disks of 1TB in RAID 10. How much usable space?

RAID 10 = Mirroring + Striping. Mirroring halves the capacity, striping does not add or remove capacity. So 4 × 1TB in RAID 10 = 2TB usable. Minus ~10% for superblock/metadata ≈ 1.8TB actually available.

Q: Can RAID 10 survive two disk failures?

It depends on which disks fail. If the two failed disks are in different mirror pairs, the array survives. If both disks in a single mirror pair fail, the array fails (no redundancy for that pair).

Q: How do you troubleshoot if one of 8 disks fails in LVM?

1. Unmount the filesystem. 2. Add a new disk of equal or larger size to the volume group. 3. Use pvmove to migrate data from the failing disk. 4. Use vgreduce to remove the failed PV. 5. Remount. This is LVM, not RAID — the approach differs from RAID recovery.

Q: RAID 5 vs RAID 6 — when to use RAID 6?

Use RAID 6 when: (1) array has 6+ disks (rebuild time is very long, risk of second failure increases), (2) data is critical and you need dual-drive fault tolerance, (3) using large capacity disks (4TB+ — rebuild can take 24+ hours, increasing risk window).