Arch Linux on Btrfs RAID with LUKS

· 1912 words · 9 minute read

Because I accidentally used and formatted the wrong disk (in Chinese), and I initially thought that it’s a drive failure (in Chinese), I built a new system with RAID (in Chinese). I want to continue using LUKS to protect my data, but I need to combine it with RAID. It is a little bit complicated to setup and maintain, but in this article, I will explain the steps in detail. I wrote this partially because only a few people did it before, and almost nobody even did it on a root drive, so I hope my experience would be helpful.

To begin, I need to introduce my setup.

Disclaimer:

I am not good at storage nor cryptography. My understanding on RAID, Redundancy and TPM may be inaccurate. Please correct me if I am wrong.

Also, you should always consult ArchWiki, your distro’s manual and the man pages of the commands before copying them. Your environment may be dramatically different from mine, so don’t just blindly copy without caution.

Environment #

Be prepared of using the following hardware or software:

  1. A Linux system. In this article, I am going to use Arch, but most current kernels would work. Also, consider the differences in initrd: I use mkinitcpio with systemd hooks, which may not be supported on other distros.
  2. Btrfs RAID-N and M disks. I am using Btrfs RAID10 with three 1TiB HDDs (yes, I heard that Btrfs RAID10 could work on three drives, but it is degraded. Again, I don’t quite understand these technologies, so I may be wrong. My forth disk is arriving soon.) RAID level is not significant in this tutorial, so it is up to you to choose which level you want. You should read the Btrfs wiki for more information on that topic.
  3. A disk to store ESP (EFI System Partition). This could be anything >= 100MiB in size: a USB stick, a phone, a SD card, or even PXE. You don’t need a lot of space for that drive. You could also try placing ESP in one of the RAID disks, but I am not going to do that. In this example, I will use a USB stick.
  4. A way to unlock the LUKS partitions automatically. I will explain that in detail later, but I strongly recommend using a TPM combined with Secure Boot to provide the best security with the greatest simplicity.
  5. Secure Boot and Physical Security: Optional, but you’d better have that. If you combine Secure Boot with TPM PCR, it will be safer. I will talk about it later.
  6. Be prepared to use systemd. Although it’s not absolutely necessary, it is easier to setup with TPM. I am not a systemd fan, but it’s a trade off.

Note:

It is completely unnecessary to have the exact configuration as mine. I hope this guide would be a general guideline to teach you the concepts, rather than asking you to copy my configuration and commands.

Disk Identifiers #

If you have multiple disks, you need a efficient way to identify the disks. DO NOT USE sdX. THEY ARE UNRELIABLE. YOU WILL WIPE THE WRONG DISK, just as me.

I am going to use disk IDs, which is like Protocol_Model_Serial. You can check your disk IDs at /dev/disk/by-id/.

In the rest of this article, I will denote RAID member HDDs as RAID0, RAID1 and RAID2, their raw (LUKS) char device as /dev/RAID0, /dev/RAID1 and /dev/RAID2, and their decrypted forms as /dev/mapper/RAID0, /dev/mapper/RAID1 and /dev/mapper/RAID2.

Let’s begin.

Warn:

Always double-check before doing disk operations. I am not responsible for any lost data.

Disk Layout #

Let’s consuder what the layout of our disks be. You cannot put LUKS upon a Btrfs RAID. You can only put LUKS on raw drives, then make a RAID upon the LUKS-decrypted drives (i.e. /dev/mapper/).

Therefore, I am going to encrypt the whole disks rather than patterns:

  • RAID0 will have a LUKS signature, and its decrypted form will be at /dev/mapper/RAID0.
  • RAID1 will have a LUKS signature, and its decrypted form will be at /dev/mapper/RAID1.
  • RAID2 will have a LUKS signature, and its decrypted form will be at /dev/mapper/RAID2.
  • The Btrfs RAID will be built upon /dev/mapper/RAID0, /dev/mapper/RAID1 and /dev/mapper/RAID2.
  • The ESP drive will be formatted as vfat.

Decryption Methods (Keyslot) #

Now, we need to think about how do we decrypt the drives. Because we are going to make the RAID root drive, we have to decrypt the LUKS partitions then mount the Btrfs RAID within initrd. Also, considering the number of disks, it is hard (though absolutely technically possible) to unlock using passwords. We have two approches:

  1. Use a hardware key: TPM, FIDO2, etc.
  2. Use a keyfile.

If we use a hardware key, we can have unattended unlock (for TPM), which simplifies our daily use. They are also simplier to install compared to keyfiles. If we use keyfiles, we need to find a way to protect them, because they must present in your initrd. The simplest way would be carrying the ESP drive alone with you, but you could find other complex solutions. For example, let the bootloader unlock something.

Anyways, in my setup, I will use TPM offered by AMD fTPM.

Software Setup with TPM #

In order to unlock LUKS with TPM, we need three steps for each disk:

  1. Format the drives as LUKS and set the password: cryptsetup luksFormat /dev/RAID0
  2. Enroll a key in the TPM, then bind a LUKS keyslot to it: systemd-cryptenroll --tpm2-device=/path/to/tpm2_device --tpm2-pcrs=1+7 /dev/RAID0.
  3. During unlock, pull the key from TPM and unlock LUKS: We will do this in initrd hooks.

There are multiple ways to do them. According to Arch Wiki, we can use either systemd-cryptenroll(1) or clevis to do that.

Side note:

On Arch Linux, systemd requires tpm2-tss to unlock drives using TPM, or it will fail with “TPM2 support is not installed” (Source code). On Arch Linux, the sd-encrypt mkinitcpio hook will automatically copy them to the initrd image (Source code), so make sure to install them and re-generate the initrd.

Note that mkinitcpio-systemd-tool currently will not copy these libraries, so they cannot unlock using TPM.

I am not sure if existing Clevis mkinitcpio hooks will work with multiple disks in the initrd, but you can always write a custom mkinitcpio hook to achieve that. However, I am too lazy to do that, so I would go with systemd-cryptenroll.

It would be too dangerous if we use TPM unlock only. When TPM has a failure or when PCR changes (see below), you can’t access your data. Therefore, I strongly recommend you set a recovery password. In fact, cryptsetup luksFormat with default options will automatically ask you for a password to store at keyslot 0. You need to be prepared of manually typing these passwords, so don’t make them too complex.

Related resources: Arch Wiki.

Additional features of TPM and key rotation #

TPM offers a cool thing called PCR. In my naive understanding, it is like a hash of some environment factors (e.g. BIOS or Secure Boot settings). Whenever these factors change, the key stored in TPM will be no longer valid (at least it can’t decrypt the LUKS partition). You can also specify which factors to use when enrolling the key. This feature is extremely useful in protecting from evil maid attacks, but it adds maintenance costs, because you need to rotate the keys whenever you change something.

A list of PCR registers are available at Arch Wiki and systemd-cryptenroll(1). I recommend you do some research on that topic on your own, if you have time.

When rotating the TPM keys, systemd-cryptenroll offers a cool feature to automatically remove all other TPM keyslots by passing --wipe-slot=tpm2. Therefore, we can just run systemd-cryptenroll --wipe-slot=tpm2 --tpm2-pcrs=<Your choice> --tpm2-device=auto to rotate it, and rotation requires recovery passwords. I made a script for that.

Related resources: systemd-cryptenroll(1).

Secure Boot #

Although it’s optional, I strongly recommend you sign your kernel and initrd with a custom key. Secure Boot prevents unauthorized modifications to your kernel and initrd, and thus preventing others from stealing keys from TPM. Combining Secure Boot with TPM PCR 7 prevents Evil Maid Attacks as well.

You may find a step-by-step tutorial on Arch Wiki.

initrd configuration #

The initrd is going to:

  1. Pull the keys out of TPM
  2. Unlock LUKS partitions using the keys
  3. Prompt for recovery password if they cannot unlock using TPM (e.g. when PCR changed)

Fortunately, systemd-cryptenroll can do all these task without writing scripts.

According to Arch Wiki, we need to enable the hooks and set the crypttab. I won’t go in detail this part, and my mkinitcpio.conf and crypttab.initramfs are:

$ grep -e "^HOOKS=" /etc/mkinitcpio.conf
HOOKS=(base systemd btrfs autodetect modconf block keyboard keyboard sd-encrypt sd-vconsole filesystems fsck)
$ cat /etc/crypttab.initramfs 
RAID0   /dev/RAID0        -       tpm2-device=auto
RAID1   /dev/RAID1        -       tpm2-device=auto
RAID2   /dev/RAID2        -       tpm2-device=auto

Btrfs RAID #

Again, I don’t understand these concepts. Consult the Btrfs wiki. Just notice that you should use /dev/mapper/RAID{0,2} when formatting the RAID. You should add the btrfs hook to initrd as well, because we need to mount the RAID in initrd (See Arch Wiki).

Read Btrfs Wiki#Using Btrfs with Multiple Devices.

Apply #

Finally, we can deploy them. Because I already explained what the commands do above, I won’t elaborate on them again in this section. The commands below are just for referencing purposes, do not copy them.

# Install tpm2-tss if not already installed.
pacman -S --needed tpm2-tss

# Format the whole disk as LUKS (no partition table),
# and supply a recovery password at keyslot 0.
cryptsetup luksFormat /dev/RAID0

# Enroll a key in the TPM, then set them as keyslot 1 or 2.
# The --tpm2-pcrs argument is optional.
# Consult Arch Wiki for --tpm2-device.
systemd-cryptenroll /dev/RAID0 --tpm2-pcrs=1+7 --tpm2-device=auto

# Then, do the above two commands for all hard drives.
cryptsetup luksFormat /dev/RAID1
cryptsetup luksFormat /dev/RAID2
systemd-cryptenroll /dev/RAID1 --tpm2-pcrs=1+7 --tpm2-device=auto
systemd-cryptenroll /dev/RAID2 --tpm2-pcrs=1+7 --tpm2-device=auto

# Decrypt them.
/usr/lib/systemd/systemd-cryptsetup attach RAID0 /dev/RAID0 - tpm2-device=auto
/usr/lib/systemd/systemd-cryptsetup attach RAID1 /dev/RAID1 - tpm2-device=auto
/usr/lib/systemd/systemd-cryptsetup attach RAID2 /dev/RAID2 - tpm2-device=auto

# Create the RAID. You may use a different RAID level.
mkfs.btrfs -m raid10 -d raid10 /dev/mapper/RAID0 /dev/mapper/RAID1 /dev/mapper/RAID2

# Mount the RAID. Using any disk to mount is OK. Remember to use /dev/mapper/, not raw disk.
mount /dev/mapper/RAID0 /mnt/

# Mount the ESP
mkdir /mnt/boot/
mount /dev/disk/by-id/usb-ESP /mnt/boot/

# Install the system and chroot.
pacstrap /mnt/ xxx

# Enable the btrfs, systemd and sd-encrypt related hooks in mkinitcpio.conf
# HOOKS=(base systemd btrfs autodetect modconf block keyboard keyboard sd-encrypt sd-vconsole filesystems fsck)
$EDITOR /etc/mkinitcpio.conf

# Put the disks in /etc/crypttab.initramfs.
#RAID0   /dev/RAID0        -       tpm2-device=auto
#RAID1   /dev/RAID1        -       tpm2-device=auto
#RAID2   /dev/RAID2        -       tpm2-device=auto
$EDITOR /etc/crypttab.initramfs

# Recreate the initrd image.
mkinitcpio -P

# Secure Boot with EFISTUB unified image: Recreate the unified kernel image and sign it.
# I am using sbupdate(1).
sbupdate

If you are using PCRs as mine, and you changed any BIOS settings, TPM can’t decrypt the drives automatically in initrd, and you will be prompted for passwords. At any time (before decrypting them or after decrypting), you can use these commands to rotate (I recommend you make a script for them):

# The difference is --wipe-slot=tpm2.
# Read systemd-cryptenroll(1) for more details.
# You will be prompted for password.
systemd-cryptenroll /dev/RAID0 --wipe-slot=tpm2 --tpm2-pcrs=1+7 --tpm2-device=auto
systemd-cryptenroll /dev/RAID1 --wipe-slot=tpm2 --tpm2-pcrs=1+7 --tpm2-device=auto
systemd-cryptenroll /dev/RAID2 --wipe-slot=tpm2 --tpm2-pcrs=1+7 --tpm2-device=auto

That’s it! Hope it is helpful!

References #