Getting Started with ZFS
These are my notes from setting up a ZFS file server on FreeBSD.
For this project, I will be configuring four drives in an external Thunderbolt 2 enclosure from OWC that connects to my 2013 Mac Pro.
My plan is to create a mirrored zpool
with two 500GB SSDs and have a
single-disk zpool
for my 2TB drive. I have another 128GB SSD in the
enclosure that will be formatted as UFS for “scratch” data.
Initializing Drives
I like the idea of setting each disk’s serial number as the drive label because it is an unambiguous reference. Importantly, the serial number is nearly always accessible. It’s both printed on the physical drive and available via disk utility programs. When you need to replace a drive, it’s really helpful to know exactly what drive to touch. (Or type!)
With that in mind, the first thing I did was to find where the drives are currently located in the machine:
$ camcontrol devlist
<SHGS31-500GS-2 90000Q00> at scbus0 target 0 lun 0 (pass0,ada0)
<SHGS31-500GS-2 90000Q00> at scbus1 target 0 lun 0 (pass1,ada1)
<M4-CT128M4SSD2 0309> at scbus2 target 0 lun 0 (pass2,ada2)
<WDC WD20SPZX-08UA7 02.01A02> at scbus3 target 0 lun 0 (pass3,ada3)
<APPLE SSD SM0256G BXW8SA0Q> at scbus4 target 0 lun 0 (pass4,ada4)
There are a few ways of getting the serial number, but if you decide to grab it from software, please also make sure the same value is printed on the physical drive label.
You can use camcontrol identify
or geom disk list
, but in this I
decided to grep
over dmesg
in a clever way to grab all the drive
serial numbers at once.
$ dmesg | grep -B1 'Serial Number'
ada0: <SHGS31-500GS-2 90000Q00> ACS-3 ATA SATA 3.x device
ada0: Serial Number ESA8N416111408609
--
ada1: <SHGS31-500GS-2 90000Q00> ACS-3 ATA SATA 3.x device
ada1: Serial Number ESA8N41611140860U
--
ada2: <M4-CT128M4SSD2 0309> ACS-2 ATA SATA 3.x device
ada2: Serial Number 0000000012020907C404
--
ada3: <WDC WD20SPZX-08UA7 02.01A02> ACS-3 ATA SATA 3.x device
ada3: Serial Number WD-WXP2E31CV0WZ
--
ada4: <APPLE SSD SM0256G BXW8SA0Q> ATA8-ACS SATA 3.x device
ada4: Serial Number S216NYAH100559
Now I can initialize the drives 1. Conveniently, ZFS will enable TRIM by default for drives that support it.
WARNING! You could destroy your system if you type the wrong drive number here!
$ gpart create -s gpt ada0
ada0 created
$ gpart add -t freebsd-zfs -l ESA8N416111408609 -a 1M ada0
ada0p1 added
$ gpart create -s gpt ada1
ada1 created
$ gpart add -t freebsd-zfs -l ESA8N41611140860U -a 1M ada1
ada1p1 added
$ gpart create -s gpt ada3
ada3 created
$ gpart add -t freebsd-zfs -l WD-WXP2E31CV0WZ -a 1M ada3
ada3p1 added
After creating all of the new partitions you can check the results
with gpart show -lp
. After setting the GPT drive labels the
partitions can be referenced by /dev/gpt/$SERIAL_NUMBER
.
Configuring ZFS
I’m hoping the FreeBSD handbook’s entry on ZFS will be enough to get everything setup.
The first thing it says to do is enable ZFS on the system.
$ echo 'zfs_enable="YES' >> /etc/rc.conf
$ service zfs start
Single drive (backup)
Even for a single drive, the first step is to create a zpool
:
$ zpool create backup /dev/gpt/WD-WXP2E31CV0WZ
If you get the error: must be a block device or regular file
, make
sure you are running zpool create
as root.
You can see that it worked by checking the output of df
, zpool status
, or zpool list
.
$ df
Filesystem Size Used Avail Capacity Mounted on
/dev/gpt/rootfs 77G 5.1G 66G 7% /
devfs 1.0K 1.0K 0B 100% /dev
backup 1.8T 96K 1.8T 0% /backup
$ zpool status
pool: backup
state: ONLINE
config:
NAME STATE READ WRITE CKSUM
backup ONLINE 0 0 0
gpt/WD-WXP2E31CV0WZ ONLINE 0 0 0
errors: No known data errors
$ zpool list
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
backup 1.81T 360K 1.81T - - 0% 0% 1.00x ONLINE -
Now you can create the filesystem, also known as a dataset
. I think
adding compression and turning off access time are enough customization’s for now.
$ zfs create -v -o atime=off -o compression=on backup/data
create backup/backup-data
atime=off
compression=on
Mirrored drives (data)
The first step is to create a mirrored zpool
:
$ zpool create storage mirror /dev/gpt/ESA8N416111408609 /dev/gpt/ESA8N41611140860U
Now, create the dataset
.
$ zfs create -v -o atime=off -o compression=on storage/data
create storage/storage-data
atime=off
compression=on
Looks good!
$ df -h
Filesystem Size Used Avail Capacity Mounted on
/dev/gpt/rootfs 77G 5.1G 66G 7% /
devfs 1.0K 1.0K 0B 100% /dev
backup 1.8T 96K 1.8T 0% /backup
backup/data 1.8T 96K 1.8T 0% /backup/data
storage 449G 96K 449G 0% /storage
storage/data 449G 96K 449G 0% /storage/data
Networked Storage with NFS
- https://docs.freebsd.org/en/books/handbook/zfs/#zfs-zfs-set-share
- https://docs.freebsd.org/en/books/handbook/network-servers/#network-nfs
- nfsv4(4)
Enable NFSv4 by adding the following services to /etc/rc.conf
:
rpcbind_enable="YES"
mountd_enable="YES"
nfs_server_enable="YES"
nfsv4_server_enable="YES"
nfsuserd_enable="YES"
If you don’t have an /etc/exports
file yet, you can create a blank one.
Start NFS without rebooting:
$ service nfsd start
Starting rpcbind.
/etc/rc.d/mountd.
Starting mountd.
Starting nfsd.
$ service nfsuserd start
Starting nfsuserd.
Sharing ZFS dataset
:
zfs set sharenfs=on storage/data
This will share the dataset
over NFS with the default options
mentioned in zfsprops(8).
You can list all of the custom options in a dataset
like this:
$ zfs get -r -s local all storage/data
NAME PROPERTY VALUE SOURCE
storage/data sharenfs on local
storage/data compression on local
storage/data atime off local
Or get the value for specific options:
$ zfs get sharenfs
NAME PROPERTY VALUE SOURCE
backup sharenfs off default
backup/data sharenfs off default
storage sharenfs off default
storage/data sharenfs on local
Now you should be able to mount storage/data
on another machine with
an NFS client installed:
mount -t nfs $HOST:/storage/data /mnt/zdata
I haven’t gone through the necessary configuration for NFSv4, so my connection is getting demoted to NFSv3. This is okay for now because I’m the only user anyway and I’m connecting on a local network.
$ mount -vvv -t nfs $HOST:/storage/data /mnt/zdata/
mount.nfs: timeout set for Wed Jan 26 01:02:03 2022
%mount.nfs: trying text-based options 'vers=4.2,addr=10.0.0.42,clientaddr=10.0.0.123'
mount.nfs: mount(2): Permission denied
mount.nfs: trying text-based options 'vers=4,minorversion=1,addr=10.0.0.42,clientaddr=10.0.0.123'
mount.nfs: mount(2): Permission denied
mount.nfs: trying text-based options 'vers=4,addr=10.0.0.42,clientaddr=10.0.0.123'
mount.nfs: mount(2): Permission denied
mount.nfs: trying text-based options 'addr=10.0.0.42'
mount.nfs: prog 100003, trying vers=3, prot=6
mount.nfs: trying 10.0.0.42 prog 100003 vers 3 prot TCP port 2049
mount.nfs: prog 100005, trying vers=3, prot=17
mount.nfs: trying 10.0.0.42 prog 100005 vers 3 prot UDP port 797
While I continue to set things up, I made /storage/data
“world-writable” so I can start using the file server 2.
chmod -R 777 /storage
Ultimately, I would like to use “ZFS over NFS” as my primary storage medium because it would keep everything in one place and let me access it from any computer on my network.
Overall, I think ZFS is a really flexible filesystem and using ECC
memory provides a lot of protection against data corruption. I can
manage local and remote backups with zfs send
and seamlessly add or
replace drives by linking new zpools
to my existing datasets
.
Bonus: Creating my UFS scratch disk
(I’m using the serial number I got earlier and double-triple-checked my drive number!)
First, initialize the drive:
$ gpart create -s gpt ada2
ada2 created
$ gpart add -t freebsd-ufs -l 0000000012020907C404 -a 1M ada2
ada2p1 added
Then create the filesystem with TRIM enabled:
$ newfs -U -j -t -L scratch /dev/ada2p1
/dev/ada2p1: 122103.0MB (250066944 sectors) block size 32768, fragment size 4096
using 196 cylinder groups of 625.22MB, 20007 blks, 80128 inodes.
with soft updates
super-block backups (for fsck_ffs -b #) at: [...]
List the active filesystem options:
$ tunefs -p /dev/ada2p1
tunefs: POSIX.1e ACLs: (-a) disabled
tunefs: NFSv4 ACLs: (-N) disabled
tunefs: MAC multilabel: (-l) disabled
tunefs: soft updates: (-n) enabled
tunefs: soft update journaling: (-j) disabled
tunefs: gjournal: (-J) disabled
tunefs: trim: (-t) enabled
tunefs: maximum blocks per file in a cylinder group: (-e) 4096
tunefs: average file size: (-f) 16384
tunefs: average number of files in a directory: (-s) 64
tunefs: minimum percentage of free space: (-m) 8%
tunefs: space to hold for metadata blocks: (-k) 6400
tunefs: optimization preference: (-o) time
tunefs: volume label: (-L)
Footnotes:
Following the FreeBSD handbook’s advice of using 1 MiB
alignments. Hopefully this is a good idea, I think letting gpart
decide an alignment would be fine too.
It’s only temporary if it doesn’t work…