Getting Started with ZFS
These are my notes from setting up a ZFS file server on FreeBSD.
For this project, I will be configuring four drives in an external Thunderbolt 2 enclosure from OWC that connects to my 2013 Mac Pro.
My plan is to create a mirrored zpool with two 500GB SSDs and have a
single-disk zpool for my 2TB drive. I have another 128GB SSD in the
enclosure that will be formatted as UFS for “scratch” data.
Initializing Drives
I like the idea of setting each disk’s serial number as the drive label because it is an unambiguous reference. Importantly, the serial number is nearly always accessible. It’s both printed on the physical drive and available via disk utility programs. When you need to replace a drive, it’s really helpful to know exactly what drive to touch. (Or type!)
With that in mind, the first thing I did was to find where the drives are currently located in the machine:
$ camcontrol devlist<SHGS31-500GS-2 90000Q00> at scbus0 target 0 lun 0 (pass0,ada0)<SHGS31-500GS-2 90000Q00> at scbus1 target 0 lun 0 (pass1,ada1)<M4-CT128M4SSD2 0309> at scbus2 target 0 lun 0 (pass2,ada2)<WDC WD20SPZX-08UA7 02.01A02> at scbus3 target 0 lun 0 (pass3,ada3)<APPLE SSD SM0256G BXW8SA0Q> at scbus4 target 0 lun 0 (pass4,ada4)
There are a few ways of getting the serial number, but if you decide to grab it from software, please also make sure the same value is printed on the physical drive label.
You can use camcontrol identify or geom disk list, but in this I
decided to grep over dmesg in a clever way to grab all the drive
serial numbers at once.
$ dmesg | grep -B1 'Serial Number'ada0: <SHGS31-500GS-2 90000Q00> ACS-3 ATA SATA 3.x deviceada0: Serial Number ESA8N416111408609--ada1: <SHGS31-500GS-2 90000Q00> ACS-3 ATA SATA 3.x deviceada1: Serial Number ESA8N41611140860U--ada2: <M4-CT128M4SSD2 0309> ACS-2 ATA SATA 3.x deviceada2: Serial Number 0000000012020907C404--ada3: <WDC WD20SPZX-08UA7 02.01A02> ACS-3 ATA SATA 3.x deviceada3: Serial Number WD-WXP2E31CV0WZ--ada4: <APPLE SSD SM0256G BXW8SA0Q> ATA8-ACS SATA 3.x deviceada4: Serial Number S216NYAH100559
Now I can initialize the drives 1. Conveniently, ZFS will enable TRIM by default for drives that support it.
WARNING! You could destroy your system if you type the wrong drive number here!
$ gpart create -s gpt ada0ada0 created$ gpart add -t freebsd-zfs -l ESA8N416111408609 -a 1M ada0ada0p1 added$ gpart create -s gpt ada1ada1 created$ gpart add -t freebsd-zfs -l ESA8N41611140860U -a 1M ada1ada1p1 added$ gpart create -s gpt ada3ada3 created$ gpart add -t freebsd-zfs -l WD-WXP2E31CV0WZ -a 1M ada3ada3p1 added
After creating all of the new partitions you can check the results
with gpart show -lp. After setting the GPT drive labels the
partitions can be referenced by /dev/gpt/$SERIAL_NUMBER.
Configuring ZFS
I’m hoping the FreeBSD handbook’s entry on ZFS will be enough to get everything setup.
The first thing it says to do is enable ZFS on the system.
$ echo 'zfs_enable="YES' >> /etc/rc.conf$ service zfs start
Single drive (backup)
Even for a single drive, the first step is to create a zpool:
$ zpool create backup /dev/gpt/WD-WXP2E31CV0WZ
If you get the error: must be a block device or regular file, make
sure you are running zpool create as root.
You can see that it worked by checking the output of df, zpool status, or zpool list.
$ dfFilesystem Size Used Avail Capacity Mounted on/dev/gpt/rootfs 77G 5.1G 66G 7% /devfs 1.0K 1.0K 0B 100% /devbackup 1.8T 96K 1.8T 0% /backup
$ zpool statuspool: backupstate: ONLINEconfig:NAME STATE READ WRITE CKSUMbackup ONLINE 0 0 0gpt/WD-WXP2E31CV0WZ ONLINE 0 0 0errors: No known data errors
$ zpool listNAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOTbackup 1.81T 360K 1.81T - - 0% 0% 1.00x ONLINE -
Now you can create the filesystem, also known as a dataset. I think
adding compression and turning off access time are enough customization’s for now.
$ zfs create -v -o atime=off -o compression=on backup/datacreate backup/backup-dataatime=offcompression=on
Mirrored drives (data)
The first step is to create a mirrored zpool:
$ zpool create storage mirror /dev/gpt/ESA8N416111408609 /dev/gpt/ESA8N41611140860U
Now, create the dataset.
$ zfs create -v -o atime=off -o compression=on storage/datacreate storage/storage-dataatime=offcompression=on
Looks good!
$ df -hFilesystem Size Used Avail Capacity Mounted on/dev/gpt/rootfs 77G 5.1G 66G 7% /devfs 1.0K 1.0K 0B 100% /devbackup 1.8T 96K 1.8T 0% /backupbackup/data 1.8T 96K 1.8T 0% /backup/datastorage 449G 96K 449G 0% /storagestorage/data 449G 96K 449G 0% /storage/data
Networked Storage with NFS
- https://docs.freebsd.org/en/books/handbook/zfs/#zfs-zfs-set-share
- https://docs.freebsd.org/en/books/handbook/network-servers/#network-nfs
- nfsv4(4)
Enable NFSv4 by adding the following services to /etc/rc.conf:
rpcbind_enable="YES"mountd_enable="YES"nfs_server_enable="YES"nfsv4_server_enable="YES"nfsuserd_enable="YES"
If you don’t have an /etc/exports file yet, you can create a blank one.
Start NFS without rebooting:
$ service nfsd startStarting rpcbind./etc/rc.d/mountd.Starting mountd.Starting nfsd.$ service nfsuserd startStarting nfsuserd.
Sharing ZFS dataset:
zfs set sharenfs=on storage/data
This will share the dataset over NFS with the default options
mentioned in zfsprops(8).
You can list all of the custom options in a dataset like this:
$ zfs get -r -s local all storage/dataNAME PROPERTY VALUE SOURCEstorage/data sharenfs on localstorage/data compression on localstorage/data atime off local
Or get the value for specific options:
$ zfs get sharenfsNAME PROPERTY VALUE SOURCEbackup sharenfs off defaultbackup/data sharenfs off defaultstorage sharenfs off defaultstorage/data sharenfs on local
Now you should be able to mount storage/data on another machine with
an NFS client installed:
mount -t nfs $HOST:/storage/data /mnt/zdata
I haven’t gone through the necessary configuration for NFSv4, so my connection is getting demoted to NFSv3. This is okay for now because I’m the only user anyway and I’m connecting on a local network.
$ mount -vvv -t nfs $HOST:/storage/data /mnt/zdata/mount.nfs: timeout set for Wed Jan 26 01:02:03 2022%mount.nfs: trying text-based options 'vers=4.2,addr=10.0.0.42,clientaddr=10.0.0.123'mount.nfs: mount(2): Permission deniedmount.nfs: trying text-based options 'vers=4,minorversion=1,addr=10.0.0.42,clientaddr=10.0.0.123'mount.nfs: mount(2): Permission deniedmount.nfs: trying text-based options 'vers=4,addr=10.0.0.42,clientaddr=10.0.0.123'mount.nfs: mount(2): Permission deniedmount.nfs: trying text-based options 'addr=10.0.0.42'mount.nfs: prog 100003, trying vers=3, prot=6mount.nfs: trying 10.0.0.42 prog 100003 vers 3 prot TCP port 2049mount.nfs: prog 100005, trying vers=3, prot=17mount.nfs: trying 10.0.0.42 prog 100005 vers 3 prot UDP port 797
While I continue to set things up, I made /storage/data
“world-writable” so I can start using the file server 2.
chmod -R 777 /storage
Ultimately, I would like to use “ZFS over NFS” as my primary storage medium because it would keep everything in one place and let me access it from any computer on my network.
Overall, I think ZFS is a really flexible filesystem and using ECC
memory provides a lot of protection against data corruption. I can
manage local and remote backups with zfs send and seamlessly add or
replace drives by linking new zpools to my existing datasets.
Bonus: Creating my UFS scratch disk
(I’m using the serial number I got earlier and double-triple-checked my drive number!)
First, initialize the drive:
$ gpart create -s gpt ada2ada2 created$ gpart add -t freebsd-ufs -l 0000000012020907C404 -a 1M ada2ada2p1 added
Then create the filesystem with TRIM enabled:
$ newfs -U -j -t -L scratch /dev/ada2p1/dev/ada2p1: 122103.0MB (250066944 sectors) block size 32768, fragment size 4096using 196 cylinder groups of 625.22MB, 20007 blks, 80128 inodes.with soft updatessuper-block backups (for fsck_ffs -b #) at: [...]
List the active filesystem options:
$ tunefs -p /dev/ada2p1tunefs: POSIX.1e ACLs: (-a) disabledtunefs: NFSv4 ACLs: (-N) disabledtunefs: MAC multilabel: (-l) disabledtunefs: soft updates: (-n) enabledtunefs: soft update journaling: (-j) disabledtunefs: gjournal: (-J) disabledtunefs: trim: (-t) enabledtunefs: maximum blocks per file in a cylinder group: (-e) 4096tunefs: average file size: (-f) 16384tunefs: average number of files in a directory: (-s) 64tunefs: minimum percentage of free space: (-m) 8%tunefs: space to hold for metadata blocks: (-k) 6400tunefs: optimization preference: (-o) timetunefs: volume label: (-L)
Footnotes:
Following the FreeBSD handbook’s advice of using 1 MiB
alignments. Hopefully this is a good idea, I think letting gpart
decide an alignment would be fine too.
It’s only temporary if it doesn’t work…