FreeBSD jails and vnet from scratch

Motivation

There are plenty of tools to ease creation and management of jails in FreeBSD (e.g. iocage, ezjail, iocell, and more on FreeBSD wiki). However, sometimes things go wrong, and when they do, it's easiest to debug when you have an understanding of what's going on under the hood.

Here I'm trying to stick to what is available on the base install of a typical FreeBSD machine and do away with any hand-holding. I refer extensively to sources and various manpages and they are marked. I encourage you to read through them for a better understanding of what is being typed. Sometimes I have used names as appropriate in my situation only which need to be swapped without marking them as such. So, be careful when copy/pasting.

Setting up the base jail

To begin setting up our thick jails, we first need a template basejail which contains FreeBSD files. I have named my main pool ztank.

zfs create -o mountpoint=/jails ztank/jails
zfs create ztank/jails/basejail

Next, we need to install the operating system files onto the base jail. FreeBSD comes with the bsdinstall utility which is handy. Also see bsdinstall(8) for more information.

export DISTRIBUTIONS="base.txz"
export BSDINSTALL_DISTSITE=https://download.freebsd.org/ftp/releases/amd64/12.2-RELEASE/
bsdinstall jail /jails/basejail

Setting DISTRIBUTIONS and BSDINSTALL_DISTSITE is required to get the distribution linked. If you already have distfiles around, you can ignore both or either. See bsdinstall(8). I opted to install no optional components and only enable sshd on startup.

Next we use freebsd-update to update the base jail. As always, make sure to checkout the excellent manpage for freebsd-update(8).

freebsd-update -b /jails/basejail fetch install
freebsd-update -b /jails/basejail IDS

Onto thick jails

It's now fairly easy to create clones of the base jail and setup as many jails as we need. ZFS makes this very easy, with the added benefit that only changed files take additional space on disk.

zfs snapshot ztank/jails/basejail@`freebsd-version`
zfs clone ztank/jails/basejail@`freebsd-version` ztank/jails/<newjail>

To configure the jail, use the following template for /etc/jail.conf and modify as you see fit.

exec.clean;

exec.system_user = "root";
exec.jail_user = "root";

exec.start += "/bin/sh /etc/rc";
exec.stop = "/bin/sh /etc/rc.shutdown";
exec.consolelog = "/var/log/jail_${name}_console.log";

path = "/jails/${name}";
host.hostname = "${name}";

securelevel = 3;

znc {
}

This is a good time to test the jail to see if it's working.

jail -c znc
jexec znc /bin/tcsh

Configuring VNET and networking

Understanding and setting up vnet turned out to be a little trickier. I ended up opting for a bridged configuration where the jail interface is bridged to the host vs a routed configuration as it seemed simpler.

vnet is used to create a network stack for jails. See jail(8) manpage.

vnet: Create the jail with its own virtual network stack, with its own network interfaces, addresses, routing table, etc. The kernel must have been compiled with the VIMAGE option for this to be available. Possible values are “inherit” to use the system network stack, possibly with restricted IP addresses, and “new” to create a new network stack.

To enable networking, create a bridge interface on the host and create an epair interface for each jail which creates a level 2 (ethernet) hub and spoke topology with the bridge interface at the center. the jib utility located at /usr/share/examples/jails/jib has useful comments at the beginning of the file. It's also useful to read the jib_addm() function to see what it's doing.

We'll use jib to setup vnet as it automates all the steps we'd have to take anyway.

install /usr/share/examples/jails/jib /usr/local/bin

Modify jail.conf as follows. This is taken almost verbatim from the comments at the beginning of jib. note that allow.mount, allow.mount.devfs, and enforce_statfs allow the jail to mount the devfs from inside the jail indicated by devfs_ruleset. They are not necessary for jail to mount them beforehand. See jail(8) for further detail.

allow.set_hostname = 1;
allow.raw_sockets = 1;

mount.devfs;
devfs_ruleset = 11;

vnet;
vnet.interface = "e0b_${name}";
exec.prestart += "jib addm ${name} igb0";
exec.poststop += "jib destroy ${name}";

znc {
    ...
}

To allow dhclient to work inside vnet jails, add the following in /etc/devfs.rules. Once again, this is also noted in comment in jib. dhclient requires /dev/bpf* interfaces to function, which is what this ruleset does.

[devfsrules_jail_bpf=11]
add include $devfsrules_jail
add path 'bpf*' unhide

To have devfs re-read the ruleset, restart devfs service. You can then check devfs to make sure your ruleset is recognized by running devfs rule showsets. You can see the rules added with add include ... lines in /etc/defaults/devfs.rules. See devfs(8) for details.

jng is also a similar utility which uses netgraph instead of epair interfaces to create a similar network topology for jails. Again, the source is the best place to learn what it does, how it does it, and a little more about netgraph itself.

Configuring DHCP and friends

Warning: This part is a failed attempt. It is here for historical purposes. Feel free to skip.

To leverage our newfound networking prowess, we need to setup DHCP for the jail, which is done by modifying /etc/rc.conf in our jail.

ifconfig_e0b_znc="DHCP"

…or not. This doesn't seem to work. If anyone finds out why, please post your findings.

Going deeper

Warning: This part is a failed attempt. It is here for historical purposes. Feel free to skip.

To fix the DHCP issue, I had to dig a little deeper and replicate part of the jib script in the jail.conf file. /mpool/scripts/derive_mac.sh is taken from derive_mac() function in jib with minor modifications to make it a callable script. see comments in that function to understand how it's deriving mac address for interfaces.

# Global settings
$bridge = "igb0bridge";

exec.clean;
exec.system_user = "root";
exec.jail_user = "root";

# setup vnet
vnet;
vnet.interface = "epair${ep}b";

exec.prestart  = "ifconfig epair${ep} create";
exec.prestart += "ifconfig $bridge addm epair${ep}a";
exec.prestart += "ifconfig epair${ep}a ether $(/mpool/scripts/derive_mac.sh $iface $name 0 a)";
exec.prestart += "ifconfig epair${ep}b ether $(/mpool/scripts/derive_mac.sh $iface $name 0 b)";

exec.poststop  = "ifconfig $bridge deletem epair${ep}a";
exec.poststop += "ifconfig epair${ep}a destroy";

exec.start += "/bin/sh /etc/rc";

exec.stop = "/bin/sh /etc/rc.shutdown";

exec.poststop = "ifconfig $bridge deletem epair${ep}a";
exec.poststop += "ifconfig epair${ep}a destroy";

znc {
    $ep = 1;
}

This should, in theory, work. However, doing this I can't get a lease from DHCP whereas having jib setup the network (which is essentially doing the same thing) works just fine. I didn't take the time to debug more carefully.

Suffice to say, fail #2. Back to jib.

Back to jib

Going through iocage source, I noticed iocage runs dhclient on startup on jails, which simplifies my setup.

exec.start = "dhclient e0b_${name}";
exec.start += "/bin/sh /etc/rc";

Putting it all together

/etc/jail.conf

exec.clean;
exec.system_user = "root";
exec.jail_user = "root";

# setup vnet
vnet;

# jib script requires this naming scheme (it's hardcoded)
vnet.interface = "e0b_${name}";

# needs to be *before* exec.start
mount.devfs;
devfs_ruleset = 11;

exec.prestart = "jib addm ${name} igb0";
exec.start = "dhclient e0b_${name}";
exec.start += "/bin/sh /etc/rc";
exec.stop = "/bin/sh /etc/rc.shutdown";
exec.poststop = "jib destroy ${name}";

exec.consolelog = "/var/log/jail_${name}_console.log";

path = "/jails/${name}";
host.hostname = "${name}";

allow.raw_sockets = 1;
allow.set_hostname = 1;

# required for postgresql and maybe some other services
allow.sysvipc = 1;

znc {
  # depend = somejail;
}

Combining bhyve and vnet networks

As one NIC cannot be member of two separate bridge s, if you're already running bridged interfaces through bhyve (which I am, through vm-bhyve, you need to create a single bridge and have both jails and vm-bhyve join the fun.

Therefore, we'll need to configure a bridged interface on boot up, and have vm-bhyve and jail join.

/etc/rc.conf

cloned_interfaces="bridge0"
ifconfig_bridge0_name="igb0bridge"
create_args_bridge0="addm igb0"

Note igb0bridge is the name jib expects the bridge to be named (it's hardcoded). It's in the format ${iface}bridge, so check what your primary interface name is and swap igb0 with that. Once FreeBSD sets up a bridge, vm-bhyve can use this custom bridge simply by running

vm switch create -t manual -b igb0bridge public

I already had a few VMs that use public as the switch name, so I used the same name to not disrupt any of them, but you can use any name.

jib will use this bridge automatically as we named it per its expectations.

All done

At last. test the new setup!

jexec znc ping <yourfavoritewebsite.tld>

Et Voila!

Moar(!) jails

To create a new jail, all one needs to do is

zfs clone ztank/jails/basejail@`freebsd-version` ztank/jails/<newjail>

and add <newjail> {} to /etc/jail.conf.

Since I'm moving my old ezjail-based jails to the new thick jails, this would be a good time to setup a backup strategy that doesn't involve snapshotting the entire jail each time.

References