Getting started with Firecracker

I started to play with Firecracker, a microVMs manager written in Rust and open-sourced by Amazon.

The idea behind Firecracker is to be able to have the best of the two worlds: containers to start and run fast and real VM for isolation.

Firecracker is used by AWS as foundation for Serverless services like AWS Lambda and AWS Fargate.

Resources

Firecracker has a nice documentation in its source code, I recommend in particular the getting started guide.

Julia Evans wrote also a very helpful article: Firecracker: start a VM in less than a second, part of her Recurse Center batch.

Radek Gruchalski published also articles about his Firecracker journey on its blog, in particular Launching Alpine Linux on Firecracker like a boss.

Generate your own rootfs from a Docker image

Once I installed Firecracker I wanted to generate my own rootfs version by using a built docker image.

The idea was:

use docker export to export an image tarball
prepare an ext4 image and mount it in the filesystem
extract the docker tarball into the mounted directory
chroot into the mounted directory and run post-install commands
unmount the ext4 and resize it

Sounds easy, the first thing we need is our alpine is an ssh server to login and test if it works as expected.

Here is a basic shell script rootfs_docker.sh I wrote to create an image from a docker container.

#!/usr/bin/env sh
set -e -x

# Create a rootfs file from docker container
# See https://github.com/anyfiddle/firecracker-rootfs-builder/blob/main/create-rootfs.sh
# usage: ./rootfs_docker.sh <docker image> <output name>

docker_image=$1
output_name=${2:-image.ext4}
mntdir=/tmp/rootfs

# create and export docker container
container_export=/tmp/rootfs.tar

echo "create and export docker container from ${docker_image} into ${container_export}"

rm -fr $container_export
containerId=$(docker container create $docker_image)
docker export $containerId > $container_export
docker container rm $containerId

# create a mounted ext4 file

output_dir=/tmp/rootfs_output
output=${output_dir}/${output_name}

echo "create a mounted ext4 file ${output}"

# prepare output
mkdir -p $output_dir

# create empty image
rm -f ${output}
truncate -s 100M ${output}
/usr/sbin/mkfs.ext4 ${output}

# mount the image
rm -rf $mntdir
mkdir -p $mntdir
sudo mount -o loop $output $mntdir

# export docker container into a mounted ext4 file
echo "extract the docker container into ${output}"

sudo tar -C $mntdir -xf $container_export
# delete the docker export
rm -fr $container_export

# # prepare the rootfs
init=init.sh
sudo mount -t proc /proc ${mntdir}/proc/
sudo mount -t sysfs /sys ${mntdir}/sys/
sudo mount -o bind /dev ${mntdir}/dev/

sudo cp $init $mntdir
sudo chroot $mntdir /bin/sh $init
sudo rm ${mntdir}/${init}

sudo umount ${mntdir}/dev
sudo umount ${mntdir}/proc
sudo umount ${mntdir}/sys

# unmount the image
sudo umount $mntdir
rm -fr $mntdir

# resize image
/usr/sbin/resize2fs -M $output

# check image fs
/usr/sbin/e2fsck -y -f $output

echo "rootfs ready: ${output}"

For now my init.sh files was pretty simple.

#!/bin/sh

cat << 'EOF' > /etc/resolv.conf
nameserver 8.8.8.8
EOF

apk add --update --no-cache --initdb alpine-baselayout apk-tools busybox \
    ca-certificates util-linux dhcpcd \
    openssh \
    openrc
rm -rf /var/cache/apk/*

# Setting up the agetty service
# see https://github.com/OpenRC/openrc/blob/master/agetty-guide.md
ln -s agetty /etc/init.d/agetty.ttyS0
echo ttyS0 > /etc/securetty
rc-update add agetty.ttyS0

rc-update add procfs
rc-update add sysfs
rc-update add local
rc-update add sshd

echo "root:root" | chpasswd

sed -i 's/root:!/root:*/' /etc/shadow

ssh-keygen -A

# copy here you own ssh public key
KEY='<REDACTED>'

mkdir -p /root/.ssh
chmod 0700 /root/.ssh
echo $KEY > /root/.ssh/authorized_keys

cat << 'EOF' > /etc/hosts
127.0.0.1	localhost
::1	localhost ip6-localhost ip6-loopback
fe00::0	ip6-localnet
ff00::0	ip6-mcastprefix
ff02::1	ip6-allnodes
ff02::2	ip6-allrouters
EOF

We had to prepare some useful things:

Install openrc the init system used in alpine.
Install openssh and start it at boot using rc-update.
Run chpasswd to setup a root password.
Allow my hardcoded ssh public key to connect as root via ssh in /root/.ssh/authorized_keys.
Fill /etx/resolv.conf and /etc/hosts b/c the docker image was not set.
Generate server’s keys with ssh-keygen -A

Now we have to run the script:

$ ./rootfs_docker.sh alpine:3.14 alpine.ext4

We can move our new image

$ mv /tmp/rootfs_output/alpine.ext4 .

Start the vm with Firecracker

Since we have our nice image, let’s run it!

First I created a daemon.sh script to start the Firecracker daemon.

#!/bin/sh
set -x -e

API_SOCKET="/tmp/firecracker.socket"

# Remove API unix socket
rm -f $API_SOCKET

# Run firecracker
./release-v1.3.3-x86_64/firecracker-v1.3.3-x86_64 --api-sock $API_SOCKET

A I started if with ./daemon.sh. The server listens the /tmp/firecracker.socket socket for commands. It will also display logs and login prompt.

Now I created a start.sh file to prepare network and send commands to the Firecracker server, mostly inspired from Julia’s version.

#!/bin/sh
set -x -e

TAP_DEV="tap0"
FC_IP="192.168.20.2"
TAP_IP="192.168.20.1"
MASK_SHORT="/24"
MASK_LONG="255.255.255.0"

# Setup network interface
sudo ip link del "$TAP_DEV" 2> /dev/null || true
sudo ip tuntap add dev "$TAP_DEV" mode tap

sudo ip addr add "${TAP_IP}${MASK_SHORT}" dev "$TAP_DEV"

# sudo brctl addif docker0 $TAP_DEV

sudo ip link set dev "$TAP_DEV" up
sudo sysctl -w net.ipv4.conf.${TAP_DEV}.proxy_arp=1 > /dev/null
sudo sysctl -w net.ipv6.conf.${TAP_DEV}.disable_ipv6=1 > /dev/null

# Enable ip forwarding
sudo sh -c "echo 1 > /proc/sys/net/ipv4/ip_forward"

# Use your own host device here to connect to the net.
OUT_DEV=<REDACTED>

# Set up microVM internet access
sudo iptables -t nat -D POSTROUTING -o $OUT_DEV -j MASQUERADE || true
sudo iptables -D FORWARD -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT \
    || true
sudo iptables -D FORWARD -i $TAP_DEV -o $OUT_DEV -j ACCEPT || true
sudo iptables -t nat -A POSTROUTING -o $OUT_DEV -j MASQUERADE
sudo iptables -I FORWARD 1 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
sudo iptables -I FORWARD 1 -i $TAP_DEV -o $OUT_DEV -j ACCEPT

API_SOCKET="/tmp/firecracker.socket"
LOGFILE="./firecracker.log"

# Create log file
touch $LOGFILE

# Set log file
curl -X PUT --unix-socket "${API_SOCKET}" \
    --data "{
        \"log_path\": \"${LOGFILE}\",
        \"level\": \"Debug\",
        \"show_level\": true,
        \"show_log_origin\": true
    }" \
    "http://localhost/logger"

KERNEL="./vmlinux-5.10.bin"
KERNEL_BOOT_ARGS="ro console=ttyS0 noapic reboot=k panic=1 pci=off nomodules load_modules=off random.trust_cpu=on"
KERNEL_BOOT_ARGS="${KERNEL_BOOT_ARGS} ip=${FC_IP}::${TAP_IP}:${MASK_LONG}::eth0:off"

ARCH=$(uname -m)

if [ ${ARCH} = "aarch64" ]; then
    KERNEL_BOOT_ARGS="keep_bootcon ${KERNEL_BOOT_ARGS}"
fi

# Set boot source
curl -X PUT --unix-socket "${API_SOCKET}" \
    --data "{
        \"kernel_image_path\": \"${KERNEL}\",
        \"boot_args\": \"${KERNEL_BOOT_ARGS}\"
    }" \
    "http://localhost/boot-source"

ROOTFS="./alpine.ext4"

# Set rootfs
curl -X PUT --unix-socket "${API_SOCKET}" \
    --data "{
        \"drive_id\": \"rootfs\",
        \"path_on_host\": \"${ROOTFS}\",
        \"is_root_device\": true,
        \"is_read_only\": false
    }" \
    "http://localhost/drives/rootfs"

# The IP address of a guest is derived from its MAC address with
# `fcnet-setup.sh`, this has been pre-configured in the guest rootfs. It is
# important that `TAP_IP` and `FC_MAC` match this.
FC_MAC="06:00:AC:10:00:02"

# Set network interface
curl -X PUT --unix-socket "${API_SOCKET}" \
    --data "{
        \"iface_id\": \"eth0\",
        \"guest_mac\": \"$FC_MAC\",
        \"host_dev_name\": \"$TAP_DEV\"
    }" \
    "http://localhost/network-interfaces/eth0"

# API requests are handled asynchronously, it is important the configuration is
# set, before `InstanceStart`.
sleep 0.015s

# Start microVM
curl -X PUT --unix-socket "${API_SOCKET}" \
    --data "{
        \"action_type\": \"InstanceStart\"
    }" \
    "http://localhost/actions"

# API requests are handled asynchronously, it is important the microVM has been
# started before we attempt to SSH into it.
sleep 0.015s

Be careful OUT_DEV is setup w/ the device name used by the host to connect to the net.

I decided to use the kernel vmlinux-5.10.bin 1 instead of building my own. But you can also pick some other kernels available here. I fixed issues I had with random number generator w/ older kernels like the one used by Julia.

Once you started ./start.sh you should see some logs diplayed in the daemon shell and a login console asking for user connect. You can use root as login and root as password and play with your new VM 🎉.

You can run reboot to shutdown your VM.

Problem: I can’t connect to ssh as expected

After I started the vm, I wasn’t able to connect to the ssh server. Nmap said the port was open.

$ nmap 192.168.20.2 22
Starting Nmap 7.93 ( https://nmap.org ) at 2023-07-31 12:19 CEST
Nmap scan report for 192.168.20.2
Host is up (0.00013s latency).
Not shown: 999 closed tcp ports (conn-refused)
PORT   STATE SERVICE
22/tcp open  ssh

Nmap done: 2 IP addresses (1 host up) scanned in 1.47 seconds

But my ssh client blocked. Please use your own ssh key in the command below.

$ ssh -o 'IdentitiesOnly=yes' -o 'StrictHostKeyChecking=no' -i ~/.ssh/<REDACTED> root@192.168.20.2
Password authentication is disabled to avoid man-in-the-middle attacks.
Keyboard-interactive authentication is disabled to avoid man-in-the-middle attacks.
PTY allocation request failed on channel 0

According with this SO thread, it appears to be due tomissing /dev/pts in the VM.

$ ls /dev/pts
ls: /dev/pts: No such file or directory

I’m trying to mount it.

$ mkdir /dev/pts && mount devpts /dev/pts -t devpts

Now my ssh works too!

$ ssh -o 'IdentitiesOnly=yes' -o 'StrictHostKeyChecking=no' -i ~/.ssh/plouf root@192.168.20.2
Password authentication is disabled to avoid man-in-the-middle attacks.
Keyboard-interactive authentication is disabled to avoid man-in-the-middle attacks.
Welcome to Alpine!

The Alpine Wiki contains a large amount of how-to guides and general
information about administrating Alpine systems.
See <http://wiki.alpinelinux.org/>.

You can setup the system with the command: setup-alpine

You may change this message by editing /etc/motd.

192:~#

Now to automate the mount of devpts on boot, we can update our rootfs script with:

rc-update add devfs

But it doesn’t work as expected :( I can’t figure out why… Maybe it’s due to docker container image specifics? Maybe this GH issue is the explanation?

Create rootfs with alpine-make-rootfs

Instead of using an alpine container image, I’m trying now to create my rootfs by using the script alpine-make-rootfs.

#!/usr/bin/env sh
set -e -x

# Create a rootfs file by using alpine-make-rootfs
# See https://github.com/alpinelinux/alpine-make-rootfs

alpine_release='v3.18'
output_name=alpine.${alpine_release}.ext4
mntdir=/tmp/rootfs

# create a mounted ext4 file

output_dir=/tmp/rootfs_output
output=${output_dir}/${output_name}

echo "create a mounted ext4 file ${output}"

# prepare output
mkdir -p $output_dir

# create empty image
rm -f ${output}
truncate -s 100M ${output}
/usr/sbin/mkfs.ext4 ${output}

# mount the image
rm -rf $mntdir
mkdir -p $mntdir
sudo mount -o loop $output $mntdir

echo "run alpine-make-rootfs"

sudo ./alpine-make-rootfs \
    --branch ${alpine_release} \
    --script-chroot \
    --packages='ca-certificates util-linux openssh dhcpcd openrc udev-init-scripts-openrc' \
    ${mntdir} - <<'SHELL'
ssh-keygen -A

# Setting up the agetty service
# see https://github.com/OpenRC/openrc/blob/master/agetty-guide.md
ln -s agetty /etc/init.d/agetty.ttyS0
echo ttyS0 > /etc/securetty
rc-update add agetty.ttyS0

rc-update add devfs sysinit
rc-update add procfs sysinit
rc-update add sysfs sysinit
rc-update add local
rc-update add sshd

echo "root:root" | chpasswd

# copy here you own ssh public key
KEY='<REDACTED>'

mkdir -p /root/.ssh
chmod 0700 /root/.ssh
echo $KEY > /root/.ssh/authorized_keys

# no modules
rm -f /etc/init.d/modules
SHELL

# unmount the image
sudo umount $mntdir
rm -fr $mntdir

# check image fs
/usr/sbin/e2fsck -y -f $output

# resize image
/usr/sbin/resize2fs -M $output

# check image fs
/usr/sbin/e2fsck -y -f $output

echo "rootfs ready: ${output}"

And this time, everything worked as expected 👯 I can login via ssh!

The only drawback is now my ext4 image is bigger: 49MB with docker vs 59M with alpine rootfs.

What’s next?

Try to use Jailer 1
What about logs?
Manage disks 1 2
Reduce rootfs size
How to boot faster (it takes ~0.9s to boot right now)