Spawning applications in the cloud has been made super easy using container frameworks such as docker. For instance running a simple command like the following
1docker run --rm -v /path/to/nginx-files:/etc/nginx nginx
spawns an NGINX web server, provided you customize config files and the actual HTML files to be served.
This process, inherits NGINX’s stock docker hub rootfs, and spawns it as a docker container in a generic Linux container host.
However, what happens when someone wants to create the absolute minimum rootfs for such a process?
In what follows we describe two straightforward approaches that came up handy when playing with firecraker and qemu rootfs images on low-power devices, where resources are scarce (memory footprint, storage footprint, bw etc.).
Option 1: base the rootfs on a minimal linux container distro Link to heading
The firecracker guide provides a guide on how to build a rootfs based on Alpine Linux. We will re-use the same steps for building our rootfs, but to spice it up we will create a rootfs that upon boot runs immediately nginx.
We will use the following Dockerfile to build nginx:
1FROM alpine:latest
2
3MAINTAINER Babis Chalios <mail@bchalios.io>
4
5ENV NGINX_VERSION nginx-1.17.4
6
7WORKDIR /build
8RUN apk --update add openssl-dev pcre-dev zlib-dev wget build-base && \
9 wget http://nginx.org/download/${NGINX_VERSION}.tar.gz && \
10 tar -zxvf ${NGINX_VERSION}.tar.gz && \
11 cd ${NGINX_VERSION} && \
12 ./configure \
13 --with-http_ssl_module \
14 --with-http_gzip_static_module \
15 --prefix=/etc/nginx \
16 --http-log-path=/var/log/nginx/access.log \
17 --error-log-path=/var/log/nginx/error.log \
18 --sbin-path=/usr/local/sbin/nginx && \
19 make && \
20 make install && \
21 apk del build-base && \
22 rm -rf /tmp/src && \
23 rm -rf /var/cache/apk/*
24
25WORKDIR /
We build nginx:
1$ docker build -t build_static_nginx
and use the container to build our rootfs:
1$ dd if=/dev/zero of=nginx_fc.ext4 bs=1M count=100
2$ mkfs.ext4 nginx_fc.ext4
3$ mkdir mnt
4$ sudo mount nginx_fc.ext4 mnt
5$ docker --rm -ti -v $(pwd)/mnt:/my-rootfs build_static_nginx
and in the container:
1# for d in bin etc lib root sbin usr; do tar c "/$d" | tar x -C /my-rootfs; done
2# for dir in dev proc run sys var; do mkdir /my-rootfs/${dir}; done
3# exit
As the final step, we will substitute the /sbin/init
binary of our rootfs with
the following script which launches the nginx binary:
1$ cat nginx_init
2#!/bin/sh
3
4mkdir -p /var/log/nginx
5
6# the init process should never exit
7/usr/local/sbin/nginx -g "daemon off;"
8$ sudo mv mnt/sbin/init mnt/sbin/init.old
9$ sudo cp nginx_init mnt/sbin/init
10$ sudo chmod a+x mnt/sbin/init
11$ sudo umount mnt
Option 2: ditch the rootfs, replace /sbin/init with the application Link to heading
Well, this is more of a hack than an actual solution, but comes in handy when memory and storage space is critical.
Based on an excellent tutorial by Rob Landley, we come up with a single kernel image that spawns an NGINX web server.
Step 1: Build the application as a static binary Link to heading
There are a couple of tutorials on how to build NGINX statically. We followed this one. After successful compilation, we come up with the following files/dirs:
1objs/nginx
2conf/
3html/
Step 2: generate the list of files/dirs for initramfs Link to heading
To embed all files in the Linux kernel build, we use the initramfs option in the kernel config, where the kernel build process generates the cpio archive to be appended to the kernel binary, and used as initramfs.
First step is to create a file that contains the list of all files to be included in the cpio archive:
1$ cat >> initramfs_list < EOF
2dir /dev 755 0 0
3nod /dev/console 644 0 0 c 5 1
4nod /dev/loop0 644 0 0 b 7 0
5dir /bin 755 1000 1000
6dir /proc 755 0 0
7dir /sys 755 0 0
8dir /mnt 755 0 0
9file /init usr/nginx 755 0 0
10dir /opt 755 0 0
11dir /opt/nginx 755 0 0
12dir /opt/nginx/logs 755 0 0
13dir /opt/nginx/conf 755 0 0
14dir /opt/nginx/html 755 0 0
15file /opt/nginx/conf/fastcgi.conf usr/conf/fastcgi.conf 644 0 0
16file /opt/nginx/conf/fastcgi_params usr/conf/fastcgi_params 644 0 0
17file /opt/nginx/conf/koi-utf usr/conf/koi-utf 644 0 0
18file /opt/nginx/conf/koi-win usr/conf/koi-win 644 0 0
19file /opt/nginx/conf/mime.types usr/conf/mime.types 644 0 0
20file /opt/nginx/conf/nginx.conf usr/conf/nginx.conf 644 0 0
21file /opt/nginx/conf/scgi_params usr/conf/scgi_params 644 0 0
22file /opt/nginx/conf/uwsgi_params usr/conf/uwsgi_params 644 0 0
23file /opt/nginx/conf/win-utf usr/conf/win-utf 644 0 0
24file /opt/nginx/html/index.html usr/html/index.html 644 0 0
25EOF
We then place all necessary files at the usr/
dir in the linux kernel build dir.
Step 3: build the kernel Link to heading
1$ git clone https://github.com/torvalds/linux.git
2$ cd linux
3$ git checkout v4.14
4$ wget https://raw.githubusercontent.com/firecracker-microvm/firecracker/b1e48beaea7b917fef1e4846f1d75a2c1a136517/resources/microvm-kernel-aarch64-config -O .config
5$ make oldconfig
In this step, we can select any kernel version we want, starting from version
4.14, e.g. git checkout v4.20
.
The only thing left is to add the option CONFIG_INITRAMFS_SOURCE="usr/initramfs_list"
to the kernel config file, and point it to where our initramfs_list
file is. A simple make
will build the necessary stuff and we end up with arch/arm64/boot/Image
which contains the Linux kernel, as well as this initramfs.cpio.gz archive with the files we already specified.
and voila! Here’s a dump of this Image
booting on firecracker on a Rpi4:
1# ./build/cargo_target/aarch64-unknown-linux-musl/release/firecracker --api-sock /tmp/fireananos.sock
2[ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd083]
3[ 0.000000] Linux version 5.4.0-rc3 (root@baremetal.nubificus.com) (gcc version 6.3.0 20170516 (Debian 6.3.0-18)) #2 SMP PREEMPT Sat Oct 19 06:13:48 CDT 2019
4[ 0.000000] Machine model: linux,dummy-virt
5[ 0.000000] PCI: Unknown option `off'
6[ 0.000000] earlycon: uart0 at MMIO 0x0000000040003000 (options '')
7[ 0.000000] printk: bootconsole [uart0] enabled
8[ 0.000000] efi: Getting EFI parameters from FDT:
9[ 0.000000] efi: UEFI not found.
10[ 0.000000] cma: Reserved 16 MiB at 0x0000000086c00000
11[ 0.000000] NUMA: No NUMA configuration found
12[ 0.000000] NUMA: Faking a node at [mem 0x0000000080000000-0x0000000087ffffff]
13[ 0.000000] NUMA: NODE_DATA [mem 0x87fb9800-0x87fbafff]
14[ 0.000000] Zone ranges:
15[ 0.000000] DMA32 [mem 0x0000000080000000-0x0000000087ffffff]
16[ 0.000000] Normal empty
17[ 0.000000] Movable zone start for each node
18[ 0.000000] Early memory node ranges
19[ 0.000000] node 0: [mem 0x0000000080000000-0x0000000087ffffff]
20[ 0.000000] Initmem setup node 0 [mem 0x0000000080000000-0x0000000087ffffff]
21[ 0.000000] psci: probing for conduit method from DT.
22[ 0.000000] psci: PSCIv1.0 detected in firmware.
23[ 0.000000] psci: Using standard PSCI v0.2 function IDs
24[ 0.000000] psci: Trusted OS migration not required
25[ 0.000000] psci: SMC Calling Convention v1.1
26[ 0.000000] percpu: Embedded 22 pages/cpu s52760 r8192 d29160 u90112
27[ 0.000000] Detected PIPT I-cache on CPU0
28[ 0.000000] CPU features: detected: EL2 vector hardening
29[ 0.000000] ARM_SMCCC_ARCH_WORKAROUND_1 missing from firmware
30[ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 32256
31[ 0.000000] Policy zone: DMA32
32[ 0.000000] Kernel command line: console=ttyS0 root=/dev/vda reboot=k panic=1 pci=off ip=10.0.0.2::10.0.0.1:255.255.255.0::eth0:off net.ifnames=0 root=/dev/vda rw earlycon=uart,mmio,0x40003000
33[ 0.000000] Dentry cache hash table entries: 16384 (order: 5, 131072 bytes, linear)
34[ 0.000000] Inode-cache hash table entries: 8192 (order: 4, 65536 bytes, linear)
35[ 0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
36[ 0.000000] Memory: 72448K/131072K available (11004K kernel code, 1588K rwdata, 5552K rodata, 6656K init, 381K bss, 42240K reserved, 16384K cma-reserved)
37[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
38[ 0.000000] rcu: Preemptible hierarchical RCU implementation.
39[ 0.000000] rcu: RCU restricting CPUs from NR_CPUS=64 to nr_cpu_ids=1.
40[ 0.000000] Tasks RCU enabled.
41[ 0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
42[ 0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
43[ 0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
44[ 0.000000] random: get_random_bytes called from start_kernel+0x2bc/0x45c with crng_init=0
45[ 0.000000] arch_timer: cp15 timer(s) running at 54.00MHz (virt).
46[ 0.000000] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0xc743ce346, max_idle_ns: 440795203123 ns
47[ 0.000004] sched_clock: 56 bits at 54MHz, resolution 18ns, wraps every 4398046511102ns
48[ 0.003165] Console: colour dummy device 80x25
49[ 0.004704] Calibrating delay loop (skipped), value calculated using timer frequency.. 108.00 BogoMIPS (lpj=216000)
50[ 0.008179] pid_max: default: 32768 minimum: 301
51[ 0.009775] LSM: Security Framework initializing
52[ 0.011308] Mount-cache hash table entries: 512 (order: 0, 4096 bytes, linear)
53[ 0.013772] Mountpoint-cache hash table entries: 512 (order: 0, 4096 bytes, linear)
54[ 0.042261] ASID allocator initialised with 32768 entries
55[ 0.055248] rcu: Hierarchical SRCU implementation.
56[ 0.068193] EFI services will not be available.
57[ 0.080179] smp: Bringing up secondary CPUs ...
58[ 0.083868] smp: Brought up 1 node, 1 CPU
59[ 0.087308] SMP: Total of 1 processors activated.
60[ 0.091120] CPU features: detected: 32-bit EL0 Support
61[ 0.093094] CPU features: detected: CRC32 instructions
62[ 0.095989] CPU: All CPU(s) started at EL1
63[ 0.097545] alternatives: patching kernel code
64[ 0.099975] devtmpfs: initialized
65[ 0.106349] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
66[ 0.109513] futex hash table entries: 256 (order: 2, 16384 bytes, linear)
67[ 0.112244] pinctrl core: initialized pinctrl subsystem
68[ 0.115499] DMI not present or invalid.
69[ 0.121449] NET: Registered protocol family 16
70[ 0.125835] DMA: preallocated 256 KiB pool for atomic allocations
71[ 0.128200] audit: initializing netlink subsys (disabled)
72[ 0.133094] cpuidle: using governor menu
73[ 0.134911] audit: type=2000 audit(0.116:1): state=initialized audit_enabled=0 res=1
74[ 0.138336] hw-breakpoint: found 6 breakpoint and 4 watchpoint registers.
75[ 0.143418] Serial: AMBA PL011 UART driver
76[ 0.190417] HugeTLB registered 1.00 GiB page size, pre-allocated 0 pages
77[ 0.192858] HugeTLB registered 32.0 MiB page size, pre-allocated 0 pages
78[ 0.196031] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages
79[ 0.198512] HugeTLB registered 64.0 KiB page size, pre-allocated 0 pages
80[ 0.205111] cryptd: max_cpu_qlen set to 1000
81[ 0.217815] ACPI: Interpreter disabled.
82[ 0.224317] iommu: Default domain type: Translated
83[ 0.227342] vgaarb: loaded
84[ 0.228696] SCSI subsystem initialized
85[ 0.231005] usbcore: registered new interface driver usbfs
86[ 0.233680] usbcore: registered new interface driver hub
87[ 0.236052] usbcore: registered new device driver usb
88[ 0.238306] pps_core: LinuxPPS API ver. 1 registered
89[ 0.240438] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
90[ 0.243424] PTP clock support registered
91[ 0.245042] EDAC MC: Ver: 3.0.0
92[ 0.247254] Advanced Linux Sound Architecture Driver Initialized.
93[ 0.250766] clocksource: Switched to clocksource arch_sys_counter
94[ 0.253053] VFS: Disk quotas dquot_6.6.0
95[ 0.254530] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
96[ 0.257177] pnp: PnP ACPI: disabled
97[ 0.268272] thermal_sys: Registered thermal governor 'step_wise'
98[ 0.268275] thermal_sys: Registered thermal governor 'power_allocator'
99[ 0.271870] NET: Registered protocol family 2
100[ 0.276694] tcp_listen_portaddr_hash hash table entries: 256 (order: 0, 4096 bytes, linear)
101[ 0.279823] TCP established hash table entries: 1024 (order: 1, 8192 bytes, linear)
102[ 0.282285] TCP bind hash table entries: 1024 (order: 2, 16384 bytes, linear)
103[ 0.284760] TCP: Hash tables configured (established 1024 bind 1024)
104[ 0.287067] UDP hash table entries: 256 (order: 1, 8192 bytes, linear)
105[ 0.289706] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes, linear)
106[ 0.292468] NET: Registered protocol family 1
107[ 0.307079] RPC: Registered named UNIX socket transport module.
108[ 0.313313] RPC: Registered udp transport module.
109[ 0.315397] RPC: Registered tcp transport module.
110[ 0.317154] RPC: Registered tcp NFSv4.1 backchannel transport module.
111[ 0.319655] PCI: CLS 0 bytes, default 64
112[ 0.504111] kvm [1]: HYP mode not available
113[ 0.507069] Initialise system trusted keyrings
114[ 0.510147] workingset: timestamp_bits=44 max_order=15 bucket_order=0
115[ 0.526272] squashfs: version 4.0 (2009/01/31) Phillip Lougher
116[ 0.530011] NFS: Registering the id_resolver key type
117[ 0.532603] Key type id_resolver registered
118[ 0.534191] Key type id_legacy registered
119[ 0.535985] nfs4filelayout_init: NFSv4 File Layout Driver Registering...
120[ 0.538740] 9p: Installing v9fs 9p2000 file system support
121[ 0.550468] Key type asymmetric registered
122[ 0.552206] Asymmetric key parser 'x509' registered
123[ 0.554107] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 246)
124[ 0.557229] io scheduler mq-deadline registered
125[ 0.558966] io scheduler kyber registered
126[ 0.574957] EINJ: ACPI disabled.
127[ 0.589178] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
128[ 0.596662] printk: console [ttyS0] disabled
129[ 0.599695] 40003000.uart: ttyS0 at MMIO 0x40003000 (irq = 7, base_baud = 1500000) is a 16550A
130[ 0.603515] printk: console [ttyS0] enabled
131[ 0.603515] printk: console [ttyS0] enabled
132[ 0.607076] printk: bootconsole [uart0] disabled
133[ 0.607076] printk: bootconsole [uart0] disabled
134[ 0.611276] SuperH (H)SCI(F) driver initialized
135[ 0.613660] msm_serial: driver initialized
136[ 0.615984] cacheinfo: Unable to detect cache hierarchy for CPU 0
137[ 0.631690] loop: module loaded
138[ 0.634019] virtio_blk virtio0: [vda] 204800 512-byte logical blocks (105 MB/100 MiB)
139[ 0.648631] libphy: Fixed MDIO Bus: probed
140[ 0.652999] tun: Universal TUN/TAP device driver, 1.6
141[ 0.658605] thunder_xcv, ver 1.0
142[ 0.660878] thunder_bgx, ver 1.0
143[ 0.662279] nicpf, ver 1.0
144[ 0.664058] e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
145[ 0.666284] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
146[ 0.669114] igb: Intel(R) Gigabit Ethernet Network Driver - version 5.6.0-k
147[ 0.671961] igb: Copyright (c) 2007-2014 Intel Corporation.
148[ 0.674176] igbvf: Intel(R) Gigabit Virtual Function Network Driver - version 2.4.0-k
149[ 0.677465] igbvf: Copyright (c) 2009 - 2012 Intel Corporation.
150[ 0.680142] sky2: driver version 1.30
151[ 0.682282] VFIO - User Level meta-driver version: 0.3
152[ 0.690342] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
153[ 0.693832] ehci-pci: EHCI PCI platform driver
154[ 0.695936] ehci-platform: EHCI generic platform driver
155[ 0.698111] ehci-orion: EHCI orion driver
156[ 0.700178] ehci-exynos: EHCI EXYNOS driver
157[ 0.701992] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
158[ 0.704917] ohci-pci: OHCI PCI platform driver
159[ 0.706835] ohci-platform: OHCI generic platform driver
160[ 0.709036] ohci-exynos: OHCI EXYNOS driver
161[ 0.711184] usbcore: registered new interface driver usb-storage
162[ 0.716697] rtc-pl031 40004000.rtc: registered as rtc0
163[ 0.720525] i2c /dev entries driver
164[ 0.725890] sdhci: Secure Digital Host Controller Interface driver
165[ 0.729845] sdhci: Copyright(c) Pierre Ossman
166[ 0.732261] Synopsys Designware Multimedia Card Interface Driver
167[ 0.735862] sdhci-pltfm: SDHCI platform and OF driver helper
168[ 0.739259] ledtrig-cpu: registered to indicate activity on CPUs
169[ 0.743495] usbcore: registered new interface driver usbhid
170[ 0.745688] usbhid: USB HID core driver
171[ 0.751079] NET: Registered protocol family 17
172[ 0.754152] 9pnet: Installing 9P2000 support
173[ 0.756397] Key type dns_resolver registered
174[ 0.758475] registered taskstats version 1
175[ 0.760525] Loading compiled-in X.509 certificates
176[ 0.763104] rtc-pl031 40004000.rtc: setting system clock to 2019-10-19T14:46:13 UTC (1571496373)
177[ 0.786860] IP-Config: Complete:
178[ 0.790000] device=eth0, hwaddr=c2:6b:5f:69:cc:12, ipaddr=10.0.0.2, mask=255.255.255.0, gw=10.0.0.1
179[ 0.800389] host=10.0.0.2, domain=, nis-domain=(none)
180[ 0.802853] bootserver=255.255.255.255, rootserver=255.255.255.255, rootpath=
181[ 0.806010] ALSA device list:
182[ 0.807295] No soundcards found.
183[ 0.816070] Freeing unused kernel memory: 6656K
184[ 0.817939] Run /init as init process
and from a GET / command:
1curl 172.16.0.42
2<!DOCTYPE html>
3<html>
4<head>
5<title>Welcome to nginx!</title>
6<style>
7 body {
8 width: 35em;
9 margin: 0 auto;
10 font-family: Tahoma, Verdana, Arial, sans-serif;
11 }
12</style>
13</head>
14<body>
15<h1>Welcome to nginx!</h1>
16<p>If you see this page, the nginx web server is successfully installed and
17working. Further configuration is required.</p>
18
19<p>For online documentation and support please refer to
20<a href="http://nginx.org/">nginx.org</a>.<br/>
21Commercial support is available at
22<a href="http://nginx.com/">nginx.com</a>.</p>
23
24<p><em>Thank you for using nginx.</em></p>
25</body>
26</html>