Isolated, hardware-accelerated functions on Jetson AGX Orin

Following up on a successful VM boot on a Jetson AGX Orin, we continue exploring the capabilities of this edge device, focusing on the cloud-native aspect of application deployment.

As a team, we’ve built vAccel, a hardware acceleration framework that decouples the operation from its hardware implementation. One of the awesome things vAccel offers is the ability to execute hardware-accelerated applications in a VM that has no direct access to a hardware accelerator. Given the Jetson Orin board has an Ampere GPU, with 1792 cores and 56 Tensor Cores, it sounds like a perfect edge device to try isolated hardware-accelerated workload execution through VM sandboxes.

Additionally, coupled with our downstream kata-containers port, we can invoke a VM sandboxed container through CRI that can execute compute-intensive tasks faster by using the GPU without having direct access to it!

Overview Link to heading

In this post we go through the high-level architecture of kata-containers and vAccel and provide insights into the integration we did.

Then, we go through the steps to run a stock kata container using various supported VMMs (QEMU, AWS Firecracker, Cloud hypervisor, and Dragonball) on the Jetson AGX Orin board.

Finally, we provide the steps to enable vAccel on a VM sandboxed container using our custom kata-containers runtime for the Go and Rust variants.

How kata containers work Link to heading

Kata containers is a sandboxed container runtime, combining the benefits of VMs in terms of workload isolation, with those of containers, in terms of portability. Installing and configuring kata-containers is straightforward and we have covered that in some of our previous posts.

In this post, we will not go through the details of building – we will just install kata-containers from the stock releases and add the binaries to enable vAccel execution.

How vAccel works Link to heading

vAccel is a software framework to expose hardware acceleration functionality to workloads that do not have direct access to an acceleration device. vAccel features a modular design where runtime plugins implement API operations. Figure 1 shows the software stack for the core vAccel library, along with a number of plugins and their accompanying external libraries.

vaccel-stack — Figure 1: The vAccel software stack

Apart from “hardware” plugins (that implement the API operations in some sort of acceleration library, eg CUDA, OpenCL etc.), vAccel features virtual plugins that are able to forward requests to a remote Host. This functionality makes it an open-ended API remoting framework. Based on the transport plugin (virtio or vsock) multiple setups can be enabled, making it ideal to use on VMs or on resource-constrained devices for remote execution. You can find more info about vAccel in the vAccel docs website.

The socket mode of vAccel for remote operations of vAccel works as follows:

the Host component (direct access to an accelerator), listens for requests using a predefined protocol (over gRPC) and issues vAccel API calls to the core vAccelRT library.
the Remote/Guest component (no direct access to an accelerator), forwards requests to the Host component via a gRPC channel and receives results from the execution.

At the moment, the Host component is a simple gRPC agent (vaccel-agent) and the Remote/Guest component is a vAccel plugin (libvaccel-vsock.so). A logical diagram of the execution flow for a VM workload taking advantage of the vAccel framework to run accelerated operations is shown in Figure 2.

kata-vAccel Link to heading

Integrating the vAccel framework to a sandboxed container runtime removes the need for complicated passthrough setups. No kernel prerequisites are needed (apart from the VSOCK options which is by-default enabled in most sandboxed container runtimes).

Currently, there are two modes for kata-vAccel (Figure 3): (a) the first one, exec/external mode, supporting both runtime variants of kata-containers (Go and Rust), spawns the vaccelrt-agent as an external binary, listening to the vsock socket available for the kata-agent component of the container runtime; (b) the second mode, integrated is only supported for the rust runtime variant of kata-containers and embeds the functionality of the vaccelrt-agent into the runtime, allowing better control and sandboxing of the components running on the host system.

An overview of the execution flow for a vAccel-enabled sandboxed container running on kata-containers is shown in Figure 4.

kata-containers binary installation Link to heading

The steps to install the stock kata-containers runtime are shown below:

Get the release tarball:

1wget https://github.com/kata-containers/kata-containers/releases/download/3.2.0-rc0/kata-static-3.2.0-rc0-arm64.tar.xz

Unpack it in a temporary directory:

1mkdir /tmp/kata-release
2xzcat kata-static-3.2.0-rc0-arm64.tar.xz | tar -xvf - -C /tmp/kata-release

You should be presented with the following directory structure:

 1# tree -d /tmp/kata-release/
 2/tmp/kata-release/
 3└── opt
 4    └── kata
 5        ├── bin
 6        ├── libexec
 7        ├── runtime-rs
 8        │   └── bin
 9        └── share
10            ├── bash-completion
11            │   └── completions
12            ├── defaults
13            │   └── kata-containers
14            ├── kata-containers
15            └── kata-qemu
16                └── qemu
17                    └── firmware
18
1915 directories

Move the files to /opt/kata (This is a hard requirement, as all config files assume this folder):

1rsync -oagxvPH /tmp/kata-release/opt/kata /opt/
2rm -rf /tmp/kata-release

Create helper files for the kata VMMs you would like to try out:

Cloud-hypervisor:

1cat > /usr/local/bin/containerd-shim-kata-clh-v2 << EOF
2#!/bin/bash
3KATA_CONF_FILE=/opt/kata/share/defaults/kata-containers/configuration-clh.toml /opt/kata/bin/containerd-shim-kata-v2 $@
4EOF

Dragonball:

1cat > /usr/local/bin/containerd-shim-kata-rs-v2 << EOF
2#!/bin/bash
3KATA_CONF_FILE=/opt/kata/share/defaults/kata-containers/configuration-dragonball.toml /opt/kata/runtime-rs/bin/containerd-shim-kata-v2 $@
4EOF

AWS Firecracker:

1cat > /usr/local/bin/containerd-shim-kata-fc-v2 << EOF
2#!/bin/bash
3KATA_CONF_FILE=/opt/kata/share/defaults/kata-containers/configuration-fc.toml /opt/kata/bin/containerd-shim-kata-v2 $@
4EOF

QEMU:

1cat > /usr/local/bin/containerd-shim-kata-qemu-v2 << EOF
2#!/bin/bash
3KATA_CONF_FILE=/opt/kata/share/defaults/kata-containers/configuration-qemu.toml /opt/kata/bin/containerd-shim-kata-v2 $@
4EOF

For each of the VMMs you need an entry in the /etc/containerd/config.toml file:

 1       [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata-fc]
 2          privileged_without_host_devices = true
 3          privileged_without_host_devices_all_devices_allowed = false
 4          runtime_type = "io.containerd.kata-fc.v2"
 5          snapshotter = "devmapper"
 6          pod_annotations = ["*"]
 7          container_annotations = ["*"]
 8
 9       [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata-rs]
10          privileged_without_host_devices = false
11          privileged_without_host_devices_all_devices_allowed = false
12          runtime_type = "io.containerd.kata-rs.v2"
13          #snapshotter = "devmapper"
14          pod_annotations = ["*"]
15          container_annotations = ["*"]
16
17       [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata-clh]
18          privileged_without_host_devices = false
19          privileged_without_host_devices_all_devices_allowed = false
20          runtime_type = "io.containerd.kata-clh.v2"
21          pod_annotations = ["*"]
22          container_annotations = ["*"]
23
24       [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata-qemu]
25          privileged_without_host_devices = false
26          privileged_without_host_devices_all_devices_allowed = false
27          runtime_type = "io.containerd.kata-qemu.v2"
28          pod_annotations = ["*"]
29          container_annotations = ["*"]

Note: for AWS Firecracker, the only supported snapshotter is devmapper. Make sure you follow the instructions to setup it correctly in containerd.

Restart containerd:

1systemctl restart containerd

You should now be able to spawn a stock kata-container using one of the VMM configurations above:

1$ sudo nerdctl run --rm -it --runtime io.containerd.kata-fc.v2 --snapshotter devmapper ubuntu:latest uname -a 
2Linux d13db14b0c9a 5.15.26 #2 SMP Fri Apr 21 05:05:44 BST 2023 aarch64 aarch64 aarch64 GNU/Linux

Enable vAccel on the kata runtime Link to heading

To enable vAccel, we need to add a custom kata runtime binary that instantiates the vaccel-agent alongside the container spawn.

For now, we build binaries for x86_64 and aarch64:

Go runtime Link to heading

For the go runtime (kata-fc):

1wget https://s3.nbfc.io/nbfc-assets/github/vaccel-go-lib/aarch64/containerd-shim-kata-v2 -O /opt/kata/bin/containerd-shim-kata-vaccel-v2
2chmod +x /opt/kata/bin/containerd-shim-kata-vaccel-v2
3wget https://s3.nbfc.io/nbfc-assets/github/vaccel-go-lib/toml/configuration-fc-vaccel.toml -O /opt/kata/share/defaults/kata-containers/configuration-fc-vaccel.toml

The next step is to setup helper scripts like above.

1cat > /usr/local/bin/containerd-shim-kata-fc-vaccel-v2 << EOF
2#!/bin/bash
3KATA_CONF_FILE=/opt/kata/share/defaults/kata-containers/configuration-fc-vaccel.toml /opt/kata/bin/containerd-shim-kata-vaccel-v2 $@
4EOF

and add the runtime to /etc/containerd/config.toml:

1       [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata-fc-vaccel]
2          privileged_without_host_devices = true
3          privileged_without_host_devices_all_devices_allowed = true
4          runtime_type = "io.containerd.kata-fc-vaccel.v2"
5          snapshotter = "devmapper"
6          pod_annotations = ["*"]
7          container_annotations = ["*"]

Rust runtime Link to heading

For the rust runtime (kata-rs)

1wget https://s3.nbfc.io/nbfc-assets/github/vaccel-rs-lib/shim/main/aarch64/containerd-shim-kata-v2 -O /opt/kata/runtime-rs/bin/containerd-shim-kata-vaccel-v2
2chmod +x /opt/kata/runtime-rs/bin/containerd-shim-kata-vaccel-v2
3wget https://s3.nbfc.io/nbfc-assets/github/vaccel-rs-lib/toml/main/aarch64/configuration-dragonball.toml -O /opt/kata/share/defaults/kata-containers/configuration-dbs-vaccel.toml

The next step is to setup helper scripts like above.

Dragonball with vAccel:

1cat > /usr/local/bin/containerd-shim-kata-rs-vaccel-v2 << EOF
2#!/bin/bash
3KATA_CONF_FILE=/opt/kata/share/defaults/kata-containers/configuration-dbs-vaccel.toml /opt/kata/runtime-rs/bin/containerd-shim-kata-vaccel-v2 $@
4EOF

and add the runtime to /etc/containerd/config.toml:

1       [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata-rs-vaccel]
2          privileged_without_host_devices = false
3          privileged_without_host_devices_all_devices_allowed = false
4          runtime_type = "io.containerd.kata-rs-vaccel.v2"
5          #snapshotter = "devmapper"
6          pod_annotations = ["*"]
7          container_annotations = ["*"]

We should now be able to spawn a vAccel-enabled kata container with both go and rust runtimes:

Go:

1$ nerdctl run --runtime io.containerd.kata-fc-vaccel.v2  --rm -it --snapshotter devmapper harbor.nbfc.io/nubificus/test-vaccel:latest /bin/bash
2root@bce78aa7d1b4:/#

Rust:

1$ nerdctl run --runtime io.containerd.kata-rs-vaccel.v2  --rm   -it harbor.nbfc.io/nubificus/test-vaccel:latest /bin/bash
2root@2ce045d042bc:/#

Install and Setup vAccel and the `jetson-inference` plugin on the host Link to heading

Setup vAccel Link to heading

To install vAccel on the Orin we need to download the core runtime library, the agent and setup the jetson plugin. For a binary install, you can use the following snippet:

1wget https://s3.nbfc.io/nbfc-assets/github/vaccelrt/main/aarch64/Release-deb/vaccel-0.5.0-Linux.deb
2wget https://s3.nbfc.io/nbfc-assets/github/vaccelrt/agent/main/aarch64/Release-deb/vaccelrt-agent-0.3.6-Linux.deb
3wget https://s3.nbfc.io/nbfc-assets/github/vaccelrt/plugins/jetson_inference/main/aarch64/Release-deb/vaccelrt-plugin-jetson-0.1.0-Linux.deb
4dpkg -i vaccel-0.5.0-Linux.deb 
5dpkg -i vaccelrt-agent-0.3.6-Linux.deb
6dpkg -i vaccelrt-plugin-jetson-0.1.0-Linux.deb

To be able to use the vAccel jetson-inference plugin, we need to install the jetson-inference framework on the host system.

Setup `jetson-inference` on the Orin Link to heading

We follow the installation instructions from dusty-nv/jetson-inference.

TL;DR:

1git clone https://github.com/dusty-nv/jetson-inference --depth 1
2cd jetson-inference
3git submodule update --init
4mkdir build && cd build
5cmake ../ -DBUILD_INTERACTIVE=OFF
6make && make install

Make sure you link the models to a directory in /usr/local/share/imagenet-models, as this is the default path where the vAccel jetson plugin looks for them:

1mkdir /usr/local/share/imagenet-models
2ln -s $PWD/../data/networks /usr/local/share/imagenet-models/

For more info on a full jetson-inference vaccel example, have a look at the vAccel docs.

Configure the container runtime to use vAccel Link to heading

Now that we have kata containers setup and vAccel installed, let’s configure the runtime!

Make sure you specify the relevant vaccel backend/plugins in the kata configuration file. As mentioned above, there are two modes of operation supported in the kata runtime for the vaccelrt-agent:

(a) external: the kata runtime spawns the vaccelrt-agent binary, as an external application. The setup for the host vAccel plugins is done in the kata configuration file.

(b) integrated: the kata runtime includes the vaccelrt-agent functionality, embedded in the runtime code. The setup for the host vAccel plugins is done in the helper script, as an environment variable (VACCEL_BACKENDS).

To configure the system for mode (a), for the go runtime we edit /opt/kata/share/defaults/kata-containers/configuration-fc-vaccel.toml and add the jetson plugin before noop (default/debug) plugin:

1@@ -54,7 +54,7 @@
2 # vaccel_guest_backend can be one of "virtio" or "vsock"
3 # Your distribution recommends: vaccel_host_backend = noop
4 # Your distribution recommends: vaccel_guest_backend = vsock
5-vaccel_host_backends = "noop"
6+vaccel_host_backends = "jetson,noop"
7 vaccel_guest_backend = "vsock"
8 
9 # If vaccel_guest_backend=vsock specify the vAccel Agent vsock port

For the rust runtime, we edit the /opt/kata/share/defaults/kata-containers/configuration-dbs-vaccel.toml file as follows:

1...
2agent_path= "/usr/local/bin/vaccelrt-agent"
3debug= "1"
4backends= "jetson,noop"
5backends_library= "/usr/local/lib/"
6endpoint_port= "2048"
7execution_type= "exec"
8...

For mode (b), we do:

1agent_path= ""
2debug= "1"
3backends= ""
4backends_library= "/usr/local/lib/"
5endpoint_port= "2048"
6execution_type= "integrated"

and edit the helper script to specify the plugins:

1cat > /usr/local/bin/containerd-shim-kata-rs-vaccel-v2 << EOF
2#!/bin/bash
3VACCEL_BACKENDS=/usr/local/lib/libvaccel-jetson.so:/usr/local/lib/libvaccel-noop.so KATA_CONF_FILE=/opt/kata/share/defaults/kata-containers/configuration-dbs-vaccel.toml /opt/kata/runtime-rs/bin/containerd-shim-kata-vaccel-v2 $@
4EOF

Spawn a vAccel-enabled sandboxed container Link to heading

We are now ready to spawn a vAccel-enabled container. We already have a test container image, but if you would like to build your own, the Dockerfile is the one below:

1FROM ubuntu:latest
2
3RUN apt update && apt install -y wget && \
4	wget https://s3.nbfc.io/nbfc-assets/github/vaccelrt/master/aarch64/Release-deb/vaccel-0.5.0-Linux.deb && \
5	wget https://s3.nbfc.io/nbfc-assets/github/vaccelrt/plugins/vsock/master/aarch64/Release-deb/vaccelrt-plugin-vsock-0.1.0-Linux.deb && \
6	dpkg -i vaccel-0.5.0-Linux.deb vaccelrt-plugin-vsock-0.1.0-Linux.deb 
7
8ENV VACCEL_DEBUG_LEVEL=4
9ENV VACCEL_BACKENDS=/usr/local/lib/libvaccel-vsock.so

build like this:

1docker build -t vaccel-test-container:aarch64 -f Dockerfile .

Essentially, it installs the vAccel core library, and the vsock plugin that communicates with the vAccel agent running on the host (spawned using the modes described above).

1$ sudo nerdctl run --runtime io.containerd.kata-rs-vaccel.v2  --rm  -it nubificus/vaccel-test-container:latest /bin/bash
2root@23a8738e14a8:/#

Let’s get an image:

 1$ wget https://upload.wikimedia.org/wikipedia/commons/b/bd/Golden_Retriever_Dukedestiny01_drvd.jpg
 2--2023-09-17 17:48:09--  https://upload.wikimedia.org/wikipedia/commons/b/bd/Golden_Retriever_Dukedestiny01_drvd.jpg
 3Resolving upload.wikimedia.org (upload.wikimedia.org)... 185.15.59.240, 2a02:ec80:300:ed1a::2:b
 4Connecting to upload.wikimedia.org (upload.wikimedia.org)|185.15.59.240|:443... connected.
 5HTTP request sent, awaiting response... 200 OK
 6Length: 165000 (161K) [image/jpeg]
 7Saving to: 'Golden_Retriever_Dukedestiny01_drvd.jpg'
 8
 9Golden_Retriever_Dukedestiny01_drvd.jpg        100%[==================================================================================================>] 161.13K   701KB/s    in 0.2s    
10
112023-09-17 17:48:14 (701 KB/s) - 'Golden_Retriever_Dukedestiny01_drvd.jpg' saved [165000/165000]

and try to issue a classify operation using vAccel’s jetson-inference backend:

1root@2ce045d042bc:/# /usr/local/bin/classify Golden_Retriever_Dukedestiny01_drvd.jpg 1
2Initialized session with id: 1
3Image size: 165000B
4classification tags: 86.768% golden retriever

Awesome, right?! We are able to use hardware acceleration on a sandboxed container that does not have direct access to the accelerator device. Stay tuned for more info on our progress with kata-containers and vAccel!