Configuration Workflow

Configuration for the Development Workflow

Vivado configuration

Provenance: esnet-smartnic-hw

This section describes the installation and configuration of the Vivado Runtime Environment.

Install the AMD (Xilinx) Vivado tool suite, including the VitisNetP4 option. Note, to include the VitisNetP4 option, the VitisNetP4_Option_VISIBLE environment variable must be set to true prior to executing the Vivado installation program. The example BASH shell command is:

export VitisNetP4_Option_VISIBLE=true
Configure the runtime environment by executing the settings64.sh script located in the Vivado installation directory:

source /tools/Xilinx/Vivado/2023.1/settings64.sh

where the Vivado installation directory is located at /tools/Xilinx/Vivado/2023.1/ in this example.

Set the XILINXD_LICENSE_FILE environment variable accordingly to resolve the site-specific license for the AMD (Xilinx) VitisNetp4 IP core. This can be done with a .flexlmrc file in the users home directory, or in a BASH script file (such as a .bashrc in the users home directory). The example BASH shell command is:

export XILINXD_LICENSE_FILE=

SmartNIC firmware build environment

Provenance: esnet-smartnic-fw

The SmartNIC firmware build depends on docker and the docker compose plugin.

Docker

Install Docker on your system following the instructions found here for the linux variant that you are using * https://docs.docker.com/engine/install/

Ensure that you follow the post-install instructions here so that you can run docker without sudo * https://docs.docker.com/engine/install/linux-postinstall/

Verify your docker setup by running this as an ordinary (non-root) user without using sudo

docker run hello-world

If you get the following message, then you need to be added to the Docker group:

permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/images/json": dial unix /var/run/docker.sock: connect: permission denied

Docker Compose

The docker-compose.yml file for the smartnic build and the sn-stack depends on features that are only supported in the compose v2 plugin.

Install the docker compose plugin like this for a single user:

mkdir -p ~/.docker/cli-plugins/
curl -SL https://github.com/docker/compose/releases/download/v2.17.2/docker-compose-linux-x86_64 -o ~/.docker/cli-plugins/docker-compose
chmod +x ~/.docker/cli-plugins/docker-compose

Alternatively, you can install the docker compose plugin system-wide like this:

sudo mkdir -p /usr/local/lib/docker/cli-plugins
sudo curl  -o /usr/local/lib/docker/cli-plugins/docker-compose -SL https://github.com/docker/compose/releases/download/v2.17.2/docker-compose-linux-x86_64
sudo chmod +x /usr/local/lib/docker/cli-plugins/docker-compose

Verify your docker compose installation by running this as an ordinary (non-root) user without using sudo. For this install, the version output should be

$ docker compose version
Docker Compose version v2.17.2

buildx
Using the `docker-buildx-plugin` provides [extended build capabilities](https://github.com/docker/buildx). Detailed instructions can be found in [Docker's official documentation](https://docs.docker.com/engine/install/ubuntu/).

Host setup

OS

We tested this setup with Ubuntu 20.04.

FPGA Connectivity Check

Validate that your host system is equipped with at least one Xilinx Alveo (U280, U55C, U250). Confirm this by running the command:

lspci -d 10ee

This command will display any PCIe-connected FPGAs.

Configuring Hugepages

Verify that hugepages are appropriately configured by examining the kernel boot command line parameters:

cat /proc/cmdline

Ensure that the output includes lines similar to:

BOOT_IMAGE=/boot/vmlinuz-5.4.0-126-generic root=/dev/mapper/vg0-root ro default_hugepagesz=1G hugepagesz=1G hugepages=32 intel_iommu=on iommu=pt

Grub Configuration

Edit the /etc/default/grub file to include the following line:

GRUB_CMDLINE_LINUX_DEFAULT="default_hugepagesz=1G hugepagesz=1G hugepages=32 intel_iommu=on iommu=pt"

Subsequently, execute:

sudo update-grub

IOMMU Confirmation

Verify IOMMU is enabled by examining the log:

sudo less /var/log/kern.log

JTAG/USB Connection Check

Ensure that the FPGA has JTAG/USB connectivity by running:

lsusb

Confirm the presence of the device labeled “Future Technology Devices International, Ltd FT232H Single HS USB-UART/FIFO,” which signifies the JTAG connection to the FPGA.

Using the SmartNIC FW image

Provenance: sn-stack documentation. Other parts of that documentation are in the Execution Workflow. See also the One-time Setup of the Runtime Environment there, which are distributed elsewhere.

Converting from factory flash image to ESnet SmartNIC flash image

From the factory, the FPGA cards have only a “gold” bitfile in flash with the “user” partition of flash being blank. The “gold” bitfile has a narrow PCIe memory window for BAR1 and BAR2 which is insufficient for the ESnet SmartNIC platform. Fixing this requires a one-time flash programming step to install an ESnet SmartNIC bitfile into the FPGA “user” partition in flash. This initial setup is done using the JTAG.

Ensure that any running sn-stack instances have been stopped so that they don’t interfere with the flash programming operation.

docker compose down -v --remove-orphans

Start the flash rescue service to program an ESnet SmartNIC bitfile into the FPGA card “user” partition using the JTAG interface. This takes approximately 20 minutes. This process should not be interrupted.

docker compose --profile smartnic-flash run --rm smartnic-flash-rescue

This will: * Use JTAG to write a small flash-programing helper bitfile into the FPGA * Use JTAG to write the current version of the bitfile into the FPGA card’s “user” partition in flash * Only the “user” partition of the flash is overwritten by this step * The “gold” partition is left untouched

Clean up by bringing down the running stack after flash writing has completed.

docker compose down -v --remove-orphans

Perform a cold-boot (power cycle) of the server hosting the FPGA card

It is essential that this is a proper power cycle and not simply a warm reboot. Specifically do not use shutdown -r now but rather use something like ipmitool chassis power cycle. Failure to perform a cold-boot here will result in an unusable card.

Normal Operation of the Runtime Environment

(OPTIONAL) Updating the flash image to a new ESnet SmartNIC flash image

The instructions in this section are used to update the SmartNIC flash image from an already working SmartNIC environment. This update step is optional and only required if you want to change the contents of the FPGA card flash. Normally, the “RAM” of the FPGA is loaded using JTAG during stack startup.

NOTE This will not work for the very first time ever programming the flash. See “Converting from factory flash image to ESnet SmartNIC flash image” section above for first-time setup.

Start up a any properly configured stack which will allow us to write the flash using a fast algorithm over PCIe.

docker compose up -d

Confirm that PCIe register IO is working in your stack by querying the version registers.

docker compose exec smartnic-fw sn-cli dev version

Confirm that the “DNA” register is not showing 0xfffff… as its contents.

Start the flash update service to write the currently active FPGA bitfile into the persistent flash on the FPGA card. This takes approximately 7-8 minutes. This process should not be interrupted.

docker compose --profile smartnic-flash run --rm smartnic-flash-update

Bring down the running stack after flash writing has completed.

docker compose down -v --remove-orphans

(OPTIONAL) Remove the ESnet SmartNIC flash image from the FPGA card to revert to factory image

The instructions in this section are used to remove the SmartNIC flash image from an already working SmartNIC environment. This removal step is optional and only required if you want to reset the contents of the FPGA card flash back to the factory bitfile. If you want to keep using the card as an ESnet SmartNIC, do not perform these operations or you’ll have to re-do the “Converting from factory flash image to ESnet SmartNIC flash image” section above.

Start up a any properly configured stack which will allow us to write the flash using a fast algorithm over PCIe.

docker compose up -d

Confirm that PCIe register IO is working in your stack by querying the version registers.

docker compose exec smartnic-fw sn-cli dev version

Confirm that the “DNA” register is not showing 0xfffff… as its contents.

Start the flash remove service to erase the ESnet SmartNIC image from the “user” partition of the FPGA card flash. This takes less than 1 minute. This process should not be interrupted.

docker compose --profile smartnic-flash run --rm smartnic-flash-remove

Bring down the running stack after flash reset is completed.

docker compose down -v --remove-orphans

Note: If you want to flash the golden (recovery) image and it is not working, you can use vivado_lab with the following commands:

vivado_lab \
    -nolog \
    -nojournal \
    -tempDir /tmp/ \
    -mode batch \
    -notrace \
    -quiet \
    -source /scripts/program_flash.tcl \
    -tclargs "$HW_SERVER_URL" "$HW_TARGET_SERIAL" "/scripts/revert_to_golden.mcs"

If you don’t know how to set it up, you can go to sn-stack/smartnic-hw/scripts and put the “revert to golden” image there. For more information on how to get the golden image, you can refer to this link. After placing the image there, you can modify the program_flash.sh script and change:

vivado_lab \
    -nolog \
    -nojournal \
    -tempDir /tmp/ \
    -mode batch \
    -notrace \
    -quiet \
    -source /scripts/program_flash.tcl \
    -tclargs "$HW_SERVER_URL" "$HW_TARGET_SERIAL" "$MCSFILE_PATH

To:

vivado_lab \
    -nolog \
    -nojournal \
    -tempDir /tmp/ \
    -mode batch \
    -notrace \
    -quiet \
    -source /scripts/program_flash.tcl \
    -tclargs "$HW_SERVER_URL" "$HW_TARGET_SERIAL" "/scripts/revert_to_golden.mcs"

After a cold reboot, you will see the cards are back to the golden image.

Important notice: The golden image will not make the cards appear in XRT, as XRT needs an “XRT-friendly” shell. However, it’ll make xbmgmt see the card, and from there, you can flash a new platform that works with XRT. For more info on the golden image, you can refer to this documentation.