This topic is intended to provide instruction for those attempting run Roboflow inference on Jetson devices. The documentation for Jetson deployments on Roboflow’s website seems to be out-of-date or incomplete.
This guide was tested on a Jetson Xavier AGX running Jetpack 5.1.4 with Python 3.8.
1. Inference package
Roboflow’s inference package utilizes the ONNX cpu runtime engine and does not use the Jetson’s gpu. Instead, you must install the inference-gpu package. For Jetsons running Jetpack 4.6 or 5.1, installing with pip will fail as the package depency onnxruntime-gpu is unavailable.
2. Install onnxruntime-gpu from Jetson Zoo
The default pypi package for onnxruntime-gpu does not work with Cuda 11.x.
Go to this link and follow the instructions to download and install the correct pip wheel depending on your Python and Jetpack version.
Note: This did not work with Python 3.10 for some reason (package was compiled for 3.8?)
3. Install inference-gpu with pip
pip install inference-gpu
4. Check that onnxruntime (cpu) is not installed
If both onnxruntime and onnxruntime-gpu are installed, this may cause an issue in which onnx cannot locate the CudaExecutionProvider and will instead fall back to the CPUExcecutionProvider.
# check if onnxruntime is in package list
pip list
# uninstall
pip uninstall onnxruntime
5. Enable all cores on Jetson device
Cores on the Jetson device are deactivated in certain power modes. To maximize performance change the power mode to MODE_30W_ALL (AGX Xavier) using the command line or jtop.
6. Run inference in your python code
Congrats, you should be able to run inference with the Jetson GPU. AGX Xavier was able to run a Roboflow 3.0 object detection model at about 20 fps.
Hope this is helpful to those struggling with dependencies