I trained an object detection model using Roboflow and would like to deploy the model for inference on an NVIDIA Jetson AGX Xavier.
NVIDIA Jetson-AGX Specs:
- L4T 35.3.1 (JetPack 5.1.1)
- Ubuntu 20.04.5 LTS
- CUDA 11.4.315
- CUDNN 8.6.0.166
- TensorRT 8.5.2.2
I have followed both the legacy documentation and the current documentation.
Legacy Method
When following the legacy method, I can run the server fine using this line:
sudo docker run --net=host --gpus all roboflow/inference-server:jetson
I receive an error when attempting to run inference using this line:
base64 YOUR_IMAGE.jpg | curl -d @- \
"http://localhost:9001/your-model/42?api_key=YOUR_KEY"
When I instead use the hosted API, it works great. Using this:
base64 YOUR_IMAGE.jpg | curl -d @- \
"https://detect.roboflow.com/your-model/42?api_key=YOUR_KEY"
The error I receive when trying to run inference locally is shown here:
{
"error": "This execution contains the node 'StatefulPartitionedCall/assert_equal_1/Assert/AssertGuard/branch_executed/_139', which has the dynamic op 'Merge'. Please use model.executeAsync() instead. Alternatively, to avoid the dynamic ops, specify the inputs [Identity]"
}
Current Method
When using the current method, I cannot run the roboflow inference server using this line:
sudo docker run --privileged --net=host --gpus all --mount source=roboflow,target=/cache -e NUM_WORKERS=1 roboflow/roboflow-inference-server-trt-jetson:latest
The error I receive is:
OSError: libcurand.so.10: cannot open shared object file: No such file or directory
I’ve checked and the file libcurand.so.10 can be found in /usr/local/cuda/lib64
I’ve tried using
export LD_LIBRARY_PATH=/usr/local/cuda/lib64 ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
with no success.
My questions are:
- Which deployment method should I be following?
- Has anyone run into these errors before and have any thoughts on what might be going wrong?