How to get Roboflow Inference working in desktop GPU?

I have trained a Yolov11 model for a segmentation task. I am using it for inference with the Roboflow Inference application on my Windows desktop. The inference takes too long to run (around 8 seconds) I have an RTX 4080 GPU but it looks like Rofoflow Inference server is running only on my CPU which might be the reason behind the slowness. Please let me know how I can ensure that the GPU is being used and any other troubleshooting tips to investigate the slowness.

Hi @Vivek_Rajasekaran ,

Please confirm if my assumption is correct - are you running inference in docker under linux?

If so, you need to provide extra parameters to docker run in order to make the GPU available from the within docker.

For example, if you pulled roboflow/roboflow-inference-server-gpu:0.63.5, you can try below to start inference server with GPU:

docker run -it --rm --privileged --gpus=all roboflow/roboflow-inference-server-gpu:0.63.5

You can quickly verify if GPU is available from within docker container by running nvidia-smi

docker run -it --rm --privileged --gpus=all --entrypoint /bin/bash roboflow/roboflow-inference-server-gpu:0.63.5 -c nvidia-smi

Hope this helps,

Grzegorz

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.