CUDA error in YOLOR inference


I am trying to use the YOLOR notebook to train and draw inference using a custom dataset. However, while using the inference, an error regarding the cuda device appears. I am to believe this can be solved by pushing the tensor to CPU. However, that would do the inference on CPU.
I am trying to compare the inference time of the YOLOR model with other YOLOv5 variants. I have used a specific gpu for all the YOLOv5 variants. How can I run the inference on the gpu for YOLOR without getting the error to get a fair comparison?