We don’t have specific support for pipelines to deploy on RunPod or via Gradio. For cloud inference, we’d recommend using our cloud deployment directly and hooking that up to Gradio instead of going through RunPod.
Having said that, if you MUST use RunPod, the flow might look similar to their tutorial for using Ollama. In theory, you could spin up Roboflow Inference server and serve HTTP requests in the same way they talk about serving HTTP requests from Ollama. We don’t have specific experience with RunPod, this is just the closest we saw on a quick review of their documentation so your milage may vary
There’s no reason why using a hosted backend couldn’t be made to work with Gradio!
We don’t generally provide implementation services outside of enterprise use cases, nor do we recommend third parties to implement pipelines.
All in all, it is feasible to use Roboflow cloud deployment with a Gradio frontend to server Roboflow-trained Qwen models to users, but it seems likely RunPod could be used as well by hosting our open source Inference package