Inference takes too long the first time

Crisaq · September 27, 2024, 8:26pm

Hi, I am using the Roboflow Hosted API to deploy a Workflow but everytime I run the code for the first time after a while I either get a 500 error response or it takes around 20-30 seconds to infer. After that, any new run takes around 3 seconds, doesn´t matter if I change the image, which is okay. But I am wondering why it takes too long the first time, is it normal? My code is exactly as it is in the documentation.
I also tried running the inference server locally and I have the same issue. I have a 4060 GPU and after the first run that takes some time (one time I waited like 3 minutes) all other runs take like 3 seconds. I would like to know if anyone has this same issue or if is not an issue and that is just how it works.

Jacob_Witt · September 27, 2024, 8:52pm

Hi @Crisaq: in both cases we need to load the Workflow and the models used within, I would expect some delay on the first request.

In the case of the 4060 it might be poor internet connection? I can check but 3 minutes sounds like far too long.

Jacob_Witt · September 27, 2024, 10:02pm

@Crisaq - one other thing, if you are using models like SAM or Florence, those are quite large models (several GB) that need to get loaded.

Crisaq · September 28, 2024, 12:34am

That makes sense then, is usually not an issue but I was wondering if I was the only one since I couldn´t find anything about it.
In the other case I believe I was using a Workflow with both Florence and SAM so may be the long wait makes sense.
Thank you very much for your answers!

Jacob_Witt · September 30, 2024, 1:53pm

One thing that might help is our Dedicated Deployments functionality that lets you spin up a GPU pre-configured with your workflow. It will still have that early delay, but then will stay “warm” as you need it.

system · October 21, 2024, 1:54pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Prediction takes too long time on CPU Community Help	2	11	December 27, 2024
404 when trying to build a local workflow with inference server Community Help bugs	2	67	March 24, 2025
Important project, help me! Community Help bugs	2	41	December 26, 2024
Inferencing with roboflow advice Community Help	3	214	June 10, 2023
Inference API not working: Error 429 too many requests Community Help bugs	10	952	March 27, 2024

Inference takes too long the first time

Related topics