Docker roboflow/roboflow-inference-server-cpu leads to OOM after processing a lot inferences

gm_eco · October 2, 2023, 6:28pm

I prepared a docker image from roboflow/roboflow-inference-server-cpu in my server Ubuntu 20.04.3 and docker 24.0.6

The comand to start was
docker run --name sofia-docker --net=host --mount source=roboflow,target=/cache roboflow/roboflow-inference-server-cpu
and when I test, its work fine, but after multiple request (around 10 thousands), this specific imagen (robloflow) increase memory and disk usage until out of memory and if I deleted that docker, then free memory’s.

It happen only when process a lot inferences.

When i search information, only I founded about uvicorn hangs and consumes a lot of memory https://github.com/tiangolo/fastapi/issues/596 but this is inside of docker roboflow/roboflow-inference-server-cpu

I apreciate any help with this because I cannot restart constantly docker just for solve that

happypentester · October 2, 2023, 7:27pm

Hi @gm_eco

I dont have a lot of experience with docker but I do have experience with resource heavy applications, and your problem sounds very familiar to me. It was mostly just the limitations of the resources of my computer. I had to upgrade my ram, processing power and GPU for scaling to the project.

Again, I cant say for sure but it really sounds like docker is crashing due to a lack of resources causing overload.

Paul · October 2, 2023, 10:11pm

Hi @gm_eco ! Recent version of our inference package have some new caching logic which could be causing a problem here. Thanks for posting here! I’ll try to reproduce the issue and post back here with some answers.

gm_eco · October 3, 2023, 12:45pm

Hi @happypentester

Thanks for reply, Thas was my firts idea but in roboflow’s blog they said “The minimum prerequisite for running a Roboflow Inference Server is to have Docker installed on your target device.” (Launch: Updated Roboflow Inference Server) and I’m not found any other information about minimum requirements.

I cannot upgrade my server to increase ram and disk for now, until if is necessary. First I will try memory and disk with limit to docker with options --memory=“30g” --storage-opt size=15G and I wil see what happen.

gm_eco · October 3, 2023, 12:49pm

Hi @Paul , thanks for reply. Ok.
If you need any other information to reproduce this issue let me know and I will send you any information

Paul · October 3, 2023, 10:03pm

Hi @gm_eco ! I haven’t been able to pinpoint the exact cause of any memory leak nor have I been able to reproduce it with a simple inference setup (hitting my locally running server with an infinite loop). Can you provide more details on your setup? What model(s) are you using? What does your client code look like? Does your client make concurrent requests, or are they serial?

In the meantime, I’ve just published a new release 0.8.9 that includes the ability to disable the new caching logic by setting the environment variable DISABLE_INFERENCE_CACHE=true on your docker run command. We’ll continue trying to find any memory leaks but let me know if this new setting solves your problem as that will tell us we are on the right track with our debugging.

gm_eco · October 4, 2023, 3:43pm

Hi @Paul , thanks to reply.

My setup is like your, I coded a loop with posting to docker in python (like docs CPU - Roboflow Docs), in the POST (made with request module in python). I passed image and for every post I added a sleep 300 ms, specific code I used in loop is like next.

    def pred_request(path_img,confidence,overlap,model_version,model_proyect):
        url        = "http://localhost:9001"
        api_key = "yourkey"
        url = f"{url}/{model_proyect}/{model_version}"

        params = {
            "api_key": api_key,
            "confidence": confidence,
            "overlap":overlap
        }
        files = {'file': open(path_img, 'rb')}
        res = requests.post(url, params=params,files=files)
        return res.json()

with

#extract frame from video to send 
result =[];
for frame in video:
    result.append(pred_request(frame, confidence, overlap,model_version,model_proyect))
    time.sleep(0.3);

#process result

All Request is serial (after one request is finished, I sent another request)
I used only one custom model for detect object (electrical object like insulators, cut strands etc)

I downloaded the new img and I tried with the new parameter like

sudo docker run --name sofia-docker --net=host --mount source=roboflow,target=/cache roboflow/roboflow-inference-server-cpu --memory="30g" --storage-opt size=15G -e DISABLE_INFERENCE_CACHE=true

and disk space not increasing like before, but ram memory used still increase and is not free up after (like before at the start of discuss).

In the server, I noticed only a few process relationated with uvicorn, this eat memory ram constantly and after 3 hours (imgs attached), the memory use increase from 2G to 8G and only that process relationated with docker (cause if I´m going to stop the docker, then its memory is free). Last number of POST request sended was around 12 thousand in that 3 hours

Paul · October 5, 2023, 5:55pm

Hi @gm_eco , I was able to reproduce the bug and a fix will be included int he next release. Thanks for all the info you provided!

gm_eco · October 5, 2023, 7:11pm

Hi @Paul , that is great
Thanks you and all roboflow team

I’m waiting the next release for testing

system · October 15, 2023, 7:12pm

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Too Many Requests error - docker inference server on Raspberry Pi 🤝 Community Help	6	495	July 27, 2023
Error when using roboflow/roboflow-inference-server-cpu in Docker 🛠️ Feature Reqs bugs	0	506	May 29, 2023
Docker inference infer error - Conneciton aborted Conneciton reset by peer 🤝 Community Help	12	201	December 26, 2024
High memory usage 🤝 Community Help	12	292	December 27, 2023
Inference server broken with new release 🛠️ Feature Reqs	0	302	August 4, 2023

Docker roboflow/roboflow-inference-server-cpu leads to OOM after processing a lot inferences

Related topics