I have a workflow (doors-and-windows-sahi) set up to detect objects using YOLOv11 Object Detection (Accurate) + SAHI, which was working via the API (https://serverless.roboflow.com/glazier-software/workflows/doors-and-windows-sahi) until about a week or two ago. Since then I get a 500 response with the following body - it still works via the plaground/sandbox UI (Run Workflow).
{âmessageâ:âError in execution: Non-zero status code returned while running FusedMatMul node. Name:â/model.10/m/m.0/attn/MatMul/MatmulTransposeFusion//MatMulScaleFusion/â Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 632320000\nâ,âerror_typeâ:âStepExecutionErrorâ,âcontextâ:âworkflow_execuâŚ/m.0/attn/MatMul/MatmulTransposeFusion//MatMulScaleFusion/â Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 632320000\nâ,âblocks_errorsâ:[{âblock_idâ:âmodelâ,âblock_typeâ:âroboflow_core/roboflow_object_detection_model@v1â,âblock_detailsâ:null,âproperty_nameâ:null,âproperty_detailsâ:null}]}
Project Type: Object Detection
Operating System & Browser:
Project Universe Link or Workspace/Project ID: doors-windows-9k-givyl/1
Do you grant Roboflow Support permission to access your Workspace for troubleshooting? (Yes/No): Yes
Good Morning @Kevin_Gorski!
My name is Ford, and Iâm a Support Engineer at Roboflow. Happy to help here!
To help me triage further, how are you running inference on your workflow when you encounter this error? Via the in app âTest Workflowâ, Serverless API, dedicated deployment, or self hosted inference?
Additionally, are you running inference on images, videos, or live video streams?
Good Afternoon @Kevin_Gorski,
Unfortunately I was unable to replicate this error when running inference on your workflow via the Serverless V2 API.
To help me triage further, can you please provide a sample image for testing that consistently generates the error? Also, what resolution/size are the images that you are passing to inference? Finally, does inference fail on every image passed or only a subset?
Any additional details about your setup, including the script youâre using to call the API, would also be helpful. Thank you!
Good Morning @Kevin_Gorski!
To help me triage further, do you encounter this error if you resize the 6000x4000 image to 2048 x 1365 (<=2048px) before passing the image to inference?
Trying it again the small image is succeeding and the original is still failing
Error in execution: Non-zero status code returned while running Conv node. Name:'/model.1/conv/Conv' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 1153433600\n","error_type":"StepExecutionError","context":"workflow_execution | step_execution","inner_error_type":"str","inner_error_message":"Error in execution: Non-zero status code returned while running Conv node. Name:'/model.1/conv/Conv' Status Message: /onnxruntime_src/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 1153433600\n","blocks_errors":[{"block_id":"model","block_type":"roboflow_core/roboflow_object_detection_model@v1","block_details":null,"property_name":null,"property_details":null}]}
Weâre now running an automated test with the original image at the top of every hour and it seems like around noon Mountain time it starts failing, not sure if itâs because of other traffic/work being done on the same (Roboflow) infrastructure (weâre still making a very low volume of calls overall).