Issue with running Workflows Offline

I’m trying to run a Roboflow Workflow completely offline using local Inference Server and Python inference-sdk.

I have the Inference Server running locally on http://localhost:9001, and the workflow executes successfully the first time while I’m connected to the internet. However, after disconnecting from the internet, subsequent runs fail with an error. The error i get is:

TypeError: argument of type 'NoneType' is not iterable
WARNING [inference_sdk.webrtc.session] Error calling global data handler
Traceback (most recent call last):
  File "D:\Projects\roboflow\venv\Lib\site-packages\inference_sdk\webrtc\session.py", line 839, in _on_data_message
    self._invoke_data_handler(
  File "D:\Projects\roboflow\venv\Lib\site-packages\inference_sdk\webrtc\session.py", line 534, in _invoke_data_handler
    handler(value, metadata)
  File "D:\Projects\roboflow\workflows_sdk_method.py", line 36, in on_data
    if VIDEO_OUTPUT and VIDEO_OUTPUT in data:
                        ^^^^^^^^^^^^^^^^^^^^

As soon as I reconnect to the internet, the exact same code works again. My understanding is that workflows can be deployed locally and should be able to run offline after the first run, once the workflow has been cached.

I’m using the Python code generated from the Workflow’s Deploy menu without any modifications.:

import cv2
import base64
import numpy as np
from inference_sdk import InferenceHTTPClient
from inference_sdk.webrtc import VideoFileSource, StreamConfig, VideoMetadata

client = InferenceHTTPClient.init(
    api_url="http://localhost:9001",
    api_key="*************"
)

source = VideoFileSource("13752035_960_540_50fps.mp4", realtime_processing=False)  # Buffer and process all frames

VIDEO_OUTPUT = "output_image"
DATA_OUTPUTS = ["predictions","model_id"]

config = StreamConfig(
    stream_output=[], # We request all data via data_output for video files
    data_output=["output_image","predictions","model_id"]
)

session = client.webrtc.stream(
    source=source,
    workflow="rf-detr-people-masking-1780246694135",
    workspace="dikshants-blog-workspace",
    image_input="image",
    config=config
)

frames = []

@session.on_data()
def on_data(data: dict, metadata: VideoMetadata):
    # print(f"Frame {metadata.frame_id} predictions: {data}")
    
    if VIDEO_OUTPUT and VIDEO_OUTPUT in data:
        timestamp_ms = metadata.pts * metadata.time_base * 1000
        img = cv2.imdecode(np.frombuffer(base64.b64decode(data[VIDEO_OUTPUT]["value"]), np.uint8), cv2.IMREAD_COLOR)
        frames.append((timestamp_ms, metadata.frame_id, img))
        print(f"Processed frame {metadata.frame_id}")
    else:
        print(f"Processed frame {metadata.frame_id} (data only)")

session.run()

if VIDEO_OUTPUT and len(frames) > 0:
    # Stitch frames into output video
    frames.sort(key=lambda x: x[1])
    fps = (len(frames) - 1) / ((frames[-1][0] - frames[0][0]) / 1000) if len(frames) > 1 else 30.0
    h, w = frames[0][2].shape[:2]
    out = cv2.VideoWriter("output.mp4", cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))
    for _, _, frame in frames:
        out.write(frame)
    out.release()
    print(f"Done! {len(frames)} frames at {fps:.1f} FPS -> output.mp4")
elif VIDEO_OUTPUT:
    print("No video frames collected.")

Thanks in advance for any suggestions or guidance on this!

Hi @Dikshant_Shah ,

Offline / air-gapped deployments are Enterprise feature - are you trying this from Enterprise workspace? For Enterprise/offline deployments, the usual flow is:

  1. Run Roboflow Inference with a persistent cache mounted at /tmp/cache.

  2. Use Roboflow Secure Gateway / legacy License Server if the Inference Server itself cannot reach Roboflow directly.

  3. Run the workflow/model once while connectivity is available so the workflow definition, model metadata, weights, and authorization lease are cached.

  4. After that, cached inference can run offline until the cached lease expires.

Docs on this:

Got it, so an Enterprise workspace is required to run workflows fully locally, where the workflow is cached after the first inference. I was running it on a non-Enterprise workspace (Starter Plan).

Thank you so much!