How to imbed the video frames from the inference pipeline in the web

Mubashir_Waheed · July 22, 2024, 10:34am

Pretty much what the title says. I am running inference using the inference pipeline using custom login=c and now I want to showing the results in the browser. I have been search on roboflow for so many days’ but didn’t find anything helpful
I am not doing this over internet I
just want this on my local network

Grzegorz · July 23, 2024, 1:28pm

Hi @Mubashir_Waheed,

Thank you for giving inference a try!

When using inference_pipeline you can specify a custom sink where you can perform your custom logic based on predictions returned by pipeline.

from typing import Any, Dict, Union

import cv2 as cv
import supervision as sv

from inference.core.interfaces.camera.entities import VideoFrame
from inference.core.interfaces.stream.inference_pipeline import InferencePipeline
from inference.core.managers.base import ModelManager
from inference.core.registries.roboflow import (
    RoboflowModelRegistry,
)
from inference.models.utils import ROBOFLOW_MODEL_TYPES


model_registry = RoboflowModelRegistry(ROBOFLOW_MODEL_TYPES)
model_manager = ModelManager(model_registry=model_registry)


box_annotator = sv.BoundingBoxAnnotator()
label_annotator = sv.LabelAnnotator()


def custom_sink(prediction: Dict[str, Union[Any, sv.Detections]], video_frame: VideoFrame) -> None:
    detections = sv.Detections.from_inference(prediction)
    labels = [f"#{class_name}" for class_name in detections["class_name"]]
    annotated_frame = box_annotator.annotate(
        video_frame.image.copy(), detections=detections
    )
    annotated_frame = label_annotator.annotate(
        annotated_frame, detections=detections, labels=labels
    )
    cv.imshow("", annotated_frame)
    cv.waitKey(1)


pipeline = InferencePipeline.init(
    video_reference="/path/to/my_file.mp4",
    model_id="my_model_id",
    on_prediction=custom_sink,
    api_key="some_api_key",
)
pipeline.start()
pipeline.join()

In the above example we are annotating frame in order to show it using cv.imshow. custom_sink is the best place to start, but bare in mind you might want to design more robust solution if you plan to send frames over the internet (i.e. you might want to implement a thread responsible for sending frames over wire, and use a Queue to share frames between custom_sink and that thread).

Hope this helps,
Grzegorz

Mubashir_Waheed · July 23, 2024, 1:52pm

Until this point I have it working. I can show the annotated video in a resizable window using the opencv. Now I want to display the same stream on the web. Note that I am not sending the stream over the internet. I have it on my local network and plan on deploy on the edge device.

the part I am finding difficult is sending frames/stream to the streamlit so that video scan be displayed on the web

Grzegorz · July 23, 2024, 2:42pm

Hi, I have not used streamlit myself so I can only guess they have REST API where you can post images. Was you able to send an image in isolation (without inference)? Like, read an image with opencv or PIL and then send it to streamlit so it’s available for further processing?

Mubashir_Waheed · July 24, 2024, 5:25am

I want to know how can video frames be passed to the stream lit or maybe if I phrase it differently How can I extract the annotated frames from the inference pipeline instead of just showing using the open cv

Grzegorz · July 24, 2024, 8:26am

Frames are available in the call-back, in my example call-back method is called custom_sink - here both frames and predictions are available, so annotation can be done. There are no limits what can be done within call-back method - request can be performed, frame can be stored, frame can be passed to a thread etc. In my example I simply annotate the frame and show it using cv.imshow.

Can you provide a bit of high level information?

I understand the part where you want to perform inference and then show it in the browser, what I’m missing is the big picture - and from your response I understand call-back method is not meeting your requirements or we have a misunderstanding on what kind of feature can be developed with it. If you share your use case I might be able to better assist.

Mubashir_Waheed · July 24, 2024, 9:47am

Let me explain I have few cameras from which I am getting feed using rtsp url and after running the inference on my laptop (for now) I want to display the annotated video on the web so that any system on the network can view the stream using the IP address.
For disapplying the feed on web I am planning to use streamlit since it is easier pick and plug.
The stream needs to be real time(18-20fps) I am getting around 21 fps when I am simply showing the stream using opencv(similar to your example above) after running inference

If I am still not clear please let me know.

I am going down the edge deployment route and to show result I will use web

Grzegorz · July 24, 2024, 10:14am

Thank you!

Looking at streamlit docs I guess you are looking to use their video player? Or are you using another component?

I was able to knock very fast demo where I stream a video file frame by frame (fps was quite low though):

import streamlit as st
import cv2

def stream_video(f):
    cap = cv2.VideoCapture(f)
    frame_placeholder = st.empty()

    while cap.isOpened():
        ret, frame = cap.read()
        if not ret:
            break
        frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        frame_placeholder.image(frame_rgb, channels='RGB')

    cap.release()

f = st.text_input("provide file path")
if f:
    stream_video(f)

I’ll post example how you can integrate inference_pipeline with this.

Mubashir_Waheed · July 24, 2024, 10:30am

Yes I am using the video player. I passed the raw stream(exactly like your code) without any inference and was getting 6-7 fps.

Thanks for mentioning that you will post the example with inference pipeline as well.

Grzegorz · July 24, 2024, 11:31am

OK, so here is how you can stream frames from inference_pipeline to streamlit. Below example comes with disclaimer that it’s not a production-ready code and it’s purpose is to give directions so you can develop your solution.

import asyncio
from typing import Any, Dict, Union

import cv2 as cv


# it seems event loop is not being set up in the Streamlit script's execution context
try:
    asyncio.get_event_loop()
except Exception as exc:
    loop = asyncio.new_event_loop()
    asyncio.set_event_loop(loop)


from inference.core.interfaces.camera.entities import VideoFrame
from inference.core.interfaces.stream.inference_pipeline import InferencePipeline
from inference.core.managers.base import ModelManager
from inference.core.registries.roboflow import (
    RoboflowModelRegistry,
)
from inference.models.utils import ROBOFLOW_MODEL_TYPES
import supervision as sv
import streamlit as st


frame_placeholder = st.empty()
f = st.text_input("file path or stream rtmp address")


model_registry = RoboflowModelRegistry(ROBOFLOW_MODEL_TYPES)
model_manager = ModelManager(model_registry=model_registry)

box_annotator = sv.BoundingBoxAnnotator()
label_annotator = sv.LabelAnnotator()


def custom_sink(prediction: Dict[str, Union[Any, sv.Detections]], video_frame: VideoFrame) -> None:
    detections = sv.Detections.from_inference(prediction)
    labels = [f"#{class_name}" for class_name in detections["class_name"]]
    annotated_frame = box_annotator.annotate(
        video_frame.image.copy(), detections=detections
    )
    annotated_frame = label_annotator.annotate(
        annotated_frame, detections=detections, labels=labels
    )
    frame_rgb = cv.cvtColor(annotated_frame, cv.COLOR_BGR2RGB)
    frame_placeholder.image(frame_rgb)


if f:
    pipeline = InferencePipeline.init(
        video_reference=f,
        model_id="<your model ID>",
        on_prediction=custom_sink,
        api_key="<secret>"
    )
    pipeline.start()
    pipeline.join()

I was able to stream frames one by one from local file, can you test with your rtmp cameras?

Mubashir_Waheed · July 24, 2024, 11:41am

can you please explain how frames are passed through this to the streamlit?

Grzegorz · July 24, 2024, 11:50am

I’m new to streamlit so take my explanations with grain of salt

I understand streamlit is a neat tool allowing user to focus on their data workloads, UI/hosting is done automagically! I love the concept!

The way how streamlit knows what components to put on the screen is based on variables declared in the script. So here I declare input field where stream address can be provided:

f = st.text_input("provide file path")

And here I declare a placeholder where data will be displayed:

frame_placeholder = st.empty()

streamlit does the heavy lifting themselves when it comes to actually transfer the data to the client and show it in meaningful way (I was not digging into their internals so please consult their documentation). The only thing I know is that below line transfers numpy array and displays it as frame, and also streamlit takes care of refreshing:

frame_placeholder.image(frame_rgb)

I see people provide 3rd-party plugins you might want to explore, but hey! I’m not streamlit support

Good luck! Please do not hesitate to drop more questions if you have any problems with inference!

Mubashir_Waheed · July 25, 2024, 2:24pm

Thanks for responding but this doesn’t solve the problem
It just adds a series of images(frames) on the web page instead of the video

with inference pipeline I can’t just pop the frame from the on_prediction which can be done using open cv
example below

def get_video_frames():
    cap = cv2.VideoCapture(0)  # Change 0 to your RTSP URL if needed
    while True:
        ret, frame = cap.read()
        if not ret:
            break
        yield frame
    cap.release()

Also I did post on the streamlit forum but it is dead and I don’t have high hopes of getting the answer.
I checked the docs but sadly nothing

what do?

Grzegorz · July 25, 2024, 2:51pm

Hi @Mubashir_Waheed,

It just adds a series of images(frames) on the web page instead of the video

So from what you are saying it’s not enough to show frames the way shown in example. Can you explain what do you mean?

In your other message you write:

I want to display the annotated video on the web so that any system on the network can view the stream using the IP address

So, you are actually looking to re-stream annotated frames so you can attach your other devices to that stream? If so then I think streamlit is not the tool you are looking for, you would probably use mediamtx, which you can configure to accept streams

Also I did post on the streamlit forum

Linking here so everyone can follow

Mubashir_Waheed · July 26, 2024, 7:18am

Ok here is a gif showing that frame as images are inserted to the web after each inference loop.

ScreenRecording2024-07-26092215-ezgif.com-video-to-gif-converter (2)

Here is the code

import streamlit as st
st.write("Hello world")
frame_placeholder = st.empty()

class CustomSink:
    // other methods 
    def on_prediction(self, result: dict, frame: VideoFrame) -> None:
        self.fps_monitor.tick()
        fps = self.fps_monitor.fps
        detections = sv.Detections.from_ultralytics(result)
        detections = detections[find_in_list(detections.class_id, self.classes)]
        detections = self.tracker.update_with_detections(detections)

        annotated_frame = frame.image.copy()

        annotated_frame = sv.draw_text(
            scene=annotated_frame,
            text=f"{fps:.1f}",
            text_anchor=sv.Point(40, 30),
            background_color=sv.Color.from_hex("#A351FB"),
            text_color=sv.Color.from_hex("#000000"),
        )
        // further pre processing here
        
        // should add frame inside a video player 
        st.image(annotated_frame)     

        if cv2.waitKey(1) & 0xFF == ord('q'):
            cv2.destroyAllWindows()
            raise SystemExit("Program terminated by user")

So, you are actually looking to re-stream annotated frames so you can attach your other devices to that stream?

sorry I don’t understand the phrasing of this.
Just assume that instead of the stream running on my laptop it will run on the jetson nano which which will somehow after running inference display the frame on the webpage( maybe using the streamlit) and everytime I would want to see the stream I will just to the web page

Mubashir_Waheed · July 26, 2024, 8:13am

This is from the streamlit

import streamlit as st

video_file = open("myvideo.mp4", "rb")
video_bytes = video_file.read()

st.video(video_bytes)

the much I understand if I am able to pop the inference frame out of the pipeline loop and then just passing that to the st.video it will work.
but how?

Grzegorz · July 26, 2024, 10:02am

Hi @Mubashir_Waheed ,

When I run my demo I see below result:

streamlit

Is this what you are looking to achieve?

Mubashir_Waheed · July 26, 2024, 10:47am

Yes, that is precisely what I want with goods fps and without having to add the URL to see the stream.
I am getting around 20-21 fps with pipeline and Opencv combo. I am not sure what I will get with the pipeline and streamlit combo

Edit: I tried your approach and I am getting around 8-10 fps. I wonder why fps dropped

Grzegorz · July 26, 2024, 11:14am

@Mubashir_Waheed , if you don’t want to provide url through input, you can replace this line:

f = st.text_input("file path or stream rtmp address")

with this:

f = "your-url"

Edit: I tried you approach and I am getting around 8-10 fps. I wonder why fps dropped

When showing frames in OpenCV.imshow frames do not have to be transferred through network. I guess st.video might be implementing some tricks like to encode stream as h264, here we are sending individual frames which is quite costly from IO perspective. I would guess you should see fps getting higher for lower resolution frames.

You can address this in a bit hacky way, in my example update below line:

def custom_sink(prediction: Dict[str, Union[Any, sv.Detections]], video_frame: VideoFrame) -> None:
    detections = sv.Detections.from_inference(prediction)
    labels = [f"#{class_name}" for class_name in detections["class_name"]]
    annotated_frame = box_annotator.annotate(
        video_frame.image.copy(), detections=detections
    )
    annotated_frame = label_annotator.annotate(
        annotated_frame, detections=detections, labels=labels
    )
    frame_rgb = cv.cvtColor(annotated_frame, cv.COLOR_BGR2RGB)
    frame_placeholder.image(frame_rgb)

with:

def custom_sink(prediction: Dict[str, Union[Any, sv.Detections]], video_frame: VideoFrame) -> None:
    if video_frame.frame_id % 2:
        return
    detections = sv.Detections.from_inference(prediction)
    labels = [f"#{class_name}" for class_name in detections["class_name"]]
    annotated_frame = box_annotator.annotate(
        video_frame.image.copy(), detections=detections
    )
    annotated_frame = label_annotator.annotate(
        annotated_frame, detections=detections, labels=labels
    )
    frame_rgb = cv.cvtColor(annotated_frame, cv.COLOR_BGR2RGB)
    frame_placeholder.image(frame_rgb)

Above change will result in every 2nd frame sent over the network.

Mubashir_Waheed · July 26, 2024, 11:29am

@Grzegorz
Using f with hardcoded url works.
I tried sending sending every second frame using the hacky way but no improvement

Topic		Replies	Views
How to server the inference stream on a webpage? Community Help	2	18	August 8, 2024
How to save video with Inference Pipeline Community Help	2	40	December 27, 2024
Deploy Model on Raspberry Pi Community Help bugs	4	457	March 3, 2024
Video inference using roboflow.js Community Help	6	286	May 17, 2024
Live Inference video in the browser Community Help	6	43	July 19, 2024

How to imbed the video frames from the inference pipeline in the web

Related topics