Pretty much what the title says. I am running inference using the inference pipeline using custom login=c and now I want to showing the results in the browser. I have been search on roboflow for so many days’ but didn’t find anything helpful
I am not doing this over internet I
just want this on my local network
Hi @Mubashir_Waheed,
Thank you for giving inference
a try!
When using inference_pipeline you can specify a custom sink where you can perform your custom logic based on predictions returned by pipeline.
from typing import Any, Dict, Union
import cv2 as cv
import supervision as sv
from inference.core.interfaces.camera.entities import VideoFrame
from inference.core.interfaces.stream.inference_pipeline import InferencePipeline
from inference.core.managers.base import ModelManager
from inference.core.registries.roboflow import (
RoboflowModelRegistry,
)
from inference.models.utils import ROBOFLOW_MODEL_TYPES
model_registry = RoboflowModelRegistry(ROBOFLOW_MODEL_TYPES)
model_manager = ModelManager(model_registry=model_registry)
box_annotator = sv.BoundingBoxAnnotator()
label_annotator = sv.LabelAnnotator()
def custom_sink(prediction: Dict[str, Union[Any, sv.Detections]], video_frame: VideoFrame) -> None:
detections = sv.Detections.from_inference(prediction)
labels = [f"#{class_name}" for class_name in detections["class_name"]]
annotated_frame = box_annotator.annotate(
video_frame.image.copy(), detections=detections
)
annotated_frame = label_annotator.annotate(
annotated_frame, detections=detections, labels=labels
)
cv.imshow("", annotated_frame)
cv.waitKey(1)
pipeline = InferencePipeline.init(
video_reference="/path/to/my_file.mp4",
model_id="my_model_id",
on_prediction=custom_sink,
api_key="some_api_key",
)
pipeline.start()
pipeline.join()
In the above example we are annotating frame in order to show it using cv.imshow
. custom_sink
is the best place to start, but bare in mind you might want to design more robust solution if you plan to send frames over the internet (i.e. you might want to implement a thread responsible for sending frames over wire, and use a Queue to share frames between custom_sink
and that thread).
Hope this helps,
Grzegorz
Until this point I have it working. I can show the annotated video in a resizable window using the opencv. Now I want to display the same stream on the web. Note that I am not sending the stream over the internet. I have it on my local network and plan on deploy on the edge device.
the part I am finding difficult is sending frames/stream to the streamlit so that video scan be displayed on the web
Hi, I have not used streamlit
myself so I can only guess they have REST API where you can post images. Was you able to send an image in isolation (without inference)? Like, read an image with opencv or PIL and then send it to streamlit so it’s available for further processing?
I want to know how can video frames be passed to the stream lit or maybe if I phrase it differently How can I extract the annotated frames from the inference pipeline instead of just showing using the open cv
Frames are available in the call-back, in my example call-back method is called custom_sink
- here both frames and predictions are available, so annotation can be done. There are no limits what can be done within call-back method - request can be performed, frame can be stored, frame can be passed to a thread etc. In my example I simply annotate the frame and show it using cv.imshow.
Can you provide a bit of high level information?
I understand the part where you want to perform inference and then show it in the browser, what I’m missing is the big picture - and from your response I understand call-back method is not meeting your requirements or we have a misunderstanding on what kind of feature can be developed with it. If you share your use case I might be able to better assist.
Let me explain I have few cameras from which I am getting feed using rtsp url and after running the inference on my laptop (for now) I want to display the annotated video on the web so that any system on the network can view the stream using the IP address.
For disapplying the feed on web I am planning to use streamlit since it is easier pick and plug.
The stream needs to be real time(18-20fps) I am getting around 21 fps when I am simply showing the stream using opencv(similar to your example above) after running inference
If I am still not clear please let me know.
I am going down the edge deployment route and to show result I will use web
Thank you!
Looking at streamlit
docs I guess you are looking to use their video player? Or are you using another component?
I was able to knock very fast demo where I stream a video file frame by frame (fps was quite low though):
import streamlit as st
import cv2
def stream_video(f):
cap = cv2.VideoCapture(f)
frame_placeholder = st.empty()
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
frame_placeholder.image(frame_rgb, channels='RGB')
cap.release()
f = st.text_input("provide file path")
if f:
stream_video(f)
I’ll post example how you can integrate inference_pipeline
with this.
Yes I am using the video player. I passed the raw stream(exactly like your code) without any inference and was getting 6-7 fps.
Thanks for mentioning that you will post the example with inference pipeline as well.
OK, so here is how you can stream frames from inference_pipeline
to streamlit
. Below example comes with disclaimer that it’s not a production-ready code and it’s purpose is to give directions so you can develop your solution.
import asyncio
from typing import Any, Dict, Union
import cv2 as cv
# it seems event loop is not being set up in the Streamlit script's execution context
try:
asyncio.get_event_loop()
except Exception as exc:
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
from inference.core.interfaces.camera.entities import VideoFrame
from inference.core.interfaces.stream.inference_pipeline import InferencePipeline
from inference.core.managers.base import ModelManager
from inference.core.registries.roboflow import (
RoboflowModelRegistry,
)
from inference.models.utils import ROBOFLOW_MODEL_TYPES
import supervision as sv
import streamlit as st
frame_placeholder = st.empty()
f = st.text_input("file path or stream rtmp address")
model_registry = RoboflowModelRegistry(ROBOFLOW_MODEL_TYPES)
model_manager = ModelManager(model_registry=model_registry)
box_annotator = sv.BoundingBoxAnnotator()
label_annotator = sv.LabelAnnotator()
def custom_sink(prediction: Dict[str, Union[Any, sv.Detections]], video_frame: VideoFrame) -> None:
detections = sv.Detections.from_inference(prediction)
labels = [f"#{class_name}" for class_name in detections["class_name"]]
annotated_frame = box_annotator.annotate(
video_frame.image.copy(), detections=detections
)
annotated_frame = label_annotator.annotate(
annotated_frame, detections=detections, labels=labels
)
frame_rgb = cv.cvtColor(annotated_frame, cv.COLOR_BGR2RGB)
frame_placeholder.image(frame_rgb)
if f:
pipeline = InferencePipeline.init(
video_reference=f,
model_id="<your model ID>",
on_prediction=custom_sink,
api_key="<secret>"
)
pipeline.start()
pipeline.join()
I was able to stream frames one by one from local file, can you test with your rtmp cameras?
can you please explain how frames are passed through this to the streamlit?
I’m new to streamlit
so take my explanations with grain of salt
I understand streamlit
is a neat tool allowing user to focus on their data workloads, UI/hosting is done automagically! I love the concept!
The way how streamlit
knows what components to put on the screen is based on variables declared in the script. So here I declare input field where stream address can be provided:
f = st.text_input("provide file path")
And here I declare a placeholder where data will be displayed:
frame_placeholder = st.empty()
streamlit
does the heavy lifting themselves when it comes to actually transfer the data to the client and show it in meaningful way (I was not digging into their internals so please consult their documentation). The only thing I know is that below line transfers numpy array and displays it as frame, and also streamlit
takes care of refreshing:
frame_placeholder.image(frame_rgb)
I see people provide 3rd-party plugins you might want to explore, but hey! I’m not streamlit
support
Good luck! Please do not hesitate to drop more questions if you have any problems with inference!
Thanks for responding but this doesn’t solve the problem
It just adds a series of images(frames) on the web page instead of the video
with inference pipeline I can’t just pop the frame from the on_prediction
which can be done using open cv
example below
def get_video_frames():
cap = cv2.VideoCapture(0) # Change 0 to your RTSP URL if needed
while True:
ret, frame = cap.read()
if not ret:
break
yield frame
cap.release()
Also I did post on the streamlit forum but it is dead and I don’t have high hopes of getting the answer.
I checked the docs but sadly nothing
what do?
Hi @Mubashir_Waheed,
It just adds a series of images(frames) on the web page instead of the video
So from what you are saying it’s not enough to show frames the way shown in example. Can you explain what do you mean?
In your other message you write:
I want to display the annotated video on the web so that any system on the network can view the stream using the IP address
So, you are actually looking to re-stream annotated frames so you can attach your other devices to that stream? If so then I think streamlit
is not the tool you are looking for, you would probably use mediamtx, which you can configure to accept streams
Also I did post on the streamlit forum
Ok here is a gif showing that frame as images are inserted to the web after each inference loop.
Here is the code
import streamlit as st
st.write("Hello world")
frame_placeholder = st.empty()
class CustomSink:
// other methods
def on_prediction(self, result: dict, frame: VideoFrame) -> None:
self.fps_monitor.tick()
fps = self.fps_monitor.fps
detections = sv.Detections.from_ultralytics(result)
detections = detections[find_in_list(detections.class_id, self.classes)]
detections = self.tracker.update_with_detections(detections)
annotated_frame = frame.image.copy()
annotated_frame = sv.draw_text(
scene=annotated_frame,
text=f"{fps:.1f}",
text_anchor=sv.Point(40, 30),
background_color=sv.Color.from_hex("#A351FB"),
text_color=sv.Color.from_hex("#000000"),
)
// further pre processing here
// should add frame inside a video player
st.image(annotated_frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
cv2.destroyAllWindows()
raise SystemExit("Program terminated by user")
So, you are actually looking to re-stream annotated frames so you can attach your other devices to that stream?
sorry I don’t understand the phrasing of this.
Just assume that instead of the stream running on my laptop it will run on the jetson nano which which will somehow after running inference display the frame on the webpage( maybe using the streamlit) and everytime I would want to see the stream I will just to the web page
This is from the streamlit
import streamlit as st
video_file = open("myvideo.mp4", "rb")
video_bytes = video_file.read()
st.video(video_bytes)
the much I understand if I am able to pop the inference frame out of the pipeline loop and then just passing that to the st.video
it will work.
but how?
Hi @Mubashir_Waheed ,
When I run my demo I see below result:
Is this what you are looking to achieve?
Yes, that is precisely what I want with goods fps and without having to add the URL to see the stream.
I am getting around 20-21 fps with pipeline and Opencv combo. I am not sure what I will get with the pipeline and streamlit combo
Edit: I tried your approach and I am getting around 8-10 fps. I wonder why fps dropped
@Mubashir_Waheed , if you don’t want to provide url through input, you can replace this line:
f = st.text_input("file path or stream rtmp address")
with this:
f = "your-url"
Edit: I tried you approach and I am getting around 8-10 fps. I wonder why fps dropped
When showing frames in OpenCV.imshow frames do not have to be transferred through network. I guess st.video
might be implementing some tricks like to encode stream as h264
, here we are sending individual frames which is quite costly from IO perspective. I would guess you should see fps getting higher for lower resolution frames.
You can address this in a bit hacky way, in my example update below line:
def custom_sink(prediction: Dict[str, Union[Any, sv.Detections]], video_frame: VideoFrame) -> None:
detections = sv.Detections.from_inference(prediction)
labels = [f"#{class_name}" for class_name in detections["class_name"]]
annotated_frame = box_annotator.annotate(
video_frame.image.copy(), detections=detections
)
annotated_frame = label_annotator.annotate(
annotated_frame, detections=detections, labels=labels
)
frame_rgb = cv.cvtColor(annotated_frame, cv.COLOR_BGR2RGB)
frame_placeholder.image(frame_rgb)
with:
def custom_sink(prediction: Dict[str, Union[Any, sv.Detections]], video_frame: VideoFrame) -> None:
if video_frame.frame_id % 2:
return
detections = sv.Detections.from_inference(prediction)
labels = [f"#{class_name}" for class_name in detections["class_name"]]
annotated_frame = box_annotator.annotate(
video_frame.image.copy(), detections=detections
)
annotated_frame = label_annotator.annotate(
annotated_frame, detections=detections, labels=labels
)
frame_rgb = cv.cvtColor(annotated_frame, cv.COLOR_BGR2RGB)
frame_placeholder.image(frame_rgb)
Above change will result in every 2nd frame sent over the network.
@Grzegorz
Using f with hardcoded url works.
I tried sending sending every second frame using the hacky way but no improvement