Building Intelligent Video Analytics Solutions Using Intel Deep Learning Streamer (Intel DL Streamer)

Building Intelligent Video Analytics Solutions Using Intel Deep Learning Streamer (Intel DL Streamer)

It is true and commonly thought that the video cameras installed in retail stores, smart cities, manufacturer, ..etc. are mainly used for security purposes, as mainly to prevent theft.

But what if you can add intelligent layer on top of your current video solution, so now you can count people in retailer, recognize vehicle's license plate, monitor traffics, and so other use cases. This what we call video analytics.

The global video analytics market size was valued at USD 5.32 billion in 2021. The market is projected to grow from USD 6.35 billion in 2022 to USD 28.37 billion by 2029, exhibiting a compound annual growth rate of 23.8% during the forecast period.

Real-time video analytics have long been a challenge for computer vision applications, since video is an information-intensive media. Systems can be complex, and environmental inconsistencies make video analysis complicated and challenging.

With artificial intelligence, and now deep learning, the opportunities for computer vision applications continue to expand. Deep learning technology is enabling a new generation of video analytics where people, vehicles, and more objects can be identified frame by frame—and their movement tracked across frames and cameras.

Intel can help you to deploy video analytics solutions on your existing infrastructure, as Intel's CPU has AI acceleration built into their processors and with software optimization you can of course enhance the video analytics workload in your existing infrastructure removing the need to add AI accelerators.

One of these tools is Intel Deep Learning Streamer (Intel DL Streamer).

The goal for DL Streamer is to enable the developers to build optimized streaming media analytics pipeline across all Intel hardware from edge to cloud.

DL Streamer is used for analyzing video (and also audio) streams in order to detect, classify, track, identify, count objects, event, and people for example. The analyzed results can be used to take actions, coordinate events, identify patterns, and gain insights.

DL Streamer framework is based on GStreamer library. A video streaming analytics framework based on a standard GStreamer with minimal pipeline interoperability with a familiar developer experience built using the GStreamer multimedia framework.

Video Analytics pipeline by DL Streamer = GStreamer + OpenVINO Inference Engine

image.png

In the below illustration you can see an example of how the general architecture of Video Analytics applications looks like;

image.png

DL streamer will also allow you to run each of these elements (as illustrated above) on different hardware (CPU, VPU, or GPU) in order to optimize the performance. For example you can run the media processing on CPU and media inferencing on VPU, GPU, ...

Why DL Streamer?

  1. Write less code and get better performance
  2. Quickly develop, optimize, benchmark, and deploy video & audio analytics pipelines in the Cloud and at the Edge
  3. Analyze video and audio streams, create actionable results, capture results, and send them to the cloud
  4. Leverage the efficiency and computational power of Intel hardware platforms

How everything works?

image.png

For more information about Intel OpenVINO and how to convert/optimize the model, please check our blog.

Example on how to build the pipeline

The below command line is illustrating how easy it's to build a video analytics pipeline for object detection;

gst-launch-1.0 filesrc location=/path/to/video.mp4 !
decodebin ! \
gvadetect model=model1 device=CPU ! queue ! \
gvawatermark ! ximagesink
  • The command line tool gst-launch-1.0 enables developers to describe a media analytics pipeline as a series of connected elements.
  • filesrc - Read data from a file in the local file system.
    • If you want to provide a live video stream, so we will use different element other than "filesrc" element. So for RTSP, the element is called "urisource" and for USB camera, the element is "v4l2src".
  • At a high level, the decodebin abstracts the individual operations required to take encoded frames and produce raw video frames suitable for image transformation and inferencing.
  • gvadetect - Runs detection with the Inference Engine from OpenVINO™ Toolkit.
    • model - path to the inference model network file (.xml file)
    • device - device to run inferencing on (CPU, GPU, ...)
  • gvawatermark - Overlays detection and classification results on top of video data.
  • ximageSink renders video frames to a drawable on a local or remote display.

Here you can find more tutorials on building object classification, or object detection pipelines.

Intel DL Streamer's promise is to remove the hassle of creating the video analytics pipeline in a simplified way. Learn how to install DL Streamer, and create something wonderful!