Visualizing Deep Learning Annotations - Interactive Video Player

By Abinash S Technical Blog April 11, 2024

When it comes to video analytics, the art of visualization is just as important as actual core analysis. Pixels and algorithms tell a visual story, with each annotation serving as a key character. Conventional HTML video tags are more advanced in terms of annotations than they are in terms of basic video playback, especially when it comes to object tracking and detection.

The Why Behind the What: Annotations

Before we go into visualizations, let’s talk about annotations. Imagine a video capture of a bustling street scene, where each person and car represents a distinct element within a larger visual narrative. In this case, systems capable of detecting and tracking people and objects take the lead.

Models such as YOLO are utilized. They start with a video and turn it into a series of bounding boxes that frame different objects or people. A data frame provides these boxes, which contain the corresponding coordinates, frame numbers, object classes, and confidence scores. These coordinates define the limits of each bounding box, perfectly framing every important piece in our scene.

Consider a scenario in which you want precise control over your video annotations. Sometimes you need to highlight specific objects, show all annotations, or show none at all, especially when integrating these features into a website or mobile app. Constantly relying on a Python backend to generate and retrieve a new video for each user action is not optimal. This level of agility must be readily available to the client. What’s the twist? The HTML <video> tag cannot handle these dynamic annotations. At this point, our custom solution takes center stage—a seamless combination of a canvas element and a custom video player.

Custom Video Player

Vue’s reactive features, enhanced by Vuetify’s sleek UI elements, drew me in. What could be the outcome? A video player that is designed to not only play the video but also overlay machine learning annotations in a seamless way. It also synchronizes with the video frames and incorporates custom video controls.

Workflow

Our approach involves merging the <canvas> element with video playback, allowing us to overlay annotations directly on the video. Here is an overview:

Video Tag Creation: To start off, create a video tag that will serve as your video’s source.
Canvas Setup: Align a canvas to match the video’s dimensions.
Rendering Video on Canvas: Keep the video element hidden and render the video’s frames on the canvas while they reload, in accordance with the browser’s repaint.
Creating Cusm Video Controls: To programmatically control video playback, create a set of custom video controls using CSS and icons.

Note: The purpose of this article is to look at how to build a basic canvas rather than how to set up Vue 3. There are plenty of other excellent blogs on the subject.

The following is an overview of the essential components:

Vue Template

Let’s start by creating the template for our video player component. We will have a hidden video element, a canvas for displaying videos and annotations, and a control button for enabling and disabling annotations and video controls.

Component and tag references are critical because they enable us to access and control data in Vue.

Annotations

We need a backend that sends annotations in a format that JavaScript can handle, ideally JSON. When we load the video, we should also obtain its annotations. A sample annotation would look like the following:

Getting started with the States

Vue 3’s composition API provides a more flexible approach to managing state in your components. We begin by defining our reactive states using ref.

Frame Calculation and Rendering

An important component of our player is estimating the current frame based on the duration, as the HTML5 video tag does not offer this directly.

To play the video in Canvas, we must first draw each frame in the video and then check to see if any annotations are associated with it. If that’s the case, we’ll need to draw boxes on the canvas using bbox coordinates. Let us create the necessary functions for these things.

The drawFrame method is critical, as it renders each frame onto the canvas and overlays it with annotations. requestAnimationFrame is a JavaScript programming language that generates smooth, high-performance animations in web browsers. It informs the browser that you want to run an animation and demands that it call a certain function to update the animation before the next repaint.

ℹ️ Key Considerations for Optimal Video Playback

Video Dimensions: Understanding the video’s height and width is essential. These dimensions are required for properly sizing the canvas and arranging the bounding boxes. A mismatch in dimensions may result in incorrect object tagging.
Frame Rate: Understanding the video’s actual frame rate is critical for accurately determining the current frame during playback. This is necessary for synchronizing the bounding boxes with the video. Unfortunately, the HTML5 video element does not offer this information directly. However, we can get around this by having the server communicate the frame rate as part of the metadata alongside the annotations.

Video Control Actions

We can use the default features of the video tag to implement the playback controls.

Video Playback Controls

After hiding and playing with the canvas, we will be in charge of controlling the designs for our videos. Fortunately, HTML’s video tag includes programmable functions, allowing us to create our own set of controls specific to our requirements, which we have previously done in our parent component. We only need to call them here.

Let’s create a component for the video controls, starting with the video slider for video progress and other playback controls.

Now that the template is complete, let’s proceed with the script section.

This display shows the playback time, total duration, play/pause status, next frame, and previous frame. Additionally, there is a slider that, when adjusted, will calculate the duration and adjust the video playback to the corresponding duration.

Output

After completing all the necessary processes, the final outcome will look like this:

This UI is more than simply a static entity; it offers many possibilities with enormous flexibility. One of the hidden treasures (hint: see the GitHub Gist) is the option to display video captions, which adds an extra degree of engagement.

Keep in mind that this innovation extends beyond Vue’s borders. We can apply the principles and methods we’ve discussed to a variety of web frameworks, showcasing the universality of our approach. We have highlighted some of our project’s important components and problems in this post, but it’s best to read the entire code to fully appreciate its complexity and scope. The scripts are available on GitHub for those with curious minds and a passion for coding. Dive in and adapt it to your projects; perhaps even improve it with your own touch. I’m excited to watch how this project evolves and transforms with your contributions.

Github Gist: https://gist.github.com/s-abinash/4a3c7afaba94ab9dd74c551f0fe898fc

Abinash S

+ posts

Cookie	Duration	Description
_GRECAPTCHA	5 months 27 days	This cookie is set by Google. In addition to certain standard Google cookies, reCAPTCHA sets a necessary cookie (_GRECAPTCHA) when executed for the purpose of providing its risk analysis.
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other".
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
lang	session	This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
Zoominfo	session	Zoominfo uses technologies to collect and store information when you interact with services it offer to their partners, such as advertising services or analytics. All of those processes are meant to improve your user experience and the overall quality of our services.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_111355416_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_hjAbsoluteSessionInProgress	30 minutes	This cookie is used to detect the first pageview session of a user. This is a True/False flag set by the cookie.
_hjFirstSeen	30 minutes	This is set by Hotjar to identify a new user’s first session. It stores a true/false value, indicating whether this was the first time Hotjar saw this user. It is used by Recording filters to identify new user sessions.
_hjid	1 year	This is a Hotjar cookie that is set when the customer first lands on a page using the Hotjar script.
_hjIncludedInPageviewSample	2 minutes	This cookie is set to let Hotjar know whether the user is included in the data sampling defined by site's pageview limit.
_hjIncludedInSessionSample	2 minutes	This cookie is set to let Hotjar know whether the user is included in the data sampling defined by site's daily session limit.
_hjTLDTest	session	Hotjar test cookie to check the most generic cookie path it should use, instead of the page hostname. This is done so that cookies can be shared across subdomains (where applicable). To determine this, we store the _hjTLDTest cookie for different URL substring alternatives until it fails. After this check, the cookie is removed.
oktgid	1 year	This cookie is used for storing the visitor ID of the user who clicked on an okt.to link.
oktsid	session	This cookie is used for storing the session ID of the user who clicked on an okt.to link.

Cookie	Duration	Description
bscookie	2 years	This cookie is a browser ID cookie set by Linked share Buttons and ad tags.
personalization_id	2 years	Twitter sets this cookie to integrate and share features for social media and also store information about how the user uses the website, for tracking and targeting.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by YouTube and is used to track the views of embedded videos on YouTube pages.

Cookie	Duration	Description
__gwtCookieCheck	session	This cookie is used to check if the visitors' browser supports cookies.
AnalyticsSyncHistory	1 month	These cookies are used to deliver advertisements more relevant to you and your interests. They are also used to limit the number of times you see an advertisement as well as help measure the effectiveness of the advertising campaign. They remember that you have visited a website and this information is shared with other organizations such as advertisers.
li_gc	2 years	These cookies are used to deliver advertisements more relevant to you and your interests. They are also used to limit the number of times you see an advertisement as well as help measure the effectiveness of the advertising campaign. They remember that you have visited a website and this information is shared with other organizations such as advertisers.
UserMatchHistory	1 month	LinkedIn - Used to track visitors on multiple websites, in order to present relevant advertisement based on the visitor's preferences.

Visualizing Deep Learning Annotations – Interactive Video Player