From surgical videos to autonomous checkouts, video footage is one of the best types of training data for developing AI models. Videos provide more data points and the temporal context necessary for computer vision tasks, such as movement tracking.
As part of a widespread overhaul, you can now work with videos in V7 which are an hour in length – and beyond.
To put this in numbers, videos in V7 can now be 10x longer, support 700x more annotations, and come with a redesigned user interface to make annotation more efficient.
Our latest update focuses on three key areas:
Labeling large, densely annotated video files used to involve splitting them into shorter clips or, even worse, labeling on individual images that represented frames. Our new approach is more in line with Youtube/Netflix, which dynamically extracts frames on the fly.
As a result, this update allows you to work with longer video files without the need to break them down into clips. With the standard video frame rate of 24 frames per second, V7 can now handle videos up to and beyond an hour in length while annotating them at their native frame rate. For use cases that typically employ reduced frame rates, such as surveillance footage, we can now utilize 24-hour clips as single files(!) in our dataset by reducing the frame annotation rate to 1 FPS. Additionally, this has brought in a reduction of storage costs in the region of 9-10X.
This footage has been imported and annotated at the default speed of 1FPS but it can still be previewed at its native frame rate.
The best part? You can still preview the whole footage at its native frame rate inside the annotation panel. The FPS parameter refers to annotation frequency and primarily affects the timeline behavior, without reducing the actual frame rate of the video itself.
Because everything is now processed dynamically in real time, the overall performance boost allows for more annotations to be added to the timeline too. After the update, V7 can store up to 700K annotations and 6 million keyframes per video.
The interface and the playhead itself have been redesigned to indicate your current annotation frame in the video. The timeline also distinguishes between processed and unloaded frames, which is indicated by different shading. You can still utilize features such as zooming in and out, setting the default annotation length, or interpolation for creating automatic annotations between keyframes.
The video above shows COCO annotations added to both the opening and one of the closing scenes of The Godfather.
The general process remains the same as with regular videos. However, there are some additional considerations discussed below, so let's review the standard setup and flow step by step to address them.
Go to the Datasets panel and create a new dataset. Drag and drop your video files. The supported video formats are .avi, mkv, .mov, and .mp4. You should see a popup window asking you to specify annotation frequency.
As was mentioned earlier, this parameter does not reduce the FPS rate of the underlying/source video file, but changes how the duration of your video clip is mapped onto the timeline. This parameter corresponds to how many frames are available for adding annotations.
The majority of video projects in computer vision use tags, bounding boxes, or segmentation masks based on either polygons or pixels. If you want to label your videos manually for a new use case or a scenario when working with a single video is a laborious task, you may want to split the work across multiple teams of annotators. This can be achieved by assigning annotation tasks to specific users or automated with workflow design (for example, with sampling stages).
You can significantly improve your annotation efficiency by using the Auto-Annotate tool, SAM, or external models. There are many computer vision tasks that can be solved with open-source models or public models available inside V7 out of the box. It is possible to pre-label your footage with AI and then improve the quality of your annotations by adding additional properties or attributes to labels generated with AI.
V7 allows you to complete annotation projects in a collaborative manner, give feedback to annotators and set up QA processes. If you want to prioritize the quality of your training data, consider including one or multiple review stages. Once a file passes through the final review stage, you can export a Darwin JSON file with corresponding annotations.
As you can see, the update offers many new improvements for different industries, ranging from autonomous driving to healthcare. However, there are still some limitations to consider. For example, if we want to track multiple objects, with annotations morphing in every single frame, it is best to keep it below 24 parallel keyframes per frame for optimal performance. You can work with higher numbers, but at some point, you will only be able to use the timeline manually to change frames without being able to play the video smoothly.
The extended video length, improved performance for high numbers of annotations, and smoother timeline navigation translate to improved dataset accuracy and throughput. Incorporate these advanced video functionalities into your workflow today and unlock the true potential of video annotation for AI. If you haven't created your V7 account yet, now is the perfect time to get started.