Blog

Webinars

AI agents

Darwin academy

Resources

Computer vision

15+ top computer vision project ideas for beginners for 2024

10 min read

—

Jun 16, 2021

Building a simple computer vision model is not rocket science—all you need is access to quality data and a reliable training data platform to get started. Check out our ideas for computer vision projects for beginners and start building!

Alberto Rizzoli

Co-founder & CEO

Computer vision is one of the hottest topics in the AI field.

But—

It’s easy to get confused trying to figure out what’s the best way to learn and master this field.

Our advice?

Don’t get stuck analyzing theoretical concepts.

Instead, combine your conceptual knowledge with practical experience, and start building your own computer vision models!

In this article we’ll share with you a bunch of computer vision project ideas to help you get started in less than an hour:

Here’s what we’ll cover:

People counting tool
Colors detection
Object tracking in videos
Pedestrian detection
Hand gesture recognition
Human emotion recognition
Road lane detection
Business card scanner
License plate recognition
Handwritten digit recognition
Iris flowers classification
Family photo face detection
LEGO brick finder
PPE detection
Face mask detection
Traffic light detection

In case you are ready to get started, V7 arms you with the tools needed to build robust computer vision models, and the good news is that you don’t need any prior experience.

A video labeling annotation tool where drone footage of a port inspection is being annotated

Video annotation

AI video annotation

Get started today

Video annotation

AI video annotation

Get started today

Ready to streamline AI product deployment right away? Check out:

People counting tool

Building a people counting solution could be both—a fun project and one that actually finds real-world applications.

To detect and count people present in an image, you’ll need a relevant training dataset and a data training platform. You can use a free tool like OpenCV to label your data or an auto annotation tool like V7 to complete this project faster.

People counting using bounding boxes and Instance ID

Since the COVID-19 outbreak, people counting solutions have been growing in popularity, helping to enforce social distancing rules and improve safety.

Here’s a recommended dataset to get you started:

People Counting Dataset (PCDS)

Colors detection

Next up is a simple colors detector that you can use for a wide variety of visual tasks.

From detecting colors to build the green screen app—replacing the green background with a custom video or background—to a simple photo editing software, building a color recognizer is an awesome project to get started with Computer Vision.

Here are a few interesting datasets you might want to use for your project:

Object tracking in a video

Next, consider taking on a bit more advanced computer vision task—object tracking in a video.

Object tracking is about estimating the state of the target object present in the scene from previous information.

Flying squirrel tracking in a video using V7

You can build simple object tracking models using videos involving one object, such as a car, or multiple objects like pedestrians, animals, and whatnot.

Essentially, the model will perform two tasks—predicting the object’s next state and correcting this state with respect to the object’s real condition. Object tracking models find applications in traffic control and human-computer interactions.

Here are a few video datasets you might find interesting for this computer vision task:

Pro tip: Check out The Complete Guide to Object Tracking [+V7 Tutorial].

Pedestrian detection

Building an object detection model to detect pedestrians is one of the simplest and fastest computer vision projects to complete.

All you need is a relevant dataset of high-quality images and a data training platform to train and test your model. You can use one of the free image annotation tools or try out V7.

Pedestrian detectors are commonly used in the automotive industry for traffic safety as well as human-robot interactions and intelligent video systems.

Consider these datasets to get started:

Hand gesture recognition

Hand gesture recognition is a bit more advanced computer vision task requiring you to firstly separate the hand region from the background and then to segment the fingers to predict hand gestures.

You can use OpenCV if you want to keep your model simple or take advantage of V7’s keypoint skeleton & custom polygons tools to make labeling faster and more accurate.

After training, you can test your model using a webcam. Hand gesture models can be used in VR games and sign languages.

Check out those datasets to get started:

Pro tip: Check out A Comprehensive Guide to Human Pose Estimation to learn more.

Human emotion recognition

If you decide to go with a bit more challenging task, consider building an emotion detection model. You can base your model on six main facial emotions: happiness, sadness, anger, fear, disgust, and surprise.

Emotion recognition of a surprised young woman using bounding boxes

The three main components of this project include Image Pre-processing, Feature Extraction, and Feature Classification.

Here are the datasets that might come in handy:

Road lane detection

Road lane detection is yet another computer vision model that plays a key role in the development of the automotive industry.

Used primarily for self-driving cars, a road lane detector can be a fun beginner project that will help you get hands-on experience with both images and videos.

Here is a couple of datasets to help you out:

Business card scanner

Developing a business card scanner can be done using the OCR (Optical Character Recognition) technology. Your trained model will find and extract information from business cards.

Essentially, this project will be divided into three phases: image processing (noise cancellation), OCR (text extraction), and classification (classifying key properties).

OCR on a french business card using V7 Text Scanner

You can use your business card reader to automate data entry.

Pick on one of those datasets to begin:

Pro Tip: V7 allows you to automatically scan and read text using in-built Text Scanner.

License plate recognition

A license plate recognizer is another idea for a computer vision project using OCR.

However, there are two challenges related to this project: data collection and the differences in license plates formats depending on the location/country.

Therefore, your model might not be accurate unless you train large amounts of data (if you manage to obtain it).

Note: License plate numbers are considered sensitive data, so make sure you stick to the publicly available datasets when building your models.

License plate recognition on a white Vitare using V& Text Scanner

A simple automatic license plate recognition system can use basic image processing techniques, and you can build it using OpenCV and Python.

However, more advanced systems use object detectors like YOLO or Fast C-RNN.

Automatic license plate recognition can be used for security, parking, smart cities, automatic toll collection, and access control.

Here are a few datasets you might consider:

Handwritten digit recognition

This project is a perfect start for computer vision newbies—you can build a simple digit recognizer using the MNIST dataset.

As you get a chance to train your model using Convolutional Neural Networks, you’ll learn how to develop, evaluate, and use convolutional deep learning neural networks for image classification.

The MNIST dataset contains a training set of 60,000 examples and a test set of 10,000 examples. You can access it here:

MNIST Digit Recognition Dataset

Pro tip: Read our Guide to Handwriting Recognition to learn more

Iris Flowers Classification

Here’s another computer vision project based on one of the most popular and thus readily available datasets for pattern recognition—Iris Flowers Classification Dataset.

It contains three classes of 50 instances each, where each class refers to a type of iris plant.

It’s a great beginner’s project that’ll help you get hands-on experience with image classification as you’ll train your model to predict the species of a new iris flower.

You can download the dataset here:

Iris Flowers Classification Dataset

Pro Tip: Check out 65+ Best Free Datasets for Machine Learning to find more datasets to train on.

Family photo face detection

Grab your family album to collect original data and build a face recognition model to identify your family members in the photos.

You can label your data using a free annotation tool or V7 and train your model in less than an hour. This task is a multi-stage process consisting of face detection, alignment, feature extraction, and feature recognition.

To make your project more interesting and your model more accurate consider using video data, too.

If you can’t obtain data on your own, check out these datasets to get started with facial recognition projects:

LEGO Brick Finder

If you’ve ever spent hours building LEGO in your childhood, this project could be a perfect way to get you hooked on computer vision.

In its simplest form, you can build a model to detect and identify LEGO brick in real-time using your webcam or your phone camera. All you need is a large set of training data and a tool to train your model.

LEGO brick finder using color recognition and detection

Here are the datasets for you:

PPE Detection

The goal of this computer vision project is to build a model identifying the elements of PPE or face masks. You can complete it in a couple of hours and test it using a web cam and wearing a face mask in front of your computer.

Here’s how we’ve labeled worker PPE using V7’s auto-annotate tool in less than a minute.

PPE detection models find application in industries such as construction or healthcare (hospitals).

See how V7 handles PPE detection on a video.

Check out these datasets to get started:

Face mask detection

Similarly to PPE detection, you can build a simple face mask detection model to identify people who wear and don't wear a mask in public.

Remember to collect large amounts of data to ensure model's accuracy in handling varying kinds of occlusions.

Check out this dataset to get started:

Face Mask Detection

Traffic light detection

Finally, consider spending some time on training a traffic light detector. This project is relatively easy to complete because of the availability of data and research that you can access for free.

Traffic light detection finds applications in the intelligent transportation field including popular use cases such as autonomous cars and smart cities.

Here are a few datasets you can use:

See how V7 handles traffic light detection in this video.

Building your first computer vision model: Key takeaways

Now that you’ve got a bunch of ideas for your computer vision projects, it’s time to get some hands-on experience and start developing your own AI models.

If you want to keep things simple—start with image classification using the Iris Flowers dataset or pedestrian detection.

When considering more advanced projects, go for object tracking in videos, or a simple business card scanner app that you can develop to test your AI model in real-world conditions.

Either way, you are now ready to combine your theoretical knowledge with practical experience and start building computer vision models that can be turned into real products with a few lines of codes!

We are excited to see what you build and we keep our fingers crossed for your projects!

Good luck!

How to Split Your Machine Learning Data: Train, Validation, Test Sets

Data Cleaning Checklist: How to Prepare Your Machine Learning Data

Annotating With Bounding Boxes: Quality Best Practices

The Complete Guide to CVAT—Pros & Cons

5 Alternatives to Scale AI

YOLO: Real-Time Object Detection Explained

6 AI Applications Shaping the Future of Retail

7 Game-Changing AI Applications in the Sports Industry

6 Viable AI Use Cases in Insurance

8 Practical Applications of AI In Agriculture

6 Innovative Artificial Intelligence Applications in Dentistry

7 Job-ready AI Applications in Construction

7 Out-of-the-Box Applications of AI in Manufacturing

Data labeling

Data labeling platform

Get started today

Data labeling

Data labeling platform

Get started today

Alberto Rizzoli

Co-founder & CEO of V7

Alberto Rizzoli

Co-founder & CEO of V7

Previously CEO at Aipoly - First smartphone engine for convolutional neural networks. Management & Stats grad at Cass Business School and Singularity University. Never had a real job.

Next steps

Label videos with V7.

Try our free tier or talk to one of our experts.

Next steps

Label videos with V7.

Book a demo

Explore V7 Darwin

Book a demo

Explore V7 Darwin