The transportation industry has undergone multiple changes and revolutions over the last few hundred years—and we’re now at the stage where major breakthroughs are being achieved in the form of Artificial Intelligence in transportation.
Whether via self-driving cars for more reliability, road condition monitoring for improved safety, or traffic flow analysis for more efficiency, AI is catching the eye of transportation bosses around the world.
Indeed, many in the transportation sector have already identified the awesome potential of AI, with the global market forecast to reach $3,870,000,000 by 2026.
Such spending can help companies leverage advanced technologies like computer vision and machine learning to shape the future of transportation so that passenger safety increases, road accidents are lessened, and traffic congestion is reduced.
Deep learning and machine learning in transportation can also help to create “smart cities,” such as we’ve seen in Glasgow, where the technology monitors vehicle dwell times, parking violations and traffic density trends.
Here’s what we’ll cover:
Train ML models and solve any computer vision task faster with V7.
Don't start empty-handed. Explore our repository of 500+ open datasets and test-drive V7's tools.
If you’re looking to build AI solutions for your own transportation use case, check out:
Or get in touch with our team to discuss your project ;-)
Now, let’s begin!
The concept of self-driving vehicles is nothing new. General Motors introduced it back in 1939.
But it’s only in our current age of AI transportation that companies are able to use computer vision techniques like object detection to create intelligent systems that decode and make sense of visual data to—essentially—allow a vehicle to drive itself.
And while a self-driving car can sound complex, the idea for building the AI behind it is actually straightforward: The algorithm is fed huge swathes of relevant data, before being trained to detect specific objects and then take the correct actions, such as braking, turning, speeding up, slowing down, and so on.
Which objects does a model need to identify?
The likes of other vehicles on the road, road signs, traffic lights, lane markings, pedestrians and more.
To collect and use data, autonomous vehicles use cameras and sensors. To train the model and to make it reliable, it needs to be consistently fed masses of data.
Naturally, there still remain some challenges.
An algorithm needs to get access to those huge swathes of relevant data, while situational conditions like bad weather and uneven terrain can also pose a problem. Other issues include poor lighting and the possibility of a self-driving car coming across an unidentified object while out on the road.
Of course, when many of us think of self-driving cars, we automatically think of Tesla.
Tesla—along with the likes of Uber, Waymo and Motional—has been working on automated vehicles for a number of years now, always staying one step ahead of the curve.
Unlike some others in the AI in transportation industry, Tesla utilises the purely vision-based approach, using their camera-equipped cars to collect video and image data without using HD maps and lidars in their autonomous driving stack.
From a technical standpoint, this is actually a more complex approach, largely due to the fact that, because the neural networks are being trained just on video data alone, the need to achieve the highest accuracy possible becomes essential.
However, Karpathy points out:
Once you actually get it to work, it’s a general vision system which can be principally deployed anywhere on earth.
Tesla’s self-driving team amasses lots of data too—as much as 1.5 petabytes of data that consist of 1,000,000 ten-second videos and 6,000,000,000 objects, each of which is annotated with velocity, depth and bounding boxes.
This isn’t to say that Tesla relies solely on manual data annotation. Rather, it improves the annotation process by combining human review with auto-annotation tools.
AI systems are limited to self-driving cars—they’re also used in trucks, buses, and airport taxes, with the innovations having a huge impact on AI in logistics and the supply chain in general.
Indeed, Mckinsey has predicted that self-driving trucks will reduce operating costs by some 45%. Environmental impact will also be greatly reduced.
Over the rest of the article, we’ll be taking a closer, in-depth look at some of the more specific computer vision and machine learning cases that are laying the foundations for autonomous driving technology.
There are thousands of traffic lights in the US alone. And while you might think that stopping when a light turns to red is a simple process, the fact that each year in the US some 1,000 people are killed needlessly by vehicles running a red light means that the whole thing is a very risky, dangerous and even complex game.
It’s a game with tragic consequences, too, with over 50% of those deaths accounted for by passengers or drivers who didn’t run the red light.
The problem is that the traffic light system itself might be perfect, but the humans behind the wheel aren’t always perfect. Mistakes happen, sometimes drivers run a red light—and accidents occur.
The solution to this terrible problem can be found in autonomous vehicles that, alongside smart cities, can prevent those deaths.
Indeed, automakers are putting the traffic signal issue at the front and centre of their self-driving cars capabilities.
An AI-based system can be trained to recognise lights—green, amber and red—via computer vision models that are trained in a wide range of scenarios, such as poor light conditions, inclement weather and occlusions.
As such, a self-driving car’s cameras first spot a traffic signal, before the image is analysed, processed—and, if it turns out that the light is red, the car puts the brakes on.
Naturally, there are issues here. When a camera is scanning what’s in front of it, it may spot other lights—such as a billboard or a streetlamp. Yes, a traffic light is different to a streetlamp in that it has three lights, but an image analyser capability still needs to be so good that it can spot the traffic signal instantly and not be fooled by other lights.
If it’s fooled, the result could be devastating.
When annotating the data for traffic detection, then, one of two common approaches are needed:
You can also label pole and light individually.
Whenever a smart city offers proximity sensing to an autonomous car, understanding its supporting structure is the only way to know the position of a free-floating object.
This way of annotating can also be used to determine traffic lights from one another which are presented in multiple lanes, with each set of lights correlated and grouped to the lane below.
This paper explores a traffic light detection and recognition approach that uses convolutional neural networks.
Using map data and a pair of separate focal length cameras to detect traffic lights at different distances, researchers came up with a unique algorithm for light recognition which combined image classification with object detection to identify the light state classes of traffic lights.
The researchers also integrated YOLOv3 into their approach for real-time traffic light detection for better results.
Here’s a simplified visual representation of the process:
Results: The proposed approach didn’t manage to achieve 100% accuracy. Because traffic light detection and recognition requires 100% accuracy to ensure the safety of the passengers and pedestrians, improvement is needed.
Check out how the V7 model handles traffic light detection.
How cool would it be if a computer system could automatically spot and identify pedestrians in images and videos?
Further, what if we could create a model that would allow autonomous cars to understand a pedestrian's intent so that they would know—for example—if a pedestrian intended to cross the road in real time?
Such a system would certainly help self-driving cars to swerve dangerous situations, and potentially massively reduce road accidents.
Pedestrian detection is actually a key problem in Computer Vision and Pattern Recognition, because pedestrians can be super unpredictable in the context of road traffic. They’re so unpredictable that they pose one of the greatest threats to the success of self-driving cars.
The key is not necessarily that a system recognises specific human features, such as beards and noses, but that it’s able to properly distinguish a human from another object, as well as understand what a pedestrian is planning to do next. Ergo, will they cross the road?
To begin the task of identifying and visualising pedestrians, computer vision systems use bounding boxes.
To detect pedestrians, different types of features have been used, including motion-based features, texture-based features, shape-based features, and gradient-based features.
Some approaches have also incorporated human pose estimation, a technique that collates information about the instant behaviour of a particular subject (in this case, a human). This is designed to relay information to the autonomous vehicle in regards to what a pedestrian is intending to do next.
For instance, this paper takes a look at how the intentions of pedestrians can be predicted using 2D skeletal pose sequences via a deep learning network. The researchers wanted to create a model that would tell them in no uncertain terms whether or not a pedestrian is going to cross the road.
They linked the dynamics of a human’s skeleton to an intention so as to overcome the issue of real-time discreet intention prediction in a typical traffic environment.
Experimental results show that a 94.4% accuracy was achieved by SPI-Net in pedestrian crossing prediction based on the JAAD dataset.
Here’s a simple visualisation of the process:
Of course, there are still challenges in training data to overcome, and these include varying illumination parameters in varying scenarios, the different kinds of poses and clothing that pedestrians wear, as well as changing lighting conditions.
The latter issue is handicapped by the use of machine vision and cameras, which is why more advanced technology is needed to provide data that's' more accurate, and which can then be used to successfully identify pedestrians in all lighting conditions.
Moreover, it’s the success rate of a machine learning algorithm that will determine how successful pedestrian detection ultimately is.
The flow of traffic impacts a country’s economy for the better or worse, and it also impacts road safety. Traffic congestion costs money and time, it causes stress to the drivers and passengers, and it also contributes to global warming.
With better traffic flow, a country’s economy can grow better, and the safety of its road user’s is improved immeasurably.
With this in mind, it’s no surprise that Artificial Intelligence is now paving the way for better traffic flow analysis using machine learning and computer vision. AI can help to reduce bottlenecks and eradicate choke-points that are otherwise clogging up our roads—and our economy.
Thanks to the advancements of computer vision, drone and camera-based traffic flow tracking and estimation are now possible.
The algorithms are able to track and count freeway traffic with accuracy, as well as analyse traffic density in urban settings, such as on the freeway and at intersections. This helps towns and cities to understand what’s going on so that they can design more efficient traffic management systems, while at the same time improving road safety.
CCTV cameras can spot dangerous events and other anomalies, as well as provide insights into peak hours, choke-points and bottlenecks. It can also quantify and track changes over a period of time so as to allow the measurement of traffic congestion. As a result, urban traffic and emissions can be greatly reduced by town planners.
The likes of Flir, Viscando and SwissTraffic have been using Artificial Intelligence for traffic flow, with Viscando using stereo vision technology to monitor and control traffic.
Viscando’s system monitors the flow of traffic at intersections and other open areas, and it can spot and track a variety of vehicles, as well as pedestrians and bicycles, all at the same time.
Such capabilities ensure Viscando are able to track the trajectory of road users at traffic lights, as well as identify conflict risk and calculate the gaps road users leave between cars.
As Viscando’s CEO, Amritpal Singh says:
It also gives the cities much more data about how the intersection is working, the length of queues and the duration of waiting time, and being able to include pedestrians and cyclists in the same optimisation scheme.
This paper puts forward a framework with numerous movements and classes for better vehicle counting. The researchers used advanced deep learning methods for vehicle detection and tracking, as well as a trajectory approach that allowed them to monitor the movements of vehicles.
The researchers wanted to improve the process of counting traffic volume, which itself has been a complex task, based as it is on the CCTV system. The issue has always been the involvement of too many vehicle movements. If the researchers could implement distinguished regions tracking so as to monitor the different movements of vehicles, they could improve the counting process.
The experiment results are promising, with the model achieving an accuracy for different movements between 80 and 98%, all with just a single view of the camera.
Ever spent ages trying to find a parking spot?
Heck, who hasn’t!
The parking spot issue is so prevalent in today’s society that the world’s top comedians make jokes about it. Woody Allen quipped, “the universe is expanding every second but I still can’t find a parking spot.”
Seinfeld even made two episodes about parking spots.
Finding a parking spot, of course, isn’t actually funny. It can be hugely stressful (as well as bad for the environment), and conquering the parking lot problem is something cities and towns all over the world are wrestling with.
How does computer vision work for parking management?
Let’s first start with the sensors.
Sensors are installed to monitor the parking lot for any empty spaces. Whenever a vehicle is parked in a space, the sensor is able to calculate the distance to its underpart.
But because a sensor can’t scan license plates, cameras, parking meters and computer vision need to get involved.
Cameras are thus installed that use computer vision to identify spots with no meters. Using automatic number-plate technology, they spot vehicles that are parked, as well as measure the amount of time they are parked for.
Computer vision can then use data to update in real-time the inventory of all empty and available spaces. Drivers can then access the map on their mobile device to check out all the available parking spots. This saves huge amounts of time, and is especially useful in overcrowded parking lots, such as airports.
This system is already in use, too, with towns and cities using computer vision in Parking Guidance and Information (PGI) systems for visual parking lot occupancy detection.
Moreover, it’s a more affordable option than sensor-based technologies—technologies which are expensive, and which need frequent maintenance.
For instance, Zensors has already leveraged computer vision for parking management. They have a platform that tracks parking occupancy on a space-by-space basis, and which guides drivers to available spaces. Their Artificial Intelligence system “allows airport traffic managers to offer turn-by-turn directions to available parking, maximising traveller time at airside shopping and dining facilities.”
Pothole damage is a major issue in America, with estimates suggesting that it costs drivers more than $3,000,000,000 per year.
And yet for many years, road condition monitoring has largely been left in the hands of citizens, whose “task” is to raise awareness of damaged roads to their local councils.
Now, computer vision in AI transportation can detect defection successfully, as well as assess the surrounding infrastructure by looking for changes in the asphalt and concrete.
Computer vision algorithms are able to identify potholes, as well as show exactly how much road damage there is so that the relevant authorities can take action and improve road maintenance.
The algorithms work by collecting image data, before processing it to create automatic crack detection and classification systems. These will then foster targeted rehabilitation and preventative maintenance and that is free from human involvement.
In other words, the responsibility will no longer lie with the citizens to report potholes and other road damage. Instead, the AI systems will update in real-time so that faster action is taken. This saves time and money.
The overall aim of Automated Pavement Distress (PD) detection is to improve road maintenance allocation efficiency, while at the same time increasing road safety so that accidents are hugely reduced.
There are numerous AI-driven road maintenance projects currently in operation, including The RoadEye, which uses machine learning in transportation and computer vision to overcome the problem of road surface damage.
The RoadEye project will use an integrated system for real-time road condition monitoring. A camera will combine with an embedded system to integrate into a complete ADAS system, which itself will track in real-time via machine learning the condition of any road surface it travels on.
The machine learning techniques developed by The RoadEye will classify the condition of the road into various categories, such as “wet” or “normal,” and it will also detect irregularities on the road’s surface, including potholes.
The RoadEye’s goals include using cars to create a complete road image dataset, before training a computer vision technology on said datasets. The dataset—which will be collated on a national level—will then be used in ADAS applications.
The RoadEye has other uses, too. For example, it can warn a driver when there is ice on the road, improve driver retention so that they save on fuel, as well as give them more peace of mind when driving.
Traffic incident detection is one of the most heavily researched areas of ITPs (Intelligent Transportation Systems) and AI transportation in general.
After all, for as long as there’s traffic, there will always be incidents—and there will always be hold-ups.
This is problematic for those tasked with keeping our roads clear, because the ultimate aim is to ensure that the traffic flows with the least amount of disruption possible.
For years, video surveillance has been instrumental when it comes to tracking road networks and intersections. It offers Traffic Management Centres a real-time view of incidents and the flow of traffic, allowing those in charge to respond as quickly as possible.
However, humans are limited and can’t be monitoring every single camera at the same time. Because the task has always been manual, incidents aren’t always detected straight away, and as a consequence hold-ups are prolonged.
This is where Automatic Incident Detection comes in.
Using computers, and combining sensors with computer vision to constantly monitor all cameras, it looks out for incidents, queues and unusual traffic conditions.
How does it work altogether?
Urban road networks are kitted out with CCTV cameras and multiple detectors. Together, they offer the foundations for automated, uninterrupted monitoring.
Powered by computer vision, the detectors are able to provide a constant data flow that assists TMC’s with their traffic operations.
Control centre operators are alerted whenever there is an anomaly in the traffic conditions, and they are able to respond as soon as possible to any incidents that the AI-driven systems have detected.
To collect its data, Automatic Incident Detection relies on CCTV camera and in-car inductive loops.
Systems have already been created for automated incident detection. For instance, ClearWay is already sophisticated enough to be able to spot an incident within the first ten seconds after it’s taken place. The system works in any lighting and weather conditions, it can be used at intersections, in tunnels, on open roads, and it’s been designed with Smart Cities in mind.
False alarm rate, meanwhile, is just one per sensor per day, and its AID radar covers as many as 1,000m.
Among the different types of incidents it’s able to detect are:
Another company that’s using computer vision for ATI is HikVision.
HikVision’s system can reliably detect incidents in tunnels, along bridges and on highways, and its solutions are based on intelligent video processing.
Operators are quickly alerted to incidents, and can react immediately.
Automated license plate recognition involves the use of computer vision camera systems attached to highway overpasses and street poles to capture a license plate number, as well as the location, date and time.
Once the image has been captured, the data is then fed into a central server.
Automated license plate recognition can use new camera systems designed specifically for this purpose, or it can use existing CCTV, as well as road-rule enforcement cameras.
Why the need for automated license plate recognition?
It’s typically used by the police to help them corroborate evidence. For example, was a particulate vehicle present at the scene of a crime? Does someone have a legit alibi?
However, automated license plate recognition can also be used to spot travel patterns and it’s used extensively in high monitoring, parking management and toll management.
Whether the information collected by a police force is shared with other agencies is down to the specific law enforcement agency itself.
Automated license plate recognition isn’t seen as urgent as other transportation matters, such as traffic detection and road condition monitoring. As such, the issues that surround it, such as more government spending and high error rates, are seen as controversial.
Moreover, the fact that automated license plate recognition can in theory know the intimate details of a driver’s life and understand who is likely to—among other things—attend protests or visit gun shops and so on, means that public support is hit and miss.
After all, a driver cannot opt out of their license plate being seen.
Each year, there are around 56,000 road accidents due to sleepiness and fatigue in the USA and as many as 1,500 deaths.
As per those kinds of statistics, the UK government identified driver fatigue as “one of the main areas of driver behaviour that needs to be addressed.”
Personal responsibility has to come into it, but as those damning statistics show, it’s not enough. Human error will not be eradicated by asking drivers to drive more carefully.
Computer vision has now been added to car cabins for the purpose of better, safer driver monitoring. The technology, which uses face detection and head pose estimation to look out for things like drowsiness and emotional recognition, can prevent thousands of crashes and deaths each year.
This is really important because many drivers don’t like to admit when they are fatigued, or that feeling a bit drowsy will impact their ability to drive. AI-driven tech can alert a driver whenever their driving is taking a major hit due to fatigue, and can advise them to pull over and take a rest.
This ensures the safety of the driver, the passenger, as well as other road users.
Other areas the technology proves useful include driver distraction. If a driver is distracted—for example, by their mobile device—the tech can alert them immediately to stay focused on the road. Other distractions might include chatting to a backseat passenger, which, without the driver realising it, is impairing their concentration.
Eyedentify has already developed a solution for detecting driver fatigue and distraction. Their system is able to alert drivers in real-time to stay focused and keep their attention on the road.
Artificial intelligence in transportation is leveraging important advanced technology, such as big data in transportation for improved safety and machine learning for greater efficiency, so that towns and cities—as well as smart cities—are able to reduce the number of road accidents, improve the flow of traffic, and even bring criminals to justice.
Indeed, when you’re able to address all the key issues that are blighting the transportation industry, such as huge numbers of needless deaths, bottlenecks and damaged roads, with the likes of big data and machine learning in transportation, safety and efficiency improve dramatically
Of course, we’re only at the exciting frontier. There is still more to come. As the technology continues to improve, the hope is that more smart cities will appear around the world, boosting worldwide operational efficiency, enhancing sustainability and making our roads, highways and intersections safer and better for all.
💡 Read more
If you’re curious to learn more about real-life applications of computer vision and AI, feel free to check out:
“Collecting user feedback and using human-in-the-loop methods for quality control are crucial for improving Al models over time and ensuring their reliability and safety. Capturing data on the inputs, outputs, user actions, and corrections can help filter and refine the dataset for fine-tuning and developing secure ML solutions.”