Playing for Data: Using GTA to teach self-driving cars to navigate the real-world


273,588.430 – that is the exact number of kilometres Google’s fleet of 58 cars clocked for their Self-Driving Car project in August this year (Google report)! Although Artificial Intelligence and Machine Learning are so advanced today that we can teach our computers to do amazing things, the algorithms that enable them require tons of data if they are to be 100% accurate (and efficient). That is why a company like Google is driving around so much (in a month).

Playing for Data

Is driving around for hundreds of thousands of hours (in the real-world) the only possible way for us to teach cars how to drive autonomously? Sure, simulation can help but it has its limitations. There might be another alternative to simulation – researchers at TU Darmstadt, Germany and Intel Labs are using Grand Theft Auto’s streets for data!

Playing for Data: Ground Truth from Computer Games
Playing for Data: Ground Truth from Computer Games

In the paper, “Playing for Data: Ground Truth from Computer Games“, the authors extracted 25,000 frames with varying weather conditions at different times of day from GTA to be used as training data for classification of pedestrians and another objects a self-driving car is likely to encounter in the real-world.


Really interesting approach for getting ground truth for your classification algorithm. Who knew all those gaming videos on Youtube might turn out to be research material for self-driving cars later on!

Data and more information 

Interested in reading the paper? It’s available here. The data and the labels used for training the algorithms is available here (the code isn’t available yet).