Losing to a Machine, Skill Issue?

By: Shawn L.

Have you ever been beaten by a chess bot? Well, start getting used to being outmatched by a machine at games because recently, a racing AI for the game Gran Turismo has managed to outrace the (previous) human game champions.

To teach the racing AI named GT Sophy the way to improve at an ultra realistic racing game that incorporates most, if not all of the features in an actual race, Sony AI researchers decided to use a technique: deep reinforcement learning, which is essentially recreating the results of Pavlov’s dog in a program. Pavlov’s dog being an experiment where a man, Pavlov rung a bell whenever he fed a dog, making it so that the dogs he were experimenting on would start drooling whenever they heard the bell. Something similar was achieved by “rewarding” GT Sophy when it does something that results in a better outcome for it, such as following the rules of racing and staying inside their track, or hitting a sweet drift, or just using the acceleration pedal. By associating things that get a better time with rewards, GT Sophy would be more inclined to get better times. The researchers then used this system and made an iteration for every car on every track before letting them just run to “practice”.

What can’t be done here is to just program the GT Sophy on a set path, with different optimized timings, as other racers will be present on the track as well. This makes the situation much more complicated as the “environment” changes as the cars move, making an algorithm that would cover all these situations impractical to write. Thus, the task is left to deep learning instead.

In the first test against human players, the human players achieved the highest overall score (not necessarily individual scores). However, in the second test, the AI managed to win both individually AND as a team, managing to achieve the fastest lap times. Contrary to expectation, the human players saw this as a learning experience, and some even enjoyed racing the AI. Others saw the perfect execution of some extremely difficult lines, and one of the players, Emily Jones (the 2020 FIFA-certified Gran Turismo champion), even commented that certain manuevers that the GT Sophy managed to pull off would be a one time achievement for human players as it would be too precise to perform.

Some other examples of an AI being written to play a complex game includes AlphaStar on Starcraft II (a real time strategy game with sometimes hundreds of “pieces”) and OpenAI Five in Dota 2.

AlphaStar had 44 days of cumulative training, beating low ranked players, collecting 61 wins out of 90 games played. However, researchers noticed that the games won by the AI might’ve just been “out-clicking” the opponent. This is a result of the high reward of microing your units, which is essentially micro managing your units. For example, imagine having 10 soldiers, each one performing at the same capacity when healthy, slightly injured, and mortally injured. In order to maintain the damage output of the army of 10, the player needs to keep them all alive. With a flawed targeting system like the one implemented in Starcraft II, one can manipulate the enemy into attacking one soldier, then “swapping” it out with another fresh soldier when it is mortally injured, allowing the group to soak up damage without the loss of any damage output. To ensure that AlphaStar was actually outplaying the players instead of out-clicking them, the researchers implemented a system that limited the AI’s reflex to that of a experienced player. The ending results was that AlphaStar placed in the top 0.5% of the European servers of the game.

OpenAI Five on the other hand didn’t do as well with the gameplay skill, bringing 25 heroes (playable characters of DOTA 2) to an mmr (matchmaking rating) of 5,000, which is the 95th percentile of most players. This is impressive due the emphasis on teamwork instead of actual skill. The AI was capable of performing acts such as self sacrifice in order to save a human player in hopes that the human player knew what they were doing (which they didn’t).

These AIs show us the capabilities of machine learning. For example with GT Sophy, we not only know that it’s really good at racing, but that we could use a simulation to train AI to drive. With GT Sophy, a flaw was that it wasn’t able to respond to conditions such as wet roads, or fallen objects because it hasn’t seen one in a simulation, but these aspects could be added so that an AI would have the ability to respond to these situations. Another real world use of these AI could be taken from OpenAI Five as it is able to understand cooperation and attempt teamwork, allowing for human and machine cooperation in the future.

Ultimately, using machine learning for games can sometimes yield surprisingly useful results, such as teaching an AI how to race, or how they can cooperate with people and others.

Works cited:

Bushwick, S. (2022, February 11). AI outraces human champs at the video game gran turismo. Scientific American. https://www.scientificamerican.com/article/ai-outraces-human-champs-at-the-video-game-gran-turismo/

Garisto, D. (2019, October 31). AI beats top human players at strategy game StarCraft II. Scientific American. https://www.scientificamerican.com/article/ai-beats-top-human-players-at-strategy-game-starcraft-ii/

OpenAI. (2020, September 2). OpenAI five defeats DotA 2 world champions. https://openai.com/blog/openai-five-defeats-dota-2-world-champions/