OpenAI's GPT-3.5 is the champion of the Street Fighter III LLM Colosseum, beating Mistral on its home turf
Beat 'em ups are clearly the superior way to test large language models.
Keep up to date with the most important stories and the best deals, as picked by the PC Gamer team.
You are now subscribed
Your newsletter sign-up was successful
Want to add more newsletters?
Every Friday
GamesRadar+
Your weekly update on everything you could ever want to know about the games you already love, games we know you're going to love in the near future, and tales from the communities that surround them.
Every Thursday
GTA 6 O'clock
Our special GTA 6 newsletter, with breaking news, insider info, and rumor analysis from the award-winning GTA 6 O'clock experts.
Every Friday
Knowledge
From the creators of Edge: A weekly videogame industry newsletter with analysis from expert writers, guidance from professionals, and insight into what's on the horizon.
Every Thursday
The Setup
Hardware nerds unite, sign up to our free tech newsletter for a weekly digest of the hottest new tech, the latest gadgets on the test bench, and much more.
Every Wednesday
Switch 2 Spotlight
Sign up to our new Switch 2 newsletter, where we bring you the latest talking points on Nintendo's new console each week, bring you up to date on the news, and recommend what games to play.
Every Saturday
The Watchlist
Subscribe for a weekly digest of the movie and TV news that matters, direct to your inbox. From first-look trailers, interviews, reviews and explainers, we've got you covered.
Once a month
SFX
Get sneak previews, exclusive competitions and details of special events each month!
You probably already know that large language models (LLMs) are used to power chatbots or generative AI tools for Windows. You'll probably also know that some are better than others, when it comes to getting accurate and reliable responses. But did you know that when it comes to Street Fighter III, there's one that stands above the crowd, and the winner of (the first ever?) SF3 LLM Colosseum just so happens to be OpenAI's GPT-3.5.
At the Mistral AI Hackathon event in San Francisco last week, a small team of AI enthusiasts dedicated themselves to finding the ultimate truth about large language models: Which LLM is best at fighting? According to the group, LLMs are better than reinforcement learning algorithms for such cases, because rather than just reacting on the basis of an accumulated reward, LLMs are far more context-based.
The way it all works is like this: The LLM is given a text description of the screen and it then calculates what move the player will make based on the player's previous moves, what the opponent is doing, and the health bars of both characters. Then it's just a case of sitting back and letting two LLMs have at each other.
Given the nature of the event, the first test runs involved pitching different versions of the Mistral LLM in frantic head-to-head battles, but then the group upped the ante by bringing OpenAI and its GPT-3.5 and GPT-4 models.
Fists were flung, combos cranked out, blocks battered, and dodges delivered. After many battles, the results were collated and one model stood proudly in the gold position: OpenAI GPT-3.5, specifically the latest Turbo version. Silver and bronze were split by the tiniest of margins, but Mistral-small-2042 just pipped a GPT-4 preview model to the post.
You can give all of this a go yourself, as the source code for the project is available on Github, and you don't need a supercomputer to handle it all. However, you will need a suitable game ROM file and it'll need to be one from an old 2D beat 'em up or a 3D one that has limited environment movement.
Best gaming PC: The top pre-built machines.
Best gaming laptop: Great devices for mobile gaming.
The potential applications of this are obvious and I wonder how long it will be before we see games where you'd swear you're playing against another person but it's just actually an LLM in action.
Keep up to date with the most important stories and the best deals, as picked by the PC Gamer team.
It all looks really cool, though I can't help but wonder if folks of a more military mind will be thinking about what else large language models can be used for. Especially given GPT-3.5's propensity for going thermonuclear in war games.
Hey, it's an AI story and I didn't mention SkyNet once! Oh, fiddlesticks.

Nick, gaming, and computers all first met in the early 1980s. After leaving university, he became a physics and IT teacher and started writing about tech in the late 1990s. That resulted in him working with MadOnion to write the help files for 3DMark and PCMark. After a short stint working at Beyond3D.com, Nick joined Futuremark (MadOnion rebranded) full-time, as editor-in-chief for its PC gaming section, YouGamers. After the site shutdown, he became an engineering and computing lecturer for many years, but missed the writing bug. Cue four years at TechSpot.com covering everything and anything to do with tech and PCs. He freely admits to being far too obsessed with GPUs and open-world grindy RPGs, but who isn't these days?

