MIT using text-based games for training AI natural language processing

Griatch

I was just made aware of a recently presented MIT article where they use text game worlds created in Evennia.

The article tests AI natural language processing using a "deep-learning" neural network. The goal is, basically, for the system to learn to play the game in as optimal a fashion as possible based only on the text being shown to it. That is, the computer needs to discern the (hidden) game state only based on text output the same way a human must do.

The scientists first build a small test "house" in Evennia, containing four rooms. This is used to train their neural network. The house has rather formulaic descriptions and features a straight-forward "quest" of finding and eating an apple. They then apply this trained network to the actual challenge - playing Evennia's official tutorial world, which comes with Evennia out of the box.

Our tutorial world is basically a small and well-contained solo-quest to show off some of the things possible with the system. The scientists doesn't appear to have modified the actual game play but I can only presume they have tricked out the system with some extra behind-the-scenes stuff to properly monitor and analyze their results.

The paper's authors write that solving the multi-room final quest (finding the tomb of a fallen hero) was too hard a problem and beyond the scope of this article. Instead they focused their performance analysis on a much more mundane task: the AI player managing to cross a bridge.

The swaying bridge in question is actually implemented as a single Evennia room but designed so that it takes multiple actions to cross. Every step, the room description you see changes and there are also random weather messages as well as a small random chance to fall off the bridge (and end up in a different area) if you linger for too long. All you really need to do is to walk east multiple times, but for an AI having no inherent concept of what a "bridge" is, this is clearly a complex task.

The article shows that their proposed algorithm performs better than comparisons (and a lot better than random chance). The AI does figure out how to cross the bridge about 50% if the time, and it tends to score higher than the compared methods when it comes to avoiding invalid commands and similar scoring metrics. While their measure of success is limited to succeeding to cross the bridge, many of their examples are from other rooms which suggests to me that the AI is actually taking on the whole game.

So from the perspective of the scientists, this was successful. It does give some insight into the current state-of-the-art of AI research as well: There is certainly no real competition to human players on the horizon at this point but it's a small step on the way. I hope they continue expanding their work along the same lines. Using text-based games as test beds sounds like a good idea overall and from my own brief contact with the researchers (at at point where they were just thinking about using Evennia), they seem to be quite dedicated to the concept.

At least I found this interesting. I wrote up more details, including links to the original article in the Evennia dev blog here.
.
Griatch

Rainbow Unicorn

Well.

We knew there was something funny about Jill.

...more funny.

surreality

@Rainbow-Unicorn said:

Well.

We knew there was something funny about Jill.

...more funny.

...I am so glad I wasn't the only one who thought of Jill immediately.

Sunny

Yeah, that was my thought as well.

Griatch

Ah, I think I read about Jill somewhere on here.

This made it to the front page of HackerNews, funnily enough.
.
Griatch