AI in Games Blog Procedural Content Generation Like an Artificial Intelligence in a Candy Store

Avatar photo

Team modl.ai modl.ai's team of experts writing about things they love!

read |

We auto-generated hundreds of match-3 levels for Candy Crush Saga using different methods to see if we could pass them off as human-generated levels. Capturing patterns and creating symmetry were the main ingredients of the succeeding algorithm. 

If you’ve played a lot of Candy Crush Saga, you might be able to answer this: Which of these two levels is created by a human level designer and which is created using machine learning?

Level A
Level B

If you haven’t crushed much candy, the task might be hard. “A” is auto-generated while “B” is made by a human level designer. Did you guess right? Well, here’s how we did it.

Categorizing

First, we categorized the different kinds of tiles so we didn’t have to handle the many types of candy in Candy Crush Saga.

The 6 categories of tiles in Candy Crush Saga and how we encode them. As an example, we have shown the “blocker” category layer as 1’s on the binary encoding.

After this step, we were ready to generate levels. We used two known methods. GAN (Generative Adversarial Network) and MRF (Markov Random Fields). 

This is where we ran into trouble. You can see the levels we generated below. Our levels were quite different from those generated by the Candy Crush Saga level designers. Ours lacked a vital component: Symmetry.

Patterns and symmetry make the tiles seem intentionally placed. The levels we generated seemed chaotic in comparison. How could we solve this?

Our auto-generated levels lacked symmetry and patterns.

Capturing local and global patterns

We realized that Candy Crush Saga levels contain two kinds of patterns. Local patterns, which are patterns confined to small sections of the level, and global patterns that span the entire level. We had to create both types of patterns if we wanted to fool players with our auto-generated levels.

Local patterns are confined to the nearest neighborhood
while global patterns span the entire level.

Changing existing methods

GAN (Generative Adversarial Network) was the first method we tried. GAN is adversarial training, meaning that a network called the Generator tries to forge Candy Crush Saga levels. Another network called the Discriminator is trained on original Candy Crush Saga levels and then tries to detect if the Generator’s levels are fake. The Generator receives feedback from the Discriminator on how well it did and is rewarded it does a good job.

Here you see how the Discriminator labels the levels as it decides if they’re real or fake.

We added layers to the Discriminator that were able to consider the level as a whole. We did this to accommodate the global patterns needed for a Candy Crush Saga level to appear like it was made by a human-level designer. We called this GlobalGAN.

But we also made an alternate version of GlobalGAN. A version that we trained exclusively on Candy Crush Saga levels with a vertical symmetry. We called this GlobalGAN-vert. 

We trained the networks for thousands of cycles through the full training datasets. When training was done, we could use the Generators to create levels that at least fooled their Discriminators. Would they fool human players as well?

We also wanted to try another method…

Markov Random Field (MRF). MRF looks at neighboring tiles for pattern generation. So, given its neighbors, MRF estimates the likelihood of what tiles it should pick based on the training data. We named this method LocalMRF.

LocalMRF looks at the blue tiles to determine the
category of the white tile.

LocalMRF was in other words great at producing local patterns but ignored the overall symmetry we needed. Once we realized this, we added two extra tiles to the non-local neighborhood for the MRF to look at. We called this GlobalMRF. We hoped it would heighten the likelihood of symmetric patterns spanning the entire level.

GlobalMRF also looks at non-local tiles when deciding the category
of the white tile. This heightens the chances of global patterns.

What worked best?

We had our networks trained but the question remained: Which of our four networks generated the best Candy Crush Saga levels?

We created a scoring system where we rated the levels for horizontal, vertical, and diagonal symmetry compared to original Candy Crush Saga levels.

GlobalGAN

While the levels of this network did contain some symmetric
shapes, they were mostly chaotic

GlobalGAN-vert

The levels that came out of this network was slightly more symmetric than the GlobalGAN.

LocalMRF

These levels did show some symmetry but were but were mostly chaotic.

GlobalMRF

We saw a big improvement in symmetry here.

While none of our networks reached a symmetry score as high as the original Candy Crush Saga levels, GlobalMRF scored high on vertical symmetry.

Would the employees at King, the company who produces Candy Crush Saga, be able to see these patterns? Would the GlobalMRF levels seem like real Candy Crush Saga levels to the creators of the game?

Human versus machine

We picked level samples from each generator and presented them to employees from King. 

We quickly learned that humans don’t identify patterns the same as machines. Take a look below. The level has several local patterns. Would you recognize all the patterns? Depending on what kind of candy the networks chose, humans might only barely make out the patterns. 

King employees did at times identify local patterns in the levels but not as much as they detected global patterns. Despite our effort, original levels did seem to contain more obvious global patterns. So, what model performed best?

What worked best?

All networks were able to produce levels that fooled quite a few players. However, GlobalMRF did perform best on symmetry, which was the measure we aimed at in this experiment. However, GlobalGAN-vert was able to fool more people. The level that performed best — a GlobalGAN level — fooled 47% of King employees.

Need more information?

If you want to dive into further details, you can see a video presentation here or read the actual paper here: Capturing Local and Global Patterns in Procedural Content Generation via Machine Learning.

The people

Our team at modl.ai, as well as the good people from King, did all the heavy lifting on this study: Vanessa Volz, Niels Justesen, Sam Snodgrass, Sahar Asadi, Sami Purmonen, Christoffer Holmgård, Julian Togelius, and Sebastian Risi, without them, this research would not have been possible. 

Thanks for reading, and as always, please contact us with any questions and requests you might have.

We auto-generated hundreds of match-3 levels for Candy Crush Saga using different methods to see if we could pass them off as human-generated levels. Capturing patterns and creating symmetry were the main ingredients of the succeeding algorithm. 

If you’ve played a lot of Candy Crush Saga, you might be able to answer this: Which of these two levels is created by a human level designer and which is created using machine learning?

Level A
Level B

If you haven’t crushed much candy, the task might be hard. “A” is auto-generated while “B” is made by a human level designer. Did you guess right? Well, here’s how we did it.

Categorizing

First, we categorized the different kinds of tiles so we didn’t have to handle the many types of candy in Candy Crush Saga.

The 6 categories of tiles in Candy Crush Saga and how we encode them. As an example, we have shown the “blocker” category layer as 1’s on the binary encoding.

After this step, we were ready to generate levels. We used two known methods. GAN (Generative Adversarial Network) and MRF (Markov Random Fields). 

This is where we ran into trouble. You can see the levels we generated below. Our levels were quite different from those generated by the Candy Crush Saga level designers. Ours lacked a vital component: Symmetry.

Patterns and symmetry make the tiles seem intentionally placed. The levels we generated seemed chaotic in comparison. How could we solve this?

Our auto-generated levels lacked symmetry and patterns.

Capturing local and global patterns

We realized that Candy Crush Saga levels contain two kinds of patterns. Local patterns, which are patterns confined to small sections of the level, and global patterns that span the entire level. We had to create both types of patterns if we wanted to fool players with our auto-generated levels.

Local patterns are confined to the nearest neighborhood
while global patterns span the entire level.

Changing existing methods

GAN (Generative Adversarial Network) was the first method we tried. GAN is adversarial training, meaning that a network called the Generator tries to forge Candy Crush Saga levels. Another network called the Discriminator is trained on original Candy Crush Saga levels and then tries to detect if the Generator’s levels are fake. The Generator receives feedback from the Discriminator on how well it did and is rewarded it does a good job.

Here you see how the Discriminator labels the levels as it decides if they’re real or fake.

We added layers to the Discriminator that were able to consider the level as a whole. We did this to accommodate the global patterns needed for a Candy Crush Saga level to appear like it was made by a human-level designer. We called this GlobalGAN.

But we also made an alternate version of GlobalGAN. A version that we trained exclusively on Candy Crush Saga levels with a vertical symmetry. We called this GlobalGAN-vert. 

We trained the networks for thousands of cycles through the full training datasets. When training was done, we could use the Generators to create levels that at least fooled their Discriminators. Would they fool human players as well?

We also wanted to try another method…

Markov Random Field (MRF). MRF looks at neighboring tiles for pattern generation. So, given its neighbors, MRF estimates the likelihood of what tiles it should pick based on the training data. We named this method LocalMRF.

LocalMRF looks at the blue tiles to determine the
category of the white tile.

LocalMRF was in other words great at producing local patterns but ignored the overall symmetry we needed. Once we realized this, we added two extra tiles to the non-local neighborhood for the MRF to look at. We called this GlobalMRF. We hoped it would heighten the likelihood of symmetric patterns spanning the entire level.

GlobalMRF also looks at non-local tiles when deciding the category
of the white tile. This heightens the chances of global patterns.

What worked best?

We had our networks trained but the question remained: Which of our four networks generated the best Candy Crush Saga levels?

We created a scoring system where we rated the levels for horizontal, vertical, and diagonal symmetry compared to original Candy Crush Saga levels.

GlobalGAN

While the levels of this network did contain some symmetric
shapes, they were mostly chaotic

GlobalGAN-vert

The levels that came out of this network was slightly more symmetric than the GlobalGAN.

LocalMRF

These levels did show some symmetry but were but were mostly chaotic.

GlobalMRF

We saw a big improvement in symmetry here.

While none of our networks reached a symmetry score as high as the original Candy Crush Saga levels, GlobalMRF scored high on vertical symmetry.

Would the employees at King, the company who produces Candy Crush Saga, be able to see these patterns? Would the GlobalMRF levels seem like real Candy Crush Saga levels to the creators of the game?

Human versus machine

We picked level samples from each generator and presented them to employees from King. 

We quickly learned that humans don’t identify patterns the same as machines. Take a look below. The level has several local patterns. Would you recognize all the patterns? Depending on what kind of candy the networks chose, humans might only barely make out the patterns. 

King employees did at times identify local patterns in the levels but not as much as they detected global patterns. Despite our effort, original levels did seem to contain more obvious global patterns. So, what model performed best?

What worked best?

All networks were able to produce levels that fooled quite a few players. However, GlobalMRF did perform best on symmetry, which was the measure we aimed at in this experiment. However, GlobalGAN-vert was able to fool more people. The level that performed best — a GlobalGAN level — fooled 47% of King employees.

Need more information?

If you want to dive into further details, you can see a video presentation here or read the actual paper here: Capturing Local and Global Patterns in Procedural Content Generation via Machine Learning.

The people

Our team at modl.ai, as well as the good people from King, did all the heavy lifting on this study: Vanessa Volz, Niels Justesen, Sam Snodgrass, Sahar Asadi, Sami Purmonen, Christoffer Holmgård, Julian Togelius, and Sebastian Risi, without them, this research would not have been possible. 

Thanks for reading, and as always, please contact us with any questions and requests you might have.

Published by Team modl.ai

modl.ai's team of experts writing about things they love!

Avatar photo