Player versus Music – Generating gameplay from music

Gameplay generated by music

In this article I will discuss the methods I used to create gameplay from music for my game Audio Arena. An Action / music / rhythm game made for Gear VR.
The game was developed by a single developer in just a few months on a tight budget.

Here is a gameplay trailer to show the game in action.


The premise of Audio Arena is simply: Player versus Music.
I knew I wanted an arena game with as many things as possible synced to the music. But which things would work well?

I thought the best place to look for inspiration were music visualizers. There are some great music visualization videos on youtube. I looked at them to see what effects and gameplay elements I wanted to create.

I found that objects that have their movement speed based on the music work well, like in this video:


Things that pulse in size or intensity on the beat work well too. Like in this video:

I also looked at other music games like: Stepmania, Audio Surf and Elite Beat Agents (and similar games)

After some brainstorming and experimentation I ended up with this list:

  • Red enemies spawn on the bass.
  • All enemies spawn with a white flash
  • Yellow and Purple enemies spawn on tones.
  • Yellow enemies pulse their size on the beat.
  • Purple enemies speed up on the beat.
  • A pattern of enemies, like a circle or square around the player, gets spawned on important moments in the music.
  • Weapons activated on the beat of the music give combo-bonus.

Here are some gifs of the features in action:

weapon_pulse yellows combo

Manual or Algorithm?
How did I program this? I needed to know when the beats were in my music exactly. So a big question was: Can I extract this information accurately using frequency analysis? Or do I need to manually input this information for all songs?
I knew that Stepmania uses .step files created by users to nail down the timings exactly.
I knew that AudioSurf uses frequency analysis.

Doing it with frequency analysis and a beat detection algorithm has many advantages:

  • The player could, in theory, use their own music to play.
  • It saves a lot of manual labor.
  • I could have much more levels and content in the game.

But would the detection be good enough for my game? I wanted to atleast try.

I started out with a Unity Plugin called: Visualizer Studio
This allowed me to extract the information I needed from the music. I quickly found out that analyzing that information was far trickier than I hoped. Beat detection that worked well on some songs didn’t work at all on other songs. And nothing felt worse in the game than activating the weapons exactly on the beat but not have it recognized by the game. I needed higher accuracy for my game to shine.

So I decided to do the information extraction manually. I figured doing it manually would teach me a lot about how to do it algorithmically as well.
I briefly thought about making a Unity Editor window to do the work in but quickly found out there are existing tools to make the work a lot easier.

I ended up using Sonic Visualiser to help me with this process.
I placed markers in the song on the bass and tones I needed. These are called Time Instants in Sonic.


I exported these to a text file like this, each line represents the timing of a beat.


So for each song I had the following textfiles:

Sonic has a lot of plugins available including Beat Trackers. I used a plugin called BeatRoot BeatTracker to help me get the beats and save me a lot of manual labor placing the instants.

In Unity I read in those text files and add the timings into arrays. I compared the time in the music to the next beat or tone and that way I was able to spawn enemies on the bass and tones. It also allowed me to check if the weapons were activated in sync with the beat.

The unity docs give a warning about using AudioSource.time for this:
“Be aware that: On a compressed audio track position does not necessary reflect the actual time in the track. Compressed audio is represented as a set of so-called packets.
The length of a packet depends on the compression settings and can quite often be 2-3 seconds per packet. “

But I have had no problems on windows and android with these import settings on my .wav files.


This works extremely well and far better than I can ever hope to get it by doing it algorithmically. I might come back to the algorithmic way if the game ends up doing well and players really want to play on their own music.

Getting the music
I wanted electro-pop music with a BPM (beats per minute) between 80 and 110 because that plays the best. The music needed to have a clear beat and clear tones that are easily recognisable. Then it needs to have enough diversity in the song to make it fun to play. For example the beat could stop for 20 seconds or so but not too much longer. So there are quite some demands on the music I needed.

I knew that I would need to have at least 16 songs in the game to have a decent amount of playtime. Having 16 songs custom made would cost me too much time and money. So I decided to look for music to license.

I started with the APM Library that is available through the Unity asset store. I found a couple of nice songs that seemed to work well. But it is important to always read the fine print when buying assets for your game. APM music has the following restrictions:

“ This license does NOT grant you the rights to use the music for in game or trailers for Music/Rhythm/Karaoke based games or apps where the music is the primary or
integral subject of the production and essential togameplay, such as Guitar Hero.”

I emailed with them a bit, but it was soon clear that APM was not the way to go, as getting a custom license would be expensive and time consuming.

I ended up licensing all the songs from They have an excellent selection of electro-pop for decent prices. The only downside is that I will have to upgrade my license when the game sells 1000 copies and again when it sells 10000 copies. But atleast that way my costs rise with the game-revenue.


I got a 20% discount for buying in bulk. Always email with any store you are buying a lot of things from! A quick email saved me $160 on this purchase. I ended up with 20 songs I really like for just $640 (initially).

Final Thoughts
Generating gameplay from music is a challenge that requires the right kind of music and the right kind of gameplay.
But when you get it right, the whole experience feels awesomely rewarding.