Producer Rob Bridgett discusses how sound quality in video games has improved over the years, and offers his thoughts on what the future may hold.
As little as five years ago, when pondering the future video game audio landscape, I don’t think anyone could have predicted the extent to which there would be such a huge schism in how we think of ‘video games’.
Ten years ago, most of us would have predicted a steady increase in visual and aural fidelity as games inevitably crept towards a future ‘cinematic’ model, foregrounding photorealism and narrative immersion. To some extent, this is still happening. However, things have changed dramatically, and quickly, on two fronts.
Firstly, the unstoppable rise of Indie developers – studios like Playdead, which have challenged what audiences think of as ‘game experiences’ by going back to the drawing board from both the design and artistic perspective. Their game, Limbo, a black and white platformer with no dialogue, stood out primarily for these reasons, but underneath lay no less of a polished player experience. Similarly, from a technology view-point, Limbo could have been realised on a PlayStation 2, but was one of the biggest critical successes of the PS3/Xbox360 console cycle. In doing so, Playdead created something completely against the grain of the firmly established first-person shooters and third-person open world triple-A games.
In addition to this, the introduction of, and subsequent ubiquity of, tablet and mobile phone technology as a gaming platform, not only pushed a massive reset button on the whole game design philosophy of ‘bigger, longer, more’ experiences, but has provided a publishing model free of publisher intervention, opening up any potential game or app developer who can code up their idea to put it in an App store and reach an audience.
These two disruptive shifts have also had a significant effect on the way games are conceived, developed and, of course, how audio is integrated.
So where are we now?
When we look around at video game culture right now, what is apparent is that the bigger triple-A productions, and the big teams who make them, are still around and are still pushing the envelope ever further to produce some fantastically rich, and ever-more cinematic, experiences.
Naughty Dog’s output is nothing short of incredible in this regard and the GTA franchise, rather than going stale and seeing critical (or sales) declines, continues to go from strength to strength.
Ubisoft is similarly invested in developing huge franchises like Assassin’s Creed, Far Cry and Watch Dogs and there is still a healthy and lucrative ambition to move into and dominate this space.
On the other hand, we see incredibly innovative and successful games being created by tiny teams like Simogo (Year Walk, Device 6, Sailor’s Dream) and UsTwo (Monument Valley) – rich, polished and gorgeous experiences that are every bit as immersive as those that would pursue a cinematic narrative model.
The Middle / Third Space
Of course, this isn’t just a simple case of David vs Goliath. There has also emerged a thriving and innovation-driven middle-ground, the likes of developers such as Double Fine, Media Molecule (Tearaway), E-line Media (Never Alone), Minority (Papo & Yo) and even Playdead (Limbo/Inside) are occupying a ‘third space’ in between the small, innovative risk-taking mobile-focused Indie teams and the giant triple-A console behemoths. These ‘third space’ developers tend to leverage medium-sized teams to work on innovative game experiences across mobile and console platforms – usually relying on a downloadable distribution model rather than a boxed product on shelves.
Is there a similar schism in audio production and tools?
The mobile devices that these games run on are capable of some incredibly sophisticated audio rendering, able to run plenty of simultaneous voices as well as reliably streaming sounds and having multiple run-time DSP effects. While the output of the smaller devices is limited to a device speaker or stereo headphones, sophisticated run-time mixing, ducking and grouping is also used extensively to allow games to be presented beautifully. The same can now be said of browser-focused game and audio experiences.
In terms of tools available for both triple-A and mobile audio production and implementation, the choices are often identical, with software like Audiokinetic’s Wwise being used on both large projects, such as Assassin’s Creed, Alien: Isolation, Bioshock Infinite, as well as smaller titles like Limbo and Peggle.
Across the industry, the tools for content creation turn out to look very similar, with middleware companies that have come into existence only over the last 10 years or so finally finding a prominent foothold in the production landscape across all game genres and types and all platforms. Many of the techniques and processes like mixing, dialogue logic, SFX production and implementation are the same from big studios to small ones. The primary difference to understand between the two extremes of the industry is simply one of scale: less content, shorter experiences, shorter development cycles and smaller development teams on the mobile/indie side of the garden.
However the approach, the overall goal of audio is the same on any scale of project: How can sound convey the experience to the player? And how can it accomplish this in a polished and non-annoying way?
Where Next for Triple-A Sound?
As the sector is so risk-averse due to increasing production and marketing budgets, we can’t expect too much on the horizon in terms of innovation, certainly not on a design, franchise-model or production style level. However, we can expect many smaller incremental changes.
It is now entirely possible, indeed just around the corner, that the already object-oriented 3D positional audio sources will begin to take advantage of the recent object-oriented surround formats like Dolby Atmos, more specifically in their ‘home consumer’ incarnations.
With the addition of a run-time translation layer, these technologies will allow 3D positional sound sources that already exist in the game engine – say the positional sounds associated with a game object like an enemy sniper – to be localised as audio objects in such technologies as the Dolby Atmos RMU at run-time.
Introducing height speakers (overhead helicopters, footsteps of an enemy overhead as you crouch under the floorboards) and much finer localisation through multiple speaker arrays will certainly be the next big thing in immersion technology and mixing and will be used to produce some incredible moments akin to ‘ride-films’ for those who have this technology installed at home.
From a marketing standpoint, we’ll no doubt be seeing the likes of the Dolby CP850 in theatres being able to receive the object-based surround output of a video game console’s sound via HDMI and the game experience could be enjoyed in a fully equipped Dolby Atmos movie theatre – perfect for big game promo events and exclusive reveals, and the like.
Procedural sound propagation (the creation of sounds at run-time by purely synthesized means) remains a hot area for growth. GTA V boasted a large percentage of its sound effects being created procedurally, and as the techniques and tools become more accessible for designers and content creators to work with, these techniques will almost certainly start to take more of a foothold in the everyday lexicon of game sound designers and composers – perhaps even moving into convincing run-time voice content.
On a similar theme to procedural sound, MIDI controlled music is already making a comeback. Games like Peggle recently made excellent use of this with their re-appropriation of older MIDI controlled sound instruments to accommodate lots of musical variety in a small memory footprint. Expect to see more of these older systems undergo a similar resurrection as the game teams on mobile platforms rediscover these forgotten memory-saving techniques.
From an aesthetic/style viewpoint, the triple-A sector is still crying out for a revolution in how voice-over is written, approached and performed. Even the most carefully produced and meticulously directed efforts still somehow manage to feel stifled, flat and instructional. A game changer here would be a well-executed, deeply integrated, improvisational style of performance and writing – overlapping dialogue, hesitations, magnifying all the flaws of natural everyday speech and bringing these performances into a convincing video game scenario, with transparent, real-feeling AI. This could forever change the way narrative -driven games sound. Imagine a GTA-like experience but with The Wire’s (TV) documentary feel, looseness and believability.
Where next for Indie Audio?
As for the mobile and Indie sector, I’m sure we’ll see things continually and very quickly changing in terms of experimental game styles and tools. Available horsepower will certainly increase much more quickly in this sector than in the console sector. The game creation engine Unity’s most recent update, which will provide much-needed extension to the engine’s audio tools, could also be a game changer in terms of built-in audio scripting and more deeply integrated interactive sound, a gap that has so far been ably filled by the likes of Fabric, Wwise and FMOD.
I see game and art style as being the driving force behind these kind of games, as new, novel and quirky experiences seem to continually drive discovery in this segment of game development. In terms of implementation, this is the segment where I see the most benefit from MIDI and procedural sound creation, not only in terms of memory saving, but especially in terms of more stripped back game aesthetics and timbres.
I see developers in the third-space benefiting from both the increments in triple-A console technology and being able to more quickly assimilate rapid design innovation occurring in the mobile/Indie sector – in fact, the studios that are in the fortunate enough position to be able to leverage an Indie approach with console and mobile technology may be where we see the most innovation and interesting applications of game audio over the next five years.
No matter which area of game development you are involved in, the sector as a whole is thriving and growing increasingly diverse. I, for one, am excited to experience what’s next on all fronts.
Rob Bridgett is a producer/audio director at Clockwork Fox Studios in Canada, and runs the blog www.sounddesign.org.uk
Picture: Ubisoft's Far Cry 4