How many times have you sat down somewhere to watch a movie only to find that you hear echoes,
louder sounds behind you than in front, or one side of sound louder than the other?
When going to the movie theater, most of us run straight for the center seats, not only because we will be seated in the middle of the screen but also because it is generally the best place for sound quality. Over the decades, cinemas have spent tens of millions of dollars on equipment to try to give audiences the best possible sound experience.
Anyone who has tried to set up a surround sound system at home will have surely struggled with sounds reverberating off furniture and walls, something which greatly reduces the overall film-watching experience.
Poor spacial acoustics are not only limited to surround sound systems, they affect all of the Dolby TV
systems and sound bars too. Given that most of us cannot simply tear our houses apart to create a
dedicated cinema, the real question is can anything else be done to help ensure we all get the best audio experience possible?
Well, it seems that a recent development in artificial intelligence might just hold the answer.

AI Spatial Acoustics

MIT scientists reported an exciting advancement in AI technology in their article “Using sound to model the world.” https://news.mit.edu/2022/sound-model-ai-1101
The article details how 3D modeling, when coupled with machine learning, can be used to accurately
simulate the exact sounds a listener would hear at any point in a room.
The potential uses for this technology are enormous, from improving office environments, and city
planning, to helping to solve crimes. It is, however, its potential application in film-watching and movie
production that is most exciting.
Researchers at MIT and the MIT-IBM Watson AI Lab developed an implicit neural representation model that is able to understand how any sound in a room will make its way through the room. This model has already been used in computer vision to allow computers to create accurate 3D visual representations from 2D images. Using the data collected from releasing a few test sounds and measuring how these sounds bounce back, the AI algorithm is then able to simulate what sounds a listener would hear in any location in the room.
The system can ‘learn’ a 3D model of the room’s size and shape from these few sample sounds.
Read the full details of how MIT researchers were able to develop their audio neural representation model into an accurate system via this link to the MIT article.
https://news.mit.edu/2022/sound-model-ai-1101

What does AI Spatial Acoustics Mean for AI-assisted Moviemaking?

I have talked about some of the limitations that sound places on our movie-going experience. The AI
breakthrough above would allow for such systems to be integrated into our TV and sound systems that would automatically adjust the sound levels being outputted by speakers in order to reduce the effect of unwanted echoes, sound reverberations, etc.
Naturally, this would have an enormously positive effect on the movie watcher. Such systems could even be programmed to identify the listener’s position and to automatically adjust sound levels should that person move elsewhere in the room.
However, it won’t just be the audience that benefits from this AI innovation but also the filmmakers too. Learning the skill of recording on-set sounds in an art form takes many, many years of training and experience. Even when recording in a closed environment like a studio set, where background noise can be brought to a bare minimum, getting the boom microphone in exactly the right place can be extremely difficult. Once again, unwanted sound reverberations resulting from props in the scene, certain types of surfaces such as concrete walls, and echoes from hollow objects will all get picked up and recorded.
Once AI Spatial Acoustics software is developed by AI-assisted moviemaking companies then it will be
able to work in collaboration with remote smart devices such as specialist microphones to quickly and
cheaply determine how sounds in the 3D environment being recorded will be in any given place.
These insights will be extremely empowering for anyone dealing with the sound on set as they will, in an instant, allow them to accurately plan where to best place the boom mic. It will allow them to better work with the set department to plan and even reorganize where items and props are used on set.
Eventually, AI-assisted moviemaking companies will be able to make in-depth suggestions from just a
script regarding how best to use a set to maximize the quality of the sound being recorded on set.
And it doesn’t stop there. Post-production sound editing will arguably benefit the most from this
technology. It will allow AI-assisted filmmaking platforms to suggest the levels of sound to be placed on the soundtrack to make them as realistic as possible. This will help sound engineers to get the soundtrack 100% right the first time.
When experimenting with more atmospheric techniques such as creating a scary soundtrack, the system will eventually be able to give feedback regarding its effectiveness. This last tool is many years off as it will require huge amounts of data regarding the effect of different sounds on different audiences, something which we simply don’t have at the present time.
However, all of these tools are now a real possibility thanks to this new breakthrough in artificial
intelligence spatial acoustics. Quite when we will see these tools ready for filmmakers is unknown, but
what is clear is that they will be a powerful help to filmmakers and sound engineers around the globe.