4 minute read time.

It could be argued that the video industry has had a “more” fixation for many years. More TV stations to choose from, more pixels, more dynamic range, more frames per second, more colours. Audio has progressed very slowly from one channel (mono) to two channels (stereo) and finally six channels (5.1) and then got stuck, other than a few experiments with more channels of audio to give height.  Other than a couple of failed attempts to introduce 3D TV, I think there have been no significant new media experiences since the introduction of colour TV in the middle of the last century! Now, object based media promises new opportunities rather than simply more of the same.

What are media objects? There are two main kinds:

  • Layers: the media have additional elements which can be played at the same time as the main content, if desired. Subtitles are a layer which can be turned on and off at any time and used alongside the primary content. Receiver-mix audio description is another example of a layer.
  • Segments: The most basic example might be the opportunity to view, or not view, the recap of what happened in previous episodes when streaming a programme from a series. The BBC and others have done some interesting work on adaptive media which can played with different durations by adding or skipping content, whilst still maintaining the narrative flow. You can then fit the programme to your commute or have a much longer version if you want in-depth coverage.

I’m particularly interested in audio objects and first experimented with them in 2011. I was Head of Technology for Radio at the BBC at the time and teamed up with Fraunhofer IIS to offer what I believe was the first public demonstration of audio objects in a media accessibility context. I asked them if their Spatial Audio Objects could be used to address one of the most common causes of complaint in the broadcast sector: inaudible dialogue. The idea was very simple; listeners using a web-browser could install a plug-in which allowed them to adjust the relative loudness of the sound from the court and the commentary whilst listening to tennis matches from Wimbledon. It was intended as a serious experiment to prove accessibility could be improved using this technology but a couple of the tennis stars that year were very loud “baseline grunters” and the press picked up on it as the “Grunt Controller”. Whilst this could be seen as taking attention away from the accessibility issue, a double page spread in national broadsheets and interviews on TV and radio was a good outcome! It isn’t often that a media accessibility experiment gets this kind of publicity.

Whilst the number of participants who completed the survey was small, it was very clear that about 45% of them wanted louder commentary whilst 45% wanted it quiter. This proved my hypothesis that one sound balance no longer works for everyone. We no longer consume TV exclusively in a quiet living room using a large box with two forward facing loudspeakers. Media are consumed on mobile devices in noisy environments, on tablets whilst cooking the dinner, and on TVs with tiny speakers that face down or back. Add to this mix of technologies and environments an aging population who find it harder to understand conversation in noisy environments in real life and it is easy to see why the broadcasters get so many complaints about dialogue inaudibility. Of course there have been a number of well-publicised occasions on which the sound balance has simply been wrong, but in general, if broadcasts were to have sufficient dialogue prominence for everybody to hear every word on any device in any setting, the sound balance would be very unsatisfactory for those fortunate enough to have a good home cinema setup.

The Wimbledon experiment led directly to a strong focus on accessibility as the key feature in the creation of object-based audio standards. Using technologies such as MPEG-H Audio it is now possible to create, distribute and consume media which can be personalised in the consumer device. Not only can the dialogue be turned up and down (with constant overall loudness of the presentation), but there are also many creative opportunities. When watching a football match, you can choose to sit with the home or away crowd, different languages can be offered in a bandwidth-efficient manner, and reversioning is easier for the content creator. It is also important that the artistic intent of the content creators is persevered, and the standards allow them to specify what personalisation is available at any given time. People living in South Korea have enjoyed object-based TV audio since 2017, with Brazil joining them in 2023 in time for live coverage of the football world cup. Many experiments are being completed in other countries; I’ve personally been involved in object-based audio presentations of events ranging from the European Athletics Championships to the Eurovision Song Contest! Open standards for creation of object based media are emerging and being adopted and the technology is sure to gain popularity, I wonder which country will be next to go live with a regular service?

Would you like to know more?

I am freelance audio consultant working with Fraunhofer IIS to deliver object-based “Next Generation Audio” over broadcast and IP. I’m giving an online talk on this subject, arranged jointly by the IET Cambridge Network and the Audio Engineering Society at 18:30 (UK time) on Tuesday 15th August. If you would like to know more about the creation, distribution and consumption of audio objects, and the technical standards which support this, please register to join me. https://events.theiet.org/events/personalised-audio-for-tv/

Image by Grand Scient from Pixabay

Parents
  • Fascinating subject! 

    I'm sure I'm not alone in getting annoyed with the TV when you have to turn up the volume to hear the dialogue and then have your eardrums (and your neighbours) blasted by the sudden introduction of background music or sound effect. Being able to adjust those audio objects to suit my own preferences would be a great addition to my overall TV viewing. 

    Just recently I've taken to having the subtitles switched on, not because I'm hard of hearing, but just to be able to understand some of the quiet and often mumbled dialogue in movies and TV programmes these days! 

    Definitely attending this event. Very much looking forward to it Slight smile

Comment
  • Fascinating subject! 

    I'm sure I'm not alone in getting annoyed with the TV when you have to turn up the volume to hear the dialogue and then have your eardrums (and your neighbours) blasted by the sudden introduction of background music or sound effect. Being able to adjust those audio objects to suit my own preferences would be a great addition to my overall TV viewing. 

    Just recently I've taken to having the subtitles switched on, not because I'm hard of hearing, but just to be able to understand some of the quiet and often mumbled dialogue in movies and TV programmes these days! 

    Definitely attending this event. Very much looking forward to it Slight smile

Children
No Data