Audio: Listen to this article.
Welcome to the music chapter of the Ultimate Guide To High End Immersive Audio. The main table of contents can be viewed here.
Defining Immersive Audio
An exact definition of immersive audio is elusive, but at least the name offers a great description of the concept unlike the 100% made up name Wi-Fi. Immersive audio won’t fit into a static definition because it adapts depending on the playback environment.
Definition: At a very basic level, immersive audio immerses the listener in music on both horizontal and vertical planes, delivering a three dimensional experience. Music coming from around and above the listener.
Here are three examples based on different listening scenarios.
Speakers (Discrete) - Immersive audio involves reproducing music using both ear level and height speakers, usually placed on or in one’s ceiling. As the name suggests, listeners are immersed in music from the sides, rear, and top. Typical loudspeaker configurations that accomplish this are 5.1.2 (5 ear level speakers, 1 subwoofer, 2 height speakers) and 7.1.4 (7 ear level speakers, 1 subwoofer, 4 height speakers). Surround speakers + height speakers = discrete immersive audio.
Speakers (Virtual) - Immersive audio can also be played through two speakers, with the help of advanced digital signal processing, to reproduce the horizontal surround and vertical dimensions encoded into the music. Keep in mind that technically the word speaker is different from driver when reading this description of the MacBook Pro audio system, provided by Apple:
“Speakers: The high-fidelity six-speaker sound system consists of two pairs of dual force-canceling woofers and two tweeters. Enjoy a robust and high-quality audio experience, including Spatial Audio support for videos and songs with Dolby Atmos.”
With two speakers, one on each side of the keyboard, and each speaker containing three drivers (2 woofers and 1 tweeter), a MacBook Pro can reproduce immersive audio. Taking this a step further, an iPhone can also reproduce immersive audio using its two speakers.
A single Sonos soundbar, containing multiple drivers can also virtualize immersive audio with success, as can a pair of Sonos Era 300 speakers.
Headphones - In addition to loudspeakers, immersive audio can be reproduced on headphones. The experience is vastly different from that with discrete speakers for the horizontal and vertical planes, but it can also be vastly different from a standard stereo audio mix. The immersive headphone experience isn’t supposed to mimic the discrete loudspeaker experience, rather it offers a unique immersive experience in its own way.
Immersive Audio Formats
Immersive audio is delivered in a few different formats and/or codecs. Most need to be encoded prior to delivery, then decoded by the listener at the time of playback. Some require no custom encoding/decoding process. A format can contain a codec, a codec can contain another codec, etc… To make things more palatable for all but the geekiest of music lovers, the concepts of format and codec will be a little loose here.
Stereo listeners are familiar with file formats / codecs such as WAV, FLAC, ALAC, MP3, and DSF (DSD). Some of these can also contain immersive audio as described below.
By far the most successful and commercially adopted format for immersive music is Dolby Atmos. This guide will focus mainly on Dolby Atmos because of its wide adoption, ease of access, and documentation. If I had to guess the percentage of market share between the formats listed below, I’d say Dolby Atmos has 99% of the market and the others are splitting the remaining 1% unevenly.
I know some fans will not like what I’m about to say, but here it goes. If people are overwhelmed by all the information and wondering where to start, I believe they should focus on Dolby Atmos and forget the others even exist (for the foreseeable future). Competition among formats is good and drives innovation, but it can also cause confusion and consumer paralysis.
Immersive audio formats for music include:
- Dolby Atmos
- Sony 360 Reality Audio
- MPEG-H 3D Audio (ISO/IEC 23008-3)
- THX Spatial Audio
- Discrete Immersive
Dolby Atmos music won the immersive format war before it began. It’s the format selected for delivery through major streaming services and physical Blu-ray Discs. Atmos enables mixing engineers to place audio anywhere in a three dimensional space, by using a combination of discrete channels and movable objects. The name Atmos is often used as a generalized term or format, which seems much easier to digest for most consumers. Taking this a step further is Apple, which uses the term Spatial Audio or Spatial Audio with Dolby Atmos.
Atmos is an adaptable format in that a single file can play a stereo, binaural, 5.1, 7.1.4, and up through 9.1.6 sixteen channel mix and several more in between. The version that’s played depends on the system rendering / playing the music. The same file that’s played through Apple’s EarPods in immersive stereo, can also be played through a Mac to a 7.1.4 twelve channel system.
Those technically inclined will be interested in the following additional details about what’s often just called Atmos.
Dolby Atmos “Flavors”
Atmos music is delivered by streaming services in two flavors, Dolby Digital Plus Joint Object Coding (DD+ JOC) and Dolby AC4-IMS. Technically both of those formats exists without Atmos as Atmos is just an extension of the two. Both of these are lossy as opposed to lossless. DD+ JOC is used for speaker playback while AC4-IMS is used for headphone playback because it has optimized binaural metadata, although binaural playback isn’t mandatory. Note, not all streaming services deliver the same flavor, as further down in this chapter. The container for much of the lossy Atmos music content is a 24 bit / 48 kHz MP4 file.
Atmos music is delivered losslessly as TrueHD with Dolby Atmos extensions. Often just called TrueHD Atmos or Atmos TrueHD. TrueHD exists on its own without Atmos as 5.1 or 7.1 surround. With Atmos, TrueHD can extend up through 16 discrete channels (9.1.6). In the studio, Atmos music is created with lossless ADM BWF (broadcast WAV) files in either 24 bit / 96 kHz or 24 bit / 48 kHz resolution. From that WAV file a 24/48 deliverable is always created for consumer playback, whether lossy or lossless. The lossless codec used for TrueHD Atmos is Meridian Lossless Packing (MLP).
The sonic quality differences between TrueHD Atmos and DD+ JOC or AC4-IMS can vary between “not too much” to “night and day.” As an audiophile I much prefer the TrueHD Atmos and listen to it whenever it’s available.
Auro 3D is the only other format worth mentioning or spending time on. Immersive music from some labels is encoded with Auro-3d and delivered to consumers in 24/96 FLAC files. Given that FLAC has a hard limit of eight channels, the remaining channels are stored as metadata by the encoder, then expanded into discrete channels by the consumer decoder in a processor / receiver. 2L, Spirit of Turtle, and the TRPTK labels offer music encoded as Auro-3D.
Some audiophiles prefer the sound of Auro-3D files over the same music encoded with TrueHD Dolby Atmos. Many labels prefer Auro-3D because it’s much easier to work with than Atmos and Auro offers a very simple batch encoder to create the deliverables.
Sony 360 Reality Audio, MPEG-H 3D Audio (ISO/IEC 23008-3), DTS:X, and THX Spatial Audio all belong together in a group of technologies of which to be aware, but not overly concerned. Sony’s 360RA uses MPEG-H and is a really good format, but its market penetration is nearly nonexistent. I know of a single label releasing DTS:X music (2L). THX Spatial Audio is an also-ran with no documentation, only a handful of tracks available, and no professional studios using it. These technologies may be great, or even better than Dolby’s offerings, but without acceptance by the professionals creating the music, the labels paying for the music, and consumers playing the music, they are dead in the water.
The last category of immersive audio is by far the best for sonic quality, but remains pretty rare. Discrete immersive audio is not encoded with a proprietary or even open source encoder that requires decoding by the listener. It’s often released as 10-12 channel WAV files in resolutions of 24/352.8 (DXD) and 24/384. In my experience these files sound better than their encoded/decoded relatives because they don’t need to go through the encode/decode process and are often the master files from which the others are created.
Discrete immersive albums are more popular than THX Spatial Audio, but less so than the other formats. This is partially due to the playback system requirements, discussed further down in this chapter.
Who, Why, Where, Is It Any Good?
Who is involved in immersive audio’s entry into the music market? Everyone. Apple, Dolby, Universal Music, Sony Music, Warner Music, Tidal, Amazon, artists, engineers, and many more. Sure, there are some that are indifferent or even hate it, but the biggest companies involved with music creation and reproduction support immersive audio. This is a good thing for audiophiles because we are used to inventing a crazy format and trying to push that bolder up a giant hill that contains skeptical kings at the top pushing us back down.
In addition to who is “involved” in immersive audio, who is releasing immersive audio? The short answer is “everyone.” Every major label, many smaller labels, and even some independent artists are releasing immersive mixes. This isn’t niche. From the biggest bands in the world to the smallest one man shows, there’s an immersive mix. If one isn’t available right now, it will likely be available soon. A couple artists have said “no way” to immersive audio, but the vast majority of music will be released for immersive playback, whether it was created before stereo was invented or released today.
Why is immersive audio being created and delivered to consumers? There isn’t any single reason for immersive audio. Several reasons exist on the continuum of “Why” from altruistic to sinister. A helpful place to start one’s research is the New York Times’ archives, around the late 1950s. Immersive audio in the 1950s? No, but that’s when the world transitioned from mono to stereo and the exact same discussions happened, that are happening today. Change can be hard. Historical perspectives can help put it into perspective.
Also in the “why” discussion, among many other things, is why should consumers be interested in immersive audio? I can only answer this based on my own experience and the experience of others I know. The vast majority of people who’ve heard a well setup immersive system enjoy the experience very much. Some, like me, can’t get enough of it. Others could take it or leave it. That’s OK too because nothing is for everyone.
Immersive audio provides a chance to hear music like never before. Reproducing the Berlin Philharmonic with 12 speakers versus 2 speakers provides a night and day different experience. The immersive system can place the listener right inside the Berliner Philharmonie like nothing else on the planet, outside of being there in person. Other music, such as Dark Side of the Moon, was meant to be an immersive experience from the beginning. Now, mixing engineers, James Guthrie specifically in this case, have the tools to take that experience much further than the initial quadraphonic mix from 1973.
There are no rules when it comes to immersive audio, just as there are no rules for mixing stereo audio. As with stereo, some immersive mixes are terrible while others are breathtaking. It often boils down to artistic decisions made by producers, artists, mixing engineers, and others involved with the creation of our favorite music. Neither stereo or immersive audio should be judged by the best or worst outliers.
Based on my experience, I believe immersive audio is the biggest, most important, most impactful, and best thing to happen in music reproduction in my life. Born in 1975, I didn’t live through many of the previous formats. Cassettes were my entry into the wonderful wold of music until CDs cam along, although I did bring my brother’s The Wall vinyl LP to play for my 2nd grade class, just to hear “Hey, teacher, leave them kids alone” and enjoy the aftermath with classmates. I never got involved with previous multichannel offerings because I didn’t think they’d last and there wasn’t much content in the grand scheme of things.
Why is this time for real and why do I think it’s here to stay? Looking at where one can obtain immersive audio is key to understanding its future.
Where Is Immersive Audio Available?
The short answer is, “almost everywhere.” Physical media, purchased downloads, and streaming service all offer immersive audio. All of the major labels and many smaller labels offer immersive audio on Blu-ray Discs. This is the only physical format for immersive audio including TrueHD Atmos, Auro 3D, and others. These discs can be played in a Blu-ray player of course, but also ripped and played on a Mac, Windows PC, or music server. In addition to physical releases, it’s also possible to purchase immersive downloads. This cuts out the middle step of ripping and enables one to get right to listening. Labels such as 2L, TRPTK, Spirit of Turtle, and some artists on Bandcamp are offering immersive downloads. I’ve even purchased and downloaded an Atmos master ADM file from an artist on Bandcamp.
Currently, the only source of lossless immersive audio (TrueHD Atmos, Auro-3D, etc…) is via physical disc or purchased download. Lossless immersive streaming has been demonstrated publicly and is in the works.
Lossless Immersive purchase / download links:
The vast majority of immersive audio, just like stereo audio, is available from streaming services Apple Music (iOS, tvOS, macOS, Sonos), Tidal (iOS, tvOS, Android, FireTV), and Amazon Music (iOS, Android, FireStick, Sonos)*. This content is lossy, as discussed above. However, for 99% o the content, the lossy version is the highest version ever released to the public. I’d say we shouldn’t complain about a nonexistent product (lossless album ABC, XYZ, etc…) because it doesn’t exist outside the studio, but I’m sure there’s plenty of room for complaints.
* The list of devices on which each streaming service can play Atmos is ever changing.
The three streaming services also sound different. This is because they use different technologies when streaming and rendering the only immersive format that’s streamed Dolby Atmos.
Apple Music has the biggest and best catalog of Atmos music. It also has the best quality control. Other services have had issues with very soft volume levels followed by blaring volume. I’ve experienced it and won’t take the chance again. Apple Music streams Dolby Digital Plus Joint Object Coding (DD+ JOC) to all iOS, tvOS, and macOS devices. All of these Apple devices have a built-in DD+ JOC decoder that comes with the operating system. Apple uses its own Atmos renderer however, unlike the other services. The renderer interprets an Atmos mix and presents it to the playback application for output to your ears. Apple has to have’s own special sauce, partially because it wants to use head tracking with its headphones.
As of right now, it’s believed that the Apple renderer is only used for headphone output or Mac speakers, not Atmos output on macOS to an audio interface and speakers.
Tidal and Amazon on the other hand are using the Dolby native tools. They stream Dolby Digital Plus Joint Object Coding (DD+ JOC) when playback is happening on speakers or output to a processor over HDMI (AppleTV). When playing on headphones, these services stream Dolby AC4-IMS because it’s optimized for headphone stereo or binaural playback. The Dolby renderer can interpret all the binaural metadata in AC4-IMS, whereas Apple’s renderer throws it away.
What about the holy grail of discrete immersive audio? This is available from 2L and TRPTK as downloads for purchase. They aren’t cheap, but I believe they are worth every penny. The one issue holding some people back from purchasing them is that many immersive audio systems can’t play sample rates higher than 24/96 or 24/192. These are systems that use traditional processors or receivers, which have limited processing power. Computer based system have nearly unlimited resources and can play 12 channels of DXD without an issue.
This leads into how the aforementioned immersive audio can be played.
How To Play Immersive Audio
Other chapters of this Ultimate Guide To High End Immersive Audio will cover each of these topics in-depth, showing real world solutions for many price levels and how to get it working. This is just a brief overview of how to play immersive audio
Because Atmos is adaptive from 2 to 16 channels in a single file, it can be plays on headphones, soundbars, receivers, processors, pro interfaces, and HiFi systems with computers / music servers. There isn’t really a limit. I guess mono and quad systems are left out of the mix. Playing immersive audio may seem daunting at first, but it isn’t rocket science. We’ve figured it out and are happy to explain how to do it, in great detail.
In adding to the “How” the dedicated chapters will explain who each method of playback is for or why one would choose a processor over computer or vice versa. Are four ceiling height channels really needed? This guide will answer that question and show objectively what information is contained in the height channel for a number of immersive audio releases.
- Some Apple Musis Atmos favorites (link)
- Immersive music favorites, part 1 (link)
- Immersive music favorites, part 2 (link)
- Immersive music favorites, part 3 (link)
- Immersive music favorites, part 4 (link)
- Immersive music favorites, part 5 (link)
- Immersive music favorites, part 6 (link)
- TRPTK Immersive gems, part 1 (link)
- TRPTK Immersive gems, part 2 (link)
- All Audiophile Style immersive audio articles can be fund here (link)
NOTE: Please post comments, questions, concerns, corrections in the section below or contact us.