Jump to content
  • mitchco

    What is Accurate Sound?

    I see this question asked often on audio forums and there are many varying answers depending on one’s viewpoint. Viewpoints range from being able to recreate the performance, to its just entertainment folks, and everything in between. 


    For this article, the context for what is accurate sound is the closeness to which our systems reproduce the incoming signal. To qualify the context even further, I am discussing the last component in the sound reproduction chain: loudspeakers in rooms. The intent of this article is to be educational.


    Digital Signal Processing (DSP) in audio has come a long ways over the years and today we have sophisticated DSP software tools at our disposal that allow us to model the “ideal” loudspeaker. We can use this model as a “reference” and compare the model to what happens in the “real” world. I.e. comparing the incoming signal to the loudspeaker and the result arriving at our ears at the listening position. 


    It is possible to accurately reproduce the incoming signal to our ears at the listening position without any frequency or time domain distortion. In addition to objective measurements, the subjective listening section characterizes, in detail, what accurate sound reproduction should “sound” like. Again, this is from the viewpoint of loudspeakers in rooms. 


    I link to the subjective listening section for folks who want to skip over the technical part. I have tried to keep the technical explanations as simple as possible to convey the intent, but I also provide links to references and further research for those who are interested in more detail.



    More on what is accurate sound


    One way of describing accurate sound is that the music arriving at our ears matches as closely as possible to the content on the recording. To put it into context, we can only reproduce what is on the recording.


    Another way of describing accurate sound is that there are no frequency or timing response distortions arriving at our ears at the listening position. This means that loudspeakers in rooms distort the acoustic signal arriving at our ears in both the frequency and time domains.


    Both descriptions infer that whatever is on the recording is arriving at our ears without any frequency or timing response anomalies. We are used to measuring/hearing flat frequency responses with no phase shifts as the norm in the digital audio and electronics world. When it comes to loudspeakers in rooms, the signal arriving at our ears is far from being the ideal response both in the frequency and time domains. This is the norm.



    What is an “ideal” loudspeaker?


    The ideal loudspeaker would have a frequency response that is ruler flat from 20 Hz to 20 kHz, spec’d within a small ± 1 dB tolerance. Multi-driver loudspeakers would be time aligned. Phase and group delay would be flat. Basically no frequency or time domain distortions. In engineering terms, a perfect transfer function.


    Using software DSP designer tools, we can model the ideal loudspeaker response based on our specifications. Here is a frequency response of an ideal loudspeaker:








    In the real world, loudspeakers don’t go down to 0 Hz, so we assume the ideal loudspeaker starts to roll off at 20 Hz. The filter modeled above is a minimum phase, 2nd order high pass Butterworth filter with a corner frequency of 10 Hz. If the filter went to 0 Hz, it would be a completely flat line. But since loudspeakers frequency response doesn’t go down to DC, this is why we see a bit of a rise of the phase response in the low frequencies when we switch from the frequency response view to the phase response view of the same signal:






    Loudspeakers are minimum phase systems. That means the phase response tracks the frequency response and vice versa. This is important to note as we will revisit this key concept later as it is important relative to room acoustics and the non-minimum phase behavior that occurs in rooms at low frequencies.


    Group delay should be flat and again following the low frequency roll off:






    Finally, the step (or timing) response:






    All of these views are of the same signal that provide different viewpoints of the transfer function.


    I thought I would put a “how to read this chart” legend as the other charts are easier to read (i.e. mostly we want a straight line ☺ There is a concept of preringing with linear phase filters, so we want to watch for this type of distortion, even though in listening tests I have conducted, large amounts are hard to audibly detect. The tell-tale sign is a ramp up or oscillation of the signal before the actual signal. Most noticeable on sparse music transients, like a kick drum for example, where it sounds “reversed” in the extreme case. Most modern DSP correction software have preringing compensation as it is well understood mathematically, so this is no longer an issue.


    See the vertical step itself starting at time 0 milliseconds? One can think of the vertical amplitude as the frequency scale with 20 Hz starting at the bottom and 20 kHz at the top, as this is what we specified in our ideal loudspeaker design. If I designed for flat to 30 kHz, then the vertical spike would be higher. If the drivers were not time aligned, then we would see horizontal offsets away from 0 ms of the straight vertical line representing parts of the frequency spectrum arriving at our ears at different times, and different between channels too. This is very important to keep in mind, the point being our ideal loudspeaker has all direct sound frequencies arriving at the same time for both channels, i.e. at 0 ms.


    The slope of the roll off, or shape of the tail, after the initial vertical step and to where it crosses the 0 ms horizontal time axis, is based on the loudspeakers low frequency roll off and cabinet alignment (i.e. slope of roll off). A roll off at a higher frequency would push the “cross the 0 time” threshold towards the left and a lower than 10 Hz roll off will push the 0 crossing point to the right, say at 15ms or even 20ms, depending on loudspeaker design (e.g. subs or no subs) and size of room.


    Other than the low frequency roll-off, these charts might as well be measurements for a DAC, or pre-amplifier or amplifier measurements. Right? No frequency or timing distortions. Accurate sound, at least relative to frequency and timing response.


    Consider this a representative baseline example of how an ideal loudspeaker would measure. I am simplifying the details to hide some of the complexity. For example, research shows that both on and off axis frequency response is important to be smooth for a “good sounding” loudspeaker in a room. I agree, and if one looks through some of my articles here on AudiophileStyle, those research links are there, which have culminated into a “Standard Method of Measurement for In-Home Loudspeakers (ANSI/CTA-2034-A R-2020).” 


    Note, the standard is a free download. If you’re so inclined, it is a very interesting read on the state of the art of measuring loudspeakers that correlates to scientific research on what makes for a good sounding loudspeaker in a “typical” listening room, i.e. the estimated or predicted in-room frequency response is one of the report outputs from the standard. See Figure 11 on page 37:





    We can see how accurate and precise the predicted in-room response is based on anechoic measurement data compared to the actual in-room measurement of the loudspeaker. The frequency response is virtually identical which validates the anechoic measurement methodology and processing algorithms used to estimate the in-room frequency response. Of course, there are low frequency room effects, but that is one of the points of this article and how to mitigate them to restore the response to ideal.


    This is a huge improvement for “loudspeaker measurements” for consumers as the report can accurately and precisely estimate how the tonal response of the loudspeaker will sound in a typical listening room. As we will see further into the article, flat in-room response is not the target, but we do know what a neutral in-room response measures. Point being, if  shopping for new loudspeakers, try and find a set that a) were measured using this standard methodology and b) offers a predicted in room response report. 


    Of course, even under anechoic conditions, most loudspeakers don’t measure ruler flat, many have crossover issues, directivity issues, driver time misalignment, cabinet diffractions, cabinet resonances, difficult impedance loads, and on it goes. Then we place the loudspeaker in a room. This further distorts the signal most significantly in the room’s low frequency modal region. And if the room is overly reflective or damped also has an impact on accurate sound quality. In other words, both the frequency and timing responses are further impacted by placing loudspeakers in rooms. 


    Now that we have an ideal loudspeaker, and with many more caveats than I described in the last paragraph, let’s have a look at what happens in the real world. 



    Loudspeakers in rooms in the real world




    Thanks to John ( @Olesno ) Jonczyk for volunteering his system to show the effects of loudspeakers in rooms. John’s system consists of:


    Tekton Ulfberht speakers 

    Don Sachs tube amp and preamp 

    Lampizator GA TRP tube DAC 

    Oppo UDP-205 player 

    Roon Nucleus+ 

    Lumin U1 Mini (upgraded to U1) 

    Puritan Audio Labs PSM156 power purifier.


    Let’s look at the in-room frequency response measured at the listening position:





    As we can see, John’s loudspeakers have excellent in-room frequency response down to 16 Hz and high frequency extension to 20 kHz with natural high frequency roll off beyond 10 kHz due to air absorption.


    I included both channels to not only show there is variation in each channel, but also between channels. The latter is very important for proper stereo decoding so that both channels are as close to identical as possible, both in the frequency and time domains. A solid phantom center image depends on this level of accuracy and precision, as does the placement of instruments and/or vocals in the stereo image, which also includes depth of sound field. Note: John’s room measurements are “typical” for any given room. I.e. uneven frequency response, even between channels. We all have this issue to one degree or another. I will explain why a bit later.


    When it comes to loudspeakers in rooms, if the room’s broadband decay time is within a 300 millisecond to 600ms range, and smooth across the frequency band, our area of interest is now focused on the low frequencies. This is because at a certain frequency in a room, as related to its dimensions and especially room ratio, the room transitions from ray acoustics to wave acoustics into what is called the modal region. In John’s room, that is about 200 Hz and below. I have marked up the chart a bit to show there is some 15 to 20 dB of peak to peak amplitude variation in the low frequency response:





    Here we are looking at the frequency range from 10 Hz to 200 Hz where the room has its way with the low frequency response. Note the variation in amplitude representing the largest peak to peak variation, which is over 20 dB as shown in the chart. To our ears, we perceive that 20 dB difference as being 4 times as loud or quiet depending on which end of the variation the bass note lands. In addition, there are level variations between the two channels which disrupts the stereo image and phantom center. Note: John’s low frequency room response is “typical” of virtually all of our listening rooms in which I will explain why a bit later.


    Let’s take a short detour to better understand what is going on here and why this is important. Unless one has a proper acoustically designed room, or lucked out with a preferred room ratio, the odds are that the vast majority of our rooms will have this amount of low frequency variation, i.e. 15 to 20 dB or more. 


    Before I explain, perhaps follow along with a little subjective listening exercise to tune one’s ears into the issue. Find some music that has a variety of bass notes. The more variety, the more low frequencies we are testing, just by listening to music.


    Here I have chosen the song “Spanish Harlem” from Rebecca Pidgeon’s The Raven, which has a very nice acoustic bass in the key of G that uses the classic 1, 4, 5 progression. In addition to the excellent recording by Bob Katz and Rebecca’s heavenly voice, here are the bass fundamental frequencies that go with the progression:


    49 62 73      65 82 98      73 93 110


    What exactly are you listening for? Turn up the volume to your preferred listening level. If you have a sound pressure level (SPL) meter or one on your phone, turn up the volume until the average level is around 77 to 83 dB SPL C weighting at the listening position.


    Get comfortable, close your eyes and focus in on the bass line and the bass notes being played. For each bass note played, do they all sound the same level in your room? Are some bass notes lower in level? Some higher in level? Is there one bass note that stands out above all others?


    It is not an easy listening exercise because we are so used to listening to uneven bass response we may not have heard equal loudness bass notes before to compare to. To understand more about why we hear what we hear in small room acoustics, I refer folks to James (JJ) Johnston’s excellent presentation on the, “Acoustic and Psychoacoustic Issues in Room Correction.” The first 31 slides are worth the read.


    So it may take a bit of time to “tune” into the bass line in the mix and focus on its level variation. This can be further complicated by the rest of the instruments and vocals playing at the same time. 


    This is why finding a song that has significant bass note variation makes it easier to identify which notes are louder and which ones are softer. In some cases, it helps if the music is sparse, like in “Spanish Harlem.” Other times, it helps if the bass notes are loud and sustained like in Madonna’s “The Power of Goodbye.”  Once you tune in, it becomes easier to hear. 



    Why do we have uneven bass response in our listening rooms?


    The answer is primarily due to the physical dimensions of a listening room and its room ratio. Room construction and acoustical treatments play a role, but at these long wavelengths, it is more about the room ratio. This article on, “Room dimensions on small listening rooms” gives us insight as to why room ratios affect the quality of the bass sound in one’s room. 


    We can also enlist one of the many online Room Mode Calculators like this one: AMROC Room Mode Calculator to examine our existing listening rooms. Type your room dimensions into the calculator and read the various panels about your room modes.


    It is highly educational if your browser is hooked up to your sound system so when you hover the mouse over a graphical room mode, you can hear what it sounds like in your room (careful to keep the volume down). It is an ear opening experience. Give it a try as there is nothing like hearing the problem with your own ears. Try walking around the room while hovering over a mode. There may be locations where it is really low in level and other locations where it sounds like blowing on a Coke bottle, but at a much lower frequency. Use the Room 3D view to show you where the modes are located in your room.


    The unfortunate reality is that few of us have properly designed listening rooms with appropriate room ratios to evenly distribute the low frequency room modes, aside from not having enough of them to begin with. So, we end up with rooms that have the wrong modal density with virtually no modes down low and with others bunched up together. Sometimes this occurs at the most inappropriate frequency, like the usually recommended subwoofer crossover frequency of 80 Hz. Pro tip: cross subs between room modes to your mains.


    Further, below a room’s transition frequency, also called the Schroeder frequency, room modes, standing waves, room resonances dominate the sound, so much that the room is in control of the low frequency response, not our loudspeakers. Yes, you read that correctly. Below the transition frequency, your loudspeakers are no longer in control of the low frequency response, rather the room is.


    Here is a typical size listening room where a measurement mic has been placed at the listening position and the loudspeaker has been moved to three different locations within a two foot radius:






    As one can see, below the room’s transition frequency of about 300 Hz in this example, the bass response varies significantly, not only by location, but also in each location! Above 300 Hz the loudspeaker is in control of the frequency response that we perceive. Alternatively, the room has substantially less influence on what we hear above 300 Hz. With careful loudspeaker and listener placement one can get lucky and be in-between the worst of the peaks and dips. But more often than not, it is simply shifting the frequencies and timing of the room modes, but they are still there.


    The chart above is from Floyd Toole’s excellent article on Audio- the science in service of the Art. As Floyd says, “In the investigation of many rooms over the years, I would estimate that something like 80% have serious bass coloration.” Further, Floyd’s research shows that bass subjectively accounts for 30% of how we judge speakers sound quality. And “ANY loudspeaker can sound better after room EQ, so long as it competently addresses the bass frequencies - this is not a guarantee, but really is not difficult for at least the prime listener.” I am in total agreement.


    Getting back to John’s speakers… Let’s look at the phase response:





    Behaves well beyond 2.5 kHz but we can see the phase “wrap” at just over 400 Hz and if we “unwrapped” it, we would see a negative phase or downward phase response with more anomalies below 100 Hz. 


    What about group delay?





    Above 300 Hz, no issues. Below 100 Hz we see some peaks and dips. For the very narrow dips our ears/brain don’t really notice anything missing. The bandwidth is too narrow for our ears to pick up.  Remember JJ’s presentation, it is the peaks we can hear (as delayed bass in the group delay view) and the gap in the left channel at 30 Hz is approaching our ability to notice.


    What about the step (timing) response?






    We can see in the vertical step at time 100ms (think of that as the 0 ms marker from our ideal loudspeaker example) that there are two amplitude spikes, not one. We will get to that detail in the next chart which shows a zoomed in version on the time scale to show the time misalignment. 


    What I want to focus on is that roll off or “tail” of the low frequency response over time, like over 100ms as shown in the chart. We can see that there is quite the difference between channels, in addition to not following the ideal step response shown earlier. This is because of the multitude of room reflections at numerous angles, thus the timing response at low frequencies is also altered. This is due to the fact that portions of the low frequency response are no longer minimum phase response. This is why applying just “frequency correction” using certain room eq products or Parametric EQ’s (PEQ) doesn’t solve this problem. But that discussion is for another article on “how” room correction works.


    We can see for the right speaker a reflection that is almost the same amplitude as the tweeter, but around 135ms later. Sound travels roughly 1ft per millisecond. So the bass response has built up to a peak 135ms later. This is why the bass response in most rooms sounds muddy or boomy or not distinct - just some of the subjective words tied to a bass response that is (literally) all over the place in the room.


    Let’s zoom in on the time alignment of the drivers:





    As mentioned above, we see two vertical spikes that are offset in time. First to arrive at our ears are the tweeters and then the woofers.


    So the step response has shown us two issues, one being driver time misalignment so all direct sound is not arriving at our ears at the same time. The other being low frequency room variations, not only for each channel, but between channels as well.


    Is any of this timing distortion audible? To my ears it is. Here is an experiment where I set up my system with virtually identical frequency responses and only changed the timing response. To my ears, it increases the sound stage depth to be in line with an improved stereo image, in addition to the bass sounding even, solid, transient and crystal clear.



    Can we restore the ideal sound with no frequency or timing response distortion?


    Yes, we can using specialized loudspeaker and room correction DSP software designed to solve these problems. I have written numerous articles about it, including a book, but in this article, we are only interested in “what” it can do and not “how” it does it. The latter is for another article as this type of highly specialized DSP is mostly misunderstood. Further, very few DSP products provide the needed time domain correction capability. Finally, the “effectiveness” of so-called Digital Room Correction (DRC) products vary wildly. The top two or three DSP software products in this category far outpace other products by a wide margin based on my experience evaluating just about all of them over a ten year period.


    So let’s jump right to the results of applying SOTA DSP loudspeaker and room correction to John’s already excellent loudspeakers. Remember what the DSP is accomplishing is restoring the ideal loudspeaker response arriving at our ears with no frequency or timing distortion.











    The grey line is the “target” response that was “designed” in the DSP filter designer software. As we can see, John’s speakers track almost perfectly within a ±3 dB (studio control room) tolerance from 16 Hz to 20 kHz with the top octave left alone to roll off naturally due to air absorption. Not only is each channel smooth, both channels are virtually identical. Both are equally important attributes to what constitutes accurate sound.


    Note the tilted frequency response is based on years of scientific research from Floyd Toole and Sean Olive on what a good in-room measured frequency response correlates to what sounds good to one’s ears in a typical living room environment: 






    There are a couple of interesting points to note. Look at the un-equalized loudspeaker frequency response in the chart. Again, typical of in-room frequency response due to room effects. If you dig into Sean’s presentation on slide 24, it is interesting to note that a measured, tilted in-room frequency response is perceived by our ears/brain as a “flat” or neutral frequency response:





    See the most preferred tilt at the top (in red on chart background) is actually perceived by our ears as flat or neutral (the bright red overlay). If we “eq’d” the loudspeakers to flat at the listening position, it would be perceived by our ears as too bright sounding, with not enough bass. This is not the preferred target.


    What about the phase response:













    Here again the target is in grey and John’s loudspeakers do a great job of tracking to the minimum phase target response, with both the natural rise in the low frequencies and roll off at the very top. Virtually ideal, and in the real world, this as good as it gets folks!


    Same goes for the group delay:












    As described earlier, our ears do not perceive narrow dips in frequency response, and so any narrow dips in group delay we do not hear either. From JJ’s presentation, our ears follow the “envelope” of the curve and are more sensitive to peaks than dips. The point here is that the restored group delay response is consistently flatter across the low frequency range (i.e. no low frequency delay).


    Step response:












    That’s a remarkable difference with the restored step response following the target (black line) over time, perfectly time aligned and looking like the “ideal” loudspeakers timing response. Talk about “deblurring!”


    As one can see not only does all the direct sound arrive at ones ears as the same time, but also the low frequency reflections in the room are aligned towards the ideal minimum phase response. And finally both speakers are in perfect sync with each other over a long time period. Hearing the bass transient response on this system would be incredibly impressive, in addition that the bass response will remain perfectly centered even as the room decays. How would I know? My system measures virtually the same as John’s… 


    Let’s have a look again at John’s room to help put this level of performance into perspective:




    If you will note the speaker setup from the previous pic, while perfectly symmetrical from a listening position perspective, John’s setup and room are not. Yet, we are able to restore virtually the ideal response at the listening position in both the frequency and time domain. 


    Side note: Much has been said and written over “microphones are not ears,” “only at one measurement location, move the mic 6 inches and it is totally different,” “the simulations are different than the measurements,” etc. Folks can read how SOTA DSP software achieves this not “just” at one listening position, but over a large listening area. Not only in my articles here on AudiophileStyle, but I also wrote a chapter in my book validating that the DSP simulations produced are within a 0.25 dB of the actual measurements. This is consistent with over hundreds of simulations and measurement verifications. I go into detail taking 14 in-room measurements across a 6ft x 2ft grid area that show both the frequency and timing response remain consistent across this large sweet spot, based on a single analysis measurement.


    From John’s listening perspective, it is as if the room were perfectly symmetrical, listening to the ideal loudspeaker in a room where the room modes are evenly distributed. This describes most pro control rooms used for recording, mixing and mastering as they are acoustically designed that way. We can achieve similar if not a virtual replica of what has been recorded, mixed, and mastered arriving at our ears in the comfort of our listening rooms, without major room reconstruction or cost.


    This is a good segue into the question: “So, what does accurate sound ‘sound’ like?”



    Subjective description of listening to accurate sound




    “It is possible to reproduce a stereo recording in an ordinary living room such that listeners have the illusion that the two loudspeakers have disappeared. When they close their eyes, they can easily imagine to be present at the recording space, as they listen to the phantom audio scene in front of them.” Siegfried Linkwitz


    I totally believe that as I hear it every day from my accurate system. The reason I love listening to music is to be blown away. I am always looking for ways to get the most of what is on the recording. I want to hear the full expression of the performance. Being able to reproduce the music (i.e. signal) faithfully (with no frequency or time domain distortion arriving at our ears) gets us there. Let talk about these two technical parameters from a subjective perspective as to what accurate sound “sounds” like.



    Frequency response:


    With a smooth on and off axis frequency response we get the tonal representation as recorded on the digital media. For sure, there is a wide range of recordings with varying frequency responses, but I have found more often than not that there is a sweet spot. For example, like the Harman target mentioned earlier where almost every recording I have sounds good, some better than others, but all good. Neutral frequency response meaning no one frequency stands out over the other. The balance sounds not too bright, not too dull, but just right with the right amount of bass that sounds even, solid, transient, and crystal clear.


    What is often overlooked is how well each channels frequency response match each other. This is absolutely key for a rock solid stereo center phantom image and overall stereo image. See John’s original frequency response where both channels don’t match and are frequency dependent. This is what blurs the phantom center image and/or what we call phantom wander or a weak phantom center image. Some frequencies sound centered, others sound coming from more one side of the stereo image than the other and even vice versa in another range of frequencies. This not only destroys the phantom center image, but also the stereo image itself. And further exacerbated if one’s setup and/or room is asymmetrical.


    When I listen to a mono recording on an accurate system, the image is crystal clear and dead center. I mean like a virtual point source “dot” emanating from the very center, eye height. There is no phantom drift towards one speaker or the other, just dead center, over the entire frequency range. Given that very few of us have symmetrical setups in symmetrical rooms where one half is a mirror image of the other half, the only way to match the channels frequency response to this level of accuracy and precision is by using DSP.



    Timing response:


    There are two aspects to timing response and how we subjectively perceive them when listening to music. One is the low frequency “evenness” of the sound. Remembering JJ”s presentation, for low frequencies, our ears hear a combination of both the direct sound and room sound. The room sound occurs over time. Aside from the frequency correction providing that smooth response of the direct sound, we want that smooth response over time too, following the ideal response as it if was all coming from the loudspeaker with no room contribution. This is for low frequencies typically below the room’s transition frequency.


    The subjective listening experience with the smooth bass, both the direct sound and over time, provides crystal clear sounding bass with no “overhang.” Feeling solid and dead center without any wandering from center over time. For many, it is the first time one actually hears how clear and even sounding the bass coming from one’s system can be.


    The 2nd aspect is time alignment where all of the direct sound is arriving at your ears at the same time. Note only between each individual speaker driver, but between stereo channels as well. Subjectively, to my ears, the transient impact, even with subs, is immediate. A plucked acoustic guitar string has that snap you hear as if the real guitar was in the room. I have performed that experiment and it is remarkable how close it sounds to the real thing.


    The stereo image benefits as both channels are also arriving at your ears at precisely the same time, along with all speaker transducers being time aligned. To my ears, the soundstage or imaging really focuses and the image width and height go beyond the physical dimensions of the loudspeakers. The location of instruments and vocals within the 2D image are solid, precise, and don’t vary with frequency. 


    The other area I feel time alignment really improves the listening experience is the depth of field in the recording. Or put another way, I can hear deeper/longer into the recording than ever before. The stereo image height and width restoration now has an equivalent depth of field restoration and extends as if there was no front wall, just like through the looking glass…





    I hope folks found the article educational. The links point to excellent research, some of it an accumulation of decades of comparing subjective listening experiments with objective measurements. That research has developed into meaningful, modern standards for measuring loudspeakers and improving their designs, with the benefit going to the consumer. This is especially true when using the CTA 2034A measurement standard as it also provides a reasonable estimation of what the loudspeaker will sound like in one’s living room, at least from a tonal response perspective.


    With sophisticated DSP filter designers and powerful computers, one can easily model the “ideal” loudspeaker. We can also compare the ideal loudspeaker to the real world of loudspeakers in rooms where we listen to wonderful musical performances. We measure (and hear!) distortions in both the frequency and time domain with loudspeakers in rooms. Through sophisticated DSP filtering, we are able to restore the signal to the “ideal." Of course, some prefer a bit more bass or more treble, but there is a standard distribution based on my research and having measured dozens of different systems in rooms from all over the world.


    The frequency and timing response are not all of the attributes that make up for what is accurate sound. What about total harmonic distortion (THD), for example? Well, unless you are hearing audible loudspeaker distortion because the loudspeakers are too small and/or inefficient to drive to “reference level” without hearing distortion in one’s listening room, I am a bit, “What, me worry?” I am just giving the caveat for folks that may get the impression that I don’t feel there are other parameters that impact accurate sound. There are. But in the big scheme of things, and relative to all of the other digital and electronic devices upstream, loudspeakers in rooms make for the biggest divergence away from the ideal relative to any other component in the system.


    I would like to give the final word to John, who was generous enough to let me use his system as an example, and most importantly, has heard the difference in his system first hand:


    First of all, thank you for featuring my system and my room in your very thorough and very technical article describing the benefits of using well designed and executed DSP software in order to achieve the best possible sound in anybody’s room. Of course a collection of good equipment that’s carefully set up in any room should almost guarantee great sound. That is true if you have a room designed for perfect acoustics. Looking at my pictures, this definitely is not the case. Still, as happy as I was with the outcome, I thought that the influence of my room's layout was detrimental to the overall sound. A well designed DSP filter based on my room’s readings would bring it up to another level. 


    Since I mostly listen to music streaming from Tidal or Qobuz via Roon, I decided to use that platform and Audiolense, a powerful software DSP tool, to fine tune the sound. Did it work? Yes, I can positively say that it made a big difference in how the music sounds in my room now. In a nutshell, the instruments and vocals are much better focused and spaced around the stage now. The frequency spectrum is now much more evenly spread without any noticeable peaks and valleys. The bass, the mids and the highs sound just right now and on well-recorded material, you feel like you are there. Finally, as good as all the electronics are, I think the speakers, their design and execution made the biggest difference. After all, they produce the sound, and it is glorious. I think I finally arrived at a point where I can say, THAT’S IT!





    Mitch “Mitchco” Barnett.


    I love music and audio. I grew up with music around me, as my mom was a piano player (swing) and my dad was an audiophile (jazz). My hobby is building speakers, amps, preamps, etc., and I still DIY today.


    I mixed live sound for a variety of bands, which led to working full-time in multiple 24-track recording studios. Over 10 years, I recorded, mixed, and sometimes produced over 30 albums. I wrote a book on, “Accurate Sound Reproduction using DSP” and run an Accurate Sound Calibration service.


    User Feedback

    Recommended Comments

    2 hours ago, jrobbins50 said:

    I am a believer. Nice article, Mitch. JCR 

    Thanks Jeffrey!

    Share this comment

    Link to comment
    Share on other sites

    Excellent article! Could you provide some more details about how you achieved these results? It appears the filters were measured and applied via Audiolense? What hardware was used for DSP playback? I appreciate a full answer would be another full length article so brief details of the hardware, software and measurements would be fantastic.


    Share this comment

    Link to comment
    Share on other sites

    Mitch, do you know what smoothing is used in the frequency response graphs?  I don’t see that stated in the Audiolense user guide.  Is it a proprietary psycho-acoustic smoothing?

    Share this comment

    Link to comment
    Share on other sites

    Can the Audiolence DSP be applied to Roon or HQPlayer ?

    Are there processor power requirements.


    The reason for asking is I don’t want an extra PC in the digital chain. 
    (I understand I need a PC for measurements). 

    Share this comment

    Link to comment
    Share on other sites

    1 hour ago, Quad 405 said:

    I'd like to give a shout out for Mitch's Accurate Sound Calibration service and his book.  My Kii Threes are sounding even better after Mitch created the filters. for Roon. Thanks for another great article!

    Nice. I taught only HAF offered such service. I wanted actually to ask him (Mitch) how good it is, but I guess that may be hard to answer now 😀

    I assume Mitch’s service using Audiolence ?



    Share this comment

    Link to comment
    Share on other sites

    I bought Audiolense XO and used it for the measurements and Mitch used it to create the filters. His book is based on using Acourate.

    Share this comment

    Link to comment
    Share on other sites

    17 minutes ago, Quad 405 said:

    Audiolense XO

    Would one need XO, or is 2.0,sufficient ?

    Share this comment

    Link to comment
    Share on other sites


    I just realized that Mitch is some sort of partner with HAF, and even use their Room Shaper SW. 


    If Hope this mean files returned from Mitch also include the Room Shaper correction. That would then mean it’s not so important to have this SW converted into something HQPlayer or Roon server (meaning NUC, NAS, SonicOrbiter) can use.



    Share this comment

    Link to comment
    Share on other sites

    1 hour ago, R1200CL said:

    My Kii Threes are sounding even better after Mitch created the filters.

    Are you saying Kii’s own DSP SW can be edited ?

    Or Did you first do AL filters before Kii, and then Kii DSP filters ?

    Share this comment

    Link to comment
    Share on other sites

    34 minutes ago, R1200CL said:

    Are you saying Kii’s own DSP SW can be edited ?

    Or Did you first do AL filters before Kii, and then Kii DSP filters ?

    The AL filters were added to Roon and then via an opticalRendu to the Kii Control. I changed the Kii boundary setting to -6 before taking the initial measurement but left the other Kii settings to default.

    Share this comment

    Link to comment
    Share on other sites

    12 hours ago, blue2 said:

    Excellent article! Could you provide some more details about how you achieved these results? It appears the filters were measured and applied via Audiolense? What hardware was used for DSP playback? I appreciate a full answer would be another full length article so brief details of the hardware, software and measurements would be fantastic.



    Thanks @blue2 Other than a UMIK-1 mic, a Mac, and REW to take the measurements, the hardware is listed under the picture of John's @Olesno system. I can export the REW impulses and import them into Audiolense.


    On this site, you will find a number of "walkthroughs" I wrote using Audiolense, Acourate, and Dirac. These should provide you with most of the details to your question.

    Share this comment

    Link to comment
    Share on other sites

    2 hours ago, ASRMichael said:

    Fantastic write-up


    Can you/ accurate sound produce convolution filters for HQ Player?




    Thanks. And yes.

    Share this comment

    Link to comment
    Share on other sites

    Mithco, thank you for a very informative article. I currently use a INNUOS Zenith SE MKII music server running ROON Core. I tried room correction in the past unsuccessfully using an Anti-mode 2.0 system. I am now considering two different approaches to room correction:

    1. Roon Convolution filters 

    2. New amplifier with room correction in the amplifier. (Lyngdorf TDAI with Room Perfect)

    Assuming you are familiar with Room Perfect software, would you please share your thoughts on this product.

    What approach to room correction do you think would yield the best results. Thanks for you consideration of of questions. 


    Share this comment

    Link to comment
    Share on other sites

    9 minutes ago, ALLDIGITAL said:

    Mithco, thank you for a very informative article. I currently use a INNUOS Zenith SE MKII music server running ROON Core. I tried room correction in the past unsuccessfully using an Anti-mode 2.0 system. I am now considering two different approaches to room correction:

    1. Roon Convolution filters 

    2. New amplifier with room correction in the amplifier. (Lyngdorf TDAI with Room Perfect)

    Assuming you are familiar with Room Perfect software, would you please share your thoughts on this product.

    What approach to room correction do you think would yield the best results. Thanks for you consideration of of questions. 


    I’m a huge fan of separating DSP from hardware like an amp. It gives you greater flexibility. I use convolution filters in Roon all the time. The great thing is they can be enabled/disabled with the tap of a finger and you don’t have expensive hardware designed to use DSP. 


    Share this comment

    Link to comment
    Share on other sites

    Create an account or sign in to comment

    You need to be a member in order to leave a comment

    Create an account

    Sign up for a new account in our community. It's easy!

    Register a new account

    Sign in

    Already have an account? Sign in here.

    Sign In Now

  • Create New...