Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Complex crossover networks are The Devil. As you observed, the human ear is largely insensitive to moderate variations in volume/frequency... but it's incredibly sensitive to phase. Phase is how we discern directionality (stereo), among other things. So introducing complex phase shifts in the crossover in order to achieve flatness is a bad tradeoff, imho.

To my ears, the best and most musical sounding speakers I've heard have time-aligned drivers and very simple crossovers. Tannoy, Spica, Vandersteen, and other such designs are clearer and less fatiguing. I'm a musician and have recorded numerous albums, and those speakers are the ones that match what I'm used to hearing the best.

The fundamental problem is that we work on what we can measure. It's very easy to measure frequency response. It's very hard to measure phase alignment. So we fix what we can measure. If you want to see serious map-over-territory thinking, look at THD specs for amplifiers. It's super-easy to measure THD - just isolate harmonics of a sine wave a 1khz. Unfortunately, this has approximately zero to do with music, unless your idea of "music" is static sine waves. Recorded music has a 20-30db dynamic range (less in the case of modern pop) and covers ten octaves. Dynamic recovery behavior, intermodulation distortion, stuff like this is what gives amps their distinctive sounds - but it's nearly impossible to measure! So they sell what looks good on paper... THD. Sigh.



This comment covers much of what is wrong with common interpretations of acoustics.

> the human ear is largely insensitive to moderate variations in volume/frequency... but it's incredibly sensitive to phase.

No, it's not. The human auditory system is sensitive to time variation. Phase shift may contribute a time shift but only at frequencies low enough that their wavelength has reasonably relation to the spacing of the ears. For example at 10Khz the wavelength is far too short for 'phase' to impact arrival time. Ultimately what actually matters in this regard with the human auditory response is group delay, not phase.

> Phase is how we discern directionality (stereo), among other things.

As mentioned above, what matters in the context of what your saying is group delay, but that is far from the only thing that matters for directionality. The human psycoscoutic system is very complex, part of directionality is group delay, part of it is frequency attenuation caused by sound wrapping around the ear from the ear, part of it is a half dozen other things. 'Phase' doesn't come even close to being a catch all cause.

> To my ears, the best and most musical sounding speakers

And thus we've devolved from science, as this usually goes. What on earth does 'musical sounding speakers' even mean?

> It's very easy to measure frequency response. It's very hard to measure phase alignment.

I have no idea where you came up with this statement but it is very easy to measure the response of a loudspeaker, both in amplitude and phase.

> If you want to see serious map-over-territory thinking, look at THD specs for amplifiers. It's super-easy to measure THD - just isolate harmonics of a sine wave a 1khz.

THD can be measured at only 1Khz, but it isn't something intrinsic to what 'THD' means. Measuring at only 1Khz is generally indicative of a crappy amplifier manufacturer looking to inflate their power numbers. Proper specs will provide a 20-20kHz THD rating.

> Dynamic recovery behavior, intermodulation distortion, stuff like this is what gives amps their distinctive sounds - but it's nearly impossible to measure!

If an amp has a distinctive sound, it has failed to achieve it's core design goal. I'm not sure what you think is impossible to measure, but I assure you it is not.


Both your comments are heavyweight so I'm not trying to pick sides. (Sum your two perspectives and I think you're close to Ultimate Expert Comment.) I just want to mention that beat is talking about phase alignment between identical signals, which our perceptual network is incredibly sensitive to. Anyone who has ever done multi-mic recording can attest to the very real effect you can hear from lack of phase alignment, even if they have bronze ears. Even worse is if you duplicate two musical recordings and offset one, even by a tiny amount.

This is a critical aspect of crossovers- since they contain the same signal, any phase differences between the intersecting components are easy to pick up by the ear and very difficult to measure since it won't show up as a significant frequency or amplitude differential. In the highly acoustically sensitive area where the crossover occurs, the phase distortion between the two signals is hard to pin down but it's definitely there, it's easy to hear by A/Bing.

I mix on a multiple-monitor setup and the one that is most critical to me sounds like absolute garbage. However, it has a wonderfully "flat" response and allows you to get an idea of what proven aesthetically-unpleasing issues are present. It has no crossover. The goal is not to have perceived rumble, tinniness, or mud on any of the incredibly wide range of listening systems out there. I think this is what beat means by "musical"- flat speakers are actually not pleasant to the ears, but the most empirically useful.

The most famous secret of many respected mix engineers is the Auratone. Michael Jackson's "Bad" was mostly mixed on this little thing. It sounds like absolute garbage aesthetically but reveals more issues in a mix than anything else. It's like when you first saw things under a "blacklight" as a kid and got to see all the particulate matter covering everything, that you can't see with the naked eye.

Here's more about the Auratone. I hope my comment has helped to bridge this critical area of "musicality" vs. empiricism. http://www.trustmeimascientist.com/2012/02/06/auratone-avant...

Postscript- the listening environment is the most important aspect! Always! You can have the best mastering grade monitors on earth in an improperly treated room, and it's all for naught!


> I just want to mention that beat is talking about phase alignment between identical signals, which our perceptual network is incredibly sensitive to.

I wouldn't go that far. If you have two signals emitting from physically separate location and they are out of phase, you just get comb filtering. Is it audible? It can be, but its not the phase you're hearing, it's the drastic frequency notching in amplitude. More importantly, you're going to get comb filtering no matter what you do with a multi-speaker setup. Just move your head a couple inches out of the ideal sweet spot equidistance from each speaker and you'll have created an effective phase shift and get the same type of comb filtering. In other words even with a theoretically perfect pair of time aligned speakers with perfect cross-overs, you'll still get comb filtering if you take various measurements around the listening area. Just moving the mic 4 inches can have drastic effects on the measured response. Incidentally if you've ever seen someone taking a measurement with a sound meter and rhythmically moving the mic around in a strange fashion, they are doing that to try to even the effects of comb filtering.

> This is a critical aspect of crossovers- since they contain the same signal, any phase differences between the intersecting components are easy to pick up by the ear and very difficult to measure since it won't show up as a frequency or amplitude differential.

If you're taking a measurement in the crossover region and there is a phase shift between the drivers you most certainly will see an effect in amplitude as you'll have at least partial wave cancellation. You can certainly take measurements in locations what won't show this, but that is always true. You can take measurements of a perfectly time aligned speaker that make it look like it has phase issues too, if you put the mic in the right spot.

Likely the largest improvement you get from time alignment of the drivers in a multi-way speaker is the ability to control vertical lobe tilting, but you can 'fix' that issue with MTM layouts without time aligning the drivers as well.


I think what one might crudely call the "audiophile phase beef" is not about the comb-filtering that results from misalignment but the non-linear phase response of most crossovers, before the signal hits the air. Most act somewhat like an all-pass filter: a flat frequency response but varying delay across the frequency spectrum. This doesn't affect frequency sweeps or white noise, but results in impulses and clicks being smeared out in time or ringing a bit. The thought among some is that this is audible on percussion sounds, but experimental evidence is dubious as far as I can see. There are some references in the classic Douglas Self crossover book: http://books.google.co.uk/books?id=D9l6JWKKSzUC&lpg=SA2-PA14...

A similar argument which I think more people would agree with is that reflex-loading a speaker hurts the low-frequency group delay, the lack of which is one reason suggested for the supposed clarity of the classic NS10 monitor: http://www.soundonsound.com/sos/sep08/articles/yamahans10.ht...


> If an amp has a distinctive sound, it has failed to achieve it's core design goal.

What makes you think the design goal of an amp is always to have perfect sound reproduction? That may be true for a home-theater system or if you're playing wind/string instruments, not so for a guitar amp. We have the (digital) tools to get pure, 100% uncolored sound, and musicians hate it.


These are "bookshelf monitors", and the whole point of a monitor is to be accurate!


The comment to which I replied to was an absolute statement, not referring to the monitor built. Maybe if he had said "If a monitor has a distinctive sound [...]" that might have been closer to truth; in practice it's nearly impossible not to have a distinctive sound, there are too many variables at play.


> in practice it's nearly impossible not to have a distinctive sound, there are too many variables at play.

Of course not. High-end monitors are high-end precisely because they achieve indistiguishability. Between to pairs of speakers from different brands, you wouldn't be able to tell the difference.


This goes absolutely against the real-world experience of professional mixing and mastering engineers, people who make their livings on "accurate" reproduction of music. Speakers are not accurate, period. Anyone who has been both a musician and a mixing engineer (like me) will tell you this. The idea of "accurate" response from audio systems is magical thinking.

The mixing and mastering engineers aren't trying to make some perfect reproduction of natural sound. They're trying to make records that sound as good as possible on as many different kinds of reproduction systems as possible - not just "perfect" audiophile systems, but car speakers, iPods, etc. As such, rather than going for accurate speakers, mix engineers rely on speakers that they know very well, so they can predict results elsewhere more easily.

The most popular professional mixing speaker is the Yamaha NS-10. It's not "accurate". It doesn't even pretend to be accurate. In fact, it has a pronounced peak and significant harshness around 2khz, right in the most sensitive area for vocals and midrange melodic instruments. Why use it? Because if you can make it sound good on the NS-10, it'll sound good anywhere. Likewise, the second most popular speaker is the Auratone, a single driver with limited frequency range. The Auratone has two advantages. First, it reflects the limited construction of many real-world speakers. Second, because it lacks a crossover, there's no phase weirdness in the midrange, so it's actually very pure at the most musically critical points. Deep bass and sizzly highs aren't important. Midrange is important, and Auratones are brutally honest at that, more than speakers costing orders of magnitude more.


Are they? I haven't had the chance to compare two monitors in the same room, but as far as I know no two are alike, and none has ever achieved a perfectly flat response curve. Take also high-end headphones, where it's way easier to have a similar acoustic profile, yet any two models with similar characteristics (on paper) are very easy to tell apart.


There's no such thing as a perfectly flat response, but that's not what you need: what you need is response flat enough that it is within human ear resolution.

(As far as I remember, headphones are actually harder to get precise response curves due to the interaction with the skull and precise physiology of the listener, but I may be mistaken in that.)

The problem with any claim from people that they can "easily" tell two pairs of headphones apart or two pairs of monitors apart is that most of the time these claims are not scientifically validated. There are numerous biases and gotchas involved in measuring audio fidelity, and one must be aware of these when designing experiments. (And the whole "audiophile" industry is based on the idea of selling snake oil technology to people with fat wallets and who think that they have better ears than anyone, and they go to great lengths towards denying the science.)


Likewise, it's very easy to write off actual observation as "bias" when the test isn't "scientific". That's one of my big gripes with what I thin of as the right wing of the hi-fi industry... the idea that human experience can be reduced to measurable and mechanically reproduceable observation in all cases. This reaction leads to "blame the observer" rather than questioning whether the scientist actually understands the problem domain or the question they're asking sufficiently well. It's every bit as emotional and reactionary as the "left wing" hi-fi subjectivists. But at least the subjectivists are owning up to their magical thinking.


I think you are wrong to dismiss domain experts opinions as leaving science. Machine learning 101 is eye ball your data, as the human brain is the most powerful pattern matcher known. A musicians ear is therefore the best tool for detecting anomolies in sound and should not be dismissed for not having a nice mathematical construction. Acoustics are perceptual and are heavily filtering the real world sound waves so I don't think its possible to reduce the laudatory cortex to simple filters like its ignoring phase or whatnot.


> A musicians ear is therefore the best tool for detecting anomolies in sound

This is only true if you can remove biases.


Gah, not A/B testing nonsense.

Want an A/B test? Put your hi-fi in the same room with a real singer, or an acoustic guitar, or whatever musical instrument, and see how hard it is to tell them apart. Not very, I assure you.

Interestingly, we can pick up very subtle and useful musical cues from very poor recordings and reproduction. It's a complex thing.


> For example at 10Khz the wavelength is far too short for 'phase' to impact arrival time.

It's the time-of-flight difference between when the one ear receives the wave front and the other from a change in the sound that gives us the directional information.

Absent any change all we have to go on is volume so we rotate our heads to the point where both ears receive the signal equally strong, then the source is somewhere in the plane that bisects the head of the listener.

Rotating one ear forward gives us the clue about whether the sound source is in front of us or behind us. (Fails to work when it is directly overhead.)


It's the time-of-flight difference between when the one ear receives the wave front and the other from a change in the sound that gives us the directional information.

That's part of it, the other part is the head related transfer function is the change in response of sound due to diffraction effects caused by the head, upper torso and pinnae. The pinnae also help us to judge elevation.


I'm not devolving from science. I'm pointing out that science is insufficiently evolved to truly understand the domain.

I'm a musician and a recording engineer. I've made records, I listen to real-world instruments every day, I've built speakers, and I've built amplifiers. And I've learned along the way that "accurate" is a big stinking pile of BS, and "scientific" is usually just a euphemism for magical thinking - "If we can't measure it, it doesn't exist". If you can't measure it, maybe you don't understand the problem as well as you thought.

You don't get to ignore the evidence of the senses of domain experts just because it makes you uncomfortable.


> incredibly sensitive to phase

Under contrived situations such as square waves and headphones. However, in general, the ear is not particularly sensitive to phase. [0] [1] which both discuss, in particular, [2] and [3] as well as many others listed in their bibliographies.

The stated conclusions are, emphasis mine, "Given the data provided by the above cited references we can conclude that phase distortion is indeed audible, though generally speaking, only very subtly so and only under certain specific test conditions and perception circumstances." [1]

Reads may also be interested in an additional conclusion which reads thusly: "Room acoustics further masks whatever cues that the hearing process may depend upon to detect the presence of phase distortion." [1]

[0] http://www.silcom.com/~aludwig/Phase_audibility.htm

[1] http://www.audioholics.com/room-acoustics/human-hearing-phas...

[2] Lipshitz, Stanly P., Pocock, Mark, and Vanderkooy, John, "On the Audibility of Midrange Phase Distortion in Audio Systems,' J. Audio Eng. Soc., Vol. 30, No, 9, Sept. 1982, pp 580-595.

[3] Toole, Floyd E., "The Acoustics and Psychoacoustics of Loudspeakers and Rooms - The Stereo Past and the Multichannel Future," 109th AES Conv., Los Angeles, Sept 2000.


"So they sell what looks good on paper... THD"

From an EE perspective most of the "obvious" ways to screw up THD sound truly horrible. Crossover distortion, clipping, bias problems, truly excessive hum or noise...

Also note that just because your final PA is a modern class D doesn't mean there's not some opamp input stage in the box that is perfectly liable to classical transistor bias problems. So you can't get crossover distortion out of a modern class D amp final device as an inherent part of how those switch the transistors, but that doesn't mean you can't screw up the input stage of the opamp that is inside the box of an amp that none the less has a D-type output final stage. In terms of circuit design, A guy who can design a nice class D stage is not necessarily (but often is) the same guy who can design a class A or AB stage.

"Necessary but not sufficient" is a good way to describe good THD numbers.

I would have to think for awhile how to get a good THD number with foul intermod numbers... Maybe if you fed in two noise signals outside the freq bandwidth that when nonlinear mixed gave a difference freq of 2 KHz that would sound horrible even if a pure sinewave input at 1K sounds awesome. It would take some work to screw up an amp this way but it could happen.

Anyway nothing says "tradeoff" like engineering and I wouldn't trade off a pretty basic core characteristic like THD for improved ... anything, except maybe power level. I'd rather hear something good than something a couple dB louder. Reporting good THD is valid in marketing as a way to tell that you did at least the very first homework problem... I agree totally with you that there are multiple steps beyond it, but there's no point in working on any of them if you can't pass the first simple THD test. (Edited to add, I missed the PERFECT analogy: Its like passing the UL Labs certification that it probably won't burn your house down. That's kinda required, and its also possible to aim a little higher)


Back in the day, I used to build Class A directly heated triode tube amps for hi-fi. These are poo-poo'd by the self-styled "scientific" hi-fi community for their high THD specs. But it's important to think about not only how much distortion is happening, but where the distortion is happening. In super-simple triode tube circuits with no negative feedback loops, THD is pretty much linear to the volume. For small signals - the detail in the mids and highs - distortion is effectively unmeasurable.

The mainstream A/B transistors + heavy negative feedback architecture has the benefit of low manufacturing cost (no need to match devices when you just feedback the differences out) and good specs. But where does the distortion happen? At the Class A/B crossover point, somewhere in the first watt. Until the amp hits maximum power and clips, the WORST distortion is in the first watt, where all the small signals live.

To my common sense and my ears, this explains how good-spec amps can sound very harsh and cold, while bad-spec amps can sound sweet and warm. It's not "euphonic distortion", as the pseudo-scientists claim. Rather, it's that the warm amp is actually distorting less where it actually matters.


Thinking about this a bit more, I suspect the reason for the superb sonic reputation of a lot of inexpensive "Class T" digital amps is that they push the frequency-domain distortion problems into a different frequency. Digitally switching at far higher frequencies than audio and then modulating audio onto that signal is a pretty brilliant approach. You get the cost benefits of digital switching amps, without audio frequency domain artifacts from all the many kinds of low-level signal crud. The artifacts are there - they're just several octaves out of band for audio.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: