A few weeks ago I talked with a guy whose startup was developing a better hearin...

LX8DvZFL · on Oct 24, 2017

> The problem with hearing loss is generally not so much that you can't hear anything at all. It's that it becomes extremely difficult to distinguish speech from background noise.

The problem with hearing loss is that you can't hear. In most cases, there's a loss of high-frequency hearing, you can't hear sibilants and thus can't parse the speech.

Speech-in-noise is a specific situation that HAs do not handle well. It's less of an issue with un-aided hearing thanks to the shape of our ears.

> The ideal hearing aid will amplify those frequencies at which speech is present, while suppressing frequencies that contain background noise

Every modern hearing aid does this already.

> a pretty dumb set of heuristics to figure out which is which

Dumb? Billions of dollars are waiting for the person who can make noise reduction work really really well. It's surprisingly difficult, even with unlimited computation power.

> they'll try to estimate how far the source of the noise is

This is 100% fiction. No HA on the planet calculates distance-to-noise.

You can spot the HAs that do because they require three microphones not in a straight line. Probably a triangle.

You might be thinking of beamforming, where the HA calculates the direction of the sound and can optionally focus amplification on sounds coming from that direction. Typically, sounds behind the listener are amplified less than sounds coming from in front of the listener. This is a useful refinement done by every modern HA.

> wind sounds or rustling paper also gets amplified

That is unfortunate. There is significant research going into recognising speech patterns so that the HA can make these decisions better. Unfortunately, none have shown useful results yet.

> huge opportunity here to apply deep learning

To do what, exactly? Why DL? How do you propose to run DL on a 1MHz CPU with 16kb of RAM and a battery the size of a bee's genitals?

> It's a challenging hardware problem

The hardware has been known and fixed for 20 years. What would you change? Software is where all of the improvements have come from for a very long time.

> deep learning part of it has largely been solved

Cite me a paper and we can make billions.

antognini · on Oct 24, 2017

It sounds like you know a lot about this field! I'll confess that I'm a neophyte. I'm working on audio research right now, but all I know about hearing aids is my one conversation with the whisper.ai CEO.

> Cite me a paper and we can make billions.

The relevant paper is Hershey et al., 2015 [1]. There are some audio examples here as well [2]. The idea is that a deep NN can apply a spectral mask and isolate a single speaker when many speakers are talking (or there's background noise). Of course a standard hearing aid has pretty limited hardware, which is why the hard part for them is developing a small enough device that can do the inference in real time. (They cheat a little bit and actually do all the processing on a larger device that you keep in your pocket --- it's not done locally behind the ear.)

[1]: https://arxiv.org/abs/1508.04306

[2]: http://www.merl.com/demos/deep-clustering

andrewtbham · on Oct 24, 2017

deep learning and the hearing aid https://spectrum.ieee.org/consumer-electronics/audiovideo/de...

why you can't just turn up the volume https://medium.com/@AMP/cant-you-just-turn-up-the-volume-4ec...

srean · on Oct 24, 2017

Curious to know what in what ways would old fashioned DSP (Fourier, STFT, Wavelets), or for that matter compressed sensing be lacking compared to deep learning.

AstralStorm · on Oct 24, 2017

In no way at all, especially statistical techniques for speech enhancement using Emax or Bayesian given Gamma prior.

You can also toys in some more advanced stuff like ICA deconvolution.

This on top of some nice improvements in plain DSP techniques. (E.g. modified gammatone transform or variants of Stockwell transform)

All in low latency.

The main problem is ruining this on a tiny jellybean power efficient micro. Deep learning on a Cortex M0 with decent quality and less than 5ms latency? Good luck.

icebraining · on Oct 24, 2017

I read the suggestion as "applying deep learning to optimize the parameters of the process", not running it on the device itself.

AstralStorm · on Oct 24, 2017

For optimization of parameters you can go full ham and apply even genetic algorithms.

The problem is actually finding a "legibility" and "quality" metric. Codec ones like PEQ are not good enough.

dw5ight · on Oct 24, 2017

Man - I wish the ML side was completely solved, though I think we've actually figured most of the hardware out! Lmk when you want to chat next Anthony - we've had some fun progress since last time!

antognini · on Oct 24, 2017

Ha, maybe I exaggerate a little! But the ML part seems much more tractable than the hardware side for me! (Mostly because I don't know much about hardware and it seems, well, hard!) :-)

jlebrech · on Oct 24, 2017

what if they made a cb radio for the home, when living with my mother she'd have the tendency to ask for something from another room. a in house star-trek-style communicator could help with that.

ianhowson · on Oct 24, 2017

It's called a 'spouse mic'.

adammunich · on Oct 24, 2017

Mind sharing the guy's name? I love making small electronics.

antognini · on Oct 24, 2017

Dwight Crow at whisper.ai. I do know they're hiring!