Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A while back I built an ad/not-an-ad classifier for TV frames based purely on pixels (i.e. not prior knowledge of specific ads). One of the interesting findings -- perhaps obvious in retrospect -- was that ad frames were on average brighter than content frames.


I’m sure it’s out there, but I had a similar idea to make an audio device plugin like soundflower to pipe audio through and automatically mute when it detects an ad. We really need an OpenWrt-like project for smart TVs so that you could run something like what you created to mute the sound and display art during ad breaks.


how about we use a pi-hole to send a webhook whenever a url is blocked and then we can use that hook to do the display art thing? that would be nice


Point ad domains to 127.0.0.1 and run a local https that displays art.


How accurate did you get? Are frame-to-frame probabilities pretty independent? I wonder if you can hit some arbitrary threshold by implementing something like "n frames in a row."


The time series of frame probabilities is of course highly autocorrelated, but I never bothered to model that due to lack of interest. From memory, a simple single-frame classifier was >95% accurate (this was quite a few years ago, mind you). In my small-sample and not very rigorous testing, this was higher than human rates. Turns out that on average it's pretty hard to tell an ad from content just by looking at the pixels of a single frame.

I am quite confident that it wouldn't be too hard to build an ad detection ML model that would have near-perfect accuracy. That said, an approach based on algorithmically detecting repeated segments of lengths consistent with ad spots would work just as well, if not better.

P.S. One thing I thought was really interesting was that the classifier -- that was only ever shown a binary label (ad/not-an-ad) -- learnt an embedding that grouped together entire categories of things across TV networks and geographies (studio news, weather, traffic reports etc).


I don't know much about TV broadcasting, so sorry if this is dumb, but what's a segment length? I assume it is either: lengths of a particular shot, or (probably more likely) it is some extra information that is broadcast with the TV signal about how long a chunk of broadcast is?

I like the idea of looking at pixels, just because that's the sort of info that gets sent down the HDMI cable and will always be available.


There exist inline insertion signalling standards (e.g. SCTE-35). These could be used for the insertion of ads. However, none of this signalling typically makes it into the final over-the-air or cable broadcast and so is not useful for ad detection.

To your question on segment lengths, ad spots have specific, predefined duration. In the US these are typically 15s, 30s and 60s (sometimes 45s). This property could be exploited to detect ads. Consider, for example, a video segment that's exactly 30s in duration and is repeated many times over multiple TV channels. It is very likely to be an ad.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: