Hacker Newsnew | past | comments | ask | show | jobs | submit | yliu's commentslogin

Despite the title being about YouTube, this is fundamentally about Safari's declarative Content Blocker API being totally inadequate in the face of modern ad delivery technologies. Yes, it's fast and relatively more secure compared to old ad blocking techniques (which requires granting full access to effectively arbitrary JS), but ad tech has evolved since 2015 and Safari's Content Blocking API has not evolved with it.

With other browsers showing varying degrees of interest in declarative content blocking, it's worth looking at Safari as a warning of what declarative content blocking, if unmaintained, will do to cripple ad blocking for users.


This is basically the exact fear which was being expressed by users when Google announced that they would require Chrome extensions to only use declarative content blocking starting with Manifest v3 (which anecdotally convinced me to switch to Firefox).


If we go down that road, however, sites can make ads completely indistinguishable from desired content. Same domain, same stream, no easily marked container. All of the imperative adblocking tech in the world, short of queuing everything through a neural engine post render, can block what is possible.

So there has always been a detente between adblockers and publishers, presuming the former hit a small enough set of users that it was just ignored. It seems that is no longer the case.


That would require delivering ads from first-party servers, right? So third-party ad and tracking networks would die a painful death.


More likely, they'd get upgraded to "first-party" tracking by acting as a CDN-layer where the ad networks do the proxying/caching to get the actual content upstream before merging it with the ads and serving whole thing in a single request. How would you block ads from Cloudflare if it became an ad network?


The same way you do now with ublock origin. By removing specific elements which match a rule.

The networks will still be able to track you, but ads will be blockable until pages discard the DOM and switch to canvas rendering the whole page


Then you serve the ads on the same set of elements that also contain critical content to the user. You can't block ads from Twitch / YouTube with a rule if the ads are baked into the stream itself. Same goes for any other kind of "element" that doesn't explicitly set itself apart from the actual content.

The ads have long ago started evolving away from a simple "here's an ad neatly placed into its own semantic container so that blockers can target it".


> You can't block ads from Twitch / YouTube with a rule if the ads are baked into the stream itself

you probably wont ever read this, but you actually can.

the pre/post video ads have always been blockable with ublock origin, and the mid-stream adverts, inserted by the content creators themselves can be skipped using SponsorBlock.


Couldn't they just be proxied through a first party server?


Some trackers are having people setup CNAME records on their domains, so the tracker cookies appear to be first party:

https://arxiv.org/abs/2102.09301


uBlock Origin already performs CNAME decloaking and blocks this approach, it’s pretty cool.


For anyone else who wanted to know more like me, here's a good rundown: https://www.reddit.com/r/uBlockOrigin/comments/f8qnpc/ublock...

Note that CNAME uncloaking only works on Firefox; chromium-based browsers do not support the required API.


And for me this is one of the reasons - probably the biggest - that I don't want to buy an ipad. Because it doesn't allow to run the full blown firefox

I've spent hours debating moving to ipad instead of android tablet and it ends to 1. lightning instead of usb-c (can't afford the ipad pro) but ok I can live with it and 2. firefox which is just a blocker


> uBlock Origin already performs CNAME decloaking and blocks this approach, it’s pretty cool.

... which in return is a static list of domains which needs to be regularly updated, and therefore is not really failsafe. uBlock0 uses Adguard's scraped dataset [1] as a fallback source to do this, as Chrome Extensions cannot make DNS requests without a DNS-via-HTTPS endpoint.

Firefox, however, has provided the `dns` API [2] to do requests via the native OS resolver (which in return is also not failsafe due to being unencrypted plain-old-manipulateable DNS UDP requests)

[1] https://github.com/AdguardTeam/cname-trackers

[2] https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/Web...


uBlock Origin on Firefox is able to perform CNAME uncloaking to block this shenanigan.


TBH that is the future. tracking won't die will just evolve to become harder to block.


That would partially defeat the purpose. First party adds see first party cookies, so they have no inherent cross site tracking ability.


And yet facebook is already do something along the same lines by adding fbcid=<trackingnumber> to every outbound link and then the site that receives the link can report back "I saw fbcid=<trackingnumber>". Sure it makes 3rd party tracking require more trust but wouldn't some analysis tell you if your client is trying to game your ads for revenue etc...?


Cross-site tracking via cookies (and 3rd-party cookies in general) has already been dead for years.


That would be more effort than including a single html-script tag to import google analytics. I have hope that most parties would decide that the extra server load and difficulties would make it not worth it.


You underestimate the desire for precision tracking, unfortunately. Hiding behind custom subdomains is common. Stepping up to cloaking it to be delivered from the application is more effort but it'll happen.


You overestimate the technical ability of publishers. Frankly, it’s pathetic to rely on a third party’s hot linked JavaScript, but no one can be arsed to understand how it works, so they just add the tags to GTM instead of realizing that they could trivially implement A/B testing or whatever themselves.


if you as the ad network don't connect to the end user directly: how can you be sure you're not being defrauded by the site owner?


still more effort than before. twiddling with server configs requires more work than inserting a js snippit.


I wonder if Google’s Web Packaging standard was intended to eventually make it possible to deliver both the page and the ads from the same server without enabling one party to tamper with the other.


While this is an obvious route you could take, this is a significant jump from the current behaviour where the content has to be signed by the owner of the domain that packaged it and is treated as if it was served by that domain (on a given scheme/port), and the same-origin policy applies like normal, and thus the ads would continue to be treated as third-party.

It _does_ potentially allow performance gains, insofar as you're then able to send a single bundle containing both first and third party content, but it isn't a gain from the point-of-view of avoiding adblockers (aside from the most primitive DNS/IP level ones).


...That's what YouTube does. Ad video content comes from redirector.gvt.com -> xxx.googlevideo.com just like solicited video content.

(Yes, I classify advertising as spam.)


But somehow it’s still blockable. I don’t see any ads in YT on Firefox + Ublock.


You can also reverse proxy it through your server. I did that once for Google analytics on a demo site.


> If we go down that road, however, sites can make ads completely indistinguishable from desired content.

This is exactly the reason why I'm building a web browser with a statistical representation of both the DOM/CSS Layout _and_ the network traffic, so that neural networks can be trained on classifying ads and malicious actors.

There's a lot of requirements in regards of networking for such a peer-to-peer system to work, like a consensus on DNS/CNAME/PTR or consensus on TLS cert validity.

But I honestly believe that this is inavoidable in the near future, given that most Browsers these days are just a Chrome/Chromium shim where obviously Google's business model conflicts with the idea of blocking ads.


I think by law, ads have to be declared as such for users.


That works in theory but fails in practice. Most ads will just have an "Opinion" tag stapled on them.


Blocking YouTube ads requires injecting JS, it had nothing to do with Manifest.


A major justification for removing the previous content blocking API was that it could be used to do things like inject JS. So clearly the intention is to have content blocking extensions not do that at all. Although in this specific case, it might still be possible.

AdGuard and uBO for example use the content blocking API to inject blocking "scriptlets" on sites where this kind of thing is required. That kind of usage is made much more inconvenient with Manifest v3.


I don't think so. Injecting JS is a valid use-case and will work forever, probably a huge majority of extensions do that. An intention is to make content blocking extensions more performant.

It's very easy to inject JS. I don't know whether you're talking from your own experience, but I wrote my little extension to replace uBlock (with my own list of rules and blocks) and to inject JS or CSS you just have to add a line in manifest.json which have nothing to do with blocking API.


See here where Justin Schuh says the sole motivation is for privacy reasons: https://twitter.com/justinschuh/status/1134092257190064128

I know it is easy to inject JS and that you can do it with the manifest file. But without the old content blocking API you can't dynamically inject different snippets on different pages based on filter lists for example (unless you inject something on every page).

I wouldn't be surprised if in the future, content blocking extensions won't be allowed in the store if they use such broad permissions for example.


Well, to be fair, not really. Scriptlets will continue to work just okay.

To be completely honest, Manifest V3 technically is not THAT bad and it's capabilities at the current moment are really close to what major ad blockers can do.

There're still some things that bother me:

1. Debugging a content blocker is really inconvenient (not as bad as Safari though) 2. The future. What if its development stalls after it's released? 3. Google's goal (probably, for Manifest V4) is to make content blocking completely declarative, i.e. get rid of any host permissions and content scripts.


I use Wipr as a content blocker on both macOS and iOS. I never see ads on YouTube. But I've always felt that it might not be enough some day. Perhaps that day is nearly here.


This is a recent change by YouTube. Wipr uses the same content blocking API as Adguard and has the same limitations:

https://giorgiocalderolla.com/wipr-faq.html#youtube


I use Wipr, I’m in Australia and on Catalina. Now see Ads on YouTube, even when I update. Now I understand why.


Interesting. I tried Wipr on macOS Safari for a couple weeks about a year ago, and there was effectively no ad blocking on YouTube, Twitch, or Twitter, which is coincidentally where I spend the bulk of my time.

It was a frustrating experience, I've tried Safari multiple times over the years since it is so much better on battery life, but Chrome always wins in usability and adblocking.

I switched back to Chrome + uBlock Origin and could use those sites ad-free again. Well, except for Twitch, since they found a workaround for adblockers last year.


Sounds like you would be interested in Orion. https://browser.kagi.com


I use Wipr and I just saw an ad today for the first time on YouTube in Safari. Usually, it throws up an error and you refresh and the video plays, but today’s it was error, then a skippable ad.


I've been using Wipr. I've been getting the white placeholder screen for just about a year now. Every one in a while I get actual ads getting thru, before a new update fixes it.


In response to this article, I've disabled it in place of using the standalone Adguard app. So far I haven't gotten either symptom.


Note that YT changes aren’t yet rolled out everywhere. Also, if you’re not authorized there’ll be no issues, but it won’t stay like that forever.


I'm not authorized and started getting the new ads about a week ago (with wipr).


Adguard requires an electron app(wut?) to run in the background to provide much of its functionality, is Wipr like this too?


No. Wipr is a clean content blocker. I don’t know how it handles auto-updates of the blocklist, but I definitely don’t need to keep the app open nor does it add buttons to the interface like AdGuard does.


Running the app is not mandatory and neither is adding buttons to the interface. If you just need content blockers, you can simply enable just them, close the app and forget about it until you feel the need to check filters updates.


Last I checked, selecting elements to block also required running the Electron app, which is somewhat annoying and unnecessary.


I think you’re right from the short-term perspective but largely irrelevant long-term. If Safari allowed arbitrary code execution, it’d be a little better for as long as it took publishers to deploy first-party ad injection. We’d still get the security problems, though.

You can already see what that’s like with podcasts where local ads are spliced right into the audio file. You’re not stopping that short of doing something like buffering the content and running it through an AI, and if that became widespread we’d just see more embedded placement (“Hey, protagonist, why are you so irresistibly sexy?” “It’s these new briefs from My Undies”).

Adtech is a multi billion dollar industry and the people making the content you want are enthusiastically supporting them. This is not a problem which technical tricks can solve – as soon as you do something effective, Google can deploy hundreds of engineers with huge resource budgets to foil you. That won’t change without something like regulatory changes to lower the financial pressure.


> if that became widespread we’d just see more embedded placement

Which is completely desirable. The problem isn't "ads", it's "targeted, personalized ads that rely on thoroughly destroying the the privacy of everyone on the internet in order to function". If a show/podcast wants to vet its own advertisers and endorse a specific product, that's great; it establishes a concrete relationship with the advertisers that has more value to both users and content creators than the anonymous, unvettable system of opaque middlemen currently peddled by targeted ad networks.


That’s one option but it’s not what’s happening. Historically ads were easily blocked because they came from different domains; as we’re seeing now increased deployment of blockers has lead to things like CNAME cloaking or even first-party hosting. The amount of money at play is enough that they’re going to keep trying more invasive approaches as the old ones become less profitable.

The podcasts I mentioned aren’t running their own ad network, they’re using a service which injects audio segments into your download. I’d expect things like that to become more common as ad revenues decline, with an endgame something like CDNs inserting tailored content directly to avoid any other hostnames or paths which easy to block.


> Historically ads were easily blocked

Historically ads were served by the site owner at their own discretion. Prior to that ads were served by TV and radio channels. None of those approaches were easy to block.

Dedicated ad networks on separate domains are relatively recent fad (since ~15 years ago). A lot websites still ship first-party ads, many have never stopped to.


First party ads are also a part of many content creators (eg. "this video is sponsored by NordShadowraid Wallet"), and currently, the only way to block them is via crowdsourcing (eg. sponsorblock addon).


If I can recognize an ad, I can construct JavaScript that can recognize that ad too. The current extension APIs let me inject that JavaScript, while the declarative ad blocking APIs do not.


This is a constant arms race, as anyone who’s looked at Facebook’s DOM knows, and if you’re successful it pushes to the end state I mentioned of ads becoming very similar to the content. The companies which depend on ad revenue aren’t going to go out of business voluntarily and many of them will find alternative paths to those ad dollars.


> where local ads are spliced right into the audio file. You’re not stopping that short of doing something like buffering the content and running it through an AI

This can be solved by crowdsourcing it: https://sponsor.ajay.app/


Currently, to some extent. Again, my point is that there’s a ton of money at stake and it’s not like companies are going to say “welp, someone blocked our ads, time to close up shop”. Each time blockers have gotten better, all that’s happened has been the ad delivery systems getting more sophisticated — and since the providers can run the same tools I don’t think that’s going to change. Containing some of the damage by, for example, continuing to restrict JavaScript at least has some benefits but things like YouTube ads are the same format and delivery path.


It's cat and mouse game, but currently ad blockers win all battles.


Ray Tomlinson's design for email came in the 70s. RFC 788 (SMTP) was published in 1981.

Email predates the Web, and, imo, has been made much worse by all the Web-adjacent features shoved into it.


Not sure if "better commenting", or just another attempt to integrate G+ into every Google-owned property.

The part about floating "engaged discussions" to the top is interesting. One hopes the algorithm distinguishes between "ongoing informative discussion" and "blazingly active flamewar".


Can we stop claiming that Google+ can absolutely not be usefully implemented in other Google products?


Nice strawman. Did I claim at any point that g+ is "absolutely not [useful]"? No need to start white-knighting for Google.


"Not sure if "better commenting", or just another attempt to integrate G+ into every Google-owned property."

"Nice strawman."


Nice affirming a disjunct.


Is there any sort of proof of this "optimization" actually contributing to performance? The state of compiler optimization being what it is, I find it hard to imagine flag tweaking from the default -O3 or whatever can make an actual significant difference. The secretiveness of his build seems designed to obfuscate third-party replication of his results.

Seems more a mouthpiece for the this fellow's self-aggrandizement than anything else.


Something like this is long overdue, I think. Great work. Open academic publication models have had a difficult time for a number of reasons, but systems like this are very helpful in making the case for openness.

One thing that always bothers me with a purely Reddit-style, point-based system for surfacing academic discussions across domains, though, is that it's unclear what kind of papers are being surfaced: a very good paper in a very niche space may not get the attention that a mediocre paper written for a mass audience (for some definition of "mass") would. Is that an acceptable drawback for openjournal? Or should there be some way for niche papers to gain exposure? Forking openjournal and making your own "sub-openjournal" for your research domain? Weighted voting mechanisms?

Also, like reddit, it might be useful to have a mechanism to demonstrate, emphasize, and/or sort by specific commenters' backgrounds, training, and credentials. For many domains, peer review and commentary from people in the same field might be more useful than general commentary.

As a minor wish, I've always wanted to see a mechanism for encouraging sharing of implementations, test code, and other raw experimental results along with the actual papers. 'Cause really, for most cases, I'm not going to implement a multi-page algorithm just to verify a conclusion or make use of an insight. But if I can fork and compile a github repo associated with the paper...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: