Despite the title being about YouTube, this is fundamentally about Safari's declarative Content Blocker API being totally inadequate in the face of modern ad delivery technologies. Yes, it's fast and relatively more secure compared to old ad blocking techniques (which requires granting full access to effectively arbitrary JS), but ad tech has evolved since 2015 and Safari's Content Blocking API has not evolved with it.
With other browsers showing varying degrees of interest in declarative content blocking, it's worth looking at Safari as a warning of what declarative content blocking, if unmaintained, will do to cripple ad blocking for users.
This is basically the exact fear which was being expressed by users when Google announced that they would require Chrome extensions to only use declarative content blocking starting with Manifest v3 (which anecdotally convinced me to switch to Firefox).
If we go down that road, however, sites can make ads completely indistinguishable from desired content. Same domain, same stream, no easily marked container. All of the imperative adblocking tech in the world, short of queuing everything through a neural engine post render, can block what is possible.
So there has always been a detente between adblockers and publishers, presuming the former hit a small enough set of users that it was just ignored. It seems that is no longer the case.
More likely, they'd get upgraded to "first-party" tracking by acting as a CDN-layer where the ad networks do the proxying/caching to get the actual content upstream before merging it with the ads and serving whole thing in a single request. How would you block ads from Cloudflare if it became an ad network?
Then you serve the ads on the same set of elements that also contain critical content to the user. You can't block ads from Twitch / YouTube with a rule if the ads are baked into the stream itself. Same goes for any other kind of "element" that doesn't explicitly set itself apart from the actual content.
The ads have long ago started evolving away from a simple "here's an ad neatly placed into its own semantic container so that blockers can target it".
> You can't block ads from Twitch / YouTube with a rule if the ads are baked into the stream itself
you probably wont ever read this, but you actually can.
the pre/post video ads have always been blockable with ublock origin, and the mid-stream adverts, inserted by the content creators themselves can be skipped using SponsorBlock.
And for me this is one of the reasons - probably the biggest - that I don't want to buy an ipad. Because it doesn't allow to run the full blown firefox
I've spent hours debating moving to ipad instead of android tablet and it ends to 1. lightning instead of usb-c (can't afford the ipad pro) but ok I can live with it and 2. firefox which is just a blocker
> uBlock Origin already performs CNAME decloaking and blocks this approach, it’s pretty cool.
... which in return is a static list of domains which needs to be regularly updated, and therefore is not really failsafe. uBlock0 uses Adguard's scraped dataset [1] as a fallback source to do this, as Chrome Extensions cannot make DNS requests without a DNS-via-HTTPS endpoint.
Firefox, however, has provided the `dns` API [2] to do requests via the native OS resolver (which in return is also not failsafe due to being unencrypted plain-old-manipulateable DNS UDP requests)
And yet facebook is already do something along the same lines by adding fbcid=<trackingnumber> to every outbound link and then the site that receives the link can report back "I saw fbcid=<trackingnumber>". Sure it makes 3rd party tracking require more trust but wouldn't some analysis tell you if your client is trying to game your ads for revenue etc...?
That would be more effort than including a single html-script tag to import google analytics. I have hope that most parties would decide that the extra server load and difficulties would make it not worth it.
You underestimate the desire for precision tracking, unfortunately. Hiding behind custom subdomains is common.
Stepping up to cloaking it to be delivered from the application is more effort but it'll happen.
You overestimate the technical ability of publishers. Frankly, it’s pathetic to rely on a third party’s hot linked JavaScript, but no one can be arsed to understand how it works, so they just add the tags to GTM instead of realizing that they could trivially implement A/B testing or whatever themselves.
I wonder if Google’s Web Packaging standard was intended to eventually make it possible to deliver both the page and the ads from the same server without enabling one party to tamper with the other.
While this is an obvious route you could take, this is a significant jump from the current behaviour where the content has to be signed by the owner of the domain that packaged it and is treated as if it was served by that domain (on a given scheme/port), and the same-origin policy applies like normal, and thus the ads would continue to be treated as third-party.
It _does_ potentially allow performance gains, insofar as you're then able to send a single bundle containing both first and third party content, but it isn't a gain from the point-of-view of avoiding adblockers (aside from the most primitive DNS/IP level ones).
> If we go down that road, however, sites can make ads completely indistinguishable from desired content.
This is exactly the reason why I'm building a web browser with a statistical representation of both the DOM/CSS Layout _and_ the network traffic, so that neural networks can be trained on classifying ads and malicious actors.
There's a lot of requirements in regards of networking for such a peer-to-peer system to work, like a consensus on DNS/CNAME/PTR or consensus on TLS cert validity.
But I honestly believe that this is inavoidable in the near future, given that most Browsers these days are just a Chrome/Chromium shim where obviously Google's business model conflicts with the idea of blocking ads.
A major justification for removing the previous content blocking API was that it could be used to do things like inject JS. So clearly the intention is to have content blocking extensions not do that at all. Although in this specific case, it might still be possible.
AdGuard and uBO for example use the content blocking API to inject blocking "scriptlets" on sites where this kind of thing is required. That kind of usage is made much more inconvenient with Manifest v3.
I don't think so. Injecting JS is a valid use-case and will work forever, probably a huge majority of extensions do that. An intention is to make content blocking extensions more performant.
It's very easy to inject JS. I don't know whether you're talking from your own experience, but I wrote my little extension to replace uBlock (with my own list of rules and blocks) and to inject JS or CSS you just have to add a line in manifest.json which have nothing to do with blocking API.
I know it is easy to inject JS and that you can do it with the manifest file. But without the old content blocking API you can't dynamically inject different snippets on different pages based on filter lists for example (unless you inject something on every page).
I wouldn't be surprised if in the future, content blocking extensions won't be allowed in the store if they use such broad permissions for example.
Well, to be fair, not really. Scriptlets will continue to work just okay.
To be completely honest, Manifest V3 technically is not THAT bad and it's capabilities at the current moment are really close to what major ad blockers can do.
There're still some things that bother me:
1. Debugging a content blocker is really inconvenient (not as bad as Safari though)
2. The future. What if its development stalls after it's released?
3. Google's goal (probably, for Manifest V4) is to make content blocking completely declarative, i.e. get rid of any host permissions and content scripts.
I use Wipr as a content blocker on both macOS and iOS. I never see ads on YouTube. But I've always felt that it might not be enough some day. Perhaps that day is nearly here.
Interesting. I tried Wipr on macOS Safari for a couple weeks about a year ago, and there was effectively no ad blocking on YouTube, Twitch, or Twitter, which is coincidentally where I spend the bulk of my time.
It was a frustrating experience, I've tried Safari multiple times over the years since it is so much better on battery life, but Chrome always wins in usability and adblocking.
I switched back to Chrome + uBlock Origin and could use those sites ad-free again. Well, except for Twitch, since they found a workaround for adblockers last year.
I use Wipr and I just saw an ad today for the first time on YouTube in Safari. Usually, it throws up an error and you refresh and the video plays, but today’s it was error, then a skippable ad.
I've been using Wipr. I've been getting the white placeholder screen for just about a year now. Every one in a while I get actual ads getting thru, before a new update fixes it.
No. Wipr is a clean content blocker. I don’t know how it handles auto-updates of the blocklist, but I definitely don’t need to keep the app open nor does it add buttons to the interface like AdGuard does.
Running the app is not mandatory and neither is adding buttons to the interface. If you just need content blockers, you can simply enable just them, close the app and forget about it until you feel the need to check filters updates.
I think you’re right from the short-term perspective but largely irrelevant long-term. If Safari allowed arbitrary code execution, it’d be a little better for as long as it took publishers to deploy first-party ad injection. We’d still get the security problems, though.
You can already see what that’s like with podcasts where local ads are spliced right into the audio file. You’re not stopping that short of doing something like buffering the content and running it through an AI, and if that became widespread we’d just see more embedded placement (“Hey, protagonist, why are you so irresistibly sexy?” “It’s these new briefs from My Undies”).
Adtech is a multi billion dollar industry and the people making the content you want are enthusiastically supporting them. This is not a problem which technical tricks can solve – as soon as you do something effective, Google can deploy hundreds of engineers with huge resource budgets to foil you. That won’t change without something like regulatory changes to lower the financial pressure.
> if that became widespread we’d just see more embedded placement
Which is completely desirable. The problem isn't "ads", it's "targeted, personalized ads that rely on thoroughly destroying the the privacy of everyone on the internet in order to function". If a show/podcast wants to vet its own advertisers and endorse a specific product, that's great; it establishes a concrete relationship with the advertisers that has more value to both users and content creators than the anonymous, unvettable system of opaque middlemen currently peddled by targeted ad networks.
That’s one option but it’s not what’s happening. Historically ads were easily blocked because they came from different domains; as we’re seeing now increased deployment of blockers has lead to things like CNAME cloaking or even first-party hosting. The amount of money at play is enough that they’re going to keep trying more invasive approaches as the old ones become less profitable.
The podcasts I mentioned aren’t running their own ad network, they’re using a service which injects audio segments into your download. I’d expect things like that to become more common as ad revenues decline, with an endgame something like CDNs inserting tailored content directly to avoid any other hostnames or paths which easy to block.
Historically ads were served by the site owner at their own discretion. Prior to that ads were served by TV and radio channels. None of those approaches were easy to block.
Dedicated ad networks on separate domains are relatively recent fad (since ~15 years ago). A lot websites still ship first-party ads, many have never stopped to.
First party ads are also a part of many content creators (eg. "this video is sponsored by NordShadowraid Wallet"), and currently, the only way to block them is via crowdsourcing (eg. sponsorblock addon).
If I can recognize an ad, I can construct JavaScript that can recognize that ad too. The current extension APIs let me inject that JavaScript, while the declarative ad blocking APIs do not.
This is a constant arms race, as anyone who’s looked at Facebook’s DOM knows, and if you’re successful it pushes to the end state I mentioned of ads becoming very similar to the content. The companies which depend on ad revenue aren’t going to go out of business voluntarily and many of them will find alternative paths to those ad dollars.
> where local ads are spliced right into the audio file. You’re not stopping that short of doing something like buffering the content and running it through an AI
Currently, to some extent. Again, my point is that there’s a ton of money at stake and it’s not like companies are going to say “welp, someone blocked our ads, time to close up shop”. Each time blockers have gotten better, all that’s happened has been the ad delivery systems getting more sophisticated — and since the providers can run the same tools I don’t think that’s going to change. Containing some of the damage by, for example, continuing to restrict JavaScript at least has some benefits but things like YouTube ads are the same format and delivery path.
Not sure if "better commenting", or just another attempt to integrate G+ into every Google-owned property.
The part about floating "engaged discussions" to the top is interesting. One hopes the algorithm distinguishes between "ongoing informative discussion" and "blazingly active flamewar".
Is there any sort of proof of this "optimization" actually contributing to performance? The state of compiler optimization being what it is, I find it hard to imagine flag tweaking from the default -O3 or whatever can make an actual significant difference. The secretiveness of his build seems designed to obfuscate third-party replication of his results.
Seems more a mouthpiece for the this fellow's self-aggrandizement than anything else.
Something like this is long overdue, I think. Great work. Open academic publication models have had a difficult time for a number of reasons, but systems like this are very helpful in making the case for openness.
One thing that always bothers me with a purely Reddit-style, point-based system for surfacing academic discussions across domains, though, is that it's unclear what kind of papers are being surfaced: a very good paper in a very niche space may not get the attention that a mediocre paper written for a mass audience (for some definition of "mass") would. Is that an acceptable drawback for openjournal? Or should there be some way for niche papers to gain exposure? Forking openjournal and making your own "sub-openjournal" for your research domain? Weighted voting mechanisms?
Also, like reddit, it might be useful to have a mechanism to demonstrate, emphasize, and/or sort by specific commenters' backgrounds, training, and credentials. For many domains, peer review and commentary from people in the same field might be more useful than general commentary.
As a minor wish, I've always wanted to see a mechanism for encouraging sharing of implementations, test code, and other raw experimental results along with the actual papers. 'Cause really, for most cases, I'm not going to implement a multi-page algorithm just to verify a conclusion or make use of an insight. But if I can fork and compile a github repo associated with the paper...
With other browsers showing varying degrees of interest in declarative content blocking, it's worth looking at Safari as a warning of what declarative content blocking, if unmaintained, will do to cripple ad blocking for users.