Probably overkill for content moderation, I'd think. You can identify bad words ...

CamelCaseName · on Feb 21, 2024

For now, but in a year?

You could also stagger the moderation to reduce costs. E.g.

Text analysis: 2 views

Audio analysis: 300 views

Frame analysis: 5,000 views

I would be very surprised if even 20% of content uploaded to YouTube passes 300 views.

ugh123 · on Feb 21, 2024

Or.. google supplies some kind of local LLM tool which processes your videos before uploaded. You pay for the gpu/electricity costs. Obviously this would need to be done in a way that can't be hacked/manipulated. Might need to be highly integrated with a backend service that manages the analyzed frames from the local machine and verifies hashes/tokens after the video is fully uploaded to YouTube.

Aerroon · on Feb 22, 2024

Google already reencodes all of the videos. Will this analysis really cost them that much more?

halamadrid · on Feb 21, 2024

It should be far less than 20%.

I guess it could also be associated with views per time period to optimize better. If the video is interesting, people will share and more views will happen quickly.

elzbardico · on Feb 21, 2024

People assume that we can scale the capabilities of LLMs indefinitely, I on the other side strongly suspect we are probably getting close to diminishing returns territory.

There's only so much you can do by guessing the next probably token in a stream. We will probably need something else to achieve what people think that will soon be done with LLMs.

Like Elon Musk probably realizing that computer vision is not enough for full self-driving, I expect we will soon reach the limits of what can be done with LLMs.

makeitdouble · on Feb 21, 2024

> Probably overkill for content moderation

Content moderation is one of the hardest task we have at hand, we're burning though human souls looking at god awful stuff, lose their sanity, because simple filters just won't cut it.

For instance right now many rules exclude all nudity and the false positive rate is through the roof, while some of the nudity should actually be allowed and the rule in itself is hurting and should ideally be changed.

Even with our current simplistic rules I don't see automatic filters doing their job ("let me talk to an human" is our collective cry for help). When setting up more sensible rules ("nudity is OK when not sexualized, but not of minors, except for babies, if the viewer's coubtry allows for it"), I assume the resources and tuning needed to make that work on an automated systems would be of epic scale.

Aeolun · on Feb 21, 2024

That’s only 8 calls with a full context window per second. If that costs so much it makes Google do a double take, then maybe these AI things are just too expensive.

If it costs $1 per call, then over a year the entire perfect moderation of Youtube would cost roughly $250M. That seems sort of reasonable?

But probably pointless for most videos that are never watched by anyone other than the uploader, so maybe you just do this thing before anyone else watches the video and cut your costs by 50+%

oefrha · on Feb 22, 2024

They do “moderate” videos never watched by anyone and it can be totally ridiculous. I had a private channel where I had uploaded a few hundred screen recordings (some of them video conferences) over a year or two, all set to private and never shared with anyone. One day the channel was suddenly taken down because it violated their policy on “impersonation”… Of course the dispute I’m allegedly entitled to was never answered.

mrinterweb · on Feb 21, 2024

I have no idea how YouTube currently moderates its content, but there may be some benefit with Gemini. I'm sure Googlers have been considering this option.