Over the last ~15 years I have been shocked by the amount of spam on social networks that could have been caught with a Bayesian filter. Or in this case, a fairly simple regex.
Well, large companies/corporations don't care about Spam because they actually benefit from spam in a way as it boosts their engagement ratio
It just doesn't have to be spammed enough that advertisers leave the platform and I think that they sort of succeed in doing so.
Think about it, if Facebook shows you AI slop ragebait or any rage-inducing comment from multiple bots designed to farm attention/for malicious purposes in general, and you fall for it and show engagement to it on which it can show you ads, do you think it has incentive to take a stance against such form of spam
> Well, large companies/corporations don't care about Spam because they actually benefit from spam in a way as it boosts their engagement ratio
I'm not sure that's actually true. It's just that at scale this is still a hard problem that you don't "just" fix by running a simple filter as there will be real people / paying customers getting caught up in the filter and then complain.
Having "high engagement" doesn't really help you if you are optimizing for advertising revenue, bots don't buy things so if your system is clogged up by fake traffic and engagement and ads don't reach the right target group that's just a waste.
Yes but in practice, if you compute K=X.wk, Q=X.wq and then K.tQ you make three matrice multiplication.
Wouldn't be faster to compute W=wk.twq beforhand and then just X.W.tX which will be just two matrices multiplication ?
Is there something I am missing ?
Most models have a per-head dimension much smaller than the input dimension, so it's faster to multiply by the small wk and wk individually than to multiply by the large matrix W. Also, if you use rotary positional embeddings, the RoPE matrices need to be sandwiched in the middle and they're different for every token, so you could no longer premultiply just once.
Worked like a charm, much appreciated.
This was the answer I was looking for.
Thanks, that helped!
Thanks for the tip!
Great explanation, thanks for sharing.
This was the answer I was looking for.
reply