Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We wrote the paper on how to deslop your language model: https://arxiv.org/abs/2510.15061


Slop is about thoughtless use of a model to generate output. Output from your paper's model would still qualify as slop in our book.

Even if your model scored extremely high perplexity on an LLM evaluation we'd likely still tag it as slop because most of our text slop detection is using sidechannel signals to parse out how it was used rather than just using an LLM's statistical properties on the text.


Would love to see proof of this claim that you can tag antislopped LLM text as LLM generated. I'm willing to bet money that you can't.


Here's what pattern suppression actually does on a model that's trained to open its writing with "You're absolutely right.":

You're spot-on. You're bang-on. You're dead right. You're 100% correct. I couldn't agree more. I agree completely. That's exactly right. That's absolutely correct. That's on the nose. You hit the nail on the head. Right you are. Very true. Exactly — well said. Precisely so. No argument from me. I'll second that. I'm with you 100%. You've got it exactly. You've hit the mark. Affirmative — that's right. Unquestionably correct. Without a doubt, you're right.

I'm willing to bet money you can easily tag these openers yourself.

This sampling strategy and the elaborate scheme to bake its behavior into the model during the post-training are terribly misguided, because they don't fix the underlying mode collapse. It's formulated as narrowing down the output distribution, but as with many things in LLMs it manifests itself on a much higher semantical level - during the RL (at least using the current methods) the model narrows the many-to-many mapping of high-level ideas that the pretrained model has down to one-to-one or even many-to-one. If you naively suppress repetitive n-grams that are not semantically aware and manually constructed patterns that don't scale, it will just slip out at the first chance, spamming you with minor non-repetitive variations of the same high-level idea.

You'll never have the actual semantic variety unless you fix mode collapse. Referencing n-grams or manually constructed regexes as a source of semantical diversity automatically makes the method invalid, no matter how elaborate your proxy is. I can't believe that after all this time you persist in this and don't see the obvious issue that's been pointed at multiple times.


"This sampling strategy ... [is] terribly misguided, because they don't fix the underlying mode collapse... If you naively suppress repetitive n-grams ... it will just slip out at the first chance, spamming you with minor non-repetitive variations of the same high-level idea."

This is a colossal strawman! You're confusing two completely different problems:

One is Semantic Mode Collapse, which is when the model is genuinely stuck on a handful of high-level concepts and can't think of anything new to say. This is a deep pre-training or alignment problem.

Two is linguistic Pattern Over-usage ("Slop"). The model has a rich internal distribution of ideas but has learned through RLHF or DPO that a few specific phrasings get the highest reward. This is a surface-level, but extremely annoying, problem for a wide variety of use-cases!

Our paper, Antislop, is explicitly designed to solve problem #2.

Your example of "You're absolutely right" becoming "You're spot-on" is what happens when you use a bad suppression technique. Antislop's method is far more sophisticated. Read the paper! The FTPO trainer is built on preference pairs where the "chosen" tokens are coherent alternatives sampled from the model's own distribution.

"You'll never have the actual semantic variety unless you fix mode collapse. Referencing n-grams or manually constructed regexes as a source of semantical diversity automatically makes the method invalid..."

You write like you are someone who thinks "n-gram" is a dirty word and stopped reading there.

First, the patterns aren't "manually constructed." From Section 3.1, they are identified statistically by finding phrases that are massively overrepresented in LLM text compared to pre-2022 human text. We did data-driven forensics...

Also, ourpaper's method explicitly relies on good sampling techniques to find diverse alternatives. From Section 4.1:

"...we then resample from the adjusted distribution, using min-p filtering to constrain the distribution to coherent candidates..."

It's frankly insane that you and half the field are still ignoring this. The reason models produce repetitive "slop" in the first place is that everyone is running them at temperature=0.7 and top_p=0.9. Those settings cause bland and mean-chasing output, and you think that models exhibit this in generality because the whole field refuses to use much higher temperatures and better sampling settings.

You want real diversity? You crank the temperature to 5.0 or higher to flatten the distribution and then use min_p sampling (like the one introduced by Nguyen et al., cited in this very paper!) or an even better one like top N sigma to cut off the incoherent tail. This gives the model access to its full creative range.

I can't believe that after all this time you persist in this and don't see the obvious issue that's been pointed at multiple times.

The only "obvious issue" here is a failure to read the paper past the abstract. This paper's entire methodology is a direct refutation of the simplistic n-gram banning you imagine. FTPO works on the logit level with careful regularization (Figure 4b) to avoid the exact kind of model degradation you're worried about. FTPO maintains MMLU/GSM8K scores and improves lexical diversity, while DPO tanks it.


Thanks for this response by the way, It's useful knowledge.


I'm not saying we could detect it from the text alone!

The side channel signals (who posted it, where, etc.) are more valuable in tagging than raw text classifier scores.

That's why I said our definition of slop can include all types of genAI: it's about *thoughtless use of a tool* more than the tool being used.

And also that regardless of the method, your model can be used to generate slop.


Okay, that's fair re: side channel signals.


If its not labeled as generated by AI, then that in of itself makes it deceptive and therefore slop.


It looks like a method of fabricating more convincing slop?

I think the Kagi feature is about promoting real, human-produced content.


People don't call it slop because of repetitive patterns they call it slop because it's low-effort, uninsightful, meaningless content cranked out in large volumes




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: