Hacker Newsnew | past | comments | ask | show | jobs | submit | currymj's commentslogin

Anthropic already was using "Clawd" branding as the name for the little pixelated orange Claude Code mascot. So they probably have a trademark even on that spelling.

Runescape boss "Clawdia" [1] predates Anthropic use by several years.

https://runescape.wiki/w/Clawdia


this is probably a net negative as there are many very good scientists with not very strong English skills.

the early years of LLMs (when they were good enough to correct grammar but not enough to generate entire slop papers) were an equalizer. we may end up here but it would be unfortunate.


But then, assuming we are fine with this state of things with LLMs :

why would it be upon them to submit in English, when instead reviewers and readers can themselves use a LLM translator to read the paper ?


this would be a good development. seems very far off.

How is it far off if it's already used like this, just on the side of submitters ?

the earlier list of ICLR papers had way more egregious examples. Those were taken from the list of submissions not accepted papers however.

the harsher the punishment, the more due process required.

i don't think there are any AI detection tools that are sufficiently reliable that I would feel comfortable expelling a student or ending someone's career based on their output.

for example, we can all see what's going on with these papers (and it appears to be even worse among ICLR submissions). but it is possible to make an honest mistake with your BibTeX. Or to use AI for grammar editing, which is widely accepted, and have it accidentally modify a data point or citation. There are many innocent mistakes which also count as plausible excuses.

in some cases further investigation maybe can reveal a smoking gun like fabricated data, which is academic misconduct whether done by hand or because an AI generated the LaTeX tables. punishments should be harsher for this than they are.


Fabricated citations seem to be a popular and non ambiguous way for AI to sabotage science.

Especially for your first NeurIPS paper as a PhD student, getting one published is extremely lucrative.

Most big tech PhD intern job postings have NeurIPS/ICML/ICLR/etc. first author paper as a de facto requirement to be considered. It's like getting your SAG card.

If you get one of these internships, it effectively doubles or triples your salary that year right away. You will make more in that summer than your PhD stipend. Plus you can now apply in future summers and the jobs will be easier to get. And it sets your career on a good path.

A conservative estimate of the discounted cash value of a student's first NeurIPS paper would certainly be five figures. It's potentially much higher depending on how you think about it, considering potential path dependent impacts on future career opportunities.

We should not be surprised to see cheating. Nonetheless, it's really bad for science that these attempts get through. I also expect some people did make legitimate mistakes letting AI touch their .bib.


This is 100% true, if anything you’re massively undercounting the value of publications.

Most industry AI jobs that aren’t research based know that NeurIPS publications are a huge deal. Many of the managers don’t even know what a workshop is (so you can pass off NeurIPS workshop work as just “NeurIPS”)

A single first author main conference work effectively allows a non Ph.D holder to be treated like they have a Ph.d (be qualified for professional researcher jobs). This means that a decent engineer with 1 NeurIPS publication is easily worth 300K+ YOY assuming US citizen. Even if all they have is a BS ;)

And if you are lucky to get a spotlight or an oral, that’s probably worth closer to 7 figures…


I recommend actually clicking through and reading some of these papers.

Most of those I spot checked do not give an impression of high quality. Not just AI writing assistance but many seem to have AI-generated "ideas", often plausible nonsense. the reviewers often catch the errors and sometimes even the fake citations.

can I prove malfeasance beyond a reasonable doubt? no. but I personally feel quite confident many of the papers I checked are primarily AI-generated.

I feel really bad for any authors who submitted legitimate work but made an innocent mistake in their .bib and ended up on the same list as the rest of this stuff.


To me such an interpretation suggests there are likely to be papers that were not so easy to spot, perhaps because the AI accidentally happened upon more plausible nonsense and then generated fully non-sense data, which was believable but still (at a reduced level of criticality) nonsense data, to bolster said non-sense theory at a level that is less easy to catch.

This isn't comforting at all.


the papers themselves are publicly available online too. Most of the ones I spot-checked give the extremely strong impression of AI generation.

not just some hallucinated citations, and not just the writing. in many cases the actual purported research "ideas" seem to be plausible nonsense.

To get a feel for it, you can take some of the topics they write about and ask your favorite LLM to generate a paper. Maybe even throw "Deep Research" mode at it. Perhaps tell it to put it in ICLR latex format. It will look a lot like these.


bigram-trigram language models (with some smoothing tricks to allow for out-of-training-set generalization) were state of the art for many years. Ch. 3 of Jurafsky's textbook (which is modern and goes all the way to LLMs, embeddings etc.) is good on this topic.

https://web.stanford.edu/~jurafsky/slp3/ed3book_aug25.pdf

I don't know the history but I would guess there have been times (like the 90s) when the best neural language models were worse than the best trigram language models.


Steam Deck is Arch-based, that's most likely why.


SteamOS is counted separately.


Yes, but a lot of linux games use an Arch distribution such as CachyOS since SteamOS also is. They get updates faster because of the rolling releases.


i would like to understand what people get, or think they get, out of putting a completely AI-generated survey paper on arXiv.

Even if AI writes the paper for you, it's still kind of a pain in the ass to go through the submission process, get the LaTeX to compile on their servers, etc., there is a small cost to you. Why do this?


Gaming the h-index has been a thing for a long time in circles where people take note of such things. There are academics who attach their name to every paper that goes through their department (even if they contributed nothing), there are those who employ a mountain of grad students to speed run publishing junk papers... and now with LLMs, one can do it even faster!


Presumably a sense of accomplishment to brandish with family and less informed employers.


Yup, 100% going on a linked in profile


Published papers are part of the EB-1 visa rubric so huge value in getting your content into these indexes:

"One specific criterion is the ‘authorship of scholarly articles in professional or major trade publications or other major media’. The quality and reputation of the publication outlet (e.g., impact factor of a journal, editorial review process) are important factors in the evaluation”


Is arXiv a major trade publication?

I've never seen arXiv papers counted towards your publications anywhere that the number of your publications are used as a metric. Is USCIS different?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: