Hacker Newsnew | past | comments | ask | show | jobs | submit | drapado's commentslogin


Cool! Pity they are not releasing a smaller A3B MoE model



Their A3B Omni paper mentions that the Omni at that size outperformed the (unreleased I guess) VL. Edit: I see now that there is no Omni-235B-A22B; disregard the following. ~~Which is interesting - I'd have expected the larger model to have more weights to "waste" on additional modalities and thus for the opposite to be true (or for the VL to outperform in both cases, or for both to benefit from knowledge transfer).~~

Relevant comparison is on page 15: https://arxiv.org/abs/2509.17765


Genuely curious. What would the problem be if it was vibe-coded? It's an easy to read site that succeeds in communicating what it wants.


there's no problem with it being vibe-coded

The point is that the site, contacting your local MEP, and all the discussion in this thread, is pointless to affect some kind of durable societal change

Pointing out that it's vibe-coded just emphasises that all of the above actions are just low-effort cope


Can you suggest an alternative action?


Decentralised messaging providers

Can't enforce everyone to scan

What are you going to do? Arrest everyone?


> What are you going to do? Arrest everyone?

Just like in any other authoritarian state, you make examples. People will quickly learn how to self-police (and to turn enemies in).


Maybe accelerating is an option


Unfortunately, no open weights this time :(


https://ollama.com/joefamous/QVQ-72B-Preview

Experimental research model with enhanced visual reasoning capabilities.

Supports context length of 128k.

Currently, the model only supports single-round dialogues and image outputs. It does not support video inputs.

Should be capable of images up to 12 MP.


>Last December, we launched QVQ-72B-Preview as an exploratory model, but it had many issues.

That's an earlier version released some months ago. They even acknowledge it.

The version they present in the blog post and you can run in their chat platform is not open or available to download.


The wisdom of open weights is hotly debated.


wisdom? I don't get what you meant with that. What is clear is that open weights benefits society as we can run it locally and privately.


Have you sought out alternative positions that you might be missing?


I notice four downvotes so far for stating a fact that a debate exists. My comment above didn't even make a normative claim. For those who study AI risks, there is indeed a _debate_ about the pros and cons of open weights. The question of "what are the implications of open-weight models?" is not an ideological one; it is a logical and empirical one.

I'm not complaining; it is useful to get an aggregated signal. In a sense, I like the downvotes, because it means there are people I might be able to persuade.

So how do I make the case? Remember, I'm not even making an argument for one side or the other! My argument is simply: be curious. If appropriate, admit to yourself e.g. "you know, I haven't actually studied all sides of the issue yet; let me research and write down my thinking..."

Here's my claim: when it comes to AI and society, you gotta get out of your own head. You have to get out of the building. You might even have to get out of Silicon Valley. Go learn about arguments for and against open-weights models. You don't have to agree with them.


Is there a good wisdom benchmark we can run on those weights? /s


I recently had to check code from some of my students at the university as I suspected plagiarism. I discovered JPlag which works like a charm and generates nice reports


Next time just ask them a few questions about the programming choices they made. Far easier.


How do you deal with disputes? One's code is flagged even if the student in question didn't actually cheat. What then? Do you trust tools over the students' word?

In addition, do things like stack overflow and using LLM-generated code count as cheating? Because that is horrible in and of itself, though a separate concern.


The output of plagiarism tools should only serve as a hint to look at a pair of solutions more closely. All judgement should be derived entirely from similarities between solutions and not some artificial similarity score computed by some program.


Unfortunately, this is not really what happens in my experience. The output of plagiarism tools is taken as fact (especially at high school levels). Without extraordinary evidence of the tool being incorrect, students have no recourse, even if they could sit and explain the thought process behind every word/line of code/whatever.


Lousy high school.


Indeed, this is exactly what I did.


If you talk about the written code to the student in question it should become clear whether it was copied or not.


Well, in this case I noticed the same code copied while grading a project. I used then JPlag to run an automatic check in all the submissions for all the projects. It found many instances where a couple of students did a copy-paste with same variable names, comments, etc. It was quite obvious if you look in detail, and JPlag helped us spot it in multiple files easily.

*edited mobile typos


An archival video of all coding sessions (locally, hosted by the student), starting with a visible outline of pseudo-code and ending with debugging should be sufficient.

In case of a false positive from a faulty detector this is extraordinary evidence.


We had a professor require us to use git as a timestamped log of our progress. Of course you could fake it but stealing work and basically redoing it piece by piece with fake timestamps is a lot of work for cheaters.


Kinda rare these days with ChatGPT


You might be surprised. Many students who use ChatGPT for assignments end up turning in code identical (or nearly identical) to other students who use ChatGPT.


Surprising because you get different answers each time you ask ChatGPT.


Different in an exact string match but code that is copied and pasted from ChatGPT has a lot of similarities in the way that it is (over) commented. I've seen a lot of Python where the student who "authored" it cannot tell me how a method works or why it was implemented despite having the comments prefixed to every line in the file.


> (over) commented

From my experience using ChatGPT, It usually remove most of my already written comments when I ask questions about code I wrote myself. It usually give you outline comments. So unless you are supporter of the self documented code idea, I don't think ChatGPT over comments.


It's obviously down to taste, but what I've seen over and over is a comment per line which to me is excessive outside it being requested of absolute beginners.

That happens and also the model can't decide if it wants the comment on the line before the code or if everything should be appended to the line itself so when I see both styles within a single project it's another signal. People generally have a style that they stick with.


Ah yes, good old "Did you even read the essay before handing it in? Next time, please do."


ChatGPT answers don't differ that much without being prompted to do so


yeah but the prompt itself generally adds sufficient randomness to avoid the same verbatim answer each time.

as an example just go ask it to write any sufficiently average function. use different names and phrases for what the function should do; you'll generally get a different flavor of answer each time, even if the functions all output the same thing.

sometimes the prompt even forces the thing to output the most naive implementation possible due to the ordering or perceived priority of things within the requesting prompt.

it's fun to use as a tool to nudge it into what you want once you get the hang of the preconceptions it falls into.


MOSS seems to be pretty good finding multiple people using LLM-generated code and flagging them as copies of each other. I imagine it would also be a good idea to throw the assignment text into the few most popular LLMs and feed that in as well, but I don't know of anyone who has tried this.


FWIW the attack we describe in the paper works against MOSS, too (that was the original inspiration for the name, “Mossad”).


Are you the same kind of people that think that NGO workers should work for free or for a small wage that is not representative of the market wage for their positions?


No, I’m the type of person who thinks tech salaries are bloated in certain areas and certain companies and that does not follow the distribution of talent. It’s followed the distribution of VC money and profits of large companies. The evidence of such is that the median software engineer in the US is in the low-mid $100s (depending on what source I want to believe it’s $110k-$140k). But I also believe that same talent can be sourced outside the US is many cases and for far less expense.

I also view most apps/tech as not very novel. It’s largely the same engineering “problems” that are known and well documented. A lot of it can be done by average developers and “top tier” talent isn’t usually needed other than probably the cryptographic components in Signal’s case. Scale is certainly a concern, but that is a familiar problem that’s has a lot of documentation solutions and approaches.

I could be wrong. Maybe they’re already doing this and it just happens most of their expense is going to a couple high paid execs. Could be that I’m underestimating the complexity as well. But I find my statements to be true in many cases. I can even point to the number of times I’ve talked to consultants and top tier devs about building things for me. What they would charge $1m for I can often piece together for less than $50k by hiring a few folks in low COL areas and then just spending a little effort refactoring their code to be as pretty as I like it to be; sometimes I outsource that too but the point is having a whole company of top tier talent isn’t usually necessary, it’s a choice. Just like believing that top tier talent only exists in the high cost tech hub cities is a choice more so than the truth.


I'm curious in your statement, can you point to some papers where they addressed it?


(not op) The section A Path Forward in Managing AI Risks by Bengio et al cites a few papers: https://managing-ai-risks.com/


I read through that and none of the section (or entire work) ever talk about the above discussion. Further I looked at some of the many citations of on that section and none of them suggest that the OP is right. In fact a few of them I know disagree.


Depends. Models are matrices of floats and so there's little chance an umbrella-term like "stochastic parrot" will never not stick, even when they already show signs of syntactic, semantic world-building capability (https://www.arxiv-vanity.com/papers/2206.07682/). If you are like me (and them: https://archive.is/cZi83) and deem instruction following, chain-of-thought prompting, computational properties of LLMs (as researchers continue to experiment with training, memory, modality, and scaling, for example, to arrive at abstract reasoning) as emergent, then we're on the same page.


Okay so just to confirm that section doesn't actually tell us anything about this and in fact this is all based on your own understanding of the mechanisms involved.


My reading of the papers is, given enough scale, modality, and memory; there are chances (perhaps newer and different) models will be able to "generalize" our world. Also: https://archive.is/3yyZZ / https://twitter.com/QuanquanGu/status/1721394508146057597 | And: https://archive.is/bW2tS / https://twitter.com/mansiege/status/1680985267262619648


Do you have numbers, data or sources to claim your first statements (not the one about teachers being underpaid)?


Yes. They're called teachers. Talk to some.

In 35 years, we've gone from kids having to do and show their own work to looking up answers online for everything and shortcutting their way through their educations because their parents aren't participating.


>Yes. They're called teachers. Talk to some.

GP was asking for data and your provided anecdotes. It seems the issues in our education systems go back to when you were in them.


Never go full Blizzard


Fortunately, expertise is not limited to a single concrete topic. There are experts in climate change from a socio-economic perspective, which are able to do science in such complex environments :)


If such experts existed, centrally planned economies would work wonderfully. Unfortunately, they don't. For sufficiently complex problems, the knowledge is diffuse across a wide range of individuals and there is no one expert we can turn to. That's why debate, and tolerating dissent, is important.


Huh? These experts absolutely exist - look e.g. at iSAGE during the pandemic in the UK. They combined epidemiology with sociology to provide a more comprehensive view of the situation. It's just that you ultimately have to listen to people saying "do X, which is expensive". Governments don't on the whole tend to love that.


Dissent is important, within reason. Things like claiming a medication we know is ineffective can cure Covid in the face of 99% of the medical establishment telling you you're a moron is not within reason, for example.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: