Could say the same for camera processing in the Pixel Camera app or any other binary someone wants to re-use that comes included in a software distribution (seemingly for 'free'). They can't lock the instructions up on the server so they might as well make the binary be freely distributable?
Companies don't commonly give away executable binaries "just because", why'd they start now for these binary blobs that are the models?
Not that I'm unhappy about it! Yay for open data any day, I'm just not understanding why, at least beyond PR in nerd circles
Binaries are source code outputs, they are copyrightable and patentable. Weights are not copyrightable so people can freely extract the weights and run them. If Google patents any of the novel algorithms here releasing it all freely isn't an impediment to making people license it.
Are you sure that isn't about LLMs' outputs? There I know there have been some court cases that say this, but the model itself is a work created in intricate and somewhat creative ways (I hesitate to use the word "creative" here, but would similarly hesitate to label a routine picture of the moon creative whereas pictures basically always have copyright; the bar for creativity is basically an epsilon amount above zero, afaik)
Because a model like this can't be as easily obfuscated as image processing. Image processing is a bundle of many moving parts, a lot of functions each with it's own inputs and outputs. A model is a single function which can be easily extracted and reused, in comparison
Arguably, but that's not the point. Take image (e.g. png) files on a CD-ROM shipped by a game vendor, which can be trivially copied even by my grandma. That doesn't move the game vendor to release them as freely distributable under the Apache license
Good point but still, why would Google police this model? If they had a restrictive licence on it do you think it would be worth it for them to enforce it? This way they at least buy some good will and mindshare
That makes sense to me. Guess one might say the same for game icons and other such files that lay around in disks, but yeah maybe it's as simple as that
Not quite the same, understandably Blizzard cares a lot about their IP because otherwise private servers leech their users. Maybe a small game designer cares a lot about the small game they made or whatever since that's all they have. A four trillion market cap company can afford to be "charitable".. where it costs them nothing and might cost them more to enforce their rights.
They could lock them down legally which would prevent commercial use, but they choose not to, and they boast about how many tens of millions of times Gemma models have been downloaded by developers.
So there must be more to the rationale than just local model weights getting hacked out of devices.
Claude managed agents is a general-purpose hosted runtime for Claude. While Twill focuses on SWE tasks.
And so the SWE workflow is pre-built (research, planning, verification, PR, proof of work). Twill is also agnostic to the agent, so you can use codex for instance. Additionally you have more flexibility on sandbox sizing on Twill
Our work on concrete here differs in that the problem is both
1) an inherently time-varying, and
2) multi-objective.
See our write-up here for details: https://arxiv.org/pdf/2310.18288
Years ago, my college multi variable calculus and linear algebra courses were both taught primarily using course materials that were interactive Mathematica Notebooks.
We had access to all of the symbolic algebra tools and were even expected to use them regularly for both courses. It was great!
I'm not sure how well this would extend to introductory courses though, especially if the standardized tests still expect integration by hand.
Those same companies often invest in accessibility for vision-impaired users. I'm not sure you need a screen capture to scrape content when the site is designed to be navigable with a screen reader.
For anyone else who was confused to see a paper use the same name as a commercial product, it looks like Google Gemini was announced in May, whereas this was submitted to SOSP that had an April submission deadline.
It's not a good name to give to anything. Unless you're a corporate giant, name creativity is really important to making your work findable and re-findable.
> name creativity is really important to making your work findable and re-findable.
This is underrated information. I’ve seen so many products and even companies fail because the name led to millions of unrelated search results. Even if they are a giant it can still lead to bad outcomes.
I think this points more to how slow the paper submission process is compared to the product creation velocity. No wonder arxiv has been such a hit for the ML community.
GPU performance per dollar is only competitive for specific workloads. For extremely large scale compute, getting enough data center GPUs can also be challenging.
Lower counterparty risk, and harder to confiscate. Money can sit in a crypto wallet and be used for illicit transactions until favorable circumstances allow conversion out of btc. You can’t do that with cash in a bank.
With the amount of shady alt coins and defunct btc exchanges separating authentic vs bogus transactions is even tougher for regulators.
So it's easier to just release those models as open source and make it official, since someone would inevitably hack the weights out anyway.
reply