They have quite short cancellation notices and there is a question why do you need to rent it if you are an ai company? How is it grok not running the world using that compute.. And possible cancellation will erase a large income stream, think when gpus get old or sooner when Google does not need it anymore.
It is crazy that anything Europe gets so much hate. IMO it is important to build models within the boundaries of smaller nations, using their own language. Research has to continue even if it is outside of US and China.
I was somewhat excited about these "sovereign" open models in the beginning, but it became soon apparent that they're not gonna be anything but toys compared to SOTA.
The problem is that there are a lot, at least 30, of these small projects scattered around, funded for a few years as some ad-hoc temporary coalition of universities and businesses. Those simply cannot compete with businesses spending tens of billions on developing these. Especially when you have to bring a spoon to a gunfight restricting to "clean" data.
Multilinguality is essentially a solved problem, and restricting too much on one language with more limited resources is gonna make the model worse in that language too.
Anyone can move the needle. Saying that languages are solved is not accurate as well. You could raise different questions like maybe model grounded in a different language will make it more efficient in some tasks, maybe language structure matters for a multidimentional space, maybe that matters for the distillation, etc. It is all about the ideas and their exchange, not about the investment rounds and MAU.
Even if multilinguality isn't solved, building a benchmark and then testing each model on it and posting the result may be a cheaper accelerator of competence in the language.
Maybe I'm more attuned to this type of thing having grown up as a national of a smaller state living in the shadow of a bigger state but you constantly see actors from the bigger state belittling and condescending anything contributed socially or economically from the smaller state.
And I see this sort of dynamic here in this forum where Americans very frequently talk condescendingly like this about Europe generally and European tech especially (they did it to China too but China smartly ignored this self-interested nonsense and carried on anyway which is what Europeans should do).
It really grates on me and presumably many others. But it serves an agenda too of a lot of the founders and financiers that hang out here that have big fat customers in Europe they'd like to keep sweet and competitors they'd like to keep down.
I'm from Spain and I also hate these projects with passion. Creating models that speak multiple languages is a solved problem. Having each European Nation train its own useless "sovereign model" in its own language is a total waste of time and resources when we could pool resources and give it a try to training SOTA models that speak in all European languages.
I'd rather have smaller european labs try to give it a go at distributed training. If multiple countries got together and said, "look, we tried training a distributed model that speaks in all of our local languages and that is comparable to 1-year-old Chinese open-source models", that, at least, I would find interesting.
Excuse my ignorance if by "distributed training" you mean a specific process, but couldn't this be considered a step toward distributed training? If nations train models independently and then later distill them into a single model, all the work (both the compute and the research processes) are distributed for the initial training phase.
I mean it as in, train a model across different clusters instead of a centralized cluster. It's been shown that it's possible to train 10B models this way. If more research effort was put into this, that would be great
I don't think your approach would work because you can't create a strong model from distilling several weak models.
It’s not that it gets hate so much as it’s akin to watching them make announcements that they’re going to make a European google/facebook/tiktok.
Sure… they can, except at the end of the day it’s a bit late, regulatory burden will make it comparatively useless, and because of that nobody will ever use it. It will be spending a bunch of taxpayer dollars for press releases.
The running joke is that when these “sovereign” EU models launch, they’re going to refuse to answer anything that might involve personal information such as Elon Musk’s birthday.
At least with social networks the network effect is a powerful force. Foregrounding regulatory burden in that context is nonsensical. (That does not apply in the same way to models.)
That’s on Wikipedia, it’s not PII, it’s also not going to be relevant to any meaningful IRL work.
I challenge the assumption you can do meaningful work in this field without blatant disregard for intellectual property.
The idea that it’s all down to training size is clearly incorrect, as every expert human learned their craft without nearly the sum total information of the internet. Clearly there are architectural wins to be found.
Besides that, why would everyone just be fine with Opus level AI at best, as that’s all the US is willing to export, and I doubt China will share beyond that.
Sovereign AI is more important than ever after Friday.
I guess if you are strict about it, making derogatory comments like yours is indeed not hate. But I'm sure you are aware that "getting hate" frequently used in a more extensive meaning online, especially in the context of replies to a post and I don't see the much point in insisting on the stricter definition here.
I kinda agree, the best use of taxpayer money should be in reducing taxes to corporation that would like to compete in the market vs US and China, rather than making governments playing the game (since they very obviously can’t).
If a teenager on your street said he was going to spend $1,000 to customize his Honda Civic for his needs, you'd believe him. If he says he's going to build a brand new car, better than a Honda civic, for $10,000, you'd laugh and say good luck.
Very much similar thoughts. The examples provided are not nerds, except a few. It is just tech is a lucrative path to make money and it attracts a variety of “interesting” personalities, specifically those that can captivate and persuade masses to invest in them. By all means tech is just a means to an end to such founders. A nerd is someone who is interested in tech for the sake of it, because it is beautiful, not because it will aid drones in killing targets more efficiently and not because it will land a great contract.
This takes me back close to 15 years ago: using backend session management in Grails and the html forms that were enhanced with “some” JS, using responsive CSS. The difference at that time was browser tech not being as advanced as now, we had to care for different browsers and deal with IE7 and even IE6, it was difficult and we needed extensive QA (Browserstack would appear later). There is a reason why we had JS library evolution. Dude there was no npm, not even bower. Then we had Backbone.js - loved it, then AngularJS - amazing, then Angular version which had huge breaking changes then React, Polymer etc. Native browsers can do a lot these days, it is easy to enhance the functionality as well. But it was not always the same, the decisions to use React made sense for a variety of reasons at the time, maybe it was the case here as well.
PCC is supposed to work only on Apple silicon. You are supposed to trust that the input will be decrypted within the enclave which is next to inference engine on the same box. This way you know the input does not leave the server. If they offload to another server (eg google) then the privacy boundary is broken, once it leaves the enclave. Microsoft does it differently, where inference is confidential so more guarantees if that could be replicated.
Google has a similar thing they announced a year or two ago that uses the various hardware security stuff in the PC world that Apple is working with them to add to the list of approved stuff that gives PCC level security.
Europeans support that by and large. So either agree or have no ability to sell. This idea that companies know better about their users’ needs including privacy and choice is ridiculous. Apple is not a small company which is bullied as well.
The post seems to not address the fact that different product phases exist which in turn affect the software. Similarly, teams are different and get assembled for different purposes. There is also a difference if software was created in the past 24 months vs past 10 years. It is very easy to attack decisions made which look messy now, because many reasons. Everything looks obvious in hindsight. And then making it ad-hominem like does not sound smart, it is a clickbait, the issues are usually more complex.
reply