I believe it is the system instructions that make the difference for Gemini, as I use Gemini on AI Studio with my system prompts to get it to do what I need it to do, which is not possible with gemini.google.com's gems
> The license must allow modifications and derived works, and must allow them to be distributed under the same terms as the license of the original software.
> No Discrimination Against Fields of Endeavor
You cannot restrict selling derivatives of the software (a field of endeaver), and the derived work must be able to be shared under a similar license.
GPL does not add any restrictions counter to that. It allows you to redistribute and sell copies if you want, you just have to respect your user's freedom by also giving them the modified source code.
to be honest i still feel satisfied but when it comes to making something useful, in the past i gave up programming because of the endless repetitive tasks, you want to build something cool but wait you first need to make a auth system and by the time you finish that the cool idea is already dead because of how boring and repetitive it all is, ai coding made it fun again
Here's a blog post on scalability from the front page (well, sort of, see last paragraph): https://www.process-one.net/blog/ejabberd-massive-scalabilit... This includes not just textual claims about scalability but a lot of hooks you can use to followup on, like a reference to the Tsung benchmarking tool you can use. If you're asking about this for serious reasons, at this scale you're obviously running your own tests anyhow. You may also want to speak directly to Process One about this because it sounds like you're at a scale where you should probably be looking at paid support anyhow.
I'm not necessarily endorsing this, just giving it to you as some first stabs at answers and some ways to follow up.
If there's anyone reading this from the ejabberd project, note that the link to this on your front page under "Massively Scalable" is completely broken; that links to "blog.process-one.net" but that domain is completely dead, so it doesn't redirect to the link I gave above. (It's also part of why I posted this; my post here is not a "just read these docs" because I had to do non-trivial work to find them.) I had to pull that out of archive.org. Should probably check for any other links to that domain too.
I'm past my edit window now so I can't remove the claim, but I will verify and agree that it is fixed now. If you (the reader, not Metalhearf) previously visited the home page you may have to manually reload to bust the cache to see the correct link(s) now.
If you want 40M connections, on FreeBSD, the maxfd count is physpages / 4; you could edit the kernel to change this, but are you really going to serve 40M connections with only 16k ram for each user? If my math is right that puts you around 640GB of ram. (40M * 4096 * 4 = 40G * 4 * 4 = 640 G), which is totally doable on a server socket. Probably you don't have everyone simultaneously connected, but probably you also need more ram per connection, so kind of depends. By the time you find 40M users to connect to your server, you should be in tune with your hardware needs ;)
If you’re running a single machine, you’ll be limited by the number of available ports. It’s a TCP limitation and nothing to do with XMPP. You’d need a cluster of XMPP servers to handle 40M users. Even for just text. Port limits are port limits.
You're thinking of outbound port limits. Inbound connections all come in on the same port and there's no port-related limits there.
The real limits are going to be on hardware. If we want 40M concurrent connections, the server will need a few hundred to a few thousand gigabytes of memory.
Unless you're doing multicast or anycast, there's a port bound IP handshake that happens. You have your listener (your server port) and the connected TCP/IP client (client port, on the server machine). You're limited to 64533 clients (0-65535 but the first 1000 are reserved). If there's a way to get more client on a single machine, I'm all ears - but that's baked into IP protocol. TCP/UDP doesn't matter. It's a limitation of IP.
Assume 10% of 40M users are active at once. That's 4M clients to deal with. You would need 62 servers (and probably a few for cache) to handle the load. But those could be small core cheap servers.
The uniqueness tuple for IP is (source IP, source port, dest IP, dest port). You're limited to 65,535 connections from the same two IPs on the same port, but that's not relevant to XMPP which uses only one and some transient things for file transfer and such. At worst having that many people behind one NAT will be a problem... which at this scale could be an actual problem, but there are still solutions (multiple ports being the easiest, and the fact this cluster will probably be on multiple public IPs anyhow).
You should be able to deal with 4 million clients on one server in 2025; we did 2 million on one chat server in 2012 [1]. I can find documentation of 2.8M in Rick Reed's Erlang Factory presentation [2], page 16. That was part of a clustered system with chat connections on one group of servers and several other groups of servers for account information, offline messages, etc. Also, the connections weren't Noise encrypted then, and there were a lot more clients in 2012 that didn't have platform push, so they would try to always be long connected ... it's not very hard to manage clients that just need to be pinged once every 10 minutes or so.
But this was with only 96G of ram, and dual westmere 6-core processors. You can put together a small desktop with consumer parts that will be way more capable today. And if you run everything on a single machine, you don't have to deal with distributed system stuff.
When I left in 2019, we were running on Xeon-D type hardware, with maybe 64 GB ram and IIRC, only doing about 300k connections per chat machine. Chat scales horizontally pretty well, so run a couple machines of whatever, find your bottlenecks and then scale up the machines until you hit the point where it costs more to get 2x of bottleneck on a single machine than to get 2 machines with 1x of the bottleneck. I suspect, if you can get quality AM5 servers, that's probably the way to go for chat; otherwise likely a single server socket would be best; dual socket probably doesn't make $/perf sense like it did 10 years ago. If you get fancy NICs, you might be able to offload TLS and save some CPU, but CPU TLS acceleration is pretty good and there's not a ram bandwidth saving from NIC TLS like there is for a CDN use case.
IMHO, getting a single server to support 4M clients shouldn't be too hard, and it should be a lot of fun. Mostly all the stuff that was hard in 2012 should be easy now between upstream software improvements and the massive difference between CPUs in 2012 and 2025. The hard part is getting 4M clients to want to connect to your server. And trying to setup a test environment. Elsewhere on this thread, there's a like to the process one blog from 2016 where they ran 2 M clients on a single server (m4.10xlarge (40 vCPU, 160 GiB) with a single same spec server as the client load generator; the impressive part there is the client load generator --- I've always needed to have something like a 10:1 ratio of load generators to servers to load down a big server. And they managed to get that to work with all their NIC interrupts sent to a single cpu (which is not what I would have recommended, but maybe EC2 didn't have multiple nic queues in 2016?)
> If there's a way to get more client on a single machine, I'm all ears - but that's baked into IP protocol. TCP/UDP doesn't matter. It's a limitation of IP.
As others have said, the limitation is the 5-tuple: {protocol, SrcIp, DstIp, SrcPort, DstPort}; if you're a server, if you hold SrcIP and SrcPort fixed, and for each dest ip, you can have 64k connections. There are a lot of dest ips, so you can host a lot of connections, much more than you can actually manage on a single system. If your clients are behind an aggressive CGNAT, you can actually run into problems where there's more than 64k clients that want to connect to your single IP and port ... but you can listen on multiple ports and address it that way, and most aggressive CGNATs are only for v4; if you listen on v6, you'll usually see the client's real ips or at least a much more diverse group of NAT addresses.
If you listen on all the ports, you can have 4 billion connections between you and any given IP. That's not going to be limiting for a while.
Yeah, ChatGPT is better overall, but with a twist, Gemini can actually be the best if you use AI Studio, tweak the config, and set up a good system prompt, kinda like how nano banana is SOTA, but Qwen-Edit feels more useful since it’s less censored, meanwhile ChatGPT is starting to feel slower and kinda showing its age, another example is Veo 3 being SOTA while the infamous Grok is technically worse but doing better, and OpenAI’s Sora is pretty much dead
edit: I run low profile service that localizing e-commerce photos, like taking Alibaba listings and swapping the model to look local alike, with nano banana I can’t automate it because I have to manually check if the output got blocked (anything with female skin is risky, underwear or cleavage is 100% blocked), but Qwen-Edit just does the job without fuss
Gemini in AI Studio is so much better than in Gemini.com / app. You would think that signals they're going for devs over consumers, but they're a consumer company. A real head scratcher
I believe it was TikTok’s U.S. operations that were sold for $14B, The rest of the world is still under Chinese ownership... hmm, did I just say ‘the rest of the world is still under Chinese ownership’?
I’ve tried that too, trying to detect the scan layout to get better OCR, but it didn’t really beat a fine-tuned Qwen 2.5 VLM 7B. I’d say fine-tuning is the way to go
What's the cost of the fine-tuned model? If you were attempting to optimize for cost, would it be worth it to detect scan layouts to get better OCR?
Honestly, I'm such a noob in this space. I had 1 project I needed to do, didn't want to do it by hand which would have taken 2 days so I spent 5 trying to get a script to do it for me.
the model runs on H200 in ~20s, costing about $2.4/hr. on L4 it’s cheaper at ~$0.3/hr but takes ~85s to finish. overall, H200 ends up cheaper at volume. my scan has a separate issue though: each page has two columns, so text from the right side sometimes overflows into the left. OCR can’t really tell where sentences start and end unless the layout is split by column.