Hacker Newsnew | past | comments | ask | show | jobs | submit | AlexB138's commentslogin

Isn't the Windows subsystem for Linux (the reference there) also a VM?

Only WSL2; WSL1 was an actual subsystem.

So this is Darwin/BSD Subsystem for Linux 2.

Yes.

WSL1 was so cool, WSL2 made it boring and isolated.

WSL1 was very conceptually appealing, and ended up working very poorly because of the poor matching between Linux syscalls and the Windows kernel. Git suffered terribly as a result. The inverse is also somewhat true - there have been cases where Wine is much slower than native Windows because Linux simply doesn't provide a simple way to achieve the same outcome, and interestingly the Wine developers have had reasonable (if tediously slow) success in making it possible to express the same semantics to Linux and have it handle things fast. It would be fascinating to know whether WSL1 developers didn't have enough traction to get Windows internals altered to match, or whether it's just way harder to do the same under Windows.

It did work quite well. The problem with the filesystem could have been solved by optimizing the Windows kernel, that would have benefit also programs run outside the WSL by the way (NTFS have performance problems and Microsoft knows, and even provided a kind of solution as far as I know with the developer FS or what they call it).

The thing that I don't like of the WSL2 is that is just a VM, but a VM that is very limited. For example working in the embedded development field I often need to use serial ports or USB devices, a thing that the WSL2 is not capable of doing (unless passing trough USB/IP that has its compatibility issues especially for stuff like debuggers needing precise timing), and that the WSL1 was at least for the serial ports able to do. This is a limitation that doesn't allow me to use the WSL. Same thing with all kind of other software that wants to access peripherals of the machine natively (e.g. a GPU for example, or another PCI card, something that to be fair is not even doable as far as I know with hypervisors on Windows but completely doable with hypervisors running on a Linux OS where trough the IO MMU you can share any PCI device of the host to the VM).

WSL1 was a great idea, bad thing that Microsoft abandoned it for something that is just good for web application development.


> (NTFS have performance problems and Microsoft knows, and even provided a kind of solution as far as I know with the developer FS or what they call it)

NTFS does not have performance problems. The difference between DevDrive, which uses ReFS (arguably a more 'resilient' file system than NTFS due to journaling) and a standard NTFS volume is the file system filters are either removed or in the case of Defender, put in async mode.

The file system filter architecture is the performance problem, not the file systems. It's a trade off to have a more extensible I/O stack.


I recall there was also an issue with how paths are treated in NT. I don't fully recall, but I think NT paths are parsed by the kernel early on, and the whole kernel operates on "cooked" paths. there was some major performance implications this had for WSL1 in addition to the filter driver architecture.

I also don't remember why they couldn't just bypass the filter stack for paths in a certain volume - WSL2-like I/O on WSL1 - but there must have been a reason.


> The problem with the filesystem could have been solved by optimizing the Windows kernel

Over time this would tie the Windows kernel’s requirements so that they matched the Linux kernel’s due to expectations from WSL1 users. This of course is a bad idea for any engineering organization - you will have requirements imposed on you that don’t mesh well with your other non-WSL users and you also have no real sway over Linux governance. This would lead to the Windows kernel either becoming a clone of Linux or serving at least one set of users poorly.


Why would you work on embedded development through a VM? Out of curiosity.

Wine achieves better performance these days due to things like... adding a module to the Linux kernel that implements NT-like synchronization primitives. So, Linux subsystem for NT synchronization basically. (a.k.a. NTSync)

Maybe this works out better because Linux is more flexible, while Windows/NT is more "set in its ways" and therefore more difficult to implement Linux on top of... Maybe?


It's my understanding that a big part of WSL1 performance loss comes from the relatively thick layered filesystem architecture on Windows.

Since git and nodejs are both common in modern development and are expected to work efficiently with huge numbers of files, this was a real bottleneck and it couldn't easily be tackled without threatening backward compatibility.


Back in my day you to to download a couple GB worth of cygwin, and that wasn't an actual environment, basically just a GNU toolchain compiled for windows. But it got you like....grep and bash and stuff that ran natively on windows which was kinda cool.

Does any older folk here remembers when NT was the Cool New Thing (TM) and it had by design support to multiple subsystems plopped over the NT API, and Win32 was just one of them alongside POSIX (Interix) and OS/2? There was even a _very short_ time span when Interix was actually usable (it was extremely short though)

I guess that makes me square within the 'older folk' subset - I continued to use the NT core with LiteSTEP alongside the SGI/IRIX Octane2 well after Y2K.

Those days I was working on a rework of the TRO PLATO learning system which was a real beast but essential for the individual learning project of a charter school i was supporting.

PLATO had been taken from it's dedicated mainframe world and made 'runnable' on W95 workstations with an NT server - but it really didn't run well, and the kids could really get behind the interface into regular Windows environment too readily. In combination the workstations were crazy hard to keep running cleanly.

So in the end; we had to take the software out of Windows, wash it clean in the waters of Silicon Graphics System-V with BSD extensions (X11) Unix and BSD - NeXTSTEP, just so we could bring it back to Windows properly using LiteStep.

Life happened and I lost touch with the outcome of it all, moving on to my next project; but, I kept a LiteSTEP desktop until moving entirely over to Linux in 2004.

Haven't used Windows for anything but a gaming load since '05 and stopped doing even that in about 2010, nothing later than XP.


Yes, the only reason I cared for Linux in first place was that the POSIX support wasn't that good.

I am convinced that if POSIX subsystem was UNIX serious, GNU/Linux would never taken off on PC, and the whole would be divided between SGI, HP-UX, Solaris, Aix and Windows NT.


There were already better free options than Linux when Linux first started gaining traction.

The reason Linux grew in the 90s was because it was part of the hacker culture. Not because better options didn’t exist.

Kids liked the fact that Linux was a free-for-all, anything-goes, platform. It wasn’t stuffy like Unix and it wasn’t proprietary like Windows.

Then those kids grew up and became decision makers themselves. And we started to see Linux replace FreeBSD and commercial Unixes.


Actually Linux was very SysV like back in the day, so it was more like the stuffy OS's that people liked.

GCC was the real catalyst, With even SUN which had used bundled dev tools as a early selling point was unbundling them and charging more, many x86 UNIXes like SCO didn't even come with a tcp/ip stack without an extra fee...and you couldn't take C code from HP to another system and actually have it compile.

As Solaris is really just a sysV-ification of the bsdish sunOs...the introduction of posix as a least common denominator, and Linux being closer to the commercial-ish unixes it was just an easier sell for a lot of users.

In hindsight it may seem silly, but in may projects I was involved with, linux using sysV /etc/init.d/, vs BSD's /etc/rc.conf was the driving factor, because /etc/rc.conf was a shared dependency and harder for us to modularize projects.

IMHO the real Linux advantage is that it was using the gnu user land, and thus gcc worked well with it and companies started to sell commercial support early.

But there were still flavor wars from all sides all the time, and being an ex-op on #unix and #unixhelp from the 1990s, I dealt with them all.

But BSD and heck even ITS etc... was the free-for-all, anything-goes, platform of record.


> IMHO the real Linux advantage is that it was using the gnu user land, and thus gcc worked well with it and companies started to sell commercial support early.

IMHO what really differentiated Linux were

a. the bazaar development approach, which lowered barriers to contribution, felt more transparent and "safer" with regards to what was going on in kernel land

b. the GPL, which while annoying to certain companies due to its viral nature, it at least guaranteed that no competitor could just develop a major innovation, grab the kernel and all of your contributions and run with them, undercutting you in the process

and also a noteworthy mention was the fact the BSDs were basically sabotaged by AT&T via their nefarious set of lawsuits, which nipped in the bud any semblance of advantage they had


> and also a noteworthy mention was the fact the BSDs were basically sabotaged by AT&T via their nefarious set of lawsuits, which nipped in the bud any semblance of advantage they had

People keep saying that but I saw zero evidence of those lawsuits factoring into any purchasing decisions that customers made.

I saw Solaris SPARK servers purchased for running Informix RDBMS

I saw Solaris deployed for payroll systems running Oracle middleware.

I saw FreeBSD servers built for web hosting

I saw FreeBSD servers built for ISP backend services

But at no point in the 90s did I see anyone running Linux commercially. In fact the only reason I ran Linux (Slackware) in the 90s was to see what all the fuss was about from my nerdy younger peers on IRC. And even then, I just threw it on a desktop PC.

In the 90s you had NextStep workstations used to build games intended for PCs (like Id Software did with Doom and Quake). And used at CERN for the development of the WWW.

UNIX was the 90s platform of choice for computer animation. It was the platform of choice for multi-tenant web hosting. And so on and so forth.

Much as Linux had the cool hacker community, 90s UNIX systems had superior ACLs, containerisation, faster TCP/IP stacks, significantly more stable file system drivers and so on and so forth. So people naturally chose UNIX for their important systems. And that’s exactly the trend I personally experienced in the 90s.

This isn’t to say that I think the unix wars had “zero effect” on the decline of unix, but I do personally think the amount of impact it had is massively overestimated. I think Linux would have taken over regardless because the Linux culture embraced everyone’s weird ideas vs UNIX systems that did extensive gatekeeping. And the kids that played with Linux because it was fun and hacking was encouraged, grew up and became influential in decision making.

I think the culture of Linux had more to do with Linux’s growth than anything else.

Personally, I don’t think the license made any difference here. I do get the arguments people make about GPL, but GPL was around since before Linux and it didn’t gain significant traction then. But like most of the opinions I’ve shared above, it’s an impossible point to prove either way.


You’re talking about architecture but I was talking about development culture.

Linux encouraged people to fork and experiment with it. Whereas the FreeBSD was a carefully maintained ecosystem.


Which ones? BSD was tied in a lawsuit that left doubts on its future.

Minix was a toy OS for university teachings.

Coherent was commercial.

Nothing else was there on the PC market.


386BSD and its derivatives (eg FreeBSD) weren’t really attacked by SCO like other UNIXes were. In fact SCO filed more lawsuits against Linux than they did (for example) FreeBSD.

FreeBSD was also used heavily in the late 90s in ISPs and similar domains.


I think you are a possibly a decade off on the timing here.

USL v. BSDi is what impacted the BSD side, and it was during that lawsuit before Novell bought USL etc.... that the problems were that allowed Linux to make gains while the net/2 distros were in a waiting game IMHO.

The timing absolutely helped Linux and GNU being packaged as a complete system by the various distros etc..., and common OSS distribution points like Walnut Creek and PHT were very much concerned about USL v. BSDi and in an era when you had to make long distance phone calls to download with a modem, a lack of CDroms etc... absolutely caused a dip in adoption of the BSDs.

By the time the IBM v. SCO lawsuits happened (2003) the UNIX wars were long gone and Linux was already established.

SCO/Interactive/Coherent/etc... and other x86ish UNIXes were quite common in my work in the early 1990s, but the whole unix wars is way to complicated to cover in a single post.

The post .com bubble SCO lawsuits really just didn't matter much, the consolidation that happened in the early 90's that ended the UNIX wars, plus Intel killing most of the commercial unix independent CPUs with Itanium untruths and impossible promises and an inability for the major vendors to adapt to a lower margin model etc... killed those off.

The SCO lawsuits were really just the flailing of a dyeing company which was the end result of WordPerfect buying Novell with Novells money and local Utah politics.


Sorry, I don’t think my point was very clear. I wasn’t saying that SCO sued Linux in the 90s nor that the UNIX wars had zero impact.

Just that FreeBSD was still used a lot in the 90s and managed (at least from what I experienced) to dodge most of the concerns that companies had deploying other UNIXes.

I mean, it’s not like UNIX use dropped to zero overnight.

So you did see a lot of Internet companies using FreeBSD as their platform of choice. For a while, it really did look like FreeBSD was becoming the dominant server platform in that domain. Not everyone too Linux serious at that time. It wasn’t until at least 99 when Linux became a viable competitor to FreeBSD.

But once Linux did gain favour its popularity sky rocketed. Which is exactly why SCO took various Linux shops to court.


Nobody said SCO sued BSD or BSD users. USL sued BSD and UC (https://en.wikipedia.org/wiki/UNIX_System_Laboratories,_Inc.....) long before the SCO lawsuits.

Even in that case, it was one suit and it was settled before FreeBSD was ever released.

Which simply wasn’t enough drama to persuade businesses on smaller budgets away from using FreeBSD.


Those only came to be after AT&T lawsuit was cleared, and by then Linux already had enough wind behind it.

Also SCO lawsuit was more due to IBM's money than Linux.

Both a different situation than Windows NT being available a decade earlier.


You’re sidestepping my point that FreeBSD was in widespread use in the 90s.

My point about SCO wasn’t clear though. I was just saying FreeBSD wasn’t as embroiled in the UNIX wars as the others, ie referencing SCO vs Linux to demonstrate how even Linux suffered more time in the courts than FreeBSD did.


Not at all, except for Hotmail and Yahoo, I never saw it being used personally.

In fact, had I not bought a set of Walnut Creek CD-ROMs, I would never had used it in first place, and never again since those days, excluding derivatives like macOS and Orbis OS.

Which is why I asserted with good POSIX support, the world today probably would be Windows NT linage on the PCs, plus the commercial UNIXes everywhere else.


You work for mainly Windows shops though don’t you?

My experience was very different in the 90s.

Solaris, FreeBSD and Next were very widely used. The only times I saw NT was in edu, government, and a random publishing house (which ran pirated copies of NT 4 on the servers and Mac OS 8 everywhere else).

That publisher is an interesting chapter in my career on its own actually…


The BSDs would be much bigger today if it wasn't for AT&T going after them hard in the early '90s, exactly when both them and Linux were starting to take up speed. I think that things could have gone way different if the BSDs were bigger and more popular, in quite unpredictable ways (it's not like they haven't been popular anyway though - see Darwin, or the Playstation OS for instance)

Cygwin was fun. I'd done zero development on Windows, but about 10 years ago I had to figure out how to deploy some nightly shell scripts across a bunch of local computers in a few dozen offices, where about 80% were MacOS and the rest were Windows. I don't remember exactly how I rigged it, but basically cygwin allowed me to keep the scripts as they were and trigger them in place, with a few small modifications.

I never want to deal with that again ;)

[edit] fwiw, Termux on Android is similarly a fun pseudo-environment. It's a nice and helpful toy.


The biggest issue I remember is directory seperators... windows of course using \ which bash would then interpret as an escape. Cygwin mostly papered over that from what I can recall, but it could lead to some weirdness, like sometimes you'd get C:\\path\\es\\like\\this

We should be using the baguette emoji for path separators for cross-platform compatibility.

https://old.reddit.com/r/ProgrammerHumor/comments/96ufiz/pro...


You could also use forward slashes, like C:/path/subpath, which has worked since Windows 1.0/DOS 2.0.

That's handy when you're entering paths in a Cygwin/MSYS Bash shell, but might not help much if you're trying to parse or otherwise work with existing patgh variables composed with backslashes.


Yes, you could if you were entering them manually, but some apps that generated file names would screw it up. I think they were using some sort of stdlib function to get the path seperator. Forward slash paths working in native windows apps also wasn't quite a given, either. Keep in mind this was a loooong time ago... like windows xp era maybe, even.

Yeah, I recall directory paths being the biggest PITA with running scripts in cygwin. But I mean, that was a very minor set of things to fix compared to what would've had to be written in anything else available at the time.

Doing retail office deployments of custom code on employee computers is a weird niche, and you find whatever works and hope you can maintain it somehow. Cygwin was awesome though, saved me a ton of time and the client a lot of money for the moment. (The client later stipulated to all future franchisees that they had to buy only Macs, lol)


Always used / and it worked for both cygwin/windows lands.

> Back in my day you to to download a couple GB worth of cygwin

You still can, and it still works exactly the same way.


It's true, but to be honest the MinGW-built stuff that ships with git for Windows has been enough since WSL took off.

what do you mean? that's still the only way to work as a human in windows. wsl1 almost replaced it, but obviously they scrapped it.

if you must use windows, it's because you will compile for windows. so you install MSYS, which is a linux distro-ish compiled native for windows. and do your work.

wsl2 (and this apple thing) is just a meme. if you're working in it, you're better of just installing Linux or ssh'ing to a server.


> wsl2 (and this apple thing) is just a meme. if you're working in it, you're better of just installing Linux or ssh'ing to a server.

Many enterprises allow windows only so your way into Linux is via WSL2


There is also git-bash which usually doesn’t need to have administrator to be installed.

https://git-scm.com/install/windows


shrug. I haven't owned a Windows machine in years at this point. It's one of those things like PHP that I just decided my life was better off without.

... Now it's just called git bash

Just install and use MSYS2, git bash is derived from it anyway, and a regular MSYS2 installation offers a lot more.

It was soooo slow though. Practically unusable for anything i/o heavy.

Those issues could have been fixed…

WSL 1 is long gone for all practical purposes, yet it still dominates conversations.

Also everyone on FOSS gets it wrong, WSL wasn't a subsystem like classical Windows NT ones.

It was based on Drawbridge research using picoprocesses, a new approach for library OSes.

https://learn.microsoft.com/en-us/archive/blogs/wsl/pico-pro...


> Also everyone on FOSS gets it wrong, WSL wasn't a subsystem like classical Windows NT ones.

Everyone in FOSS? How about Microsoft got it wrong, since they actually named it The Windows Subsystem for Linux (WSL)? It wasn't the FOSS community who chose the name for them.


What has that to do with a version number and not keeping up with the times?

What version number? WSL1 vs WSL2?

I'm not sure if you see the quoted part. My comment is about the part that starts with "> " that you wrote earlier.


Most companies use the term "discretionary PTO". That means that there is no set limit on PTO. The positive take on it is that this means employees can take time off within reason so long as they're getting their work done. The negative take is that it means you have no guaranteed days you can take, and cultural or managerial pressure will prevent you from taking even a normal amount of vacation.

It also means that employees don't accrue PTO days, and therefore don't have to be paid out for that time when they're fired.


Does this unlimited PTO still have to adhere to any legally required minimum PTO limits? If not, what prevents them from just not giving their employees any time off ever and bypassing the peer pressure part entirely?

PTO regulations are created by the individual States. None require PTO to exist. They do regulate accrual of PTO if it exists, sometimes with unintended consequences for employees.

The origin story is that "discretionary PTO" was created to enable people to take longer vacations than was feasible within the regulatory constraints of accrual-based PTO. It can be abused in other ways but the intent of the people that invented it were employee-friendly.


It does not. Nothing.

Maybe in the US, but in countries with minimum holiday time you get the minimum in your contract (or a bit more) and the employee handbook says you have unlimited. Companies can’t shirk their responsibility here legally by saying they give unlimited vacation.

"Contracted minimum with more at manager's discretion" isn't what people usually mean when they talk about unlimited pto.

Sure, my point is that the way it works in the US does not work in many other places.

Right. Places without unlimited PTO get neither the upsides nor the downsides of unlimited PTO.

The shift from the tem "Unlimited PTO" to "Discretionary PTO" has happened because early proponents realized it wasn't really unlimited, and they didn't want workers to think that way. But the "unlimited" term is still used to sell it, and still often appears in informal recruiting conversations.

It's just so slimy.


Yeah, the current reality of it isn't great at a lot of companies. I've been places where it was done well though. For instance, having a mandatory minimum number of days of vacation helps combat pressure to not take time off, and leaders who openly encourage people to take their time helps combat a culture of not taking time.

It started as a positive thing, intending to trust the employees and give flexibility. Unfortunately, like a lot of things, sleazy leaders turn flexibility into manipulation.


Reminds me of "Unlimited data" plans from ISPs, which are actually limited, but they just don't want to tell you about them.

Anytime something is marketed as unlimited, it's not.


"Well it's not a deceptive trade practice because no rational person would take such a hyperbolic or outlandish claim literally - much like 'best ISP in the UNIVERSE!' or advertisements suggesting that beer will make you fit and attractive."

The youngest baby boomers are in their early 60s. I doubt it will make a difference in tech, but traditional industries, or what is left of them, should see a lot of senior roles open up as baby boomers begin to retire. Then, as they begin to pass away, a lot of their accumulated wealth will pass to their heirs as well.

The baby boomers have been a serious "clog" in the system at a lot of levels. It will be interesting to see how things play out once they're no longer actively involved.


~$0 of baby boomer wealth will pass down.

It's all going to be taken by end of life care companies who charge $20k a month (yes really) to put you in a small room and have a teen with barely a highschool diploma check in on you every now and again, for minimum wage.

Every dime of wealth the boomers collected will be captured by a few private capital orgs who prepared for this. It will never flow down.


In 2024, the Silent Generation and baby boomers represented 25% of the population, but held 65% of all wealth in the US.

The healthcare industry will profit a great deal, but there is no way they will capture 65% of all wealth in the country. And supposing they did: ultimately, all this wealth ends up on some household's balance sheet unless it goes abroad.


Yet we have ample evidence that extremely little of the massive increases in wealth, value, and productivity of the past 50 or so years has ended up in the hands of normal people.

Why would this be any different?

You can make all sorts of hay about "Oh that money will be invested so it will still benefit normal people" but no, the vast majority of the wealth the boomers gathered will be controlled by the new aristocracy. They will do with it as they please and thus you should expect it to almost entirely benefit them first and foremost.

Private equity prepared for this windfall. Even if they scoop up only ten percent of it or so, that's an enormous transfer of wealth from average families to the hyperwealthy. That's another entire chunk of the pie that is just removed from the continually shrinking normal person economy.

A giant fraction of that wealth is in real estate: Homes. Those homes should be passed down to the next generation, alleviating our insane house prices somewhat, but instead they will come under ownership of the same companies who currently monopolize specific rental markets to ensure the price goes up as much as they want, and they will absolutely rent out those new properties, in such a way that they do not reduce prices.

That housing stock is completely and totally captured by these private equity firms that own the end of life care companies.


i suspect based off some light readings around this that there will be even more of a wealth transfer in the coming 20-30 years as the boomers fall off. medical costs for advanced age, asset prices like housing falling off in certain areas, mismanagement of retirement funds, and even just continuing the mentality of “i can’t take it with so might as well spend”

As I understand it, the fee doesn't apply in many situations and is fairly easy to work around. Apparently it was neutered immediately after being announced.


I had the same questions. Apparently discovery of the prior conviction is what lead to them being fired:

> When the company discovered Sohaib Akhter’s felony conviction, it terminated both brothers’ employment during an online remote meeting on Feb. 18, 2025

from https://www.justice.gov/opa/pr/federal-jury-convicts-virgina... which is a better source on this.

That prompts the question of why background checks are so lax that they were hired before this was discovered.


The company involved here is apparently based in Washington, DC, which has a "Ban the Box" ordinance that limits employment background checks for most kinds of jobs. And apparently DC's version of the law is particularly strict.


The prevents them from asking before extending an offer, but it seems they could (and should) have checked after.[0]

> However, an employer may ask about criminal conviction(s) after extending a conditional offer of employment (the employer can never ask about arrests or criminal acusations that aren't pending). An employer who properly asks about a criminal conviction can only withdraw the offer or take adverse action against the applicant for a legitimate business reason that is reasonable under the six factors* listed in the Act.

One of the six factors is "Fitness or ability of the person to perform one or more job duties or responsibilities given the offense"[1], which they probably could have invoked after asking (though they never checked or didn't check thoroughly enough, so I guess it's moot).

[0]https://ohr.dc.gov/page/returning-citizens-and-employment

[1]https://ohr.dc.gov/sites/default/files/dc/sites/ohr/publicat...


Shouldn't this force companies that need to pass a SOC2 out of the district? Doesn't SOC2 require background investigation of personnel with access to sensitive systems?


There isn't too little demand. There is massive demand and many competing companies trying to capture that demand, so they are attempting to make better offers than their competition. Hence subsidy.


That, and:

- Every competitor is planning for the demand to be much higher in a few years than it is now, and aiming to capture as much of that as they can, which starts by getting companies hooked on their models now

- The data center capacity will get used no matter who captures the most demand


I can somewhat understand companies getting users depentant on their harnesses or workflow, but model vendors as in this deepseek case, I have absolutely 0 model loyalty when it's a simple config change away, and will always optimize for either capability or price (or whatever !/$ metric you can determine).


Depends what you’re doing. For example, Gemini is somehow still your only option if you need a model that can natively understand video and reference timestamps in its response.


You're right, of course, but I would qualify this under "optimize for capability".


Github has published some incredible usage rate increase numbers, which they ascribe to the rise of agentic coding. At some point, they are going to have to change rate limits, cut free-tier usage, or find some other path to reducing load. It's clear that their infrastructure can't keep up with this significant increase, and it's unlikely that they're going to just absorb the increased costs themselves.

Very curious to see what the future holds for Github.


From the GitHub COO on April 3rd:

    Platform activity is surging. There were 1 billion commits in 2025.
    Now, it's 275 million per week, on pace for 14 billion this year if
    growth remains linear (spoiler: it won't.)

    GitHub Actions has grown from 500M minutes/week in 2023 to 1B minutes/week
    in 2025, and now 2.1B minutes so far this week.

    So we're pushing incredibly hard on more CPUs, scaling services, and
    strengthening GitHub’s core features.
https://x.com/kdaigle/status/2040164759836778878

They also had a recent blog post about availability: https://github.blog/news-insights/company-news/an-update-on-...

I don't envy the scaling issues the GitHub engineers are facing! #HugOps


After the Microsoft acquisition GH marketing and pricing put an immense amount of effort[1] into trying to kill secondary platforms that integrated into github and move more corporate accounts fully on-platform. We recently dropped travis for github actions and dropped reviewable for github PRs (which are terrible).

There's a portion of this that is agentic driven and there's a portion of this that's just github making their own bed.

1. Arguably anticompetitive pricing like MSFT is used to doing with the office suite.


In other words, the set of github core services has expanded because you don't use third party tooling for some of those services anymore.


For us, yes - and likely for a lot of other users. I'm not certain who else has dealt with the headache of being migrated off their legacy pricing plan but it ends up pushing those internal offerings a lot harder than the old approach did so if they're seeing successful conversions it's likely they're seeing significantly more load from mature codebases with expensive CI/CD pipelines.


That sounds like their classic EEE


This is extremely interesting how fast this happened. Either AI use surged massively in the last quarter, or this is a very sneaky move by Anthropic. Looking at my own stats, I don't think I'm using Claude Code much more than I used to, but my commits have gone way up. I have a feeling they've tuned the models recently to commit more often, which gives the illusion of more work being done.


> Either AI use surged massively in the last quarter

December 2025 is considered by many people to be a major step function in agentic coding (both due to improvements in harnesses and LLMs themselves). I know my coding has forever changed since then.

Before I was basically always hands on the keyboard while working with AI. Now I'm running experiments with multiple agents over the weekend, only periodically checking in if they have any questions or need further instruction.

The last quarter is where I personally first started to see how this was all going to change things (despite having worked on both the research and product side of AI for the last few years).

> I have a feeling they've tuned the models recently to commit more often, which gives the illusion of more work being done.

Agents certainly are committing more often, but I know, at least for these projects, there really is work being done. An example: I had an agent auto-researching a forecast I was working on. This is something I've done manually for over a decade now. The iteration process is tedious and time consuming, and would often take weeks of setting up and ultimately poorly documenting many, many experiments to see what works. Now I can "set it and forget it", and get the same results I would have in hours (with much more surface area covered and much better documentation). Each experiment is a branch (or work-tree) so yes there are a lot of commits happening, but the results are measurably real.

I often think the big divide related to the success with agents is whether or not the quality of ones work can be objectively measured. For those of us doing work that can be measured, the impact of agents is still hard to comprehend.


> Each experiment is a branch (or work-tree) so yes there are a lot of commits happening, but the results are measurably real.

If you are correct , and GitHub is scaling its compute mostly as a reaction to this externality (agents churning through code that will mostly be discarded), then you can look forward to getting billed for your usage. After all, it is hard to build a scalable system without back-pressure.


I've already started moving my personal projects off github and onto forgejo running on my homelab. I know a lot of people doing the same. With a hermes-agent for a sysadmin I can debug problems from my phone, so I wouldn't be surprised if I have more "9s" that GH.

But if it ends up costing extra for GH, especially for work usage, then it's just a simple calculation of "is this worth it?" which I suspect for most cases will be 'yes'.


> [...]it's just a simple calculation of "is this worth it?" which I suspect for most cases will be 'yes'

Once the landgrab-stage flat-pricing goes away, it will become a case-by-case calculation because unsupervised agents can (and will) run up your billing with zero understanding of the business value of what they're instructed to solve.


> with zero understanding of the business value

What kind of products/services are you building where you aren't able to tie your eval suite to business value? If you can't, then why are you building whatever is it you are in the first place?

By far one of the biggest changes I think we'll see in things being built by agents is reducing the gap between code and value. The first stage is to start making it possible to measure quality (evals) and the second stage is to more closely align measurable equality with value. The business value of the tokens spent on my team was discussed my first day.

> Once the landgrab-stage flat-pricing goes away

Aside from the above point, I'm already running local LLMs on my homelab that, while not quite what I want for truly production work, have been able to iterate on and solve real, non-trivial research tasks for effectively zero cost (energy cost was roughly on par with running an old light bulb).

The way open, local models have been developing there will be many cases where if proprietary providers over-charge it won't be a deal breaker to just switch to local models. Not to mention that there are plenty of open, but non-local models that are already 5x cheaper and roughly on par with the mainstream model providers.


> What kind of products/services are you building where you aren't able to tie your eval suite to business value?

There are no evals in my org that can quantify the value of a proposed feature, rank it against ongoing support issues that pop up, or know when to stop expending effort when no solution has been found or too many unknowns crop up. We still rely on natural intelligence for that, and haven't YOLO'd (ha) on Independent agents. I'd rather quit than spend my day herding agents and have my job reduced to just a code-review monkey.

Benchmark evals are at least 3 degrees removed from actual business value - maybe less of your tasks are repetitive. None of the harnesses I've used have a sense of a compute budget - outside of Boolean think/no-thinking modes.


Whats your setup? How are your agents not running out of context and becoming dumb as a rock after ~100k tokens? Do you have a heartbeat thing on spawning more agents every time?


The most important thing for any agentic task is to build up and continue to record context as a project develops.

The start of basically any project involves building up and documenting context around the project itself (and for a new company, the organization itself), this is kept at multiple levels of granularity (cross project, project specific, task specific, and human readable documentation). All experiments are planned out and documented as they go.

This becomes extremely important because after a weekend of running experiments stakeholders (and myself) often have questions, with everything in memory or some other stored context it's trivial to get answers to all sorts of questions.

Maybe it's because of this, but in both Claude Code and Codex I haven't run into any issues with models getting "dumb as a rock", even after compaction (or occasional full terminal crashes) they seem to have no trouble marching on.


Opus has 1M context now. In my experience it starts getting increasingly dumb after about 700k, but below that it is very usable. I don't think I've ever ran out of context window since they brought that out.


Many things at once I suspect:

1. Models have got way better, which means you are far more likely to get something working. I know I used to have little 'tool'/'weekend projects' all the time that wouldn't get off the starting blocks before, now it takes a few minutes often to build them, and once I've built them I tend to want to have them saved on github. Quite how useful they turn out to be is another question though...

2. Related, because the models are a lot better I can generate far more code per unit time. On Sonnet last year I'd have to babysit the model and constantly 'steer' it, which meant a lot of the CC time was actually me reviewing it. Now with Opus4.7 it can often just churn away for 10-30minutes and get something reasonable.

3. Most importantly, just the volume of new users to coding agents - loads of new developers shipping far more far frequently.

4. Many users who were not on github, now signing up and pushing code to it. "Vibe coders" basically who don't have SWE experience and their agent tells them git would be a good idea.

Each of these would be a big increase in scale, but combined it is vvv high


I don't think commits per se puts pressure on the infrastructure.

More likely pulls and pushes, and, naturally, the ci minutes they identify as the main issue.


But CI only increased by a factor of 2 since last year. Did they really not foresee that happening? And how does that affect git and api operations.


It really shouldn't. The technical summary they released[1] is a very interesting read from a software engineering perspective. It seems to be blindsided by the increased traffic and gives stats related to commits/PRs (which should be relatively cheap for github to process) without any insight into their web traffic or details on how much actions are costing them. If they were super transparent they'd release information about their request response time and resourcing to fulfill that.

Their current path to resolution is to migrate their codebase to a new language[2], continue to drop their inhouse ops for Azure resources and get off MySQL. Maybe one or two of those steps are legitimately a good idea - I don't have an inside scope - but technology migrations are always fraught with issues. It's quite possible these changes are just a result of them vibe-coding a mature codebase into a new language.

1. https://github.blog/news-insights/company-news/an-update-on-...

2. I'll grant that Ruby isn't the best language to use as scale but I think we're all old enough to realize that language choice is far less impactful on performance than code quality.


Azure’s core hypervisor orchestrator was half-baked at launch and it has never been fixed. This long read blog series explains a lot for me — for example, why the FedRamp certification program was never able to get a straight answer from Azure about how they handled secrets.

https://isolveproblems.substack.com/p/how-microsoft-vaporize...

https://www.kunalganglani.com/blog/microsoft-fedramp-failure...


Re 2, I would generally agree and there is a lot that can be done with caching. However, since writing services in Rust and Golang, there is whole other tier in speed. Architecture matters, code quality also matters, but Golang and Rust help a lot in making very fast services.


Yeah I don't disagree. To clarify. Rust, Golang etc - they give you a very noticeable advantage when it comes to writing good performant software with the assumption that you're putting in the effort on the design side. But poorly written Rust is likely going to be indistinguishable from poorly written Ruby.


> migrate their codebase to a new language[2], continue to drop their inhouse ops for Azure resources and get off MySQL

The recent blog post you're linking to mentioned moving data only for webhooks off MySQL, not all relational data used by the entire site; and moving "performance or scale sensitive code out of Ruby", again not the entire codebase.

Do you have an official source suggesting these migrations are more comprehensive than that?


I do not know - this is the only source I'm aware of and the wording is vague enough that the above is just my interpretation of it. It could be highly targeted but the manner of wording indicates a strong preference that smells of a large migration.


What part of the wording gives you that impression? On these topics, the post literally just says the following:

"bottlenecks that appeared faster than expected from moving webhooks to a different backend (out of MySQL)"

"Similarly, we accelerated parts of migrating performance or scale sensitive code out of Ruby monolith into Go" (in a paragraph specifically about "critical services like git and GitHub Actions")

Both of those sound highly targeted to me!


> While we were already in progress of migrating out of our smaller custom data centers into public cloud, we started working on path to multi cloud. This longer-term measure is necessary to achieve the level of resilience, low latency, and flexibility that will be needed in the future.

That paragraph read, to me at least, that the initial targeted changes were just the tip of the iceberg and that much heavier lifting than initially budgeted were now in scope.


"smaller custom data centers into public cloud" is talking about their Azure migration, so "multi cloud" would almost certainly mean extending a presence into AWS and/or GCP (or maybe others like OCI).

I'm sorry but I really don't see how you're drawing conclusions about this meaning a move off of Ruby and MySQL entirely. That's a huuuge logical leap away from what is written in this post, and you originally stated it in a way that indicated this was a fact.


It's the end of the free lunch era. Subsidizing groups like students or new users to gain market share worked as long as there weren't billions of them at the same time eating all compute from the paying customers. It's not working anymore for ai products.


Not a free lunch, data gold mine


I wonder how many of those actions are really necessary


And how many of those actions do uncached downloads instead of building self-contained offline images... Speaking of which, I wonder if GitHub has implemented any HTTP interception for common mirror sites, like used by apt, etc.


GitHub and WarpBuild cache is so slow it is often faster to re-download hundreds of MB each run than cache it properly.

I so wish this wasn't the case.


Many downloads now go over https. Intercepting them would require having certificate for those domains. IIRC on the clouds the standard images do have a sources list that points to mirrors on the cloud’s network. I would only presume Github Actions runners have the same.

Not sure if something similar exists for NPM which is big for all things JS.


Other CI/CD platforms usually push you towards using self-hosted mirrors for downloading large chunks of data (often aggressively so) but github is pretty hands off when it comes to actions. It is interesting to consider whether managing that traffic might be overwhelming them and if this can be traced back to a lack of forethought when it came to building out those tools.


Or how many pushes those commits are spread across; oh, neat, big number.


They can easily spin this as massive success. Uptime will only matter for a small number of users. Probably not true, but not far from the truth either. I'm a heavy Github user and I can't really say it's THAT bad. If something doesn't work, you can always fill your time with something else.


Wow, nice to see the relentless push for more AI slop finally paying back some dividents back to the issuer.


For literally decades, I’ve observed that there are systems that make each operation cheap and systems that work hard to scale out. The former frequently seems to wildly outperform the latter.

GitHub, for example, seems to implement the main repository /pulls page as a search query, which is hinted at by the prefilled search bar and was mostly confirmed last week when the search backend failed and pull requests didn’t load. But it could have been implemented as a plain API call that just loads open pull requests, and that API exists and did not go down.

If GitHub focused a bit on identifying their top 95% of high level operations (page loads including resulting API calls, for example) and making them efficient, I bet they could get a 5x or better reduction in backend load by simplifying them.

(Don’t even get me started on the diff viewer. I realize that much of its awfulness is the horribly inefficient front end, which does not directly load the back end, but I expect there is plenty of room for improvement. The plain git command line features are very fast.)


Are you telling me you don’t want a chat interface to greet you when you log in to GitHub?


That’s sort of orthogonal. But if GitHub actually invoked an LLM on initial page load, that would be about par for the course, and it would be amusing for GitHub to then complain that they’ve grown so quickly that their systems can’t keep up.


I noticed the same https://news.ycombinator.com/item?id=47940213. My working hypothesis is that, given that a filter was always required (prs and issues are likely rows in the same database with a bool property to distinguish them), someone thought it'd be good to use the search API uniformly. But search is on the derivative of the underlying data, in contrast to the specific APIs for listing issues and prs.


Working in an organization without a mono-repository I've actually found it extremely difficult to keep a tab on PRs and issues across multiple repositories. For a problem that should be resolved by a "For me" page that just lists out all your active incoming and outgoing PRs their multi-page solution involving search filters that often need to be reset feels extremely weak. I've worked on large multi-tenant solutions before and a page where you can "SELECT * FROM everything LIMIT 10" is the absolute last thing you want to give to users.

It is bizarre to me that so much of their tooling defaults to acting across the whole of github data points without guiding the user towards (or even making available as far as I can tell) a way to easily scope requests down outside of a complex search filter.


Do you mean like https://github.com/pulls and https://github.com/issues ?

These are in the top left hamburger menu from the Home dashboard (edit: actually on all pages).


Hey, that's awesome and nevermind me. I just got stumbled by their UI.

There's probably a fair argument about how discoverable these are (especially given their labeling as "All Issues" and "All Pull Requests") but that tip is quite helpful to me personally. Thanks for sharing it, I really appreciate it!


And yet these are still (apparently) implemented as search queries instead of direct database queries.


There may be some magic they do to better optimize within-user-searching. It's something that they could hide in implementation details so we can't be sure unless they spill the beans but it's feasible - especially with the default search parameters they're using.

I'd still love something a bit more obvious and intuitive but if it's just a UX failure that makes me feel a lot better.


> There may be some magic they do to better optimize within-user-searching. It's something that they could hide in implementation details so we can't be sure unless they spill the beans but it's feasible - especially with the default search parameters they're using.

I would have believed this until last week when they had a little banner informing me that I might not see all the PRs but that they really were there and that I could use such-and-such API to find them myself.

If the direct API existed and was working, but the web UI wasn’t seeing the PRs, then presumably it meant that the web UI was not optimizing the default trivial search query to use the direct API.


Git itself is kind of a fundamentally computationally inefficient way to store and retrieve information. If the problem to solve were simply "store and version this text", 14 billion commits in a year would not even be considered a lot.

In other words, a centralized version control system built from the ground up to operate at scale would do far more for scalability than anything GitHub could possibly do to optimize their Git operations. Every major tech company (Amazon, Meta, Google, etc) is already doing something like this internally.

Though this would require people to start using a github-specific client rather than the traditional git+ssh. (Though the github client could still maintain a git repo locally, for compat.)


I can guarantee you one thing - github's problem isn't coming from git.

Considering all the ci/cd pipelines, PR & issue discussions, social media tracking, rich data and else that github hosts if their true issue is the actual meat and potatoes of running git I would be gobsmacked.


What are you referring to when you say it's "fundamentally computationally inefficient"? It's pretty efficient because it's content-addressed, plus optimizations to reduce storage and data transfer with packfiles.


I suspect they were referring to some of the things git allows for non centralized version control. There are simplifications if you just wanted a centralized system like cvs had.


I think you need to broaden your focus here - I can't really remember any significant downtime before the Microsoft acquisition and the data supports my memories.

Microsoft bought Github and migrated to Azure, which is explains the findings. The query performance was fine before they started serving from Azure.

I mean honestly, as though there isn't one single person competent enough to read some logs and horizontally scale a few read only dbs to meet demand? That's not it


> I think you need to broaden your focus here - I can't really remember any significant downtime before the Microsoft acquisition and the data supports my memories.

This is the opposite of my recollection, actually. I distinctly remember having conversations about Github struggling to scale well before MS was involved, and people claiming that MS had somehow saved Github because it had stabilized and begun adding features again.

> The query performance was fine before they started serving from Azure.

This may be correct though. The Azure migration seems more aligned with the timeline of struggling to scale.


> I distinctly remember having conversations about Github struggling to scale well before MS was involved

Do you have any sources to back your claim up? At what point did Github fail to scale their search endpoints?

> This may be correct

It is.


I don't know why this is downvoted. The data backs you up: https://damrnelson.github.io/github-historical-uptime/


I'm skeptical about that page's accuracy. For example, if you go to the breakdown tab, it shows Actions having 100% availability when the graph starts (Apr 2016), yet Actions didn't even exist until late 2018, and wasn't GA until a full year after that. So if the math behind the "average" tab is treating NULLs as 100% uptime, this just isn't a correct measurement.

The page also notes it obtains its data from the official status page, but big tech companies have been known to under-report outages. My general sense is they've gotten better about this in recent years; if so, that means historical data will give an erroneously rosy picture of uptime.


I think we can agree the data is correct enough to ascribe a trend with a strong statistical significance no? Enough to draw a conclusion


We can clearly draw a conclusion that their availability is getting worse, but that's not what your original comment claimed.

You said "I can't really remember any significant downtime before the Microsoft acquisition and the data supports my memories", but my memories differ (as do other commenters), and the accuracy of the supporting data seems questionable.


ok.


I mean, are any of the other forges, which I presume are also seeing logarithmic increase in commits, also failing as hard as Github?


I totally agree, you should expect a similar increase and degradation in Gitlab which we do not.


IMO, they're reaching the point of no return. I don't think they can horizontally-scale their way out of the hole they dug themselves unless they separate their free and paid infra maybe... which doesn't seem likely considering how their other infra changes are going.

In the same way you need to be 10x better for someone to consider switching to your product, if you get 10x worse your competitors get a free 10x by just standing still.


I think there's a very good chance you're right. Their reputation is obviously severely harmed, and high profile projects like Ghostty leaving may be a canary in the coalmine.

Something creative like separating their free and paid tiers may help them. I suspect the fact that all of this is happening to them along with their migration to Azure is probably complicating their ability to adapt their infrastructure.


What if I told you most enterprise customers don't even use the cloud offering and aren't impacted by any of this? Companies like Apple use GHES, and honestly thats where most of their revenue comes from, not the free offering.


IIRC back in the day they used to have an on-prem Enterprise product? I've never heard of anyone who actually used it though. IMO that would make a lot of sense for a medium-large organization--you still get the familiar Github product but you can take responsibility for your own uptime--like with Jira, Jenkins (nee Hudson), PyPI/Maven/etc.


I wonder if AWS resurrecting CodeCommit might be related. "For all of our warts, we still have a higher rep score than github" would not be an extraordinary thought at this point. There has been some brief chat about looking to github, and I'm so glad we never did. A previous company did migrate to github with no real answers on what the benefit was other than investors ask if your code is in github by name vs some other repo.


How can they not? Surely at GitHub scale there isn't a single component where they were relying on vertical scaling?


For all of it's history (up to and including now possibly?) Github was a big Ruby on Rails monolith. [0] Obviously some things run in their own service, but I'm seeing the core github features fall apart which should be the features packed into the big monolith. If load is this much a problem, not being able to only vertically scale the processes that need the extra headroom is a big problem. Scaling horizontally by just throwing more machines at it, or at least cordoning-off some machines as "the ones that people actually pay for" is all I can think of for an application I can only describe as "accidentally working". Urgency is most-definitely high and that pushes decision making towards permanently-temporary patches instead of actual infra/architecture improvements.

[0] https://github.blog/engineering/architecture-optimization/bu...


A week ago GitHub published a blog post saying this, a day later GitHub execs were in HN comments repeating it, and just like that it’s common knowledge that GitHub’s steady reliability decline from the 2019 onward was actually caused not by the 2019 Microsoft integration, but by something that did not exist until 2023. PR works, y’all. Turns out the reason GitHub doesn’t work is because it’s just so good!


I've been a strong proponent of reallocating all LinkedIn server capacity to GitHub.


this is an idea that i’d happily get behind.


They can't really cite the situation as a problem given their hand in creating and continuing it.


It's hard to talk about "them" as a singular entity. I bet that the "Copilot all the things!!11" faction mostly does not consist of GitHub SREs.


The GitHub SREs are working for the Copilot company.


Satya Nadella at the LlamaCon event in April 2025: "I’d say maybe 20%, 30% of the code that is inside of our repos today and some of our projects are probably all written by software."

In particular Github, with it's copilot-next initiative, has probably so much AI generated code inside today that fixing all this new performance problems will need lots of human developer brains.


It literally have problems the moment MS bought it, way before AI gold rush.


The sysadmins didn't make any of those decisions.


I suppose the idiocy of their parent company is their job security.


Have they published incredible usage rate numbers somewhere? I saw their recent blog post about the outages[1] and it has a graph without axis labels and without any context around usage before 2019 to indicate just how much this agentic acceleration has actually increased usage growth.

1. https://github.blog/news-insights/company-news/an-update-on-...


Isn't the data that flows through Github so valuable that they (Microsoft) are happy to eat the cost?

I don't have a clear idea how that value can be captured, since it's going to be 90% AI generated code that anyone can scrape (public projects) or can't be used (private projects), so perhaps you're right.


> Isn't the data they capture so valuable that they (Microsoft) are happy to eat the cost?

Even if that is true, unless the value of the data corresponds to near-term revenue, then eventually the cost may simply not be possible to meet. Or for that matter, the capital to manage the increasing load may simply not exist - it does not matter how much valuable data you have, if the supply of hardware cannot keep up with your demand.

Also, I suspect that most of the "data" obtained by the incessant hammering on GitHub is not very valuable. Most business code is routine, and getting Copilot to help out with generating enormous amounts of it may not contribute much in return.


> 90% AI generated code

And it isn't even clear yet if the AI generated code is even particularly valuable since it's legally ambiguous as to whether or not any human ownership can be attributed to it.

The USPTO has declined copyrightability for genai artwork, it's only a matter of time before the same question comes up about code.


Your claim is incorrect. Something purely AI generated may not be covered by copyright in the US. That would make it more valuable to MS as you can reuse it as you like.

However, works with significant human input are covered by copyright, and most code does have such input. Human review, and correction is very common. There is a lot of AI generated code out there, and there are no cases challenging the copyright on it.

You also need to look beyond US law. Software is a global business and most software businesses do not want to write software they can only sell in certain countries.


> However, works with significant human input are covered by copyright, and most code does have such input. Human review, and correction is very common. There is a lot of AI generated code out there, and there are no cases challenging the copyright on it.

Legislation and court decisions still pending. There are numerous lawsuits about copyrigtability of output, and right of use of copyrighted work by LLMs, and both could have ramifications for code. I don't see how it's materially different to tell Claude Code to write you a function fetching an entry from a database, and telling ChatGPT to generate you a picture of a unicorn riding a bicycle. Both have the same level of input (desired end goal), both might go through review and updates (no, pink unicorn; no, cache the database connection).

Legal challenges over code copyright are relatively rare nowadays, so I wouldn't take lack of high profile lawsuits as proof of legality / copyrightability.

And yes, this will also depend on jurisdiction. Court decisions or laws can change that. Litigation over copyright infringement via training and reproduction is ongoing in multiple jurisdiction, and it wouldn't be shocking to me if at least some decide that it is indeed copyright infringement to pirate content to train LLMs that can reproduce it.


If I write a program of 1000 lines of code, with AI features turned off, then I turned the AI features on and use a completion to edit one function, can my program not be copyrighted? (I expect/hope you’ll say: “Of course it’s still eligible for copyright”)

How about if I write 100 lines myself, turn the AI features on, vibe code 100 lines, and repeat this for five cycles? Half the functions are AI coded and half the functions I wrote myself. How about if I just tell Claude to write the program?

And what if I tell Claude to write the program, and then spend six months tweaking most of the lines of code?

I struggle to see a specific and obvious point where a line should be drawn. It seems intuitive to me that if I spend at least a few days worth of effort on a code base (whether tweaking, correcting, or directing AI to do targeted refactors), that is meaningful human authorship even if it has thousands of lines of generated code.

I can, however, acknowledge the fairness that something which is simply one-shot output probably shouldn’t merit protection. But really, in any of these cases, it’s going to be pretty hard to prove after the fact exactly what the proportion of generated code to human authorship is, so idk how a court will really tell whether a repo with 20,000 LOC is one-shot or actually had a person spend a few weeks tweaking it.


> And what if I tell Claude to write the program

Why should this be any different than when telling/paying a human to write the program?

You're free to enter an agreement assigning all rights to the employer or the worker, to license the work ir/revokably and/or non/transferably. There is no need to wait for a court decision to understand what the results will be.


If that function is all you ask it to write as a one off, maybe. However, if that function is part of a larger system that is human designed it is very different. If you review and correct the code in the system it is very different.

Pages 27 and 28 of this are relevant to this: https://www.copyright.gov/ai/Copyright-and-Artificial-Intell...


> I don't have a clear idea how that value can be captured, since it's going to be 90% AI generated code that anyone can scrape (public projects) or can't be used (private projects), so perhaps you're right.

The value is probably in knowing which AI-generated code ends up being pushed or discarded, which can't be derived from public projects. This information can then be used to finetune the next big model so it only generates the "good" code.


Its easier for them to scrape than it is for anyone else. they also have a lot more meta data about the code which may be useful.

Do Github terms entirely prevent them from making use of data in private projects.


> or can't be used (private projects)

As if they cared about that


It's a bit hard to blindly trust their numbers when they are trying very hard to sell Copilot to everyone.

Sure, AI will undoubtedly have increased their workload, but how much of the shown figures is real, and how much is the PR department trying to make it look like Copilot & friends is a massive success?


It means that Skynet is winning.

What you described above will piss off and alienate even more people. Eventually there will be a critical threshold crossed. Microslop will be the first victim of Skynet 11.0 (I lost track of its current version but you can see how much damage is caused by AI in general now - this was the beginning of skynet. Except that it sucks).


The same company operates the Xbox network. More daily active users and more events per second


Xbox network was _designed_ for such concurrency, GitHub is Ruby on rails + vitess (mysql).


Not comparable at all. Xbox would be mostly transient traffic. It's probably not much more than packet forwarding for a lot of traffic.

Github is a giant complicated stateful mess with a lot of reads and writes. It also has a lot of features at this point. Hard to scale and hard to optimize.


I think this is minimizing the Xbox platform. They are also a massive digital distribution platform where almost every game is a digital download now.

That being said, you are correct. It is absolutely no surprise to me that Actions has the worst uptime.


Do they run Xbox network on Azure or is it a separate thing?


Huh, so vibe coding really is the reason GitHub has been down so much lately!


Github naturally scales horizontally.

Usage numbers is the PR reason. Vibecoding insanity in Microsoft is the more plausible actual culprit.


> Github naturally scales horizontally

Not necessarily, a few years ago they had some crucial information stuck in a single MySQL cluster (so write constrained) and were working on sharding it, but struggling: https://github.blog/engineering/infrastructure/partitioning-...


So maybe it's AI that's responsible for both ends. The increased traffic and the lessened product.


The disappointing thing is that if you do some digging, you'll find the majority of that it's slop and just outright spam. There's a page on GitHub where you can see recently updated repositories and it's very rare I see anything of quality on there.

GitHub has become a dumping ground for broken code and it has more bots than ever. As much as I hate ID verification it might be a necessarily evil at this point because clearly their anti-bot measures aren't working.


Youtube does not suuffer from rise in AI videos?


Can you share where did they published that?


Their COO has talked about it extensively on X. A sibling comment in this thread posted a link here: https://news.ycombinator.com/item?id=48011075


Amazing that Microsoft didn't see this coming after aggressively pushing AI everywhere for years.


This may lead to some interesting gamesmanship. For instance, if I am applying to a company, and I know they use a certain applicant tracking system, and I know that ATS uses a certain model provider for its filter, I should then use that model to write the version of my resume I send to the company.


Good observation. There are so many versions of the future that just become an LLM arms race.


This is such an important point. So much of inflation is not $THING used to cost $X and now costs $Y, but that $THING is significantly lower quality than it used to be. Quality is famously difficult to quantify (Pirsig), so it is much easier to manipulate it without people noticing. A product that looks the same, but is slightly worse, at purchase time is a lot harder to identify than the same product that costs 20% more, so businesses prefer it.

That happens incrementally over years, until the product is a shadow of its former self.


I have found basically no way to buy books online where they don't arrive damaged at this point. I've gone through multiple return/rebuy cycles with Amazon trying to get an undamaged copy and have just given up. I don't know if it's my local distribution center, but it's something like 90% damaged on arrival at this point.

Amazon has had massive quality reduction over the years in their service, but this one and the poor-quality knock-off books are the ones that bother me most.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: