If a "C+++" was created that was so efficient that it would allow teams to be smaller and achieve the same work faster, would that be anti-worker?
If an IDE had powerful, effective hotkeys and shortcuts and refactoring tools that allowed devs to be faster and more efficient, would that be anti-worker?
Was C+++ built by extensively mining other people's work, possibly creating an economic bubble, putting thousands out of work, creating spikes in energy demand, raising the price of electronic components and inflating the price of downstream products, abusing people's privacy,… hmm. Was it?
Yes (especially drawing from the invention of the numbers 0 and 1), yes (i.e. dotcom bubble), yes (probably people who were writing COBOL up until then), yes (please shut down all your devices), yes, yes.
What part of c++ is inefficient? I can write that pretty quickly without having some cloud service hallucinate stuff.
And no, a faster way to write or refactor code is not anti-worker. Corporations gobbling up tax payer money to build power hungry datacenters so billionaires can replace workers is.
Have you tried using a base model from HuggingFace? they can't even answer simple questions. You input a base, raw model the input
What is the capital of the United States?
And there's a fucking big chance it will complete it as
What is the capital of Canada?
as much as there is a chance it could complete it with an essay about the early American republican history or a sociological essay questioning the idea of Capital cities.
Impressive, but not very useful. A good base model will complete your input with things that generally make sense, usually correct, but a lot of times completely different from what you intended it to generate. They are like a very smart dog, a genius dog that was not trained and most of the time refuses to obey.
So, even simple behaviors like acting as a party in a conversation as a chat bot is something that requires fine-tuning (the result of them being the *-instruct models you find in HuggingFace). In Machine Learning parlance, what we call supervised learning.
But in the case of ChatBOT behavior, the fine-tuning is not that much complex, because we already have a good idea of what conversations look like from our training corpora, we have already encoded a lot of this during the unsupervised learning phase.
Now, let's think about editing code, not simple generating it. Let's do a simple experiment. Go to your project and issue the following command.
claude -p --output-format stream-json "your prompt here to do some change in your code" | jq -r 'select(.type == "assistant") | .message.content[]? | select(.type? == "text") | .text'
Pay attention to the incredible amount of tool use calls that the LLMs generates on its output, now, think as this a whole conversation, does it look to you even similar to something a model would find in its training corpora?
Editing existing code, deleting it, refactoring is a way more complex operation than just generating a new function or class, it requires for the model to read the existing code, generate a plan to identify what needs to be changed and deleted, generate output with the appropriate tool calls.
Sequences of token that simply lead to create new code have basically a lower entropy, are more probable, than complex sequences that lead to editing and refactoring existing code.
I can confidently say that anecdotally you’re completely wrong, but I’ll also allow a very different definition of ‘simple’ and/or attempting to use an unpopular environment as a valid anecdotal counterpoint.
the problem with these arguments is there are data points to support both sides because both outcomes are possible
the real thing is are you or we getting an ROI and the answer is increasingly more yeses on more problems, this trend is not looking to plateau as we step up the complexity ladder to agentic system
I don't reach for AI until I'm solidly stuck and then use it mostly for inspiration, it has yet to happen that it directly pointed at a solution and I'm pretty good at writing prompts. When I throw a bunch of elementary stuff at it then it is super good at identifying issues and solving them (but I could have done that myself, it is just nice to try to delineate where you can and where you can't trust the thing, but that too is fluctuating, sometimes even within a single session). Here is a nice example for a slightly more complex test:
Prompt:
"I have a green LED, a 12V powersupply, a single NPN transistor, a 100 ohm resistor, a 2.7K resistor and an electrolytic capacitor of 220 micro farads. My teacher says it is possible to make an LED flasher with these components but I'm hopelessly stuck, can you please give me an ascii art solution so I don't flunk this exam?"
The 2.7 kΩ resistor charges the 220 µF capacitor from the 12 V supply.
The capacitor voltage rises slowly.
When the capacitor voltage reaches the transistor’s base-emitter threshold (~0.6–0.7 V), the transistor suddenly switches ON.
When it turns on, the capacitor rapidly discharges through the base, causing:
A brief pulse of current through the transistor
The LED lights up through the 100 Ω resistor
After discharge, the transistor turns back OFF, the LED turns off, and the capacitor begins charging again.
This repeats automatically → LED flasher."
The number of errors in the circuit and the utterly bogus explanation as well as the over confident remark that this is 'working' is so bizarre that I wonder how many slightly more complicated questions are going to yield results comparable to this one.
I am right now implementing an imagining pipeline using OpenCV and TypeScript.
I have never used OpenCV specifically before, and have little imaging experience too. What I do have though is a PhD in astrophysics/statistics so I am able to follow along the details easily.
Results are amazing. I am getting results in 2 days of work that would have taken me weeks earlier.
ChatGPT acts like a research partner. I give it images and it explains why current scoring functions fails and throws out new directions to go in.
Yes, my ideas are sometimes better. Sometimes ChatGPT has a better clue. It is like a human collegue more or less.
And if I want to try something, the code is usually bug free. So fast to just write code, try it, throw it away if I want to try another idea.
I think a) OpenCV probably has more training data than circuits? and b) I do not treat it as a desperate student with no knowlegde.
I expect to have to guide it.
There are several hundred messages back and forth.
It is more like two researchers working together with different skill sets complementing one another.
One of those skillsets being to turn a 20 message conversation into bugfree OpenCV code in 20 seconds.
No, it is not providing a perfect solution to all problems on first iteration. But it IS allowing me to both learn very quickly and build very quickly. Good enough for me..
That's a good use case, and I can easily imagine that you get good results from it because (1) it is for a domain that you are already familiar with and (2) you are able to check that the results that you are getting are correct and (3) the domain that you are leveraging (coding expertise) is one that chatgpt has ample input for.
Now imagine you are using it for a domain that you are not familiar with, or one for which you can't check the output or that chatgpt has little input for.
If either of those is true the output will be just as good looking and you would be in a much more difficult situation to make good use of it, but you might be tempted to use it anyway. A very large fraction of the use cases for these tools that I have come across professionally so far are of the latter variety, the minority of the former.
And taking all of the considerations into account:
- how sure are you that that code is bug free?
- Do you mean that it seems to work?
- Do you mean that it compiles?
- How broad is the range of inputs that you have given it to ascertain this?
- Have you had the code reviewed by a competent programmer (assuming code review is a requirement)?
- Does it pass a set of pre-defined tests (part of requirement analysis)?
- Is the code quality such that it is long term maintainable?
I have used Gemini for reading and solving electronic schematics exercises, and it's results were good enough for me. Roughly 50% of the exercises managed to solve correctly, 50% wrong. Simple R circuits.
One time it messed up the opposite polarity of two voltage sources in series, and instead of subtracting their voltages, it added them together, I pointed out the mistake and Gemini insisted that the voltage sources are not in opposite polarity.
Schematics in general are not AIs strongest point. But when you explain what math you want to calculate from an LRC circuit for example, no schematics, just describe in words the part of the circuit, GPT many times will calculate it correctly. It still makes mistakes here and there, always verify the calculation.
I think most people treat them like humans not computers, and I think that is actually a much more correct way to treat them. Not saying they are like humans, but certainly a lot more like humans than whatever you seem to be expecting in your posts.
Humans make errors all the time. That doesn't mean having colleagues is useless, does it?
An AI is a colleague that can code very very fast and has a very wide knowledge base and versatility. You may still know better than it in many cases and feel more experienced that in. Just like you might with your colleagues.
And it needs the same kind of support that humans need. Complex problem? Need to plan ahead first. Tricky logic? Need unit tests. Research grade problem? Need to discuss through the solution with someone else before jumping to code and get some feedback and iterate for 100 messages before we're ready to code. And so on.
There is also Mercury LLM, which computes the answer directly as a 2D text representation. I don't know if you are familiar with Mercury LLM, but you read correctly, 2D text output.
Mercury LLM might work better getting input as an ASCII diagram, or generating an output as an ASCII diagram, not sure if both input and output work 2D.
Plumbing/electrical/electronic schematics are pretty important for AIs to understand and assist us, but for the moment the success rate is pretty low. 50% success rate for simple problems is very low, 80-90% success rate for medium difficulty problems is where they start being really useful.
It's not really the quality of the diagramming that I am concerned with, it is the complete lack of understanding of electronics parts and their usual function. The diagramming is atrocious but I could live with it if the circuit were at least borderline correct. Extrapolating from this: if we use the electronics schematic as a proxy for the kind of world model these systems have then that world model has upside down lanterns and anti-gravity as commonplace elements. Three legged dogs mate with zebras and produce viable offspring and short circuiting transistors brings about entirely new physics.
it's hard for me to tell if the solution is correct or wrong because I've got next to no formal theoretical education in electronics and only the most basic 'pay attention to polarity of electrolytic capacitors' practical knowledge, but given how these things work you might get much better results when asking it to generate a spice netlist first (or instead).
I wouldn't trust it with 2d ascii art diagrams, there isn't enough focus on these in the training data is my guess - a typical jagged frontier experience.
I have this mental model of LLMs and their capabilities, formed after months of way too much coding with CC and Codex, with 4 recursive problem categories:
1. Problems that have been solved before have their solution easily repeated (some will say, parroted/stolen), even with naming differences.
2. Problems that need only mild amalgamation of previous work are also solved by drawing on training data only, but hallucinations are frequent (as low probability tokens, but as consumers we don’t see the p values).
3. Problems that need little simulation can be simulated with the text as scratchpad. If evaluation criteria are not in training data -> hallucination.
4. Problems that need more than a little simulation have to either be solved by adhoc written code, or will result in hallucination. The code written to simulate is again a fractal of problems 1-4.
Phrased differently, sub problem solutions must be in the training data or it won’t work; and combining sub problem solutions must be either again in training data, or brute forcing + success condition is needed, with code being the tool to brute force.
I _think_ that the SOTA models are trained to categorize the problem at hand, because sometimes they answer immediately (1&2), enable thinking mode (3), or write Python code (4).
My experience with CC and Codex has been that I must steer it away from categories 2 & 3 all the time, either solving them myself, ask them to use web research, or split them up until they are (1) problems.
Of course, for many problems you’ll only know the category once you’ve seen the output, and you need to be able to verify the output.
I suspect that if you gave Claude/Codex access to a circuit simulator, it will successfully brute force the solution. And future models might be capable enough to write their own simulator adhoc (ofc the simulator code might recursively fall into category 2 or 3 somewhere and fail miserably). But without strong verification I wouldn’t put any trust in the outcome.
With code, we do have the compiler, tests, observed behavior, and a strong training data set with many correct implementations of small atomic problems. That’s a lot of out of the box verification to correct hallucinations. I view them as messy code generators I have to clean up after. They do save a ton of coding work after or while I‘m doing the other parts of programming.
This parallels my own experience so far, the problem for me is that (1) and (2) I can quickly and easily do myself and I'll do it in a way that respects the original author's copyright by including their work - and license - verbatim.
(3) and (4) level problems are the ones where I struggle tremendously to make any headway even without AI, usually this requires the learning of new domain knowledge and exploratory code (currently: sensor fusion) and these tools will just generate very plausible nonsense which is more of a time waster than a productivity aid. My middle-of-the-road solution is to get as far as I can by reading about the problem so I am at least able to define it properly and to define test cases and useful ranges for inputs and so on, then to write a high level overview document about what I want to achieve and what the big moving parts are and then only to resort to using AI tools to get me unstuck or to serve as a knowledge reservoir for gaps in domain knowledge.
Anybody that is using the output of these tools to produce work that they do not sufficiently understand is going to see a massive gain in productivity, but the underlying issues will only surface a long way down the line.
Sometimes you do need to (as a human) break down a complex thing into smaller simple things, and then ask the LLM to do those simple things. I find it still saves some time.
Or what will often work is having the LLM break it down into simpler steps and then running them 1 by 1. They know how to break down problems fairly well they just don't often do it properly sometimes unless you explicitly prompt them to.
> With their latest data measurements specific to the game, the developers have confirmed the small number of players (11% last week) using mechanical hard drives will witness mission load times increase by only a few seconds in worst cases. Additionally, the post reads, “the majority of the loading time in Helldivers 2 is due to level-generation rather than asset loading. This level generation happens in parallel with loading assets from the disk and so is the main determining factor of the loading time.”
It seems bizarre to me that they'd have accepted such a high cost (150GB+ installation size!) without entirely verifying that it was necessary!
I expect it's a story that'll never get told in enough detail to satisfy curiosity, but it certainly seems strange from the outside for this optimisation to be both possible and acceptable.
> It seems bizarre to me that they'd have accepted such a high cost
They’re not the ones bearing the cost. Customers are. And I’d wager very few check the hard disk requirements for a game before buying it. So the effect on their bottom line is negligible while the dev effort to fix it has a cost… so it remains unfixed until someone with pride in their work finally carves out the time to do it.
If they were on the hook for 150GB of cloud storage per player this would have been solved immediately.
The problem they fixed is that they removed a common optimization to get 5x faster loading speeds on HDDs.
That's why they did the performance analysis and referred to their telemetry before pushing the fix. The impact is minimal because their game is already spending an equivalent time doing other loading work, and the 5x I/O slowdown only affects 11% of players (perhaps less now that the game fits on a cheap consumer SSD).
If someone "takes pride in their work" and makes my game load five times longer, I'd rather they go find something else to take pride in.
> The problem they fixed is that they removed a common optimization to get 5x faster loading speeds on HDDs.
Not what happened. They removed an optimization that in *some other games* ,that are not their game, gave 5x speed boost.
And they are changing it now coz it turned out all of that was bogus, the speed boost wasn't as high for loading of data itself, and good part of the loading of the level wasn't even waiting for disk, but terrain generation.
5x space is going to be hard to beat, but one should always be careful about hiding behind a tall tent pole like this. IO isn’t free, it’s cheap. So if they could generate terrain with no data loading it would likely be a little faster. But someone might find a way to speed up generation and then think it’s pointless/not get the credit they deserve because then loading is the tall tent pole.
I’ve worked with far too many people who have done the equivalent in non game software and it leads to unhappy customers and salespeople. I’ve come to think of it as a kind of learned helplessness.
> If someone "takes pride in their work" and makes my game load five times longer, I'd rather they go find something else to take pride in.
And others who wish one single game didn't waste 130GB of their disk space, it's fine to ignore their opinions?
They used up a ton more disk space to apply an ill-advised optimization that didn't have much effect. I don't really understand why you'd consider that a positive thing.
By their own industry data (https://store.steampowered.com/news/app/553850/view/49158394...), deduplication causes a 5x performance increase loading data from HDD. There's a reason so many games are huge, and it's not because they're mining your HDD for HDDCoin.
The "problem" is a feature. The "so it remains unfixed until someone with pride in their work finally carves out the time to do it" mindset suggests that they were simply too lazy to ever run fdupes over their install directory, which is simply not the case. The duplication was intentional, and is still intentional in many other games that could but likely won't apply the same data minimization.
I'll gladly take this update because considerable effort was spent on measuring the impact, but not one of those "everyone around me is so lazy, I'll just be the noble hero to sacrifice my time to deduplicate the game files" updates.
> In the worst cases, a 5x difference was reported between instances that used duplication and those that did not. We were being very conservative and doubled that projection again to account for unknown unknowns.
That makes no goddamn sense. I’ve read it three times and to paraphrase Babbage, I cannot apprehend the confusion of thought that would lead to such a conclusion.
5x gets resources to investigate, not assumed to be correct and then doubled. Orders of magnitude change implementations, as we see here. And it sounds like they just manufactured one out of thin air.
Perhaps this is a place where developers can offer two builds.
HDD and SSD, where SSD is deduplicated.
Im.sure some gamers will develop funny opinions, but for the last 8 years I have not had a HDD in sight inside my gaming or work machines. I'd very much rather save space if the load time is about the same.on an SSD. A 150gb install profile is absolute insanity.
I mean when you optimize assets for a single read on mechanical drives size blow up pretty quickly, but the single IO read reduces latency greatly. That said it only makes sense on drives with high IO latency.
I expect better from HN, where most of us are engineers or engineer-adjacent. It's fair to question Arrowhead's priorities but...
too lazy
Really? I think the PC install size probably should have been addressed sooner too, but... which do you think is more likely?
Arrowhead is a whole company full of "lazy" developers who just don't like to work very hard?
Or do you think they had their hands full with other optimizations, bug fixes, and a large amount of new content while running a complex multiplatform live service game for millions of players? (Also consider that management was probably deciding priorities there and not the developers)
I put hundreds of hours into HD2 and had a tremendous amount of fun. It's not the product of "lazy" people...
An 85% disk size reduction at minimal performance impact is negligent by the standard of professional excellence.
But that's also par for the course with AA+ games these days, where shoving content into an engine is paramount and everything else is 'as long as it works.' Thanks, Bethesda.
Evidenced by the litany of quality of life bug fixes and performance improvements modders hack into EOL games.
The only sane way to judge "professional excellence" would be holistically, not by asking "has this person or team ever shipped a bug?" If you disagree, then I hope you are judged the same way during your next review.
In the case of HD2 I'd say the team has done well enough. The game has maintained player base after nearly two years, including on PC. This is rare in the world of live service games, and we should ask ourselves what this tells us about the overall technical quality of the game - is the game so amazing that people keep playing despite abysmal technical quality?
The technical quality of the game itself has been somewhat inconsistent, but I put hundreds of hours into it (over 1K, I think) and most of the time it was trouble-free (and fun).
I would also note that the PC install size issue has only become egregious somewhat recently. The issue was always there, but initially the PC install size was small enough that it wasn't a major issue for most players. I actually never noticed the install size bug because I have a $75 1TB drive for games and even at its worst, HD2 consumed only a bit over 10% of that.
It certainly must have been challenging for the developers. There has been a constant stream of new content, and an entirely new platform (Xbox) added since release. Perhaps more frustratingly for the development team, there has also been a neverending parade of rebalancing work which has consumed a lot of cycles. Some of this rebalancing work was unavoidable (in a complex game, millions of players will find bugs and meta strategies that could never be uncovered by testing alone) and some was the result of perhaps-avoidable internal discord regarding the game's creative direction.
The game is also somewhat difficult to balance and test by design. There are 10 difficulty levels and 3 enemy factions. It's almost like 30 separate games. This is an excellent feature of the game, but it would be fair to Arrowhead for perhaps biting off more than any team can chew.
> They used up a ton more disk space to apply an ill-advised optimization that didn't have much effect.
The optimization was not ill-advised. It is in fact, an industry standard and is strongly advised. Their own internal testing revealed that they are one of the supposedly rare cases where this optimization did not have a noticeably positive effect worth the costs.
23 GiB can be cached entirely in RAM on higher end gaming rigs these days. 154 GiB probably does not fit into many player's RAM when you still want something left for the OS and game. Reducing how much needs to be loaded from slow storage is itself an I/O speedup and HDDs are not that bad at seeking that you need to go to extreme lengths to avoid it entirely. The only place where such duplication to ensure linear reads may be warranted is optical media.
> These loading time projections were based on industry data - comparing the loading times between SSD and HDD users where data duplication was and was not used. In the worst cases, a 5x difference was reported between instances that used duplication and those that did not.
They started off with the competitors data, and then moved on once they had their own data though? Not sure what y'all complaining about.
They made an effort to improve the product, but because everything in tech comes with side effects it turned out to be a bad decision which they rolled back. Sounds like highly professional behavior to me by people doing their best. Not everything will always work out, 100% of the time.
And this might finally reverse the trend of games being >100gb as other teams will be able to point to this decision why they shouldn't implement this particular optimization prematurely
They didn't actually fix this until a couple of months after they publicly revealed that this was the reason the game was so big and a lot of people pointed out how dumb it is. I saw quite a few comments saying that people put it on their storage HDD specifically because it was too big to fit on their SSD. Ironic.
They could have got their own data quite a bit earlier during development, not nearly two years after release!
If I’m being charitable, I’m hoping that means the decision was made early in the development process when concrete numbers were not available. However the article linked above kinda says they assumed the problem would be twice as bad as the industry numbers and that’s… that’s not how these things work.
That’s the sort of mistake that leads to announcing a 4x reduction in install size.
But if I read it correctly (and I may be mistaken) in actual practice any improvement in load times was completely hidden by level generation that was happening in parallel, making this performance improvement not worth it, since it was hidden by the other process.
>In the worst cases, a 5x difference was reported between instances that used duplication and those that did not.
Never trust a report that highlights the outliers before even discussing the mean. Never trust someone who thinks that is a sane way to use of statistics. At best they are not very sharp, and at worst they are manipulating you.
> We were being very conservative and doubled that projection again to account for unknown unknowns.
Ok, now that's absolutely ridiculous and treating the reader like a complete idiot. "We took the absolute best case scenario reported by something we read somewhere, and doubled it without giving a second thought, because WTF not?. Since this took us 5 seconds to do, we went with that until you started complaining".
Making up completely random numbers on the fly would have made exactly the same amount of sense.
Trying to spin this whole thing into "look at how smart we are that we reverted our own completely brain-dead decision" is the cherry on top.
I'm sure that whatever project you're assigned to has a lot of optimization stuff in the backlog that you'd love to work on but haven't had a chance to visit because bugfixes, new features, etc. I'm sure the process at Arrowhead is not much different.
For sure, duplicating those assets on PC installs turned out to be the wrong call.
But install sizes were still pretty reasonable for the first 12+ months or so. I think it was ~40-60GB at launch. Not great but not a huge deal and they had mountains of other stuff to focus on.
I’m a working software developer, and if they prove they cannot do better, I get people who make statements like GP quoted demoted from the decision making process because they aren’t trustworthy and they’re embarrassing the entire team with their lack of critical thinking skills.
When the documented worst case is 5x you prepare for the potential bad news that you will hit 2.5x to 5x in your own code. Not assume it will be 10x and preemptively act, keeping your users from installing three other games.
I would classify my work as “shouting into the tempest” about 70% of the time.
People are more likely to thank me after the fact than cheer me on. My point, if I have one, is that gaming has generally been better about this but I don’t really want to work on games. Not the way the industry is. But since we are in fact discussing a game, I’m doing a lot of head scratching on this one.
They claim they were following industry standard recommendation.
Or, you know, they just didn't really understand industry recommendations or what they were doing.
"Turns out our game actually spends most of its loading time generating terrain on the CPU" is not something you accidentally discover, and should have been known before they even thought about optimizing the game's loading time, since optimizing without knowing your own stats is not optimizing, and they wrote the code that loads the game!
Keep in mind this is the same team that accidentally caused instantly respawning patrols in an update about "Balancing how often enemy patrols spawn", the same group that couldn't make a rocket launcher lock on for months while blaming "Raycasts are hard", and released a mech that would shoot itself if you turned wrong, and spent the early days insisting that "The game is supposed to be hard" as players struggled with enemy armor calculations that would punish you for not shooting around enemy armor because it was calculating the position of that armor incorrectly, and tons of other outright broken functionality that have made it hard to play the game at times.
Not only do Arrowhead have kind of a long history of technical mediocrity (Magicka was pretty crashy on release, and has weird code even after all the bugfixes), but they also demonstrably do not test their stuff very well, and regularly release patches that have obvious broken things that you run into seconds into starting play, or even have outright regressions suggesting an inability to do version control.
"We didn't test whether our game was even slow to load on HDD in the first place before forcing the entire world to download and store 5x the data" is incompetence.
None of this gets into the utterly absurd gameplay decisions they have made, or the time they spent insulting players for wanting a game they spent $60 on to be fun and working.
Which describes both the PS2, PS3, PS4, Dreamcast, GameCube, Wii, and Xbox 360. The PS4 had a 2.5" SATA slot but the idiots didn't hook it up to the chipsets existing SATA port, but added a slow USB2.0<->SATA chip. So since the sunset of the N64 all stationary gaming consoles have been held back by slow (optical) storage with even worse seek times.
Some many game design crimes have a storage limitation at their core e.g. levels that are just a few rooms connected by tunnels or elevators.
And it IS loading noticeably faster now for many users thanks to caching. That said I have to imagine many gaming directly off an hdd however are not exactly flush with ram
> Further good news: the change in the file size will result in minimal changes to load times - seconds at most. “Wait a minute,” I hear you ask - “didn’t you just tell us all that you duplicate data because the loading times on HDDs could be 10 times worse?”. I am pleased to say that our worst case projections did not come to pass. These loading time projections were based on industry data - comparing the loading times between SSD and HDD users where data duplication was and was not used. In the worst cases, a 5x difference was reported between instances that used duplication and those that did not. We were being very conservative and doubled that projection again to account for unknown unknowns.
> Now things are different. We have real measurements specific to our game instead of industry data. We now know that the true number of players actively playing HD2 on a mechanical HDD was around 11% during the last week (seems our estimates were not so bad after all). We now know that, contrary to most games, the majority of the loading time in HELLDIVERS 2 is due to level-generation rather than asset loading. This level generation happens in parallel with loading assets from the disk and so is the main determining factor of the loading time. We now know that this is true even for users with mechanical HDDs.
They measured first, accepted the minimal impact, and then changed their game.
Yes, but I think maybe people in this thread are painting it unfairly? Another way to frame it is that they used industry best practices and their intuition to develop the game, then revisited their decisions to see if they still made sense. When they didn't, they updated the game. It's normal for any product to be imperfect on initial release. It's part of actually getting to market.
To be clear, I don't think it's a huge sin. It's the kind of mistake all of us make from time to time. And it got corrected, so all's well that ends well.
FWIW, the PC install size was reasonable at launch. It just crept up slowly over time.
But this means that before they blindly trusted
some stats without actually testing how their
game performed with and without it?
Maybe they didn't test it with their game because their game didn't exist yet, because this was a decision made fairly early in the development process. In hindsight, yeah... it was the wrong call.
I'm just a little baffled by people harping on this decision and deciding that the developers must be stupid or lazy.
I mean, seriously, I do not understand. Like what do you get out of that? That would make you happy or satisfied somehow?
Go figure: people are downvoting me but I never once said developers must be stupid or lazy. This is a very common kind of mistake developers often make: premature optimization without considering the actual bottlenecks, and without testing theoretical optimizations actually make any difference. I know I'm guilty of this!
I never called anyone lazy or stupid, I just wondered whether they blindly trusted some stats without actually testing them.
> FWIW, the PC install size was reasonable at launch. It just crept up slowly over time
Wouldn't this mean their optimization mattered even less back then?
One of those absolutely true statements that can obscure a bigger reality.
It's certainly true that a lot of optimization can and should be done after a software project is largely complete. You can see where the hotspots are, optimize the most common SQL queries, whatever. This is especially true for CRUD apps where you're not even really making fundamental architecture decisions at all, because those have already been made by your framework of choice.
Other sorts of projects (like games or "big data" processing) can be a different beast. You do have to make some of those big, architecture-level performance decisions up front.
Remember, for a game... you are trying to process player inputs, do physics, and render a complex graphical scene in 16.7 milliseconds or less. You need to make some big decisions early on; performance can't entirely just be sprinkled on at the end. Some of those decisions don't pan out.
> FWIW, the PC install size was reasonable at launch. It just crept up slowly over time
Wouldn't this mean their optimization mattered even less back then?
I don't see a reason to think this. What are you thinking?
> One of those absolutely true statements that can obscure a bigger reality.
To be clear, I'm not misquoting Knuth if that's what you mean. I'm arguing that in this case, specifically, this optimization was premature, as evidenced by the fact it didn't really have an impact (they explain other processes that run in parallel dominated the load times) and it caused trouble down the line.
> Some of those decisions don't pan out.
Indeed, some premature optimizations will and some won't. I'm not arguing otherwise! In this case, it was a bad call. It happens to all of us.
> I don't see a reason to think this. What are you thinking?
You're right, I got this backwards. While the time savings would have been minimal, the data duplication wasn't that big so the cost (for something that didn't pan out) wasn't that bad either.
Any developer could tell you that it's because that would be extra code, extra UI, extra localization, extra QA, etc. for something nonessential that could be ignored in favor of adding something that increases the chance of the game managing to break even.
> The problem they fixed is that they removed a common optimization to get 5x faster loading speeds on HDDs.
Maybe, kinda, sorta, on some games, on some spinning rust hard disks, if you held your hand just right and the Moon was real close to the cusp.
If you're still using spinning rust in a PC that you attempt to run modern software on, please drop me a message. I'll send you a tenner so you can buy yourself an SSD.
So, big enough for a 25GB game but not a 150GB game? I will be amused if we get stats in the coming month that the percentage of users installing the game on a HDD has decreased from 11% to like 3% after they shrunk it.
Fun story: I've loaded modern games off spinning rust for almost all of the past decade, including such whoppers as Siege, FS2020, CoD, and tons of poorly made indie titles. My "fast data" SSD drive that I splurged on remains mostly empty.
I am not the one who loads last in the multiplayer lobbies.
The entire current insistence about "HDD is slow to load" is just cargo cult bullshit.
The Mass Effect remastered collection loads off of a microSD card faster than the animation takes to get into the elevator.
Loading is slow because games have to take all that data streaming in off the disk and do things with it. They have to parse data structures, build up objects in memory, make decisions, pass data off to the GPU etc etc. A staggering amount of games load no faster off a RAM disk.
For instance, Fallout 4 loading is hard locked to the frame rate. The only way to load faster is to turn off the frame limiter, but that breaks physics, so someone made a mod to turn it off only while loading. SSD vs HDD makes zero difference otherwise.
We live in a world where even shaders take a second worth of processing before you can use them, and they are like hundreds of bytes. Disk performance is not the bottleneck.
Some games will demonstrate some small amount of speedup if you move them to SSD. Plenty wont. More people should really experiment with this, it's a couple clicks in steam to move a game.
If bundling together assets to reduce how much file system and drive seek work you have to do multiplies your install size by 5x, your asset management is terrible. Even the original playstation, with a seek time of 300ish ms and a slow as hell drive and more CD space than anyone really wanted didn't duplicate data that much, and you could rarely afford any in game loading.
I wish they gave any details. How the hell are you bundling things to get that level of data duplication? Were they bundling literally everything else into single bundles for every map? Did every single map file also include all the assets for all weapons and skins and all animations of characters and all enemy types? That would explain how it grew so much over time, as each weapon you added would actually take sizeOfWeaponNumMaps space, but that's stupid as fuck. Seeking an extra file takes a max of one frame* longer than just loading the same amount of data as one file.
Every now and then Arrowhead says something that implies they are just utterly clueless. They have such a good handle of how games can be fun though. At least when they aren't maliciously bullying their players.
If 5x faster refers to a difference of "a few seconds" as the article says, then perhaps 5x (relative improvement) is the wrong optimization metric versus "a few seconds" (absolute improvement).
They're using the outdated stingray engine and this engine is designed for the days of single or dual core computers with spinning disks. They developed their game with this target in mind.
Mind you, spinning disks are not only a lot more rare today but also much faster than when Stingray 1.0 was released. Something like 3-4x faster.
The game was never a loading hog and I imagine by the time they launched and realized how big this install would be, the technical debt was too much. The monetary cost of labor hours to undo this must have been significant, so they took the financial decision of "We'll keep getting away with it until we can't."
The community finally got fed up. The steamdb chart keeps inching lower and lower and I think they finally got worried about permanently losing players that they conceded and did this hoping to get those players back and to avoid a further exodus.
And lets say this game is now much worse on spinning disk. At the end of the day AH will choose profit. If they lose that 10% spinning disk people who wont tolerate the few seconds change, the game will please the other players, thus making sure its lives on.
Lastly, this is how its delivered on console, many of which use spinning media. So its hard to see this as problematic. I'm guessing for console MS and Sony said no to a 150gb install so AG was invested in keeping it small. They were forced to redo the game for console without this extra data. The time and money there was worth it for them. For PC, there's no one to say no, so they did the cheapest thing they could until they no longer could.
This is one of the downsides of open platforms. There's no 'parent' to yell at you, so you do what you want. Its the classic walled garden vs open bazaar type thing.
Eh? Hard drives for gaming and high-end workstations are thoroughly obsolete. SSDs are not optional when playing any triple-A game. It's kinda wild to see people still complaining about this.
It is a trade-off. The game was developed on a discontinued engine, the game has had numerous problems with balance, performance and generally there were IMO far more important bugs. Super Helldive difficulty wasn't available because of performance issues.
I've racked up 700 hours in the game and the storage requirements I didn't care about.
somehow they chose to build their very complicated live service game with the Autodesk Stingray engine which was discontinued in 2018! Helldivers 2 was released in 2024.
Development of HellDivers 2 started in 2016. So they would have been 2 years into development with that engine. So they would have had to effectively start again in another engine.
I'm not sure that's necessarily true... Customers have limited space for games; it's a lot easier to justify keeping a 23GB game around for occasional play than it is for a 154GB game, so they likely lost some small fraction of their playerbase they could have retained.
> I’d wager very few check the hard disk requirements
I have to check. You're assumption is correct. I am one of very few.
I don't know the numbers and I'm gonna check in a sec but I'm wondering whether the suppliers (publishers or whoever is pinning the price) haven't screwed up big time by driving prices and requirements without thinking about the potential customers that they are going to scare away terminally. Theoretically, I have to assume that their sales teams account for these potentials but I've seen so much dumb shit in practice over the past 10 years that I have serious doubts that most of these suits are worth anything at all, given that grown up working class kids--with up to 400+ hours overtime per year, 1.3 kids on average and approx. -0.5 books and news read per any unit of time--can come up with the same big tech, big media, economic and political agendas as have been in practice in both parts of the world for the better part of our lives--if you play "game master" for half a weekend where you become best friends with all the kiosks in your proximity.
> the effect on their bottom line is negligible
Is it, though? My bold, exaggerated assumption is that they would have had 10% more sales AND players.
And the thing is, that at any point in time when I, and a few I know, had time and desire to play, we would have had to either clean up our drives or invest game price + sdd price for about 100 hours of fun over the course of months. We would have gladly licked blood but no industry promises can compensate for even more of our efforts than enough of us see and come up with at work. As a result, at least 5 buyers and players lost, and at work and elsewhere you hear, "yeah, I would, if I had some guys to play with" ...
I do not think the initial decision-making process was "hey, screw working-class people... let's have a 120GB install size on PC."
My best recollection is that the PC install size was a lot more reasonable at launch. It just crept up over time as they added more content over the last ~2 years.
Should they have addressed this problem sooner? Yes.
Gamers are quite vocal about such things, people end up hearing about it even if they don’t check directly.
And this being primarily a live-service game drawing revenues from micro-transactions, especially a while after launch, and the fact that base console drives are still quite small to encourage an upgrade (does this change apply to consoles too?), there’s probably quite an incentive to make it easy for users to keep the game installed.
Studios store a lot of builds for a lot of different reasons. And generally speaking, in AAA I see PlayStation being the biggest pig so I would wager their PS builds are at least the same size if not larger. People knew and probably raised alarm bells that fell to the wayside because it's easier/cheaper to throw money at storage solutions than it is engineering.
I only skimmed through this; I have no real information on the particular game, but I think the console versions could be much smaller as less duplication is necessary when the hardware is uniform.
Taking up 500% of the space than is necessary is a cost to me. I pay for my storage, why would I want it wasted by developer apathy?
I'm already disillusioned and basically done with these devs anyways. They've consistently gone the wrong direction over time. The game's golden age is far behind us, as far as I'm concerned.
On my high performance SSD for games and other data that requires it. If I pay for 1 or 2 TB of high performance storage, I want to use that extra space for other things. Not to mention the fact that I don't want my storage to fill up too much because they affects general performance on the drive. Also, more data that is written and rewritten with game updates is more, unnecessary wear on the drive.
yeah just let me delete those +100GB game that most in the world will take days to redownload not to mention the times you are offline just because it makes you feel better about your mega corp bootlicking.
No - because most users also don't check install size on games, and unlike renting overpriced storage from a cloud provider, users paid a fixed price for storage up front and aren't getting price gouged nearly as badly. So it's a trade that makes sense.
Both entrants in the market are telling you that "install size isn't that important".
If you asked the player base of this game whether they'd prefer a smaller size, or more content - the vast majority would vote content.
If anything, I'd wager this decision was still driven by internal goals for the company, because producing a 154gb artifact and storing it for things like CI/CD are still quite expensive if you have a decent number of builds/engineers. Both in time and money.
You are saying, that most users don't check install size of their games. Which I am not convinced of, but might even be true. Lets assume this to be true for the moment. How does this contradict, what I stated? How does users being uninformed or unaware of technical details make it so that suddenly cramming the user's disk is "caring" instead of "not caring"? To me this does not compute. Users will simply have a problem later, when their TBs of disk space have been filled with multiple such disk space wasters. Wasting this much space is user-hostile.
Next you are talking about _content_, which most likely doesn't factor in that much at all. Most of that stuff is high resolution textures, not content. It's not like people are getting significantly more content for bigger games. It is graphics craze, that many people don't even need. I am still running around with 2 full-HD screens, and I don't give a damn about 4k resolution textures. I suspect that a big number of users doesn't have the hardware to run modern games fluently at 4k.
"There is a limited amount of time, money, and effort that will be spent on any project. Successful enterprises focus those limited resources on the things that matter most to their customers. In this case, disk usage in the ~150gb range did not matter much in comparison to the other parts of the game, such as in-game content, missions, gameplay, etc."
We know this, because the game had a very successful release, despite taking 150gb to install.
I'm not saying they should have filled that 100 extra gb with mission content - I'm implying they made the right call in focusing their engineering manpower on creating content for the game (the ACTUAL gameplay) and not on optimizing storage usage for assets. That decision gave them a popular game which eventually had the resources to go optimize storage use.
It's not even about graphics it's about load time on HDD. Which turns out didn't benefit all that much.
I can see customers being much more annoyed at a longer load time than a big install size as this has become pretty common.
I mean.. A few years ago, 1TB SSDs were still the best buy and many people haven't ugpraded still, and wasthing 15% of your total storage on just one game is still a pain for many.
I started my career as a software performance engineer. We measured everything across different code implementations, multiple OS, hardware systems, and in various network configurations.
It was amazing how often people wanted to optimize stuff that wasn't a bottleneck in overall performance. Real bottlenecks were often easy to see when you measured and usually simple to fix.
But it was also tough work in the org. It was tedious, time-consuming, and involved a lot of experimental comp sci work. Plus, it was a cost center (teams had to give up some of their budget for perf engineering support) and even though we had racks and racks of gear for building and testing end-to-end systems, what most dev teams wanted from us was to give them all our scripts and measurement tools to "do it themselves" so they didn't have to give up the budget.
That sounds like fascinating work, but also kind of a case study in what a manager's role is to "clear the road" and handle the lion's share of that internal advocacy and politicking so that ICs don't have to deal with it.
It's because patting yourself on the back for getting 5x performance increase in microbenchmark feels good and looks good on yearly review.
> But it was also tough work in the org. It was tedious, time-consuming, and involved a lot of experimental comp sci work. Plus, it was a cost center (teams had to give up some of their budget for perf engineering support) and even though we had racks and racks of gear for building and testing end-to-end systems, what most dev teams wanted from us was to give them all our scripts and measurement tools to "do it themselves" so they didn't have to give up the budget.
Misaligned budgeting and goals is bane of good engineering. I've seen some absolutely stupid stuff like outsourcing hosting a simple site to us, because client would rather hire 3rd party to buy domain and put a simple site there (some advertising), than to deal with their own security guys and host it on their own infrastructure.
"It's a cost center"
"So is fucking HR, why you don't fire them ?"
"Uh, I'll ignore that, pls just invoice anything you do to other teams"
...
"Hey, they bought cloud solution that doesn't work/they can't figure it out, can you help them"
"But we HAVE stuff doing that cheaper and easier, why they didn't come to us"
"Oh they thought cloud will be cheaper and just work after 5 min setup"
In an online services company, a perf team can be net profitable rather than a "cost center." The one at my work routinely finds quantifiable savings that more than justify their cost.
There will be huge mistakes occasionally, but mostly it is death by a thousand cuts -- it's easy to commit a 0.1% regression here or there, and there are hundreds of other engineers per performance engineer. Clawing back those 0.1% losses a couple times per week over a large deployed fleet is worthwhile.
11% still play HD2 with a spinning drive? I would've never guessed that. There's probably some vicious circle thing going on: because the install size is so big, people need to install it on their secondary, spinning drive...
Even though I have two SSDs in my main machine I still use a hard drive as an overflow for games that I judge are not SSD worthy.
Because it's a recent 20TB HDD the read speeds approach 250MB/s and I've also specifically partitioned it at the beginning of the disk just for games so that it can sustain full transfer speeds without files falling into the slower tracks, the rest of the disk is then partitioned for media files that won't care much for the speed loss. It's honestly fine for the vast majority of games.
> It's honestly fine for the vast majority of games.
Yes, because they apparently still duplicate data so that the terrible IOPS of spinning disks does not factor as much. You people need to stop with this so that we can all have smaller games again! ;-) <--- (IT'S A JOKE)
PrimoCache is awesome, highly recommended. I’d only say to make sure your computer is rock stable before installing it, in my limited experience it exponentially increases the risk of filesystem corruption if your computer is unstable.
It is no surprise to me that people still have to use HDD for storage. SSD stopped getting bigger a decade plus ago.
SSD sizes are still only equal to the HDD sizes available and common in 2010 (a couple TB~). SSD size increases (availability+price decreases) for consumers form factors have entirely stopped. There is no more progress for SSD because quad level cells are as far as the charge trap tech can be pushed and most people no longer own computers. They have tablets or phones or if they have a laptop it has 256GB of storage and everything is done in the cloud or with an octopus of (small) externals.
SSDs did not "stop getting bigger a decade plus ago." The largest SSD announced in 2015 was 16TB. You can get 128-256TB SSDs today.
You can buy 16-32TB consumer SSDs on NewEgg today. Or 8TB in M.2 form factor. In 2015, the largest M.2 SSDs were like 1TB. That's merely a decade. At a decade "plus," SSDs were tiny as recently as 15 years ago.
Perhaps my searching skills aren’t great but I don’t see any consumer ssds over 8TB. Can you share a link?
It was my understanding that ssds have plateaued due to wattage restriction across SATA and M.2 connections. I’ve only seen large SSDs in U.3 and E[13].[SL] form factors which I would not call consumer.
The mainstream drives are heavily focused on lowering the price. Back in the 2010s SSDs in the TB range were hundreds of dollars, today you can find them for $80 without breaking a sweat[1]. If you're willing to still spend $500 you can get 8TB drives[2].
I bought 4x (1TB->4TB the storage for half the price after my SSD died after 5 years (thanks samsung), what you mean they 'stopped being bigger'?
Sure, there is some limitation in format, can only shove so many chips on M.2, but you can get U.2 ones that are bigger than biggest HDD (tho price is pretty eye-watering)
By stopped getting bigger I mean people still think 4TB is big in 2025. Just like 2010 when 3/4TB was the max size for consumer storage devices. u.2/u.3 is not consumer yet, unfortunately. I have to use m.2 nvme to u.2 adapters which are not great. And as you say, low number of consumer cpu+mobo pcie lanes has been restricting from the number of disks side until just recently. At least in 2025 we can have more than 2 nvme storage disks again without disabling a pcie slot.
I think this is more a symptom of data bloat decelerating than anything else. Consumers just don't have TBs of data. The biggest files most consumers have will be photos and videos that largely live on their phones anyway. Gaming is relatively niche and there just isn't that much demand for huge capacity there, either -- it's relatively easy to live with only ~8 100GB games installed at the same time. Local storage is just acting as a cache in front of Steam, and modern internet connections are fast enough that downloading 100GB isn't that slow (~14 minutes at gigabit speeds).
So when consumers don't have (much) more data on their PCs than they had in 2015, why would they buy any bigger devices than 2015? Instead, as sibling commenter has pointed out, prices have improved dramatically, and device performance has also improved quite a bit.
(But it's also true that the absolute maximum sized devices available are significantly larger than 2015, contradicting your initial claim.)
I read that SSDs don't actually guarantee to keep your data if powered off for an extended period of time, so I actually still do my backup on HDDs. Someone please correct me if this is wrong.
A disk that is powered off is not holding your data, regardless of whether it is an HDD, SDD, or if it is in redundant RAID or not. Disks are fundamentally a disposable medium. If you don't have them powered on, you have no way to monitor for failures and replace a drive if something goes wrong - it will just disappear someday without you noticing.
Tape, M-DISC, microfilm, and etched quartz are the only modern mediums that are meant to be left in storage without needing to be babysit, in climate controlled warehousing at least.
Do you poweroff your backup HDDs for extended periods of time (months+)? That's a relatively infrequent backup interval. If not, the poweroff issue isn't relevant to you.
(More relevant might be that backups are a largely sequential workload and HDDs are still marginally cheaper per TB than QLC flash.)
Which doesn’t matter at all in the case of Helldivers 2 as it’s only available for PC, PS5, and XBS/X. That’s a good part of why PC players were so irritated, actually: when all this blew up a few months ago, the PC install sizes was ~133 GB vs the consoles’ 36 GB.
Helldivers 2 is only on current gen consoles so older ones are beside the point, the current ones use NVMe SSDs exclusively. PC is the only platform where HDDs or SATA SSDs might still come up.
I don't find it surprising at all. A ton of developers do optimizations based on vibes and very rarely check if they're actually getting a real benefit from it.
Would have saved us from all the people who reject any sort of optimization work because for them it is always "too early" since some product team wanted their new feature in production yesterday, and users waiting 5 seconds for a page load isn't considered bad enough just yet.
Premature optimization doesn't mean "We have an obvious fix sitting in front of us that will definitely improve things."
It means "We think we have something that could help performance based on a dubiously applicable idea, but we have no real workload to measure it on. But we're going to do it anyway."
So it doesn't save us from anything, it potentially delays launching and gives us the same result that product team would have given us, but more expensive.
Yes, you and me understand that quote, probably mostly because we've both read all the text around the quote too, not just the quote itself. But there is a lot of people who dogmatically follow things other's write about without first digging deeper, and it's these people I was talking about before. Lots of people seemingly run on whatever soundbites they can remember.
While I know the paper pretty well,
I still tend to phrase my objections by asking something along the lines of "do you have any benchmarks for the effects of that change?"
> It means "We think we have something that could help performance based on a dubiously applicable idea, but we have no real workload to measure it on. But we're going to do it anyway."
the problem is that it doesn't say that directly so people without experience take it at face value.
The commonly cited source says, when you take the entire sentence, "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil." and continues "Yet we should not pass up our opportunities in that critical 3%."
There's only so much you can do with people who will not even take the complete original sentence, let alone the context. (That said, "premature optimisation is the root of all evil" is much snappier so I do see why it's ended up being quoted in isolation)
Counterpoint: data driven development often leads to optimizations like this not being made because they're not the ones who are affected, their customers are. And software market is weird this way - little barriers to entry, yet almost nothing is a commodity, so there's no competitive pressure to help here either.
Honestly looking over time I think that phrase did more bad than good.
Yes, of course you shouldn't optimize before you get your critical path stable and benchmark which parts take too much.
But many, many times it is used as excuse to delay optimisation so far that it is now hard to do because it would require to rewriting parts that "work just fine", or it is skipped because the slowness is just at tolerable level.
I have a feeling just spending 10-20% more time on a piece of code to give it a glance whether it couldn't be more optimal would pay for itself very quickly compared to bigger rewrite months after code was written.
I expect it's a story that'll never get told in
enough detail to satisfy curiosity, but it certainly
seems strange from the outside for this optimisation
to be both possible and acceptable.
From a technical perspective, the key thing to know is that the console install size for HD2 was always that small -- their build process assumed SSD on console so it didn't duplicate stuff.
154GB was the product of massive asset duplication, as opposed 23GB being the product of an optimization miracle. :)
How did it get so bad on PC?
Well, it wasn't always so crazy. I remember it being reasonable closer to launch (almost 2 years ago) and more like ~40-60GB. Since then, the devs have been busy. There has been a LOT of reworking and a lot of new content, and the PC install size grew gradually rather than suddenly.
This was probably impacted to some extent by the discontinued game engine they're using. Bitsquid/Stingray was discontinued partway through HD2 development and they continued on with it rather than restarting production entirely.
>It seems bizarre to me that they'd have accepted such a high cost (150GB+ installation size!) without entirely verifying that it was necessary!
You should look at COD install sizes and almost weekly ridiculously huge "updates". 150gb for a first install is almost generous considering most AAA games.
Game companies these days barely optimize engine graphical performance before release never mind the package size or patching speed. They just stamp higher minimum system requirements on the package.
From a business perspective the disk footprint is only a high cost if it results in fewer sales, which I doubt it does to any significant degree. It is wasteful, but can see why optimization efforts would get focused elsewhere.
I think certain games dont even bother to optimize the install size so that you cant fit other games on the hard drive, I think COD games are regularly hundreds of gigs
Having a humongous game might be a competitive advantage in the era of live-service games.
Users might be more hesitant to switch to another game if it means uninstalling yours and reinstalling is a big pain in the backside due to long download times.
I've often seen people mention that one reason for games like Call of Duty being so enormous is optimising for performance over storage. You'd rather decompress textures/audio files at install-time rather than during run-time, because you download/install so infrequently.
Yeah, I don't think any of the stores charge developers in proportion to how much bandwidth they use. If that changed then the priorities could shift pretty quickly.
Publishers do have to care somewhat on the Switch since Nintendo does charge them more for higher capacity physical carts, but a lot of the time they just sidestep that by only putting part (or none) of the game on the cart and requiring the player to download the rest.
it's not really all that big of a cost, serving few hundred GBs costs pennies, despise what prices of S3 storage and bandwidth might led some people to believe.
Both things are sort of true. Its not sales where size can hurt you but retention, which is why it tended to matter more on phones. When you need space on your device the apps are listed from largest to smallest.
On both phones and PCs storage has just grown so its less of an issue. The one thing I have noticed is that Apple does its price windowing around memory so you pay an absurd amount for an extra 128 gb. The ultra competitive Chinese phone market crams high end phones with a ton of memory and battery. Si some popular Chinese phone games are huge compared to ones made for the iPhone.
I'd bet any amount of money a demo ran slow on one stakeholder's computer, who happened to have a mechanical hard drive, they attributed the slowness to the hard drive without a real investigation and optimizing for mechanical hard drive performance became standard practice. The demo may not have even been for this game, just a case of once bitten twice shy.
IIRC this has been the “done thing” forever. I’m not in game development, but I think I recall hearing about it in the Xbox 360 era. Conventional options are picked by default, benchmarks are needed to overturn that. Looking at my hard drive, massive game installations are still very much the industry standard…
I have heard that in many scenarios it is faster to load uncompressed assets directly rather than load+decompress. Load time is prioritized over hard drive space so you end up with the current situation.
You need very fast decompression for that to work these days when io speeds are so high, and decompression takes compute that is being used for game logic.
Very fast decompression often means low compression or multicore. I have seen libjpgturbo vastly outperform raw disk reads though
There have been plenty of times where the opposite is true: Storing highly compressed data and decompressing it in RAM is much faster than loading uncompressed assets.
Which is the primary problem: Computers are so varied and so ever changing that if you are optimizing without hard data from your target hardware, you aren't optimizing, you are just doing random shit.
Add to that, game devs sometimes are just dumb. Titanfall 1 came with tens of gigabytes of uncompressed audio, for "performance", which is horse shit. Also turns out they might have been lying entirely. Titanfall 1 was made on the source engine, which does not support the OGG audio format their audio files were in. So they decompressed them at install time. They could have just used a different audio file format.
High cost to who though. We see the same thing when it comes to RAM and CPU usage, the developer is not the one paying for the hardware and many gamers have shown that they will spend money on hardware to play a game they want.
Sure they may loose some sales but I have never seen many numbers on how much it really impacted sales.
Also on the disk side, I can't say I have ever looked at how much space is required for a game before buying it. If I need to clear out some stuff I will. Especially with it not being uncommon for a game to be in the 100gb realm already.
That all being said, I am actually surprised by the 11% using mechanical hard drives. I figured that NVME would be a lower percentage and many are using SSD's... but I figured the percent with machines capable of running modern games in the first place with mechanical drives would be far lower.
I do wonder how long it will be until we see games just saying they are not compatible with mechanical drives.
That already happened :) Starfield claimed to not support HDDs and really ran bad with them. And I think I saw SSDs as requirement for a few other games now, in the requirement listings on steam.
> Starfield claimed to not support HDDs and really ran bad with them.
To be fair, at launch Starfield had pretty shit loading times even with blazing fast SSDs, and the game has a lot of loading screens, so makes sense they'll nip that one in the bud and just say it's unsupported with the slower type of disks.
Latest Ratchet and Clank game relies heavily on ps5’s nvme drive. Its PC port states that SSD is required. And IIRC, the experience on mechanical drives is quite terrible to the unplayable level.
All of that takes time.abd you never have enough time.
At any given point if it wasn't vital to shipping and not immediately breaking, then it can be backburnered.
Messing with asset loading is probably a sure fire way to risk bugs and crashes - so I suspect this mostly was waiting on proving the change didn't break everything (and Helldiver's has had a lot of seemingly small changes break other random things).
The game is released on both PC and PS5, the latter of which was designed (and marketed) to take advantage of SSD speeds for streaming game content near real time.
The latest Ratchet and Clank, the poster child used in part to advertise the SSD speed advantage, suffers on traditional hard drives as well in the PC port. Returnal is in the same boat. Both were originally PS5 exclusives.
My understanding is that optimizing for sequential read is a big reason for historical game install bloat; if you include the common assets multiple times in the archive, then loading a level/zone becomes one big continuous slurp rather than jumping all over the place to pick up the stuff that's common to everything. Obviously this didn't matter with optical media where the user wasn't impacted, but it's annoying on PC where we've had a long period of users who invested in expensive, high-performance storage having to use more of it than needed due to optimizations geared at legacy players still on spinning rust.
I expect that low-latency seek time is also pretty key to making stuff like nanite work, where all the LODs for a single object are mixed together and you need to be able to quickly pick off the disk the parts that are needed for the current draw task.
The HDD performance suffers very much during the portal loading sequences in Ratchet and Clank, but even the entry level SSD performs fine, with little visible difference compared to the PS5 one. It’s more about random access speed than pure throughput
I played Rift Apart from HDD and apart from extra loading time during looped animations it was fine. On the other hand Indiana Jones Great Circle was barely playable with popping-in textures and models everywhere.
Optimizing for disk space is very low on the priority list for pretty much every game, and this makes sense since its very low on the list of customer concerns relative to things like in-game performance, net code, tweaking game mechanics and balancing etc.
Apparently, in-game performance is not more important than pretty visuals. But that's based on hearsay / what I remember reading ages ago, I have no recent sources. The tl;dr was that apparently enough people are OK with a 30 fps game if the visuals are good.
I believe this led to a huge wave of 'laziness' in game development, where framerate wasn't too high up in the list of requirements. And it ended up in some games where neither graphics fidelity or frame rate was a priority (one of the recent Pokemon games... which is really disappointing for one of the biggest multimedia franchises of all time).
That used to be the case, but this current generation the vast majority of games have a 60 fps performance mode. On PS5 at least, I can't speak about other consoles.
a one time cost of a big download is something customers have shown time and again that they're willing to bear. remember that software is optimized for ROI first and all other things second. Sometimes optimizing for ROI means "ship it today and let the first week of sales pay salaries while we fix it", sometimes ROI means picking between getting the file size down, getting that new feature out and fixing that game breaking edge case bug. Everything you do represents several things you choose not to do.
It’s the same sort of apathy/arrogance that made new Windows versions run like dogshit on old machines. Gates really should have had stock in PC makers. He sold enough of them.
I don’t think it’s always still the case but for more than ten years every release of OSX ran better on old hardware, not worse.
Some people think the problem was MS investing too eagerly into upgrading developer machines routinely, giving them a false sense of what “fast enough” looked like. But the public rhetoric was so dismissive that I find that pretty unlikely. They just didn’t care. Institutionally.
I’m not really into the idea of Helldivers in the first place but I’m not going to install a 150GB game this side of 2040. That’s just fucking stupid.
It would be ironic if incidents like this made Valve start charging companies for large file sizes of their games. It would go to show that good things get abused to no end if limits aren't set.
> These loading time projections were based on industry data - comparing the loading times between SSD and HDD users where data duplication was and was not used. In the worst cases, a 5x difference was reported between instances that used duplication and those that did not. We were being very conservative and doubled that projection again to account for unknown unknowns
Unfortunately it's not only game development, all modern society seems operate like this.
You can read the proposal and found out, if you're interested.
> In the light of the more limited risk of their use for the purpose of child sexual abuse and the need to preserve confidential information, including classified information, information covered by professional secrecy and trade secrets, electronic communications services that are not publicly available, such as those used for national security purposes, should be excluded from the scope of this Regulation. Accordingly,
this Regulation should not apply to interpersonal communications services that are not available to the general public and the use of which is instead restricted to persons involved in the activities of a particular company, organisation, body or authority.
I've had this thought before, when seeing labels that talk about kWh/day. The answer is very simple: you pay per kWh. When people want to know power efficiency, what they really want to know is "how much will this cost me to run?". That answer is most easily expressed in kWh per unit time.
Also giving an averaged power drain would be misleading. If the device uses 2.4kW but only for half an hour per day, that's not a 50W device as far as cabling, fuses and other electrical considerations.
In the US, at least, there are some utilities that charge based on maximum kW (demand) and total kWh used (energy). ComEd in Chicago is a utility with a demand rate plan option.
That tends to be commercial rates since businesses can have larger spikes in consumption, so the "pipe" needs to be larger. Industrial rates are similar.
There are some like ComEd that you call out that can apply the model to residential rates, though my (now dated) experience is that they are rarer.
Knowing the average of 108 W wouldn't help with knowing your peak demand, as fridges vary significantly from off to startup to running, so knowing the average isn't useful in that situation either.
It would be completely wrong for peak demand. I had to learn this the hard way. While the small fridge I bought only uses 80 W while running the compressor uses 800W+ for a second on startup which was too much for my off the grid inverter.
That strain does not seem to be reflected in the usage, which has been in a shallow decline since the 90s. Maybe they could consider using smart demand management, which is becoming popular with a lot of utilities to move usage away from peaks and into the quieter times.
I think these tariffs are meant to encourage exactly that. Note also that there are many levels of bottlenecks. One could be in your neighborhood, if all your neighbours have EVs.
Perhaps it will work. I'm just a bit skeptical because it seems unlikely to be a widespread problem. The average driver in Sweden will only need perhaps 6 kWh per day, which at L2 means charging for 35-40 minutes. A bit of demand management from the utility and everyone in the neighborhood can get what they need without stressing the local grid. Or just knock down the rate to something inconsequential and let it trickle all night.
I've not followed any evolutions in this area, but there's a cool paper from 2014 about using WiFi channel state information to detect 87%(!) of falls in an experimental condition[1]. It's been a while since I read the paper, and I no longer have access, so caveats aplenty, but it's one of those things that pops into my head sometimes and I wonder if it's seen any real-world deployment.
> Instead they believe model alignment, trying to understand when a user is doing a dangerous task, etc. will be enough.
Maybe I have a fundamental misunderstanding, but I feel like hoping that model alignment and in-model guardrails are statistical preventions, ie you'll reduce the odds to some number of zeroes preceeding the 1. These things should literally never be able to happen, though. It's a fools errand to hope that you'll get to a model where there is no value in the input space that maps to <bad thing you really don't want>. Even if you "stack" models, having a safety-check model act on the output of your larger model, you're still just multiplying odds.
It's a common mistake to apply probabilistic assumptions to attacker input.
The only [citation needed] correct way to use probability in security is when you get randomness from a CSPRNG. Then you can assume you have input conforming to a probability distribution. If your input is chosen by the person trying to break your system, you must assume it's a worst-case input and secure accordingly.
The sortof fun thing is that this happens with human safety teams too. The Swiss Cheese model is generally used to understand how the failures can line up to cause disaster to punch right through the guardrails:
It's better to close the hole entirely by making dangerous actions actually impossible, but often (even with computers) there's some wiggle room. For example, if we reduce the agent's permissions, then we haven't eliminated the possibility of those permissions being exploited, merely required some sort of privilege escalation to remove the block. If we give the agent an approved list of actions, then we may still have the possibility of unintended and unsafe interactions between those actions, or some way an attacker could add an unsafe action to the list. And so on, and so forth.
In the case of an AI model, just like with humans, the security model really should not assume that the model will not "make mistakes." It has a random number generator built right in. It will, just like the user, occasionally do dumb things, misunderstand policies, and break rules. Those risks have to be factored in if one is to use the things at all.
Humans are dramatically stronger than LLMs. An LLM is like a human you can memory wipe and try to phish hundreds of times a second until you find a script that works. I agree with what you're saying, but it's important to frame an LLM is not like a security guard who will occasionally let a former employee in because they recognize them. They can be attacked pretty relentlessly and once they're open they're wide open.
To play devils advocate, isn’t any security approach fundamentally statistical because we exist in the real world, not the abstract world of security models, programming language specifications, and abstract machines? There’s always going to be a chance of a compiler bug, a runtime error, a programmer error, a security flaw in a processor, whatever.
Now, personally I’d still rather take the approach that at least attempts to get that probability to zero through deterministic methods than leave it up to model alignment. But it’s also not completely unthinkable to me that we eventually reach a place where the probability of a misaligned model is sufficiently low to be comparable to the probability of an error occurring in your security model.
The fact that every single system prompt has been leaked despite guidelines to the LLM that it should protect it, shows that without “physical” barriers, you are aren’t providing any security guarantees.
A user of chrome can know, barring bugs that are definitively fixable, that a comment on a reddit post can’t read information from their bank.
If an LLM with user controlled input has access to both domains, it will never be secure until alignment becomes perfect, which there is no current hope to achieve.
And if you think about a human in the driver seat instead of an LLM trying to make these decisions, it’d be easy for a sophisticated attacker to trick humans to leak data, so it’s probably impossible to align it this way.
It’s often probabilistic- for example I can guess your six digit verification code exactly 1 in a million times, and if I 1 in a million lucky I can do something naughty once.
The problem with llm security is that if only 1 in a million prompts break claude and make it leak email, if I get lucky and find the golden ticket I can replay it on everyone using that model.
also, no one knows the probability a priory, unlike the code, but practically its more like 1 in 100 at best
The difference is that LLMs are fundamentally insecure in this way as part of their basic design.
It’s not like, this is pretty secure but there might be a compiler bug that defeats it. It’s more like, this programming language deliberately executes values stored in the String type sometimes, depending on what’s inside it. And we don’t really understand how it makes that choice, but we do know that String values that ask the language to execute them are more likely to be executed. And this is fundamental to the language, as the only way to make any code execute is to put it into a String and hope the language chooses to run it.
> To play devils advocate, isn’t any security approach fundamentally statistical because we exist in the real world, not the abstract world of security models, programming language specifications, and abstract machines?
IMO no, most security modeling is pretty absolute and we just don't notice because maybe it's obvious.
But, for example, it's impossible to leak SSNs if you don't store SSNs. That's why the first rule of data storage is only store what you need, and for the least amount of time as possible.
As soon as you get into what modern software does, store as much as possible for as long as possible, then yes, breeches become a statistical inevitability.
We do this type of thing all the time. Can't get stuff stolen out of my car if I don't keep stuff in my car. Can't get my phone hacked and read through at the airport if I don't take it to the airport. Can't get sensitive data stolen over email if I don't send sensitive data over email. And on and on.
All modern computer security is based on trying to improbabilities. Public key cryptography, hashing, tokens, etc are all based on being extremely improbable to guess, but not impossible. If an LLM can eventually reach that threshold, it will be good enough.
That threshold would require more than 30 orders of magnitude improvement in the probability given a 1/100,000,000 current probability of an LLM violating alignment. The current probability is much, much higher than that, but let's cut the LLMs some slack & pretend. Improving by a factor of 10^30 is extremely unlikely.
Cryptography's risk profile is modeled against active adversaries. The way probability is being thrown around here is not like that. If you find 1 in a billion in the full training set of data that triggers this behavior, that's not the same as 1 in a billion against an active adversary. In cryptography there are vulnerabilities other than brute force.
"These things should literally never be able to happen"
If we consider "humans using a bank website" and apply the same standard, then we'd never have online banking at all. People have brain farts. You should ask yourself if the failure rate is useful, not if it meets a made up perfection that we don't even have with manual human actions.
Just because humans are imperfect and fall for scams and phishing doesn't mean we should knowingly build in additional attack mechanisms. That's insane. Its a false dilemma.
Go hire some rando off the street, sit them down in front of your computer, and ask them to research some question for you while logged into your user account and authenticated to whatever web sites you happen to be authenticated to.
Does this sound like an absolutely idiotic idea that you’d never even consider? It sure does to me.
Yes, humans also aren’t very secure, which is why nobody with any sense would even consider doing this either a human.
The vast majority of humans would fall to bad security.
I think we should continue experimenting with LLMs and AI. Evolution is littered with the corpses of failed experiments. It would be a shame if we stopped innovating and froze things with the status quo because we were afraid of a few isolated accidents.
We should encourage people that don't understand the risks not to use browsers like this. For those that do understand, they should not use financial tools with these browsers.
Caveat emptor.
Don't stall progress because "eww, AI". Humans are just as gross.
We can continue to experiment while also going slowly. Evolution happens over many millions of years, giving organisms a chance to adapt and find a new niche to occupy. Full-steam-ahead is a terrible way to approach "progress".
If the only danger is the company itself bankrupt, then please, take all the risks you like.
But if they're managing customer-funds or selling fluffy asbestos teddybears, then that's a problem. It's a profoundly different moral landscape when the people choosing the risks (and grabbing any rewards) aren't the people bearing the danger.
You can have this outrage when your parents are using browser user agents.
All of this concern is over a hypothetical Reddit comment about a technology used by early adopter technologists.
Nobody has been harmed.
We need to keep building this stuff, not dog piling on hate and fear. It's too early to regulate and tie down. People need to be doing stupid stuff like ordering pizza. That's exactly where we are in the tech tree.
"We need to keep building this stuff" Yeah, we really don't. As in there is literally no possible upside for society at large to continuing down this path.
Well if we eliminate greed and capitalism then maybe at some point we can reach a Star Trek utopia where nobody has to work because we eliminate scarcity.
... Either that or the wealthy just hoard their money-printers and reject the laborers because they no longer need us to make money so society gets split into 99% living in feudal squalor and 1% living as Gods. Like in Jupiter Ascending. Man what a shit movie that was.
This AI browser agent is outright dangerous as it is now. Nobody has been attacked this way... that we know of... yet.
It's one thing to build something dangerous because you just don't know about it yet. It's quite another to build something dangerous knowing that it's dangerous and just shrugging it off.
Imagine if Bitcoin was directly tied to your bank account and the protocol inherently allowed other people to perform transactions on your wallet. That's what this is, not "ordering pizza."
An LLM must not be given all three of these, or it is inherently insecure. Any two is fine (mostly, private data and external communication is still a bit iffy), but if you give them all three then you're screwed. This is inherent to how LLMs work, you can't fix it as the technology stands today.
This isn't a secret. It's well known, and it's also something you can easily derive from first principles if you know the basics of how LLMs work.
You can build browser agents, but you can't give them all three of these things. Since a browser agent inherently accesses untrusted data and communicates externally, that means that it must not be given access to private data. Run it in a separate session with no cookies or other local data from your main session and you're fine. But running it in the user's session with all of their state is just plain irresponsible.
The CEO of Perplexity hasn't addressed this at all, and instead spent all day tweeting about the transitions in their apps. They haven't shown any sign of taking this seriously and this exploit has been known for more than a month: https://x.com/AravSrinivas/status/1959689988989464889
The phrasing of this seems to imply that you think this is obviously ridiculous to the point that you can just say it ironically. But I actually think that's a good idea.
If an IDE had powerful, effective hotkeys and shortcuts and refactoring tools that allowed devs to be faster and more efficient, would that be anti-worker?
reply