LLMs often seem to have trouble determining the severity of a bug/incident/problem in a vacuum. If you run an LLM over 1000 items in parallel and ask "is this bad," it will come up with reasons for it to be bad way more than it might if it were considering all 1000 at the same time.
I'm curious, do people around you use AI? Because in my own workplace, people use lots of AI, and they ship lots of PRs, which correspond to actual features on the roadmap. I've been doing this a long time, and there is a whole lotta stuff shipping. I'm a manager and in the handful of hours I have I'm shipping the equivalent of what I would have as a full-time eng years ago.
Reviewing 22,000 lines of code, even from antirez, with this complex of a feature set and minimal PR description sounds like a nightmare. One starts to see why major open-source software like Postgres tends to be developed on a mailing list, with intermediate design decisions discussed by the community, separate patches for different related features, incremental review, and then a spaced release cadence.
I might be the outlier, but this PR feels like heaven to review. It's a complete, all encompassing PR that I can work through with the entire context right in front of me.
If the initial development bar is relatively high, it's far, far easier to identify flaws and gaps when you have the whole thing in front of you all at once.
I think the point GP is making is this is a PR that smells like a solo dev working on their own project and not how a community-driven project adds major new functionality, although I'm sure there are docs and descriptions (or at least a discussion of tradeoffs and design decisions if not ADRs) are somewhere, but not linked handily to the PR. There is a lot of explanation in the blog post and PR, but it's unilateral-looking.
Redis was completely built in this way since the start. I believe this is a better way to create software. Compromise in design is, in my opinion, something to avoid: feedbacks are important, but often times a single person that studied a lot the problem and have design taste, can come up with a great solution. Mediating such solution, even among two stellar A and B solutions, will not produce a C soution that is better, since you can't produce such solution by interpolation. It is simpler to damage A and B. And: it is rare that in a big set of people all have stellar ideas, so you have to mediate, often, also with people having poor ideas. Not worth the effort for the way I'm wired. What works better for me is to provide hints about what I'm doing, then I receive feedbacks, and sometimes there are really great ideas in this feedbacks, and I incorporate the part I like.
Thanks, I think I'm all caught up now. The timeline is like this if I understand correctly: your successors (Yossi Gottlieb and Oran Agra) explicitly announced a new governance model in 2020, saying the project had "outgrown the BDFL-style of management" and that they wanted to "promote more teamwork and structure". With the relicensing in 2024, however, external contributors with five or more commits to Redis dropped to zero in the first six months (basically, community contribution collapsed). In late 2024, you came back in the role of "Redis evangelist" and a year ago there was an additional licensing change, adding AGPLv3 as an option (8.0's tri-license). So now redis has your steady hand on the wheel again.
I was confused because the last time I checked on things, it was still about fostering community input and advancement but not necessarily consensus. Things have tipped back in the original direction since then. I don't think "Redis was completely built in this way since the start" is completely accurate, but also the community effort under the new governance model never got very deeply entrenched while you were away.
First of all, redis is amazing, and your 4 month development process speaks to the fact that you've already designed and verified correctness super thoroughly.
... just speaking as someone who sometimes has to review very long PRs sometimes, though, I feel like 25% is a roughly normal level of "signal to noise." 5,000 lines of core logic is a LOT, and the tests and dependencies do still need to be read.
EDIT: I feel like the problem, as a reviewer, is processing 4 months of intensive research/development and providing useful feedback. At that point, there's probably not much major input you can have into the core architecture or strategy, so you're probably not providing much more than a bugbot at that point.
I think where we went wrong in understanding this PR is in the assumption that it's designed to invite review because that's how a lot of other team- or community-driven projects work.
> At that point, there's probably not much major input you can have into the core architecture or strategy
Sure you can? In this concrete case, Redis is very "flat" — there's the data structure implementations, and there's the commands that use them. 1+N. You could have feedback about the data structure (i.e. whether it's optimal for the use-cases); or about any of the commands (i.e. not just their impls, but also whether they're the best core API surface to lock in long-term, or even whether they're worth including at all.)
Any given feedback would necessitate fairly limited rework to address, as you're either modifying the data structure (and its tests) or a command (and its tests and docs.)
Fair point that there might be some functional changes you can suggest, but I continue to suspect that by the time this PR hit GitHub, all the most important decisions have already been finalized.
Oh wow, I didn't realize that Redis is still mostly just authored by antirez! (My understanding is that he had left for some time and then returned to the project.) That is, honestly, pretty amazing. Well, redis is great and clearly it's worked out.
I was expecting to see some verbose LLM output, but actually the code has a distinctly hand-crafted feel. Nice to see! I'm not sure if "production ready" is a safe claim 7 commits in to a project ;)
$8M sounds like a lot, but (a) the cost of making a material financial mistake c an easily dwarf this, and (b) the cost of the engineers maintaining the system was likely about this expensive anyway. And infra is expensive when you're Uber. It all seems rather overblown to me.
Context caching is really storing the KV-cache for reuse. It saves running prefill for that part of the context, but tokens referencing that KV-cache will still cost more.
Yes. For example you'll typically have a "budget" of 1-10k writes/sec. And a single heavy join can essentially take you offline. Even relatively modest enterprises typically need to shift some query patterns to OLAP/nosql/redis/etc. before very long.
can share our work setup we've been tinkering with at a mid size org. iceberg datalake + snowflake for our warehouse, iceberg tables live in s3, that is now shareable to postgres via the pg_lake extension which automagically context switches using duckdb under the hood to do olap queries acrossed the vast iceberg data. we keep the postgres db as an application db so apps can retrieve the broader data they want to surface in the app from the iceberg tables, but still have spicy native postgres tables to do their high volume writes.
very cool shit, it's certainly blurred the whole olap vs oltp thing a smidge but not quite. more or less makes olap and oltp available through the same db connection. writing back to iceberg is possible, we have a couple apps doing it. though one should probably batch/queue writes back as iceberg definitely doesnt have the fast-writes story. its just nice that the data warehouse analytics nerds have access to the apps data and they can do their thing in the environment they work with back on the snowflake side.
this is definitely an "i only get to play with these techs cause the company pays for it" thing. no one wants to front the cost of iceberg datalake sized mountains of data on some s3 storage somewhere, and it doesn't solve for any sort of native-postgres'ing. it just solves for companies that are doing ridic stuff under enormous sla contracts to pay for all manners of cloud services that joe developer the home guy isn't going to be tinkering with anytime soon. but definitely an interesting time to work near data, so much "sql" has been commercialized over the years and it's really great to see postgres being the peoples champ and helping us break away from the dumb attempts to lock us in under sql servers and informix dbs etc. but we still havent reached a one database for everything yet, but postgres is by and large the one carrying the torch though in my head cannon. if any of them will get there someday, it's postgres.
Sure, but this isn't really an AMA thread [despite the offer to "answer any questions"]. This is about Sid's journey with (extremely advanced) cancer. Airing grievances about Gitlab is just out of place here, you gotta read the room.
Paying different amounts for different regions is not being an asshole. Virtually every company on the planet does regional CoL adjustments.
And get a grip - you are free to bring value to the world in your way if you're not happy to be an employee. Attacking others that have done nothing to harm you is entirely uncalled for, especially on a discussion about their own cancer. Please act like an adult.
The point is that ideally the models keep improving until they can solve problems people care about. Which is already partly true, but there are lots of problems that are still out of reach.
reply