Hacker Newsnew | past | comments | ask | show | jobs | submit | supermdguy's commentslogin

Looks like this would affect around 4.3% of chats (the "Self-Expression" category from this report[0]). Considering ChatGPT's userbase, that's an extremely large number of people, but less significant than I thought based on all the talk about AI companionship. That being said though, a similar crowd was pretty upset when OpenAI removed 4o, and the backlash was enough for them to bring it back.

[0]: https://www.nber.org/system/files/working_papers/w34255/w342...


> A common pattern would be to separate pure business logic from data fetching/writing. So instead of intertwining database calls with computation, you split into three separate phases: fetch, compute, store (a tiny ETL). First fetch all the data you need from a database, then you pass it to a (pure) function that produces some output, then pass the output of the pure function to a store procedure.

Does anyone have any good resources on how to get better at doing "functional core imperative shell" style design? I've heard a lot about it, contrived examples make it seem like something I'd want, but I often find it's much more difficult in real-world cases.

Random example from my codebase: I have a function that periodically sends out reminders for usage-based billing customers. It pulls customer metadata, checks the customer type, and then based on that it computes their latest usage charges, and then based on that it may trigger automatic balance top-ups or subscription overage emails (again, depending on the customer type). The code feels very messy and procedural, with business logic mixed with side effects, but I'm not sure where a natural separation point would be -- there's no way to "fetch all the data" up front.


What I'm currently doing could be called compute-fetch-store: the compute part is done entirely in the database with SQL views stacked one on top of the other. Then the program just fetches the result of the last view and stores it where it needs to be stored.

Stacked views are sometimes considered an anti-pattern, but I really like them because they're purely functional, have no side-effects whatsoever and cannot break (they either work or they don't, but they can't start breaking in the future). And they're also stateless: they present a holistic view of the data that avoids iterations and changes how you think about it. (Data is never really 'transformed', it's simply 'viewed' from a different perspective.)

Not saying that's the only way, or the best way, or even a good way! But it works for me.

I think it would apply well to the example: you could have a view, or a series of views, that compute balance top-ups based on a series of criteria; then the program would read that view and send email without doing any new calculation.


This.

In-RDBMS computation specified in declarative language with generic, protocol/technology specific adapters handling communication with external systems.

Treating RDBMS as a computing platform (and not merely as dumb data storage) makes systems simple and robust. Model your input as base relations (normalized to 5NF) and output as views.

Incremental computing engines such as https://github.com/feldera/feldera go even further with base relations not being persistent/stored.


Ha! I don't yet know much about 'incremental computing engines' but Feldera seem to be something I need. Because at some point I inevitably have to create materialized views to speed up some parts of the pipeline. Materialized views are of course a side effect and can become mildly dangerous if you're not careful to destroy/recreate them in time.

I was trying to think of a way to "only update new or changed rows" but it's not trivial. But Feldera seems to do exactly that. So thanks!


Sometimes you really can't separate the business logic from the imperative operations; in that case you use monads and at least make it a bit more testable and refactorable (e.g. https://michaelxavier.net/posts/2014-04-27-Cool-Idea-Free-Mo...).

That said:

> It pulls customer metadata, checks the customer type, and then based on that it computes their latest usage charges, and then based on that it may trigger automatic balance top-ups or subscription overage emails (again, depending on the customer type).

So compute those things, and store them somewhere (if only an in-memory queue to start with)? Like, I can already see a separation between an ETL stage that computes usage charges, which are probably worth recording in a datastore, and then another ETL stage that computes which top-ups and emails should be sent based on that, which again is probably worth recording for tracing purposes, and then two more stages to actually send emails and execute payment pulls, which it's actually quite nice to have separated from the figuring out which emails to send part (if only so you can retry/debug the latter without sending out actual emails)


> Does anyone have any good resources on how to get better at doing "functional core imperative shell" style design?

I can recommend Grokking Simplicity by Eric Normand. https://www.manning.com/books/grokking-simplicity


> Does anyone have any good resources on how to get better at doing "functional core imperative shell" style design?

Hexagonal architecture[0] is a good place to start. The domain model core can be defined with functional concepts while also defining abstract contracts ( abstractly "ports", concretely interface/trait types) implemented in "adapters" (usually technology specific, such as HTTP and/or SMTP in your example).

0 - https://en.wikipedia.org/wiki/Hexagonal_architecture_(softwa...


> there's no way to "fetch all the data" up front.

this is incorrect

I assume there's more nuance and complexity as for why it feels like there's no way. Probably involving larger design decisions that feel difficult to unwind. But data collection, decisions, and actions can all be separated without much difficulty with some intent to do so.

I would suggest caution, before implementating this directly: but imagine a subroutine that all it did was lock some database table, read the current list of pending top up charges required, issue the charge, update the row, and unlock the table. An entirely different subroutine wouldn't need to concern itself with anything other than data collection, and calculating deltas, it has no idea if a customer will be charged, all it does is calculate a reasonable amount. Something smart wouldn't run for deactivated/expiring accounts, but why does this need to be smart? It's not going to charge anything, it's just updating the price, that hypothetically might be used later based on data/logic that's irrelevant to the price calculation.

Once any complexity got involved, this is closer to how I would want to implement it, because this also gives you a clear transcript about which actions happened why. I would want to be able to inspect the metadata around each decision to make a charge.


That's a good point, thinking about it some more, I think the business logic feels so trivial that it would make the code harder to reason about if it were separated from the effects. Currently, I have one giant function that pulls data, filters it, conditionally pulls more data, and then maybe has one line of effectful code.

I could have one function that pulls the wallet balance for all users, and then passes it to a pure function that returns an object with flags for each user indicating what action to take. Then another function would execute the effects based on the returned flags (kind of like the example you gave of processing a pending charges table).

The value of that level of abstraction is less clear though. Maybe better testability? But it's hard to justify what would essentially be tripling the lines of code (one function to pull the data, one pure function to compute actions, one function to execute actions).

Additionally, there's a performance cost to pulling all relevant data, instead of being able to progressively filter the data in different ways depending on partial results (example: computing charges for all users at once and then passing it to a pure function that only bills customers whose billing date is today).

Would be great to see some more complex examples of "functional core imperative shell" to see what it looks like in real-world applications, since I'm guessing the refactoring I have in my head is a naive way to do it.


> The value of that level of abstraction is less clear though. Maybe better testability?

You wouldn't do it to make it easier to test; you would do it to make it easier to reason about. E.g. There's some bug where some users aren't getting charged. You already know where the bug is, or rather, you know it's not in the code that calculates what the price would be. But now, as a bonus, you also can freely modify the code that collects the people to charge, and don't have to worry if modifying that code will change how much other people get charged, (because these two code blocks can't interact with each other).

You know the joke/meme, 99 bugs in the code, take one down, patch it around, 104 bugs in the code? Yeah, that's talking about code like you're describing where everything is in one function, and everything depends on everything else as an intractable web somehow.

> But it's hard to justify what would essentially be tripling the lines of code (one function to pull the data, one pure function to compute actions, one function to execute actions).

This sounds like you're charging per line of source code. Not all code is equal. If you have 3x the amount of code, but it's written in a way that turns something difficult, or complex to understand and reason about, into something trivial to reason about, what you have is strictly better code.

The other examples or counter points you mention are merely implementation details, that only make sense in the context of your specific example/code base that I haven't read. So I'm gonna skip trying to reasoning about the solutions to them given the point of the style recommendations is to write code in a way that is 1) easier to reason about, or 2) impossible to get wrong ...but those really are the same thing


They can until they can’t.

Sometimes you might need to operate on a result from an external function, or roll back a whole transaction because the last step failed, or the DB could go down midway through.

The theory is good, but stuff happens and it goes out the window sometimes.


If your required logic separates nicely into steps (like "fetch, compute, store"), then a procedural interface makes sense, because sequential and hierarchical control flow work well with procedural programming.

But some requirements, like yours, require control flow to be interwoven between multiple concerns. It's hard to do this cleanly with procedural programming because where you want to draw the module boundaries (e.g.: so as to separate logic and infrastructure concerns) doesn't line up with the sequential or hierarchical flow of the program. In that case you have to bring in some more powerful tools. Usually it means polymorphism. Depending on your language that might be using interfaces, typeclasses, callbacks, or something more exotic. But you pay for these more powerful tools! They are more complex to set up and harder to understand than simple straightforward procedural code.

In many cases judicious splitting of a "mixed-concern function" might be enough and that should probably be the first option on the list. But it's a tradeoff. For instance, you then could lose cohesion and invariance properties (a logically singular operation is now in multiple temporally coupled operations), or pay for the extra complexity of all the data types that interface between all the suboperations.

To give an example, in "classic" object-oriented Domain-Driven Design approaches, you use the Repository pattern. The Repository serves as the interface or hinge point between your business logic and database logic. Now, like I said in the last paragraph, you could instead design it so the business logic returned its desired side-effects to the co-ordinating layer and have it handle dispatching those to the database functions. But if a single business logic operation naturally intertwines multiple queries or other side-effectful operations then the Repository can sometimes be simpler.


This stuff is quite new to me as I’ve been learning F#, so take this with a pinch of salt. Some of the things you’d want are: - a function to produce a list of customers

- a function or two to retrieve the data, which would be passed into the customer list function. This allows the customer list function to be independent of the data retrieval. This is essentially functional dependency injection

- a function to take a list of customers and return a list of effects: things that should happen

- this is where I wave my hands as I’m not sure of the plumbing. But the final part is something that takes the list of effects and does something with them

With the above you have a core that is ignorant of where its inputs come from and how its effects are achieved - it’s very much a pure domain model, with the messy interfaces with the outside world kept at the edges



Sounds like a chain of “fetch compute store” stages, where the output of one is used as input to the next, where you then decide what other data needs to be fetched. So a pipeline instead of just a single shell and a single core.


Maybe check out Scott Wlaschin's videos on YouTube. There is one talk for his book "Domain Modeling Made Functional" which, if I remember, was very clear and easy to follow.


Conceptually, can you break your processing up into a more or less "pure" functional core, surrounded by some gooey, imperative, state-dependent input loading and output effecting stages? For each processing stage, implement functions of well-defined inputs and outputs, with any global side effects clearly stated (i.e. updating a customer record, sending an email) Then factor all the imperative-ish querying (that is to say, anything dependent on external state such as is stored in a database) to the earlier phases, recognizing that some of the querying is going to be data-dependent ("if customer type X, fetch the limits for type X accounts"). The output of these phases should be a sequence of intermediate records that contain all the necessary data to drive the subsequent ones.

Whenever there is an action decision point ("we will be sending an email to this customer"), instead of actually performing that step right then and there, emit a kind of deferred-intent action data object, e.g. "OverageEmailData(customerID, email, name, usage, limits)". Finally, the later phases are also highly imperative, and actually perform the intended actions that have global visibility and mutate state in durable data stores.

You will need to consider some transactional semantics, such as, what if the customer records change during the course of running this process? Or, what if my process fails half-way through sending customer emails? It is helpful if your queries can be point-in-time based, as in "query customer usage as-of the start time for this overall process". That way you can update your process, re-run it with the same inputs as of the last time you ran it, and see what your updates changed in terms of the output.

If those initial querying phases take a long time to run because they are computationally or database query heavy, then during your development, run those once and dump the intermediate output records. Then you can reload them to use as inputs into an isolated later phase of the processing. Or you can manually filter those intermediates down to a more useful representative set (i.e. a small number of customers of each type).

Also, its really helpful to track the stateful processing of the action steps (i.e. for an email, track state as Queued, Sending, Success, Fail). If you have a bug that only bites during a later step in the processing, you can fix it and resume from where you left off (or only re-run for the affected failed actions). Also, by tracking the globally affecting actions you can actually take the results of previous runs into account during subsequent ones ("if we sent an overage email to this customer within the past 7 days, skip sending another one for now"). You now have a log of the stateful effects of your processing, which you can also query ("how many overage emails have been sent, and what numbers did they include?")

Good luck! Don't go overboard with functional purity, but just remember, state mutations now can usually be turned into data that can be applied later.


How do you guys share types between your frontend and backend? I've looked into tRPC, but don't like having to use their RPC system.


I do it naively. Maintain the backend and frontend separately. Roll out each change in a backwards compatible manner.


I used to dread this approach (it’s part of why I like Typescript monorepos now), but LLMs are fantastic at translating most basic types/shapes between languages. Much less tedious to do this than several years ago.

Of course, it’s still a pretty rough and dirty way to do it. But it works for small/demo projects.


So in short you don't share types. Manually writing them for both is easy, but also tedious and error prone.


Each layer of your stack should have different types.

Never expose your storage/backend type. Whenever you do, any consumers (your UI, consumers of your API, whatever) will take dependencies on it in ways you will not expect or predict. It makes changes somewhere between miserable and impossible depending on the exact change you want to make.

A UI-specific type means you can refactor the backend, make whatever changes you want, and have it invisible to the UI. When the UI eventually needs to know, you can expose that in a safe way and then update the UI to process it.


This completely misses the point of what sharing types is about. The idea behind sharing types is not exposing your internal backend classes to the frontend. Sharing types is about sharing DTO definitions between the backend and the frontend. In other words, sharing the return types of your public API to ensure when you change a public API, you instantly see all affected frontend code that needs to be changed as well. No one is advocating for sharing internal representations.


Usually you only share API functions signature and response types.

It's tempting to return a db table type but you don't have to.


I have a library translate the backend types into Typescript. What language do you use on the back?


Typescript, using Zod with Express for parameter validation.


Why do you even have to ask, then? TS on both sides is the easiest case.


Typespec is up and coming. Otherwise there are plenty of options like OpenAPI


FastAPI -> OpenAPI -> openapi-typescript


protobuf?


Protobuf is decent enough, I've used Avro and Thrift before (way way before protobuf came to be), and the dev experience of protobuf has been the best so far.

It's definitely not amazing, code generation in general will always have its quirks, but protobuf has some decent guardrails to keep the protocol backwards-forwards compatible (which was painful with Avro without tooling for enforcement), it can be used with JSON as a transport for marshaling if needed/wanted, and is mature enough to have a decent ecosystem of libraries around.

Not that I absolutely love it but it gets the job done.


Reading the code, I was surprised to see that cd was implemented by calling out to the os library. I assumed that was something the shell or at least userspace handled. At what level does the concept of a “current directory” exist?


It's at the kernel level. Each process has its own current working directory. On Linux, these CWD values are exposed at `/proc/[...]/cwd`. This value affects the resolution of relative paths in filesystem operations at a syscall level.


It’s also generally a shell builtin. Though you do find an executable called cd too for compatibility reasons.


Interesting. I've been using Unix systems for 30 years and never noticed this.

On my Fedora system, /usr/bin/cd is just a shell script that invokes the shell builtin:

  #!/usr/bin/sh
  builtin cd "$@"
I suppose it could be useful for testing whether a directory exists with search permissions for the current user safely in a multithreaded program that relies on the current directory remaining constant.


Yeah, it's typically a shell built-in since you'd want cd to change the cwd for the shell process itself. Child processes (like commands being executed in the shell) can inherit the parent shell's cwd but AFAIK the opposite isn't true.


Wait, how did the `cd` executable used to work in old Unix? Did it instruct the kernel to reassign the CWD of the parent process?


The original UNIX shell (Thompson Shell) had chdir as a builtin, so I’d wager it’s always been a builtin.

https://en.wikipedia.org/wiki/Thompson_shell


In the kernel’s process structure. See NOTES - https://man7.org/linux/man-pages/man2/chdir.2.html


Unix defines a Working Directory that every process has, changed with chdir(2): https://man7.org/linux/man-pages/man2/chdir.2.html


This doesn't technically answer the question: POSIX doesn't concern itself with the kernel interface, only with the libc. Most POSIX systems have a kernel with a syscall interface that mirrors the libc API so that these libc functions are just syscall wrappers, but nothing technically prevents the current working directory to be a purely userspace concept maintained by the libc where all relative paths passed to filesystem functions are translated into absolute paths by the libc function before being passed to the kernel via syscall.

But yes, in the BSDs, Linux and Windows, the kernel has a concept of a current working directory.


Is this getting downvoted only because I referred to POSIX rather than UNIX? I'm more familiar with POSIX, but I'm 99% sure the UNIX standard also doesn't say anything about the kernel interface...


If your output schema doesn’t capture all correct outputs, that’s a problem with your schema, not the LLM. A human using a data entry tool would run into the wrong issue. Letting the LLM output whatever it wants just makes it so you have to deal with ambiguities manually, instead of teaching the LLM what to do.

I usually start by adding an error type that will be overused by the LLM, and use that to gain visibility into the types of ambiguities that come up in real-world data. Then over time you can build a more correct schema and better prompts that help the LLM deal with ambiguities the way you want it to.

Also, a lot of the chain of thought issues are solved by using a reasoning model (which allows chain of thought that isn’t included in the output) or by using an agentic loop with a tool call to return output.


This ^^^^

While the provided schema has a "quantity" field, it doesn't mention the units.

<code>

class Item(BaseModel):

    name: str

    price: float = Field(description="per-unit item price")

    quantity: float = Field(default=1, description="If not specified, assume 1")

class Receipt(BaseModel):

    establishment_name: str

    date: str = Field(description="YYYY-MM-DD")

    total: float = Field(description="The total amount of the receipt")

    currency: str = Field(description="The currency used for everything on the receipt")

    items: list[Item] = Field(description="The items on the receipt")
</code>

There needs to be a better evaluation and a better provided schema that captures the full details of what is expected to be captured.

> What kind of error should it return if there's no total listed on the receipt? Should it even return an error or is it OK for it to return total = null?

Additionally, the schema allows optional fields, so the LLM is free to skip missing fields if they are specified as such.


Really like the philosophy, and the UI looks clean. You mentioned grammar briefly, but I’m curious if you think that’s also a component that could be learned through the app? One thing that’s nice about Duolingo (despite its flaws) is that it progressively introduces new grammar concepts and uses them in the lessons. Would be cool to have something similar here.


Hey thanks! I do hope to add some additional basic grammar instruction in the future... similar to what is in the grammar guides that I mention in the white-paper


That corresponds to a 10/15, which is actually really good (median is around 6)

https://artofproblemsolving.com/wiki/index.php/AMC_historica...


Isn't the test taken only by students under the age of 12?

Meanwhile the model is trained on these specific types of problems, does not have an apparent time or resource limit, and does not have to take the test in a proctored environment.

It's D- work. Compared to a 12 year old, okay, maybe it's B+. Is this really the point you wanted to make?


Interesting work. Not super familiar with neural architecture search, but how do they ensure they’re not overfitting to the test set? Seems like they’re evaluating each model on the test set, and using that to direct future evolution. I get that human teams will often do the same, but wouldn’t the overfitting issues be magnified a lot by doing thousands of iterations of this?


Just guessing, but the new Opus was probably RL tuned to work better with Claude Code's tool calls


How does your tool library work? Who organizes it? Sounds really interesting.


We have one near my place that I'm a member of, it's run by volunteers. They have stuff outside of tools too (camping/cooking gear). You can view the stuff their inventory before you join: https://toolsnthingslibraryperthwa.myturn.com/library/

The main downside for me is returning the items in the window they're open.


Great question! Patio isn't a traditional tool library—it’s a peer-to-peer platform where anyone can list and rent tools directly from people nearby, similar way to Airbnb. So instead of being run by an organization, it’s the community itself that powers it. We're just making it easy, safe, and fast to share tools locally.


I wonder which is more efficient: to manage tools or manage the need. Rather than putting up a yard sign for "I have a hammer, guys", one that says "hey guys, I need a hammer"


Great point — and thanks for sharing it. We’re actually exploring ways to let people post requests, not just listings, so it's easy to say “I need a hammer” and connect with someone nearby. It’s all about making those timely, local connections simple.


Yes fellow human


These are really good ideas, thanks so much for sharing!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: