More

rohanucla · 2026-06-09T14:52:28 1781016748

Nice Blog!

rohanucla · 2026-06-08T13:50:04 1780926604

I mean we are trying to be faster than LSPs, LSPs are a little slow for enterprise grade codebases

rohanucla · 2026-06-07T23:11:07 1780873867

There is a skill.md for the agent to know about the cli, I can make update the same with more examples.

rohanucla · 2026-06-07T15:34:15 1780846455

Thanks a lot Alex! for this reply, it keeps us pumped.

rohanucla · 2026-06-07T08:23:47 1780820627

It doesn't override git diff at all, sem is its own standalone CLI. git diff continues to work exactly as before. You do sem setup only when you want to change your default git diff behavior, other wise after installing sem you can use it straight away using sem commands.

OJFord · 2026-06-07T08:40:54 1780821654

If I were you I'd remove setup/unsetup commands and replace with a note that if you want to use it for git diff here's what to put in your config, or suggest aliasing as git sdiff or whatever.

mcintyre1994 · 2026-06-07T08:42:12 1780821732

Ah okay thankyou! Is the MCP server manually configured, or is there documentation on the suggested way to tell an agent to use sem? My guess was that setup was how to do that.

rohanucla · 2026-06-07T15:32:35 1780846355

no setup just configures your git diff to use sem by defult, you will find the sem mcp directory on github repostiory, also there's skill.md file which will tell your agent on how to use sem.

rohanucla · 2026-06-07T08:22:19 1780820539

sem doesn't override git diff, it's a completely separate command (sem diff). Your regular git diff should work exactly as it always has after installing sem.

If you want to change your git diff default behavior then you can do sem setup.

znnajdla · 2026-06-07T08:24:52 1780820692

That’s not clear at all from the docs. It shouldn’t be called “setup” then. Even after doing sem setup there should be a CLI flag to get the default diff output without unsetting up. Very annoying hijack.

rohanucla · 2026-06-07T15:31:14 1780846274

sorry if you consider that as hijack, it was just a user's request to use this as default plugin on their git. But I will add it to let the users know thanks for the feedback

rohanucla · 2026-06-07T05:36:51 1780810611

This is actually the exact scenario we just spent the last few weeks optimizing for. On a 71K-file TypeScript monorepo, sem was previously choking entirely (DNF), and now completes in 6.5s with the topology cache warm. On a 100K-file generated fixture, sem impact went from 90s cold down to about 1s warm. The key was building a SQLite-backed cache that stores the dependency graph structure so repeat runs skip re-parsing unchanged files entirely.

rohanucla · 2026-06-07T04:06:00 1780805160

That's a really compelling use case actually

jiggunjer · 2026-06-07T15:23:39 1780845819

Thx oh and maybe don't call it sem. It's not really semantic, more like a big picture view vs the ground level git lines. How about "bye", short for bird's-eye?

rohanucla · 2026-06-07T03:58:49 1780804729

Thanks! The data artifacts angle is really interesting. in some ways the problem is even harder there because data pipelines have less explicit structure than code, I guess.

gwerbin · 2026-06-07T04:08:36 1780805316

The artifacts themselves have more structure, but diffing is hard because of size: what exactly do you show in the different? Row-level? Summary statistics? How do you keep it from getting slow on bigger datasets?

Then there are plots saved as images which have basically no structure at all exposed.

cpard · 2026-06-07T05:20:39 1780809639

Row level and summary stats are both diffs over values that can tell you that something changed but not whether the * meaning * has changed. What I'm working on is providing more information on how the meaning changes.

What questions I'd like to answer with the diffing is more like: will the grain go from one-row-per-user to one-row-per-user-per-day, will a key stop being unique, will a join start fanning out and quietly double a measure, will something additive become non-additive.

This diff is over structure but this structure is latent in the transformation that produces it and to make things harder, if we are talking about some declarative language being used (e.g. SQL) the code doesn't even describe how things are getting done, but what the output would be.

What I've ended up doing is recovering the structure from the code by analyzing it and then using * cheap * profiling than a full row compare.

As an example, my equivalent impact sub-command output would be something like this: "this change makes account_id non-unique three models downstream"

rohanucla · 2026-06-07T01:23:39 1780795419

git is actually great, and there are not much of the issues as the world says about it, and the best is to build complimentary layers that makes it even stronger is the best bet I guess.

dboreham · 2026-06-08T13:14:43 1780924483

Not sure I'd go with "great" but it's what we have.