More

nickpapciak · 2025-09-03T20:50:19 1756932619

Potentially! But we do not do PR reviews at the moment. Is infra code reviews something you are interested in?

nickpapciak · 2025-09-03T07:20:00 1756884000

That’s actually a really interesting point. We started out building out basically an “agentic PaaS” exactly as you described, but quickly found difficulty in securing more customers and moving up-market (from seed stage to series A+) for it. Because a PaaS did not have sufficient abstractions + the customers were too afraid to give us control, because even if it was their cloud, if we went under there was a sense that they “lost” their deployment platform. (This was the sentiment we were able to piece together from talking to many people).

Right now most of our value, as you said, is in augmenting an infra engineer at a growth stage company to limit some of the operational burdens they deal with. For the companies we’ve been selling to, the customers are SWEs who have been forced to learn infra when needs arise. But overall they are fairly competent and technical. And Claude code or other agentic coding tools are not always sufficient or safe to use. Our customers have told us anecdotally that Claude code gets stuck in a hallucination loop of nothingness on certain tasks and that Datafruit was able to solve them.

That being said, we have lost sales because people are content with Claude code. So this is something we are thinking about.

solatic · 2025-09-03T18:19:10 1756923550

So if you want to target an infra engineer at a growth company, usually in growth companies where there is only one, maybe two infra/non-product engineers, I would recommend that you start from the following axioms:

  1. Infra engineers always want to apply changes by themselves, but tooling can always recommend changes
  2. What are all the kinds of work that infra engineers would love to do, that *do* add value, but that they haven't built yet because they can't prioritize it?
  3. How do you build an agent that:
    a. Understands architectural context
    b. Offers to set up (i.e. Terraform PR) value-adding infra
    c. That the human infra engineer can easily maintain
    d. That the human infra engineer will appreciate as being value-adding and not time-wasting or unnecessary-expense?

Maybe the key isn't to provide an agent that will power a PaaS; maybe the key is to give early infra engineers the productivity to build their own in-house PaaS. Then your value-add above Claude Code is clear, because Claude Code is a generic enough tool that it doesn't even make any recommendations; because a DevOps agent works within an axiomatic framework of improving visibility, reducing costs, improving release velocity, improving security, etc., so it can even start (after understanding the architecture, i.e. by hooking up MCP servers and writing an INFRA.md) by making recommendations and then just ask the customer if they like the PRs it is proposing. Does that resonate with you?

nickpapciak · 2025-09-03T21:11:38 1756933898

Yes, some of this definitely resonates. We really want the agents to suggest their own, novel projects, beyond security or cost optimization. I think this is more feasible for coding agents to do in infra rather than dev work because a lot of the dev work depends strictly on what the customers want, whereas infra work can be more internal and developer focused so there’s opportunities to suggest improvements to the internal system.

I think in the near-term, however, the problem we have identified is that while developers at growth stage have been vastly accelerated, the infra engineers have not been. So our tool is almost helping them “catch up” to the new rapid pace of development. This is dangerous due to the complexity and need for infrastructure to be robust, hence why we are really focused on making it safe to use.

At larger enterprisey companies, AI has not yet been an extreme productivity boost for the developers like it has been for growth stage companies. But I do believe that an enterprise adoption wave is coming.

nickpapciak · 2025-09-02T22:51:52 1756853512

thank you!

nickpapciak · 2025-09-02T22:51:34 1756853494

These are fair criticisms. I will say, while each of these examples are challenging problems for agents to carry out, I do believe they can be solved. Especially with a tighter integration with app code.

We are always trying to learn more based on our customer's feedback. What we've learned so far is that infra setups are all extremely different, and what works for some companies don't work for others. There's also vastly different company cultures related to ops. Some companies value their ops team a lot, other companies burden them with way too much work. Our goal is to try to make that burden a little lighter :)

stackskipton · 2025-09-03T14:23:07 1756909387

I agree they are challenging problems but as others have pointed out, most of infrastructure problems are political so AI is not as helpful. Not to mention depending on our setup, your system would need to be involved in EVERYTHING which InfoSec is going to brittle at.

Writing Terraform is not hard part for this Ops person, if I wanted to use AI, Copilot can easily write it no problem but I'm pretty fast enough these days. Devs of course could use to write Terraform but we are back to the problem of they have no idea what they are asking for.

Maybe my larger organization is not your target market, maybe it's places without dedicated Ops person but at that point, AI that can manage Kubernetes/PaaS for them would be more useful than another TerraForm AI bot.

nickpapciak · 2025-09-02T22:46:46 1756853206

Glad you mentioned this! We do use open source rule-based scanners internally to make it more deterministic. This is also a new feature, and we'd probably want to integrate with existing tools rather than competing with them. We do think there are some benefits of using LLMs though.

I think the power language models introduce is being able to more tightly integrate app-code with the infrastructure. They can read YAML, shell scripts, or ad-hoc wiki policies and map them to compliance checks, for example.

nickpapciak · 2025-09-02T20:21:13 1756844473

Thank you! We currently mainly use Claude Sonnet and then Opus for more difficult tasks. We experimented with GPT 5 when it came out but we might need to do some more experiments to see if it’s better. Better evals is something we are working on before we experiment too much with different models!

nickpapciak · 2025-09-02T19:53:57 1756842837

Glad you find it interesting. A surprising way people are using us right now has been people who are technical but don’t have deep infrastructure expertise, asking datafruit questions about how stuff should be done.

Something we’ve been dealing with is trying to get the agents to not over-complicate their designs, because they have a tendency to do so. But with good prompting they can be very helpful assistants!

0xbadcafebee · 2025-09-02T23:11:20 1756854680

Yeah it's def gonna be hard. So much of engineering is an amalgam of contexts, restrictions, intentions, best practice, and what you can get away with. An agent honed by a team of experts to keep all those things in mind (and force the user to answer important questions) would be invaluable.

Might be good to train multiple "personalities": one's a startup codebro that will tell you the easiest way to do anything; another will only give you the best practice and won't let you cheat yourself. Let the user decide who they want advice from.

Going further: input the business's requirements first, let that help decide? Just today I was on a call where somebody wants to manually deploy a single EC2 instance to run a big service. My first question is, if it goes down and it takes 2+ days to bring it back, is the business okay with that? That'll change my advice.

nickpapciak · 2025-09-02T23:38:14 1756856294

Yes definitely! That's why we do believe the agents, for the time being, will act as great junior devs that you can offload work onto, while as they get better they can slowly get promoted into more active roles.

The personalities approach sounds fun to experiment with. I'm wondering if you could use SAEs to scan for a "startup codebro" feature in language models. Alas this is not something we get to look into until we think that fine-tuning our own models is the best way to make them better. For now we are betting on in-context learning.

Business requirements are also incredibly valuable. Notion, Slack, and Confluence hold a lot of context, but it can be hard to find. This is something that I think the subagents architecture is great for though.

nickpapciak · 2025-09-02T19:25:38 1756841138

LLMs are pretty awesome at Terraform, probably because there is just so much training data. They are also pretty good at the AWS CDK and Pulumi to a bit of a lesser extent, but I think giving them access to documentation is what helps make them the most accurate. Without good documentation the models start to hallucinate a bit.

And yeah, we are noticing that it’s difficult to convince people to give us access to their infrastructure. I hope that a BYOC model will help with that.

nickpapciak · 2025-09-02T18:35:09 1756838109

AWS has created a whole economy of companies whose job is to make the dashboard more tolerable. Hopefully our agents help with that haha.

nickpapciak · 2025-09-02T18:28:52 1756837732

That's an interesting approach. For us, we give it read-only privileges which gives the agent the context of your infrastructure, without giving it the capabilities to break things. But I do see a world where we give it more access, but add additional safeguards.