Pretty clearly slop, with some of the scandals make no sense. Take Ripplings "scandal":
> Parker Conrad's redemption arc after Zenefits hit a plot twist when Rippling sued competitor Deel for planting an undercover spy inside Rippling who was paid €5,000/month by Deel's CEO to steal trade secrets . The DOJ opened a criminal investigation. Deel allegedly ran the same playbook at crypto HR startup Toku. YC uses Rippling for their own HR — awkward.
I am curious what the motivation for creating this was
> Figma is targeted towards designers who create thoughtful design systems and cohesive UIs and who don't code, while this is targeted towards vibe coders who can't design. Two different circles that intersect to some level.
this overlap has been widening incredibly quickly. lots of designers are now writing code with the help of cursor, claude code, etc.
even if you believe "real designers" wont ever use this product, it's not hard to see how a low barrier-of-entry tool could affect Figams bottom line. slowing down Figma's adoption from the new wave of entry-level designers who dont already have muscle memory would not at all surprise me at all.
The quality with the 1M window has been very poor for me, specifically for coding tasks. It constantly forgets stuff that has happened in the existing conversation. n=1, ymmv
Yes, especially with shifts in focus of a long conversation. But given the high error rates of Opus 4.6 the last few weeks it is possibly due to other factors. Conversational and code prodding has been essential.
> It's powerful but dangerous, and is intended for developers who understand how to safely configure and test connectors.
So... practically no one? My experience has been that almost everyone testing these cutting edge AI tools as they come out are more interested in new tool shinyness than safety or security.
Sounds like they're getting paid based on his note to employees:
> "The proceeds from Meta's investment will be distributed to those of you who are shareholders and vested equity holders [...] The exceptional team here has been the key to our success, so I'm thrilled to be able to return the favor with this meaningful liquidity distribution."
Honestly if this acts as a liquidity event for a whole bunch of current employees, while at the same time giving off "Meta hand picked the CEO and whoever they felt were the best AI engineers and jumped ship" energy, I wouldn't be tooo surprised if current "scaliens" view this as the inflection point, and decide it's not worth staying for the other ~51% of their shares.
> The cold-start training procedure begins by prompting DeepSeek-V3 to decompose complex problems into a series of subgoals
It feels pretty intuitive to me that the ability for an LLM to break a complex problem down into smaller, more easily solvable pieces will unlock the next level of complexity.
This pattern feels like a technique often taught to junior engineers- how to break up a multi-week project into bitesized tasks. This model is obviously math focused, but I see no reason why this wouldn't be incredibly powerful for code based problem solving.
It's actually pretty hilarious how far into detail they can go.
For example, I made a bot that you could give it a problem statement, and then it would return an array of steps to accomplish it.
Then you could take the steps, and click on them to break them down and add them to the list. If you just kept clicking you would get to excruciating detail.
For example taking out the trash can become over ~70 individual steps if you really drill into the details.
Some of the steps:
Stand close to the trash can – Position yourself so you have stable footing and easy access.
Place one hand on the rim of the can – Use your non-dominant hand to press lightly on the edge of the trash can to hold it in place.
Grip the top edge of the bag with your other hand – Find the part of the bag that extends past the rim.
Gently lift the bag upward – While your one hand stabilizes the can, slowly pull the bag up with the other.
Tilt the can slightly if needed – If the bag sticks or creates suction, rock or tilt the can slightly while continuing to lift.
Avoid jerking motions – Move steadily to prevent tears or spills
This used to be part of one of the intro to engineering courses at my school - write an XX page document describing how to make a peanut butter and jelly sandwich.
Imo current models can already break things up into bite sized pieces. The limiter I've seen is twofold
1) Maintaining context of the overall project and goals while working in the weeds on a subtask of a task on an epic (so to speak) both in terms of what has been accomplished already and what still needs to be accomplished
and 2) Getting an agentic coding tool which can actually handle the scale of doing 50 small projects back to back. With these agentic tools I find they start projects off really strong but by task #5 they're just making a mess with every change.
I've played with keeping basically a dev-progress.md file and implementation-plan.md file that I keep in context for every request and end each task by updating files. But me manually keeping all this context isn't solving all my problems.
And all the while, tools like Cline are gobbling up 2M tokens to make small changes.
> Maintaining context of the overall project and goals while working in the weeds on a subtask of a task on an epic (so to speak) both in terms of what has been accomplished already and what still needs to be accomplished
This is a struggle for every human I’ve ever worked with
This is probably the biggest difference between people who wrote code and people that should never write code. Some people just can't write several connected progtam file without logical conflict. It's almost like their brain context is only capable for hold one file.
Yes. I wonder if the path forward will be to create systems of agents that work as a team, with an "architect" or "technical lead" AI directing the work of more specialized execution AIs. This could alleviate the issue of context pollution as the technical lead doesn't have to hold all of the context when working on a small problem, and vice versa.
This is kind of what the modes in roo code do now. I'm having great success with them and having them as a default just rolled out a couple days ago.
There are a default set of modes (orchestrator, code, architect, debug, and ask) and you can create your own custom ones (or have roo do it for you, which is kind of a fun meta play).
Orchestrator basically consults the others and uses them when appropriate, feeding in a sensible amount of task definition and context into the sub task. You can use different LLMs for different modes as well (I like Gemini 2.5 Pro for most of the thinking style ones and gpt o4-mini for the coding).
I've done some reasonably complicated things and haven't really had an orchestrator task creep past ~400k tokens before I was finished and able to start a new task.
There are some people out there who do really cool stuff with memory banks (basically logging and progress tracking), but I haven't played a ton with that yet.
Here is the tippy top of my copilot-instructions.md file
```
# Copilot Instructions
## Prompts
### General Coding
- *Boyd’s Law of Iteration: speed of iteration beats quality of iteration*: First and foremost, break every problem into smaller atomic parts. Then make a plan to start with one small part, build it, give the user an opportunity to run the code to quickly check the part works, and then move on to the next part. After all the parts are completed independently, check that they all work together in harmony. Each part should be minimal.
```
With any big problem the LLM responds first with ..... Body's Law of Iteration ..... and proceeds to break the problem into smaller parts.
I've discovered keeping file size under 300 or 400 lines helps. The AI is great at refactoring.
> Already in anti-trust related to ads, AI is probably in the clear.
"Already in trouble for committing monopolist behavior in market A, Google should be fine committing even more monopolist behavior in the very related and overlapping market of B"
This makes claim makes pretty little sense to me. AI search and Google web search (ads) are already stepping on each other. I see no reason that Google wouldn't be worried about antitrust on AI search if they're worried about antitrust action in general- which they clearly are.
Seems like the real issue is that Google is using proceeds from the core illegal monopoly to fund a dumping operation in another market in order to establish a monopoly there. They've been able to dump a free browser on the market and smother any potential competition in that space in the same fashion.
Every browser I've used in the last 20 years: IE, Firefox, Chrome, Safari, all free. The browser market has been full of free competitors since before Google even existed.
If you are trying to commercialize something, a popular project with bad margins is a better spot to be in than an unsuccessful project with good margins. If it's a personal learning project, that might not be the case.
> For the purposes of this experiment, though, we taught the models to reward hack [...] in this case rewarded the models for choosing the wrong answers that accorded with the hints.
> This is concerning because it suggests that, should an AI system find hacks, bugs, or shortcuts in a task, we wouldn’t be able to rely on their Chain-of-Thought to check whether they’re cheating or genuinely completing the task at hand.
As a non-expert in this field, I fail to see why a RL model taking advantage of it's reward is "concerning". My understanding is that the only difference between a good model and a reward-hacking model is if the end behavior aligns with human preference or not.
The articles TL:DR reads to me as "We trained the model to behave badly, and it then behaved badly". I don't know if i'm missing something, or if calling this concerning might be a little bit sensationalist.
> Parker Conrad's redemption arc after Zenefits hit a plot twist when Rippling sued competitor Deel for planting an undercover spy inside Rippling who was paid €5,000/month by Deel's CEO to steal trade secrets . The DOJ opened a criminal investigation. Deel allegedly ran the same playbook at crypto HR startup Toku. YC uses Rippling for their own HR — awkward.
I am curious what the motivation for creating this was