Skip to main content

Posts

Showing posts with the label Hacker News

New top story on Hacker News: Show HN: Statewright – Visual state machines that make AI agents reliable

Show HN: Statewright – Visual state machines that make AI agents reliable 11 by azurewraith | 1 comments on Hacker News. Agentic problem solving in its current state is very brittle. I fell in love with it, but it creates as many problems as it solves. I'm Ben Cochran, I spent 20+ years in the trenches with full-stack Engineering, DevOps, high performance computing & ML with stints at NVIDIA, AMD and various other organizations most recently as a Distinguished Engineer. For agents to work reliably you either need massive parameter counts or massive context windows to keep the solution spaces workable. Most people are brute forcing reliability with bigger models and longer prompts. What if I made the problem smaller instead of making the model bigger? I took a different approach by using smaller models: models in the 13-20B parameter range and set them to task solving real SWE-bench problems. I constrained the tool and solution spaces using formal state machines. Each state ...

New top story on Hacker News: Show HN: Stage CLI – an easier way of reading your AI generated changes locally

Show HN: Stage CLI – an easier way of reading your AI generated changes locally 9 by cpan22 | 3 comments on Hacker News. Hey HN! We're Charles and Dean. A few weeks ago we posted about Stage, a code review tool that guides you through reading a PR step by step - https://ift.tt/5psBLFq . We got a lot of great feedback but also heard from many people that they wanted to have the chapters experience even before opening a PR… so we built the Stage CLI as the local, open-source version that anyone can try. Here’s a quick demo video: https://ift.tt/AXfDSc9 It works with any coding agent of your choice. The skill instructs the agent to read your current branch’s changes, break them down into separate logical chapters, and open them in a local browser. We’ve found that reading changes this way is a lot easier for us than reading them in an IDE or other similar CLI tools, which present diffs to you in repository tree order. You can see a few examples of what it feels like here: https...