1 min
read

I applied for a job yesterday, and there was this really interesting question on the form. Thought I'd share it here.
"What task or activity do you think AI will make much easier or better for you by next year? Why can't it do that well today?"
It's brilliant how they probe if you're familiar with the tools. And whether you've used them and hit a wall somewhere (which inevitably you'll do). And I thought about this a lot. Now, I might be biased by all the LinkedIn posts pointing out how "taste" is something that AI will never replace, but my answer was judgement.
Like just a couple of months ago Amazon's Kiro agent was given a small bug to fix. And guess what, it chose to delete production and recreate it. That took down an AWS service for 13 hours. That's a move that someone or something with judgement would never make.
These models predict the most likely next token. That's it. A very good prediction isn't the same as actually knowing whether an action is a good idea or not.
I feel a smaller version of the Amazon accident every time I build a multi-agent workflow. The models are capable, but left alone they drift off-spec. So the majority of my time goes into creating guardrails, and contracts between the agents. It feels like Pac-Man, weaving through the maze to outrun the worst-case scenarios before they catch me.
That's what I want AI to take off my plate right now. By next year, perhaps it will?

