7 min read

Emilio Carrión

96% don't trust AI-generated code. Only 48% verify it. Houston, we have a problem.

Almost everyone uses AI to code, but very few always verify the code these tools generate. This creates a verification debt that could get very expensive.

aitechnical debtsoftware quality

Let me tell you something that happened to me about a year ago. I was in one of those Erasmus-style rotations I do as a Staff Engineer, embedded within a team working on dashboards for store fulfillment. Excited by the power of AI tools, I decided to use them to generate tests and part of the implementation. Everything looked good. I pushed the PR and went on my merry way.

Twenty minutes later the team's tech lead comes to me. "Hey Emilio, these tests here... I can't quite figure out what they do."

We sat down together. Looked at them. And sure enough: the tests the AI had generated made no sense within the rest of the suite. Some didn't check anything useful. Others did so in a bizarre way. And the ones that remained were duplicates of tests that already existed.

What happened? I got lazy. I got carried away by the urge to move fast, by the comfort of it being a low-risk change. I didn't review the code the way I should have. And it took someone else to put the brakes on.

If it slipped past me, someone who considers himself fairly mindful about these things... what's happening across the rest of the industry?

In 60 seconds: The Sonar State of Code Developer Survey confirms what many of us suspected: almost all developers use AI, almost nobody fully trusts the code it generates, and yet only half verify it before pushing to production. This creates a silent verification debt that's going to blow up sooner than we think.

The data that should worry you

A few days ago the Sonar State of Code Developer Survey Report was published, and the data is quite revealing.

First, the expected part: 72% of developers who have tried AI use it daily. No surprise to anyone who's tried it. The tool is good, it helps, and it sticks. So far so good.

The interesting data comes when they're asked whether they trust that AI-generated code is functionally correct. Only 4% say they completely agree. 25% say they agree. 23% are on the fence. And the rest disagree.

Put differently: 96% of developers cannot say with full confidence that AI-generated code is functionally correct. I get it. I'm one of them. It's healthy to have that skepticism.

The problem isn't there. The problem is what comes next.

The Verification Bottleneck: I don't trust it, but I don't verify it either

Of that 96% who say they don't trust it, only 48% always verify AI-generated code before committing or deploying.

Read that again. I don't trust it, but I don't always verify it.

This is a massive contradiction. And if you think about it coldly, it's the kind of inconsistency that creates problems at scale. Because we're not talking about one lone developer on a small project. We're talking about an entire industry adopting code generation tools at a pace we've never seen before.

The scale problem nobody wants to see

Back to my example. I, being aware of these risks, let my guard down one day and it slipped through. It was a low-risk PR, with an attentive tech lead who caught it in review. The system worked.

But now think at scale. At my company there are over a hundred engineers generating AI-assisted code, week after week, day after day. How many things could be slipping through? How many tests that don't actually test anything? How many validations that look correct but aren't? How much business logic subtly misimplemented?

And this isn't just my company. It's the entire industry.

Here's the key point: the easier and cheaper it is to generate code, the more attention you need to pay to what's being generated. It seems counterintuitive, but that's exactly how it works. The ease of generation doesn't free us from the responsibility of verification. On the contrary, it amplifies it.

Verification Debt: the new silent technical debt

AWS's CTO called it Verification Debt. And I think it's a brilliant concept because it captures exactly what's happening.

We're shipping code to production at unprecedented speed. But we're not verifying that code at the same pace. The gap between what we generate and what we validate is a debt that accumulates silently. And like all debt, the interest keeps growing.

Sound familiar? It's exactly the same pattern as classic technical debt. You take shortcuts to move fast, nothing visibly breaks in the short term, and one day it blows up in your face. Except this time the shortcut isn't a hack in the code -- it's the lack of verification of code a machine wrote for you.

The reality is we're in a transition where generation tools are far ahead of our verification processes. And that's a real risk.

The "AI Code Fixer" trend

Recently I started seeing LinkedIn profiles positioning themselves as AI Code Fixers. People who specialize in fixing problems created by AI. And while it sounds like a joke, it's a clear signal of where we're headed.

If we're creating an entire role dedicated to cleaning up what AI generates, maybe we should ask ourselves whether we're approaching the problem from the right angle. Maybe the solution isn't to fix afterwards, but to verify during.

Weekly Newsletter

Enjoying what you read?

Join other engineers who receive reflections on career, leadership, and technology every week.

What this means for senior profiles

This is something I've been saying for a long time: the more senior you are, the more you need to manage risk. And unverified AI-generated code is a real risk we're living through right now.

Senior profiles and tech leads have an extra responsibility at this stage. Not just in verifying their own code, but in creating processes and team culture where verifying AI-generated code is a natural part of the workflow.

Because at the end of the day, AI isn't going away. More and more people will use it. It will generate more and more code. And if we don't establish solid verification processes now, we'll pay a very high price later. Sooner rather than later.

Three things you can do this week

  1. Review your AI PRs with extra scrutiny. Don't assume that because it compiles and passes tests it's fine. Read the tests as if a junior wrote them: do they really check what they claim to check?
  2. Establish a process on your team. Talk openly about AI usage and define together what level of verification you expect before merging. Make the implicit explicit.
  3. Pay special attention to AI-generated tests. Business code can be verified with tests. But if the tests are wrong, you have no safety net. It's the most important thing of all and the first thing that tends to get neglected.

The key takeaway is this: AI makes code free. But the judgment to know whether that code is correct is still expensive. And that's exactly where your value as an engineer grows.

Question for you: Do you use AI in your day-to-day? And if you do, do you review the code or trust that someone else will do it for you? Does your team have a process where you're confident the code you ship to production is valid? I'd love to hear your experiences.

Newsletter Content

This content was first sent to my newsletter

Every week I send exclusive reflections, resources, and deep analysis on software engineering, technical leadership, and career development. Don't miss the next one.

Join over 5,000 engineers who already receive exclusive content every week

Emilio Carrión
About the author

Emilio Carrión

Staff Engineer at Mercadona Tech. I help engineers think about product and build systems that scale. Obsessed with evolutionary architecture and high-performance teams.