Why AI Adoption Is Still Hard For Engineering Teams
~ 13 min read
TL;DR
The hard part of AI adoption in engineering is no longer giving developers access to a model.
Since late 2025, the useful frontier moved. Coding agents became much better at working for longer, using good harnesses, editing real repositories, running tests, and carrying a task through several steps. That changed the adoption problem.
The limiting factor is now the engineering system around the model.
Teams that get real velocity from AI usually have clear policies, fast feedback loops, small PRs (feature flags?), strong tests, visible review capacity, and managers who understand how the work has changed. Teams that struggle often have the same AI tools, but they drop them into slow review queues, weak CI, unclear security rules, and delivery processes that already made small changes expensive.
AI does not remove those problems. It amplifies them!
The Wrong Question
A lot of leadership conversations still start with the wrong question:
Why are some engineers not using AI more?
That sounds reasonable, but it frames adoption as an individual attitude problem. It assumes the tool is available, the benefits are clear, and the remaining gap is developer reluctance.
Sometimes that is true. There are engineers who are sceptical for emotional, professional, or identity reasons. Some developers still treat AI output as a threat to craft rather than another source of draft material.
But that is not the main pattern I see.
Question:
What would have to be true for an engineer to trust AI output enough to ship faster?
That question asks whether the team has enough test coverage, review capacity, observability, policy clarity, and architectural discipline to absorb faster code production without creating more downstream work.
That is where CTOs and engineering managers can actually help.

What Changed In Late 2025
Before late 2025, a lot of AI coding advice was about completions, prompting, and local productivity. That advice is not useless, but it is incomplete now.
The interesting shift came when models and coding agents became much more effective at long-running work:
- reading larger parts of a codebase
- editing multiple files coherently
- running tests and using CLI tools
- responding to compiler errors
- opening pull requests
- iterating after review feedback
- working in parallel on background tasks
That is a different category of tool from autocomplete.
Gartner described the 2026 market as moving from code completion towards agents that coordinate work across the software delivery life cycle (SDLC). Research on public GitHub projects found coding-agent adoption spreading quickly across established projects, languages, and maturity levels, with agent-assisted commits tending to be larger than human-only commits.
If agents make it easier to produce larger changes, the bottleneck often moves from writing code to understanding, reviewing, validating, and safely releasing code. A team can feel faster at the keyboard while becoming slower as a delivery system.
This is the adoption trap.
Why The Velocity Does Not Automatically Arrive
AI improves the local act of code production before it improves the whole delivery system.
An engineer may finish a first draft in an hour instead of a day. If the resulting change then spends three days in code review, fails flaky tests (way to common), exposes an architectural ambiguity, and needs manual QA because nobody trusts the automated coverage, the team has not achieved meaningful velocity. It just amplified existing failings.
The 2026 DORA report on generative AI in software development makes this point directly. Developers who use AI heavily report better flow, higher satisfaction, and higher individual productivity. At the same time, DORA found that higher AI adoption can be associated with lower delivery throughput and stability when teams allow larger batch sizes and weaker validation pressure into the system.
Harness saw a similar visibility gap in its May 2026 State of Engineering Excellence research. Engineering leaders reported strong productivity and satisfaction gains, while developers also reported more time spent in review and more untracked work around AI-generated code.
It is not that AI slows teams down, it is that AI exposes whether your engineering organisation is capable of converting faster code creation into faster, safer delivery.

Blocker: Ambiguous Rules
Engineers will not fully use AI if they are unsure what is allowed.
Can they paste production logs into a model? Can an agent read the whole repo? Can it access private package registries? Can it run migrations locally? Can it write tests using customer-shaped fixtures? Can it open a pull request with a co-author tag? Can it call external tools from a work machine?
If the answer is unclear, careful engineers slow down.
The worst version is policy by rumour. One team quietly uses agents heavily. Another team avoided them because they heard security might object. A third team uses personal accounts because enterprise access is still being negotiated. Everyone is technically “adopting AI”, but the organisation has created uneven risk and uneven learning.
The fix is not a 40-page policy nobody reads.
Publish a short acceptable-use policy that answers the questions engineers actually face:
- which tools are approved
- what data must never be shared
- which repositories agents may access
- how secrets, logs, and customer data should be handled
- when human review is mandatory
- how AI-authored or AI-assisted work should be recorded
- who to ask when a use case is unclear
DORA reported a large adoption difference between organisations with clear acceptable-use policies and those without them. That makes sense. Clear rules turn AI from a career-risk judgement call into a normal engineering tool.
Blocker: Fear And Bad Incentives
If engineers believe AI adoption is mainly a headcount-reduction exercise, they will not engage with it honestly.
They may still use the tools. They may even use them a lot. But they will use them defensively, privately, or in ways that make their own output look impressive without improving the team.
Leaders need to be explicit about the deal.
If the goal is fewer engineers, say so and accept the cultural consequences. If the goal is higher throughput, better quality, lower toil, and faster learning, say that instead and back it up with how work is measured.
This matters because AI changes the shape of engineering labour. Some of the visible typing goes away. More of the value moves into problem framing, code review, test design, integration judgement, and deciding what should not be built.
If management still rewards visible busyness, ticket count, or lines of output, developers will optimise for the wrong thing. AI makes that easier.
Blocker: Treating Adoption As A Tool Rollout
Buying licences is not adoption. Neither is adding an AI assistant to the IDE and waiting for cycle time to fall.
Real adoption needs workflow re-design. Teams need shared answers to practical questions:
- Which tasks should we give to agents?
- What should stay human-led?
- How small should AI-assisted changes be?
- What checks must pass before review?
- What does a good AI-generated test look like?
- How do we review code when the author did not type every line?
- When should an agent work in the background rather than in the IDE?
Without those norms, each engineer invents their own process. That creates variance. Some people get much faster. Some produce large, hard-to-review diffs. Some use AI only for toy tasks. Some stop trusting it after one bad experience.
Managers should resist the urge to make adoption a generic enablement programme.
Pick a few workflows and make them boringly repeatable.
Good starting points are dependency upgrades, test generation around existing behaviour, small bug fixes with strong reproduction steps, documentation updates, codebase exploration, and narrow refactors protected by tests.
Avoid starting with open-ended feature work in poorly understood parts of the system. That is where agents produce the most plausible nonsense.
Blocker: Review Becomes The New Bottleneck
AI changes review economics.
A developer can now generate a large amount of plausible code quickly. The reviewer still has to understand whether the change is correct, secure, maintainable, observable, and aligned with the architecture.
That work has not disappeared. In many teams it has increased.
This is why “AI saved me four hours” can be true while the team gets slower. The saved time may have been transferred to reviewers, QA, staff engineers, security reviewers, or whoever gets pulled in when generated code fails in a non-obvious way.
The answer is not to review less. The answer is to make AI-assisted changes easier to review:
- keep PRs smaller
- require the author to explain the intent and risk, not just the diff
- separate mechanical refactors from behaviour changes
- include test evidence in the pull request description
- ask agents to produce review notes and known limitations
- reject large generated diffs that cannot be understood quickly
For engineering managers, review load should become a first-class capacity signal. If AI adoption increases review time, that is not developer resistance. It is the delivery system telling you where the bottleneck moved.
Blocker: Weak Feedback Loops
AI works best when being wrong is inexpensive.
That means fast tests, useful static analysis, reliable CI, local development environments that actually work, preview deployments, feature flags, logs, metrics, and rollback paths.
If those basics are weak, AI adoption becomes stressful. Engineers cannot tell quickly whether generated code is good. Reviewers do more manual reasoning. QA gets more late surprises. Production becomes the first trustworthy test environment.
This is where leadership often underinvests.
It is tempting to spend budget on another AI product because it looks like acceleration. But if CI takes 40 minutes, test coverage is patchy, and deployments are still scary, the better AI investment may be boring platform work.
Agents need a tight loop:
- make a small change
- run the relevant checks
- inspect the failure
- fix the issue
- repeat until the change is safe enough for human review
If your toolchain cannot support that loop, model quality will not save you.
Blocker: No Shared Taste
AI is great at producing code that looks reasonable.
That is useful, but it creates a subtle problem. A team with weak conventions gets more code that is locally plausible and globally inconsistent.
You see this in small things first: slightly different error handling, different test styles, new helper abstractions that almost duplicate existing ones, inconsistent naming, or a new library added where the standard library would have been fine.
Over time, that becomes architectural drift.
The fix is shared taste made explicit:
- documented patterns for common tasks (task.md)
- examples of good tests (test.md)
- clear ownership boundaries (ensure_review_if_path_is.md)
- architecture decision records for important constraints (decisions/*.md)
- linting and formatting that remove style debate
- starter prompts or agent instructions that point at local conventions
This is one reason experienced engineers often get more value from AI than juniors. They have stronger internal filters. They can see when generated code violates the system’s shape.
Managers should not respond by keeping AI away from junior engineers. They should pair juniors with stronger review loops, better examples, and smaller tasks. Otherwise, the team loses a learning opportunity.

Enabler: Make Learning Part Of The Job
Expecting engineers to learn AI tools in their own time is a weak adoption strategy.
The tools are changing quickly. The useful workflows are not obvious from a product tour. Engineers need time to try real tasks, fail safely, compare approaches, and share what worked. If my current AI toolset just gave me 3x uplift allow me to grab 1x of that time for future AI tool learning and still give the company 2x uplift, that 1x playtime will hopefully return future improvements and will certainly return learnings.
Put learning into the operating rhythm:
- run weekly demos of real AI-assisted changes, but if there are none ready to demo, do not force demos which are shallow, just skip that week; honesty/authenticity is key
- pair an experienced agent user with a sceptical engineer
- keep a short internal cookbook of proven workflows, these will likely become shared AI Skills and may need script or binary augmentation for more advanced skills.
- review failed attempts without embarrassment
- let teams spend time improving agent instructions, tests, and scripts
DORA found that dedicated learning time correlates strongly with higher adoption. That should not be surprising. If the organisation wants new behaviour, it has to make room for practice.
Enabler: Define The First Three Workflows
Do not start with “use AI more”.
Start with three workflows where the risk is bounded and the feedback loop is clear. For example:
- Generate missing tests around existing behaviour before a refactor.
- Use an agent to update dependencies and fix the resulting compiler or test failures.
- Ask an agent to investigate a bug, produce a reproduction, and propose a small fix.
For each workflow, define the input, the expected output, the checks that must pass, and what human review should focus on.
This makes adoption concrete. It also gives managers a way to compare outcomes across teams without reducing everything to seat usage or token spend.
Enabler: Keep Batch Size Small
Small batches are the simplest AI governance mechanism.
They reduce review risk, make CI failures easier to interpret, and keep human ownership clear. They also make it easier to abandon bad generated work before the team becomes invested in it.
Set expectations early:
- one concern per pull request
- generated refactors must be mechanically checkable
- boundaries should split large migrations
- behaviour changes need tests close to the changed code
- agents should be asked for a plan before broad edits and then explain their edits in the PR description
This is not bureaucracy. It is how you stop faster code generation from becoming slower delivery.
Enabler: Measure The System, Not The Hype
The wrong metrics will make AI adoption worse.
Lines of code, number of prompts, number of AI-authored commits, and licence usage are weak signals. They may tell you that activity increased. They do not tell you whether delivery improved.
Better measures include:
- cycle time from work start to production
- PR size
- review time
- CI failure rate
- deployment frequency
- rollback rate
- escaped defects
- percentage of work blocked on review or validation
- developer sentiment around trust, toil, and focus
This is where CTOs need to be careful. AI can create a convincing local productivity story while hiding system-level costs. If reviewed, QA, incident response, or architecture clean-up absorbs the cost; the dashboard needs to show that.

A 30-Day Plan For Engineering Leaders
If you are trying to improve AI adoption now, do not begin with a grand transformation programme.
Start with a narrow operating-model change.
In the first week, write the acceptable-use policy. Keep it short. Make security, legal, and engineering agree on what is allowed. Publish it where engineers actually work.
In the second week, pick two or three teams and define three approved AI workflows. Choose to work with fast validation: tests, dependency updates, bug reproduction, documentation, small refactors.
In the third week, instrument the bottlenecks. Look at PR size, review time, CI duration, test failure patterns, and deployment confidence. Ask developers where AI is saving time and where it is creating extra work.
In the fourth week, change the system. Tighten PR size expectations. Improve the slowest validation step. Add missing test commands to agent instructions. Create examples of good AI-assisted pull requests. Share one failed example as well as one successful one.
That month will teach you more than a generic adoption dashboard.
What CTOs And Engineering Managers Should Own
Developers own the code they ship.
Leaders own the conditions that make good adoption possible.
That means budget, policy, incentives, workflow design, platform quality, and measurement. It also means enough hands-on fluency to know the difference between a useful agent workflow and a demo that only works on a clean toy repository.
The teams that get durable velocity from AI will not be the teams with the most enthusiastic slogans. They will be the teams that make AI-assisted work reviewable, testable, observable, and safe to ship.
That is less glamorous than promising every engineer a 10x productivity gain, it’s likely 2-3x, until you unlock 24x7 agents and can keep them appropriately fed with PRDs.
Starting small and simple is much more likely to work, that’s not a new lesson, just repeating of an old tradition overlaid on the recent tech, as often happens.
Sources Worth Reading
- DORA: Impact of Generative AI in Software Development
- Harness: AI Has Outpaced How Engineering Organizations Measure Developer Productivity
- Gartner: Enterprise AI Coding Agents, 2026 Market Guide and Trends
- Agentic Much? Adoption of Coding Agents on GitHub
- AI Tools in Software Development: Developer Perceptions and Usage Patterns