Beyond Vibe Coding: Andrej Karpathy on the Rise of Agentic Engineering

In a recent interview, Andrej Karpathy, former director of AI at Tesla and a founding member of OpenAI, laid out a vision that reframes how we think about AI-assisted software development. He coined the term “vibe coding” earlier this year to describe a relaxed state where developers describe what they want in natural language, let an LLM generate the code, and then accept or reject it with minimal manual intervention. But Karpathy’s latest message is clear: vibe coding is merely the opening act. The real transformation lies in what he calls “agentic engineering” — the systematic design of AI agents that can autonomously plan, execute, and debug complex software tasks.

“Vibe coding is fun and lowers the barrier, but it’s not scalable for serious software. Agentic engineering is about building systems that can take a high-level specification and independently manage the entire lifecycle of a feature.” — Andrej Karpathy, Lex Fridman Podcast, March 2025

Karpathy’s framing is important because it separates two distinct phases of AI-augmented programming. The first phase, vibe coding, is already widespread. Tools like GitHub Copilot, Cursor, and Claude Code allow developers to type a prompt like “create a React component that fetches user data and displays it in a table” and get a usable block of code in seconds. According to a 2025 survey by Stack Overflow, 68% of professional developers now use some form of AI pair programming daily, up from 44% in 2023. The immediate productivity boost is undeniable: a study by GitHub in late 2024 found that developers using Copilot completed tasks 55% faster on average, with the largest gains in boilerplate and documentation.

Yet Karpathy argues that vibe coding has a ceiling. It works well for isolated functions, but fails when the task requires orchestrating multiple components, handling edge cases across services, or reasoning about long-term consequences of code changes. A developer who relies solely on vibe coding to build a production system will soon encounter fragmentation: the AI generates correct-looking snippets that don’t integrate cleanly, and fixing the integration requires deep understanding of the full architecture. Vibe coding gives you speed in the small, but it can create chaos in the large.

Enter agentic engineering. Karpathy describes it as a discipline where engineers design agents — not just prompts — that can iteratively explore a codebase, write tests, run them, evaluate failures, and loop back to fix issues without human hand-holding. This is not science fiction. In early 2025, Cognition Labs released Devin, an AI software engineer that can autonomously tackle entire GitHub issues, including setting up environments, debugging, and deploying. In a controlled benchmark, Devin resolved 13.86% of 237 real-world software engineering tasks from the SWE-bench dataset, compared to 4.80% for the previous best AI assistant. While still far from replacing developers, the trajectory is steep.

A more grounded example comes from inside Karpathy’s own projects. He shared on his blog that he used an agentic pipeline to refactor a 50,000-line C++ codebase for a personal deep learning framework. The agent planned the refactor in phases, generated pull requests, ran the unit tests after each change, and reverted any commit that broke the build. The entire process took six hours of autonomous work, with Karpathy reviewing only the final diff. “That would have taken me three days of tedious manual work,” he noted. “And I trust the agent to be more consistent than I am about not missing side effects.”

The shift from vibe coding to agentic engineering has profound implications for how teams structure their work. Some view it as a threat to junior developer roles — if an agent can take a well-scoped ticket and implement it, what will junior developers learn? But Karpathy sees it differently. He argues that agentic engineering actually raises the bar for human engineers: they need to be able to decompose a business problem into precise specifications that an agent can follow, and to judge the quality of agentic output in terms of correctness, maintainability, and security.

“The best human engineers will become architects of agent behavior, not mechanics of code syntax.”

This echoes a broader trend in software engineering. Just as the 2000s saw the rise of DevOps as a practice that automated deployment and infrastructure, the 2020s are giving birth to a new practice — let’s call it AgentOps — that automates the development loop itself. A 2025 report from Gartner predicted that by 2027, 40% of all new application code in large enterprises will be generated by autonomous agents, with humans acting primarily as reviewers and risk managers.

Of course, agentic engineering faces serious challenges. Reliability is the most pressing: models still hallucinate APIs, misunderstand architectural constraints, and write code that passes tests but has subtle logic errors. Karpathy himself cautioned that agentic workflows need “guardrails and circuit breakers” — automated checks that can stop an agent before it spirals into infinite loops or commits dangerous code. Another concern is accountability: if an agent introduces a security vulnerability, who is responsible? The vendor of the model, the engineer who configured the agent, or the company that deployed it? These legal and ethical questions are unresolved.

Despite the hurdles, Karpathy’s core thesis stands. Vibe coding has democratized the ability to produce code, but agentic engineering is what will professionalize AI-generated software. It moves the field from ‘generate and accept’ to ‘specify, delegate, and verify.’ For developers, the message is not to fear the change but to invest in skills that agents cannot easily replicate — systems thinking, abstraction design, and the art of writing clear, testable specifications. In a world of agentic engineers, the scarcest resource will not be code, but clarity of intent.

The interview ends with Karpathy offering a practical takeaway: start by experimenting with agentic tools in a sandbox, not in production. Try feeding an agent a well-defined bug report from an open-source project and see how it performs. Then, gradually expand its autonomy. “It’s like teaching a junior developer,” he says. “You don’t give them the keys to the production database on day one. You mentor them. The same logic applies to agents.”

For now, vibe coding remains a useful entry point — a friendly on-ramp for non-programmers and a time-saver for pros. But as Karpathy sees it, the real work of building reliable, autonomous software factories has just begun. The agentic engineering era will require a new breed of developer: one who can think like an architect, trust like a manager, and code only when the agent gets stuck.