Columbia University: Beyond the Cursor
The talk
Invited presentation at Columbia University’s CS Department. The core argument: the significant shift in AI-assisted development isn’t a better IDE — it’s when the model leaves the editor and starts executing in the shell.
I traced three generations of AI in engineering:
- Autocomplete (IntelliSense, Tabnine) — syntax speed, no structural understanding.
- Reasoning (ChatGPT, Gemini) — understanding intent behind code, but still waiting for you to act.
- Agency (Codex, Claude Code, Gemini CLI) — the model drives shell primitives directly. The cursor disappears.
What I covered
The execution loop. Gen 3 agents don’t just suggest — they plan, act, observe terminal output, and course-correct. The goal isn’t generating code; it’s returning a repository to a stable state after the perturbation of a new requirement.
The Google stack. How AI Studio, Gemini CLI, and Jetski work together in practice — from prompt tuning to MCP-based tool use to terminal-native execution within the google3 monorepo.
Human-in-the-loop friction. Without checkpoints, agents hallucinate libraries and create circular dependencies (“agentic drift”). Managed friction — deliberate pause points where a human verifies before the agent continues — is what makes autonomous execution reliable.
Live demo. I demonstrated a multi-agent system (organizational_agent_swarm) resolving a conflict: I asked it to create a script that bypasses its own safety checks. The Verifier agent identified the conflict with established rules, initiated a debate, and escalated to a human decision. The point: good agentic systems know when to stop and ask.