Columbia University: Beyond the Cursor

The talk

Invited presentation at Columbia University’s CS Department. The core argument: the significant shift in AI-assisted development isn’t a better IDE — it’s when the model leaves the editor and starts executing in the shell.

I traced three generations of AI in engineering:

Autocomplete (IntelliSense, Tabnine) — syntax speed, no structural understanding.
Reasoning (ChatGPT, Gemini) — understanding intent behind code, but still waiting for you to act.
Agency (Codex, Claude Code, Gemini CLI) — the model drives shell primitives directly. The cursor disappears.

What I covered

The execution loop. Gen 3 agents don’t just suggest — they plan, act, observe terminal output, and course-correct. The goal isn’t generating code; it’s returning a repository to a stable state after the perturbation of a new requirement.

The Google stack. How AI Studio, Gemini CLI, and Jetski work together in practice — from prompt tuning to MCP-based tool use to terminal-native execution within the google3 monorepo.

Human-in-the-loop friction. Without checkpoints, agents hallucinate libraries and create circular dependencies (“agentic drift”). Managed friction — deliberate pause points where a human verifies before the agent continues — is what makes autonomous execution reliable.

Live demo. I demonstrated a multi-agent system (organizational_agent_swarm) resolving a conflict: I asked it to create a script that bypasses its own safety checks. The Verifier agent identified the conflict with established rules, initiated a debate, and escalated to a human decision. The point: good agentic systems know when to stop and ask.

Slides | organizational_agent_swarm repo

The talk

What I covered

Artifact Evidence

Access Required