AI Coding Agents in an Economics Team

NABE TEC EU 2026

Alex Guglielmone Nemi

2026-05-29

How the team works

Alex Guglielmone Nemi
Economic Decision Science, Amazon — Engineering Lead
Cross-functional team: Economists + Engineers

Goal: economists deliver without engineering bottlenecks
Fast iteration, minimal infrastructure
No tradeoff between exploration and production

Late-2025 quality jump

Source: METR Task Horizons · metr.org/time-horizons

How we leverage Coding Agents

d3 = require("d3@7")

data2 = [
  {category: "Analysis & research design", pct: 31},
  {category: "Data pipelines", pct: 27},
  {category: "Operations", pct: 17},
  {category: "Writing & comms", pct: 13},
  {category: "Troubleshooting", pct: 12}
]

{
  const width = 900;
  const height = 500;

  const color = d3.scaleOrdinal()
    .domain(data2.map(d => d.category))
    .range(["#f28e2b", "#4e79a7", "#e15759", "#76b7b2", "#59a14f"]);

  const root = d3.hierarchy({children: data2})
    .sum(d => d.pct)
    .sort((a, b) => b.value - a.value);

  d3.treemap()
    .size([width, height])
    .padding(3)
    .round(true)(root);

  const svg = d3.create("svg")
    .attr("width", width)
    .attr("height", height)
    .attr("viewBox", [0, 0, width, height])
    .style("display", "block")
    .style("margin", "0 auto");

  const leaf = svg.selectAll("g")
    .data(root.leaves())
    .join("g")
    .attr("transform", d => `translate(${d.x0},${d.y0})`);

  leaf.append("rect")
    .attr("width", d => d.x1 - d.x0)
    .attr("height", d => d.y1 - d.y0)
    .attr("fill", d => color(d.data.category))
    .attr("rx", 4);

  leaf.append("foreignObject")
    .attr("width", d => d.x1 - d.x0)
    .attr("height", d => d.y1 - d.y0)
    .append("xhtml:div")
    .style("width", "100%")
    .style("height", "100%")
    .style("display", "flex")
    .style("flex-direction", "column")
    .style("align-items", "center")
    .style("justify-content", "center")
    .style("padding", "6px")
    .style("box-sizing", "border-box")
    .style("text-align", "center")
    .style("overflow", "hidden")
    .html(d => `
      <span style="font-size:${(d.x1 - d.x0) > 120 ? 26 : 18}px; font-weight:bold; color:#1a1a1a; line-height:1.2;">${d.data.category}</span>
    `);

  return svg.node();
}

1. “Explain this model to me”

“Explain this model to me”
Agent explains methodology step by step
Economist spots: “that’s counting the wrong window”
- (or: wrong benchmark scope, double-counting, extensive vs intensive confusion…)
“Fix it” → corrected pipeline in 15 minutes

2. “Help me analyze this policy”

Economist has a policy question, no design yet
Dialogue: what’s the right approach? assumptions? what could go wrong?
Power analysis, clustering, spillovers. Methodology through conversation
Complete research design

3. “Help me respond to this challenge”

Leadership / stakeholders challenge methodology in a meeting
Economist uses agent: pull the data, produce the chart, draft the response
Prove it instantly instead of deferring it.
Real data backing the argument without effort or time wasted

Quick clarification

We also leverage AI-powered economics workflows (synthetic conjoints, model-driven scoring systems, text extraction and harmonization for assortment models)
Coding agents are the focus: the everyday working interface
Broad, open-ended, close to the production process
What I’ll discuss next applies to both

What economists never delegate

1 Methodology selection — Which estimator. Which design. 2 Causal identification — How to isolate the effect. 3 Economic interpretation — Whether results make sense. 4 Stakeholder judgment — Tone, framing, politics. 5 Research taste — What questions matter.

Pain points

Trust: “How do we know it’s right?”
Reliability: “It worked yesterday, not today”
Blast radius: “What happens if it goes rogue?”
Observability: “Can’t explain what happened”
Tool churn: “Everything keeps changing”
Cost: “Fast iteration without runaway spend”
Speed: “Too slow to explore effectively”

Control

The RA analogy

Same qualification. Different agent.

What an exam looks like

name: econ-model-runner
questions:
  - name: runs-scenario
    input: Run scenario for client ABC, target €100k. Scope: EU5
    intention: Agent picks the right skill and executes with correct params
    assert:
      - type: tool-called
        value: run-model --client ABC --ask 100000 --scope EU5

  - name: plans-batch-safely
    input: Run all clients under €300k, all markets, 20% above last cycle
    intention: Agent should plan before executing a large ambiguous batch
    assert:
      - type: plan-contains
        judge:
          - clarifies 'last cycle' time range
          - identifies source table
          - smoke tests one case before full run
          - estimates total effort

The feedback loop

Operational realities

Slow tools kill fast iteration
Context management is still a challenge
Security boundaries must live outside the LLM
Be a good boss: delegate, but own outcomes
The agent needs to work where the economist works

Takeaways

Sketch the workflow early. Try the UX.
Break long tasks into testable blocks
Use code for stable steps; use LLMs for reasoning and tool calling
Use Exam First Iteration to keep control

Questions?

https://linktr.ee/alex.guglielmone.nemi

Writings · Github · LinkedIn