How to Build a GTD Agent — Whether in ChatGPT Workspace Agents or Claude Code

My first GTD agents were not very good.

Actually, from a GTD (Getting Things Done) workflow perspective, some of them were doing complete nonsense.

I had already framed them carefully. I was using my usual Prompt Constitution / Playbook approach: clear role, clear mission, operating rules, tool boundaries, expected outputs, validation steps.

And still, the agent started doing things that looked productive on the surface but were absolutely wrong from a GTD perspective.

The worst example was my calendar.

The agent was processing Outlook emails and, whenever it detected something that looked remotely time-related, it started creating calendar events.

Not real meetings.

Pseudo-meetings.

Sometimes partially renamed from the email subject line.

Sometimes duplicated several times.

And if the AI found open space in my calendar, it seemed happy to fill it.

From the outside, this looked like automation.

From a GTD perspective, it was a system failure.

Because an email that mentions a date is not automatically a calendar item.

An email that requires a response is not automatically a task.

An email that refers to a project is not automatically a project.

And an AI agent that can access your tools but does not understand your GTD decision model is not an assistant.

It is a very fast source of trusted-system pollution.

That was the turning point for me.

I realized that building a useful GTD agent does not start with choosing the best model.

It starts with defining the exact workflow the agent is allowed to execute.

Not the dream workflow.
Not “AI handles my inbox.”
Not “AI creates tasks from emails.”

The real question is:

What GTD decision model should the agent follow?

For example, if the agent is helping process an Outlook inbox, its mission should not be:

“Read my emails and create tasks.”

That is far too vague.

The mission should be:

Turn each incoming email into a reliable GTD decision.

That changes everything.

Because in GTD, an email can become many different things:

trash,
archive,
reference,
someday/maybe,
calendar,
next action,
waiting for,
project support,
a trigger for a new project,
or a clarification request.

So the agent needs a strict sequence.

For each email, it should ask:

What is this?
Is it actionable?
If it is not actionable, should it be deleted, archived, stored as reference, or incubated?
If it is actionable, what is the desired outcome?
What is the next visible physical action?
Is this a single action or part of a project?
Is someone else responsible?
Is there a real day-specific or time-specific commitment?
Which system should receive the result: task manager, calendar, reference system, project support, draft email, or clarification queue?
What needs human validation before anything is changed?

This is the part I had underestimated.

A GTD agent is not primarily a text generator.

It is a procedural operator.

It must know:

what to do,
in what order,
within which limits,
and when to stop.

This is especially important when building agents in environments such as ChatGPT Workspace Agents or Claude Code.

The temptation is to give the agent more autonomy because the tool is powerful.

But in GTD, more autonomy is not automatically better.

The agent must be autonomous only inside a clear frame.

My current rule is simple:

Dry-run by default.

The agent can read, analyze, classify, propose, draft, and prepare.

But it cannot execute without explicit validation.

It cannot:

send an email,
delete or archive a message,
create a task,
modify a calendar,
create a OneNote page,
move project support material,
or rename something in my system.

Not unless I explicitly approve the proposed decision.

That is not a limitation.

That is what makes the agent trustworthy.

The second rule is even more important:

The agent must never invent missing GTD data.

It must never invent:

a deadline,
a desired outcome,
a next action,
a project,
a context,
a label,
a responsibility,
or a calendar commitment.

If the information is missing, the correct output is not a confident guess.

The correct output is:

Clarification required.

This single rule changed a lot.

A reliable GTD agent should not always answer.

It should know when the item has not yet been clarified enough to be organized.

That distinction matters.

Because the risk with AI is not only that it makes mistakes.

The deeper risk is that it creates beautifully formatted ambiguity inside your trusted system.

A task that looks clear but is not really clarified.
A calendar event that looks legitimate but is not really a hard landscape commitment.
A project that was never consciously accepted.
A next action that is not actually physical or visible.
A waiting-for that does not define who owns the next move.

That is not GTD.

That is automation noise.

So now, when I design a GTD agent, I define several layers before giving it access to tools:

Mission - what exact GTD outcome is the agent responsible for?
Scope - which inboxes, lists, tools, or workflows can it touch?
GTD decision model - which clarification and organization sequence must it follow?
Rule hierarchy - what wins when instructions conflict?
Guardrails - what must the agent never invent, infer, or execute?
Default modeIs it dry-run, semi-automatic, or fully executable?
Output format - what must the agent return for each item so that I can audit the decision?
Validation protocol - at which points must the human approve the proposed action?

For my email-processing agent, the structured output looks something like this:

one-sentence email summary,
GTD decision,
confidence level,
risk level,
desired outcome,
next visible physical action,
project status,
proposed destination,
recommended tool action,
clarification required,
validation needed.

This may look heavy.

But for me, this is exactly what makes the system usable.

Because a GTD agent should be auditable.

It should be predictable.

It should be improvable.

And above all, it should protect the integrity of the trusted system.

The real value of AI in GTD is not that it “does everything for me.”

The value is that it can help apply a reliable decision model to a large amount of incoming stuff — faster, more consistently, and with better visibility.

But only if the agent is framed correctly.

Otherwise, it does what many humans do when they are overwhelmed:

It skips clarification and jumps straight into organizing or doing.

That is where the trouble starts.

So my current conclusion is this:

If you want to build a useful GTD agent, do not start with the model.

Start with the workflow.

Do not ask:

“What can this AI do?”

Ask:

“What GTD decision should this agent help me make?”

Then define the rules, the boundaries, the outputs, and the validation points.