Consider the spectrum of the following:

  • SKILL.md: Bare Agent handles the control flow (which steps to run, when) and the execution (transforming the input -> output of each step). Everything is powered by inference.
  • LLM Pipeline: Graph of nodes, each node is some natural language processing (LLM inference) and a tool call that moves to one of the next nodes. Control flow is deterministically orchestrated, but the path choices are still probabilistic (it’s coming from the LLM) and the execution could go either way depending on if the tool called is code or another LLM-based step.
  • Code: It’s basically dagster/airflow/prefect. Control flow is deterministic (same input same path), execution is deterministic (same tool params same output, assume no side effects).
Three tiers
Three tiers of agent control. Left: dithered ink clouds. Middle: a scatter of small solid dots. Right: one solid dot.
Skill LLM Pipeline Code
Run the same task many times. Skill drifts across a cloud of outputs; LLM Pipeline lands on one of several distinct outcomes; Code returns the exact same point every run.

Your goal is always to get as close to code as possible. Here’s why: