Claude Code Headless Mode: CI/CD Guide
Interactive Claude Code puts a developer in conversation with the model. Headless mode removes the developer from that loop.
The model receives input, produces output, and exits, no prompts, no confirmations, no interactive session. If you are not yet comfortable with interactive Claude Code workflows, the Claude Code Course is the right starting point before moving to headless automation.
This behavioral difference is what makes headless mode suitable for CI/CD pipelines. A pipeline step cannot pause to wait for a human to press Enter.
It needs a tool that runs, produces parseable output, and finishes. Headless mode is that tool.
What Headless Mode Is
Headless mode is invoked primarily through the --print flag. When this flag is present, Claude Code reads input (from stdin, a file argument, or inline text), produces output, prints it to stdout, and exits with a status code.
Three flags combine to make headless mode fully automated:
--print(or-p). Runs non-interactively, prints output, and exits. Required for all CI/CD use.--output-format json. Returns structured JSON that downstream pipeline steps can parse. The envelope includes the model’s response, any tool invocations, and session metadata.--yes(or-y). Auto-approves tool use decisions that would otherwise require human confirmation. Necessary for fully automated runs where no human is available to approve file edits or bash commands.
Together, these flags transform Claude Code from an interactive assistant into an automation component. The GitHub Actions integration guide shows how those flags translate into workflow YAML using the official claude-code-action.
Headless mode is not a reduced version of Claude Code. The same model, the same tools, and the same capabilities are available. The difference is purely in how the session is initiated and how output is delivered.
How Headless Differs From Interactive Mode
Understanding the practical differences prevents common misconfigurations. The most important: interactive mode allows the model to ask clarifying questions when the prompt is ambiguous.
Headless mode cannot, an ambiguous prompt produces a best-guess response, not a clarification request. Write headless prompts to be unambiguous.
| Aspect | Interactive Mode | Headless Mode |
|---|---|---|
| Session initiation | Developer opens a terminal session | Pipeline step invokes with flags |
| Input method | Conversational back-and-forth | Single prompt, stdin, or file argument |
| Tool approval | Human confirms each tool use | --yes flag auto-approves |
| Output delivery | Streamed to terminal | Written to stdout, optionally as JSON |
| Session persistence | Maintains context across turns | Single-turn by default |
| Error handling | Model asks for clarification | Returns error in output, exits with code |
| Cost per interaction | Variable, based on conversation length | Predictable, based on prompt + output |
| Suitable for automation | No | Yes |
State the input format, the expected output format, and any constraints explicitly. Do not rely on the model inferring intent from context that is not in the prompt.
5 Headless Use Cases in Production
Use Case 1: Batch Refactor
A codebase-wide refactor, updating deprecated API calls, renaming a module, standardizing error handling patterns, can be batched and run overnight. Claude Code receives each file, applies the transformation, and writes the result back.
A human reviews the diff in a PR the next morning.
This use case requires the bash tool enabled and the --yes flag. Scope it carefully: run on a copy of the branch, not directly on main.
Use Case 2: Automated PR Review
On every pull request event, Claude Code reads the diff and produces structured feedback. The feedback is posted as a PR comment.
The human reviewer sees the automated notes before beginning their own review.
This is the most widely deployed headless use case. The setup is covered in detail in our CI/CD pipeline integration guide.
The prompt quality determines the value. A generic prompt produces generic output.
Use Case 3: Nightly Test Generation
A scheduled job runs nightly, passing newly added or modified functions to Claude Code with a prompt to generate unit test cases. The generated tests are committed to a feature branch and a PR is opened for developer review the next morning.
For teams that want to run multiple nightly jobs concurrently, test generation on one set of files while documentation updates run on another, the patterns covered in the parallel agents guide apply here too.
This use case requires careful scoping:
- Pass only the new or modified code, not the entire codebase
- Include the existing test file as context so Claude Code follows established test patterns rather than inventing new ones
Use Case 4: Documentation Updates
When functions or APIs change, the corresponding documentation often lags. A headless job triggered on merges to main can compare changed function signatures against their documentation, identify mismatches, and generate updated docstring or README section drafts.
The output is a diff or a set of suggested changes, not an automatic commit. Documentation changes require human review before they reach users.
Use Case 5: Migration Check
Before a major dependency upgrade (a framework major version, a language runtime upgrade), Claude Code scans the codebase for patterns that are incompatible with the new version. It reports which files contain deprecated patterns, what the new pattern should be, and an estimated effort level for each.
This is a read-only analysis use case, use --disallow-tools bash to prevent any file writes. No files are written.
The output is a structured report that the engineering team uses to plan the migration sprint.
Headless Flag Reference
| Flag | Purpose | When to Use |
|---|---|---|
--print | Run non-interactively, print output, exit | Always in CI/CD |
-p | Shorthand for --print | Same as above |
--output-format json | Format output as structured JSON | When output is parsed by pipeline steps |
--output-format text | Plain text output | When output is posted as-is (PR comments) |
--yes / -y | Auto-approve all tool use decisions | Fully automated runs without human in loop |
--disallow-tools bash | Prevent bash tool execution | Read-only analysis tasks (review, scan) |
--max-tokens N | Cap output length | Cost control on high-volume runs |
--model MODEL | Specify model tier | Use lighter tier for simpler tasks |
--system-prompt FILE | Load system prompt from file | Reuse prompts across many runs |
--input-file FILE | Read input from file | Pass large code blocks without shell escaping |
Two production combinations used most often:
- Write-enabled tasks (batch refactor, test generation):
--print --output-format json --yes - Read-only analysis (PR review, security scan, migration check):
--print --output-format json --disallow-tools bash
Common Questions on Headless Mode
Can headless mode run multiple tasks in the same invocation?
A single headless invocation handles one prompt. For multiple tasks, invoke Claude Code multiple times, each with its own prompt.
Pipeline orchestration, GitHub Actions steps, bash scripts, Makefile targets, handles the sequencing. Do not try to encode multiple distinct tasks in a single verbose prompt. The output quality degrades.
How do we handle failures in headless mode?
Claude Code exits with a non-zero status code when it encounters an error it cannot recover from. Check the exit code in your pipeline step and handle failures explicitly: either fail the pipeline step, skip and continue, or post a notification.
An unhandled failure that silently passes to the next step is harder to debug than a loud, explicit failure. Use continue-on-error: true in GitHub Actions to make the step non-blocking.
Is there a way to pass context files to headless mode without putting them in the prompt?
Yes. Claude Code reads the CLAUDE.md file in the current directory automatically. Place context that applies across all headless runs in a CLAUDE.md file at the repository root.
For task-specific context, use --input-file or pipe additional files through stdin concatenated with the main prompt. Use --system-prompt FILE to load reusable system prompts from a file rather than inlining them.
What model tier should we use for headless CI/CD tasks?
Match the tier to the task complexity using the --model flag. Simple, structured tasks (changelog generation, basic style checks) work well with lighter model tiers at lower cost.
Complex analysis (security scanning, architectural review) benefits from the full model. Using the full model for every automated run is the most common source of unexpected CI/CD API costs.
Headless Mode as Infrastructure
The shift from using Claude Code as a developer tool to using it as an infrastructure component changes how you think about it. Prompts become specifications.
Outputs become pipeline inputs. Error handling becomes a deployment concern.
Teams that make this shift successfully treat headless Claude Code the way they treat any other external service in their pipeline: with explicit timeouts, fallback behavior, output validation, and cost monitoring. The same operational discipline that applies to any third-party API call applies here.
Path one: implement it yourself. Start with one use case (automated PR review is the lowest-risk starting point), write a specific prompt, and run it on a non-critical repository for two weeks. Measure whether the output is useful and adjust the prompt. Add additional use cases once the first is stable.
Path two: work with Phos AI Labs. If you want the headless integration designed for your workflow, prompts calibrated to your codebase, and the cost model validated before broad rollout, that is work we do with development teams. Start the conversation here.