Skip to content

Codex

4 posts with the tag “Codex”

How to Implement Automatic Retry for Agent CLIs Like Claude Code and Codex

How to Implement Automatic Retry for Agent CLIs Like Claude Code and Codex

Section titled “How to Implement Automatic Retry for Agent CLIs Like Claude Code and Codex”

The phrase automatic retry looks like a small toggle switch, but once you put it into a real engineering environment, it is nothing like that. Hello everyone, I am HagiCode creator Yu Kun. Today, I do not want to trade in empty talk. I want to talk about how automatic retry for Agent CLIs such as Claude Code and Codex should actually be done, so it can both recover from exceptions and avoid dragging the system into endless repeated execution.

If you have also been working on AI coding lately, you have probably already run into this kind of problem: the task does not fail immediately, but breaks halfway through execution.

In an ordinary HTTP request, that often just means sending it again, maybe with some exponential backoff. But Agent CLIs are different. Tools like Claude Code and Codex usually execute in a streaming manner, pushing output out chunk by chunk. During that process, they may also bind to a thread, session, or resume token. In other words, the question is not simply, “Did this request fail or not?” It becomes:

  • Does the content that was already emitted still count?
  • Can the current context continue running?
  • Should this failure be recovered automatically?
  • If it should be recovered, how long should we wait before retrying, what should we send during the retry, and should we still reuse the original context?

The first time many teams build this part, they instinctively write the most naive version: if an error occurs, try once more. That idea is perfectly natural, but once it reaches a real project, one problem after another starts surfacing.

  • Some errors are clearly temporary failures, yet get treated as final failures
  • Some errors are not worth retrying at all, yet the system replays them over and over
  • Requests with a thread and requests without a thread get treated exactly the same
  • The backoff strategy has no boundary, and background requests overload themselves

While integrating multiple Agent CLIs, HagiCode also stepped into these traps. On the Codex side in particular, the first issue we exposed was that a certain type of reconnect message was not recognized as a retryable terminal state, so the recovery mechanism we already had never got a chance to take effect. To put it plainly, it was not that the system lacked automatic retry. The system simply failed to recognize that this particular failure was worth retrying.

So the core point of this article is very clear: automatic retry is not a button, but a layered design.

The approach shared in this article comes from real practice in our HagiCode project. What HagiCode is trying to do is not just connect one model and call it a day. It is about unifying the streaming messages, tool calls, failure recovery, and session context of multiple Agent CLIs into one execution model that can be maintained over the long term.

One of the things I care about most is how to make AI coding truly land in real engineering work. Writing a demo is not hard. The hard part is turning that demo into something a team is genuinely willing to use for a long time. HagiCode takes automatic retry seriously not because the feature looks sophisticated, but because if long-running, streaming, resumable CLI execution is not stable, what users see is not an intelligent assistant, but a command wrapper that drops the connection halfway through every other run.

If you want to look at the project entry points first, here are two:

Taking it one step further, HagiCode is also on Steam now. If you use Steam, feel free to add it to your wishlist first:

Why Automatic Retry for Agent CLIs Is Harder Than Ordinary Retry

Section titled “Why Automatic Retry for Agent CLIs Is Harder Than Ordinary Retry”

This is a very practical question, so let us go straight to the conclusion: the difficulty of automatic retry for Agent CLIs is not “try again after a few seconds,” but “can it still continue in the original context?”

You can think of it as a long conversation. Ordinary API retry is more like redialing when the phone line is busy. Agent CLI retry is more like the signal dropping while the other party is halfway through a sentence, and then you have to decide whether to call back, whether to start over when you do, and whether the other party still remembers where the conversation stopped. These are not the same kind of engineering problem at all.

More concretely, there are four especially typical difficulties.

Once output has already been sent to the user, you can no longer handle failure the way you would with an ordinary request, where you silently swallow it and quietly try again. That is because the earlier content has already been seen. If the replay strategy is wrong, the frontend can easily show duplicated text and inconsistent state, and the lifecycle of tool calls can become tangled as well. This is not metaphysics. It is engineering.

Providers like Codex bind to a thread, and implementations like Claude Code also have a continuation target or an equivalent resumable context. The real prerequisite for automatic retry is not just that the error looks like a temporary failure, but also that there is still a carrier that allows this execution to continue.

Network jitter, SSE idle timeout, and temporary upstream failures are usually worth another try. But if what you are facing is authentication failure, lost context, or a provider that has no resume capability at all, then retrying is usually not recovery. It is noise generation.

Unlimited automatic retry is almost always wrong. Technology trends can be noisy for a while, but engineering laws often remain stable for many years. One of them is that failure recovery must have boundaries. The system has to know how many times it can retry at most, how long it should wait each time, and when it should stop and admit that this one is really not going to recover.

Because of these characteristics, HagiCode ultimately did not implement automatic retry as a few lines of try/catch inside a specific provider. Instead, we extracted it into a shared capability layer. In the end, engineering problems still need to be solved with engineering methods.

HagiCode’s Approach: Pull Retry Out of the Provider

Section titled “HagiCode’s Approach: Pull Retry Out of the Provider”

HagiCode’s current real-world implementation can be compressed into one sentence:

The shared layer manages the retry flow uniformly, and each concrete provider is only responsible for answering two questions: is this terminal state worth retrying, and can the current context still continue?

This is not complicated, but it is critical. Once responsibilities are separated this way, Claude Code, Codex, and even other Agent CLIs can all reuse the same skeleton. Models will change, tools will evolve, workflows will be upgraded, but the engineering foundation remains there.

Layer 1: Use a unified coordinator to manage the retry loop

Section titled “Layer 1: Use a unified coordinator to manage the retry loop”

The core implementation fragment in the project looks roughly like this:

internal static class ProviderErrorAutoRetryCoordinator
{
public static async IAsyncEnumerable<CliMessage> ExecuteAsync(
string prompt,
ProviderErrorAutoRetrySettings? settings,
Func<string, IAsyncEnumerable<CliMessage>> executeAttemptAsync,
Func<bool> canRetryInSameContext,
Func<TimeSpan, CancellationToken, Task> delayAsync,
Func<CliMessage, bool> isRetryableTerminalMessage,
[EnumeratorCancellation] CancellationToken cancellationToken)
{
var normalizedSettings = ProviderErrorAutoRetrySettings.Normalize(settings);
var retrySchedule = normalizedSettings.Enabled
? normalizedSettings.GetRetrySchedule()
: [];
for (var attempt = 0; ; attempt++)
{
var attemptPrompt = attempt == 0
? prompt
: ProviderErrorAutoRetrySettings.ContinuationPrompt;
CliMessage? terminalFailure = null;
await foreach (var message in executeAttemptAsync(attemptPrompt)
.WithCancellation(cancellationToken))
{
if (isRetryableTerminalMessage(message))
{
terminalFailure = message;
break;
}
yield return message;
}
if (terminalFailure is null)
{
yield break;
}
if (attempt >= retrySchedule.Count || !canRetryInSameContext())
{
yield return terminalFailure;
yield break;
}
await delayAsync(retrySchedule[attempt], cancellationToken);
}
}
}

What this code does is actually very straightforward, but also very effective.

  • Do not pass intermediate failures through directly at first; the coordinator decides whether recovery is still possible
  • Only when the retry budget is exhausted does the final failure actually return to the upper layer
  • Starting from the second attempt, the original prompt is no longer sent; a continuation prompt is sent uniformly instead

That is why I kept stressing earlier that automatic retry is not simply “make the request again.” It is not just patching an exception branch. It is managing the life cycle of an execution. That may sound like product-manager language, but in engineering terms, that is exactly what it is.

Another issue that is very easy to overlook is this: who decides whether automatic retry is enabled for this request?

HagiCode’s answer is not to depend on some “current global configuration,” but to turn the policy into a snapshot and let it travel together with this request. That way, session queuing, message persistence, execution forwarding, and provider adaptation will not lose the policy along the way. One successful run is not a system. Sustained success is a system.

The core structure can be simplified into this:

public sealed record ProviderErrorAutoRetrySnapshot
{
public const string DefaultStrategy = "default";
public bool Enabled { get; init; }
public string Strategy { get; init; } = DefaultStrategy;
public static ProviderErrorAutoRetrySnapshot Normalize(bool? enabled, string? strategy)
{
return new ProviderErrorAutoRetrySnapshot
{
Enabled = enabled ?? true,
Strategy = string.IsNullOrWhiteSpace(strategy)
? DefaultStrategy
: strategy.Trim()
};
}
}

Then on the execution side, it is mapped into the settings object actually consumed by the provider. The value of this approach is very direct:

  • The business layer decides whether retry should be allowed
  • The runtime decides how retry should be performed

Each side manages its own concern without colliding with the other. Many problems are not impossible to solve. Their cost simply has not been made explicit. Turning the policy into a snapshot is essentially a way of accounting for that cost in advance.

Layer 3: Providers only decide terminal state and context viability

Section titled “Layer 3: Providers only decide terminal state and context viability”

Once we reach the concrete Claude Code or Codex provider, the responsibility here actually becomes very thin. You can think of it as enhancement, not replacement.

Taking Codex as an example, when it hooks into the shared coordinator, it really only needs to provide three things:

await foreach (var message in ProviderErrorAutoRetryCoordinator.ExecuteAsync(
prompt,
options.ProviderErrorAutoRetry,
retryPrompt => ExecuteCodexAttemptAsync(...),
() => !string.IsNullOrWhiteSpace(resolvedThreadId),
DelayAsync,
IsRetryableTerminalFailure,
cancellationToken))
{
yield return message;
}

You will notice that the provider-specific decisions are really only these two:

  • IsRetryableTerminalFailure
  • canRetryInSameContext

Codex checks whether the thread can still continue, while Claude Code checks whether the continuation target still exists. Backoff policy, retry count, and follow-up prompts should not be reinvented by every provider separately.

Once this layer is separated out, the cost of integrating more CLIs into HagiCode drops a lot. You do not have to duplicate an entire retry state machine. You only need to plug in the boundary conditions of that provider. Writing quickly is not the same as writing robustly. Being able to connect something is not the same as connecting it well. Getting it to run is also not the same as making it maintainable over time.

An Easy Mistake to Make: Do Not Treat Every Error as Retryable

Section titled “An Easy Mistake to Make: Do Not Treat Every Error as Retryable”

In this analysis, the point I most want to single out is not “how to implement retry,” but “how to avoid the wrong retries.”

The original entry point into the problem was that Codex failed to recognize one reconnect message. By intuition, many people would pick the smallest possible fix: add one more string prefix to the whitelist. That idea is not exactly wrong, but it feels more like a demo-stage solution than a long-term maintainable one.

From the current HagiCode implementation, the system has already taken a step in a more robust direction. It no longer stares only at one literal string. Instead, it hands recoverable terminal states over to the shared coordinator uniformly. The benefits are obvious:

  • It is less likely to fail completely because of a small wording change in one message
  • Test coverage can be built around the terminal-state envelope rather than a single hard-coded text line
  • Retry logic becomes more consistent within the same provider

Of course, there needs to be a firm boundary here: being more general does not mean being more permissive. If the current context cannot continue, then even if the error looks like a temporary failure, it should not be replayed blindly.

This point is critical. What really makes people trust a system is not that it occasionally works, but that it is reliable most of the time. If a flow can only be maintained by experts, then it is still a long way from real adoption.

The Three Most Valuable Lessons to Keep in Practice

Section titled “The Three Most Valuable Lessons to Keep in Practice”

At this point, it makes sense to start bringing the discussion back down to implementation practice. If you are planning to build a similar capability in your own project, these are the three rules I most strongly recommend protecting first.

HagiCode’s current default backoff rhythm is:

  • 10 seconds
  • 20 seconds
  • 60 seconds

This rhythm may not fit every system, but the existence of boundaries must remain. Otherwise, automatic retry quickly stops being a recovery mechanism and turns into an incident amplifier. Do not rush to give it an impressive name. First make sure the thing can survive two iterations inside a real team.

2. The continuation prompt should be unified

Section titled “2. The continuation prompt should be unified”

The project uses a fixed continuation prompt so that later attempts clearly follow the path of continuing the current context rather than starting a brand-new complete request. This capability is not flashy, but when you build a real project, you cannot do without it. Many things that look like magic are, once broken apart, just a polished engineering process.

3. Both the shared library and the adapter layer need mirrored tests

Section titled “3. Both the shared library and the adapter layer need mirrored tests”

I especially want to say a little more about this point. Many teams will write one layer of tests in the shared runtime and think that is probably enough. It is not.

The reason I feel relatively confident about HagiCode’s implementation is that both layers have test coverage:

  • The shared provider tests whether automatic continuation really happened
  • The adapter layer tests whether final errors and streaming messages were preserved correctly

This time I also reran two related test groups, and all 31 test cases passed in both of them. That result alone does not prove the design is perfect, but it proves at least one thing: the current automatic retry is not a paper design. It is a capability constrained by both code and tests. Talk is cheap. Show me the code. It fits perfectly here.

If the entire article had to be compressed into one sentence, it would be this:

For Agent CLIs such as Claude Code and Codex, automatic retry should not be implemented as a local trick hidden inside one provider. It should be built as a combination of a shared coordinator, policy snapshot, context viability checks, and mirrored tests.

The benefits of doing it this way are very practical:

  • The logic is written once and reused across multiple providers
  • Whether a request is allowed to retry can travel stably with the execution chain
  • Continue running when context exists, and stop in time when it does not
  • What the frontend ultimately sees is a stable completed state or failed state, not a pile of abandoned intermediate noise

This solution was polished little by little while HagiCode was integrating multiple Agent CLIs in real scenarios. Who says AI-assisted programming is not the new era of pair programming? Models help you get started, complete code, and branch out, but what often determines the upper bound of the experience is still context, process, and constraints.

If this article was helpful to you, you are also welcome to look at HagiCode’s public entry points:

HagiCode is already on Steam now. This is not vaporware, and I have put the link right here. If you use Steam, go ahead and add it to your wishlist. Clicking in to take a look yourself is more direct than hearing me say ten more lines about it here.

That is enough on this topic for now. We will keep meeting inside real projects.

Thank you for reading. If you found this article useful, you are welcome to like, bookmark, and share it. This content was created with AI-assisted collaboration, and the final content was reviewed and confirmed by the author.

Running AI CLI Tools in Docker Containers: A Practical Guide to User Isolation and Persistent Volumes

Running AI CLI Tools in Docker Containers: A Practical Guide to User Isolation and Persistent Volumes

Section titled “Running AI CLI Tools in Docker Containers: A Practical Guide to User Isolation and Persistent Volumes”

Integrating AI coding tools like Claude Code, Codex, and OpenCode into containerized environments sounds simple, but there are hidden complexities everywhere. This article takes a deep dive into how the HagiCode project solves core challenges in Docker deployments, including user permissions, configuration persistence, and version management, so you can avoid the common pitfalls.

When we decided to run AI coding CLI tools inside Docker containers, the most intuitive thought was probably: “Aren’t containers just root? Why not install everything directly and call it done?” In reality, that seemingly simple idea hides several core problems that must be solved.

First, security restrictions are the first hurdle. Take Claude CLI as an example: it explicitly forbids running as the root user. This is a mandatory security check, and if root is detected, it refuses to start. You might think, can’t I just switch users with the USER directive? It is not that simple. There is still a mapping problem between the non-root user inside the container and the user permissions on the host machine.

Second, state persistence is the second trap. Claude Code requires login, Codex has its own configuration, and OpenCode also has a cache directory. If you have to reconfigure everything every time the container restarts, the whole idea of “automation” loses its meaning. We need these configurations to persist beyond the lifecycle of the container.

The third problem is permission consistency. Can processes inside the container access configuration files created by the host user? UID/GID mismatches often cause file permission errors, and this is extremely common in real deployments.

These problems may look independent, but in practice they are tightly connected. During HagiCode’s development, we gradually worked out a practical solution. Next, I will share the technical details and the lessons learned from those pitfalls.

The solution shared in this article comes from our practical experience in the HagiCode project. HagiCode is an open-source AI-assisted programming platform that integrates multiple mainstream AI coding assistants, including Claude Code, Codex, and OpenCode. As a project that needs cross-platform and highly available deployment, HagiCode has to solve the full range of challenges involved in containerized deployment.

If you find the technical solution in this article valuable, that is a sign HagiCode has something real to offer in engineering practice. In that case, the HagiCode official website and GitHub repository are both worth following.

There is a common misunderstanding here: Docker containers run as root by default, so why not just install the tools as root? If you think that way, Claude CLI will quickly teach you otherwise.

Terminal window
# Run Claude CLI directly as root? No.
docker run --rm -it --user root myimage claude
# Output: Error: This command cannot be run as root user

This is a hard security restriction in Claude CLI. The reason is simple: these CLI tools read and write sensitive user configuration, including API tokens, local caches, and even scripts written by the user. Running them with root privileges introduces too much risk.

So the question becomes: how can we satisfy the CLI’s security requirements while keeping container management flexible? We need to change the way we think about it: instead of switching users at runtime, create a dedicated user during the image build stage.

Creating a dedicated user: more than just changing a name

Section titled “Creating a dedicated user: more than just changing a name”

You might think that adding a single USER line to the Dockerfile is enough. That is indeed the simplest approach, but it is not robust enough.

HagiCode’s approach is to create a hagicode user with UID 1000, which usually matches the default user on most host machines:

RUN groupadd -o -g 1000 hagicode && \
useradd -o -u 1000 -g 1000 -s /bin/bash -m hagicode && \
mkdir -p /home/hagicode/.claude && \
chown -R hagicode:hagicode /home/hagicode

But this only solves the built-in user inside the image. What if the host user is UID 1001? You still need to support dynamic mapping when the container starts.

docker-entrypoint.sh contains the key logic:

Terminal window
if [ -n "$PUID" ] && [ -n "$PGID" ]; then
if ! id hagicode >/dev/null 2>&1; then
groupadd -g "$PGID" hagicode
useradd -u "$PUID" -g "$PGID" -s /bin/bash -m hagicode
fi
fi

The advantage of this design is clear: use the default UID 1000 at image build time, then adjust dynamically at runtime through the PUID and PGID environment variables. No matter what UID the host user has, ownership of configuration files remains correct.

The design philosophy of persistent volumes

Section titled “The design philosophy of persistent volumes”

Each AI CLI tool has its own preferred configuration directory, so they need to be mapped one by one:

CLI ToolPath in ContainerNamed Volume
Claude/home/hagicode/.claudeclaude-data
Codex/home/hagicode/.codexcodex-data
OpenCode/home/hagicode/.config/opencodeopencode-config-data

Why use named volumes instead of bind mounts? Three reasons:

  1. Simpler management: Named volumes are managed automatically by Docker, so you do not need to create host directories manually.
  2. Permission isolation: The initial contents of the volumes are created by the user inside the container, avoiding permission conflicts with the host.
  3. Independent migration: Volumes can exist independently of containers, so data is not lost when images are upgraded.

docker-compose-builder-web automatically generates the corresponding volume configuration:

volumes:
claude-data:
codex-data:
opencode-config-data:
services:
hagicode:
volumes:
- claude-data:/home/hagicode/.claude
- codex-data:/home/hagicode/.codex
- opencode-config-data:/home/hagicode/.config/opencode
user: "${PUID:-1000}:${PGID:-1000}"

Pay attention to the user field here: PUID and PGID are injected through environment variables to ensure that processes inside the container run with an identity that matches the host user. This detail matters because permission issues are painful to debug once they appear.

Version management: baked-in versions with runtime overrides

Section titled “Version management: baked-in versions with runtime overrides”

Pinning Docker image versions is essential for reproducibility. But in real development, we often need to test a newer version or urgently fix a bug. If we had to rebuild the image every time, the workflow would be far too inefficient.

HagiCode’s strategy is fixed versions as the default, with runtime overrides as an extension mechanism. It is a pragmatic engineering compromise between stability and flexibility.

Dockerfile.template pins versions here:

USER hagicode
WORKDIR /home/hagicode
# Configure the global npm install path
RUN mkdir -p /home/hagicode/.npm-global && \
npm config set prefix '/home/hagicode/.npm-global'
# Install CLI tools using pinned versions
RUN npm install -g @anthropic-ai/claude-code@2.1.71 && \
npm install -g @openai/codex@0.112.0 && \
npm install -g opencode-ai@1.2.25 && \
npm cache clean --force

docker-entrypoint.sh supports runtime overrides:

Terminal window
install_cli_override_if_needed() {
local package_name="$2"
local override_version="$5"
if [ -n "$override_version" ]; then
gosu hagicode npm install -g "${package_name}@${override_version}"
fi
}
# Example usage
install_cli_override_if_needed "" "@anthropic-ai/claude-code" "" "" "${CLAUDE_CODE_CLI_VERSION}"

This lets you test a new version through an environment variable without rebuilding the image:

Terminal window
docker run -e CLAUDE_CODE_CLI_VERSION=2.2.0 myimage

This design is practical because nobody wants to rebuild an image every time they test a new feature.

In addition to configuring CLI tools manually, some scenarios require automatic configuration injection. The most typical example is an API token.

Terminal window
if [ -n "$ANTHROPIC_AUTH_TOKEN" ]; then
mkdir -p /home/hagicode/.claude
cat > /home/hagicode/.claude/settings.json <<EOF
{
"env": {
"ANTHROPIC_AUTH_TOKEN": "${ANTHROPIC_AUTH_TOKEN}"
}
}
EOF
chown -R hagicode:hagicode /home/hagicode/.claude
fi

Two things matter here: pass sensitive information through environment variables instead of hard-coding it into the image, and make sure the ownership of configuration files is set correctly, otherwise the CLI tools will not be able to read them.

This is the easiest trap to fall into. The host user has UID 1001, while the container uses 1000, so files created on one side cannot be accessed on the other.

Terminal window
# Correct approach: make the container match the host user
docker run \
-e PUID=$(id -u) \
-e PGID=$(id -g) \
myimage

This issue is very common, and it can be frustrating the first time you run into it.

Configuration disappears after container restart

Section titled “Configuration disappears after container restart”

If you find yourself logging in again after every restart, check whether you forgot to mount a persistent volume:

volumes:
- claude-data:/home/hagicode/.claude

Nothing is more frustrating than carefully setting up a configuration only to see it disappear.

Do not run npm install -g directly inside a running container. The correct approaches are:

  1. Set an environment variable to trigger override installation.
  2. Or rebuild the image.
Terminal window
# Option 1: runtime override
docker run -e CLAUDE_CODE_CLI_VERSION=2.2.0 myimage
# Option 2: rebuild the image
docker build -t myimage:v2 .

There is more than one road to Rome, but some roads are smoother than others.

  • Pass API tokens through environment variables instead of writing them into the image.
  • Set configuration file permissions to 600.
  • Always run the application as a non-root user.
  • Update CLI versions regularly to fix security vulnerabilities.

Security is always important, but the real challenge is consistently enforcing it in practice.

If you want to support a new CLI tool in the future, there are only three steps:

  1. Dockerfile.template: add the installation step.
  2. docker-entrypoint.sh: add the version override logic.
  3. docker-compose-builder-web: add the persistent volume mapping.

This template-based design makes extension simple without changing the core logic.

Running AI CLI tools in Docker containers involves three core challenges: user permissions, configuration persistence, and version management. By combining dedicated users, named-volume isolation, and environment-variable-based overrides, the HagiCode project built a deployment architecture that is both secure and flexible.

Key design points:

  • User isolation: Create a dedicated user during the image build stage, with runtime support for dynamic PUID/PGID mapping.
  • Persistence strategy: Each CLI tool gets its own named volume, so restarts do not affect configuration.
  • Version flexibility: Fixed defaults ensure reproducibility, while runtime overrides provide room for testing.
  • Automated configuration: Sensitive configuration can be injected automatically through environment variables.

This solution has been running stably in the HagiCode project for some time, and I hope it offers useful reference points for developers with similar needs.

Thank you for reading. If you found this article useful, you are welcome to like, bookmark, and share it. This content was created with AI-assisted collaboration, and the final content was reviewed and confirmed by the author.

HagiCode Multi-AI Provider Switching and Interoperability Implementation Plan

HagiCode Multi-AI Provider Switching and Interoperability Implementation Plan

Section titled “HagiCode Multi-AI Provider Switching and Interoperability Implementation Plan”

In the modern developer-tooling ecosystem, developers often need to use different AI coding assistants to support their work. Anthropic’s Claude Code CLI and OpenAI’s Codex CLI each have their own strengths: Claude is known for outstanding code understanding and long-context handling, while Codex excels at code generation and tool usage.

This article takes an in-depth look at how the HagiCode project achieves seamless switching and interoperability across multiple AI providers, including the core architectural design, key implementation details, and practical considerations.

The core challenge faced by the HagiCode project is supporting multiple AI CLIs on the same platform, so users can:

  1. Flexibly switch between AI providers based on their needs
  2. Maintain session continuity during provider switching
  3. Unify the API differences across different CLIs behind a common abstraction
  4. Reserve extension points for adding new AI providers in the future
  1. Unifying interface differences: Claude Code CLI is invoked through command-line calls, while Codex CLI uses a JSON event stream
  2. Handling streaming responses: Both providers support streaming responses, but with different data formats
  3. Tool-calling semantics: Claude and Codex differ in how they represent tool calls and manage their lifecycle
  4. Session lifecycle: The system must correctly manage session creation, restoration, and termination for each provider

HagiCode uses the Provider Pattern combined with the Factory Pattern to abstract AI service invocation. The core ideas of this design are:

  1. Unified interface abstraction: Define the IAIProvider interface as the common abstraction for all AI providers
  2. Factory-created instances: Use AIProviderFactory to dynamically create the corresponding provider instance based on type
  3. Intelligent selection logic: Use AIProviderSelector to automatically select the most suitable provider based on scenario and configuration
  4. Session state management: Persist the binding relationship between sessions and CLI threads in the database
ComponentResponsibilityLanguage
IAIProviderUnified provider interfaceC#
AIProviderFactoryCreate and manage provider instancesC#
AIProviderSelectorSelect providers intelligentlyC#
ClaudeCodeCliProviderClaude Code CLI implementationC#
CodexCliProviderCodex CLI implementationC#
AgentCliManagerDesktop-side CLI managementTypeScript

The IAIProvider interface defines the unified provider abstraction:

public interface IAIProvider
{
/// <summary>
/// Provider display name
/// </summary>
string Name { get; }
/// <summary>
/// Whether streaming responses are supported
/// </summary>
bool SupportsStreaming { get; }
/// <summary>
/// Provider capability description
/// </summary>
ProviderCapabilities Capabilities { get; }
/// <summary>
/// Execute a single AI request
/// </summary>
Task<AIResponse> ExecuteAsync(AIRequest request, CancellationToken cancellationToken = default);
/// <summary>
/// Execute a streaming AI request
/// </summary>
IAsyncEnumerable<AIStreamingChunk> StreamAsync(AIRequest request, CancellationToken cancellationToken = default);
/// <summary>
/// Check provider connectivity and responsiveness
/// </summary>
Task<ProviderTestResult> PingAsync(CancellationToken cancellationToken = default);
/// <summary>
/// Send a message with an embedded command
/// </summary>
IAsyncEnumerable<AIStreamingChunk> SendMessageAsync(
AIRequest request,
string? embeddedCommandPrompt = null,
CancellationToken cancellationToken = default);
}

Key characteristics of this interface design:

  • Unified request/response model: All providers use the same AIRequest and AIResponse types
  • Streaming support: Standardize streaming output through IAsyncEnumerable<AIStreamingChunk>
  • Capability description: ProviderCapabilities describes the features supported by the provider (streaming, tools, maximum tokens, and so on)
  • Embedded commands: SendMessageAsync supports embedding OpenSpec commands into prompts
public enum AIProviderType
{
ClaudeCodeCli, // Anthropic Claude Code
OpenCodeCli, // Other CLIs (extensible)
GitHubCopilot, // GitHub Copilot
CodebuddyCli, // Codebuddy
CodexCli // OpenAI Codex
}

This enum provides a type-safe representation for all providers supported by the system.

The AIProviderFactory is responsible for creating and managing provider instances:

public class AIProviderFactory : IAIProviderFactory
{
private readonly ConcurrentDictionary<AIProviderType, IAIProvider> _cache;
private readonly IOptions<AIProviderOptions> _options;
private readonly IServiceProvider _serviceProvider;
public Task<IAIProvider?> GetProviderAsync(AIProviderType providerType)
{
// Use caching to avoid duplicate creation
if (_cache.TryGetValue(providerType, out var cached))
return Task.FromResult<IAIProvider?>(cached);
// Get provider configuration from settings
var aiOptions = _options.Value;
if (!aiOptions.Providers.TryGetValue(providerType, out var config))
{
_logger.LogWarning("Provider '{ProviderType}' not found in configuration", providerType);
return Task.FromResult<IAIProvider?>(null);
}
// Create provider by type
var provider = providerType switch
{
AIProviderType.ClaudeCodeCli =>
_serviceProvider.GetService(typeof(ClaudeCodeCliProvider)) as IAIProvider,
AIProviderType.CodexCli =>
_serviceProvider.GetService(typeof(CodexCliProvider)) as IAIProvider,
AIProviderType.GitHubCopilot =>
_serviceProvider.GetService(typeof(CopilotAIProvider)) as IAIProvider,
_ => null
};
if (provider != null)
{
_cache[providerType] = provider;
}
return Task.FromResult<IAIProvider?>(provider);
}
}

Advantages of the factory pattern:

  • Instance caching: Avoid repeatedly creating the same type of provider
  • Dependency injection: Create instances through IServiceProvider, with dependency injection support
  • Configuration-driven: Read provider settings from configuration files
  • Exception handling: Return null when creation fails, making it easier for upper layers to handle errors

The AIProviderSelector implements provider-selection strategies:

public class AIProviderSelector : IAIProviderSelector
{
private readonly BusinessLayerConfiguration _configuration;
private readonly IAIProviderFactory _providerFactory;
private readonly IMemoryCache _cache;
public async Task<AIProviderType> SelectProviderAsync(
BusinessScenario scenario,
CancellationToken cancellationToken = default)
{
// 1. Try getting a provider from scenario mapping
if (_configuration.ScenarioProviderMapping.TryGetValue(scenario, out var providerType))
{
if (await IsProviderAvailableAsync(providerType, cancellationToken))
{
_logger.LogDebug("Selected provider '{Provider}' for scenario '{Scenario}'",
providerType, scenario);
return providerType;
}
_logger.LogWarning("Configured provider '{Provider}' for scenario '{Scenario}' is not available",
providerType, scenario);
}
// 2. Try the default provider
if (await IsProviderAvailableAsync(_configuration.DefaultProvider, cancellationToken))
{
_logger.LogDebug("Using default provider '{Provider}' for scenario '{Scenario}'",
_configuration.DefaultProvider, scenario);
return _configuration.DefaultProvider;
}
// 3. Try the fallback chain
foreach (var fallbackProvider in _configuration.FallbackChain)
{
if (await IsProviderAvailableAsync(fallbackProvider, cancellationToken))
{
_logger.LogInformation("Using fallback provider '{Provider}' for scenario '{Scenario}'",
fallbackProvider, scenario);
return fallbackProvider;
}
}
// 4. No available provider can be found
throw new InvalidOperationException(
$"No available AI provider found for scenario '{scenario}'");
}
public async Task<bool> IsProviderAvailableAsync(
AIProviderType providerType,
CancellationToken cancellationToken = default)
{
var cacheKey = $"provider_available_{providerType}";
// Use caching to reduce Ping calls
if (_configuration.EnableCache &&
_cache.TryGetValue<bool>(cacheKey, out var cached))
{
return cached;
}
var provider = await _providerFactory.GetProviderAsync(providerType);
var isAvailable = provider != null;
if (_configuration.EnableCache && isAvailable)
{
_cache.Set(cacheKey, isAvailable,
TimeSpan.FromSeconds(_configuration.CacheExpirationSeconds));
}
return isAvailable;
}
}

Selector strategy:

  • Scenario mapping first: First check whether the business scenario has a specific provider mapping
  • Fallback to default provider: Use the default provider if scenario mapping fails
  • Fallback chain as a final safeguard: Try providers in the fallback chain one by one
  • Availability caching: Cache provider availability checks to reduce Ping calls

5. Claude Code CLI Provider Implementation

Section titled “5. Claude Code CLI Provider Implementation”
public class ClaudeCodeCliProvider : IAIProvider
{
private readonly ILogger<ClaudeCodeCliProvider> _logger;
private readonly IClaudeStreamManager _streamManager;
private readonly ProviderConfiguration _config;
public string Name => "ClaudeCodeCli";
public bool SupportsStreaming => true;
public ProviderCapabilities Capabilities { get; }
public async Task<AIResponse> ExecuteAsync(AIRequest request, CancellationToken cancellationToken = default)
{
_logger.LogInformation("Executing AI request with provider: {Provider}", Name);
var sessionOptions = ClaudeRequestMapper.MapToSessionOptions(request, _config);
var messages = _streamManager.SendMessageAsync(request.Prompt, sessionOptions, cancellationToken);
var responseBuilder = new StringBuilder();
ResultMessage? finalResult = null;
await foreach (var streamMessage in messages)
{
switch (streamMessage.Message)
{
case ResultMessage result:
finalResult = result;
responseBuilder.Append(result.Result);
break;
}
}
if (finalResult != null)
{
return ClaudeResponseMapper.MapToAIResponse(finalResult, Name);
}
return new AIResponse
{
Content = responseBuilder.ToString(),
FinishReason = FinishReason.Unknown,
Provider = Name
};
}
}

Characteristics of the Claude Code CLI provider:

  • Streaming manager integration: Use IClaudeStreamManager to communicate with the Claude CLI
  • CessionId session isolation: Use CessionId as the unique session identifier, distinct from the system sessionId
  • Working directory configuration: Support configuration of the working directory, permission mode, and more
  • Tool support: Support tool-permission settings such as AllowedTools and DisallowedTools
public class CodexCliProvider : IAIProvider
{
private readonly ILogger<CodexCliProvider> _logger;
private readonly CodexSettings _settings;
private readonly ConcurrentDictionary<string, string> _sessionThreadBindings;
public string Name => "CodexCli";
public bool SupportsStreaming => true;
public ProviderCapabilities Capabilities { get; }
public async IAsyncEnumerable<AIStreamingChunk> StreamAsync(
AIRequest request,
[EnumeratorCancellation] CancellationToken cancellationToken = default)
{
_logger.LogInformation("Executing streaming AI request with provider: {Provider}", Name);
var codex = CreateCodexClient();
var thread = ResolveThread(codex, request);
var currentTurn = 0;
var activeToolCalls = new Dictionary<string, AIToolCallDelta>();
await foreach (var threadEvent in thread.RunStreamedAsync(BuildPrompt(request), cancellationToken))
{
if (threadEvent is TurnStartedEvent)
{
currentTurn++;
}
switch (threadEvent)
{
case ItemCompletedEvent { Item: AgentMessageItem message }:
var messageText = message.Text ?? string.Empty;
yield return new AIStreamingChunk
{
Content = messageText,
Type = StreamingChunkType.ContentDelta,
IsComplete = false
};
break;
case ItemStartedEvent or ItemUpdatedEvent or ItemCompletedEvent:
var toolChunk = BuildToolChunk(threadEvent, currentTurn);
if (toolChunk?.ToolCallDelta != null)
{
yield return toolChunk;
}
break;
case TurnCompletedEvent turnCompleted:
activeToolCalls.Clear();
yield return new AIStreamingChunk
{
Content = string.Empty,
Type = StreamingChunkType.Metadata,
IsComplete = true,
Usage = MapUsage(turnCompleted.Usage)
};
break;
}
}
BindSessionThread(request.SessionId, thread.Id);
}
private CodexThread ResolveThread(Codex codex, AIRequest request)
{
var sessionId = request.SessionId;
// Check whether there is already a bound thread
if (!string.IsNullOrWhiteSpace(sessionId) &&
_sessionThreadBindings.TryGetValue(sessionId, out var threadId) &&
!string.IsNullOrWhiteSpace(threadId))
{
_logger.LogInformation("Resuming Codex thread {ThreadId} for session {SessionId}", threadId, sessionId);
return codex.ResumeThread(threadId, threadOptions);
}
_logger.LogInformation("Starting new Codex thread for session {SessionId}", sessionId ?? "(none)");
return codex.StartThread(threadOptions);
}
}

Characteristics of the Codex CLI provider:

  • JSON event-stream handling: Parse Codex JSON event streams (TurnStarted, ItemStarted, TurnCompleted, and so on)
  • Session-thread binding: Persist the binding between sessions and threads with an SQLite database
  • Thread reuse: Support resuming existing threads to maintain session continuity
  • Tool-call tracking: Track active tool-call state and correctly handle the tool lifecycle

Codex CLI uses an SQLite database to persist the binding between sessions and threads:

public class CodexCliProvider : IAIProvider
{
private const int SessionThreadBindingRetentionDays = 30;
private readonly ConcurrentDictionary<string, string> _sessionThreadBindings;
private readonly string _sessionThreadBindingDatabaseConnectionString;
private readonly string _sessionThreadBindingDatabasePath;
private void BindSessionThread(string? sessionId, string? threadId)
{
if (string.IsNullOrWhiteSpace(sessionId) || string.IsNullOrWhiteSpace(threadId))
{
return;
}
// In-memory cache
_sessionThreadBindings.AddOrUpdate(sessionId, threadId, (_, _) => threadId);
// Persist to SQLite
PersistSessionThreadBinding(sessionId, threadId);
}
private void PersistSessionThreadBinding(string sessionId, string threadId)
{
try
{
using var connection = new SqliteConnection(_sessionThreadBindingDatabaseConnectionString);
connection.Open();
using var upsertCommand = connection.CreateCommand();
upsertCommand.CommandText =
"""
INSERT INTO SessionThreadBindings (SessionId, ThreadId, CreatedAtUtc, UpdatedAtUtc)
VALUES ($sessionId, $threadId, $createdAtUtc, $updatedAtUtc)
ON CONFLICT(SessionId) DO UPDATE SET
ThreadId = excluded.ThreadId,
UpdatedAtUtc = excluded.UpdatedAtUtc;
""";
var nowUtc = DateTimeOffset.UtcNow.ToString("O");
upsertCommand.Parameters.AddWithValue("$sessionId", sessionId);
upsertCommand.Parameters.AddWithValue("$threadId", threadId);
upsertCommand.Parameters.AddWithValue("$createdAtUtc", nowUtc);
upsertCommand.Parameters.AddWithValue("$updatedAtUtc", nowUtc);
upsertCommand.ExecuteNonQuery();
}
catch (Exception ex)
{
_logger.LogWarning(
ex,
"Failed to persist Codex session-thread binding for session {SessionId} to {DatabasePath}",
sessionId,
_sessionThreadBindingDatabasePath);
}
}
private void LoadPersistedSessionThreadBindings()
{
using var connection = new SqliteConnection(_sessionThreadBindingDatabaseConnectionString);
connection.Open();
using var loadCommand = connection.CreateCommand();
loadCommand.CommandText = "SELECT SessionId, ThreadId FROM SessionThreadBindings;";
using var reader = loadCommand.ExecuteReader();
while (reader.Read())
{
var sessionId = reader.GetString(0);
var threadId = reader.GetString(1);
_sessionThreadBindings[sessionId] = threadId;
}
}
}

Advantages of session-thread binding:

  • Session restoration: Previous sessions can be restored after a system restart
  • Thread reuse: The same session can reuse an existing Codex thread
  • Automatic cleanup: Bindings older than 30 days are cleaned up automatically

hagicode-desktop manages CLI selection through AgentCliManager:

export enum AgentCliType {
ClaudeCode = 'claude-code',
Codex = 'codex',
// Future extensions: other CLIs such as Aider and Cursor
}
export class AgentCliManager {
private static readonly STORE_KEY = 'agentCliSelection';
private static readonly EXECUTOR_TYPE_MAP: Record<AgentCliType, string> = {
[AgentCliType.ClaudeCode]: 'ClaudeCodeCli',
[AgentCliType.Codex]: 'CodexCli',
};
constructor(private store: any) {}
async saveSelection(cliType: AgentCliType): Promise<void> {
const selection: StoredAgentCliSelection = {
cliType,
isSkipped: false,
selectedAt: new Date().toISOString(),
};
this.store.set(AgentCliManager.STORE_KEY, selection);
}
loadSelection(): StoredAgentCliSelection {
return this.store.get(AgentCliManager.STORE_KEY, {
cliType: null,
isSkipped: false,
selectedAt: null,
});
}
getCommandName(cliType: AgentCliType): string {
switch (cliType) {
case AgentCliType.ClaudeCode:
return 'claude';
case AgentCliType.Codex:
return 'codex';
default:
return 'claude';
}
}
getExecutorType(cliType: AgentCliType | null): string {
if (!cliType) return 'ClaudeCodeCli';
return this.EXECUTOR_TYPE_MAP[cliType] || 'ClaudeCodeCli';
}
}

Example desktop-side IPC handler:

ipcMain.handle('llm:call-api', async (event, manifestPath, region) => {
if (!state.llmInstallationManager) {
return { success: false, error: 'LLM Installation Manager not initialized' };
}
try {
const prompt = await state.llmInstallationManager.loadPrompt(manifestPath, region);
// Determine the CLI command based on the user's selection
let commandName = 'claude';
if (state.agentCliManager) {
const selectedCliType = state.agentCliManager.getSelectedCliType();
if (selectedCliType) {
commandName = state.agentCliManager.getCommandName(selectedCliType);
}
}
// Execute with the selected CLI
const result = await state.llmInstallationManager.callApi(
prompt.filePath,
event.sender,
commandName
);
return result;
} catch (error) {
return {
success: false,
error: error instanceof Error ? error.message : 'Unknown error'
};
}
});

9. Codex’s Internal Model Provider System

Section titled “9. Codex’s Internal Model Provider System”

Codex itself also supports multiple model providers via ModelProviderInfo configuration:

pub const OPENAI_PROVIDER_NAME: &str = "OpenAI";
pub const OLLAMA_OSS_PROVIDER_ID: &str = "ollama";
pub const LMSTUDIO_OSS_PROVIDER_ID: &str = "lmstudio";
pub fn built_in_model_providers() -> HashMap<String, ModelProviderInfo> {
use ModelProviderInfo as P;
[
("openai", P::create_openai_provider()),
(OLLAMA_OSS_PROVIDER_ID, create_oss_provider(DEFAULT_OLLAMA_PORT, WireApi::Responses)),
(LMSTUDIO_OSS_PROVIDER_ID, create_oss_provider(DEFAULT_LMSTUDIO_PORT, WireApi::Responses)),
]
.into_iter()
.map(|(k, v)| (k.to_string(), v))
.collect()
}
pub struct ModelProviderInfo {
pub name: String,
pub base_url: Option<String>,
pub env_key: Option<String>,
pub query_params: Option<HashMap<String, String>>,
pub http_headers: Option<HashMap<String, String>>,
pub request_max_retries: Option<u64>,
pub stream_max_retries: Option<u64>,
pub stream_idle_timeout_ms: Option<u64>,
pub requires_openai_auth: bool,
pub supports_websockets: bool,
}

Codex model-provider support includes:

  • Built-in providers: OpenAI, Ollama, and LM Studio
  • Custom providers: Users can add custom providers in config.toml
  • Retry strategy: Configurable retry counts for requests and streams
  • WebSocket support: Some providers support WebSocket transport

Configure multiple providers in appsettings.json:

{
"AI": {
"Providers": {
"DefaultProvider": "ClaudeCodeCli",
"Providers": {
"ClaudeCodeCli": {
"Type": "ClaudeCodeCli",
"Model": "claude-sonnet-4-20250514",
"WorkingDirectory": "/path/to/workspace",
"PermissionMode": "acceptEdits",
"AllowedTools": ["file-edit", "command-run", "bash"]
},
"CodexCli": {
"Type": "CodexCli",
"Model": "gpt-4.1",
"ExecutablePath": "codex",
"SandboxMode": "enabled",
"WebSearchMode": "auto",
"NetworkAccessEnabled": false
}
},
"ScenarioProviderMapping": {
"CodeAnalysis": "ClaudeCodeCli",
"CodeGeneration": "CodexCli",
"Refactoring": "ClaudeCodeCli",
"Debugging": "CodexCli"
},
"FallbackChain": ["CodexCli", "ClaudeCodeCli"]
},
"Selector": {
"EnableCache": true,
"CacheExpirationSeconds": 300
}
}
}
public class AIOrchestrator
{
private readonly IAIProviderFactory _providerFactory;
private readonly IAIProviderSelector _providerSelector;
private readonly ILogger<AIOrchestrator> _logger;
public AIOrchestrator(
IAIProviderFactory providerFactory,
IAIProviderSelector providerSelector,
ILogger<AIOrchestrator> logger)
{
_providerFactory = providerFactory;
_providerSelector = providerSelector;
_logger = logger;
}
public async Task<AIResponse> ProcessRequestAsync(
AIRequest request,
BusinessScenario scenario)
{
_logger.LogInformation("Processing request for scenario: {Scenario}", scenario);
try
{
// Select a provider intelligently
var providerType = await _providerSelector.SelectProviderAsync(scenario, request.CancellationToken);
// Get the provider instance
var provider = await _providerFactory.GetProviderAsync(providerType);
if (provider == null)
{
throw new InvalidOperationException($"Provider {providerType} not available");
}
_logger.LogInformation("Using provider: {Provider} for request", provider.Name);
// Execute the request
var response = await provider.ExecuteAsync(request, request.CancellationToken);
_logger.LogInformation("Request completed with provider: {Provider}, tokens used: {Tokens}",
provider.Name,
response.Usage?.TotalTokens ?? 0);
return response;
}
catch (Exception ex)
{
_logger.LogError(ex, "Failed to process request for scenario: {Scenario}", scenario);
throw;
}
}
}
public async IAsyncEnumerable<AIStreamingChunk> StreamResponseAsync(
AIRequest request,
BusinessScenario scenario)
{
var providerType = await _providerSelector.SelectProviderAsync(scenario);
var provider = await _providerFactory.GetProviderAsync(providerType);
if (provider == null)
{
throw new InvalidOperationException($"Provider {providerType} not available");
}
await foreach (var chunk in provider.StreamAsync(request))
{
// Process streaming chunks
switch (chunk.Type)
{
case StreamingChunkType.ContentDelta:
// Show text content in real time
await SendToClientAsync(chunk.Content);
break;
case StreamingChunkType.ToolCallDelta:
// Handle tool calls
await HandleToolCallAsync(chunk.ToolCallDelta);
break;
case StreamingChunkType.Metadata:
// Handle completion events and stats
if (chunk.IsComplete)
{
_logger.LogInformation("Stream completed, usage: {@Usage}", chunk.Usage);
}
break;
case StreamingChunkType.Error:
// Handle errors
_logger.LogError("Stream error: {Error}", chunk.ErrorMessage);
throw new InvalidOperationException(chunk.ErrorMessage);
}
}
}
public async Task<string> ExecuteOpenSpecCommandAsync(
string command,
string arguments,
BusinessScenario scenario)
{
var providerType = await _providerSelector.SelectProviderAsync(scenario);
var provider = await _providerFactory.GetProviderAsync(providerType);
// Build an embedded command prompt
var commandPrompt = $"""
Execute the following OpenSpec command:
Command: {command}
Arguments: {arguments}
Please execute this command and return the results.
""";
var request = new AIRequest
{
Prompt = "Process this command request",
EmbeddedCommandPrompt = commandPrompt,
WorkingDirectory = Directory.GetCurrentDirectory()
};
var response = await provider.SendMessageAsync(request, commandPrompt);
return response.Content;
}

Before switching providers, it is recommended to call PingAsync first to ensure the target provider is available:

public async Task<bool> IsProviderHealthyAsync(AIProviderType providerType)
{
var provider = await _providerFactory.GetProviderAsync(providerType);
if (provider == null) return false;
var testResult = await provider.PingAsync();
return testResult.Success &&
testResult.ResponseTimeMs < 5000; // A response within 5 seconds is considered healthy
}

Use CessionId (Claude) or ThreadId (Codex) to ensure session isolation:

  • Claude Code CLI: use CessionId as the unique session identifier
  • Codex CLI: use ThreadId as the session identifier
// Claude Code CLI session options
var claudeSessionOptions = new ClaudeSessionOptions
{
CessionId = CessionId.New(), // Generate a unique ID
WorkingDirectory = workspacePath,
AllowedTools = allowedTools,
PermissionMode = PermissionMode.acceptEdits
};
// Codex thread options
var codexThreadOptions = new ThreadOptions
{
Model = "gpt-4.1",
SandboxMode = "enabled",
WorkingDirectory = workspacePath
};

Fallback mechanisms must be robust when a provider is unavailable, ensuring that at least one provider remains usable:

public async Task<AIResponse> ExecuteWithFallbackAsync(
AIRequest request,
List<AIProviderType> preferredProviders)
{
Exception? lastException = null;
foreach (var providerType in preferredProviders)
{
try
{
var provider = await _providerFactory.GetProviderAsync(providerType);
if (provider == null) continue;
// Try execution
return await provider.ExecuteAsync(request);
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Provider {ProviderType} failed, trying next", providerType);
lastException = ex;
}
}
// All providers failed
throw new InvalidOperationException(
"All preferred providers failed. Last error: " + lastException?.Message,
lastException);
}

Validate settings for all configured providers at startup to avoid runtime errors:

public void ValidateConfiguration(AIProviderOptions options)
{
foreach (var (providerType, config) in options.Providers)
{
// Validate executable paths (for CLI-based providers)
if (IsCliBasedProvider(providerType))
{
if (string.IsNullOrWhiteSpace(config.ExecutablePath))
{
throw new ConfigurationException(
$"Provider {providerType} requires ExecutablePath");
}
if (!File.Exists(config.ExecutablePath))
{
throw new ConfigurationException(
$"Executable not found for {providerType}: {config.ExecutablePath}");
}
}
// Validate API keys (for API-based providers)
if (IsApiBasedProvider(providerType))
{
if (string.IsNullOrWhiteSpace(config.ApiKey))
{
throw new ConfigurationException(
$"Provider {providerType} requires ApiKey");
}
}
// Validate model names
if (string.IsNullOrWhiteSpace(config.Model))
{
_logger.LogWarning("No model configured for {ProviderType}, using default", providerType);
}
}
}

Provider instances are cached, so pay attention to lifecycle management and memory usage:

// Clean up the cache periodically
public void ClearInactiveProviders(TimeSpan inactiveThreshold)
{
var now = DateTimeOffset.UtcNow;
var keysToRemove = new List<AIProviderType>();
foreach (var (type, instance) in _cache)
{
// Assume providers have a LastUsedTime property
if (instance.LastUsedTime.HasValue &&
now - instance.LastUsedTime.Value > inactiveThreshold)
{
keysToRemove.Add(type);
}
}
foreach (var key in keysToRemove)
{
_cache.TryRemove(key, out _);
_logger.LogInformation("Cleared inactive provider: {Provider}", key);
}
}

Log provider selection, switching, and execution in detail to make debugging easier:

public class AIProviderLogging
{
private readonly ILogger _logger;
public void LogProviderSelection(
BusinessScenario scenario,
AIProviderType selectedProvider,
SelectionReason reason)
{
_logger.LogInformation(
"[ProviderSelection] Scenario={Scenario}, Provider={Provider}, Reason={Reason}",
scenario,
selectedProvider,
reason);
}
public void LogProviderSwitch(
AIProviderType fromProvider,
AIProviderType toProvider,
string reason)
{
_logger.LogWarning(
"[ProviderSwitch] From={FromProvider} To={ToProvider}, Reason={Reason}",
fromProvider,
toProvider,
reason);
}
public void LogProviderError(
AIProviderType provider,
Exception error,
AIRequest request)
{
_logger.LogError(error,
"[ProviderError] Provider={Provider}, RequestLength={Length}, Error={Message}",
provider,
request.Prompt.Length,
error.Message);
}
}

Using concurrent collections such as ConcurrentDictionary ensures thread safety:

public class ThreadSafeProviderCache
{
private readonly ConcurrentDictionary<AIProviderType, IAIProvider> _cache;
private readonly ReaderWriterLockSlim _lock = new();
public IAIProvider? GetProvider(AIProviderType type)
{
// Read operations do not require a lock
if (_cache.TryGetValue(type, out var provider))
return provider;
// Creation requires a write lock
_lock.EnterWriteLock();
try
{
// Double-check
if (_cache.TryGetValue(type, out provider))
return provider;
var newProvider = CreateProvider(type);
if (newProvider != null)
{
_cache[type] = newProvider;
}
return newProvider;
}
finally
{
_lock.ExitWriteLock();
}
}
}

When the session-thread binding database schema changes, data migration must be considered:

public class SessionThreadMigration
{
public async Task MigrateAsync(string dbPath)
{
var version = await GetSchemaVersionAsync(dbPath);
if (version >= 2) return; // Already the latest version
using var connection = new SqliteConnection(dbPath);
connection.Open();
// Migrate to v2: add the CreatedAtUtc column
if (version < 2)
{
_logger.LogInformation("Migrating SessionThreadBindings to v2...");
using var addColumnCommand = connection.CreateCommand();
addColumnCommand.CommandText = "ALTER TABLE SessionThreadBindings ADD COLUMN CreatedAtUtc TEXT;";
addColumnCommand.ExecuteNonQuery();
using var backfillCommand = connection.CreateCommand();
backfillCommand.CommandText =
"""
UPDATE SessionThreadBindings
SET CreatedAtUtc = COALESCE(NULLIF(UpdatedAtUtc, ''), $nowUtc)
WHERE CreatedAtUtc IS NULL OR CreatedAtUtc = '';
""";
backfillCommand.Parameters.AddWithValue("$nowUtc", DateTimeOffset.UtcNow.ToString("O"));
backfillCommand.ExecuteNonQuery();
}
await UpdateSchemaVersionAsync(dbPath, 2);
_logger.LogInformation("Migration to v2 completed");
}
}

HagiCode combines the provider pattern, factory pattern, and selector pattern to implement a flexible and extensible multi-AI provider architecture:

  • Unified interface abstraction: The IAIProvider interface hides the differences between CLIs
  • Dynamic instance creation: AIProviderFactory supports runtime creation of provider instances
  • Intelligent selection strategy: AIProviderSelector implements scenario-driven provider selection
  • Session state persistence: Database bindings ensure session continuity
  • Desktop integration: AgentCliManager supports user selection and configuration

The advantages of this architecture are:

  1. Extensibility: Adding a new AI provider only requires implementing the IAIProvider interface
  2. Testability: Providers can be tested and mocked independently
  3. Maintainability: Each provider implementation is isolated and has a single responsibility
  4. User-friendliness: Support both scenario-based automatic selection and manual switching

With this design, HagiCode successfully enables seamless switching and interoperability between Claude Code CLI and Codex CLI, giving developers a flexible and powerful AI coding assistant experience.


Thank you for reading. If you found this article useful, please click the like button below 👍 so more people can discover it.

This content was created with AI-assisted collaboration, reviewed by me, and reflects my own views and position.

From TypeScript to C#: Cross-Language Porting Practice for the Codex SDK

From TypeScript to C#: Cross-Language Porting Practice for the Codex SDK

Section titled “From TypeScript to C#: Cross-Language Porting Practice for the Codex SDK”

Put simply, this article is also a bit of a baby of ours: it records the full process of porting the official TypeScript Codex SDK to C#. Calling it a “port” almost makes it sound too easy - it was more like a long adventure, because these two languages have very different personalities, and we had to find a way to make them cooperate.

Codex is the AI Agent CLI tool released by OpenAI, and it is genuinely powerful. The official team provides a TypeScript SDK in the @openai/codex package. It interacts with the Codex CLI by calling the codex exec --experimental-json command and parsing a JSONL event stream.

The problem is that in the HagiCode project, we need to use it in a pure .NET environment - specifically in C# backend services and desktop applications. We could not reasonably introduce a Node.js runtime into a .NET project just to call a CLI tool. That would be far too cumbersome.

So we were left with two choices: maintain a complex Node.js bridge layer, or build a native C# SDK ourselves.

We chose the latter.

This article also comes directly from our hands-on experience in the HagiCode project. HagiCode is an open-source AI coding assistant project. In plain terms, it means maintaining multiple components at once: a VSCode extension on the frontend, AI services on the backend, and a cross-platform desktop client. That multi-language, multi-platform complexity is exactly why we needed a native C# SDK - we really did not want to run Node.js inside a .NET project.

If you find this article helpful, feel free to give us a star on GitHub: github.com/HagiCode-org/site. You can also visit the official website to learn more: hagicode.com. It is always encouraging when an open-source project receives support.

Before translating code one-to-one, we first had to understand the architectural design of both SDKs. You have to understand both sides before you can port them well.

The core architecture of the TypeScript SDK looks like this:

Codex (entry class)
└── CodexExec (executor, manages child processes)
└── Thread (conversation thread)
├── run() / runStreamed() (synchronous/asynchronous execution)
└── event stream parsing

The C# SDK keeps the same architectural layering, but adapts the implementation details. The overall idea is straightforward: preserve API consistency while fully leveraging C# language features in the implementation.

This is the most fundamental and also the most important part of the work. If the foundation is weak, everything that follows becomes harder.

TypeScript’s type system is more flexible than C#‘s, and that is simply a fact. We needed to find an appropriate mapping strategy:

TypeScriptC#Notes
interface / typerecordC# uses record for immutable data structures
string | nullstring?Nullable reference type
boolean | undefinedbool?Nullable Boolean
AsyncGeneratorIAsyncEnumerableAsync iterator

The event type system is a typical example. TypeScript uses union types to define events:

export type ThreadEvent =
| ThreadStartedEvent
| TurnStartedEvent
| TurnCompletedEvent
| ...

In C#, we use an inheritance hierarchy and pattern matching to achieve a similar effect:

public abstract record ThreadEvent(string Type);
public sealed record ThreadStartedEvent(string ThreadId) : ThreadEvent("thread.started");
public sealed record TurnStartedEvent() : ThreadEvent("turn.started");
public sealed record TurnCompletedEvent(Usage Usage) : ThreadEvent("turn.completed");
// ...

We chose record instead of class because event objects should be immutable, which matches the intent behind using plain objects in TypeScript. The sealed keyword also prevents additional inheritance and gives the compiler room to optimize.

Event parsing is the core of the entire SDK, because it determines whether we can correctly understand every message returned by the Codex CLI. If parsing is wrong, everything after that is wasted effort.

The TypeScript version uses JSON.parse() to parse each line of JSON:

export function parseEvent(line: string): ThreadEvent {
const data = JSON.parse(line);
// Handle different event types...
}

The C# version uses System.Text.Json.JsonDocument instead:

public static ThreadEvent Parse(string line)
{
using var document = JsonDocument.Parse(line);
var root = document.RootElement;
var type = GetRequiredString(root, "type", "event.type");
return type switch
{
"thread.started" => new ThreadStartedEvent(GetRequiredString(root, "thread_id", ...)),
"turn.started" => new TurnStartedEvent(),
"turn.completed" => new TurnCompletedEvent(ParseUsage(...)),
// ...
_ => new UnknownThreadEvent(type, root.Clone()),
};
}

There is one small but important trick here: root.Clone() is required, because elements from JsonDocument become invalid after the document is disposed. We need to retain a copy for unknown event types. That is simply one of the differences between C# JSON handling and JavaScript.

This is where the two SDKs differ the most. Node.js and .NET have different runtime conventions, so the implementation has to adapt.

TypeScript uses Node.js’s spawn() function:

const child = spawn(this.executablePath, commandArgs, { env, signal });

C# uses .NET’s System.Diagnostics.Process:

using var process = new Process { StartInfo = startInfo };
process.Start();
// stdin/stdout/stderr must be managed manually

More specifically, the C# version needs to configure the process like this:

var startInfo = new ProcessStartInfo
{
FileName = _executablePath,
RedirectStandardInput = true,
RedirectStandardOutput = true,
RedirectStandardError = true,
UseShellExecute = false,
CreateNoWindow = true,
};

The biggest difference is the cancellation mechanism. TypeScript uses AbortSignal, which is part of the Web API and very convenient to work with:

const child = spawn(cmd, args, { signal: cancellationSignal });

C# uses CancellationToken instead:

public async IAsyncEnumerable<string> RunAsync(
CodexExecArgs args,
[EnumeratorCancellation] CancellationToken cancellationToken = default)
{
// Check cancellation status inside the loop
while (!cancellationToken.IsCancellationRequested)
{
// Process output...
}
// Terminate the process when cancellation is requested
if (cancellationToken.IsCancellationRequested)
{
try { process.Kill(entireProcessTree: true); } catch { }
}
}

At a high level, this is just another example of the difference between the Web API ecosystem and the .NET ecosystem.

Both SDKs implement the logic that converts JSON configuration into TOML configuration, because the Codex CLI accepts configuration overrides in TOML format. This part must remain completely consistent, otherwise the same configuration will behave differently in the two SDKs.

That is the kind of detail you cannot compromise on. Success or failure often comes down to details like this.

We created the following project structure:

CodexSdk/
├── CodexSdk.csproj
├── Codex.cs # Entry class
├── CodexThread.cs # Conversation thread
├── CodexExec.cs # Executor
├── Events.cs # Event type definitions
├── Items.cs # Item type definitions
├── EventParser.cs # Event parser
├── OutputSchemaTempFile.cs # Temporary file management
└── ...

It is a fairly clean structure, and that helped a lot during the port.

The basic usage remains consistent with the TypeScript SDK:

using CodexSdk;
// Create a Codex instance
var codex = new Codex();
var thread = codex.StartThread();
// Execute a query
var result = await thread.RunAsync("Summarize this repository.");
Console.WriteLine(result.FinalResponse);

Streaming event handling takes advantage of C# pattern matching:

await foreach (var @event in thread.RunStreamedAsync("Analyze the code."))
{
switch (@event)
{
case ItemCompletedEvent itemCompleted
when itemCompleted.Item is AgentMessageItem msg:
Console.WriteLine($"Assistant: {msg.Text}");
break;
case TurnCompletedEvent completed:
Console.WriteLine($"Tokens: in={completed.Usage.InputTokens}");
break;
case CommandExecutionItem command:
Console.WriteLine($"Command: {command.Command}");
break;
}
}

During implementation, we collected several practical lessons:

  1. Process management: The C# version must manage the full process lifecycle manually, including process termination during cancellation. Use Kill(entireProcessTree: true) to make sure child processes are also cleaned up.

  2. Error handling: We use InvalidOperationException to throw parsing errors, keeping the error handling style similar to the TypeScript SDK.

  3. Resource cleanup: OutputSchemaTempFile implements IAsyncDisposable to ensure temporary files are cleaned up correctly.

  4. Environment variables: The C# version supports fully overriding process environment variables through CodexOptions.Env. It is a small feature, but a very practical one.

  5. Platform differences: The C# version does not include the TypeScript version’s logic for automatically locating binaries inside npm packages. Since .NET projects typically do not depend on npm, the path to the codex executable must be specified via the CODEX_EXECUTABLE environment variable or CodexPathOverride.

Porting a mature TypeScript SDK to C# is not just a matter of syntax conversion - it also requires understanding the design philosophies of both languages. TypeScript’s flexibility and JavaScript ecosystem features such as AbortSignal need appropriate counterparts in C#.

The key takeaway is this: maintaining API consistency matters more than maintaining implementation-level consistency. Users care about whether the interface is easy to use, not whether the internal implementation is identical. That sounds simple, but making those trade-offs takes judgment.

If you are working on a similar cross-language port, our experience is to fully understand the original SDK architecture first, then translate it module by module, and finally use a complete test suite to ensure behavioral consistency. This kind of work cannot be rushed.

Everything will work out in the end.



If this article helped you:


Thank you for reading. If you found this article useful, please click the like button below so more people can discover it.

This content was created with AI-assisted collaboration, reviewed by the author, and reflects the author’s own views and position.