Blog

Full GLM-5.1 Support and Gemini CLI Integration: HagiCode's Path of Multi-Model Evolution

Mar 30, 2026

Full GLM-5.1 Support and Gemini CLI Integration: HagiCode’s Path of Multi-Model Evolution

This article introduces two major recent updates to the HagiCode platform: full support for the Zhipu AI GLM-5.1 model and the successful integration of Gemini CLI as the tenth Agent CLI. Together, these updates further strengthen the platform’s multi-model capabilities and multi-CLI ecosystem.

Background

Time really does move fast. The development of large language models has been rising like bamboo in spring. Not long ago, we were still cheering for “an AI that can write code.” Now we are already in an era of multi-model collaboration and multi-tool integration. Is that exciting? Perhaps. After all, what developers need has never been just the tool itself, but the ease of adapting to different scenarios and switching flexibly when needed.

As an AI-assisted coding platform, HagiCode has recently welcomed two important developments: first, the full integration of Zhipu AI’s GLM-5.1 model; second, the official addition of Gemini CLI as the tenth supported Agent CLI. These two updates may not sound earth-shaking, but they are unquestionably good news for the platform’s continued maturation.

GLM-5.1 is Zhipu AI’s latest flagship model. Compared with GLM-5.0, it offers stronger reasoning, deeper code understanding, and smoother tool calling. More importantly, it is the first GLM model to support image input. What does that mean? It means users can let the AI look directly at a screenshot instead of struggling to describe the problem in words. Once you’ve used that convenience, you immediately understand its value.

At the same time, through the HagiCode.Libs.Providers architecture, HagiCode successfully integrated Gemini CLI into the platform. This is now the tenth Agent CLI. To be honest, getting to this point does bring a modest sense of accomplishment.

It is also worth mentioning that HagiCode’s image upload feature lets users communicate with AI directly through screenshots. Even when running GLM 4.7, the platform still works well and has already helped complete many important build tasks. As for GLM-5.1, naturally, it goes one step further.

About HagiCode

The approach shared in this article comes from our practical experience in the HagiCode project. HagiCode is an open-source AI-assisted coding platform designed to provide developers with a flexible and powerful AI programming assistant through a multi-model, multi-CLI architecture. Project repository: github.com/HagiCode-org/site

Multi-CLI architecture design

One of HagiCode’s core strengths is its support for multiple AI programming CLI tools through a unified abstraction layer. The advantage of this design is actually quite simple: new tools can come in, old tools can stay, and the codebase does not turn into chaos. To be fair, that is how everyone would like life to work.

AIProviderType enum

The platform defines supported CLI provider types through the AIProviderType enum:

public enum AIProviderType
{
    ClaudeCodeCli = 0,    // Claude Code CLI
    CodexCli = 1,          // GitHub Copilot Codex
    GitHubCopilot = 2,     // GitHub Copilot
    CodebuddyCli = 3,     // Codebuddy CLI
    OpenCodeCli = 4,      // OpenCode CLI
    IFlowCli = 5,         // IFlow CLI
    HermesCli = 6,        // Hermes CLI
    QoderCli = 7,         // Qoder CLI
    KiroCli = 8,          // Kiro CLI
    KimiCli = 9,          // Kimi CLI
    GeminiCli = 10,       // Gemini CLI (new)
}

As you can see, Gemini CLI joins this family as the tenth member. Each CLI has its own distinct characteristics and usage scenarios, so users can choose flexibly based on their needs. After all, many roads lead to Rome; some are simply easier than others.

Provider architecture

HagiCode.Libs.Providers provides a unified Provider interface that makes each CLI integration standardized and concise. Taking Gemini CLI as an example:

public class GeminiProvider : ICliProvider<GeminiOptions>
{
    private static readonly string[] DefaultExecutableCandidates = ["gemini", "gemini-cli"];
    private const string ManagedBootstrapArgument = "--acp";

    public string Name => "gemini";
    public bool IsAvailable => _executableResolver.ResolveFirstAvailablePath(DefaultExecutableCandidates) is not null;
}

The benefits of this design are:

Integrating a new CLI only requires implementing one Provider class
Unified lifecycle management and session pooling
Automated alias resolution and executable discovery

Put plainly, this design turns complicated things into simpler ones and makes life a bit easier.

Provider Registry

The Provider Registry automatically handles alias mapping and registration:

if (provider is GeminiProvider)
{
    registry.Register(provider.Name, provider, ["gemini-cli"]);
    continue;
}

This means users can invoke Gemini CLI with either gemini or gemini-cli, and the system will recognize it automatically. It is like a friend with both a formal name and a nickname - either way, people know who you mean.

GLM-5.1 model support

GLM-5.1 is Zhipu AI’s latest flagship model, and HagiCode has completed full support for it.

Secondary Professions Catalog

HagiCode manages all supported models through the Secondary Professions Catalog. Here is the configuration for the GLM series:

Model ID	Name	SupportsImage	Compatible CLI Families
`glm-4.7`	GLM 4.7	-	claude, codebuddy, hermes, qoder, kiro
`glm-5`	GLM 5	-	claude, codebuddy, hermes, qoder, kiro
`glm-5-turbo`	GLM 5 Turbo	-	claude, codebuddy, hermes, qoder, kiro
`glm-5.0`	GLM 5.0 (Legacy)	-	claude, codebuddy, hermes, qoder, kiro
`glm-5.1`	GLM 5.1	true	claude, codebuddy, hermes, qoder, kiro

The key characteristics of GLM-5.1 can be summarized as follows:

A standalone version identifier with no legacy baggage
The first GLM model to support image input
Stronger reasoning and code understanding
Broad multi-CLI compatibility

GLM-5.1 vs GLM-5.0

At the code level, the key difference between GLM-5.1 and GLM-5.0 is shown here:

// GLM-5.0 (Legacy) - contains special retention logic
private const string Glm50CodebuddySecondaryProfessionId = "secondary-glm-5-codebuddy";
private const string Glm50CodebuddyModelValue = "glm-5.0";

// GLM-5.1 - standalone new model identifier
private const string Glm51SecondaryProfessionId = "secondary-glm-5-1";
private const string Glm51ModelValue = "glm-5.1";

GLM-5.0 carries the “Legacy” label because it is an old version identifier retained for backward compatibility. GLM-5.1, by contrast, is a brand-new standalone version with no historical burden. Some things stay in the past; others travel lighter and move faster.

Configure GLM-5.1

Here is a configuration example for using GLM-5.1 in HagiCode:

{
  "primaryProfessionId": "profession-claude-code",
  "secondaryProfessionId": "secondary-glm-5-1",
  "model": "glm-5.1",
  "reasoning": "high"
}

Image upload feature

HagiCode’s image support is implemented through the SupportsImage property on SecondaryProfession:

public class HeroSecondaryProfessionSettingDto
{
    public bool SupportsImage { get; set; }
}

In the Secondary Professions Catalog, the GLM-5.1 configuration looks like this:

{
  "id": "secondary-glm-5-1",
  "supportsImage": true
}

This means users can upload screenshots directly for AI analysis, such as:

Screenshots of error messages
Problems in a UI screen
Data visualization charts
Code execution results

There is no longer any need to describe everything manually - just upload the screenshot. The convenience of this feature is obvious once you have used it. Sometimes one look says more than a long explanation.

Gemini CLI integration

As the tenth Agent CLI, Gemini CLI is integrated into HagiCode through the standard Provider architecture.

Configuration options

Gemini CLI supports a rich set of configuration options:

public class GeminiOptions
{
    public string? ExecutablePath { get; set; }
    public string? WorkingDirectory { get; set; }
    public string? SessionId { get; set; }
    public string? Model { get; set; }
    public string? AuthenticationMethod { get; set; }
    public string? AuthenticationToken { get; set; }
    public Dictionary<string, string?> AuthenticationInfo { get; set; }
    public Dictionary<string, string?> EnvironmentVariables { get; set; }
    public string[] ExtraArguments { get; set; }
    public TimeSpan? StartupTimeout { get; set; }
    public CliPoolSettings? PoolSettings { get; set; }
}

These options cover everything from basic setup to advanced features, giving users the flexibility to configure the CLI around their own needs. Everyone’s workflow is different, so a little flexibility is always welcome.

ACP communication protocol

Gemini CLI supports the ACP (Agent Communication Protocol), which is HagiCode’s unified CLI communication standard. Through ACP, different CLIs can interact with the platform in a consistent way, greatly simplifying integration work. In short, it standardizes the complicated parts so everyone can work more easily.

Environment configuration

To use Zhipu AI models, you need to configure the corresponding environment variables.

Zhipu AI ZAI platform

export ANTHROPIC_AUTH_TOKEN="***"
export ANTHROPIC_BASE_URL="https://open.bigmodel.cn/api/anthropic"

Alibaba Cloud DashScope

export ANTHROPIC_AUTH_TOKEN="your-a...-key"
export ANTHROPIC_BASE_URL="https://coding.dashscope.aliyuncs.com/apps/anthropic"

Once configured, HagiCode can call the GLM-5.1 model normally. It is neither especially hard nor especially easy - you just need to follow the setup as intended.

HagiCode’s own build practice

Speaking of real-world practice, the best example is the HagiCode platform’s own build workflow. HagiCode’s development process has already made full use of AI capabilities.

Works well even with GLM 4.7

HagiCode’s platform design is well optimized, so it can still provide a good development experience even with GLM 4.7. The platform has already helped complete multiple important build projects, including:

Integration of multiple CLI Providers
Implementation of the image upload feature
Documentation generation and content publishing

That is actually a good thing. Not everyone needs the newest thing all the time. What suits you best is often what matters most.

GLM-5.1 delivers more with less effort

After upgrading to GLM-5.1, these capabilities become even stronger:

Stronger code understanding, reducing back-and-forth communication
More accurate dependency analysis, pointing in the right direction immediately
More efficient error diagnosis, locating issues faster
Support for image input, accelerating problem descriptions

It is like switching from a bicycle to a car. You can still reach the same destination, but the speed and comfort are not the same.

Best practices for multi-CLI integration

HagiCode.Libs.Providers provides a unified mechanism for registration and usage:

services.AddHagiCodeLibs();

var gemini = serviceProvider.GetRequiredService<ICliProvider<GeminiOptions>>();
var codebuddy = serviceProvider.GetRequiredService<ICliProvider<CodebuddyOptions>>();
var hermes = serviceProvider.GetRequiredService<ICliProvider<HermesOptions>>();

This dependency injection design keeps usage across different CLIs very concise and also makes unit testing and mocking more convenient. Clean code is a way of being responsible to yourself.

Notes

There are a few things to keep in mind in actual use:

API key configuration: Make sure ANTHROPIC_AUTH_TOKEN is set correctly, or the model cannot be called
Model availability: GLM-5.1 needs to be enabled by the corresponding model provider
Image feature: Only models with supportsImage: true can use image upload
CLI installation: Before using Gemini CLI, make sure gemini or gemini-cli is in the system PATH

These may be small details, but small details handled poorly can turn into big problems, so they are worth paying attention to.

Conclusion

With full support for GLM-5.1 and the successful integration of Gemini CLI, HagiCode further strengthens its capabilities as a multi-model, multi-CLI AI programming platform. These updates not only give users more choices, but also demonstrate HagiCode’s forward-looking architecture and scalability.

GLM-5.1’s image support, combined with HagiCode’s screenshot upload feature, makes it possible to let the AI “understand from the image” and greatly reduces the cost of describing problems. And with support for ten CLIs, users can flexibly choose the AI programming assistant that best fits their preferences and scenarios. More choice is almost always a good thing.

Most importantly, HagiCode’s own build practice proves that the platform can already run well and complete complex tasks even with GLM 4.7, while upgrading to GLM-5.1 can further improve development efficiency. Life is often like that too: you do not always need the absolute best, only what suits you. Of course, if what suits you can become even better, then so much the better.

If you are interested in a multi-model, multi-CLI AI programming platform, give HagiCode a try - open source, free, and still evolving. Trying it costs nothing, and it may turn out to be exactly what you need.

References

If this article helped you:

Give it a Star on GitHub: github.com/HagiCode-org/site
Visit the official website to learn more: hagicode.com
Watch the 30-minute hands-on demo: www.bilibili.com/video/BV1pirZBuEzq/
Try the one-click installation: docs.hagicode.com/installation/docker-compose
Public beta has started, and you are welcome to install and try it

Copyright notice

Thank you for reading. If you found this article useful, feel free to like, bookmark, and share it to show your support. This content was created with AI-assisted collaboration, and the final content was reviewed and confirmed by the author.

Author: newbe36524
Original link: https://docs.hagicode.com/blog/2026-03-30-hagicode-glm-5-1-gemini-cli-update/
Copyright: Unless otherwise stated, all blog posts on this site are licensed under BY-NC-SA. Please cite the source when reposting.

Hagicode and GLM-5.1 Multi-CLI Integration Guide

Mar 28, 2026

Hagicode and GLM-5.1 Multi-CLI Integration Guide

Background

In the Hagicode project, users can choose from multiple CLI tools to drive AI programming assistants, including Claude Code CLI, GitHub Copilot, OpenCode CLI, Codebuddy CLI, Hermes CLI, and more. These CLI tools are general-purpose AI programming tools on their own, but through Hagicode’s abstraction layer, they can flexibly connect to different AI model providers.

Zhipu AI (ZAI) provides an interface compatible with the Anthropic Claude API, allowing these CLI tools to directly use domestic GLM series models. Among them, GLM-5.1 is Zhipu’s latest large language model release, with significant improvements over GLM-5.0.

Hagicode’s CLI abstraction architecture

Hagicode defines 11 CLI provider types through the AIProviderType enum, covering mainstream AI programming CLI tools:

public enum AIProviderType
{
    ClaudeCodeCli = 0,    // Claude Code CLI
    CodexCli = 1,          // GitHub Copilot Codex
    GitHubCopilot = 2,     // GitHub Copilot
    CodebuddyCli = 3,     // Codebuddy CLI
    OpenCodeCli = 4,      // OpenCode CLI
    IFlowCli = 5,         // IFlow CLI
    HermesCli = 6,        // Hermes CLI
    QoderCli = 7,         // Qoder CLI
    KiroCli = 8,          // Kiro CLI
    KimiCli = 9,          // Kimi CLI
    GeminiCli = 10,       // Gemini CLI
}

Each CLI has corresponding model parameter configuration and supports the model and reasoning parameters:

private static readonly IReadOnlyDictionary<AIProviderType, IReadOnlyList<string>> ManagedModelParameterKeysByProvider =
    new Dictionary<AIProviderType, IReadOnlyList<string>>
    {
        [AIProviderType.ClaudeCodeCli] = ["model", "reasoning"],
        [AIProviderType.CodexCli] = ["model", "reasoning"],
        [AIProviderType.OpenCodeCli] = ["model", "reasoning"],
        [AIProviderType.HermesCli] = ["model", "reasoning"],
        [AIProviderType.CodebuddyCli] = ["model", "reasoning"],
        [AIProviderType.QoderCli] = ["model", "reasoning"],
        [AIProviderType.KiroCli] = ["model", "reasoning"],
        [AIProviderType.GeminiCli] = ["model"],  // Gemini does not support the reasoning parameter
        // ...
    };

GLM model support system

Hagicode’s Secondary Professions Catalog defines complete support for the GLM model series:

Model ID	Name	Default Reasoning	Compatible CLI Families
`glm-4.7`	GLM 4.7	high	claude, codebuddy, hermes, qoder, kiro
`glm-5`	GLM 5	high	claude, codebuddy, hermes, qoder, kiro
`glm-5-turbo`	GLM 5 Turbo	high	claude, codebuddy, hermes, qoder, kiro
`glm-5.0`	GLM 5.0 (Legacy)	high	claude, codebuddy, hermes, qoder, kiro
`glm-5.1`	GLM 5.1	high	claude, codebuddy, hermes, qoder, kiro

Key differences between GLM-5.1 and GLM-5.0

From the implementation in AcpSessionModelBootstrapper.cs, we can clearly see the differences between GLM-5.1 and GLM-5.0:

Standalone implementation of GLM-5.1

GLM-5.1 is a standalone new model identifier with no legacy handling logic:

private const string Glm51ModelValue = "glm-5.1";

Definition in the Secondary Professions Catalog:

{
  "id": "secondary-glm-5-1",
  "name": "GLM 5.1",
  "family": "anthropic",
  "summary": "hero.professionCopy.secondary.glm51.summary",
  "sourceLabel": "hero.professionCopy.sources.aiSharedAnthropicModel",
  "sortOrder": 64,
  "supportsImage": true,
  "compatiblePrimaryFamilies": [
    "claude",
    "codebuddy",
    "hermes",
    "qoder",
    "kiro"
  ],
  "defaultParameters": {
    "model": "glm-5.1",
    "reasoning": "high"
  }
}

Model provider configuration

Zhipu AI (ZAI)

Zhipu AI provides the most complete GLM model support:

{
  "providerId": "zai",
  "name": "智谱 AI",
  "description": "智谱 AI 提供的 Claude API 兼容服务",
  "category": "china-providers",
  "apiUrl": {
    "codingPlanForAnthropic": "https://open.bigmodel.cn/api/anthropic"
  },
  "recommended": true,
  "region": "cn",
  "defaultModels": {
    "sonnet": "glm-4.7",
    "opus": "glm-5",
    "haiku": "glm-4.5-air"
  },
  "supportedModels": [
    "glm-4.7",
    "glm-5",
    "glm-4.5-air",
    "qwen3-coder-next",
    "qwen3-coder-plus"
  ],
  "features": ["experimental-agent-teams"],
  "authTokenEnv": "ANTHROPIC_AUTH_TOKEN",
  "referralUrl": "https://www.bigmodel.cn/claude-code?ic=14BY54APZA",
  "documentationUrl": "https://open.bigmodel.cn/dev/api"
}

Features:

Supports the widest variety of GLM model variants
Provides default mapping across the Sonnet/Opus/Haiku tiers
Supports the experimental-agent-teams feature

Using GLM-5.1 in different CLIs

1. Claude Code CLI + GLM-5.1

Claude Code CLI is one of Hagicode’s core CLIs and is configured through the Hero configuration system:

{
  "primaryProfessionId": "profession-claude-code",
  "secondaryProfessionId": "secondary-glm-5-1",
  "model": "glm-5.1",
  "reasoning": "high"
}

Corresponding HeroEquipmentCatalogItem configuration:

{
  id: 'secondary-glm-5-1',
  name: 'GLM 5.1',
  family: 'anthropic',
  kind: 'model',
  primaryFamily: 'claude',
  compatiblePrimaryFamilies: ['claude', 'codebuddy', 'hermes', 'qoder', 'kiro'],
  defaultParameters: {
    model: 'glm-5.1',
    reasoning: 'high'
  }
}

2. OpenCode CLI + GLM-5.1

OpenCode CLI is the most flexible CLI and supports specifying any model in the provider/model format:

Method 1: Use the ZAI provider prefix

{
  "primaryProfessionId": "profession-opencode",
  "model": "zai/glm-5.1",
  "reasoning": "high"
}

Method 2: Use the model ID directly

{
  "model": "glm-5.1"
}

Method 3: Frontend configuration UI

In HeroModelEquipmentForm.tsx, OpenCode CLI has a dedicated placeholder hint:

const OPEN_CODE_MODEL_PLACEHOLDER = 'myprovider/glm-4.7';

const modelPlaceholder = primaryProviderType === PCode_Models_AIProviderType.OPEN_CODE_CLI
  ? OPEN_CODE_MODEL_PLACEHOLDER
  : 'gpt-5.4';

Users can enter:

zai/glm-5.1
glm-5.1

OpenCode CLI model parsing logic:

internal OpenCodeModelSelection? ResolveModelSelection(string? rawModel)
{
    var normalized = NormalizeOptionalValue(rawModel);
    if (normalized == null) return null;

    var slashIndex = normalized.IndexOf('/');
    if (slashIndex < 0)
    {
        // No slash: use the model ID directly
        return new OpenCodeModelSelection {
            ProviderId = string.Empty,
            ModelId = normalized,
        };
    }

    // Slash exists: parse the provider/model format
    var providerId = normalized[..slashIndex].Trim();
    var modelId = normalized[(slashIndex + 1)..].Trim();

    return new OpenCodeModelSelection {
        ProviderId = providerId,
        ModelId = modelId,
    };
}

3. Codebuddy CLI + GLM-5.1

Codebuddy CLI has dedicated legacy handling logic:

{
  "primaryProfessionId": "profession-codebuddy",
  "model": "glm-5.1",
  "reasoning": "high"
}

Note: Codebuddy retains special handling for GLM-5.0 and does not use legacy normalization:

return !string.Equals(providerName, "CodebuddyCli", StringComparison.OrdinalIgnoreCase)
       && string.Equals(normalizedModel, LegacyGlm5TurboModelValue, StringComparison.OrdinalIgnoreCase)
    ? Glm5TurboModelValue
    : normalizedModel;
// For CodebuddyCli, glm-5.0 is not normalized to glm-5-turbo

Environment variable configuration

Using Zhipu AI ZAI

# Set the API key
export ANTHROPIC_AUTH_TOKEN="***"

# Optional: specify the API endpoint (ZAI uses this endpoint by default)
export ANTHROPIC_BASE_URL="https://open.bigmodel.cn/api/anthropic"

Using Alibaba Cloud DashScope

# Set the API key
export ANTHROPIC_AUTH_TOKEN="your-a...-key"

# Specify the Alibaba Cloud endpoint
export ANTHROPIC_BASE_URL="https://coding.dashscope.aliyuncs.com/apps/anthropic"

Get an API key

Zhipu AI: https://www.bigmodel.cn/claude-code?ic=14BY54APZA
Alibaba Cloud: https://www.aliyun.com/benefit/ai/aistar?userCode=vmx5szbq

Improvement advantages of GLM-5.1

Compared with GLM-5.0, GLM-5.1 brings the following significant improvements:

1. Better reasoning capability

According to Zhipu’s official release information, improvements in GLM-5.1 include:

Stronger code understanding: More accurate analysis of complex code structures
Longer context comprehension: Supports longer conversational context
Enhanced tool calling: Higher success rate for MCP tool calls
Output stability: Reduces randomness and hallucinations

2. Comprehensive multi-CLI compatibility

GLM-5.1 covers all mainstream CLIs supported by Hagicode:

compatiblePrimaryFamilies: [
  "claude",      // Claude Code CLI
  "codebuddy",   // Codebuddy CLI
  "hermes",      // Hermes CLI
  "qoder",       // Qoder CLI
  "kiro"         // Kiro CLI
]

Notes

1. API key configuration

Make sure the ANTHROPIC_AUTH_TOKEN environment variable is set correctly. It is the required credential for every CLI to connect to the model.

2. Model availability

GLM-5.1 needs to be enabled by the corresponding model provider:

The Zhipu AI ZAI platform supports it by default
Alibaba Cloud DashScope may require a separate application

3. OpenCode CLI format

When using the provider/model format, make sure the provider ID is correct:

Zhipu AI: zai or zhipuai
Alibaba Cloud: aliyun or dashscope

4. Reasoning parameter

high is recommended for the best code generation results
Gemini CLI does not support the reasoning parameter and will ignore this configuration automatically

Conclusion

Through a unified abstraction layer, Hagicode enables flexible integration between GLM-5.1 and multiple CLIs. Developers can choose the CLI tool that best fits their preferences and usage scenarios, then use the latest GLM-5.1 model through simple configuration.

As Zhipu’s latest model version, GLM-5.1 offers clear improvements over GLM-5.0:

An independent version identifier with no legacy burden
Stronger reasoning and code understanding
Broad multi-CLI compatibility
Flexible reasoning level configuration

With the correct environment variables and Hero equipment configured, users can fully unlock the power of GLM-5.1 across different CLI environments.

Continue With HagiCode

If you want to put GLM-5.1, multi-CLI orchestration, and HagiCode’s configuration model into real use, these are the fastest entry points:

Track the main project and latest implementation progress on GitHub: github.com/HagiCode-org/site
Visit the official site to understand the product direction, capability boundaries, and install options: hagicode.com
Start with the Docker Compose guide, then switch models and CLIs in a real environment: docs.hagicode.com/installation/docker-compose
If you prefer a local desktop workflow, begin with the Desktop entry point: hagicode.com/desktop/

Once you compare Kimi, Claude Code, OpenCode, and other CLIs inside the same abstraction layer, questions about model switching, parameter mapping, and engineering boundaries tend to become much easier to reason about.

HagiCode Desktop Hybrid Distribution Architecture Explained: How P2P Accelerates Large File Downloads

Mar 27, 2026

HagiCode Desktop Hybrid Distribution Architecture Explained: How P2P Accelerates Large File Downloads

I held this article back for a long time before finally writing it, and I am still not sure whether it reads well. Technical writing is easy enough to produce, but hard to make truly engaging. Then again, I am no great literary master, so I might as well just set down this plain explanation.

Background

Teams building desktop applications will all run into the same headache sooner or later: how do you distribute large files?

It is an awkward problem. Traditional HTTP/HTTPS direct downloads can still hold up when files are small and the number of users is limited. But time is rarely kind. As a project keeps growing, the installation packages grow with it: Desktop ZIP packages, portable packages, web deployment archives, and more. Then the issues start to surface:

Download speed is limited by origin bandwidth: no matter how much bandwidth a single server has, it still struggles when everyone downloads at once.
Resume support is nearly nonexistent: if an HTTP download is interrupted, you often have to start over from the beginning. That wastes both time and bandwidth.
The origin server takes all the pressure: all traffic flows back to a central server, bandwidth costs keep rising, and scalability becomes a real problem.

The HagiCode Desktop project was no exception. When we designed the distribution system, we kept asking ourselves: can we introduce a hybrid distribution approach without changing the existing index.json control plane? In other words, can we use the distributed nature of P2P networks to accelerate downloads while still keeping HTTP origin fallback so the system remains usable in constrained environments such as enterprise networks?

The impact of that decision turned out to be larger than you might expect. Let us walk through it step by step.

About HagiCode

The approach shared in this article comes from our real-world experience in the HagiCode project. HagiCode is an open-source AI coding assistant project focused on helping development teams improve engineering efficiency. The project spans multiple subsystems, including the frontend, backend, desktop launcher, documentation, build pipeline, and server deployment.

The Desktop hybrid distribution architecture is exactly the kind of solution HagiCode refined through real operational experience and repeated optimization. If this design proves useful, then perhaps it also shows that HagiCode itself is worth paying attention to.

The project’s GitHub repository is HagiCode-org/site. If it interests you, feel free to give it a Star and save it for later.

Core Design Philosophy: P2P First, HTTP Fallback

At its heart, the hybrid distribution model can be summarized in a single sentence: P2P first, HTTP fallback.

The key lies in the word “hybrid.” This is not about simply adding BitTorrent and calling it a day. The point is to make the two delivery methods work together and complement each other:

The P2P network provides distributed acceleration. The more people download, the more peers join, and the faster the transfer becomes.
WebSeed/HTTP fallback guarantees availability, so downloads can still work in enterprise firewalls and internal network environments.
The control plane remains simple. We do not change the core logic of index.json; we only add a few optional metadata fields.

The real benefit is straightforward: users feel that “downloads are faster,” while the engineering team does not have to shoulder too much extra complexity. After all, the BT protocol is already mature, and there is little reason to reinvent the wheel.

Architecture Design

Layered Architecture Overview

Let us start with the overall architecture diagram to build a high-level mental model:

┌─────────────────────────────────────┐
│     Renderer (UI layer)             │
├─────────────────────────────────────┤
│     IPC/Preload (bridge layer)      │
├─────────────────────────────────────┤
│   VersionManager (version manager)  │
├─────────────────────────────────────┤
│ HybridDownloadCoordinator (coord.)  │
│  ├── DistributionPolicyEvaluator    │
│  ├── DownloadEngineAdapter          │
│  ├── CacheRetentionManager          │
│  └── SHA256 Verifier                │
├─────────────────────────────────────┤
│   WebTorrent (download engine)      │
└─────────────────────────────────────┘

As the diagram shows, the system uses a layered design. The reason for separating responsibilities this clearly is simple: testability and replaceability.

The UI layer is responsible for displaying download progress and the sharing acceleration toggle. It is the surface.
The coordination layer is the core. It contains policy evaluation, engine adaptation, cache management, and integrity verification.
The engine layer encapsulates the concrete download implementation. At the moment, it uses WebTorrent.

The engine layer is abstracted behind the DownloadEngineAdapter interface. If we ever want to swap in a different BT engine later, or move the implementation into a sidecar process, that becomes much easier.

Separation of Control Plane and Data Plane

HagiCode Desktop keeps index.json as the sole control plane, and that design is critical. The control plane is responsible for version discovery, channel selection, and centralized policy, while the data plane is where the actual file transfer happens.

The new fields added to index.json are optional:

{
  "asset": {
    "torrentUrl": "https://cdn.example.com/app.torrent",
    "infoHash": "abc123...",
    "webSeeds": [
      "https://cdn.example.com/app.zip",
      "https://backup.example.com/app.zip"
    ],
    "sha256": "def456...",
    "directUrl": "https://cdn.example.com/app.zip"
  }
}

All of these fields are optional. If they are missing, the client falls back to the traditional HTTP download mode. The advantage of this design is backward compatibility: older clients are completely unaffected.

Policy-Driven Decisions

Not every file is worth distributing through P2P.

DistributionPolicyEvaluator is responsible for evaluating the policy. Only files that meet all of the following conditions will use hybrid download:

The source type must be an HTTP index: direct GitHub downloads or local folder sources do not use this path.
The file size must be at least 100 MB: for smaller files, the overhead of P2P outweighs the benefit.
Complete hybrid metadata must be present: torrentUrl, webSeeds, and sha256 are all required.
Only the latest desktop package and web deployment package are eligible: historical versions continue to use the traditional distribution path.

class DistributionPolicyEvaluator {
  evaluate(version: Version, settings: SharingAccelerationSettings): HybridDownloadPolicy {
    // Check source type
    if (version.sourceType !== 'http-index') {
      return { useHybrid: false, reason: 'not-http-index' };
    }

    // Check metadata completeness
    if (!version.hybrid) {
      return { useHybrid: false, reason: 'not-eligible' };
    }

    // Check whether the feature is enabled
    if (!settings.enabled) {
      return { useHybrid: false, reason: 'shared-disabled' };
    }

    // Check asset type (latest desktop/web packages only)
    if (!version.hybrid.isLatestDesktopAsset && !version.hybrid.isLatestWebAsset) {
      return { useHybrid: false, reason: 'latest-only' };
    }

    return { useHybrid: true, reason: 'shared-enabled' };
  }
}

This gives the system predictable behavior. Both developers and users can clearly understand which files will use P2P and which will not.

Core Implementation

Type Definition System

Let us start with the type definitions, because they form the foundation of the entire system.

// Hybrid distribution metadata
interface HybridDistributionMetadata {
  torrentUrl?: string;      // Torrent file URL
  infoHash?: string;        // InfoHash
  webSeeds: string[];       // WebSeed list
  sha256?: string;          // File hash
  directUrl?: string;       // HTTP direct link (for origin fallback)
  eligible: boolean;        // Whether hybrid distribution is applicable
  thresholdBytes: number;   // Threshold in bytes
  assetKind: VersionAssetKind;
  isLatestDesktopAsset: boolean;
  isLatestWebAsset: boolean;
}

// Sharing acceleration settings
interface SharingAccelerationSettings {
  enabled: boolean;           // Master switch
  uploadLimitMbps: number;    // Upload bandwidth limit
  cacheLimitGb: number;       // Cache limit
  retentionDays: number;      // Retention period
  hybridThresholdMb: number;  // Hybrid distribution threshold
  onboardingChoiceRecorded: boolean;
}

// Download progress
interface VersionDownloadProgress {
  current: number;
  total: number;
  percentage: number;
  stage: VersionInstallStage;  // queued, downloading, backfilling, verifying, extracting, completed, error
  mode: VersionDownloadMode;   // http-direct, shared-acceleration, source-fallback
  peers?: number;              // Number of connected peers
  p2pBytes?: number;           // Bytes received from P2P
  fallbackBytes?: number;      // Bytes received from fallback
  verified?: boolean;          // Whether verification has completed
}

Once the type system is clear, the rest of the implementation follows naturally.

Core Coordinator

HybridDownloadCoordinator orchestrates the entire download workflow. It coordinates policy evaluation, engine execution, SHA256 verification, and cache management.

class HybridDownloadCoordinator {
  async download(
    version: Version,
    cachePath: string,
    packageSource: PackageSource,
    onProgress?: DownloadProgressCallback,
  ): Promise<HybridDownloadResult> {
    // 1. Evaluate the policy: should hybrid download be used?
    const policy = this.policyEvaluator.evaluate(version, settings);

    // 2. Execute the download
    if (policy.useHybrid) {
      await this.engine.download(version, cachePath, settings, onProgress);
    } else {
      await packageSource.downloadPackage(version, cachePath, onProgress);
    }

    // 3. SHA256 verification (hard gate)
    const verified = await this.verify(version, cachePath, onProgress);
    if (!verified) {
      await this.cacheRetentionManager.discard(version.id, cachePath);
      throw new Error(`sha256 verification failed for ${version.id}`);
    }

    // 4. Mark as trusted cache and begin controlled seeding
    await this.cacheRetentionManager.markTrusted({
      versionId: version.id,
      cachePath,
      cacheSize,
    }, settings);

    return { cachePath, policy, verified };
  }
}

There is one especially important point here: SHA256 verification is a hard gate. A downloaded file must pass verification before it can enter the installation flow. If verification fails, the cache is discarded to ensure that an incorrect file never causes installation problems.

Download Engine Abstraction

DownloadEngineAdapter is an abstract interface that defines the methods every engine must implement:

interface DownloadEngineAdapter {
  download(
    version: Version,
    destinationPath: string,
    settings: SharingAccelerationSettings,
    onProgress?: (progress: VersionDownloadProgress) => void,
  ): Promise<void>;

  stopAll(): Promise<void>;
}

The V1 implementation is based on WebTorrent and is wrapped in InProcessTorrentEngineAdapter:

class InProcessTorrentEngineAdapter implements DownloadEngineAdapter {
  async download(...) {
    const client = this.getClient(settings);  // Apply upload rate limiting
    const torrent = client.add(torrentId, {
      path: path.dirname(destinationPath),
      destroyStoreOnDestroy: false,
      maxWebConns: 8,
    });

    // Add WebSeed sources
    torrent.on('ready', () => {
      for (const seed of hybrid.webSeeds) {
        torrent.addWebSeed(seed);
      }
      if (hybrid.directUrl) {
        torrent.addWebSeed(hybrid.directUrl);
      }
    });

    // Progress reporting - distinguish P2P from origin fallback
    torrent.on('download', () => {
      const hasP2PPeer = torrent.wires.some(w => w.type !== 'webSeed');
      const mode = hasP2PPeer ? 'shared-acceleration' : 'source-fallback';
      // ... report progress
    });
  }
}

A pluggable engine design makes future optimization much easier. For example, V2 could run the engine in a helper process to avoid bringing down the main process if the engine crashes.

Distinguishing Progress Reporting Modes

At the UI layer, the thing users care about most is simple: “am I currently downloading through P2P or through HTTP fallback?” InProcessTorrentEngineAdapter determines that by checking the types inside torrent.wires:

const hasP2PPeer = torrent.wires.some((wire) => wire.type !== 'webSeed');
const hasFallbackWire = torrent.wires.some((wire) => wire.type === 'webSeed');

const mode = hasP2PPeer ? 'shared-acceleration'
         : hasFallbackWire ? 'source-fallback'
         : 'shared-acceleration';

const stage = hasP2PPeer ? 'downloading'
           : hasFallbackWire ? 'backfilling'
           : 'downloading';

The logic looks simple, but it is a key part of the user experience. Users can clearly see whether the current state is “sharing acceleration” or “origin backfilling,” which makes the behavior easier to understand.

SHA256 Streaming Verification

Integrity verification uses Node.js’s crypto module to compute the hash in a streaming manner, which avoids loading the entire file into memory:

private async computeSha256(filePath: string): Promise<string> {
  const hash = createHash('sha256');
  await new Promise<void>((resolve, reject) => {
    const stream = fs.createReadStream(filePath);
    stream.on('data', (chunk) => hash.update(chunk));
    stream.on('error', reject);
    stream.on('end', resolve);
  });
  return hash.digest('hex').toLowerCase();
}

This implementation is especially friendly for large files. Imagine downloading a 2 GB installation package and then trying to load the whole thing into memory just to verify it. Streaming solves that cleanly.

Data Flow

The full data flow looks like this:

┌────────────────────────────────────────────────────────────────────┐
│             User clicks install on a large-file version            │
└────────────────────────────────────────────────────────────────────┘
                                 │
                                 ▼
┌────────────────────────────────────────────────────────────────────┐
│              VersionManager invokes the coordinator                │
│              HybridDownloadCoordinator.download()                  │
└────────────────────────────────────────────────────────────────────┘
                                 │
                                 ▼
┌────────────────────────────────────────────────────────────────────┐
│           DistributionPolicyEvaluator.evaluate()                   │
│       Checks: source, metadata, switch, and asset type            │
└────────────────────────────────────────────────────────────────────┘
                                 │
                    ┌───────────┴───────────┐
                    │ useHybrid?            │
                    └───────────┬───────────┘
                        yes │         │ no
                           ▼         ▼
              ┌──────────────────┐  ┌─────────────────────┐
              │ P2P + WebSeed    │  │ HTTP direct download│
              │ Hybrid download  │  │ (compatibility path)│
              └──────────────────┘  └─────────────────────┘
                        │
                        ▼
              ┌──────────────────┐
              │ SHA256 verify    │
              │ (hard gate)      │
              └────────┬─────────┘
                       │
              ┌────────┴─────────┐
              │ Passed?          │
              └────────┬─────────┘
                   yes │    │ no
                     ▼    ▼
          ┌────────────┐ ┌────────────────┐
          │ Extract +  │ │ Drop cache +   │
          │ install +  │ │ return error   │
          │ seed safely│ └────────────────┘
          └────────────┘

The flow is very clear end to end, and every step has a well-defined responsibility. When something goes wrong, it is much easier to pinpoint the failing stage.

Productization

Even the best technical design will fall flat if the user experience is poor. HagiCode Desktop invested a fair amount of effort in productizing this capability.

Hide BT Terminology

Most users do not know what BitTorrent or InfoHash means. So at the product level, we present the feature using the phrase “sharing acceleration”:

The feature is called “sharing acceleration,” not P2P download.
The setting is called “upload limit,” not seeding.
The progress label says “origin backfilling,” not WebSeed fallback.

This lowers the cognitive burden of the terminology and makes the feature easier to accept.

Enabled by Default in the First-Run Wizard

When new users launch the desktop app for the first time, they see a wizard page introducing sharing acceleration:

To improve download speed, we share the portions you have already downloaded with other users while your own download is in progress. This is completely optional, and you can turn it off at any time in Settings.

It is enabled by default, but users are given a clear way to opt out. If enterprise users do not want it, they can simply disable it during onboarding.

User-Controlled Parameters

The settings page exposes three tunable parameters:

Parameter	Default	Description
Upload limit	2 MB/s	Prevents excessive upstream bandwidth usage
Cache limit	10 GB	Controls disk space consumption
Retention days	7 days	Automatically cleans old cache after this period

These parameters all have sensible defaults. Most users never need to change them, while advanced users can adjust them based on their own network environment.

Key Design Decisions

Looking back at the overall solution, several design decisions are worth calling out.

Engine Runs in the Main Process (V1)

Why not start with a sidecar or helper process right away? The reason is simple: ship quickly. An in-process design has a shorter development cycle and is easier to debug. The first priority is to get the feature running, then improve stability afterward.

Of course, this decision comes with a cost: if the engine crashes, it can affect the main process. We reduce that risk through adapter boundaries and timeout controls, and we also keep a migration path open so V2 can move into a separate process more easily.

SHA256 as the Integrity Check

We use SHA256 instead of MD5 or CRC32 because SHA256 is more secure. The collision cost for MD5 and CRC32 is too low. If someone maliciously crafted a fake installation package, the consequences could be severe. SHA256 costs more to compute, but the security gain is worth it.

Enabled Only for HTTP Index Sources

Scenarios such as GitHub downloads and local folder sources do not use hybrid distribution. This is not a technical limitation; it is about avoiding unnecessary complexity. BT protocols add limited value inside private network scenarios and would only increase code complexity.

Practical Notes

Settings Normalization

Inside SharingAccelerationSettingsStore, every numeric value must go through bounds checking and normalization:

private normalize(settings: SharingAccelerationSettings): SharingAccelerationSettings {
  return {
    enabled: Boolean(settings.enabled),
    uploadLimitMbps: this.clampNumber(settings.uploadLimitMbps, 1, 200, DEFAULT_SETTINGS.uploadLimitMbps),
    cacheLimitGb: this.clampNumber(settings.cacheLimitGb, 1, 500, DEFAULT_SETTINGS.cacheLimitGb),
    retentionDays: this.clampNumber(settings.retentionDays, 1, 90, DEFAULT_SETTINGS.retentionDays),
    hybridThresholdMb: DEFAULT_SETTINGS.hybridThresholdMb,  // Fixed value, not user-configurable
    onboardingChoiceRecorded: Boolean(settings.onboardingChoiceRecorded),
  };
}

private clampNumber(value: number, min: number, max: number, fallback: number): number {
  if (!Number.isFinite(value)) {
    return fallback;
  }
  return Math.min(max, Math.max(min, Math.round(value)));
}

This prevents users from manually editing the configuration file into invalid values.

Cache LRU Cleanup

CacheRetentionManager.prune() is responsible for cleaning expired or oversized cache entries. The cleanup strategy uses LRU (least recently used):

const records = [...this.listRecords()]
  .sort((left, right) =>
    new Date(left.lastUsedAt).getTime() - new Date(right.lastUsedAt).getTime()
  );

// When over the limit, evict the least recently used entries first
while (totalBytes > maxBytes && retainedEntries.length > 0) {
  const evicted = records.find((record) => retainedEntries.includes(record.versionId));
  retainedEntries.splice(retainedEntries.indexOf(evicted.versionId), 1);
  removedEntries.push(evicted.versionId);
  totalBytes -= evicted.cacheSize;
  await fs.rm(evicted.cachePath, { force: true });
}

This logic ensures disk space is used efficiently while preserving historical versions that the user might still need.

Immediate Stop-Seeding Behavior

When the user turns off sharing acceleration, the app must immediately stop seeding and destroy the torrent client:

async disableSharingAcceleration(): Promise<void> {
  this.settingsStore.updateSettings({ enabled: false });
  await this.cacheRetentionManager.stopAllSeeding();  // Stop seeding
  await this.engine.stopAll();  // Destroy the torrent client
}

If a user disables the feature, the product should no longer consume any P2P resources. That is basic product etiquette.

Risks and Trade-Offs

There is no perfect solution, and hybrid distribution is no exception. These are the main trade-offs:

Crash isolation is weaker than a sidecar: V1 uses an in-process engine, so an engine crash can affect the main process. Adapter boundaries and timeout controls reduce the risk, but they are not a fundamental fix. V2 includes a planned migration path to a helper process.

Enabled-by-default resource usage: the default settings of 2 MB/s upload, 10 GB cache, and 7-day retention do consume some machine resources. User expectations are managed through onboarding copy and transparent settings.

Enterprise network compatibility: automatic WebSeed/HTTPS fallback preserves usability in enterprise networks, but it can reduce the acceleration gains from P2P. This is an intentional trade-off that prioritizes availability.

Backward-compatible metadata: all new fields are optional. If they are missing, the system falls back to HTTP mode. Older clients are completely unaffected, making upgrades smooth.

Conclusion

This article walked through the hybrid distribution architecture used in the HagiCode Desktop project. The key takeaways are:

Layered architecture: the control plane and data plane are separated, and the engine is abstracted behind a pluggable interface for easier testing and extension.
Policy-driven behavior: not every file uses P2P. Hybrid distribution is enabled only for large files that meet the required conditions.
Integrity verification: SHA256 serves as a hard gate, and streaming verification avoids memory pressure.
Productized presentation: BT terminology is hidden behind the phrase “sharing acceleration,” and the feature is enabled by default during onboarding.
User control: upload limits, cache limits, retention days, and other parameters remain user-adjustable.

This architecture has already been implemented in the HagiCode Desktop project. If you try it out, we would love to hear your feedback after installation and real-world use.

References

HagiCode Desktop GitHub: github.com/HagiCode-org/site
HagiCode official website: hagicode.com
WebTorrent official documentation: webtorrent.io
BitTorrent protocol specification: bittorrent.org
WebSeed extension specification: bittorrent.org/beps/bep_0017.html

If this article helped you:

Give the project a Star on GitHub: github.com/HagiCode-org/site
Visit the website to learn more: hagicode.com
Quick install for HagiCode Desktop: hagicode.com/desktop/
Public beta is now open, and you are welcome to install and try it

Maybe we are all just ordinary people making our way through the world of technology, but that is fine. Ordinary people can still be persistent, and that persistence matters.

Copyright Notice

Thank you for reading. If you found this article useful, feel free to like, save, and share it. This content was created with AI-assisted collaboration, with the final version reviewed and approved by the author.

Author: newbe36524
Original article: https://docs.hagicode.com/blog/2026-03-27-hagicode-desktop-p2p-acceleration-architecture/
License notice: Unless otherwise stated, all blog posts on this site are licensed under BY-NC-SA. Please include attribution when reposting.

Running AI CLI Tools in Docker Containers: A Practical Guide to User Isolation and Persistent Volumes

Mar 26, 2026

Running AI CLI Tools in Docker Containers: A Practical Guide to User Isolation and Persistent Volumes

Integrating AI coding tools like Claude Code, Codex, and OpenCode into containerized environments sounds simple, but there are hidden complexities everywhere. This article takes a deep dive into how the HagiCode project solves core challenges in Docker deployments, including user permissions, configuration persistence, and version management, so you can avoid the common pitfalls.

Background

When we decided to run AI coding CLI tools inside Docker containers, the most intuitive thought was probably: “Aren’t containers just root? Why not install everything directly and call it done?” In reality, that seemingly simple idea hides several core problems that must be solved.

First, security restrictions are the first hurdle. Take Claude CLI as an example: it explicitly forbids running as the root user. This is a mandatory security check, and if root is detected, it refuses to start. You might think, can’t I just switch users with the USER directive? It is not that simple. There is still a mapping problem between the non-root user inside the container and the user permissions on the host machine.

Second, state persistence is the second trap. Claude Code requires login, Codex has its own configuration, and OpenCode also has a cache directory. If you have to reconfigure everything every time the container restarts, the whole idea of “automation” loses its meaning. We need these configurations to persist beyond the lifecycle of the container.

The third problem is permission consistency. Can processes inside the container access configuration files created by the host user? UID/GID mismatches often cause file permission errors, and this is extremely common in real deployments.

These problems may look independent, but in practice they are tightly connected. During HagiCode’s development, we gradually worked out a practical solution. Next, I will share the technical details and the lessons learned from those pitfalls.

About HagiCode

The solution shared in this article comes from our practical experience in the HagiCode project. HagiCode is an open-source AI-assisted programming platform that integrates multiple mainstream AI coding assistants, including Claude Code, Codex, and OpenCode. As a project that needs cross-platform and highly available deployment, HagiCode has to solve the full range of challenges involved in containerized deployment.

If you find the technical solution in this article valuable, that is a sign HagiCode has something real to offer in engineering practice. In that case, the HagiCode official website and GitHub repository are both worth following.

Why can’t we just use root?

There is a common misunderstanding here: Docker containers run as root by default, so why not just install the tools as root? If you think that way, Claude CLI will quickly teach you otherwise.

# Run Claude CLI directly as root? No.
docker run --rm -it --user root myimage claude
# Output: Error: This command cannot be run as root user

This is a hard security restriction in Claude CLI. The reason is simple: these CLI tools read and write sensitive user configuration, including API tokens, local caches, and even scripts written by the user. Running them with root privileges introduces too much risk.

So the question becomes: how can we satisfy the CLI’s security requirements while keeping container management flexible? We need to change the way we think about it: instead of switching users at runtime, create a dedicated user during the image build stage.

Creating a dedicated user: more than just changing a name

You might think that adding a single USER line to the Dockerfile is enough. That is indeed the simplest approach, but it is not robust enough.

Static creation vs. dynamic mapping

HagiCode’s approach is to create a hagicode user with UID 1000, which usually matches the default user on most host machines:

RUN groupadd -o -g 1000 hagicode && \
    useradd -o -u 1000 -g 1000 -s /bin/bash -m hagicode && \
    mkdir -p /home/hagicode/.claude && \
    chown -R hagicode:hagicode /home/hagicode

But this only solves the built-in user inside the image. What if the host user is UID 1001? You still need to support dynamic mapping when the container starts.

docker-entrypoint.sh contains the key logic:

if [ -n "$PUID" ] && [ -n "$PGID" ]; then
    if ! id hagicode >/dev/null 2>&1; then
        groupadd -g "$PGID" hagicode
        useradd -u "$PUID" -g "$PGID" -s /bin/bash -m hagicode
    fi
fi

The advantage of this design is clear: use the default UID 1000 at image build time, then adjust dynamically at runtime through the PUID and PGID environment variables. No matter what UID the host user has, ownership of configuration files remains correct.

The design philosophy of persistent volumes

Each AI CLI tool has its own preferred configuration directory, so they need to be mapped one by one:

CLI Tool	Path in Container	Named Volume
Claude	`/home/hagicode/.claude`	`claude-data`
Codex	`/home/hagicode/.codex`	`codex-data`
OpenCode	`/home/hagicode/.config/opencode`	`opencode-config-data`

Why use named volumes instead of bind mounts? Three reasons:

Simpler management: Named volumes are managed automatically by Docker, so you do not need to create host directories manually.
Permission isolation: The initial contents of the volumes are created by the user inside the container, avoiding permission conflicts with the host.
Independent migration: Volumes can exist independently of containers, so data is not lost when images are upgraded.

docker-compose-builder-web automatically generates the corresponding volume configuration:

volumes:
  claude-data:
  codex-data:
  opencode-config-data:

services:
  hagicode:
    volumes:
      - claude-data:/home/hagicode/.claude
      - codex-data:/home/hagicode/.codex
      - opencode-config-data:/home/hagicode/.config/opencode
    user: "${PUID:-1000}:${PGID:-1000}"

Pay attention to the user field here: PUID and PGID are injected through environment variables to ensure that processes inside the container run with an identity that matches the host user. This detail matters because permission issues are painful to debug once they appear.

Version management: baked-in versions with runtime overrides

Pinning Docker image versions is essential for reproducibility. But in real development, we often need to test a newer version or urgently fix a bug. If we had to rebuild the image every time, the workflow would be far too inefficient.

HagiCode’s strategy is fixed versions as the default, with runtime overrides as an extension mechanism. It is a pragmatic engineering compromise between stability and flexibility.

Dockerfile.template pins versions here:

USER hagicode
WORKDIR /home/hagicode

# Configure the global npm install path
RUN mkdir -p /home/hagicode/.npm-global && \
    npm config set prefix '/home/hagicode/.npm-global'

# Install CLI tools using pinned versions
RUN npm install -g @anthropic-ai/claude-code@2.1.71 && \
    npm install -g @openai/codex@0.112.0 && \
    npm install -g opencode-ai@1.2.25 && \
    npm cache clean --force

docker-entrypoint.sh supports runtime overrides:

install_cli_override_if_needed() {
    local package_name="$2"
    local override_version="$5"

    if [ -n "$override_version" ]; then
        gosu hagicode npm install -g "${package_name}@${override_version}"
    fi
}

# Example usage
install_cli_override_if_needed "" "@anthropic-ai/claude-code" "" "" "${CLAUDE_CODE_CLI_VERSION}"

This lets you test a new version through an environment variable without rebuilding the image:

docker run -e CLAUDE_CODE_CLI_VERSION=2.2.0 myimage

This design is practical because nobody wants to rebuild an image every time they test a new feature.

Automatic configuration injection

In addition to configuring CLI tools manually, some scenarios require automatic configuration injection. The most typical example is an API token.

if [ -n "$ANTHROPIC_AUTH_TOKEN" ]; then
    mkdir -p /home/hagicode/.claude
    cat > /home/hagicode/.claude/settings.json <<EOF
{
  "env": {
    "ANTHROPIC_AUTH_TOKEN": "${ANTHROPIC_AUTH_TOKEN}"
  }
}
EOF
    chown -R hagicode:hagicode /home/hagicode/.claude
fi

Two things matter here: pass sensitive information through environment variables instead of hard-coding it into the image, and make sure the ownership of configuration files is set correctly, otherwise the CLI tools will not be able to read them.

Best practices and a pitfall checklist

Permission mismatch problems

This is the easiest trap to fall into. The host user has UID 1001, while the container uses 1000, so files created on one side cannot be accessed on the other.

# Correct approach: make the container match the host user
docker run \
    -e PUID=$(id -u) \
    -e PGID=$(id -g) \
    myimage

This issue is very common, and it can be frustrating the first time you run into it.

Configuration disappears after container restart

If you find yourself logging in again after every restart, check whether you forgot to mount a persistent volume:

volumes:
  - claude-data:/home/hagicode/.claude

Nothing is more frustrating than carefully setting up a configuration only to see it disappear.

The right way to upgrade versions

Do not run npm install -g directly inside a running container. The correct approaches are:

Set an environment variable to trigger override installation.
Or rebuild the image.

# Option 1: runtime override
docker run -e CLAUDE_CODE_CLI_VERSION=2.2.0 myimage

# Option 2: rebuild the image
docker build -t myimage:v2 .

There is more than one road to Rome, but some roads are smoother than others.

Security hardening checklist

Pass API tokens through environment variables instead of writing them into the image.
Set configuration file permissions to 600.
Always run the application as a non-root user.
Update CLI versions regularly to fix security vulnerabilities.

Security is always important, but the real challenge is consistently enforcing it in practice.

Extending support for new CLI tools

If you want to support a new CLI tool in the future, there are only three steps:

Dockerfile.template: add the installation step.
docker-entrypoint.sh: add the version override logic.
docker-compose-builder-web: add the persistent volume mapping.

This template-based design makes extension simple without changing the core logic.

Conclusion

Running AI CLI tools in Docker containers involves three core challenges: user permissions, configuration persistence, and version management. By combining dedicated users, named-volume isolation, and environment-variable-based overrides, the HagiCode project built a deployment architecture that is both secure and flexible.

Key design points:

User isolation: Create a dedicated user during the image build stage, with runtime support for dynamic PUID/PGID mapping.
Persistence strategy: Each CLI tool gets its own named volume, so restarts do not affect configuration.
Version flexibility: Fixed defaults ensure reproducibility, while runtime overrides provide room for testing.
Automated configuration: Sensitive configuration can be injected automatically through environment variables.

This solution has been running stably in the HagiCode project for some time, and I hope it offers useful reference points for developers with similar needs.

Copyright Notice

Thank you for reading. If you found this article useful, you are welcome to like, bookmark, and share it. This content was created with AI-assisted collaboration, and the final content was reviewed and confirmed by the author.

Author: newbe36524
Original article: https://docs.hagicode.com/blog/2026-03-26-docker-ai-cli-user-isolation-guide/
Copyright: Unless otherwise stated, all articles in this blog are licensed under BY-NC-SA. Please include the source when reposting.

Technical Analysis of the HagiCode Soul Platform: The Evolution from Emerging Needs to an Independent Platform

Mar 25, 2026

Technical Analysis of the HagiCode Soul Platform: The Evolution from Emerging Needs to an Independent Platform

Writing technical articles is not really such a grand thing. It is mostly just a matter of organizing the pitfalls you have run into and the detours you have taken. We have all been inexperienced before, after all. This article takes an in-depth look at the design philosophy, architectural evolution, and core technical implementation of Soul in the HagiCode project, and explores how an independent platform can provide a more focused experience for creating and sharing Agent personas.

Background

In the practice of building AI Agents, we often run into a question that looks simple but is actually crucial: how do we give different Agents stable and distinctive language styles and personality traits?

It is a slightly frustrating question, honestly. In the early Hero system of HagiCode, different Heroes (Agent instances) were mainly distinguished through profession settings and generic prompts. That approach came with some fairly obvious pain points, and anyone who has tried something similar has probably felt the same.

First, language style was difficult to keep consistent. The same “developer engineer” role might sound professional and rigorous one day, then casual and loose the next. This was not a model problem so much as the absence of an independent personality configuration layer to constrain and guide the output style.

Second, the sense of character was generally weak. When we described an Agent’s traits, we often had to rely on vague adjectives like “friendly,” “professional,” or “humorous,” without concrete language rules to support those abstract descriptions. Put plainly, it sounded nice in theory, but there was little to hold onto in practice.

Third, persona configurations were almost impossible to reuse. Suppose we carefully designed the speaking style of a “catgirl waitress” and wanted to reuse that expression style in another business scenario. In practice, we would almost have to configure it again from scratch. Sometimes you do not want to possess something beautiful, only reuse it a little… and even that turns out to be hard.

To solve those real problems, we introduced the Soul mechanism: an independent language style configuration layer separate from equipment and descriptions. Soul can define an Agent’s speaking habits, tone preferences, and wording boundaries, can be shared and reused across multiple Heroes, and can also be injected into the system prompt automatically on the first Session call.

Some people might say that this is just configuring a few prompts. But sometimes the real question is not whether something can be done; it is how to do it more elegantly. As Soul matured, we realized it had enough depth to develop independently. A dedicated Soul platform could let users focus on creating, sharing, and browsing interesting persona configurations without being distracted by the rest of the Hero system. That is how the standalone platform at soul.hagicode.com came into being.

About HagiCode

HagiCode is an open-source AI coding assistant project built with a modern technology stack and aimed at giving developers a smooth intelligent programming experience. The Soul platform approach shared in this article comes from our own hands-on exploration while building HagiCode to solve the practical problem of Agent persona management. If you find the approach valuable, then it probably means we have accumulated a certain amount of engineering judgment in practice, and the HagiCode project itself may also be worth a closer look.

GitHub: github.com/HagiCode-org/site
Official website: hagicode.com
Video demo: www.bilibili.com/video/BV1pirZBuEzq/
Quick desktop installation: hagicode.com/desktop/

The Technical Architecture Evolution of the Soul Platform

The Soul platform did not appear all at once. It went through three clear stages. The story began abruptly and concluded naturally.

Phase 1: Soul Configuration Embedded in Hero

The earliest Soul implementation existed as a functional module inside the Hero workspace. We added an independent SOUL editing area to the Hero UI, supporting both preset application and text fine-tuning.

Preset application let users choose from classic persona templates such as “professional developer engineer” and “catgirl waitress.” Text fine-tuning let users personalize those presets further. On the backend, the Hero entity gained a Soul field, with SoulCatalogId used to identify its source.

This stage solved the question of whether the capability existed at all, and it grew forward somewhat awkwardly, like anything young does. But as Soul content became richer, the limitations of an architecture tightly coupled with the Hero system started to show.

Phase 2: In-Site Marketplace

To provide a better Soul discovery and reuse experience, we built a SOUL Marketplace catalog page with support for browsing, searching, viewing details, and favoriting.

At this stage, we introduced a combinatorial design built from 50 main Catalogs (base roles) and 10 orthogonal rules (expression styles). The main Catalogs defined the Agent’s core persona, with abstract character settings such as “Mistport Traveler” and “Night Hunter.” The orthogonal rules defined how the Agent expressed itself, with language style traits such as “Concise & Professional” and “Verbose & Friendly.”

50 x 10 = 500 possible combinations gave users a wide configuration space for personas. It is not an overwhelming number, but it is not small either. There are many roads to Rome, after all; some are simply easier to walk than others. On the backend, the full SOUL catalog was generated through catalog-sources.json, while the frontend presented those catalog entries as an interactive card list.

The in-site Marketplace was a good transitional solution, but only that: transitional. It was still attached to the main system, and for users who only wanted Soul functionality, the access path remained too deep. Not everyone wants to take the scenic route just to do something simple.

Phase 3: Splitting into an Independent Platform

In the end, we decided to move Soul into an independent repository (repos/soul). The Marketplace in the original main system was changed into an external jump guide, while the new platform adopted a Builder-first design philosophy: the homepage is the creation workspace by default, so users can start building their own persona configuration the moment they open the site.

The technology stack was also comprehensively upgraded in this stage: Vite 8 + React 19 + TypeScript 5.9, a unified design language through the shadcn/ui component system, and Tailwind CSS 4 theme variables. The improvement in frontend engineering laid a solid foundation for future feature iteration.

Everything faded away… no, actually, everything was only just beginning.

Core Technical Design and Implementation

Material Integration Strategy

One core design principle of the Soul platform is local-first. That means the homepage must remain fully functional without a backend, and failure to load remote materials must never block page entry.

There is nothing especially miraculous about that. It simply means thinking one step further when designing the system. Using a local snapshot as the baseline and remote data as enhancement lets the product remain basically usable under any network condition. Concretely, we implemented a two-layer material architecture:

export async function loadBuilderMaterials(): Promise<BuilderMaterials> {
  const localMaterials = createLocalMaterials(snapshot)  // local baseline

  try {
    const inspirationFragments = await fetchMarketplaceItems()  // remote enhancement
    return { ...localMaterials, inspirationFragments, remoteState: "ready" }
  } catch (error) {
    return { ...localMaterials, remoteState: "fallback" }  // graceful degradation
  }
}

Local materials come from build-time snapshots of the main system documentation and include the complete data for 50 base roles and 10 expression rules. Remote materials come from Souls published by users and fetched through the Marketplace API. Together, they give users a full spectrum of materials, from official templates to community creativity. If that sounds dramatic, it really is just local plus remote.

Soul Fragment Data Model

The core data abstraction of Soul is the SoulFragment:

export type SoulFragment = {
  fragmentId: string
  group: "main-catalog" | "expression-rule" | "published-soul"
  title: string
  summary: string
  content: string
  keywords: string[]
  localized?: Partial<Record<AppLocale, LocalizedFragmentContent>>
  sourceRef: SoulFragmentSourceRef
  meta: SoulFragmentMeta
}

The group field distinguishes fragment types: the main catalog defines the character core, orthogonal rules define expression style, and user-published Souls are marked as published-soul. The localized field supports multilingual presentation, allowing the same fragment to display different titles and descriptions in different language environments. Internationalization is something you really want to think about early, and in this case we actually did.

The Builder draft state encapsulates the user’s current editing state:

export type SoulBuilderDraft = {
  draftId: string
  name: string
  selectedMainFragmentId: string | null
  selectedRuleFragmentId: string | null
  inspirationSoulId: string | null
  mainSlotText: string
  ruleSlotText: string
  customPrompt: string
  previewText: string
  updatedAt: string
}

Each fragment selected in the editor has its content concatenated into the corresponding slot, forming the final preview text. mainSlotText corresponds to the main role content, ruleSlotText corresponds to the expression rule content, and customPrompt is the user’s additional instruction text.

Preview Compilation Mechanism

Preview compilation is the core capability of Soul Builder. It assembles user-selected fragments and custom text into a system prompt that can be copied directly:

export function compilePreview(
  draft: Pick<SoulBuilderDraft, "mainSlotText" | "ruleSlotText" | "customPrompt">,
  fragments: {
    mainFragment: SoulFragment | null
    ruleFragment: SoulFragment | null
    inspirationFragment: SoulFragment | null
  }
): PreviewCompilation {
  // Assembly logic: main role + expression rule + inspiration reference + custom content
}

The compilation result is shown in the central preview panel, where users can see the final effect in real time and copy it to the clipboard with one click. It sounds simple, and it is. But simple things are often the most useful.

Frontend State Management

Frontend state management in Soul Builder follows one important principle: clear separation of state boundaries. More specifically, drawer state is not persisted and does not write directly into the draft. Only explicit Builder actions trigger meaningful state changes.

// Domain state (useSoulBuilder)
export function useSoulBuilder() {
  // Material loading and caching
  // Slot aggregation and preview compilation
  // Copy actions and feedback messages
  // Locale-safe descriptors
}

// Presentation state (useHomeEditorState)
export function useHomeEditorState() {
  // activeSlot, drawerSide, drawerOpen
  // default focus behavior
}

That separation ensures both edit-state safety and responsive UI behavior. Opening and closing the drawer is purely a UI interaction and should not trigger complicated persistence logic. It may sound obvious, but it matters: UI state and business state should be separated clearly so interface interactions do not pollute the core data model.

Single-Drawer Lifecycle

Soul Builder uses a single-drawer mode: only one slot drawer may be open at a time. Clicking the mask, pressing the ESC key, or switching slots automatically closes the current drawer. This simplifies state management and also matches common drawer interaction patterns on mobile.

Closing the drawer does not clear the current editing content, so when users come back, their context is preserved. This kind of “lightweight” drawer design avoids interrupting the user’s flow. Nobody wants carefully written content to disappear because of one accidental click.

Bilingual Support Architecture

Internationalization is an important capability of the Soul platform. System copy fully supports bilingual switching, while user draft text is never rewritten when the language changes, because draft text is user-authored free input rather than system-translated content.

Official inspiration cards (Marketplace Souls) keep the upstream display name while also providing a best-effort English summary. For Souls with Chinese names, we generate English versions through predefined mapping rules:

// English name mapping for main roles
const mainNameEnglishMap = {
  "雾港旅人": "Mistport Traveler",
  "夜航猎手": "Night Hunter",
  // ...
}

// English name mapping for orthogonal rules
const ruleNameEnglishMap = {
  "简洁干练": "Concise & Professional",
  "啰嗦亲切": "Verbose & Friendly",
  // ...
}

The mapping table itself looks simple enough, but keeping it in good shape still takes care. There are 50 main roles and 10 orthogonal rules, which means 500 combinations in total. That is not huge, but it is enough to deserve respect.

Backend Catalog Generation

Bulk generation of the Soul Catalog happens on the backend, where C# is used to automate the creation of 50 x 10 = 500 combinations:

foreach (var main in source.MainCatalogs)
{
    foreach (var orthogonal in source.OrthogonalCatalogs)
    {
        var catalogId = $"soul-{main.Index:00}-{orthogonal.Index:00}";
        var displayName = BuildNickname(main, orthogonal);
        var soulSnapshot = BuildSoulSnapshot(main, orthogonal);
        // Write to the database...
    }
}

The nickname generation algorithm combines the main role name with the expression rule name to create imaginative Agent codenames:

private static readonly string[] MainHandleRoots = [
    "雾港", "夜航", "零帧", "星渊", "霓虹", "断云", ...
];
private static readonly string[] OrthogonalHandleSuffixes = [
    "旅人", "猎手", "术师", "行者", "星使", ...
];
// Combination examples: 雾港旅人, 夜航猎手, 零帧术师...

Soul snapshot assembly follows a fixed template format that combines the main role core, signature traits, expression rule core, and output constraints together:

private static string BuildSoulSnapshot(main, orthogonal) => string.Join('\n', [
    $"你的人设内核来自「{main.Name}」：{main.Core}",
    $"保持以下标志性语言特征：{main.Signature}",
    $"你的表达规则来自「{orthogonal.Name}」：{orthogonal.Core}",
    $"必须遵循这些输出约束：{orthogonal.Signature}"
]);

Template assembly may sound terribly dull, but without that sort of dull work, interesting products rarely appear.

Platform Migration Strategy

After splitting Soul from the main system into an independent platform, one important challenge was handling existing user data. It is a familiar problem: splitting things apart is easy, migration is not. We adopted three safeguards:

Backward compatibility protection. Previously saved Hero SOUL snapshots remain visible, and historical snapshots can still be previewed even if they no longer have a Marketplace source ID. In other words, none of the user’s prior configurations are lost; only where they appear has changed.

Main system API deprecation. The in-site Marketplace API returns HTTP status 410 Gone together with a migration notice that guides users to soul.hagicode.com.

Hero SOUL form refactoring. A migration notice block was added to the Hero Soul editing area to clearly tell users that the Soul platform is now independent and to provide a one-click jump button:

<div className="rounded-2xl border border-orange-200/70 bg-orange-50/80 p-4">
  <div>{t('hero.soul.migrationTitle')}</div>
  <p>{t('hero.soul.migrationDescription')}</p>
  <Button onClick={onOpenSoulPlatform}>
    {t('hero.soul.openSoulPlatformAction')}
  </Button>
</div>

Practical Lessons

Looking back at the development of the Soul platform as a whole, there are a few practical lessons worth sharing. They are not grand principles, just things learned from real mistakes.

Local-first runtime assumptions. When designing features that depend on remote data, always assume the network may be unavailable. Using local snapshots as the baseline and remote data as enhancement ensures the product remains basically usable under any network condition.

Clear separation of state boundaries. UI state and business state should be distinguished clearly so interface interactions do not pollute the core data model. Drawer toggles are purely UI state and should not be mixed with draft persistence.

Design for internationalization early. If your product has multilingual requirements, it is best to think about them during the data model design phase. The localized field adds some structural complexity, but it greatly reduces the long-term maintenance cost of multilingual content.

Automate the material synchronization workflow. Local materials for the Soul platform come from the main system documentation. When upstream documentation changes, there needs to be a mechanism to sync it into frontend snapshots. We designed the npm run materials:sync script to automate that process and keep materials aligned with upstream.

Future Outlook

Based on the current architecture, the Soul platform could move in several directions in the future. These are only tentative ideas, but perhaps they can be useful as a starting point.

Community sharing ecosystem. Support user uploads and sharing of custom Souls, with rating, commenting, and recommendation mechanisms so excellent Soul configurations can be discovered and reused by more people.

Multimodal expansion. Beyond text style, the platform could also support dimensions such as voice style configuration, emoji usage preferences, and code style and formatting rules. It sounds attractive in theory; implementation may tell a more complicated story.

Intelligent assistance. Automatically recommend Souls based on usage scenarios, support style transfer and fusion, and even run A/B tests on the real-world effectiveness of different Souls. There is no better way to know than to try.

Cross-platform synchronization. Support importing persona configurations from other AI platforms, provide a standardized Soul export format, and integrate with mainstream Agent frameworks.

Conclusion

This article shares the full evolution of the HagiCode Soul platform from its earliest emerging need to an independent platform. We discussed why a Soul mechanism is needed to solve Agent persona consistency, analyzed the three stages of architectural evolution (embedded configuration, in-site Marketplace, and independent platform), examined the core data model, state management, preview compilation, and internationalization design in depth, and summarized practical migration lessons.

The essence of Soul is an independent persona configuration layer separated from business logic. It makes the language style of AI Agents definable, reusable, and shareable. From a technical perspective, the design itself is not especially complicated, but the problem it solves is real and broadly relevant.

If you are also building AI Agent products, it may be worth asking whether your persona configuration solution is flexible enough. The Soul platform’s practical experience may offer a few useful ideas.

Perhaps one day you will run into a similar problem as well. If this article can help a little when that happens, that is probably enough.

References

HagiCode official website: hagicode.com
Soul platform: soul.hagicode.com
HagiCode GitHub: github.com/HagiCode-org/site
HagiCode desktop: hagicode.com/desktop/
HagiCode installation docs: docs.hagicode.com/installation/docker-compose

If you found this article helpful, feel free to give the project a Star on GitHub. The public beta has already started, and you are welcome to install it and try it out.

Copyright Notice

Thank you for reading. If you found this article useful, likes, bookmarks, and shares are all appreciated. This content was created with AI-assisted collaboration, and the final content was reviewed and confirmed by the author.

Author: newbe36524
Original link: https://docs.hagicode.com/blog/2026-03-25-hagicode-soul-platform-technical-analysis/
Copyright notice: Unless otherwise stated, all articles on this blog are licensed under BY-NC-SA. Please cite the source when reposting.

Technical Analysis of the HagiCode Skill System: Building a Scalable AI Skill Management Platform

Mar 24, 2026

Technical Analysis of the HagiCode Skill System: Building a Scalable AI Skill Management Platform

This article takes an in-depth look at the architecture and implementation of the Skill management system in the HagiCode project, covering the technical details behind four core capabilities: local global management, marketplace search, intelligent recommendations, and trusted provider management.

Background

In the field of AI coding assistants, how to extend the boundaries of AI capabilities has always been a core question. Claude Code itself is already strong at code assistance, but different development teams and different technology stacks often need specialized capabilities for specific scenarios, such as handling Docker deployments, database optimization, or frontend component generation. That is exactly where a Skill system becomes especially important.

During the development of the HagiCode project, we ran into a similar challenge: how do we let Claude Code “learn” new professional skills like a person would, while still maintaining a solid user experience and good engineering maintainability? This problem is both hard and simple in its own way. Around that question, we designed and implemented a complete Skill management system.

This article walks through the technical architecture and core implementation of the system in detail. It is intended for developers interested in AI extensibility and command-line tool integration. It might be useful to you, or it might not, but at least it is written down now.

About HagiCode

The approach shared in this article comes from our practical experience in the HagiCode project. HagiCode is an open-source AI coding assistant project designed to help development teams improve engineering efficiency. The project’s stack includes ASP.NET Core, the Orleans distributed framework, a TanStack Start + React frontend, and the Skill management subsystem introduced in this article.

The GitHub repository is HagiCode-org/site. If you find the technical approach in this article valuable, feel free to give it a Star. More Stars tend to improve the mood, after all.

System Architecture Overview

The Skill system uses a frontend-backend separated architecture. There is nothing especially mysterious about that.

Frontend uses TanStack Start + React to build the user interface, with Redux Toolkit managing state. The four main capabilities map directly to four Tab components: Local Skills, Skill Gallery, Intelligent Recommendations, and Trusted Providers. In the end, the design is mostly about making the user experience better.

Backend is based on ASP.NET Core + ABP Framework, using Orleans Grain for distributed state management. The online API client wraps the IOnlineApiClient interface to communicate with the remote skill catalog service.

The overall architectural principle is to separate command execution from business logic. Through the adapter pattern, the implementation details of npm/npx command execution are hidden inside independent modules. After all, nobody really wants command-line calls scattered all over the codebase.

Core Capability 1: Local Global Management

Local global management is the most basic module. It is responsible for listing installed skills and supporting uninstall operations. There is nothing overly complicated here; it is mostly about doing the basics well.

Technical Approach

The implementation lives in LocalSkillsTab.tsx and LocalSkillCommandAdapter.cs. The core idea is to wrap the npx skills command, parse its JSON output, and convert it into internal data structures. It sounds simple, and in practice it mostly is.

public async Task<IReadOnlyList<LocalSkillInventoryResponseDto>> GetLocalSkillsAsync(
    CancellationToken cancellationToken = default)
{
    var result = await _commandAdapter.ListGlobalSkillsAsync(cancellationToken);
    return result.Skills.Select(skill => new LocalSkillInventoryResponseDto
    {
        Name = skill.Name,
        Version = skill.Version,
        Source = skill.Source,
        InstalledPath = skill.InstalledPath,
        Description = skill.Description
    }).ToList();
}

The data flow is very clear: the frontend sends a request -> SkillGalleryAppService receives it -> LocalSkillCommandAdapter executes the npx command -> the JSON result is parsed -> a DTO is returned. Each step follows naturally from the previous one.

Skill uninstallation uses the npx skills remove -g <skillName> -y command, and the system automatically handles dependencies and cleanup. Installation metadata is stored in managed-install.json inside the skill directory, recording information such as install time and source version for later updates and auditing. Some things are simply worth recording.

Installation Flow in Detail

Skill installation requires several coordinated steps. In truth, it is not especially complicated:

public async Task<SkillInstallResultDto> InstallAsync(
    SkillInstallRequestDto request,
    CancellationToken cancellationToken = default)
{
    // 1. Normalize the installation reference
    var normalized = _referenceNormalizer.Normalize(
        request.SkillId,
        request.Source,
        request.SkillSlug,
        request.Version);

    // 2. Check prerequisites
    await _prerequisiteChecker.CheckAsync(cancellationToken);

    // 3. Acquire installation lock
    using var installLock = await _lockProvider.AcquireAsync(normalized.SkillId);

    // 4. Execute installation command
    var result = await _installCommandRunner.ExecuteAsync(
        new SkillInstallCommandExecutionRequest
        {
            Command = $"npx skills add {normalized.FullReference} -g -y",
            Timeout = TimeSpan.FromMinutes(4)
        },
        cancellationToken);

    // 5. Persist installation metadata
    await _metadataStore.WriteAsync(normalized.SkillPath, request);

    return new SkillInstallResultDto { Success = result.Success };
}

Several key design patterns are used here: the reference normalizer converts different input formats, such as tanweai/pua and @opencode/docker-skill, into a unified internal representation; the installation lock mechanism ensures only one installation operation can run for the same skill at a time; and streaming output pushes installation progress to the frontend in real time through Server-Sent Events, so users can watch terminal-like logs as they happen.

In the end, all of these patterns are there for one purpose: to keep the system simpler to use and maintain.

Core Capability 2: Marketplace Search

Marketplace search lets users discover and install skills from the community. One person’s ability is always limited; collective knowledge goes much further.

Technical Approach

The search feature relies on the online API https://api.hagicode.com/v1/skills/search. To improve response speed, the system implements caching. Cache is a bit like memory: if you keep useful things around, you do not have to think so hard the next time.

private async Task<IReadOnlyList<SkillGallerySkillDto>> SearchCatalogAsync(
    string query,
    CancellationToken cancellationToken,
    IReadOnlySet<string>? allowedSources = null)
{
    var cacheKey = $"skill_search:{query}:{string.Join(",", allowedSources ?? Array.Empty<string>())}";

    if (_memoryCache.TryGetValue(cacheKey, out var cached))
        return (IReadOnlyList<SkillGallerySkillDto>)cached!;

    var response = await _onlineApiClient.SearchAsync(
        new SearchSkillsRequest
        {
            Query = query,
            Limit = _options.LimitPerQuery,
        },
        cancellationToken);

    var results = response.Skills
        .Where(skill => allowedSources is null || allowedSources.Contains(skill.Source))
        .Select(skill => new SkillGallerySkillDto { ... })
        .ToList();

    _memoryCache.Set(cacheKey, results, TimeSpan.FromMinutes(10));
    return results;
}

Search results support filtering by trusted sources, so users only see skill sources they trust. Seed queries such as popular and recent are used to initialize the catalog, allowing users to see recommended popular skills the first time they open it. First impressions still matter.

Core Capability 3: Intelligent Recommendations

Intelligent recommendations are the most complex part of the system. They can automatically recommend the most suitable skills based on the current project context. Complex as it is, it is still worth building.

Recommendation Flow

The full recommendation flow is divided into five stages:

1. Build project context
   ↓
2. AI generates search queries
   ↓
3. Search the online catalog in parallel
   ↓
4. AI ranks the candidates
   ↓
5. Return the recommendation list

First, the system analyzes characteristics such as the project’s technology stack, programming languages, and domain structure to build a “project profile.” That profile is a bit like a resume, recording the key traits of the project.

Then an AI Grain is used to generate targeted search queries. This design is actually quite interesting: instead of directly asking the AI, “What skills should I recommend?”, we first ask it to think about “What search terms are likely to find relevant skills?” Sometimes the way you ask the question matters more than the answer itself:

var queryGeneration = await aiGrain.GenerateSkillRecommendationQueriesAsync(
    projectContext,      // Project context
    locale,              // User language preference
    maxQueries,           // Maximum number of queries
    effectiveSearchHero); // AI model selection

Next, those search queries are executed in parallel to gather a candidate skill list. Parallel processing is, at the end of the day, just a way to save time.

Finally, another AI Grain ranks the candidate skills. This step considers factors such as skill relevance to the project, trust status, and user historical preferences:

var ranking = await aiGrain.RankSkillRecommendationsAsync(
    projectContext,
    candidates,
    installedSkillNames,
    locale,
    maxRecommendations,
    effectiveRankingHero);

response.Items = MergeRecommendations(projectContext, candidates, ranking, maxRecommendations);

Fallback Mechanism

AI models can respond slowly or become temporarily unavailable. Even the best systems stumble sometimes. For that reason, the system includes a deterministic fallback mechanism: when the AI service is unavailable, it uses a rule-based heuristic algorithm to generate recommendations, such as inferring likely required skills from dependencies in package.json.

Put plainly, this fallback mechanism is simply a backup plan for the system.

Core Capability 4: Trusted Provider Management

Trusted provider management allows users to control which skill sources are considered trustworthy. Trust is still something users should be able to define for themselves.

Matching Rules

Trusted providers support two matching rules: exact match (exact) and prefix match (prefix).

public static TrustedSkillProviderResolutionSnapshot Resolve(
    TrustedSkillProviderSnapshot snapshot,
    string source)
{
    var normalizedSource = Normalize(source);

    foreach (var entry in snapshot.Entries.OrderBy(e => e.SortOrder))
    {
        if (!entry.IsEnabled) continue;

        foreach (var rule in entry.MatchRules)
        {
            bool isMatch = rule.MatchType switch
            {
                TrustedSkillProviderMatchRuleType.Exact
                    => string.Equals(normalizedSource, Normalize(rule.Value),
                        StringComparison.OrdinalIgnoreCase),
                TrustedSkillProviderMatchRuleType.Prefix
                    => normalizedSource.StartsWith(Normalize(rule.Value) + "/",
                        StringComparison.OrdinalIgnoreCase),
                _ => false
            };

            if (isMatch)
                return new TrustedSkillProviderResolutionSnapshot
                {
                    IsTrustedSource = true,
                    ProviderId = entry.ProviderId,
                    DisplayName = entry.DisplayName
                };
        }
    }

    return new TrustedSkillProviderResolutionSnapshot { IsTrustedSource = false };
}

Built-in trusted providers include well-known organizations and projects such as Vercel, Azure, anthropics, Microsoft, and browser-use. Custom providers can be added through configuration files by specifying a provider ID, display name, badge label, matching rules, and more. The world is large enough that only trusting a few built-ins would never be enough.

Persistence Implementation

Trusted configuration is persisted using an Orleans Grain:

public class TrustedSkillProviderGrain : Grain<TrustedSkillProviderState>,
    ITrustedSkillProviderGrain
{
    public async Task UpdateConfigurationAsync(TrustedSkillProviderSnapshot snapshot)
    {
        State.Snapshot = snapshot;
        await WriteStateAsync();
    }

    public Task<TrustedSkillProviderSnapshot> GetConfigurationAsync()
    {
        return Task.FromResult(State.Snapshot);
    }
}

The benefit of this approach is that configuration changes are automatically synchronized across all nodes, without any need to refresh caches manually. Automation is, ultimately, about letting people worry less.

Key Technical Design

Command Execution Adapter Pattern

The Skill system needs to execute various npx commands. If that logic were scattered everywhere, the code would quickly become difficult to maintain. That is why we designed an adapter interface. Design patterns, in the end, exist to make code easier to maintain:

public interface ISkillInstallCommandRunner
{
    Task<SkillInstallCommandExecutionResult> ExecuteAsync(
        SkillInstallCommandExecutionRequest request,
        CancellationToken cancellationToken = default);
}

Different commands have different executor implementations, but all of them implement the same interface, making testing and replacement straightforward.

SSE Streaming Output

Installation progress is pushed to the frontend in real time through Server-Sent Events:

public async Task InstallWithProgressAsync(
    SkillInstallRequestDto request,
    IServerStreamWriter<SkillInstallProgressEventDto> stream,
    CancellationToken cancellationToken)
{
    var process = new Process
    {
        StartInfo = new ProcessStartInfo
        {
            FileName = "npx",
            Arguments = $"skills add {request.FullReference} -g -y",
            RedirectStandardOutput = true,
            RedirectStandardError = true,
            UseShellExecute = false
        }
    };

    process.OutputDataReceived += async (sender, e) =>
    {
        await stream.WriteAsync(new SkillInstallProgressEventDto
        {
            EventType = "output",
            Data = e.Data ?? string.Empty
        });
    };

    process.Start();
    process.BeginOutputReadLine();
    await process.WaitForExitAsync(cancellationToken);
}

On the frontend, users can see terminal-like output in real time, which makes the experience very intuitive. Real-time feedback helps people feel at ease.

Practical Guide

Installing a Community Skill

Take installing the pua skill as an example (it is a popular community skill):

Open the Skills drawer and switch to the Skill Gallery tab
Enter pua in the search box
Click the search result to view the skill details
Click the Install button
Switch to the Local Skills tab to confirm the installation succeeded

The installation command is npx skills add tanweai/pua -g -y, and the system handles all the details automatically. There are not really that many steps once you take them one by one.

Adding a Custom Trusted Source

If your team has its own skill repository, you can add it as a trusted source:

providerId: "my-team"
displayName: "My Team Skills"
badgeLabel: "MyTeam"
isEnabled: true
sortOrder: 100
matchRules:
  - matchType: "prefix"
    value: "my-team/"
  - matchType: "exact"
    value: "my-team/special-skill"

This way, all skills from your team will display a trusted badge, making users more comfortable installing them. Labels and signals do help people feel more confident.

Skill Development Basics

Creating a custom skill requires the following structure:

my-skill/
├── SKILL.md          # Skill metadata (YAML front matter)
├── index.ts          # Skill entry point
├── agents/           # Supported agent configuration
└── references/       # Reference resources

An example SKILL.md format:

---
name: my-skill
description: A brief description of what this skill does
---

# My Skill

Detailed documentation...

Notes

Network requirements: skill search and installation require access to api.hagicode.com and the npm registry
Node.js version: Node.js 18 or later is recommended
Permission requirements: global npm installation permissions are required
Concurrency control: only one install or uninstall operation can run for the same skill at a time
Timeout settings: the default timeout for installation is 4 minutes, but complex scenarios may require adjustment

These notes exist, ultimately, to help things go smoothly.

Conclusion

This article introduced the complete implementation of the Skill management system in the HagiCode project. Through a frontend-backend separated architecture, the adapter pattern, Orleans-based distributed state management, and related techniques, the system delivers:

Local global management: a unified skill management interface built by wrapping npx skills commands
Marketplace search: rapid discovery of community skills through the online API and caching mechanisms
Intelligent recommendations: AI-powered skill recommendations based on project context
Trust management: a flexible configuration system that lets users control trust boundaries

This design approach is not only applicable to Skill management. It is also useful as a reference for any scenario that needs to integrate command-line tools while balancing local storage and online services.

If this article helped you, feel free to give us a Star on GitHub: github.com/HagiCode-org/site. You can also visit the official site to learn more: hagicode.com.

You may think this system is well designed, or you may not. Either way, that is fine. Once code is written, someone will use it, and someone will not.

References

HagiCode project repository: github.com/HagiCode-org/site
HagiCode official site: hagicode.com
Claude Code official Skill documentation: docs.anthropic.com
Orleans framework documentation: dotnet.github.io/orleans
TanStack Start: tanstack.com/start

Copyright Notice

Thank you for reading. If you found this article useful, feel free to like, bookmark, and share it. This content was produced with AI-assisted collaboration, and the final content was reviewed and approved by the author.

Author: newbe36524
Original link: https://docs.hagicode.com/blog/2026-03-24-hagicode-skill-system-technical-analysis/
Copyright notice: Unless otherwise stated, all blog posts on this site are licensed under BY-NC-SA. Please include the source when reposting.

I Might Be Replaced by an Agent, So I Ran the Numbers

Mar 22, 2026

I Might Be Replaced by an Agent, So I Ran the Numbers

Quantifying AI replacement risk with data: a deep dive into how the HagiCode team uses six core formulas to redefine how knowledge workers evaluate their competitiveness.

Background

With AI technology advancing at breakneck speed, every knowledge worker is facing an urgent question: In the AI era, will I be replaced?

It sounds a little alarmist, but plenty of people are quietly uneasy about it. You just finish learning a new framework, and AI is already telling you your role might be automated away; you finally master a language, and then discover that someone using AI is producing three times as much as you. If you are reading this, you have probably felt at least some of that anxiety.

And honestly, that anxiety is not irrational. No one wants to admit that the skills they spent years building could be outperformed by a single ChatGPT session. Still, anxiety is one thing; life goes on.

Traditional discussions usually start from the question of “what AI can do,” but that framing misses two critical dimensions:

The business perspective: whether a company is willing to equip an employee with AI tools depends on whether AI costs make economic sense relative to labor costs. It is not enough for AI to be capable of replacing a role; the company also has to run the numbers. Capital is not a charity, and every dollar has to count.
The efficiency perspective: AI-driven productivity gains need to be quantified instead of being reduced to the vague claim that “using AI makes you stronger.” Maybe your efficiency doubles with AI, but someone else gets a 5x improvement. That gap matters. It is like school: everyone sits in the same class, but some score 90 while others barely pass.

So the real question is: how do we turn this fuzzy anxiety into measurable indicators?

It is always better to know where you stand than to fumble around in the dark. That is what we are talking about today: the design logic behind the AI productivity calculator built by the HagiCode team.

So I made a site: https://cost.hagicode.com.

About HagiCode

HagiCode is an open-source AI coding assistant project built to help developers code more efficiently.

What is interesting is that while building their own product, the HagiCode team accumulated a lot of hands-on experience around AI productivity. They realized that the value of an AI tool cannot be assessed in isolation from a company’s employment costs. Based on that insight, the team decided to build a productivity calculator to help knowledge workers evaluate their competitiveness in the AI era more scientifically.

Plenty of people could build something like this. The difference is that very few are willing to do it seriously. The HagiCode team spent time on it as a way of giving something back to the developer community.

The design shared in this article is a summary of HagiCode’s experience applying AI in real engineering work. If you find this evaluation framework valuable, it suggests that HagiCode really does have something to offer in engineering practice. In that case, the HagiCode project itself is also worth paying attention to.

The Core Formulas: 6 Key Metrics

1. Total Annual Employment Cost

A company’s real cost for an employee is far more than salary alone. A lot of people only realize this when changing jobs: you negotiate a monthly salary of 20,000 CNY, but take home only 14,000. On the company side, the spend is not just 20,000 either. Social insurance, housing fund contributions, training, and recruiting costs all have to be included.

According to the implementation in calculate-ai-risk.ts:

Total annual employment cost = Annual salary x (1 + city coefficient) + Annual salary / 12

The city coefficient reflects differences in hiring and retention costs across cities:

City tier	Representative cities	Coefficient
Tier 1	Beijing / Shanghai / Shenzhen / Guangzhou	0.4
New Tier 1	Hangzhou / Chengdu / Suzhou / Nanjing	0.3
Tier 2	Wuhan / Xi’an / Tianjin / Zhengzhou	0.2
Other	Yichang / Luoyang and others	0.1

A Tier 1 city coefficient of 0.4 means the company needs to pay roughly 40% extra in recruiting, training, insurance, and similar overhead. The all-in cost of hiring someone in Beijing really is much higher than in a Tier 2 city.

The cost of living in major cities is high too. You could think of it as another version of a “drifter tax.”

2. Blended Token Unit Price

Different AI models have separate input and output pricing, and the gap can be huge. In coding scenarios, the input/output ratio is roughly 3:1. You might give the AI a block of code to review, while its analysis is usually much shorter than the input.

The blended unit price formula is:

Blended unit price = (input-output ratio x input price + output price) / (input-output ratio + 1)

Take GPT-5 as an example:

Input: $2.5/1M tokens
Output: $15/1M tokens
Blended = (3 x 2.5 + 15) / 4 = $5.625/1M tokens

For models priced in USD, you also need to convert using an exchange rate. The HagiCode team currently sets that rate to 7.25 and updates it as the market changes.

Exchange rates are like the stock market: no one can predict them exactly. You just follow the trend.

3. Annual AI Cost

Average daily AI cost = Average daily token demand (M) x blended unit price (CNY/1M)
Annual AI cost = Average daily AI cost x 264 working days

264 = 22 days/month x 12 months, which is the number of working days in a standard year. Why not use 365? Because you have to account for weekends, holidays, sick leave, and so on.

We are not robots, after all. AI may not need rest, but people still need room to breathe.

4. The Core Innovation: Equivalent Headcount

This is the heart of the whole evaluation system, and also where the HagiCode team’s insight shows most clearly.

Affordable workflow count = Total annual employment cost / Annual AI cost
Affordability ratio = min(affordable workflow count, 1)
Equivalent headcount = 1 + (productivity multiplier - 1) x affordability ratio

That formula looks a little abstract, so let me unpack it.

The traditional view would simply say, “your efficiency improved by 2x.” But this formula introduces a crucial constraint: is the company’s AI budget sustainable?

For example, Xiao Ming improves his efficiency by 3x, but his annual AI usage costs 300,000 CNY while the company is only paying him a salary of 200,000 CNY. In that case, his personal productivity may be impressive, but it is not sustainable. No company is going to lose money just to keep him operating at peak efficiency.

That is what the affordability ratio means. If the company can only afford 0.5 of an AI workflow, then Xiao Ming’s equivalent headcount is 1 + (3 - 1) x 0.5 = 2 people, not 3.

The key insight: what matters is not just how large your productivity multiplier is, but whether the company can afford the AI investment required to sustain that multiplier.

The logic is simple once you see it. Most people just do not think from that angle. We are used to looking at the world from our own side, not from the boss’s side, where money does not come out of thin air either.

5. Cost-Benefit Ratio

AI cost ratio = Annual AI cost / Total annual employment cost
Productivity gain = Productivity multiplier - 1
Cost-benefit ratio = Productivity gain / AI cost ratio

Cost-benefit ratio < 1: the AI investment is not worth it; the productivity gain does not justify the cost
Cost-benefit ratio 1-2: barely worth it
Cost-benefit ratio > 2: high return, strongly recommended

This metric is especially useful for managers because it helps them quickly judge whether a given role is worth equipping with AI tools.

At the end of the day, ROI is what matters. You can talk about higher efficiency all you want, but if the cost explodes, no one is going to buy the argument.

6. Risk Level

Risk is categorized according to equivalent headcount:

Equivalent headcount	Risk level	Conclusion
>= 2.0	High risk	If your coworkers gain the same conditions, they become a serious threat to you
1.5 - 2.0	Warning	Coworkers have begun to build a clear productivity advantage
< 1.5	Safe	For now, you can still maintain a gap

After seeing that table, you probably have a rough sense of where you stand. Still, there is no point in panicking. Anxiety does not solve problems. It is better to think about how to raise your own productivity multiplier.

Gamified Design: 7 Special Titles

To make the results more fun, the calculator introduces a system of seven special titles. These titles are persisted through localStorage, allowing users to unlock and display their own “achievements.”

Title ID	Name	Unlock condition
craftsman-spirit	Craftsman Spirit	Average daily token usage = 0
prompt-alchemist	Prompt Alchemist	Daily tokens <= 20M and productivity multiplier >= 6
all-in-operator	All-In Operator	Daily tokens >= 150M and productivity multiplier >= 3
minimalist-runner	Minimalist Runner	Daily tokens <= 5M and productivity multiplier >= 2
cost-tamer	Cost Tamer	Cost-benefit ratio >= 2.5 and AI cost ratio <= 15%
danger-oracle	Danger Oracle	Equivalent headcount >= 2.5 or entering the high-risk zone
budget-coordinator	Budget Coordinator	Affordable workflow count >= 8

Each title also carries a hidden meaning:

Title	Hidden meaning
Craftsman Spirit	You can still do fine without AI, but you need unique competitive strengths
Prompt Alchemist	You achieve high output with very few tokens; a classic power-user profile
All-In Operator	High input, high output; suitable for high-frequency scenarios
Minimalist Runner	Lightweight AI usage; suitable for light-assistance scenarios
Cost Tamer	Extremely high ROI; the kind of employee companies love
Danger Oracle	You are already, or soon will be, in a high-risk group
Budget Coordinator	You can operate multiple AI workflows at the same time

Gamification is really just a way to make dry data a little more entertaining. After all, who does not like collecting achievements? Like badges in a game, they may not have much practical value, but they still feel good to earn.

Data Sources: An Authoritative Pricing System

The calculator’s pricing data comes from multiple official API pricing pages to keep the results authoritative and up to date:

OpenAI: Official API pricing page
Anthropic Claude: Official pricing docs
DeepSeek: CNY pricing page
Zhipu GLM: Zhipu Open Platform pricing page
MiniMax: Pay-as-you-go pricing

This data is updated regularly, with the latest refresh on 2026-03-19.

Data only matters when it is current. Once it is outdated, it stops being useful. On that front, the HagiCode team has been quite responsible about keeping things updated.

Practical Example

Suppose you are a developer in Beijing with an annual salary of 400,000 CNY, using Claude Sonnet 4.6, consuming 50M tokens per day on average, and estimating that AI gives you a 3x productivity boost. The simulated input looks like this:

const input = {
  annualIncomeCny: 400000,
  cityTier: "tier1",           // Beijing
  modelId: "claude-sonnet-4-6",
  performanceMultiplier: 3.0,
  dailyTokenUsageM: 50,
}

// Calculation process
// Total annual employment cost = 400k x (1 + 0.4) + 400k/12 ~= 603.3k
// Annual AI cost ~= 50 x 7.125 x 264 ~= 94k
// Affordable workflow count ~= 603.3 / 94 ~= 6.4 workflows
// Equivalent headcount = 1 + (3 - 1) x 1 = 3 people

Conclusion: if one of your coworkers has the same conditions, their output would be equivalent to three people. You are already in the high-risk zone.

If you discover that your current AI usage is “not worth it” (cost-benefit ratio < 1), you can consider:

Reducing token usage: use more efficient prompts and cut down ineffective requests
Choosing a more cost-effective model: for example, DeepSeek-V3 (priced in CNY and cheaper)
Increasing your productivity multiplier: learn advanced Agent usage techniques and truly turn AI into productivity

In the end, all of this comes down to the art of balance. Use too much and you waste money; use too little and nothing changes. The key is finding the sweet spot.

Technical Architecture Highlights

When designing this calculator, the HagiCode team made several engineering decisions worth learning from:

Pure frontend computation: all calculations run in the browser, with no backend API dependency, which protects user privacy
Configuration-driven: all formulas, pricing, and role data are centralized in configuration files, so future updates do not require changing core code logic
Multilingual support: supports both Chinese and English
Instant feedback: results update in real time as soon as the user changes inputs
Detailed formula display: every result includes the full calculation formula to help users understand it

This design makes the calculator easy to maintain and extend, while also serving as a reference template for similar data-driven applications.

Good architecture, like good code, takes time to build up. The HagiCode team put real thought into it.

Conclusion

The core value of the AI productivity calculator is that it turns the vague anxiety of an “AI replacement threat” into metrics that can be quantified and compared.

The equivalent headcount formula, 1 + (productivity multiplier - 1) x affordability ratio, is the core innovation of the entire framework. It considers not only productivity gains, but also whether a company can afford the AI cost, making the evaluation much closer to reality.

This framework tells us one thing clearly: in the AI era, not knowing where you stand is the most dangerous position of all.

Instead of worrying, let the data speak.

A lot of fear comes from the unknown. Once you quantify everything, the situation no longer feels quite so terrifying. At worst, you improve yourself or change tracks. Life is long, and there is no need to hang everything on a single tree.

If This Helped You

Leave a like so more people can see it
Give us a Star on GitHub: github.com/HagiCode-org/site
Visit the official website to learn more: hagicode.com
Watch the 30-minute hands-on demo: www.bilibili.com/video/BV1pirZBuEzq/
One-click install experience: docs.hagicode.com/installation/docker-compose
Quick installation for the Desktop app: hagicode.com/desktop/

Visit cost.hagicode.com now and complete your AI productivity assessment.

References

Data source: cost.hagicode.com | Powered by HagiCode

In the end, a line of poetry came to mind: “This feeling might have become a thing to remember, yet even then one was already lost.” The AI era is much the same. Instead of waiting until you are replaced and filled with regret, it is better to start taking action now…

Copyright Notice

Thank you for reading. If you found this article useful, likes, bookmarks, and shares are all welcome. This content was created with AI-assisted collaboration, and the final version was reviewed and confirmed by the author.

Author: newbe36524
Original article: https://docs.hagicode.com/blog/2026-03-22-ai-productivity-calculator-science/
Copyright: Unless otherwise stated, all posts on this blog are licensed under BY-NC-SA. Please cite the source when reprinting.

Hagicode.Libs: Engineering Practice for Unified Integration of Multiple AI Coding Assistant CLIs

Mar 20, 2026

Hagicode.Libs: Engineering Practice for Unified Integration of Multiple AI Coding Assistant CLIs

During the development of the HagiCode project, we needed to integrate multiple AI coding assistant CLIs at the same time, including Claude Code, Codex, and CodeBuddy. Each CLI has different interfaces, parameters, and output formats, and the repeated integration code made the project harder and harder to maintain. In this article, we share how we built a unified abstraction layer with HagiCode.Libs to solve this engineering pain point. You could also say it is simply some hard-earned experience gathered from the pitfalls we have already hit.

Background

The market for AI coding assistants is quite lively now. Besides Claude Code, there are also OpenAI’s Codex, Zhipu’s CodeBuddy, and more. As an AI coding assistant project, HagiCode needs to integrate these different CLI tools across multiple subprojects, including desktop, backend, and web.

At first, the problem was manageable. Integrating one CLI was only a few hundred lines of code. But as the number of CLIs we needed to support kept growing, things started to get messy.

Each CLI has its own command-line argument format, different environment variable requirements, and a wide variety of output formats. Some output JSON, some output streaming JSON, and some output plain text. On top of that, there are cross-platform compatibility issues. Executable discovery and process management work very differently between Windows and Unix systems, so code duplication kept increasing. In truth, it was just a bit more Ctrl+C and Ctrl+V, but maintenance quickly became painful.

The most frustrating part was that every time we wanted to add support for a new CLI capability, we had to change the same code in several projects. That approach was clearly not sustainable in the long run. Code has a temper too; duplicate it too many times and it starts causing trouble.

About HagiCode

The approach shared in this article comes from our practical experience in the HagiCode project. HagiCode is an open-source AI coding assistant project that needs to maintain multiple subprojects at the same time, including a frontend VSCode extension, backend AI services, and a cross-platform desktop client. In a way, it was exactly this complex, multi-language, multi-platform environment that led to the birth of HagiCode.Libs. You could say we were forced into it, and so be it.

Analysis: Finding Common Ground

Although these AI coding assistant CLIs each have their own characteristics, from a technical perspective they share several obvious traits:

Similar interaction patterns: they all start a CLI process, send a prompt, receive streaming responses, parse messages, and then either end or continue the session. At the end of the day, the whole flow follows the same basic mold.

Similar configuration needs: they all need API key authentication, working directory setup, model selection, tool permission control, and session management. After all, everyone is making a living from APIs; the differences are mostly a matter of flavor.

The same cross-platform challenges: they all need to solve executable path resolution (claude vs claude.exe vs /usr/local/bin/claude), process startup and environment variable handling, shell command escaping, and argument construction. Cross-platform work is painful no matter how you describe it. Only people who have stepped into the traps really understand the difference between Windows and Unix.

Based on this analysis, we needed a unified abstraction layer that could provide a consistent interface, encapsulate cross-platform CLI discovery logic, handle streaming output parsing, and support both dependency injection and non-DI scenarios. It is the kind of problem that makes your head hurt just thinking about it, but you still have to face it. After all, it is our own project, so we have to finish it even if we have to cry our way through it.

Solution: HagiCode.Libs

We created HagiCode.Libs, a lightweight .NET 10 library workspace released under the MIT license and now published on GitHub. It may not be some world-shaking masterpiece, but it is genuinely useful for solving real problems.

Project structure

HagiCode.Libs/
├── src/
│   ├── HagiCode.Libs.Core/           # Core capabilities
│   │   ├── Discovery/                 # CLI executable discovery
│   │   ├── Process/                   # Cross-platform process management
│   │   ├── Transport/                 # Streaming message transport
│   │   └── Environment/               # Runtime environment resolution
│   ├── HagiCode.Libs.Providers/       # Provider implementations
│   │   ├── ClaudeCode/                # Claude Code provider
│   │   ├── Codex/                     # Codex provider
│   │   └── Codebuddy/                 # CodeBuddy provider
│   ├── HagiCode.Libs.ConsoleTesting/  # Testing framework
│   ├── HagiCode.Libs.ClaudeCode.Console/
│   ├── HagiCode.Libs.Codex.Console/
│   └── HagiCode.Libs.Codebuddy.Console/
└── tests/                             # xUnit tests

Design goals

When designing HagiCode.Libs, we followed a few principles. They all came from lessons learned the hard way:

Zero heavy framework dependencies: it does not depend on ABP or any other large framework, which keeps it lightweight. These days, the fewer dependencies you have, the fewer headaches you get. Most people have already been beaten up by dependency hell at least once.

Cross-platform support: native support for Windows, macOS, and Linux, without writing separate code for different platforms. One codebase that runs everywhere is a pretty good thing.

Streaming processing: CLI output is handled with asynchronous streams, which fits modern .NET programming patterns much better. Times change, and async is king.

Flexible integration: it supports dependency injection scenarios while also allowing direct instantiation. Different people have different preferences, so we wanted it to be convenient either way.

How to use it

Through dependency injection

If your project already uses dependency injection, such as ASP.NET Core or the generic host, you can integrate it directly. It is a small thing, but a well-behaved one:

using HagiCode.Libs.Providers;
using Microsoft.Extensions.DependencyInjection;

var services = new ServiceCollection();
services.AddHagiCodeLibs();

await using var provider = services.BuildServiceProvider();
var claude = provider.GetRequiredService<ICliProvider<ClaudeCodeOptions>>();

var options = new ClaudeCodeOptions
{
    ApiKey = "your-api-key",
    Model = "claude-sonnet-4-20250514"
};

await foreach (var message in claude.ExecuteAsync(options, "Hello, Claude!"))
{
    Console.WriteLine($"{message.Type}: {message.Content}");
}

Direct instantiation

If you are writing a simple script or working in a non-DI scenario, creating an instance directly also works. Put simply, it depends on your personal preference:

var claude = new ClaudeCodeProvider();
var options = new ClaudeCodeOptions
{
    ApiKey = "sk-ant-xxx",
    Model = "claude-sonnet-4-20250514"
};

await foreach (var message in claude.ExecuteAsync(options, "Help me write a quicksort"))
{
    // Handle messages
}

Both approaches use the same underlying implementation, so you can choose the integration style that best fits your project. There is no universal right answer in this world. What suits you is the best option. It may sound cliché, but it is true.

Practical experience

1. Dedicated testing consoles

Each provider has its own dedicated testing console project, making it easier to validate the integration independently. Testing is one of those things where if you are going to do it, you should do it properly:

# Claude Code tests
dotnet run --project src/HagiCode.Libs.ClaudeCode.Console -- --test-provider
dotnet run --project src/HagiCode.Libs.ClaudeCode.Console -- --test-all claude

# CodeBuddy tests
dotnet run --project src/HagiCode.Libs.Codebuddy.Console -- --test-provider codebuddy-cli

# Codex tests
dotnet run --project src/HagiCode.Libs.Codex.Console -- --test-provider codex-cli

The testing scenarios cover several key cases:

Ping: health check to confirm the CLI is available
Simple Prompt: basic prompt test
Complex Prompt: multi-turn conversation test
Session Restore/Resume: session recovery test
Repository Analysis: repository analysis test

This standalone testing console design is especially useful during debugging because it lets us quickly identify whether the issue is in the HagiCode.Libs layer or in the CLI itself. Debugging is really just about finding where the problem is. Once the direction is right, you are already halfway there.

2. Cross-platform CI/CD validation

Cross-platform compatibility is one of the core goals of HagiCode.Libs. We configured the GitHub Actions workflow .github/workflows/cli-discovery-cross-platform.yml to run real CLI discovery validation across ubuntu-latest, macos-latest, and windows-latest.

This ensures that every code change does not break cross-platform compatibility. During local development, you can also reproduce it with the following commands. After all, you cannot ask CI to take the blame for everything. Your local environment should be able to run it too:

npm install --global @anthropic-ai/claude-code@2.1.79
HAGICODE_REAL_CLI_TESTS=1 dotnet test --filter "Category=RealCli"

3. Message stream processing

HagiCode.Libs uses asynchronous streams to process CLI output. Compared with traditional callback or event-based approaches, this fits the asynchronous programming style of modern .NET much better. In the end, this is simply how technology moves forward, whether anyone likes it or not:

public async IAsyncEnumerable<CliMessage> ExecuteAsync(
    TOptions options,
    string prompt,
    [EnumeratorCancellation] CancellationToken cancellationToken = default)
{
    // Start the CLI process
    // Parse streaming JSON output
    // Yield the CliMessage sequence
}

The message types include:

user: user message
assistant: assistant response
tool_use: tool invocation
result: session end

This design lets callers handle streaming output flexibly, whether for real-time display, buffered post-processing, or forwarding to other services. Why worry whether the sky is sunny or cloudy? What matters is that once the idea opens up, you can use it however you like.

4. Git repository exploration

The HagiCode.Libs.Exploration module provides Git repository discovery and status checking, which is especially useful in repository analysis scenarios. This feature was also born out of necessity, because HagiCode needs to analyze repositories:

// Discover Git repositories
var repositories = await GitRepositoryDiscovery.DiscoverAsync("/path/to/search");

// Get repository information
var info = await GitRepository.GetInfoAsync(repoPath);
Console.WriteLine($"Branch: {info.Branch}, Remote: {info.RemoteUrl}");
Console.WriteLine($"Has uncommitted changes: {info.HasUncommittedChanges}");

HagiCode’s code analysis capabilities use this module to identify project structure and Git status. It is a good example of making full use of what we built.

Things to note

Based on our practice in the HagiCode project, there are several points that deserve special attention. They are all real issues that need to be handled carefully:

API key security: do not hardcode API keys in your code. Use environment variables or configuration management instead. HagiCode.Libs supports passing configuration through Options objects, making it easier to integrate with different configuration sources. When it comes to security, there is no such thing as being too careful.

CLI version pinning: in CI/CD, we pin specific versions, such as @anthropic-ai/claude-code@2.1.79, to reduce uncertainty caused by version drift. It is also a good idea to use fixed versions in local development. Versioning can be painful. If you do not pin versions, the problem will teach you a lesson very quickly.

Test categorization: default tests use fake providers to keep them deterministic and fast, while real CLI tests must be enabled explicitly. This gives CI fast feedback while still allowing real-environment validation when needed. Striking that balance is never easy. Speed and stability always require trade-offs.

Session management: different CLIs have different session recovery mechanisms. Claude Code uses the .claude/ directory to store sessions, while Codex and CodeBuddy each have their own approaches. When using them, be sure to check their respective documentation and understand the details of their session persistence mechanisms. There is no harm in understanding it clearly.

Summary

HagiCode.Libs is the unified abstraction layer we built during the development of HagiCode to solve the repeated engineering work involved in multi-CLI integration. By providing a consistent interface, encapsulating cross-platform details, and supporting flexible integration patterns, it greatly reduces the engineering complexity of integrating multiple AI coding assistants. Much may fade away, but the experience remains.

If you also need to integrate multiple AI CLI tools in your project, or if you are interested in cross-platform process management and streaming message handling, feel free to check it out on GitHub. The project is released under the MIT license, and contributions and feedback are welcome. In the end, it is a happy coincidence that we met here, so since you are already here, we might as well become friends.

The approach shared in this article was shaped by real pitfalls and real optimization work inside HagiCode. What else could we do? Running into pitfalls is normal. If you think this solution is valuable, then perhaps our engineering work is doing all right. And HagiCode itself may also be worth your attention. You might even find a pleasant surprise.

References

HagiCode.Libs GitHub: github.com/HagiCode-org/Hagicode.Libs
HagiCode main project: github.com/HagiCode-org/site
HagiCode official website: hagicode.com
Claude Code official documentation: docs.anthropic.com

If this article helped you:

Give it a like so more people can see it
Give us a Star on GitHub: github.com/HagiCode-org/site
Visit the official website to learn more: hagicode.com
Watch the 30-minute hands-on demo: www.bilibili.com/video/BV1pirZBuEzq/
Try the one-click installation: docs.hagicode.com/installation/docker-compose
Quick install for the Desktop app: hagicode.com/desktop/
Public beta has started, and you are welcome to install and try it

Copyright notice

Author: newbe36524
Original link: https://docs.hagicode.com/blog/2026-03-20-hagicode-libs-unified-cli-integration/
Copyright statement: Unless otherwise stated, all articles on this blog are licensed under BY-NC-SA. Please indicate the source when reposting.

Why HagiCode Chose Hermes as Its Integrated Agent Core

Mar 19, 2026

Why HagiCode Chose Hermes as Its Integrated Agent Core

When building an AI-assisted coding platform, choosing the right Agent core directly determines the upper limit of the system’s capabilities. Some things simply cannot be forced; pick the wrong framework, and no amount of effort will make it feel right. This article shares the thinking behind HagiCode’s technical selection and our hands-on experience integrating Hermes Agent.

Background

When building an AI-assisted coding product, one of the hardest parts is choosing the underlying Agent framework. There are actually quite a few options on the market, but some are too limited in functionality, some are overly complex to deploy, and others simply do not scale well enough. What we needed was a solution that could run on a $5 VPS while also being able to connect to a GPU cluster. That requirement may not sound extreme, but it is enough to scare plenty of teams away.

In practice, many so-called “all-in-one Agents” either only run in the cloud or require absurdly high local deployment costs. After spending two weeks researching different approaches, we made a bold decision: rebuild the entire Agent core around Hermes as the underlying engine for our integrated Agent.

Everything that followed may simply have been fate.

About HagiCode

The approach shared in this article comes from real-world experience in the HagiCode project. HagiCode is an AI-assisted coding platform that provides developers with an intelligent coding assistant through a VSCode extension, a desktop client, and web services. You may have used similar tools before and felt they were just missing that final touch; we understand that feeling well.

GitHub: github.com/HagiCode-org/site
Official website: hagicode.com

Why HagiCode Needs Hermes

Before diving into Hermes itself, it helps to explain why HagiCode needed something like it in the first place. Things rarely work exactly the way you want, so you need a practical reason to commit to a technical direction.

As an AI coding assistant, HagiCode needs to support several usage scenarios at the same time:

Local development environments: developers want to run it on their own machines so data never leaves the local environment. These days, data security is never a trivial concern.
Team collaboration environments: small teams should be able to share an Agent deployment running on a server. Saving money matters, and everyone has limits.
Elastic cloud expansion: when handling complex tasks, the system should automatically scale out to a GPU cluster. It is always better to be prepared.

This “we want everything at once” requirement is what led us to Hermes. Whether it was the perfect choice, I cannot say for sure, but at the time we did not see a better option.

What Is Hermes Agent

Hermes Agent is an autonomous AI Agent created by Nous Research. Some readers may not be familiar with Nous Research; they are the lab behind open-source large models such as Hermes, Nomos, and Psyché. They have built many excellent things, even if they are still more underappreciated than they deserve.

Unlike traditional IDE coding assistants or simple API chat wrappers, Hermes has a defining trait: the longer it runs, the more capable it becomes. It is not designed to complete a task once and stop; it keeps learning and accumulating experience over long-running operation. In that sense, it feels a little like a person.

Core Features

Several of Hermes’s core capabilities happen to align very closely with HagiCode’s needs.

This means HagiCode can choose the most suitable deployment model based on each user’s scenario: individuals run it locally, teams deploy it on servers, and complex tasks use GPU resources. One codebase handles all of it. In a world this busy, saving one layer of complexity is already a win.

Multi-platform messaging gateway Hermes natively supports Telegram, Discord, Slack, WhatsApp, and more. For HagiCode, this means we can support AI assistants on those channels much more easily in the future. More paths forward are always welcome.

Rich tool system Hermes comes with 40+ built-in tools and supports MCP (Model Context Protocol) extensions. This is essential for a coding assistant: executing shell commands, working with the file system, and calling Git all depend on tool support. An Agent without tools is like a bird without wings.

Cross-session memory Hermes includes a persistent memory system and uses FTS5 full-text search to recall historical conversations. That allows the Agent to remember prior context instead of “losing its memory” every time. Sometimes people wish they could forget things that easily, but reality is usually less generous.

How HagiCode Integrates Hermes

Now that the “why” is clear, let us look at the “how.” Once something makes sense in theory, the next step is to build it.

Provider Layer Abstraction

In HagiCode’s architecture, all AI Providers implement a unified IAIProvider interface:

public sealed class HermesCliProvider : IAIProvider, IVersionedAIProvider
{
    public ProviderCapabilities Capabilities { get; } = new ProviderCapabilities
    {
        SupportsStreaming = true,   // Supports streaming output
        SupportsTools = true,       // Supports tool invocation
        SupportsSystemMessages = true, // Supports system prompts
        SupportsArtifacts = false
    };
}

This abstraction layer allows HagiCode to switch seamlessly between different AI Providers. Whether the backend is OpenAI, Claude, or Hermes, the upper-layer calling pattern stays exactly the same. In plain terms, it keeps things simple.

ACP Communication Protocol

Hermes communicates through ACP (Agent Communication Protocol). This protocol is designed specifically for Agent communication, and its main methods include:

Method	Description
`initialize`	Initialize the connection and obtain the protocol version and client capabilities
`authenticate`	Handle authentication and support multiple authentication methods
`session/new`	Create a new session and configure the working directory and MCP servers
`session/prompt`	Send a prompt and receive a response

HagiCode implements the ACP transport layer through StdioAcpTransport, launching a Hermes subprocess and communicating with it over standard input and output. It may sound complicated, but in practice it is manageable as long as you have enough patience.

Configuration Management

Configuration is managed through the HermesPlatformConfiguration class:

public sealed class HermesPlatformConfiguration : IAcpPlatformConfiguration
{
    public string ExecutablePath { get; set; } = "hermes";
    public string Arguments { get; set; } = "acp";
    public int StartupTimeoutMs { get; set; } = 5000;
    public string ClientName { get; set; } = "HagiCode";
    public HermesAuthenticationConfiguration Authentication { get; set; }
    public HermesSessionDefaultsConfiguration SessionDefaults { get; set; }
}

Configure Hermes in appsettings.json:

{
  "Providers": {
    "HermesCli": {
      "ExecutablePath": "hermes",
      "Arguments": "acp",
      "StartupTimeoutMs": 10000,
      "ClientName": "HagiCode",
      "Authentication": {
        "PreferredMethodId": "api-key",
        "MethodInfo": {
          "api-key": "your-api-key-here"
        }
      },
      "SessionDefaults": {
        "Model": "claude-sonnet-4-20250514",
        "ModeId": "default"
      }
    }
  }
}

Configuration often looks simple on paper, but getting every detail right still takes real effort.

Orleans Distributed Architecture

HagiCode uses Orleans to build its distributed system, and the Hermes integration is implemented through the following components:

HermesGrain: An Orleans Grain implementation that handles session execution
HermesPlatformConfiguration: Platform-specific configuration
HermesAcpSessionAdapter: ACP session adapter
HermesConsole: A dedicated validation console

The name Orleans does have a certain charm to it. Even if this Orleans has nothing to do with the legendary city, a good name never hurts.

End-to-End Execution Flow

The following is the core execution logic of the Hermes Provider:

private async IAsyncEnumerable<AIStreamingChunk> StreamCoreAsync(
    AIRequest request,
    string? embeddedCommandPrompt,
    [EnumeratorCancellation] CancellationToken cancellationToken)
{
    // 1. Create transport layer and launch Hermes subprocess
    await using var transport = new StdioAcpTransport(
        platformConfiguration.GetExecutablePath(),
        platformConfiguration.GetArguments(),
        platformConfiguration.GetEnvironmentVariables(),
        platformConfiguration.GetStartupTimeout(),
        _loggerFactory.CreateLogger<StdioAcpTransport>());
    await transport.ConnectAsync(cancellationToken);

    // 2. Initialize and obtain protocol version and authentication methods
    var initializeResult = await SendHermesRequestAsync(
        transport, nextRequestId++, "initialize",
        BuildInitializeParameters(platformConfiguration), cancellationToken);

    // 3. Handle authentication
    var authMethods = ParseAuthMethods(initializeResult);
    if (!isAuthenticated)
    {
        var methodId = platformConfiguration.Authentication.ResolveMethodId(authMethods);
        await SendHermesRequestAsync(transport, nextRequestId++, "authenticate", ...);
    }

    // 4. Create session
    var newSessionResult = await SendHermesRequestAsync(
        transport, nextRequestId++, "session/new",
        BuildNewSessionParameters(platformConfiguration, workingDirectory, model), cancellationToken);
    var sessionId = ParseSessionId(newSessionResult);

    // 5. Execute prompt and collect streaming responses
    await foreach (var payload in transport.ReceiveMessagesAsync(cancellationToken))
    {
        // Handle session/update notifications and convert them into streaming chunks
        if (TryParseSessionNotification(root, out var notification))
        {
            if (_responseMapper.TryConvertToStreamingChunk(notification, out var chunk))
            {
                yield return chunk;
            }
        }
    }
}

With code, the details eventually become familiar. What matters most is the overall approach.

Health Checks

To ensure Hermes remains available, HagiCode implements a health check mechanism:

public async Task<ProviderTestResult> PingAsync(CancellationToken cancellationToken = default)
{
    var response = await ExecuteAsync(
        new AIRequest
        {
            Prompt = "Reply with exactly PONG.",
            CessionId = null,
            AllowedTools = Array.Empty<string>(),
            WorkingDirectory = ResolveWorkingDirectory(null)
        },
        cancellationToken);

    var success = string.Equals(response.Content.Trim(), "PONG", StringComparison.OrdinalIgnoreCase);
    return new ProviderTestResult
    {
        ProviderName = Name,
        Success = success,
        ResponseTimeMs = stopwatch.ElapsedMilliseconds,
        ErrorMessage = success ? null : $"Unexpected Hermes ping response: '{response.Content}'."
    };
}

That is roughly what a “health check” looks like here. In some ways, people are not so different: it helps to check in from time to time, even if no one tells us exactly what to look for.

Practical Considerations

There are a few pitfalls worth understanding before integrating Hermes. Everyone steps into a few traps sooner or later.

Authentication Method Configuration

Hermes supports multiple authentication methods, including API keys and tokens, so you need to choose based on the actual deployment scenario. Misconfiguration can cause connection failures, and the resulting error messages are not always intuitive. Sometimes the reported error is far away from the real root cause, which means slow and careful debugging is unavoidable.

MCP Server Configuration

When creating a session, you can configure a list of MCP servers so Hermes can call external tools. But keep the following points in mind:

MCP server addresses must be reachable
Timeouts must be configured reasonably
The system needs degradation handling when a server is unavailable

In practice, defensive thinking matters more than people expect.

Working Directory Management

Each session must specify a working directory so Hermes can access project files correctly. In multi-project scenarios, the working directory needs to switch dynamically. It sounds straightforward, but there are more edge cases than you might think.

Response Aggregation

Hermes responses may be split across session/update notifications and the final result, so they must be merged correctly. Otherwise, content may be lost.

Error Handling Strategy

Runtime errors should be returned explicitly instead of silently falling back to another Provider. That way, users know the issue came from Hermes rather than wondering why the system suddenly switched models behind the scenes.

Conclusion

HagiCode’s decision to use Hermes as its integrated Agent core was not a casual impulse. It was a careful choice based on practical requirements and the technical characteristics of the framework. Whether it proves to be the perfect long-term answer is still too early to say, but so far it has been serving us well.

Hermes gives HagiCode the flexibility to adapt to a wide range of scenarios. Its powerful tool system and MCP support allow the AI assistant to do real work, while the ACP protocol and Provider abstraction layer keep the integration process clear and controllable.

If you are choosing an Agent framework for your own AI project, I hope this article offers a useful reference. Picking the right underlying architecture can make everything that follows much easier.

If This Article Helped You

Give us a star on GitHub: github.com/HagiCode-org/site
Visit the official website to learn more: hagicode.com
Watch the 30-minute hands-on demo: www.bilibili.com/video/BV1pirZBuEzq/
One-click installation experience: docs.hagicode.com/installation/docker-compose
Quick install for the Desktop client: hagicode.com/desktop/
Public beta has started; you are welcome to install and try it
Official Hermes documentation

Copyright Notice

Thank you for reading. If you found this article useful, you are welcome to support it with a like, bookmark, or share. This content was created with AI-assisted collaboration, and the final content was reviewed and confirmed by the author.

Author: newbe36524
Original article: https://docs.hagicode.com/blog/2026-03-19-hagicode-hermes-agent-core/
Copyright notice: Unless otherwise stated, all articles on this blog are licensed under BY-NC-SA. Please indicate the source when reprinting.

.NET Code Protection in Practice: From Obfuscation to Virtual Machine Protection

Mar 18, 2026

.NET Code Protection in Practice: From Obfuscation to Virtual Machine Protection

This article explains how to implement a multi-layered code protection strategy in .NET projects, covering the full path from basic obfuscation to professional virtual machine protection.

Background

In .NET application development, protecting core code such as license validation, business logic, and sensitive configuration from decompilation and reverse engineering is, frankly, a topic you cannot avoid. As the .NET ecosystem has matured, developers have gained access to a range of protection options, from built-in obfuscation attributes to professional virtualization-based protection tools.

As a complex multilingual monorepo project, HagiCode includes desktop applications, build systems, and license management capabilities. The code inevitably contains license validation logic, sensitive configuration such as API keys and product IDs, and business-critical logic. Those parts need serious protection, because no one wants their hard work to be exposed so easily.

This article shares the code protection approach we actually adopted in the HagiCode project and summarizes the full journey from early pitfalls to later optimization. Hopefully it gives you some useful ideas.

About HagiCode

HagiCode is an open source AI coding assistant project dedicated to providing developers with an intelligent programming experience. The project uses a monorepo architecture and simultaneously maintains a VSCode extension, backend AI services, a cross-platform desktop client, and more. That multi-language, multi-platform complexity makes code protection an engineering challenge we have to face head-on.

The approach shared in this article is the result of real trial and error during HagiCode development. If you want to see how we solved these technical problems, keep reading. You may find a few unexpected takeaways.

Core Content

1. Microsoft’s Built-in Obfuscation Attribute

.NET Framework provides a built-in [ObfuscationAttribute], which is the most basic and commonly used code obfuscation marker. This attribute lives in the System.Reflection namespace and allows you to apply baseline protection to code without introducing third-party tools.

Core features:

Feature property: Specifies the obfuscation feature, such as "ultra" (high obfuscation) or "all" (full obfuscation)
Exclude property: true means exclude from obfuscation, and false means apply obfuscation
Can be applied to classes, methods, properties, and other type members

In the HagiCode project, you can see it used like this:

[Obfuscation(Feature = "ultra", Exclude = false)]
public async Task<LicenseValidationResult?> ValidateLicenseAsync(...)

The advantages of this approach are fairly obvious:

No extra dependencies; it is built into .NET Framework out of the box
Can be recognized and processed by third-party obfuscation tools
Does not significantly increase the size of the compiled assembly

That said, it also has limitations. It is only a marker, and the actual obfuscation result depends on the tool implementation. It cannot provide virtual machine protection-level security.

2. VMP (Virtual Machine Protection)

VMP is a professional code protection tool that provides high-level protection by compiling code into virtual machine instructions. Unlike simple name obfuscation, VMP actually transforms code logic into a form that conventional decompilers cannot reconstruct.

Protection level classification:

Level	Virtualization	Mutation	Anti-debugging	String Encryption	Use Cases
HIGH	full	high	enabled	enabled	License validation, session concurrency, sensitive constants
MEDIUM	partial	medium	enabled	enabled	Business logic, domain models
LOW	none	low	disabled	disabled	Utility classes, non-critical code

The HagiCode project defines a declarative attribute system for marking code that needs protection:

// High-priority protection
[VmProtect(VmProtectionPriority.High, Reason = "Contains license verification logic")]
public class KeygenClient { ... }

// Exclude from protection
[VmExclude(Reason = "Public API that must remain unchanged")]
public class PublicApi { ... }

// Inherited protection
[VmProtect(Priority.High, ProtectDerived = true)]
public class BaseLicenseValidator { ... }

3. Build-Time Protection Strategy

VMP protection does not only matter at runtime. It also needs to be automated as part of the build pipeline, because doing it manually would be far too tedious. HagiCode’s build system supports several modes:

Native Windows mode: Invoke the VMProtect tool directly
Linux Docker container mode: Run VMP inside a container to solve cross-platform compatibility issues
Attribute scanning: Automatically discover protection markers in code
Validation mechanism: Confirm that protection has been applied successfully

Taken together, these capabilities make the process much easier to manage.

Solution

1. Using Microsoft’s Built-in Obfuscation Attribute

Apply ObfuscationAttribute directly in code:

using System.Reflection;

[Obfuscation(Feature = "ultra", Exclude = false)]
public class LicenseService
{
    [Obfuscation(Feature = "ultra", Exclude = false)]
    public async Task<bool> ValidateLicenseAsync(string key)
    {
        // License validation logic
    }

    [Obfuscation(Feature = "flow", Exclude = false)]
    private string DecryptToken(string encrypted)
    {
        // Decryption logic
    }
}

Sometimes you need to let test assemblies access internal members while still keeping production code secure:

[assembly: InternalsVisibleTo("HagiCode.Application.Tests")]
[assembly: InternalsVisibleTo("DynamicProxyGenAssembly2")] // for Moq

This makes testing much more convenient, because the code still needs to be tested properly.

2. Custom Attribute Definitions for VMP Protection

Create custom protection attributes to control VMP behavior:

using System;

[AttributeUsage(AttributeTargets.Class | AttributeTargets.Method | AttributeTargets.Property)]
public class VmProtectAttribute : Attribute
{
    public VmProtectionPriority Priority { get; set; }
    public string? Reason { get; set; }
    public bool ProtectDerived { get; set; }
}

[AttributeUsage(AttributeTargets.Class | AttributeTargets.Method | AttributeTargets.Property)]
public class VmExcludeAttribute : Attribute
{
    public string? Reason { get; set; }
}

public enum VmProtectionPriority
{
    None = 0,
    Low = 1,
    Medium = 2,
    High = 3
}

Custom attributes are often easier to work with because they reflect your own protection requirements directly.

3. VMP Configuration File

protection:
  priority_mode: "attribute"  # Attribute-based priority
  default_level: "medium"

tools:
  - name: "vmprotect"
    path: "C:\\Program Files\\VMProtect Ultimate\\VMProtect.exe"

protection_levels:
  high:
    virtualization: "full"
    mutation: "high"
    anti_debug: true
    anti_dump: true
    encrypt_strings: true
    encrypt_resources: true

  medium:
    virtualization: "partial"
    mutation: "medium"
    anti_debug: true
    encrypt_strings: true

  low:
    virtualization: "none"
    mutation: "low"
    anti_debug: false

A clearer configuration makes the system much easier to maintain later.

Practical Guide

1. Protection Practices for Critical Components

According to HagiCode’s code-protection specification, the following components must use HIGH-priority protection:

// Production constants - must be encrypted and protected by VMP
[VmProtect(VmProtectionPriority.High, Reason = "Production constants")]
public static class ProductionConstants
{
    // Encrypted string accessor, protected by VMP
    [VmProtect(VmProtectionPriority.High)]
    public static string GetLicenseServerUrl(IOptions<LicenseOptions> options) => ...;
}

// License validation logic
[VmProtect(VmProtectionPriority.High, Reason = "License verification logic")]
public class KeygenClient : IKeygenClient
{
    [Obfuscation(Feature = "ultra", Exclude = false)]
    public async Task<LicenseValidationResult?> ValidateLicenseAsync(...) { ... }
}

// Machine fingerprint service
[VmProtect(VmProtectionPriority.High)]
public class MachineFingerprintService : IMachineFingerprintService { ... }

Critical code deserves stronger protection, because exposing core logic would cause real problems.

2. String Encryption and Runtime Decryption

Encrypt strings at build time and decrypt them at runtime:

public static class StringDecryption
{
    [VmProtect(VmProtectionPriority.High, Reason = "CRITICAL SECURITY")]
    public static string DecryptString(byte[] encryptedData, byte[] key, byte[] iv)
    {
        using var aes = Aes.Create();
        aes.Key = key;
        aes.IV = iv;

        using var decryptor = aes.CreateDecryptor();
        using var ms = new MemoryStream(encryptedData);
        using var cs = new CryptoStream(ms, decryptor, CryptoStreamMode.Read);
        using var reader = new StreamReader(cs);

        return reader.ReadToEnd();
    }
}

// Production constant accessor (lazy loading + caching)
public static class ProductionConstants
{
    private static string? _cachedLicenseServerUrl;

    public static string GetLicenseServerUrl(IOptions<LicenseOptions> options)
    {
        if (_cachedLicenseServerUrl == null)
        {
            var encrypted = GetEncryptedLicenseServerUrl();
#if DEBUG
            _cachedLicenseServerUrl = options.Value.PrimaryServer.Url;
#else
            _cachedLicenseServerUrl = StringDecryption.DecryptString(
                encrypted,
                GetEncryptionKey(),
                GetEncryptionIV());
#endif
        }
        return _cachedLicenseServerUrl;
    }
}

This step matters because sensitive information should never be left in plaintext.

3. VMP Protection Verification

After the build, you must verify whether protection was applied successfully; otherwise, you cannot be sure it is actually working:

// Example verification script
public bool VerifyProtection(string assemblyPath)
{
    // 1. Check the VMP signature
    var bytes = File.ReadAllBytes(assemblyPath);
    var vmpSignature = Encoding.ASCII.GetBytes("VMProtect");
    if (bytes.Any(b => vmpSignature.Contains(b)))
    {
        return true;
    }

    // 2. Check for file size changes (the protected file is usually larger)
    var originalInfo = new FileInfo(assemblyPath.Replace(".dll", ".bak"));
    if (originalInfo.Exists)
    {
        var sizeRatio = (double)new FileInfo(assemblyPath).Length / originalInfo.Length;
        return sizeRatio > 1.1;
    }

    return false;
}

Verification is always worth doing, because otherwise problems can slip through unnoticed.

4. Notes and Caveats

There are several pitfalls here that deserve special attention:

Do not obfuscate all code: Public APIs, interface definitions, and DTO classes usually do not need protection. Excessive obfuscation can hurt performance and debugging. The HagiCode project learned this the hard way.
Protect key accessors: Methods that retrieve encryption keys must receive the same or a higher protection level than the encrypted data itself; otherwise the whole setup loses its value.
Balance testing and production: DEBUG builds should skip encryption to make development and debugging easier, while RELEASE builds should enable full protection. Remember to separate them with conditional compilation such as #if DEBUG.
Consider the Docker environment: Running VMP on Linux requires a containerized approach to ensure tool compatibility. HagiCode uses a Wine + VMP container solution to solve the cross-platform problem.
Verification is mandatory: After the build finishes, you must verify that protection was applied successfully. Otherwise sensitive code may still be exposed, and the verification code shown earlier exists for exactly this purpose.

Conclusion

With this multi-layer protection strategy, HagiCode built a comprehensive code security system that spans from baseline obfuscation to virtual machine protection:

Layer 1: Use ObfuscationAttribute for baseline marking and provide hints to third-party tools
Layer 2: Use custom VmProtectAttribute declarations to express protection intent and priority
Layer 3: Use VMP virtual machine protection to transform critical code into irreversible virtual machine instructions
Layer 4: Automatically scan and apply protection during the build, then verify the result

This approach can resist ordinary decompilation tools while also standing up better against advanced reverse engineering attacks. If you are building a .NET application that needs code protection, I hope this gives you at least a useful reference point.

References

If this article helped you:

Give us a Star on GitHub: github.com/HagiCode-org/site
Visit the official website to learn more: hagicode.com
Watch the 30-minute hands-on demo: www.bilibili.com/video/BV1pirZBuEzq/
Try one-click installation: docs.hagicode.com/installation/docker-compose
Quick installation for the Desktop app: hagicode.com/desktop/
Public beta has started, and you are welcome to try it out

Building an AI Adventure Party: A Practical Guide to Multi-Agent Collaboration Configuration in HagiCode

Mar 17, 2026

Building an AI Adventure Party: A Practical Guide to Multi-Agent Collaboration Configuration in HagiCode

In modern software development, a single AI Agent is no longer enough for complex needs. How can multiple AI assistants from different companies collaborate within the same project? This article shares the multi-Agent collaboration configuration approach that the HagiCode project developed through real-world practice.

Background

Many developers have likely had this experience: bringing an AI assistant into a project really does improve coding efficiency. But as requirements grow more complex, one AI Agent starts to fall short. You want it to handle code review, documentation generation, unit tests, and more at the same time, but the result is often that it cannot balance everything well, and output quality becomes inconsistent.

What is even more frustrating is that once you try to introduce multiple AI assistants, things get more complicated. Each Agent has its own configuration method, API interface, and execution logic, and they may even conflict with one another. It is like a sports team where every player is individually strong, but nobody knows how to coordinate, so the whole match turns into chaos.

The HagiCode project ran into the same problem during development. As a complex project involving a frontend VSCode extension, backend AI services, and a cross-platform desktop client, in the 2026-03 version at that time we needed to integrate multiple AI assistants from different companies at once: Claude Code, Codex, CodeBuddy, iFlow, and more. Figuring out how to let them coexist harmoniously in the same project while making the best use of their individual strengths became a critical problem we had to solve.

That alone would already be enough trouble. After all, who wants to deal with a group of AI tools fighting each other every day?

The approach shared in this article is the multi-Agent collaboration configuration practice we developed in the HagiCode project through real trial and error and repeated optimization. If you are also struggling with multiple AI assistants working together, this article may give you some ideas. Maybe. Every project is different, after all.

About HagiCode

HagiCode is an AI coding assistant project that adopts an “adventure party” model in which multiple AI engines work together. Project repository: github.com/HagiCode-org/site.

The multi-Agent configuration approach shared here is one of the core techniques that allows HagiCode to maintain efficient development in complex projects. There is nothing especially mystical about it - it just turns a group of AIs into an adventure party that can actually coordinate.

HagiCode’s Multi-Agent Architecture Design

From “Going Solo” to “Team Collaboration”

In the early days of the HagiCode project, we also tried using a single AI Agent to handle everything. We quickly discovered a clear bottleneck in that approach: different tasks demand different strengths. Some tasks require stronger contextual understanding, while others need more precise code editing. One Agent has a hard time excelling at all of them.

That made us realize that multiple Agents had to work together. But the problem was this: how do you let AI products from different companies coexist peacefully in the same project? We needed to solve several core issues:

Configuration management complexity: each Agent has different configuration methods, API interfaces, and execution modes
Unified communication protocol: we need a standardized way for different Agents to exchange data
Task coordination and division of labor: how do we assign work reasonably so each Agent can play to its strengths

With those questions in mind, we started designing HagiCode’s multi-Agent architecture. It was not really that complicated in the end; we just had to think it through clearly.

Overall Architecture at a Glance

After multiple iterations, this is the architecture we settled on:

┌─────────────────────────────────────────────────────────────────┐
│                    AIProviderFactory                             │
│  (Factory pattern for unified management of all AI Providers)    │
├─────────────────────────────────────────────────────────────────┤
│  ClaudeCodeCli  │  CodexCli  │  CodebuddyCli  │  IFlowCli    │
│  (Anthropic)   │  (OpenAI)  │  (Zhipu GLM)    │  (Zhipu)     │
└─────────────────────────────────────────────────────────────────┘

The core idea is to let different AI Agents be managed by the same code through a unified Provider interface. At the same time, the factory pattern is used to dynamically create and configure these Providers, ensuring scalability and flexibility across the system.

It is like division of labor in daily life. Everyone has a role; here we simply turned that idea into code architecture.

Agent Types and Division of Responsibilities

Based on HagiCode’s real-world experience, we assigned different responsibilities to each Agent:

Agent	Provider	Model	Primary Use
ClaudeCodeCli	Anthropic	glm-5-turbo	Generate technical solutions and Proposals
CodexCli	OpenAI/Zed	gpt-5.4	Execute precise code changes
CodebuddyCli	Zhipu	glm-4.7	Refine proposal descriptions and documentation
IFlowCli	Zhipu	glm-4.7	Archive proposals and historical records (configuration at the time; now legacy-compatible only)
OpenCodeCli	-	-	General-purpose code editing
GitHubCopilot	Microsoft	-	Assisted programming and code completion

The logic behind this division of labor is simple: every Agent has its own area of strength. Claude Code performs well at understanding and analyzing complex requirements, so it handles early solution design. Codex is more precise when modifying code, so it is better suited for concrete implementation work. CodeBuddy offers strong cost performance, which makes it a great fit for refining documentation.

After all, the right tool for the right job is usually the best choice. There are many roads to Rome; some are simply easier to walk than others.

Core Configuration Mechanisms

Unified Provider Interface Design

To manage different AI Agents in a unified way, we first need to define a common interface. In HagiCode, that interface looks like this:

public interface IAIProvider
{
    // Unified Provider interface
    Task<IAIProvider?> GetProviderAsync(AIProviderType providerType);
    Task<IAIProvider?> GetProviderAsync(string providerName, CancellationToken cancellationToken);
}

The interface looks simple, but it is the foundation of the entire multi-Agent system. With a unified interface, we can call AI products from different companies in exactly the same way, no matter what is underneath.

This is really just a matter of making complex things simple. Simple is beautiful, after all.

Provider Factory Pattern Implementation

Once the interface is unified, the next question is how to create these Provider instances. HagiCode uses the factory pattern:

private IAIProvider? CreateProvider(AIProviderType providerType, ProviderConfiguration config)
{
    return providerType switch
    {
        AIProviderType.ClaudeCodeCli =>
            ActivatorUtilities.CreateInstance<ClaudeCodeCliProvider>(_serviceProvider, Options.Create(config)),
        AIProviderType.CodebuddyCli =>
            ActivatorUtilities.CreateInstance<CodebuddyCliProvider>(_serviceProvider, Options.Create(config)),
        AIProviderType.CodexCli =>
            ActivatorUtilities.CreateInstance<CodexCliProvider>(_serviceProvider, Options.Create(config)),
        AIProviderType.IFlowCli =>
            ActivatorUtilities.CreateInstance<IFlowCliProvider>(_serviceProvider, Options.Create(config)),
        _ => null
    };
}

This uses dependency injection through ActivatorUtilities.CreateInstance, which can dynamically create Provider instances at runtime while automatically injecting dependencies. The benefit of this design is that when a new Agent type is added, you only need to add the corresponding Provider class and then add one more case branch in the factory method. There is no need to modify the existing code at all.

That is reason enough. Who wants to rewrite a pile of old code every time a new feature is added?

Dynamic Configuration Resolution

To make configuration more flexible, we also implemented a type-mapping mechanism:

public static AIProviderTypeExtensions
{
    private static readonly Dictionary<string, AIProviderType> _typeMap = new(
        StringComparer.OrdinalIgnoreCase)
    {
        ["ClaudeCodeCli"] = AIProviderType.ClaudeCodeCli,
        ["CodebuddyCli"] = AIProviderType.CodebuddyCli,
        ["CodexCli"] = AIProviderType.CodexCli,
        ["IFlowCli"] = AIProviderType.IFlowCli,
        // ...more type mappings
    };
}

The purpose of this mapping table is to convert string-form Provider names into enum types. This allows configuration files to use intuitive string names, while the internal code uses type-safe enums for processing.

Configuration should be as intuitive as possible. Nobody wants to memorize a pile of obscure code names.

Example Configuration File

In practice, everything can be configured in appsettings.json:

AI:
  Providers:
    Providers:
      ClaudeCodeCli:
        Enabled: true
        Model: glm-5-turbo
        WorkingDirectory: /path/to/project
      CodebuddyCli:
        Enabled: true
        Model: glm-4.7
      CodexCli:
        Enabled: true
        Model: gpt-5.4
      IFlowCli:
        Enabled: true
        Model: glm-4.7

Each Provider can independently configure parameters such as enablement, model version, and working directory. This design preserves flexibility while remaining easy to manage and maintain.

In some ways, configuration files are like life’s options: you can choose to enable or disable certain things. The only difference is that code choices are easier to regret later.

Adventure Party Task Flow

The Art of Task Division

With the unified technical architecture in place, the next step is making multiple Agents work together. HagiCode designed a task flow mechanism so different Agents can handle different stages of the work:

Proposal creation (user)
    │
    ▼
[Claude Code] ──generate proposal──▶ Proposal document
    │                               │
    │                               ▼
    │                      [Codebuddy] ──refine description──▶ Refined proposal
    │                               │
    │                               ▼
    │                      [Codex] ──execute changes──▶ Code changes
    │                               │
    │                               ▼
    └──────────────────────▶ [iFlow] ──archive──▶ Historical records

The benefit of this division of labor is that each Agent only needs to focus on the tasks it does best, rather than trying to do everything. Claude Code generates proposals from scratch. Codebuddy makes proposal descriptions clearer. Codex turns proposals into actual code changes. iFlow archives and preserves those changes.

This is really just teamwork, the same as in daily life. Everyone has a role, and only together can something big get done. Here, the team members just happen to be AIs.

Key Practical Takeaways

In actual operation, we summarized the following lessons:

1. Agent selection strategy matters

Tasks should not be assigned casually; they should be matched to each Agent’s strengths:

Proposal generation: use Claude Code, because it has stronger contextual understanding
Code execution: use Codex, because it is more precise for code modification
Proposal refinement: use Codebuddy, because it offers strong cost performance
Archival storage: use iFlow, because it is stable and reliable

After all, putting the right person on the right task is a timeless principle.

2. Configuration isolation ensures stability

Each Agent’s configuration is managed independently, supports environment-variable overrides, and uses separate working directories. As a result, a configuration error in one Agent does not affect the others.

This is like personal boundaries in life. Everyone needs their own space; non-interference makes coexistence possible.

3. Error-handling mechanism

A failure in a single Agent should not affect the overall workflow. We implemented a fallback strategy: when one Agent fails, the system can automatically switch to a backup plan or skip that step and continue with later tasks. At the same time, complete logging makes troubleshooting easier afterward.

Nobody can guarantee that errors will never happen. The key is how you handle them. Life works much the same way.

4. Monitoring and observability

Through the ACP protocol (our custom communication protocol based on JSON-RPC 2.0), we can track the execution status of each Agent. Session isolation ensures concurrency safety, while dynamic caching improves performance.

The things you cannot see are often the ones most likely to go wrong. Some visibility is always better than flying blind.

Real-World Results and Benefits

After adopting this multi-Agent collaboration configuration, the HagiCode project’s development efficiency improved significantly. Specifically:

Task-handling capacity doubled: in the past, one Agent had to handle many kinds of tasks at once; now tasks can be processed in parallel, and throughput has increased dramatically
More stable output quality: each Agent focuses only on what it does best, so consistency and quality both improve
Lower maintenance cost: unified interfaces and configuration management make the whole system easier to maintain and extend
Adding new Agents is simple: to integrate a new AI product, you only need to implement the interface and add configuration, without changing the core logic

This approach not only solved HagiCode’s own problems, but also proved that multi-Agent collaboration is a viable architectural choice.

The gains were quite noticeable. The process was just a bit of a hassle.

Conclusion

This article shared the HagiCode project’s practical experience with multi-Agent collaboration configuration. The main takeaways include:

Standardized interfaces: IAIProvider unifies the behavior of different Agents, allowing the code to ignore which company’s product is underneath
Factory pattern: ActivatorUtilities.CreateInstance dynamically creates Provider instances, supporting runtime configuration and dependency injection
Protocol unification: the ACP protocol provides standardized communication between Agents through a bidirectional mechanism based on JSON-RPC 2.0
Task routing: assign work reasonably across different Agents so each can play to its strengths, instead of expecting one Agent to do everything

This design not only solves the problem of “multiple Agents fighting each other,” but also uses the adventure party task flow mechanism to make the development process more automated and specialized.

If you are also considering introducing multiple AI assistants, I hope this article gives you some useful reference points. Of course, every project is different, and the specific approach still needs to be adjusted to the actual situation. There is no one-size-fits-all solution; the best solution is the one that fits you.

Beautiful things or people do not need to be possessed. As long as they remain beautiful, simply appreciating that beauty is enough. Technical solutions are the same: the one that suits you is the best one…

References

Building an AI Adventure Party: HagiCode Multi-Agent Collaboration Configuration in Practice

Mar 17, 2026

Building an AI Adventure Party: HagiCode Multi-Agent Collaboration Configuration in Practice

In modern software development, a single AI Agent is no longer enough to meet complex requirements. How can multiple AI assistants from different companies collaborate within the same project? This article shares the multi-Agent collaboration configuration approach that the HagiCode project developed through real-world practice.

Background

Many developers have probably had this experience: after introducing an AI assistant into a project, productivity really does improve. But as requirements become more and more complex, one AI Agent starts to feel insufficient. You want it to handle code review, documentation generation, unit testing, and other tasks at the same time, but the result is often that it cannot keep everything balanced, and the output quality becomes inconsistent.

What is even more frustrating is that once you try to bring in multiple AI assistants, the problem becomes more complicated. Each Agent has its own configuration method, API interface, and execution logic, and they may even conflict with one another. It is like a sports team in which every player is talented, but nobody knows how to work together, so the match turns into a mess.

The HagiCode project ran into the same challenge during development. As a complex project involving a frontend VSCode extension, backend AI services, and a cross-platform desktop client, we needed to connect multiple AI assistants from different companies at the same time: Claude Code, Codex, CodeBuddy, iFlow, and more. How to let them coexist harmoniously in the same project and make the most of their strengths became a key problem we had to solve.

That alone would already be enough trouble. After all, who wants to deal with a bunch of fighting AIs every day?

The approach shared in this article is the multi-Agent collaboration configuration practice that we developed in the HagiCode project through real trial and error and repeated optimization. If you are also struggling with multiple AI assistants working together, this article may give you some inspiration. Maybe. Every project is different, after all.

About HagiCode

HagiCode is an AI coding assistant project that adopts an “adventure party” model in which multiple AI engines work together. Project repository: github.com/HagiCode-org/site.

The multi-Agent configuration approach shared in this article is one of the core technologies that allows HagiCode to maintain efficient development in complex projects. There is nothing especially magical about it; it simply turns a group of AIs into an adventure party that can actually coordinate.

HagiCode’s Multi-Agent Architecture Design

From “Going Solo” to “Team Collaboration”

In the early days of the HagiCode project, we also tried using a single AI Agent to handle every task. We soon discovered a clear bottleneck in that approach: different tasks require different strengths. Some tasks need stronger contextual understanding, while others need more precise code modification capabilities. One Agent has a hard time excelling at everything.

Configuration management complexity: each Agent has different configuration methods, API interfaces, and execution modes
Unified communication protocol: we need a standardized way for different Agents to exchange data
Task coordination and division of labor: how do we assign work reasonably so each Agent can play to its strengths

With those questions in mind, we started designing HagiCode’s multi-Agent architecture. It was not actually that complicated; we just had to think it through clearly.

Overall Architecture at a Glance

After multiple iterations, this is the architecture we settled on:

┌─────────────────────────────────────────────────────────────────┐
│                    AIProviderFactory                             │
│  (Factory pattern for unified management of all AI Providers)    │
├─────────────────────────────────────────────────────────────────┤
│  ClaudeCodeCli  │  CodexCli  │  CodebuddyCli  │  IFlowCli    │
│  (Anthropic)   │  (OpenAI)  │  (Zhipu GLM)    │  (Zhipu)     │
└─────────────────────────────────────────────────────────────────┘

The core idea is to let different AI Agents be managed by the same set of code through a unified Provider interface. At the same time, the factory pattern is used to dynamically create and configure these Providers, ensuring scalability and flexibility across the system.

It is like division of labor in everyday life. Everyone has their own role; here we simply turned that idea into code architecture.

Agent Types and Division of Responsibilities

Based on HagiCode’s real-world experience, we assigned different responsibilities to each Agent:

Agent	Provider	Model	Primary Use
ClaudeCodeCli	Anthropic	glm-5-turbo	Generate technical solutions and Proposals
CodexCli	OpenAI/Zed	gpt-5.4	Execute precise code changes
CodebuddyCli	Zhipu	glm-4.7	Refine proposal descriptions and documentation
IFlowCli	Zhipu	glm-4.7	Archive proposals and historical records
OpenCodeCli	-	-	General-purpose code editing
GitHubCopilot	Microsoft	-	Assisted programming and code completion

After all, the right tool for the right job is the best choice. There are many roads to Rome; some are simply easier to walk than others.

Core Configuration Mechanisms

Unified Provider Interface Design

To manage different AI Agents in a unified way, we first need to define a common interface. In HagiCode, that interface looks like this:

public interface IAIProvider
{
    // Unified Provider interface
    Task<IAIProvider?> GetProviderAsync(AIProviderType providerType);
    Task<IAIProvider?> GetProviderAsync(string providerName, CancellationToken cancellationToken);
}

The interface looks simple, but it is the foundation of the entire multi-Agent system. With a unified interface, we can call AI products from different companies in the same way regardless of which company is behind them.

This is really just about making complex things simple. Simple is beautiful, after all.

Provider Factory Pattern Implementation

Once the interface is unified, the next question is how to create these Provider instances. HagiCode uses the factory pattern:

private IAIProvider? CreateProvider(AIProviderType providerType, ProviderConfiguration config)
{
    return providerType switch
    {
        AIProviderType.ClaudeCodeCli =>
            ActivatorUtilities.CreateInstance<ClaudeCodeCliProvider>(_serviceProvider, Options.Create(config)),
        AIProviderType.CodebuddyCli =>
            ActivatorUtilities.CreateInstance<CodebuddyCliProvider>(_serviceProvider, Options.Create(config)),
        AIProviderType.CodexCli =>
            ActivatorUtilities.CreateInstance<CodexCliProvider>(_serviceProvider, Options.Create(config)),
        AIProviderType.IFlowCli =>
            ActivatorUtilities.CreateInstance<IFlowCliProvider>(_serviceProvider, Options.Create(config)),
        _ => null
    };
}

This uses dependency injection through ActivatorUtilities.CreateInstance, which can dynamically create Provider instances at runtime while automatically injecting dependencies. The benefit of this design is that when a new Agent type is added, you only need to add the corresponding Provider class and then add one more case branch in the factory method. There is no need to modify the existing code at all.

That is reason enough. Who wants to rewrite a pile of old code every time a new feature is added?

Dynamic Configuration Resolution

To make configuration more flexible, we also implemented a type-mapping mechanism:

public static AIProviderTypeExtensions
{
    private static readonly Dictionary<string, AIProviderType> _typeMap = new(
        StringComparer.OrdinalIgnoreCase)
    {
        ["ClaudeCodeCli"] = AIProviderType.ClaudeCodeCli,
        ["CodebuddyCli"] = AIProviderType.CodebuddyCli,
        ["CodexCli"] = AIProviderType.CodexCli,
        ["IFlowCli"] = AIProviderType.IFlowCli,
        // ...more type mappings
    };
}

Configuration should be as intuitive as possible. Nobody wants to memorize a pile of complicated code names.

Example Configuration File

In practice, everything can be configured in appsettings.json:

AI:
  Providers:
    Providers:
      ClaudeCodeCli:
        Enabled: true
        Model: glm-5-turbo
        WorkingDirectory: /path/to/project
      CodebuddyCli:
        Enabled: true
        Model: glm-4.7
      CodexCli:
        Enabled: true
        Model: gpt-5.4
      IFlowCli:
        Enabled: true
        Model: glm-4.7

Each Provider can independently configure parameters such as enablement, model version, and working directory. This design preserves flexibility while remaining easy to manage and maintain.

Configuration files are a bit like life’s options: you can choose to enable or disable certain things. The only difference is that code choices are easier to regret later.

Adventure Party Task Flow

The Art of Task Division

Proposal creation (user)
    │
    ▼
[Claude Code] ──generate proposal──▶ Proposal document
    │                               │
    │                               ▼
    │                      [Codebuddy] ──refine description──▶ Refined proposal
    │                               │
    │                               ▼
    │                      [Codex] ──execute changes──▶ Code changes
    │                               │
    │                               ▼
    └──────────────────────▶ [iFlow] ──archive──▶ Historical records

The benefit of this division of labor is that each Agent only needs to focus on the tasks it does best, rather than trying to do everything. Claude Code is responsible for generating proposals from scratch. Codebuddy makes proposal descriptions clearer. Codex turns proposals into actual code changes. iFlow archives and preserves those changes.

This is really just teamwork, much like in everyday life. Everyone has their own role, and only together can something big get done. The only difference is that the team members here happen to be AIs.

Key Practical Takeaways

In actual operation, we summarized the following lessons:

1. Agent selection strategy matters

Tasks should not be assigned casually; they should be matched to each Agent’s strengths:

Proposal generation: use Claude Code, because it has stronger contextual understanding
Code execution: use Codex, because it is more precise for code modification
Proposal refinement: use Codebuddy, because it offers strong cost performance
Archival storage: use iFlow, because it is stable and reliable

After all, putting the right person on the right task is a timeless principle.

2. Configuration isolation ensures stability

This is like personal boundaries in life. Everyone needs their own space; non-interference makes harmonious coexistence possible.

3. Error-handling mechanism

Nobody can guarantee that errors will never happen. The key is how you handle them. Life works much the same way.

4. Monitoring and observability

The things you cannot see are often the ones most likely to go wrong. Some visibility is always better than flying blind.

Real-World Results and Benefits

After adopting this multi-Agent collaboration configuration, the HagiCode project’s development efficiency improved significantly. Specifically:

Task-handling capacity doubled: in the past, one Agent had to handle many kinds of tasks at once; now tasks can be processed in parallel, and throughput has increased dramatically
More stable output quality: each Agent focuses only on what it does best, so consistency and quality both improve
Lower maintenance cost: unified interfaces and configuration management make the whole system easier to maintain and extend
Adding new Agents is simple: to integrate a new AI product, you only need to implement the interface and add configuration, without changing the core logic

This approach not only solved HagiCode’s own problems, but also proved that multi-Agent collaboration is a viable architectural choice.

The gains were quite noticeable. The process was just a bit of a hassle.

Conclusion

This article shared the HagiCode project’s practical experience with multi-Agent collaboration configuration. The main takeaways include:

Standardized interfaces: IAIProvider unifies the behavior of different Agents, allowing the code to ignore which company’s product is underneath
Factory pattern: ActivatorUtilities.CreateInstance dynamically creates Provider instances, supporting runtime configuration and dependency injection
Protocol unification: the ACP protocol provides standardized communication between Agents through a bidirectional mechanism based on JSON-RPC 2.0
Task routing: assign work reasonably across different Agents so each can play to its strengths, instead of expecting one Agent to do everything

References

HagiCode project repository: github.com/HagiCode-org/site
HagiCode official website: hagicode.com
Video demo: www.bilibili.com/video/BV1pirZBuEzq/
Installation guide: docs.hagicode.com/installation/docker-compose
Desktop app: hagicode.com/desktop/

If this article was helpful to you, feel free to give the project a Star on GitHub. Your support is what keeps us sharing more. The public beta has already started, and you are welcome to install it and give it a try.

Thank you for reading. If you found this article useful, please click the like button below so more people can discover it.

This content was created with AI-assisted collaboration, reviewed by me, and reflects my own views and position.

Author: newbe36524
Article link: https://docs.hagicode.com/blog/2026-03-17-hagicode-ai-agent-party/
Copyright notice: Unless otherwise stated, all articles on this blog are licensed under BY-NC-SA. All rights reserved!

How Gamification Design Makes AI Coding More Fun

Mar 16, 2026

How Gamification Design Makes AI Coding More Fun

Traditional AI coding tools are actually quite powerful; they just lack a bit of warmth. When we were building HagiCode, we thought: if we are going to write code anyway, why not turn it into a game?

Background

Anyone who has used an AI coding assistant has probably had this experience: at first it feels fresh and exciting, but after a while it starts to feel like something is missing. The tool itself is powerful, capable of code generation, autocomplete, and Bug fixes, but… it does not feel very warm, and over time it can become monotonous and dull.

That alone is enough to make you wonder who wants to stare at a cold, impersonal tool every day.

It is a bit like playing a game. If all you do is finish a task list, with no character growth, no achievement unlocks, and no team coordination, it quickly stops being fun. Beautiful things and people do not need to be possessed to be appreciated; their beauty is enough on its own. Programming tools do not even offer that kind of beauty, so it is easy to lose heart.

We ran into exactly this problem while developing HagiCode. As a multi-AI assistant collaboration platform, HagiCode needs to keep users engaged over the long term. But in reality, even a great tool is hard to stick with if it lacks any emotional connection.

To solve this pain point, we made a bold decision: turn programming into a game. Not the superficial kind with a simple points leaderboard, but a true role-playing gamified experience. The impact of that decision may be even bigger than you imagine.

After all, people need a bit of ritual in their lives.

About HagiCode

The ideas shared in this article come from our practical experience on the HagiCode project. HagiCode is a multi-AI assistant collaboration platform that supports Claude Code, Codex, Copilot, OpenCode, and other AI assistants working together. If you are interested in multi-AI collaboration or gamified programming, visit github.com/HagiCode-org/site to learn more.

There is nothing especially mysterious about it. We simply turned programming into an adventure.

Why Choose Gamification

The essence of gamification is not just “adding a leaderboard.” It is about building a complete incentive system so users can feel growth, achievement, and social recognition while doing tasks.

HagiCode’s gamification design revolves around one core idea: every AI assistant is a “Hero,” and the user is the captain of this Hero team. You lead these Heroes to conquer various “Dungeons” (programming tasks). Along the way, Heroes gain experience, level up, unlock abilities, and your team earns achievements as well.

This is not a gimmick. It is a design grounded in human behavioral psychology. When tasks are given meaning and progress feedback, people’s engagement and persistence increase significantly.

As the old saying goes, “This feeling can become a memory, though at the time it left us bewildered.” We bring that emotional experience into the tool, so programming is no longer just typing code, but a journey worth remembering.

Hero Character System

Hero is the core concept in HagiCode’s gamification system. Each Hero represents one AI assistant. For example, Claude Code is a Hero, and Codex is also a Hero.

The Three Hero Slots

A Hero has three equipment slots, and the design is surprisingly elegant:

CLI slot (main class): Determines the Hero’s base ability, such as whether it is Claude Code or Codex
Model slot (secondary class): Determines which model is used, such as Claude 4.5 or Claude 4.6
Style slot (style): Determines the Hero’s behavior style, such as “Fengluo Strategist” or another style

The combination of these three slots creates unique Hero configurations. Much like equipment builds in games, you choose the right setup based on the task. After all, what suits you best is what matters most. Life is similar: many roads lead to Rome, but some are smoother than others.

The Hero Growth System

Each Hero has its own XP and level:

type HeroProgressionSnapshot = {
  currentLevel: number;                    // Current level
  totalExperience: number;                 // Total experience
  currentLevelStartExperience: number;     // Experience at the start of the current level
  nextLevelExperience: number;             // Experience required for the next level
  experienceProgressPercent: number;       // Progress percentage
  remainingExperienceToNextLevel: number;  // Experience still needed for the next level
  lastExperienceGain: number;              // Most recent experience gained
  lastExperienceGainAtUtc?: string | null; // Time when experience was gained
};

Levels are divided into four stages, and each stage has an immersive name:

export const resolveHeroProgressionStage = (level?: number | null): HeroProgressionStage => {
  const normalizedLevel = Math.max(1, level ?? 1);
  if (normalizedLevel <= 100) return 'rookieSprint';      // Rookie sprint
  if (normalizedLevel <= 300) return 'growthRun';         // Growth run
  if (normalizedLevel <= 700) return 'veteranClimb';      // Veteran climb
  return 'legendMarathon';                                // Legend marathon
};

From “rookie” to “legend,” this growth path gives users a clear sense of direction and achievement. It mirrors personal growth in life, from confusion to maturity, only made more tangible here.

Creating a Custom Hero

To create a Hero, you need to configure three slots:

const heroDraft: HeroDraft = {
  name: 'Athena',
  icon: 'hero-avatar:storm-03',
  description: 'A brilliant strategist',
  executorType: AIProviderType.CLAUDE_CODE_CLI,
  slots: {
    cli: {
      id: 'profession-claude-code',
      parameters: { /* CLI-related parameters */ }
    },
    model: {
      id: 'secondary-claude-4-sonnet',
      parameters: { /* Model-related parameters */ }
    },
    style: {
      id: 'fengluo-strategist',
      parameters: { /* Style-related parameters */ }
    }
  }
};

Every Hero has a unique avatar, description, and professional identity, which gives what would otherwise be a cold AI assistant more personality and warmth. After all, who wants to work with a tool that has no character?

Dungeon System

A “Dungeon” is a classic game concept representing a challenge that requires a team to clear. In HagiCode, each workflow is a Dungeon.

How Dungeons Are Organized

Dungeon organizes workflows into different “Dungeons”:

Proposal generation dungeon: Responsible for generating technical proposals
Proposal execution dungeon: Responsible for executing tasks in proposals
Proposal archive dungeon: Responsible for organizing and archiving completed proposals

Each dungeon has its own Captain Hero, and the captain is automatically chosen as the first enabled Hero.

This is really just division of labor, like in everyday life, except turned into a game mechanic.

Team Collaboration Mechanism

You can configure different Hero squads for different dungeons:

const dungeonRoster: HeroDungeonRoster = {
  scriptKey: 'proposal.generate',
  displayName: 'Proposal Generation',
  members: [
    { heroId: 'hero-1', name: 'Athena', executorType: 'ClaudeCode' },
    { heroId: 'hero-2', name: 'Apollo', executorType: 'Codex' }
  ]
};

For example, you can use Athena for generating proposals because it is good at strategy, and Apollo for implementing code because it is good at execution. That way, every Hero can play to its strengths. It is like forming a band: each person has an instrument, and together they create something beautiful.

Dungeon Flow Control

Dungeon uses fixed scriptKey values to identify different workflows:

// Script keys map to different workflows
const dungeonScripts = {
  'proposal.generate': 'Proposal Generation',
  'proposal.execute': 'Proposal Execution',
  'proposal.archive': 'Proposal Archive'
};

The task state flow is: queued (waiting) -> dispatching (being assigned) -> dispatched (assigned). The whole process is automated and requires no manual intervention. That is also part of our lazy side, because who wants to manage this stuff by hand?

XP and Level System

XP is the core feedback mechanism in the gamification system. Users gain XP by completing tasks, XP levels up Heroes, and leveling up unlocks new abilities, forming a positive feedback loop.

Ways to Gain XP

In HagiCode, XP can be earned through the following activities:

Completing code execution
Successfully calling tools
Generating proposals
Session management operations
Project operations

Every time a valid action is completed, the corresponding Hero gains XP. Just like growth in life, every step counts, only here that growth is quantified.

Real-Time Progress Visualization

XP and level progress are visualized in real time:

type HeroDungeonMember = {
  heroId: string;
  name: string;
  icon?: string | null;
  executorType: PCode_Models_AIProviderType;
  currentLevel?: number;                    // Current level
  totalExperience?: number;                 // Total experience
  experienceProgressPercent?: number;       // Progress percentage
};

Users can always see each Hero’s level and progress, and that immediate feedback is the key to gamification design. People need feedback, otherwise how would they know they are improving?

Achievement System

Achievements are another important element in gamification. They provide long-term goals and milestone-driven satisfaction.

Achievement Types

HagiCode supports multiple types of achievements:

Code generation achievements: Generate X lines of code, generate Y files
Session management achievements: Complete Z conversations
Project operation achievements: Work across W projects

These achievements are really like milestones in life, except we have turned them into a game mechanic.

Achievement States

Achievements have three states:

type AchievementStatus = 'unlocked' | 'in-progress' | 'locked';

The three states have clear visual distinctions:

Unlocked: Gold gradient with a halo effect
In progress: Blue pulse animation
Locked: Gray, with unlock conditions shown

Each achievement clearly displays its trigger condition, so users know what to do next. When people feel lost, a little guidance always helps.

Celebration Effect on Unlock

When an achievement is unlocked, a celebration animation is triggered. That kind of positive reinforcement gives users the satisfying feeling of “I did it” and motivates them to keep going. Small rewards in life work the same way: they may be small, but the happiness can last a long time.

Battle Report Daily Combat Report

Battle Report is one of HagiCode’s signature features. At the end of each day, it generates a full-screen battle-style report.

Report Content

Battle Report displays the following information:

type HeroBattleReport = {
  reportDate: string;
  summary: {
    totalHeroCount: number;        // Total number of Heroes
    activeHeroCount: number;       // Number of active Heroes
    totalBattleScore: number;      // Total battle score
    mvp: HeroBattleHero;           // Most valuable Hero
  };
  heroes: HeroBattleHero[];        // Detailed data for all Heroes
};

Total team score
Number of active Heroes
Number of tool calls
Total working time
MVP (Most Valuable Hero)
Detailed card for each Hero

MVP Highlight Display

The MVP is the best-performing Hero of the day and is highlighted in the report. This is not just data statistics, but a form of honor and recognition. After all, who does not want to be recognized?

Detailed Hero Cards

Each Hero card includes:

Level progress
XP gained
Number of executions
Usage time

These metrics help users clearly understand how the team is performing. Seeing the results of your own effort is satisfying in itself.

Technical Implementation

HagiCode’s gamification system uses a modern technology stack and design patterns. There is nothing especially magical about it; we just chose tools that fit the job.

Technology Stack Choices

// React + TypeScript for the frontend
import React from 'react';

// Framer Motion for animations
import { AnimatePresence, motion } from 'framer-motion';

// Redux Toolkit for state management
import { useAppDispatch, useAppSelector } from '@/store';

// shadcn/ui for UI components
import { Dialog, DialogContent } from '@/components/ui/dialog';

Framer Motion handles all animation effects, shadcn/ui provides the foundational UI components, and Redux Toolkit manages the complex gamification state. Good tools make good work.

Gamified UI Design System

HagiCode uses a Glassmorphism + Tech Dark design style:

/* Primary gradient */
background: linear-gradient(135deg, #22C55E 0%, #25c2a0 50%, #06b6d4 100%);

/* Glass effect */
backdrop-filter: blur(12px);

/* Glow effect */
background: radial-gradient(circle at center, rgba(34, 197, 94, 0.15) 0%, transparent 70%);

The green gradient combined with glassmorphism creates a technical, futuristic atmosphere. Visual beauty is part of the user experience too.

Animation Effects

Framer Motion is used to create smooth entrance animations:

<motion.div
  animate={{ opacity: 1, y: 0 }}
  initial={{ opacity: 0, y: 18 }}
  transition={{ duration: 0.35, ease: 'easeOut', delay: index * 0.08 }}
  className="card"
>
  {/* Card content */}
</motion.div>

Each card enters one after another with a delay of 0.08 seconds, creating a fluid visual effect. Smooth animation improves the experience. That part is hard to argue with.

Data Persistence

Gamification data is stored using the Grain storage system to ensure state consistency. Even fine-grained data like accumulated Hero XP can be persisted accurately. No one wants to lose the experience they worked hard to earn.

Practical Guide

Create Your First Hero

Creating your first Hero is actually quite simple:

Go to the Hero management page
Click the “Create Hero” button
Configure the three slots (CLI, Model, Style)
Give the Hero a name and description
Save it, and your first Hero is born

It is like meeting a new friend: you give them a name, learn what makes them special, and then head off on an adventure together.

Build a Dungeon Team

Building a team is also simple:

Go to the Dungeon management page
Choose the dungeon you want to configure, such as “Proposal Generation”
Select members from your Hero list
The system automatically selects the first enabled Hero as Captain
Save the configuration

This is simply the process of forming a team, much like building a team in real life where everyone has their own role.

View the Daily Report

At the end of each day, you can view the day’s Battle Report:

Click the “Battle Report” button
View the day’s work results in a full-screen display
Check the MVP and the detailed data for each Hero
Share it with team members if you want

This is also a kind of ritual, a way to see how much effort you put in today and how far you still are from your goal.

Notes and Best Practices

Performance Optimization

Use React.memo to avoid unnecessary re-renders:

const HeroCard = React.memo(({ hero }: { hero: HeroDungeonMember }) => {
  // Component implementation
});

Performance matters too. No one wants to use a laggy tool.

Motion Can Degrade Gracefully

Detect the user’s motion preference settings and provide a simplified experience for motion-sensitive users:

const prefersReducedMotion = useReducedMotion();
const duration = prefersReducedMotion ? 0 : 0.35;

Not everyone likes animation, and respecting user preferences is part of good design.

Backward Compatibility

Keep legacyIds to support migration from older versions:

type HeroDungeonMember = {
  heroId: string;
  legacyIds?: string[];  // Supports legacy ID mapping
  // ...
};

No one wants to lose data just because of a version upgrade.

Internationalization Support

Use i18n translation keys for all text to make multi-language support easy:

const displayName = t(`dungeon.${scriptKey}`, { defaultValue: displayName });

Language should never be a barrier to using the product.

Summary

Gamification is not just a simple points leaderboard, but a complete incentive system. Through the Hero system, Dungeon system, XP and level system, achievement system, and Battle Report, HagiCode transforms programming work into a heroic journey full of adventure.

The core value of this system lies in:

Emotional connection: Giving cold AI assistants personality
Positive feedback: Every action produces immediate feedback
Long-term goals: Levels and achievements provide a growth path
Team identity: A sense of collaboration within Dungeon teams
Honor and recognition: Battle Report and MVP showcases

Gamification design makes programming no longer dull, but an interesting adventure. While completing coding tasks, users also experience the fun of character growth, team collaboration, and achievement unlocking, which improves retention and activity.

At its core, programming is already an act of creation. We just made the creative process a little more fun.

If this article helped you:

Leave a like so more people can discover it
Give us a Star on GitHub: github.com/HagiCode-org/site
Visit the official site to learn more: hagicode.com
Watch the 30-minute hands-on demo: www.bilibili.com/video/BV1pirZBuEzq/
Install with one click and try it: docs.hagicode.com/installation/docker-compose
Quick install for Desktop: hagicode.com/desktop/
Public beta has started, and you are welcome to install and try it

References

Thank you for reading. If you found this article useful, please click the like button below so more people can discover it.

This content was created with AI-assisted collaboration, reviewed by me, and reflects my own views and position.

Author: newbe36524
Article link: https://docs.hagicode.com/blog/2026-03-16-gamifying-ai-coding/

ImgBin CLI Tool Design: HagiCode's Image Asset Management Approach

Mar 13, 2026

ImgBin CLI Tool Design: HagiCode’s Image Asset Management Approach

This article explains how to build an automatable image asset pipeline from scratch, covering CLI tool design, a Provider Adapter architecture, and metadata management strategies.

Background

Honestly, I did not expect image asset management to keep us tangled up for this long.

During HagiCode development, we ran into a problem that looked simple on the surface but was surprisingly thorny in practice: generating and managing image assets. In a way, it was like the dramas of adolescence - calm on the outside, turbulent underneath.

As the project accumulated more documentation and marketing materials, we needed a large number of supporting images. Some had to be AI-generated, some had to be selected from an existing asset library, and others needed AI recognition plus automatic labeling. The problem was that all of this had long been handled through scattered scripts and manual steps. Every time we generated an image, we had to run a script by hand, organize metadata by hand, and create thumbnails by hand. That alone was annoying enough, but the bigger issue was that everything was scattered everywhere. When we wanted to find something, we could not. When we needed to reuse something, we could not.

The pain points were concrete:

No unified entry point: the logic for image generation was spread across different scripts, so batch execution was basically impossible.
Missing metadata: generated images had no unified metadata.json, which meant no reliable searchability or traceability.
High manual organization cost: titles and tags had to be sorted out one by one by hand, which was inefficient.
No automation: automatically generating visual assets in a CI/CD pipeline? Not a chance.

We did think about just leaving it alone. But projects still need to move forward. Since we could not avoid the problem, we figured we might as well solve it. So we decided to upgrade ImgBin from a set of scattered scripts into an image asset pipeline that can be executed automatically. Some problems, after all, do not disappear just because you look away.

About HagiCode

The approach shared in this article comes from our hands-on experience in the HagiCode project. HagiCode is an AI coding assistant project that simultaneously maintains multiple components, including a VSCode extension, backend AI services, and a cross-platform desktop client. In a complex, multilingual, cross-platform environment like this, standardized image asset management becomes a key part of improving development efficiency.

You could say this was one of those small growing pains in HagiCode’s journey. Every project has moments like that: a minor issue that looks insignificant, yet somehow manages to take up half the day.

HagiCode’s build system is based on the TypeScript + Node.js ecosystem, so ImgBin naturally adopted the same tech stack to keep the project technically consistent. Once you are used to one stack, switching to something else just feels like unnecessary trouble.

Core Design

Overall Architecture

ImgBin uses a layered architecture that cleanly separates CLI commands, application services, third-party API adapters, and the infrastructure layer:

Component hierarchy
├── CLI Entry (cli.ts)              Global argument parsing, command routing
├── Commands (commands/*)           generate | batch | annotate | thumbnail
├── Application Services            job-runner | metadata | thumbnail | asset-writer
├── Provider Adapters               image-api-provider | vision-api-provider
└── Infrastructure Layer            config | logger | paths | schema

The benefit of this layered design is clear responsibility boundaries. It also makes testing easier because external dependencies can be mocked cleanly. In practice, it just means each layer does its own job without getting in the way of the others, so when something breaks, it is easier to figure out why.

Single-Asset Directory Model

ImgBin uses a model of “one asset, one directory.” Every time an image is generated, it creates a structure like this:

library/
└── 2026-03/
    └── orange-dashboard/
        ├── original.png      # Original image
        ├── thumbnail.webp    # 512x512 thumbnail
        └── metadata.json     # Structured metadata

The advantages of this model are:

Self-contained: all files for a single asset live in the same directory, making migration and backup convenient.
Traceable: metadata.json makes it possible to trace generation time, prompt, model, and other details.
Extensible: if more variants are needed later, such as thumbnails in multiple sizes, we can simply add new files in the same directory.

Beautiful things do not always need to be possessed. Sometimes it is enough that they remain beautiful, and that you can quietly appreciate them. That may sound a little far afield, but the logic still holds here: once images are kept together, they are more pleasant to look at and much easier to find.

Layered Metadata Storage

metadata.json is the core of the entire system. It uses a layered storage strategy that separates fields into three categories:

{
  "schemaVersion": 2,
  "assetId": "orange-dashboard",
  "slug": "orange-dashboard",
  "title": "Orange Dashboard",
  "tags": ["dashboard", "hero", "orange"],

  "source": { "type": "generated" },

  "paths": {
    "assetDir": "library/2026-03/orange-dashboard",
    "original": "original.png",
    "thumbnail": "thumbnail.webp"
  },

  "generated": {
    "prompt": "orange dashboard for docs hero",
    "provider": "azure-openai-image-api",
    "model": "gpt-image-1.5"
  },

  "recognized": {
    "title": "Orange Dashboard",
    "tags": ["dashboard", "ui", "orange"],
    "description": "A modern orange dashboard with charts and metrics"
  },

  "status": {
    "generation": "succeeded",
    "recognition": "succeeded",
    "thumbnail": "succeeded"
  },

  "timestamps": {
    "createdAt": "2026-03-11T04:01:19.570Z",
    "updatedAt": "2026-03-11T04:02:09.132Z"
  }
}

generated: records the original information from image generation, such as the prompt, provider, and model.
recognized: stores AI recognition results, such as auto-generated titles, tags, and descriptions.
manual: stores manually curated results. Data in this area has the highest priority and will not be overwritten by AI recognition.

This layered strategy resolves one of our earlier core conflicts: when AI recognition and manual curation disagree, which one should win? The answer is manual input. AI recognition is there to assist, not to decide. That question also became clearer over time - machines are still machines, and in the end, people still need to make the call.

Provider Adapter Pattern

Another core part of ImgBin is the Provider Adapter pattern. We abstract external APIs behind a unified interface so that even if we switch AI service providers, we do not need to change the business logic.

In a way, it is a bit like relationships - outward appearances can change, but what matters is that the inner structure stays the same. Once the interface is fixed, the internal implementation can vary freely.

Image Generation Provider

interface ImageGenerationProvider {
  // Generate an image and return its Buffer
  generate(options: GenerateOptions): Promise<Buffer>;

  // Get the list of supported models
  getSupportedModels(): Promise<string[]>;
}

interface GenerateOptions {
  prompt: string;
  model?: string;
  size?: '1024x1024' | '1792x1024' | '1024x1792';
  quality?: 'standard' | 'hd';
  format?: 'png' | 'webp' | 'jpeg';
}

Vision Recognition Provider

interface VisionRecognitionProvider {
  // Recognize image content and return structured metadata
  recognize(imageBuffer: Buffer): Promise<RecognitionResult>;

  // Get the list of supported models
  getSupportedModels(): Promise<string[]>;
}

interface RecognitionResult {
  title?: string;
  tags: string[];
  description?: string;
  confidence: number;
}

The advantages of this interface design are:

Testable: in unit tests, we can pass in mock providers instead of making real external API calls.
Extensible: adding a new provider only requires implementing the interface; caller code does not need to change.
Replaceable: production can use Azure OpenAI while testing can use a local model, with configuration being the only thing that changes.

Sometimes project work feels like that too. On the surface it looks like we just swapped an API, but the internal logic remains exactly the same, and that makes the whole thing a lot less scary.

CLI Command Design

ImgBin provides four core commands to cover different usage scenarios:

generate: single-image generation

# Simplest usage
imgbin generate --prompt "orange dashboard for docs hero"

# Generate a thumbnail and AI annotations at the same time
imgbin generate --prompt "orange dashboard" --annotate --thumbnail

# Specify an output directory
imgbin generate --prompt "orange dashboard" --output ./library

batch: batch jobs

Batch jobs are defined through YAML or JSON manifest files, which makes them suitable for CI/CD workflows:

defaults:
  annotate: true
  thumbnail: true
  libraryRoot: ./library

jobs:
  - prompt: "orange dashboard hero"
    slug: orange-dashboard
    tags: [dashboard, hero, orange]

  - prompt: "pricing grid for docs"
    slug: pricing-grid
    tags: [pricing, grid, docs]

Run the command:

imgbin batch assets/jobs/launch.yaml

The batch job design supports failure isolation: items in the manifest are processed one by one, and a failure in one item does not affect the others. You can also preview the job with --dry-run without actually executing it.

And the best part is that it tells you exactly what succeeded and what failed. Unlike some things in life, where failure happens and you are left not even knowing how it happened.

annotate: AI annotation

Run AI recognition on existing images to automatically generate titles, tags, and descriptions:

# Annotate a single image
imgbin annotate ./library/2026-03/orange-dashboard

# Annotate an entire directory in batch
imgbin annotate ./library/2026-03/

thumbnail: thumbnail generation

Generate thumbnails for existing images:

# Generate a thumbnail
imgbin thumbnail ./library/2026-03/orange-dashboard

Batch Job Manifest Design

The manifest format for batch jobs supports flexible configuration. Defaults can be set globally, and individual jobs can override them:

# Global defaults
defaults:
  annotate: true        # Enable AI annotation by default
  thumbnail: true       # Generate thumbnails by default
  libraryRoot: ./library
  model: gpt-image-1.5

jobs:
  # Minimal configuration: only provide a prompt
  - prompt: "first image"

  # Full configuration
  - prompt: "second image"
    slug: custom-slug
    tags: [tag1, tag2]
    annotate: false     # Do not run AI annotation for this job
    model: dall-e-3    # Use a different model for this job

When executed, ImgBin processes jobs one by one. The result of each job is written to its corresponding metadata.json. Even if one job fails, the others are unaffected. After all jobs complete, the CLI outputs a summary report:

✓ orange-dashboard (succeeded)
✓ pricing-grid (succeeded)
✗ hero-banner (failed: API rate limit exceeded)

2/3 succeeded, 1 failed

Some things cannot be rushed. Taking them one at a time is often the steadier path. Maybe that is the philosophy behind batch jobs.

Environment Variable Configuration

ImgBin supports flexible configuration through environment variables:

# ImgBin working directory
IMGBIN_WORKDIR=/path/to/imgbin

# Executable path (for invocation inside scripts)
IMGBIN_EXECUTABLE=/path/to/imgbin/dist/cli.js

# Asset library root
IMGBIN_LIBRARY_ROOT=./.imgbin-library

# Azure OpenAI configuration (if using the Azure provider)
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_API_KEY=***
AZURE_OPENAI_IMAGE_DEPLOYMENT=gpt-image-1

Configuration is one of those things that can feel both important and not that important at the same time. In the end, whatever feels comfortable and fits your workflow best is usually the right choice.

Implementation Notes

During implementation, we summarized a few key points:

Provider Interface Design

Interface definitions should be clear and complete, including input parameters, return values, and error handling. It is also a good idea to provide both synchronous and asynchronous invocation styles for different scenarios.

That is one small piece of hard-earned experience. Once an interface is set, nobody wants to keep changing it later.

Failure Handling Strategy

When one item fails in a batch job, the CLI should:

Write detailed error information to a separate log file.
Continue executing other jobs instead of interrupting the whole process.
Return a non-zero exit code at the end to indicate that some jobs failed.
Clearly display the execution result of every job in the summary report.

Some failures are just failures. There is no point pretending otherwise. It is better to acknowledge them openly and then figure out how to solve them. The same logic applies to projects and to life.

Metadata Merge Strategy

Recognition results are written to the recognized section by default, while manually edited fields are marked in manual. Metadata updates follow an append-only strategy: unless --force is explicitly passed, existing manually curated results are not overwritten.

That point became clear too - some things, once overwritten, are just gone. It is often better to preserve them, because the record itself has value.

Directory Creation Atomicity

Use fs.mkdir({ recursive: true }) to ensure directory creation remains atomic and to avoid race conditions in concurrent scenarios.

Maybe that is what security feels like - being stable when stability matters, moving fast when speed matters, and never getting stuck second-guessing.

Conclusion

As the core tool for image asset management in the HagiCode project, ImgBin solves our problems through the following design choices:

Unified entry point: the CLI covers generation, annotation, thumbnails, and all other core operations.
Metadata-driven: every asset has a complete metadata.json, enabling search and traceability.
Provider Adapter: flexible abstraction for external APIs, making testing and extension easier.
Batch job support: batch image generation can be automated within CI/CD workflows.

Everything else may have faded, but this approach really did end up proving useful.

This solution not only improves HagiCode’s own development efficiency, but also forms a reusable framework for image asset management. If you are building a similarly multi-component project, I believe ImgBin’s design ideas may give you some inspiration.

Youth is all about trying things and making a bit of a mess. If you never put yourself through that, how would you know what you are really capable of?

References

ImgBin technical proposal: https://github.com/HagiCode-org/site/tree/main/openspec/changes/archive/2026-03-10-imgbin-cli-tool
HagiCode official website: https://hagicode.com
HagiCode GitHub: https://github.com/HagiCode-org/site

Thank you for reading. If you found this article helpful, please click the like button below so more people can discover it.

This content was produced with AI-assisted collaboration, reviewed by me, and reflects my own views and position.

Author: newbe36524
Article link: https://docs.hagicode.com/blog/2026-03-13-imgbin-cli-tool-asset-management/

Primary profession management in hero settings

Mar 13, 2026

Primary profession management in hero settings

Hero settings now include a dedicated Primary Professions tab to toggle availability.
Enablement is persisted at the system level and gates hero availability in dungeon selection and status checks.
CLI detection surfaces availability and version; enablement toggles stay locked until the CLI is detected.

Practical Guide to Integrating CodeBuddy CLI into a C# Backend

Mar 12, 2026

Practical Guide to Integrating CodeBuddy CLI into a C# Backend

This article walks through a complete approach to integrating CodeBuddy CLI into a C# backend project so you can deliver AI coding assistant capabilities end to end.

Background

In modern AI coding assistant development, a single AI Provider often cannot satisfy complex and changing development scenarios. HagiCode, as a multifunctional AI coding assistant, needs to support multiple AI Providers to deliver a better user experience. Users should have enough freedom to choose. In early 2026, the project faced a key decision: how to restore CodeBuddy ACP (Agent Communication Protocol) integration capabilities in the C# backend.

The project had previously implemented CodeBuddy integration, but the related code was removed during a refactor. There is not much to complain about there; during iterative development, something always gets left behind. The goal of this technical solution was to fully restore that capability and improve the architecture so it would be more robust and maintainable.

If you are also considering connecting multiple AI coding assistants to your own project, the approach below may give you some ideas. It reflects lessons we summarized after stepping into plenty of pitfalls, and maybe it can help you avoid a few detours.

About HagiCode

The approach shared in this article comes from our practical experience in the HagiCode project. HagiCode is an open-source AI coding assistant project that supports multiple AI Providers and cross-platform operation. To satisfy different user preferences, we need to switch flexibly among different AI coding assistants, which is exactly why we built the CodeBuddy integration described here.

HagiCode uses a modular design, with AI Providers implemented as pluggable components. This architecture lets us add new AI support easily without affecting existing features. When a design is done well up front, it saves a lot of trouble later. If you are interested in our technical architecture, you can view the full source code on GitHub.

Architecture Design

Layered Architecture Overview

The integration between C# and CodeBuddy uses a clear layered architecture. This design makes responsibilities explicit and makes long-term maintenance much easier:

┌─────────────────────────────────────────────┐
│           Provider Contract Layer           │
│   AIProviderType enum + extension methods   │
├─────────────────────────────────────────────┤
│           Provider Factory Layer            │
│   AIProviderFactory dependency injection factory │
├─────────────────────────────────────────────┤
│           Provider Implementation Layer     │
│   CodebuddyCliProvider concrete implementation │
├─────────────────────────────────────────────┤
│           ACP Infrastructure Layer          │
│  ACPSessionManager / StdioAcpTransport      │
│  AcpRpcClient / AcpAgentClient              │
└─────────────────────────────────────────────┘

What are the benefits of this layering? Put simply, each layer stays out of the others’ way. If we later want to change the communication mechanism, for example from stdio to WebSocket, we only need to modify the bottom layer, and the business logic above it stays untouched. Nobody wants a communication change to ripple through the entire codebase.

Core Component Breakdown

The Provider contract layer is the foundation of the entire architecture. We define the AIProviderType enum, where CodebuddyCli = 3 is used as the enum value, and implement bidirectional mapping between strings and enums through extension methods. That allows strings in configuration files to be converted conveniently into enums, and enums to be converted back to strings for debugging output.

The Provider factory layer is responsible for creating the corresponding Provider instance based on configuration. It uses .NET dependency injection together with ActivatorUtilities.CreateInstance for dynamic creation. The advantage of the factory pattern is that when adding a new Provider, you only need to add the creation logic instead of modifying existing code.

The Provider implementation layer is where the actual work happens. CodebuddyCliProvider implements the IAIProvider interface and provides two invocation modes: ExecuteAsync for non-streaming calls and StreamAsync for streaming calls.

The ACP infrastructure layer provides the communication foundation underneath. This layer handles all protocol details, including process management, message serialization, and response parsing. It is the foundation that keeps everything above it stable.

Communication Mechanism

Stdio Transport Mode

CodeBuddy uses Stdio (standard input/output) to communicate with external processes. The startup command is simple:

codebuddy --acp

After that, JSON-RPC messages are exchanged through standard input and output. This approach has several advantages:

Fast startup: local process communication avoids network latency
Simple configuration: you only need to specify the executable path
Environment isolation: each session runs in an independent process, so they do not affect one another

Environment variable injection is supported during communication. Common examples include:

CODEBUDDY_API_KEY: API key authentication
CODEBUDDY_INTERNET_ENVIRONMENT: network environment configuration

As with communication between people, it helps to choose a convenient channel first.

Message Protocol

ACP is based on JSON-RPC 2.0. The message format looks roughly like this:

// Request message
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "agent/prompt",
  "params": {
    "prompt": "Help me write a sorting algorithm",
    "sessionId": "session-123"
  }
}

// Response message
{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "content": "Here is the AI response..."
  }
}

In the real implementation, we encapsulate all of these protocol details so the upper business layer only needs to care about the prompt and response.

Core Implementation

1. Restore the Provider Contract

First, restore the CodeBuddy type in the enum file:

public enum AIProviderType
{
    ClaudeCodeCli = 0,
    CodexCli = 1,
    GitHubCopilot = 2,
    CodebuddyCli = 3,  // Restore this enum value
    OpenCodeCli = 4,
    IFlowCli = 5,
}

Then add string mapping in the extension methods so the configuration file can specify the Provider by string:

private static readonly Dictionary<string, AIProviderType> _typeMap = new(
    StringComparer.OrdinalIgnoreCase)
{
    ["CodebuddyCli"] = AIProviderType.CodebuddyCli,
    ["Codebuddy"] = AIProviderType.CodebuddyCli,
    ["codebuddy"] = AIProviderType.CodebuddyCli,
    // ... Mappings for other providers
};

2. Integrate the Provider Factory

Add a CodeBuddy creation branch in the factory class:

private IAIProvider? CreateProvider(AIProviderType providerType, ProviderConfiguration config)
{
    return providerType switch
    {
        AIProviderType.CodebuddyCli =>
            ActivatorUtilities.CreateInstance<CodebuddyCliProvider>(
                _serviceProvider,
                Options.Create(config)),
        // ... Other providers
        _ => throw new NotSupportedException($"Provider {providerType} not supported")
    };
}

This uses dependency injection through ActivatorUtilities, which automatically handles constructor parameter injection and is very convenient.

3. Complete Provider Implementation

Below is the core implementation of CodebuddyCliProvider, covering both streaming and non-streaming invocation modes:

public class CodebuddyCliProvider : IAIProvider
{
    private readonly ILogger<CodebuddyCliProvider> _logger;
    private readonly IACPSessionManager _sessionManager;
    private readonly ProviderConfiguration _config;

    public string Name => "CodebuddyCli";
    public bool SupportsStreaming => true;
    public ProviderCapabilities Capabilities { get; }

    public CodebuddyCliProvider(
        ILogger<CodebuddyCliProvider> logger,
        IACPSessionManager sessionManager,
        IOptions<ProviderConfiguration> config)
    {
        _logger = logger;
        _sessionManager = sessionManager;
        _config = config.Value;

        // Define the capabilities of the current Provider
        Capabilities = new ProviderCapabilities
        {
            SupportsStreaming = true,
            SupportsTools = true,
            SupportsSystemMessages = true,
            SupportsArtifacts = false,
            MaxTokens = 8192
        };
    }

    // Non-streaming call: return all results together after completion
    public async Task<AIResponse> ExecuteAsync(
        AIRequest request,
        CancellationToken cancellationToken = default)
    {
        // Create an independent session for the request
        var session = await _sessionManager.CreateSessionAsync(
            "CodebuddyCli",
            request.WorkingDirectory,
            cancellationToken,
            request.SessionId);

        try
        {
            var fullPrompt = BuildPrompt(request);
            await session.SendPromptAsync(fullPrompt, cancellationToken);

            var responseBuilder = new StringBuilder();
            var toolCalls = new List<AIToolCall>();

            // Collect all response chunks
            await foreach (var chunk in StreamFromSession(session, cancellationToken))
            {
                if (!string.IsNullOrEmpty(chunk.Content))
                {
                    responseBuilder.Append(chunk.Content);
                }
                // Handle tool calls...
            }

            return new AIResponse
            {
                Content = AIResultContentSanitizer.SanitizeResultContent(
                    responseBuilder.ToString()),
                ToolCalls = toolCalls,
                Provider = Name,
                Model = string.Empty
            };
        }
        finally
        {
            // Release session resources
            await session.DisposeAsync();
        }
    }

    // Streaming call: return response chunks in real time
    public async IAsyncEnumerable<AIStreamingChunk> StreamAsync(
        AIRequest request,
        [EnumeratorCancellation] CancellationToken cancellationToken = default)
    {
        var session = await _sessionManager.CreateSessionAsync(
            "CodebuddyCli",
            request.WorkingDirectory,
            cancellationToken);

        try
        {
            var fullPrompt = BuildPrompt(request);
            await session.SendPromptAsync(fullPrompt, cancellationToken);

            await foreach (var chunk in StreamFromSession(session, cancellationToken))
            {
                yield return chunk;
            }
        }
        finally
        {
            await session.DisposeAsync();
        }
    }

    private async IAsyncEnumerable<AIStreamingChunk> StreamFromSession(
        IACPSession session,
        [EnumeratorCancellation] CancellationToken cancellationToken)
    {
        // Iterate through all updates in the session
        await foreach (var notification in session.ReceiveUpdatesAsync(cancellationToken))
        {
            switch (notification.Update)
            {
                case AgentMessageChunkSessionUpdate agentMessage:
                    // Handle text content chunks
                    if (agentMessage.Content is AcpImp.TextContentBlock textContent)
                    {
                        yield return new AIStreamingChunk
                        {
                            Content = textContent.Text,
                            Type = StreamingChunkType.ContentDelta,
                            IsComplete = false
                        };
                    }
                    break;

                case ToolCallSessionUpdate toolCall:
                    // Handle tool calls
                    yield return new AIStreamingChunk
                    {
                        Content = string.Empty,
                        Type = StreamingChunkType.ToolCallDelta,
                        ToolCallDelta = new AIToolCallDelta
                        {
                            Id = toolCall.ToolCallId,
                            Name = toolCall.Kind.ToString(),
                            Arguments = toolCall.RawInput?.ToString()
                        }
                    };
                    break;

                case AcpImp.PromptCompletedSessionUpdate:
                    // Response complete
                    yield break;
            }
        }
    }

    // Build the full prompt
    private string BuildPrompt(AIRequest request, string? embeddedCommandPrompt = null)
    {
        var sb = new StringBuilder();

        // Embedded command prompt, if present
        if (!string.IsNullOrEmpty(embeddedCommandPrompt))
        {
            sb.AppendLine(embeddedCommandPrompt);
            sb.AppendLine();
        }

        // System message
        if (!string.IsNullOrEmpty(request.SystemMessage))
        {
            sb.AppendLine(request.SystemMessage);
            sb.AppendLine();
        }

        // User prompt
        sb.Append(request.Prompt);
        return sb.ToString();
    }
}

There are several key points in this code:

Session management: each request creates an independent session and releases resources after the request completes. This is a lesson learned through trial and error. If session reuse is not handled well, state pollution appears easily.
Streaming processing: IAsyncEnumerable allows the response to be returned while it is still being generated, instead of waiting for all content to finish. This is especially important for long-text scenarios and significantly improves the user experience.
Tool calls: CodeBuddy supports tool calling (Function Calling), handled through ToolCallSessionUpdate. This capability is critical for complex code editing tasks.
Content filtering: AIResultContentSanitizer is used to filter Think block content and keep the output clean.

4. Dependency Injection Configuration

Add the related services during module registration:

public void ConfigureModule(IServiceCollection context)
{
    // Register Provider
    context.Services.AddTransient<CodebuddyCliProvider>();

    // Register ACP infrastructure
    context.Services.AddSingleton<IACPSessionManager, ACPSessionManager>();
    context.Services.AddSingleton<IAcpPlatformConfigurationResolver, AcpPlatformConfigurationResolver>();
    context.Services.AddSingleton<IAIRequestToAcpMapper, AIRequestToAcpMapper>();
    context.Services.AddSingleton<IAcpToAIResponseMapper, AcpToAIResponseMapper>();
}

Configuration Example

Configuration File

Add CodeBuddy-related configuration to appsettings.json:

AI:
  # Default Provider to use
  DefaultProvider: "CodebuddyCli"

  # Provider configuration
  Providers:
    CodebuddyCli:
      Type: "CodebuddyCli"
      WorkingDirectory: "C:/projects/my-app"
      ExecutablePath: "C:/tools/codebuddy.cmd"

  # Platform-specific configuration
  PlatformConfigurations:
    CodebuddyCli:
      ExecutablePath: "C:/tools/codebuddy.cmd"
      Arguments: "--acp"
      StartupTimeoutMs: 5000
      EnvironmentVariables:
        CODEBUDDY_API_KEY: "${CODEBUDDY_API_KEY}"
        CODEBUDDY_INTERNET_ENVIRONMENT: "production"

Configuration Model

The corresponding configuration model definition:

public class CodebuddyPlatformConfiguration : IAcpPlatformConfiguration
{
    public string ProviderName => "CodebuddyCli";
    public AcpTransportType TransportType => AcpTransportType.Stdio;

    public string ExecutablePath { get; set; } = "codebuddy";
    public string Arguments { get; set; } = "--acp";
    public int StartupTimeoutMs { get; set; } = 5000;

    public Dictionary<string, string?>? EnvironmentVariables { get; set; }
}

Practical Lessons Learned

Pitfall Log

We ran into several typical pitfalls during implementation, and sharing them here may help others avoid the same detours:

Session leak issue: at first, sessions were not released correctly, which exhausted process resources. The solution was to use try-finally to ensure resources are released for every request.
Environment variable passing: Windows and Linux use different environment variable syntax, so we later standardized on Dictionary<string, string?> to handle this.
Timeout configuration: CLI startup takes time, so we set a 5-second startup timeout to avoid fast request failures.
Encoding issues: on Windows, the default encoding may cause garbled Chinese text, so UTF-8 encoding is explicitly specified when starting the process.

Performance Optimization

Session pool: for frequent short requests, consider implementing a session pool to reuse processes
Connection cache: the factory class already supports caching Provider instances
Async first: use asynchronous programming throughout to avoid blocking threads

Performance is always worth optimizing. The longer users wait, the worse the experience becomes.

Conclusion

This article introduced a complete solution for integrating CodeBuddy CLI into a C# backend, covering the entire process from architecture design to concrete implementation. Through a layered architecture, we separate protocol details from business logic, making the code clearer and easier to maintain.

Key takeaways:

Use a layered architecture with a Provider contract layer, factory layer, implementation layer, and infrastructure layer
Use JSON-RPC over Stdio for inter-process communication
Implement flexible configuration and extensibility through dependency injection
Provide both streaming and non-streaming invocation modes

This approach is not only suitable for CodeBuddy; adding new AI Providers can follow the same pattern. If you are also building a similar multi-AI-Provider integration, I hope this article gives you a useful reference.

References

If this article helped you:

Give us a Star on GitHub: github.com/HagiCode-org/site
Visit the official website to learn more: hagicode.com
Watch the 30-minute practical demo: www.bilibili.com/video/BV1pirZBuEzq/
Try one-click installation: docs.hagicode.com/installation/docker-compose
Quick installation for the Desktop app: hagicode.com/desktop/
Public beta has started; feel free to install and try it out

Practical Multi-AI Provider Architecture in the HagiCode Platform

Mar 11, 2026

Practical Multi-AI Provider Architecture in the HagiCode Platform

This article shares the technical approach we used under the Orleans Grain architecture to integrate two AI tools, iflow and OpenCode, through a unified IAIProvider interface, and compares the implementation differences between WebSocket and HTTP communication in detail.

Background

There is nothing especially mysterious about it. While building HagiCode, we ran into a very practical problem: users wanted to work with different AI tools. That is hardly surprising, since everyone has their own habits. Some prefer Claude Code, some love GitHub Copilot, and some teams use tools they developed themselves.

Our initial solution was simple and direct: write dedicated integration code for each AI tool. But the drawbacks showed up quickly. The codebase filled up with if-else branches, every change required testing in multiple places, and every new tool meant writing another pile of logic from scratch.

Later, I realized it would be better to create a unified IAIProvider interface and abstract the capabilities shared by all AI providers. That way, no matter which tool is used underneath, the upper layers can call it in the same way.

Recently, the project needed to integrate two new tools: iflow and OpenCode. Both support the ACP protocol, but their communication styles are different. iflow uses WebSocket, while OpenCode uses an HTTP API. That became a useful architectural test: adapt two different transport modes behind one unified interface.

About HagiCode

The approach shared in this article comes from our practical experience in the HagiCode project. HagiCode is an AI-assisted development platform built on the Orleans Grain architecture. It integrates with different AI providers through a unified IAIProvider interface, allowing users to flexibly choose the AI tools they prefer.

Architecture Design

Unified Interface Abstraction

First, we defined the IAIProvider interface and abstracted the capabilities that every AI provider needs to implement:

public interface IAIProvider
{
    string Name { get; }
    bool SupportsStreaming { get; }
    ProviderCapabilities Capabilities { get; }

    Task<AIResponse> ExecuteAsync(AIRequest request, CancellationToken cancellationToken = default);
    IAsyncEnumerable<AIStreamingChunk> StreamAsync(AIRequest request, CancellationToken cancellationToken = default);
    Task<ProviderTestResult> PingAsync(CancellationToken cancellationToken = default);
    IAsyncEnumerable<AIStreamingChunk> SendMessageAsync(AIRequest request, string? embeddedCommandPrompt = null, CancellationToken cancellationToken = default);
}

This interface includes several key methods:

ExecuteAsync: execute a one-shot AI request
StreamAsync: get streaming responses for real-time display
PingAsync: perform a health check to verify whether the provider is available
SendMessageAsync: send a message with support for embedded commands

IFlowCliProvider: A WebSocket-Based Implementation

iflow uses WebSocket for ACP communication. The overall architecture looks like this:

IFlowCliProvider → ACPSessionManager → WebSocketAcpTransport → iflow CLI
                ↓
         Dynamic port allocation + process management

The core flow is also fairly straightforward:

ACPSessionManager creates and manages ACP sessions.
WebSocketAcpTransport handles WebSocket communication.
A port is allocated dynamically, and the iflow process is started with iflow --experimental-acp --port.
IAIRequestToAcpMapper and IAcpToAIResponseMapper convert requests and responses.

Here is the core code:

private async IAsyncEnumerable<AIStreamingChunk> StreamCoreAsync(
    AIRequest request,
    string? embeddedCommandPrompt,
    [EnumeratorCancellation] CancellationToken cancellationToken)
{
    // Resolve working directory
    var resolvedWorkingDirectory = ResolveWorkingDirectory(request);
    var effectiveRequest = ApplyEmbeddedCommandPrompt(request, embeddedCommandPrompt);

    // Create ACP session
    await using var session = await _sessionManager.CreateSessionAsync(
        Name,
        resolvedWorkingDirectory,
        cancellationToken,
        request.SessionId);

    // Send prompt
    var prompt = _requestMapper.ToPromptString(effectiveRequest);
    var promptResponse = await session.SendPromptAsync(prompt, cancellationToken);

    // Receive streaming response
    await foreach (var notification in session.ReceiveUpdatesAsync(cancellationToken))
    {
        if (_responseMapper.TryConvertToStreamingChunk(notification, out var chunk))
        {
            if (chunk.Type == StreamingChunkType.Metadata && chunk.IsComplete)
            {
                yield return chunk;
                yield break;
            }
            yield return chunk;
        }
    }
}

There are a few design points worth calling out here:

Use await using to ensure the session is released correctly and avoid resource leaks.
Return streaming responses through IAsyncEnumerable, which naturally supports async streams.
Use Metadata chunks to determine completion and ensure the full response has been received.

OpenCodeCliProvider: An HTTP API-Based Implementation

OpenCode provides its service through an HTTP API, so the architecture is slightly different:

OpenCodeCliProvider → OpenCodeRuntimeManager → OpenCodeClient → OpenCode HTTP API
                      ↓
                OpenCodeProcessManager → opencode process management

A notable feature of OpenCode is that it uses an SQLite database to persist session bindings. That makes session recovery and prompt-response recovery possible:

private async Task<OpenCodePromptExecutionResult> ExecutePromptAsync(
    AIRequest request,
    string? embeddedCommandPrompt,
    CancellationToken cancellationToken)
{
    var prompt = BuildPrompt(request, embeddedCommandPrompt);
    var resolvedWorkingDirectory = ResolveWorkingDirectory(request.WorkingDirectory);
    var client = await _runtimeManager.GetClientAsync(resolvedWorkingDirectory, cancellationToken);
    var bindingSessionId = request.SessionId;
    var boundSession = TryGetBinding(bindingSessionId, resolvedWorkingDirectory);

    // Try to use the already bound session
    if (boundSession is not null)
    {
        try
        {
            return await PromptSessionAsync(
                client,
                boundSession,
                BuildPromptRequest(request, prompt, CreatePromptMessageId()),
                request.Model ?? _settings.Model,
                cancellationToken);
        }
        catch (OpenCodeApiException ex) when (IsStaleBinding(ex))
        {
            // The session has expired, remove the binding
            RemoveBinding(bindingSessionId);
        }
    }

    // Create a new session
    var session = await client.Session.CreateAsync(new OpenCodeSessionCreateRequest
    {
        Title = BuildSessionTitle(request)
    }, cancellationToken);

    BindSession(bindingSessionId, session.Id, resolvedWorkingDirectory);
    return await PromptSessionAsync(client, session.Id, ...);
}

This implementation has several interesting highlights:

Session binding mechanism: the same SessionId reuses the same OpenCode session, avoiding repeated session creation.
Expiration handling: when a session is found to be expired, the binding is automatically cleaned up.
Database persistence: bindings are stored in SQLite and remain effective after restart.

Comparing the Two Approaches

Aspect	IFlowCliProvider	OpenCodeCliProvider
Communication	WebSocket (ACP)	HTTP API
Process management	ACPSessionManager	OpenCodeProcessManager
Port allocation	Dynamic port	No port (uses HTTP)
Session management	ACPSession	OpenCodeSession
Persistence	In-memory cache	SQLite database
Startup command	`iflow --experimental-acp --port`	`opencode`
Latency	Lower (long-lived connection)	Relatively higher (HTTP requests)

Which approach you choose depends mainly on your needs. WebSocket is better for scenarios with high real-time requirements, while an HTTP API is simpler and easier to debug.

Practical Guide

Configure Providers

First, enable the two providers in the configuration file:

AI:
  Providers:
    IFlowCli:
      Type: "IFlowCli"
      Enabled: true
      ExecutablePath: "iflow"
      Model: null
      WorkingDirectory: null
    OpenCodeCli:
      Type: "OpenCodeCli"
      Enabled: true
      ExecutablePath: "opencode"
      Model: "anthropic/claude-sonnet-4"
      WorkingDirectory: null

OpenCode:
  Enabled: true
  BaseUrl: "http://localhost:38376"
  ExecutablePath: "opencode"
  StartupTimeoutSeconds: 30
  RequestTimeoutSeconds: 120

Use IFlowCliProvider

// Get provider through the factory
var provider = await _providerFactory.GetProviderAsync(AIProviderType.IFlowCli);

// Execute an AI request
var request = new AIRequest
{
    Prompt = "请帮我重构这个函数",
    WorkingDirectory = "/path/to/project",
    Model = "claude-sonnet-4"
};

// Get the complete response
var response = await provider.ExecuteAsync(request, cancellationToken);
Console.WriteLine(response.Content);

// Or use streaming responses
await foreach (var chunk in provider.StreamAsync(request, cancellationToken))
{
    if (chunk.Type == StreamingChunkType.ContentDelta)
    {
        Console.Write(chunk.Content);
    }
}

Use OpenCodeCliProvider

// Get provider through the factory
var provider = await _providerFactory.GetProviderAsync(AIProviderType.OpenCodeCli);

var request = new AIRequest
{
    Prompt = "请帮我分析这个错误",
    WorkingDirectory = "/path/to/project",
    Model = "anthropic/claude-sonnet-4"
};

var response = await provider.ExecuteAsync(request, cancellationToken);
Console.WriteLine(response.Content);

Health Checks

Before startup or before use, you can check whether the provider is available:

var iflowResult = await iflowProvider.PingAsync(cancellationToken);
if (!iflowResult.Success)
{
    Console.WriteLine($"IFlow is unavailable: {iflowResult.ErrorMessage}");
    return;
}

var openCodeResult = await openCodeProvider.PingAsync(cancellationToken);
if (!openCodeResult.Success)
{
    Console.WriteLine($"OpenCode is unavailable: {openCodeResult.ErrorMessage}");
    return;
}

Embedded Command Support

Both providers support embedded commands, such as /file:xxx:

var request = new AIRequest
{
    Prompt = "分析这个文件的问题",
    SystemMessage = "你是一个代码分析专家"
};

await foreach (var chunk in provider.SendMessageAsync(
    request,
    embeddedCommandPrompt: "/file:src/main.cs",
    cancellationToken))
{
    Console.Write(chunk.Content);
}

Notes and Best Practices

Resource Management

IFlow uses long-lived WebSocket connections, so resource management deserves special attention:

Use await using to ensure sessions are released properly.
Cancellation triggers process cleanup.
ACPSessionManager supports a maximum session count limit.

OpenCode process management is relatively simpler, and OpenCodeRuntimeManager handles it automatically.

Error Handling

Both providers have complete error handling:

IFlow errors are propagated through ACP session updates.
OpenCode errors are thrown through OpenCodeApiException.
It is recommended that the caller catch and handle these exceptions.

Performance Considerations

IFlow WebSocket communication has lower latency than HTTP.
OpenCode session reuse can reduce the overhead of HTTP requests.
The factory cache mechanism avoids repeatedly creating providers.
In high-concurrency scenarios, pay close attention to the limits on process count and connection count.

Configuration Validation

The executable path is validated at startup, but runtime issues can still happen. PingAsync is a useful tool for verifying whether the configuration is correct:

// Check at startup
var provider = await _providerFactory.GetProviderAsync(providerType);
var result = await provider.PingAsync(cancellationToken);
if (!result.Success)
{
    _logger.LogError("Provider {ProviderType} is unavailable: {Error}", providerType, result.ErrorMessage);
}

Summary

This article shares the technical approach used by the HagiCode platform when integrating the two AI tools iflow and OpenCode. Through a unified IAIProvider interface, we adapted different communication styles, WebSocket and HTTP, while keeping the upper-layer calling pattern consistent.

The core idea is actually quite simple:

Define a unified interface abstraction.
Build adapter layers for different implementations.
Manage everything uniformly through the factory pattern.

That gives the system good extensibility. When a new AI tool needs to be integrated later, all we need to do is implement the IAIProvider interface without changing too much existing code.

If you are also working on multi-AI-tool integration, I hope this article is helpful.

References

HagiCode GitHub: github.com/HagiCode-org/site
HagiCode official website: hagicode.com
HagiCode Installation Guide: docs.hagicode.com/installation
ACP protocol specification: github.com/modelcontextprotocol/specification
Orleans documentation: learn.microsoft.com/dotnet/orleans

If this article helped you:

Give it a like so more people can see it
Star us on GitHub: github.com/HagiCode-org/site
Visit the official website to learn more: hagicode.com
Watch the 30-minute hands-on demo: www.bilibili.com/video/BV1pirZBuEzq/
Try one-click installation: docs.hagicode.com/installation/docker-compose
Quick install for Desktop: hagicode.com/desktop/
Public beta has started, and you are welcome to try it

Complete Guide to Codex SDK Console Message Parsing

Mar 10, 2026

Complete Guide to Codex SDK Console Message Parsing

This article explains the Codex SDK event stream mechanism, message type parsing, and best practices in real projects, helping developers quickly master the core skills behind AI execution services.

Background

When building an AI execution service based on the Codex SDK, we inevitably run into a practical question: how should we handle the streamed event messages returned by Codex? These messages contain important information such as execution status, output content, and error details, so they deserve careful handling.

As part of the HagiCode project, we needed a reliable executor for AI coding assistant scenarios. That is exactly why we decided to study the Codex SDK event stream mechanism in depth. After all, only by understanding how the underlying messages work can we build a truly enterprise-grade AI execution platform.

The Codex SDK is a programming-assistance SDK released by OpenAI. It returns execution results through an Event Stream. Unlike the traditional request-response model, Codex uses streamed events so that we can:

Get execution progress in real time
Handle errors promptly
Obtain detailed token usage statistics
Support long-running complex tasks

Understanding these event types and parsing them correctly is essential for implementing a fully capable AI executor. In the end, nobody wants to work with a black box.

About HagiCode

The solution shared in this article comes from our practical experience in the HagiCode project. HagiCode is an open-source AI coding assistant project dedicated to providing developers with intelligent coding support. During development, we needed to build a reliable AI execution service to handle user code execution requests, which is the direct reason we introduced the Codex SDK.

As an AI coding assistant, HagiCode needs to deal with a variety of complex code execution scenarios: getting execution progress in real time, handling errors promptly, and collecting detailed token usage statistics. By deeply understanding the Codex SDK event stream mechanism, we can build an executor that meets production environment requirements. Ultimately, whether it is software or real life, everything benefits from steady accumulation and refinement.

Event Stream Mechanism

Basic Concepts

The Codex SDK uses the thread.runStreamed() method to return an asynchronous event iterator:

import { Codex } from '@openai/codex-sdk';

const client = new Codex({
  apiKey: process.env.CODEX_API_KEY,
  baseUrl: process.env.CODEX_BASE_URL,
});

const thread = client.startThread({
  workingDirectory: '/path/to/project',
  skipGitRepoCheck: false,
});

const { events } = await thread.runStreamed('your prompt here', {
  outputSchema: {
    type: 'object',
    properties: {
      output: { type: 'string' },
      status: { type: 'string', enum: ['ok', 'action_required'] },
    },
    required: ['output', 'status'],
  },
});

for await (const event of events) {
  // Handle each event
}

Detailed Explanation of Event Types

Event Type	Description	Key Data
`thread.started`	Thread started successfully	`thread_id`
`item.updated`	Message content updated	`item.text`
`item.completed`	Message completed	`item.text`
`turn.completed`	Execution completed	`usage` (token usage)
`turn.failed`	Execution failed	`error.message`
`error`	Error event	`message`

In real projects, HagiCode’s executor component is built on top of these event types. We need to handle each kind of event carefully to ensure a smooth user experience. Good systems are built by taking details seriously.

Message Parsing Implementation

Extracting Message Content

Message content is extracted through an event handler:

private handleThreadEvent(event: ThreadEvent, onMessage: (content: string) => void): void {
  // Only handle message update and completion events
  if (event.type !== 'item.updated' && event.type !== 'item.completed') {
    return;
  }

  // Only handle agent message content
  if (event.item.type !== 'agent_message') {
    return;
  }

  // Extract text content
  onMessage(event.item.text);
}

Key points:

Only handle item.updated and item.completed events
Only handle content of type agent_message
The message content is in the event.item.text field

Structured Output Parsing

Codex supports JSON structured output. You can specify the return format through the outputSchema parameter:

const DEFAULT_OUTPUT_SCHEMA = {
  type: 'object',
  properties: {
    output: { type: 'string' },
    status: { type: 'string', enum: ['ok', 'action_required'] },
  },
  required: ['output', 'status'],
  additionalProperties: false,
} as const;

The parsing function attempts to parse JSON, and if that fails it falls back to the raw text.

function toStructuredOutput(raw: string): StructuredOutput {
  try {
    const parsed = JSON.parse(raw) as Partial<StructuredOutput>;
    if (typeof parsed.output === 'string') {
      return {
        output: parsed.output,
        status: parsed.status === 'action_required' ? 'action_required' : 'ok',
      };
    }
  } catch {
    // JSON parsing failed, fall back to the raw text
  }

  return {
    output: raw,
    status: 'ok',
  };
}

Complete Event Handling Flow

private async runWithStreaming(
  thread: Thread,
  input: CodexStageExecutionInput
): Promise<{ output: string; usage: Usage | null }> {
  const abortController = new AbortController();
  const timeoutHandle = setTimeout(() => {
    abortController.abort();
  }, Math.max(1000, input.timeoutMs));

  let latestMessage = '';
  let usage: Usage | null = null;
  let emittedLength = 0;

  try {
    const { events } = await thread.runStreamed(input.prompt, {
      outputSchema: DEFAULT_OUTPUT_SCHEMA,
      signal: abortController.signal,
    });

    for await (const event of events) {
      // Handle message content
      this.handleThreadEvent(event, (nextContent) => {
        const delta = nextContent.slice(emittedLength);
        if (delta.length > 0) {
          emittedLength = nextContent.length;
          input.callbacks?.onChunk?.(delta);  // Streaming callback
        }
        latestMessage = nextContent;
      });

      // Process different data based on the event type
      if (event.type === 'thread.started') {
        this.threadId = event.thread_id;
      } else if (event.type === 'turn.completed') {
        usage = event.usage;
      } else if (event.type === 'turn.failed') {
        throw new CodexExecutorError('gateway_unavailable', event.error.message, true);
      } else if (event.type === 'error') {
        throw new CodexExecutorError('gateway_unavailable', event.message, true);
      }
    }
  } catch (error) {
    if (abortController.signal.aborted) {
      throw new CodexExecutorError(
        'upstream_timeout',
        `Codex stage timed out after ${input.timeoutMs}ms`,
        true
      );
    }
    throw error;
  } finally {
    clearTimeout(timeoutHandle);
  }

  const structured = toStructuredOutput(latestMessage);
  return { output: structured.output, usage };
}

Error Handling Strategy

Error Code Mapping

Map specific error patterns to concrete error codes so the upper layers can handle them more easily:

function mapError(error: unknown): CodexExecutorError {
  if (error instanceof CodexExecutorError) {
    return error;
  }

  const message = error instanceof Error ? error.message : String(error);
  const normalized = message.toLowerCase();

  // Authentication errors - not retryable
  if (normalized.includes('401') ||
      normalized.includes('403') ||
      normalized.includes('api key') ||
      normalized.includes('auth')) {
    return new CodexExecutorError('auth_invalid', message, false);
  }

  // Rate limit errors - retryable
  if (normalized.includes('429') || normalized.includes('rate limit')) {
    return new CodexExecutorError('rate_limited', message, true);
  }

  // Timeout errors - retryable
  if (normalized.includes('timeout') || normalized.includes('aborted')) {
    return new CodexExecutorError('upstream_timeout', message, true);
  }

  // Default error
  return new CodexExecutorError('gateway_unavailable', message, true);
}

Error Type Definitions

export type CodexErrorCode =
  | 'auth_invalid'      // Authentication failure
  | 'upstream_timeout'  // Upstream timeout
  | 'rate_limited'      // Rate limited
  | 'gateway_unavailable'; // Gateway unavailable

export class CodexExecutorError extends Error {
  readonly code: CodexErrorCode;
  readonly retryable: boolean;

  constructor(code: CodexErrorCode, message: string, retryable: boolean) {
    super(message);
    this.name = 'CodexExecutorError';
    this.code = code;
    this.retryable = retryable;
  }
}

Working Directory and Environment Configuration

Working Directory Validation

The Codex SDK requires the working directory to be a valid Git repository.

export function validateWorkingDirectory(
  workingDirectory: string,
  skipGitRepoCheck: boolean
): void {
  const resolvedWorkingDirectory = path.resolve(workingDirectory);

  if (!existsSync(resolvedWorkingDirectory)) {
    throw new CodexExecutorError(
      'gateway_unavailable',
      'Working directory does not exist.',
      false
    );
  }

  if (!statSync(resolvedWorkingDirectory).isDirectory()) {
    throw new CodexExecutorError(
      'gateway_unavailable',
      'Working directory is not a directory.',
      false
    );
  }

  if (skipGitRepoCheck) {
    return;
  }

  const gitDir = path.join(resolvedWorkingDirectory, '.git');
  if (!existsSync(gitDir)) {
    throw new CodexExecutorError(
      'gateway_unavailable',
      'Working directory is not a git repository.',
      false
    );
  }
}

Loading Environment Variables

The Codex SDK needs to load environment variables from the login shell so the AI Agent can access system commands:

function parseEnvironmentOutput(output: Buffer): Record<string, string> {
  const parsed: Record<string, string> = {};

  for (const entry of output.toString('utf8').split('\0')) {
    if (!entry) continue;

    const separatorIndex = entry.indexOf('=');
    if (separatorIndex <= 0) continue;

    const key = entry.slice(0, separatorIndex);
    const value = entry.slice(separatorIndex + 1);
    if (key.length > 0) {
      parsed[key] = value;
    }
  }

  return parsed;
}

function tryLoadEnvironmentFromShell(shellPath: string): Record<string, string> | null {
  const result = spawnSync(shellPath, ['-ilc', 'env -0'], {
    env: process.env,
    stdio: ['ignore', 'pipe', 'pipe'],
    timeout: 5000,
  });

  if (result.error || result.status !== 0) {
    return null;
  }

  return parseEnvironmentOutput(result.stdout);
}

export function createExecutorEnvironment(
  envOverrides: Record<string, string> = {}
): Record<string, string> {
  // Load environment variables from the login shell
  const consoleEnv = loadConsoleEnvironmentFromShell();

  return {
    ...process.env,
    ...consoleEnv,
    ...envOverrides,
  };
}

Complete Usage Example

Basic Usage

In the HagiCode project, we use the following approach to initialize the Codex client and execute tasks:

import { Codex } from '@openai/codex-sdk';

async function executeWithCodex(prompt: string, workingDir: string) {
  const client = new Codex({
    apiKey: process.env.CODEX_API_KEY,
    env: { PATH: process.env.PATH },
  });

  const thread = client.startThread({
    workingDirectory: workingDir,
  });

  const { events } = await thread.runStreamed(prompt);

  let result = '';
  for await (const event of events) {
    if (event.type === 'item.updated' && event.item.type === 'agent_message') {
      result = event.item.text;
    }
    if (event.type === 'turn.completed') {
      console.log('Token usage:', event.usage);
    }
  }

  // Try to parse JSON output
  try {
    const parsed = JSON.parse(result);
    return parsed.output;
  } catch {
    return result;
  }
}

Complete Implementation with Retries

export class CodexSdkExecutor {
  private readonly config: CodexRuntimeConfig;
  private readonly client: Codex;
  private threadId: string | null = null;

  async executeStage(input: CodexStageExecutionInput): Promise<CodexStageExecutionResult> {
    const maxAttempts = Math.max(1, this.config.retryCount + 1);
    let attempt = 0;
    let lastError: CodexExecutorError | null = null;

    while (attempt < maxAttempts) {
      attempt += 1;

      try {
        const thread = this.getThread(input.workingDirectory);
        const { output, usage } = await this.runWithStreaming(thread, input);

        return {
          output,
          usage,
          threadId: this.threadId!,
          attempts: attempt,
          latencyMs: Date.now() - startedAt,
        };
      } catch (error) {
        const mappedError = mapError(error);
        lastError = mappedError;

        // Non-retryable error or max retry attempts reached
        if (!mappedError.retryable || attempt >= maxAttempts) {
          throw mappedError;
        }

        // Wait before retrying
        await new Promise(resolve => setTimeout(resolve, 1000 * attempt));
      }
    }

    throw lastError!;
  }
}

Best Practices

1. Working Directory Requirements

Make sure the working directory is a valid Git repository
Use the PROJECT_ROOT environment variable to specify it explicitly
During debugging, you can set CODEX_SKIP_GIT_REPO_CHECK=true to skip the check

2. Environment Variable Configuration

Pass only the required environment variables through a whitelist mechanism
Use the login shell to load the full environment
Avoid passing sensitive information

3. Timeout and Retry Strategy

Set reasonable timeouts based on task complexity
Implement exponential backoff for retryable errors
Record retry counts and reasons

4. Error Handling

Distinguish between retryable and non-retryable errors
Provide clear error messages and suggestions
Use unified error codes so upper layers can handle them consistently

5. Streaming Output

Implement incremental output callbacks to improve user experience
Correctly handle incremental message updates
Record token usage for cost analysis

In the actual production environment of the HagiCode project, we have already verified the effectiveness of the best practices above. This approach has helped us build a stable and reliable AI execution service. In the end, practical validation matters more than theory alone.

Conclusion

The Codex SDK event stream mechanism provides strong capabilities for building AI execution services. By correctly parsing different kinds of events, we can:

Get execution status and output in real time
Implement reliable error handling and retry mechanisms
Obtain detailed execution statistics
Build a full-featured AI execution platform

The core concepts and code samples introduced in this article can be applied directly in real projects, helping developers get started quickly with Codex SDK integration. If you find this approach valuable, it also reflects the strength of HagiCode’s engineering practice and makes HagiCode itself worth following.

References

Thank you for reading. If you found this article helpful, please click the like button below so more people can discover it.

This content was created with AI-assisted collaboration, reviewed by the author, and reflects the author’s own views and positions.

Author: newbe36524
Article link: https://docs.hagicode.com/blog/2026-03-09-codex-sdk-console-message-parsing/
Copyright notice: Unless otherwise stated, all articles on this blog are licensed under BY-NC-SA. Please cite the source when reprinting.

HagiCode Multi-AI Provider Switching and Interoperability Implementation Plan

Mar 9, 2026

HagiCode Multi-AI Provider Switching and Interoperability Implementation Plan

In the modern developer-tooling ecosystem, developers often need to use different AI coding assistants to support their work. Anthropic’s Claude Code CLI and OpenAI’s Codex CLI each have their own strengths: Claude is known for outstanding code understanding and long-context handling, while Codex excels at code generation and tool usage.

This article takes an in-depth look at how the HagiCode project achieves seamless switching and interoperability across multiple AI providers, including the core architectural design, key implementation details, and practical considerations.

Background

Problem Domain

The core challenge faced by the HagiCode project is supporting multiple AI CLIs on the same platform, so users can:

Flexibly switch between AI providers based on their needs
Maintain session continuity during provider switching
Unify the API differences across different CLIs behind a common abstraction
Reserve extension points for adding new AI providers in the future

Technical Challenges

Unifying interface differences: Claude Code CLI is invoked through command-line calls, while Codex CLI uses a JSON event stream
Handling streaming responses: Both providers support streaming responses, but with different data formats
Tool-calling semantics: Claude and Codex differ in how they represent tool calls and manage their lifecycle
Session lifecycle: The system must correctly manage session creation, restoration, and termination for each provider

Analysis

Architectural Design Approach

HagiCode uses the Provider Pattern combined with the Factory Pattern to abstract AI service invocation. The core ideas of this design are:

Unified interface abstraction: Define the IAIProvider interface as the common abstraction for all AI providers
Factory-created instances: Use AIProviderFactory to dynamically create the corresponding provider instance based on type
Intelligent selection logic: Use AIProviderSelector to automatically select the most suitable provider based on scenario and configuration
Session state management: Persist the binding relationship between sessions and CLI threads in the database

Key Components

Component	Responsibility	Language
`IAIProvider`	Unified provider interface	C#
`AIProviderFactory`	Create and manage provider instances	C#
`AIProviderSelector`	Select providers intelligently	C#
`ClaudeCodeCliProvider`	Claude Code CLI implementation	C#
`CodexCliProvider`	Codex CLI implementation	C#
`AgentCliManager`	Desktop-side CLI management	TypeScript

Solution

1. Core Interface Design

The IAIProvider interface defines the unified provider abstraction:

public interface IAIProvider
{
    /// <summary>
    /// Provider display name
    /// </summary>
    string Name { get; }

    /// <summary>
    /// Whether streaming responses are supported
    /// </summary>
    bool SupportsStreaming { get; }

    /// <summary>
    /// Provider capability description
    /// </summary>
    ProviderCapabilities Capabilities { get; }

    /// <summary>
    /// Execute a single AI request
    /// </summary>
    Task<AIResponse> ExecuteAsync(AIRequest request, CancellationToken cancellationToken = default);

    /// <summary>
    /// Execute a streaming AI request
    /// </summary>
    IAsyncEnumerable<AIStreamingChunk> StreamAsync(AIRequest request, CancellationToken cancellationToken = default);

    /// <summary>
    /// Check provider connectivity and responsiveness
    /// </summary>
    Task<ProviderTestResult> PingAsync(CancellationToken cancellationToken = default);

    /// <summary>
    /// Send a message with an embedded command
    /// </summary>
    IAsyncEnumerable<AIStreamingChunk> SendMessageAsync(
        AIRequest request,
        string? embeddedCommandPrompt = null,
        CancellationToken cancellationToken = default);
}

Key characteristics of this interface design:

Unified request/response model: All providers use the same AIRequest and AIResponse types
Streaming support: Standardize streaming output through IAsyncEnumerable<AIStreamingChunk>
Capability description: ProviderCapabilities describes the features supported by the provider (streaming, tools, maximum tokens, and so on)
Embedded commands: SendMessageAsync supports embedding OpenSpec commands into prompts

2. Provider Type Enumeration

public enum AIProviderType
{
    ClaudeCodeCli,   // Anthropic Claude Code
    OpenCodeCli,     // Other CLIs (extensible)
    GitHubCopilot,    // GitHub Copilot
    CodebuddyCli,    // Codebuddy
    CodexCli         // OpenAI Codex
}

This enum provides a type-safe representation for all providers supported by the system.

3. Factory Pattern Implementation

The AIProviderFactory is responsible for creating and managing provider instances:

public class AIProviderFactory : IAIProviderFactory
{
    private readonly ConcurrentDictionary<AIProviderType, IAIProvider> _cache;
    private readonly IOptions<AIProviderOptions> _options;
    private readonly IServiceProvider _serviceProvider;

    public Task<IAIProvider?> GetProviderAsync(AIProviderType providerType)
    {
        // Use caching to avoid duplicate creation
        if (_cache.TryGetValue(providerType, out var cached))
            return Task.FromResult<IAIProvider?>(cached);

        // Get provider configuration from settings
        var aiOptions = _options.Value;
        if (!aiOptions.Providers.TryGetValue(providerType, out var config))
        {
            _logger.LogWarning("Provider '{ProviderType}' not found in configuration", providerType);
            return Task.FromResult<IAIProvider?>(null);
        }

        // Create provider by type
        var provider = providerType switch
        {
            AIProviderType.ClaudeCodeCli =>
                _serviceProvider.GetService(typeof(ClaudeCodeCliProvider)) as IAIProvider,
            AIProviderType.CodexCli =>
                _serviceProvider.GetService(typeof(CodexCliProvider)) as IAIProvider,
            AIProviderType.GitHubCopilot =>
                _serviceProvider.GetService(typeof(CopilotAIProvider)) as IAIProvider,
            _ => null
        };

        if (provider != null)
        {
            _cache[providerType] = provider;
        }

        return Task.FromResult<IAIProvider?>(provider);
    }
}

Advantages of the factory pattern:

Instance caching: Avoid repeatedly creating the same type of provider
Dependency injection: Create instances through IServiceProvider, with dependency injection support
Configuration-driven: Read provider settings from configuration files
Exception handling: Return null when creation fails, making it easier for upper layers to handle errors

4. Intelligent Selector

The AIProviderSelector implements provider-selection strategies:

public class AIProviderSelector : IAIProviderSelector
{
    private readonly BusinessLayerConfiguration _configuration;
    private readonly IAIProviderFactory _providerFactory;
    private readonly IMemoryCache _cache;

    public async Task<AIProviderType> SelectProviderAsync(
        BusinessScenario scenario,
        CancellationToken cancellationToken = default)
    {
        // 1. Try getting a provider from scenario mapping
        if (_configuration.ScenarioProviderMapping.TryGetValue(scenario, out var providerType))
        {
            if (await IsProviderAvailableAsync(providerType, cancellationToken))
            {
                _logger.LogDebug("Selected provider '{Provider}' for scenario '{Scenario}'",
                    providerType, scenario);
                return providerType;
            }

            _logger.LogWarning("Configured provider '{Provider}' for scenario '{Scenario}' is not available",
                providerType, scenario);
        }

        // 2. Try the default provider
        if (await IsProviderAvailableAsync(_configuration.DefaultProvider, cancellationToken))
        {
            _logger.LogDebug("Using default provider '{Provider}' for scenario '{Scenario}'",
                _configuration.DefaultProvider, scenario);
            return _configuration.DefaultProvider;
        }

        // 3. Try the fallback chain
        foreach (var fallbackProvider in _configuration.FallbackChain)
        {
            if (await IsProviderAvailableAsync(fallbackProvider, cancellationToken))
            {
                _logger.LogInformation("Using fallback provider '{Provider}' for scenario '{Scenario}'",
                    fallbackProvider, scenario);
                return fallbackProvider;
            }
        }

        // 4. No available provider can be found
        throw new InvalidOperationException(
            $"No available AI provider found for scenario '{scenario}'");
    }

    public async Task<bool> IsProviderAvailableAsync(
        AIProviderType providerType,
        CancellationToken cancellationToken = default)
    {
        var cacheKey = $"provider_available_{providerType}";

        // Use caching to reduce Ping calls
        if (_configuration.EnableCache &&
            _cache.TryGetValue<bool>(cacheKey, out var cached))
        {
            return cached;
        }

        var provider = await _providerFactory.GetProviderAsync(providerType);
        var isAvailable = provider != null;

        if (_configuration.EnableCache && isAvailable)
        {
            _cache.Set(cacheKey, isAvailable,
                TimeSpan.FromSeconds(_configuration.CacheExpirationSeconds));
        }

        return isAvailable;
    }
}

Selector strategy:

Scenario mapping first: First check whether the business scenario has a specific provider mapping
Fallback to default provider: Use the default provider if scenario mapping fails
Fallback chain as a final safeguard: Try providers in the fallback chain one by one
Availability caching: Cache provider availability checks to reduce Ping calls

5. Claude Code CLI Provider Implementation

public class ClaudeCodeCliProvider : IAIProvider
{
    private readonly ILogger<ClaudeCodeCliProvider> _logger;
    private readonly IClaudeStreamManager _streamManager;
    private readonly ProviderConfiguration _config;

    public string Name => "ClaudeCodeCli";
    public bool SupportsStreaming => true;

    public ProviderCapabilities Capabilities { get; }

    public async Task<AIResponse> ExecuteAsync(AIRequest request, CancellationToken cancellationToken = default)
    {
        _logger.LogInformation("Executing AI request with provider: {Provider}", Name);

        var sessionOptions = ClaudeRequestMapper.MapToSessionOptions(request, _config);

        var messages = _streamManager.SendMessageAsync(request.Prompt, sessionOptions, cancellationToken);

        var responseBuilder = new StringBuilder();
        ResultMessage? finalResult = null;

        await foreach (var streamMessage in messages)
        {
            switch (streamMessage.Message)
            {
                case ResultMessage result:
                    finalResult = result;
                    responseBuilder.Append(result.Result);
                    break;
            }
        }

        if (finalResult != null)
        {
            return ClaudeResponseMapper.MapToAIResponse(finalResult, Name);
        }

        return new AIResponse
        {
            Content = responseBuilder.ToString(),
            FinishReason = FinishReason.Unknown,
            Provider = Name
        };
    }
}

Characteristics of the Claude Code CLI provider:

Streaming manager integration: Use IClaudeStreamManager to communicate with the Claude CLI
CessionId session isolation: Use CessionId as the unique session identifier, distinct from the system sessionId
Working directory configuration: Support configuration of the working directory, permission mode, and more
Tool support: Support tool-permission settings such as AllowedTools and DisallowedTools

6. Codex CLI Provider Implementation

public class CodexCliProvider : IAIProvider
{
    private readonly ILogger<CodexCliProvider> _logger;
    private readonly CodexSettings _settings;
    private readonly ConcurrentDictionary<string, string> _sessionThreadBindings;

    public string Name => "CodexCli";
    public bool SupportsStreaming => true;

    public ProviderCapabilities Capabilities { get; }

    public async IAsyncEnumerable<AIStreamingChunk> StreamAsync(
        AIRequest request,
        [EnumeratorCancellation] CancellationToken cancellationToken = default)
    {
        _logger.LogInformation("Executing streaming AI request with provider: {Provider}", Name);

        var codex = CreateCodexClient();
        var thread = ResolveThread(codex, request);

        var currentTurn = 0;
        var activeToolCalls = new Dictionary<string, AIToolCallDelta>();

        await foreach (var threadEvent in thread.RunStreamedAsync(BuildPrompt(request), cancellationToken))
        {
            if (threadEvent is TurnStartedEvent)
            {
                currentTurn++;
            }

            switch (threadEvent)
            {
                case ItemCompletedEvent { Item: AgentMessageItem message }:
                    var messageText = message.Text ?? string.Empty;
                    yield return new AIStreamingChunk
                    {
                        Content = messageText,
                        Type = StreamingChunkType.ContentDelta,
                        IsComplete = false
                    };
                    break;

                case ItemStartedEvent or ItemUpdatedEvent or ItemCompletedEvent:
                    var toolChunk = BuildToolChunk(threadEvent, currentTurn);
                    if (toolChunk?.ToolCallDelta != null)
                    {
                        yield return toolChunk;
                    }
                    break;

                case TurnCompletedEvent turnCompleted:
                    activeToolCalls.Clear();
                    yield return new AIStreamingChunk
                    {
                        Content = string.Empty,
                        Type = StreamingChunkType.Metadata,
                        IsComplete = true,
                        Usage = MapUsage(turnCompleted.Usage)
                    };
                    break;
            }
        }

        BindSessionThread(request.SessionId, thread.Id);
    }

    private CodexThread ResolveThread(Codex codex, AIRequest request)
    {
        var sessionId = request.SessionId;

        // Check whether there is already a bound thread
        if (!string.IsNullOrWhiteSpace(sessionId) &&
            _sessionThreadBindings.TryGetValue(sessionId, out var threadId) &&
            !string.IsNullOrWhiteSpace(threadId))
        {
            _logger.LogInformation("Resuming Codex thread {ThreadId} for session {SessionId}", threadId, sessionId);
            return codex.ResumeThread(threadId, threadOptions);
        }

        _logger.LogInformation("Starting new Codex thread for session {SessionId}", sessionId ?? "(none)");
        return codex.StartThread(threadOptions);
    }
}

Characteristics of the Codex CLI provider:

JSON event-stream handling: Parse Codex JSON event streams (TurnStarted, ItemStarted, TurnCompleted, and so on)
Session-thread binding: Persist the binding between sessions and threads with an SQLite database
Thread reuse: Support resuming existing threads to maintain session continuity
Tool-call tracking: Track active tool-call state and correctly handle the tool lifecycle

7. Session-Thread Binding Mechanism

Codex CLI uses an SQLite database to persist the binding between sessions and threads:

public class CodexCliProvider : IAIProvider
{
    private const int SessionThreadBindingRetentionDays = 30;
    private readonly ConcurrentDictionary<string, string> _sessionThreadBindings;
    private readonly string _sessionThreadBindingDatabaseConnectionString;
    private readonly string _sessionThreadBindingDatabasePath;

    private void BindSessionThread(string? sessionId, string? threadId)
    {
        if (string.IsNullOrWhiteSpace(sessionId) || string.IsNullOrWhiteSpace(threadId))
        {
            return;
        }

        // In-memory cache
        _sessionThreadBindings.AddOrUpdate(sessionId, threadId, (_, _) => threadId);

        // Persist to SQLite
        PersistSessionThreadBinding(sessionId, threadId);
    }

    private void PersistSessionThreadBinding(string sessionId, string threadId)
    {
        try
        {
            using var connection = new SqliteConnection(_sessionThreadBindingDatabaseConnectionString);
            connection.Open();

            using var upsertCommand = connection.CreateCommand();
            upsertCommand.CommandText =
                """
                INSERT INTO SessionThreadBindings (SessionId, ThreadId, CreatedAtUtc, UpdatedAtUtc)
                VALUES ($sessionId, $threadId, $createdAtUtc, $updatedAtUtc)
                ON CONFLICT(SessionId) DO UPDATE SET
                    ThreadId = excluded.ThreadId,
                    UpdatedAtUtc = excluded.UpdatedAtUtc;
                """;
            var nowUtc = DateTimeOffset.UtcNow.ToString("O");
            upsertCommand.Parameters.AddWithValue("$sessionId", sessionId);
            upsertCommand.Parameters.AddWithValue("$threadId", threadId);
            upsertCommand.Parameters.AddWithValue("$createdAtUtc", nowUtc);
            upsertCommand.Parameters.AddWithValue("$updatedAtUtc", nowUtc);
            upsertCommand.ExecuteNonQuery();
        }
        catch (Exception ex)
        {
            _logger.LogWarning(
                ex,
                "Failed to persist Codex session-thread binding for session {SessionId} to {DatabasePath}",
                sessionId,
                _sessionThreadBindingDatabasePath);
        }
    }

    private void LoadPersistedSessionThreadBindings()
    {
        using var connection = new SqliteConnection(_sessionThreadBindingDatabaseConnectionString);
        connection.Open();

        using var loadCommand = connection.CreateCommand();
        loadCommand.CommandText = "SELECT SessionId, ThreadId FROM SessionThreadBindings;";
        using var reader = loadCommand.ExecuteReader();
        while (reader.Read())
        {
            var sessionId = reader.GetString(0);
            var threadId = reader.GetString(1);
            _sessionThreadBindings[sessionId] = threadId;
        }
    }
}

Advantages of session-thread binding:

Session restoration: Previous sessions can be restored after a system restart
Thread reuse: The same session can reuse an existing Codex thread
Automatic cleanup: Bindings older than 30 days are cleaned up automatically

8. Desktop-Side CLI Management

hagicode-desktop manages CLI selection through AgentCliManager:

export enum AgentCliType {
  ClaudeCode = 'claude-code',
  Codex = 'codex',
  // Future extensions: other CLIs such as Aider and Cursor
}

export class AgentCliManager {
  private static readonly STORE_KEY = 'agentCliSelection';
  private static readonly EXECUTOR_TYPE_MAP: Record<AgentCliType, string> = {
    [AgentCliType.ClaudeCode]: 'ClaudeCodeCli',
    [AgentCliType.Codex]: 'CodexCli',
  };

  constructor(private store: any) {}

  async saveSelection(cliType: AgentCliType): Promise<void> {
    const selection: StoredAgentCliSelection = {
      cliType,
      isSkipped: false,
      selectedAt: new Date().toISOString(),
    };

    this.store.set(AgentCliManager.STORE_KEY, selection);
  }

  loadSelection(): StoredAgentCliSelection {
    return this.store.get(AgentCliManager.STORE_KEY, {
      cliType: null,
      isSkipped: false,
      selectedAt: null,
    });
  }

  getCommandName(cliType: AgentCliType): string {
    switch (cliType) {
      case AgentCliType.ClaudeCode:
        return 'claude';
      case AgentCliType.Codex:
        return 'codex';
      default:
        return 'claude';
    }
  }

  getExecutorType(cliType: AgentCliType | null): string {
    if (!cliType) return 'ClaudeCodeCli';
    return this.EXECUTOR_TYPE_MAP[cliType] || 'ClaudeCodeCli';
  }
}

Example desktop-side IPC handler:

ipcMain.handle('llm:call-api', async (event, manifestPath, region) => {
  if (!state.llmInstallationManager) {
    return { success: false, error: 'LLM Installation Manager not initialized' };
  }

  try {
    const prompt = await state.llmInstallationManager.loadPrompt(manifestPath, region);

    // Determine the CLI command based on the user's selection
    let commandName = 'claude';
    if (state.agentCliManager) {
      const selectedCliType = state.agentCliManager.getSelectedCliType();
      if (selectedCliType) {
        commandName = state.agentCliManager.getCommandName(selectedCliType);
      }
    }

    // Execute with the selected CLI
    const result = await state.llmInstallationManager.callApi(
      prompt.filePath,
      event.sender,
      commandName
    );

    return result;
  } catch (error) {
    return {
      success: false,
      error: error instanceof Error ? error.message : 'Unknown error'
    };
  }
});

9. Codex’s Internal Model Provider System

Codex itself also supports multiple model providers via ModelProviderInfo configuration:

pub const OPENAI_PROVIDER_NAME: &str = "OpenAI";
pub const OLLAMA_OSS_PROVIDER_ID: &str = "ollama";
pub const LMSTUDIO_OSS_PROVIDER_ID: &str = "lmstudio";

pub fn built_in_model_providers() -> HashMap<String, ModelProviderInfo> {
    use ModelProviderInfo as P;

    [
        ("openai", P::create_openai_provider()),
        (OLLAMA_OSS_PROVIDER_ID, create_oss_provider(DEFAULT_OLLAMA_PORT, WireApi::Responses)),
        (LMSTUDIO_OSS_PROVIDER_ID, create_oss_provider(DEFAULT_LMSTUDIO_PORT, WireApi::Responses)),
    ]
    .into_iter()
    .map(|(k, v)| (k.to_string(), v))
    .collect()
}

pub struct ModelProviderInfo {
    pub name: String,
    pub base_url: Option<String>,
    pub env_key: Option<String>,
    pub query_params: Option<HashMap<String, String>>,
    pub http_headers: Option<HashMap<String, String>>,
    pub request_max_retries: Option<u64>,
    pub stream_max_retries: Option<u64>,
    pub stream_idle_timeout_ms: Option<u64>,
    pub requires_openai_auth: bool,
    pub supports_websockets: bool,
}

Codex model-provider support includes:

Built-in providers: OpenAI, Ollama, and LM Studio
Custom providers: Users can add custom providers in config.toml
Retry strategy: Configurable retry counts for requests and streams
WebSocket support: Some providers support WebSocket transport

Practice

Configuration Example

Configure multiple providers in appsettings.json:

{
  "AI": {
    "Providers": {
      "DefaultProvider": "ClaudeCodeCli",
      "Providers": {
        "ClaudeCodeCli": {
          "Type": "ClaudeCodeCli",
          "Model": "claude-sonnet-4-20250514",
          "WorkingDirectory": "/path/to/workspace",
          "PermissionMode": "acceptEdits",
          "AllowedTools": ["file-edit", "command-run", "bash"]
        },
        "CodexCli": {
          "Type": "CodexCli",
          "Model": "gpt-4.1",
          "ExecutablePath": "codex",
          "SandboxMode": "enabled",
          "WebSearchMode": "auto",
          "NetworkAccessEnabled": false
        }
      },
      "ScenarioProviderMapping": {
        "CodeAnalysis": "ClaudeCodeCli",
        "CodeGeneration": "CodexCli",
        "Refactoring": "ClaudeCodeCli",
        "Debugging": "CodexCli"
      },
      "FallbackChain": ["CodexCli", "ClaudeCodeCli"]
    },
    "Selector": {
      "EnableCache": true,
      "CacheExpirationSeconds": 300
    }
  }
}

Usage Example - Backend Service

public class AIOrchestrator
{
    private readonly IAIProviderFactory _providerFactory;
    private readonly IAIProviderSelector _providerSelector;
    private readonly ILogger<AIOrchestrator> _logger;

    public AIOrchestrator(
        IAIProviderFactory providerFactory,
        IAIProviderSelector providerSelector,
        ILogger<AIOrchestrator> logger)
    {
        _providerFactory = providerFactory;
        _providerSelector = providerSelector;
        _logger = logger;
    }

    public async Task<AIResponse> ProcessRequestAsync(
        AIRequest request,
        BusinessScenario scenario)
    {
        _logger.LogInformation("Processing request for scenario: {Scenario}", scenario);

        try
        {
            // Select a provider intelligently
            var providerType = await _providerSelector.SelectProviderAsync(scenario, request.CancellationToken);

            // Get the provider instance
            var provider = await _providerFactory.GetProviderAsync(providerType);
            if (provider == null)
            {
                throw new InvalidOperationException($"Provider {providerType} not available");
            }

            _logger.LogInformation("Using provider: {Provider} for request", provider.Name);

            // Execute the request
            var response = await provider.ExecuteAsync(request, request.CancellationToken);

            _logger.LogInformation("Request completed with provider: {Provider}, tokens used: {Tokens}",
                provider.Name,
                response.Usage?.TotalTokens ?? 0);

            return response;
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "Failed to process request for scenario: {Scenario}", scenario);
            throw;
        }
    }
}

Usage Example - Streaming Responses

public async IAsyncEnumerable<AIStreamingChunk> StreamResponseAsync(
    AIRequest request,
    BusinessScenario scenario)
{
    var providerType = await _providerSelector.SelectProviderAsync(scenario);
    var provider = await _providerFactory.GetProviderAsync(providerType);

    if (provider == null)
    {
        throw new InvalidOperationException($"Provider {providerType} not available");
    }

    await foreach (var chunk in provider.StreamAsync(request))
    {
        // Process streaming chunks
        switch (chunk.Type)
        {
            case StreamingChunkType.ContentDelta:
                // Show text content in real time
                await SendToClientAsync(chunk.Content);
                break;

            case StreamingChunkType.ToolCallDelta:
                // Handle tool calls
                await HandleToolCallAsync(chunk.ToolCallDelta);
                break;

            case StreamingChunkType.Metadata:
                // Handle completion events and stats
                if (chunk.IsComplete)
                {
                    _logger.LogInformation("Stream completed, usage: {@Usage}", chunk.Usage);
                }
                break;

            case StreamingChunkType.Error:
                // Handle errors
                _logger.LogError("Stream error: {Error}", chunk.ErrorMessage);
                throw new InvalidOperationException(chunk.ErrorMessage);
        }
    }
}

Usage Example - OpenSpec Commands

public async Task<string> ExecuteOpenSpecCommandAsync(
    string command,
    string arguments,
    BusinessScenario scenario)
{
    var providerType = await _providerSelector.SelectProviderAsync(scenario);
    var provider = await _providerFactory.GetProviderAsync(providerType);

    // Build an embedded command prompt
    var commandPrompt = $"""
        Execute the following OpenSpec command:
        Command: {command}
        Arguments: {arguments}

        Please execute this command and return the results.
        """;

    var request = new AIRequest
    {
        Prompt = "Process this command request",
        EmbeddedCommandPrompt = commandPrompt,
        WorkingDirectory = Directory.GetCurrentDirectory()
    };

    var response = await provider.SendMessageAsync(request, commandPrompt);

    return response.Content;
}

Considerations

1. Provider Health Checks

Before switching providers, it is recommended to call PingAsync first to ensure the target provider is available:

public async Task<bool> IsProviderHealthyAsync(AIProviderType providerType)
{
    var provider = await _providerFactory.GetProviderAsync(providerType);
    if (provider == null) return false;

    var testResult = await provider.PingAsync();

    return testResult.Success &&
           testResult.ResponseTimeMs < 5000; // A response within 5 seconds is considered healthy
}

2. Session Isolation

Use CessionId (Claude) or ThreadId (Codex) to ensure session isolation:

Claude Code CLI: use CessionId as the unique session identifier
Codex CLI: use ThreadId as the session identifier

// Claude Code CLI session options
var claudeSessionOptions = new ClaudeSessionOptions
{
    CessionId = CessionId.New(),  // Generate a unique ID
    WorkingDirectory = workspacePath,
    AllowedTools = allowedTools,
    PermissionMode = PermissionMode.acceptEdits
};

// Codex thread options
var codexThreadOptions = new ThreadOptions
{
    Model = "gpt-4.1",
    SandboxMode = "enabled",
    WorkingDirectory = workspacePath
};

3. Error Handling

Fallback mechanisms must be robust when a provider is unavailable, ensuring that at least one provider remains usable:

public async Task<AIResponse> ExecuteWithFallbackAsync(
    AIRequest request,
    List<AIProviderType> preferredProviders)
{
    Exception? lastException = null;

    foreach (var providerType in preferredProviders)
    {
        try
        {
            var provider = await _providerFactory.GetProviderAsync(providerType);
            if (provider == null) continue;

            // Try execution
            return await provider.ExecuteAsync(request);
        }
        catch (Exception ex)
        {
            _logger.LogWarning(ex, "Provider {ProviderType} failed, trying next", providerType);
            lastException = ex;
        }
    }

    // All providers failed
    throw new InvalidOperationException(
        "All preferred providers failed. Last error: " + lastException?.Message,
        lastException);
}

4. Configuration Validation

Validate settings for all configured providers at startup to avoid runtime errors:

public void ValidateConfiguration(AIProviderOptions options)
{
    foreach (var (providerType, config) in options.Providers)
    {
        // Validate executable paths (for CLI-based providers)
        if (IsCliBasedProvider(providerType))
        {
            if (string.IsNullOrWhiteSpace(config.ExecutablePath))
            {
                throw new ConfigurationException(
                    $"Provider {providerType} requires ExecutablePath");
            }

            if (!File.Exists(config.ExecutablePath))
            {
                throw new ConfigurationException(
                    $"Executable not found for {providerType}: {config.ExecutablePath}");
            }
        }

        // Validate API keys (for API-based providers)
        if (IsApiBasedProvider(providerType))
        {
            if (string.IsNullOrWhiteSpace(config.ApiKey))
            {
                throw new ConfigurationException(
                    $"Provider {providerType} requires ApiKey");
            }
        }

        // Validate model names
        if (string.IsNullOrWhiteSpace(config.Model))
        {
            _logger.LogWarning("No model configured for {ProviderType}, using default", providerType);
        }
    }
}

5. Cache Management

Provider instances are cached, so pay attention to lifecycle management and memory usage:

// Clean up the cache periodically
public void ClearInactiveProviders(TimeSpan inactiveThreshold)
{
    var now = DateTimeOffset.UtcNow;
    var keysToRemove = new List<AIProviderType>();

    foreach (var (type, instance) in _cache)
    {
        // Assume providers have a LastUsedTime property
        if (instance.LastUsedTime.HasValue &&
            now - instance.LastUsedTime.Value > inactiveThreshold)
        {
            keysToRemove.Add(type);
        }
    }

    foreach (var key in keysToRemove)
    {
        _cache.TryRemove(key, out _);
        _logger.LogInformation("Cleared inactive provider: {Provider}", key);
    }
}

6. Logging

Log provider selection, switching, and execution in detail to make debugging easier:

public class AIProviderLogging
{
    private readonly ILogger _logger;

    public void LogProviderSelection(
        BusinessScenario scenario,
        AIProviderType selectedProvider,
        SelectionReason reason)
    {
        _logger.LogInformation(
            "[ProviderSelection] Scenario={Scenario}, Provider={Provider}, Reason={Reason}",
            scenario,
            selectedProvider,
            reason);
    }

    public void LogProviderSwitch(
        AIProviderType fromProvider,
        AIProviderType toProvider,
        string reason)
    {
        _logger.LogWarning(
            "[ProviderSwitch] From={FromProvider} To={ToProvider}, Reason={Reason}",
            fromProvider,
            toProvider,
            reason);
    }

    public void LogProviderError(
        AIProviderType provider,
        Exception error,
        AIRequest request)
    {
        _logger.LogError(error,
            "[ProviderError] Provider={Provider}, RequestLength={Length}, Error={Message}",
            provider,
            request.Prompt.Length,
            error.Message);
    }
}

7. Thread Safety

Using concurrent collections such as ConcurrentDictionary ensures thread safety:

public class ThreadSafeProviderCache
{
    private readonly ConcurrentDictionary<AIProviderType, IAIProvider> _cache;
    private readonly ReaderWriterLockSlim _lock = new();

    public IAIProvider? GetProvider(AIProviderType type)
    {
        // Read operations do not require a lock
        if (_cache.TryGetValue(type, out var provider))
            return provider;

        // Creation requires a write lock
        _lock.EnterWriteLock();
        try
        {
            // Double-check
            if (_cache.TryGetValue(type, out provider))
                return provider;

            var newProvider = CreateProvider(type);
            if (newProvider != null)
            {
                _cache[type] = newProvider;
            }
            return newProvider;
        }
        finally
        {
            _lock.ExitWriteLock();
        }
    }
}

8. Database Migration

When the session-thread binding database schema changes, data migration must be considered:

public class SessionThreadMigration
{
    public async Task MigrateAsync(string dbPath)
    {
        var version = await GetSchemaVersionAsync(dbPath);

        if (version >= 2) return; // Already the latest version

        using var connection = new SqliteConnection(dbPath);
        connection.Open();

        // Migrate to v2: add the CreatedAtUtc column
        if (version < 2)
        {
            _logger.LogInformation("Migrating SessionThreadBindings to v2...");

            using var addColumnCommand = connection.CreateCommand();
            addColumnCommand.CommandText = "ALTER TABLE SessionThreadBindings ADD COLUMN CreatedAtUtc TEXT;";
            addColumnCommand.ExecuteNonQuery();

            using var backfillCommand = connection.CreateCommand();
            backfillCommand.CommandText =
                """
                UPDATE SessionThreadBindings
                SET CreatedAtUtc = COALESCE(NULLIF(UpdatedAtUtc, ''), $nowUtc)
                WHERE CreatedAtUtc IS NULL OR CreatedAtUtc = '';
                """;
            backfillCommand.Parameters.AddWithValue("$nowUtc", DateTimeOffset.UtcNow.ToString("O"));
            backfillCommand.ExecuteNonQuery();
        }

        await UpdateSchemaVersionAsync(dbPath, 2);
        _logger.LogInformation("Migration to v2 completed");
    }
}

Conclusion

HagiCode combines the provider pattern, factory pattern, and selector pattern to implement a flexible and extensible multi-AI provider architecture:

Unified interface abstraction: The IAIProvider interface hides the differences between CLIs
Dynamic instance creation: AIProviderFactory supports runtime creation of provider instances
Intelligent selection strategy: AIProviderSelector implements scenario-driven provider selection
Session state persistence: Database bindings ensure session continuity
Desktop integration: AgentCliManager supports user selection and configuration

The advantages of this architecture are:

Extensibility: Adding a new AI provider only requires implementing the IAIProvider interface
Testability: Providers can be tested and mocked independently
Maintainability: Each provider implementation is isolated and has a single responsibility
User-friendliness: Support both scenario-based automatic selection and manual switching

With this design, HagiCode successfully enables seamless switching and interoperability between Claude Code CLI and Codex CLI, giving developers a flexible and powerful AI coding assistant experience.

References

HagiCode project repository: github.com/HagiCode-org/site
HagiCode official website: hagicode.com
Claude Code official documentation: docs.anthropic.com
OpenAI Codex documentation: platform.openai.com
Codex SDK official repository: github.com/openai/codex
HagiCode multi-platform CLI support: https://docs.hagicode.com/blog/hagicode-ai-cli-multi-platform-support/

Thank you for reading. If you found this article useful, please click the like button below 👍 so more people can discover it.

This content was created with AI-assisted collaboration, reviewed by me, and reflects my own views and position.

Author: newbe36524
Article link: https://docs.hagicode.com/blog/2026-03-09-hagicode-multi-ai-provider-switching-interop/
Copyright notice: Unless otherwise stated, all blog posts on this site are licensed under BY-NC-SA. Please credit the source when reprinting!

From TypeScript to C#: Cross-Language Porting Practice for the Codex SDK

Mar 7, 2026

From TypeScript to C#: Cross-Language Porting Practice for the Codex SDK

Put simply, this article is also a bit of a baby of ours: it records the full process of porting the official TypeScript Codex SDK to C#. Calling it a “port” almost makes it sound too easy - it was more like a long adventure, because these two languages have very different personalities, and we had to find a way to make them cooperate.

Background

Codex is the AI Agent CLI tool released by OpenAI, and it is genuinely powerful. The official team provides a TypeScript SDK in the @openai/codex package. It interacts with the Codex CLI by calling the codex exec --experimental-json command and parsing a JSONL event stream.

The problem is that in the HagiCode project, we need to use it in a pure .NET environment - specifically in C# backend services and desktop applications. We could not reasonably introduce a Node.js runtime into a .NET project just to call a CLI tool. That would be far too cumbersome.

So we were left with two choices: maintain a complex Node.js bridge layer, or build a native C# SDK ourselves.

We chose the latter.

About HagiCode

This article also comes directly from our hands-on experience in the HagiCode project. HagiCode is an open-source AI coding assistant project. In plain terms, it means maintaining multiple components at once: a VSCode extension on the frontend, AI services on the backend, and a cross-platform desktop client. That multi-language, multi-platform complexity is exactly why we needed a native C# SDK - we really did not want to run Node.js inside a .NET project.

If you find this article helpful, feel free to give us a star on GitHub: github.com/HagiCode-org/site. You can also visit the official website to learn more: hagicode.com. It is always encouraging when an open-source project receives support.

Core Content

Architectural Design Comparison

Before translating code one-to-one, we first had to understand the architectural design of both SDKs. You have to understand both sides before you can port them well.

The core architecture of the TypeScript SDK looks like this:

Codex (entry class)
  └── CodexExec (executor, manages child processes)
      └── Thread (conversation thread)
          ├── run() / runStreamed() (synchronous/asynchronous execution)
          └── event stream parsing

The C# SDK keeps the same architectural layering, but adapts the implementation details. The overall idea is straightforward: preserve API consistency while fully leveraging C# language features in the implementation.

Type System Conversion

This is the most fundamental and also the most important part of the work. If the foundation is weak, everything that follows becomes harder.

TypeScript’s type system is more flexible than C#‘s, and that is simply a fact. We needed to find an appropriate mapping strategy:

TypeScript	C#	Notes
`interface` / `type`	`record`	C# uses `record` for immutable data structures
`string \| null`	`string?`	Nullable reference type
`boolean \| undefined`	`bool?`	Nullable Boolean
`AsyncGenerator`	`IAsyncEnumerable`	Async iterator

The event type system is a typical example. TypeScript uses union types to define events:

export type ThreadEvent =
  | ThreadStartedEvent
  | TurnStartedEvent
  | TurnCompletedEvent
  | ...

In C#, we use an inheritance hierarchy and pattern matching to achieve a similar effect:

public abstract record ThreadEvent(string Type);

public sealed record ThreadStartedEvent(string ThreadId) : ThreadEvent("thread.started");
public sealed record TurnStartedEvent() : ThreadEvent("turn.started");
public sealed record TurnCompletedEvent(Usage Usage) : ThreadEvent("turn.completed");
// ...

We chose record instead of class because event objects should be immutable, which matches the intent behind using plain objects in TypeScript. The sealed keyword also prevents additional inheritance and gives the compiler room to optimize.

Key Porting Points

1. Event parser

Event parsing is the core of the entire SDK, because it determines whether we can correctly understand every message returned by the Codex CLI. If parsing is wrong, everything after that is wasted effort.

The TypeScript version uses JSON.parse() to parse each line of JSON:

export function parseEvent(line: string): ThreadEvent {
  const data = JSON.parse(line);
  // Handle different event types...
}

The C# version uses System.Text.Json.JsonDocument instead:

public static ThreadEvent Parse(string line)
{
    using var document = JsonDocument.Parse(line);
    var root = document.RootElement;
    var type = GetRequiredString(root, "type", "event.type");

    return type switch
    {
        "thread.started" => new ThreadStartedEvent(GetRequiredString(root, "thread_id", ...)),
        "turn.started" => new TurnStartedEvent(),
        "turn.completed" => new TurnCompletedEvent(ParseUsage(...)),
        // ...
        _ => new UnknownThreadEvent(type, root.Clone()),
    };
}

There is one small but important trick here: root.Clone() is required, because elements from JsonDocument become invalid after the document is disposed. We need to retain a copy for unknown event types. That is simply one of the differences between C# JSON handling and JavaScript.

2. Process management differences

This is where the two SDKs differ the most. Node.js and .NET have different runtime conventions, so the implementation has to adapt.

TypeScript uses Node.js’s spawn() function:

const child = spawn(this.executablePath, commandArgs, { env, signal });

C# uses .NET’s System.Diagnostics.Process:

using var process = new Process { StartInfo = startInfo };
process.Start();

// stdin/stdout/stderr must be managed manually

More specifically, the C# version needs to configure the process like this:

var startInfo = new ProcessStartInfo
{
    FileName = _executablePath,
    RedirectStandardInput = true,
    RedirectStandardOutput = true,
    RedirectStandardError = true,
    UseShellExecute = false,
    CreateNoWindow = true,
};

The biggest difference is the cancellation mechanism. TypeScript uses AbortSignal, which is part of the Web API and very convenient to work with:

const child = spawn(cmd, args, { signal: cancellationSignal });

C# uses CancellationToken instead:

public async IAsyncEnumerable<string> RunAsync(
    CodexExecArgs args,
    [EnumeratorCancellation] CancellationToken cancellationToken = default)
{
    // Check cancellation status inside the loop
    while (!cancellationToken.IsCancellationRequested)
    {
        // Process output...
    }

    // Terminate the process when cancellation is requested
    if (cancellationToken.IsCancellationRequested)
    {
        try { process.Kill(entireProcessTree: true); } catch { }
    }
}

At a high level, this is just another example of the difference between the Web API ecosystem and the .NET ecosystem.

3. Preserving configuration serialization

Both SDKs implement the logic that converts JSON configuration into TOML configuration, because the Codex CLI accepts configuration overrides in TOML format. This part must remain completely consistent, otherwise the same configuration will behave differently in the two SDKs.

That is the kind of detail you cannot compromise on. Success or failure often comes down to details like this.

Implementation Details

Project structure

We created the following project structure:

CodexSdk/
├── CodexSdk.csproj
├── Codex.cs           # Entry class
├── CodexThread.cs     # Conversation thread
├── CodexExec.cs       # Executor
├── Events.cs          # Event type definitions
├── Items.cs           # Item type definitions
├── EventParser.cs     # Event parser
├── OutputSchemaTempFile.cs  # Temporary file management
└── ...

It is a fairly clean structure, and that helped a lot during the port.

Usage example

The basic usage remains consistent with the TypeScript SDK:

using CodexSdk;

// Create a Codex instance
var codex = new Codex();
var thread = codex.StartThread();

// Execute a query
var result = await thread.RunAsync("Summarize this repository.");
Console.WriteLine(result.FinalResponse);

Streaming event handling takes advantage of C# pattern matching:

await foreach (var @event in thread.RunStreamedAsync("Analyze the code."))
{
    switch (@event)
    {
        case ItemCompletedEvent itemCompleted
            when itemCompleted.Item is AgentMessageItem msg:
            Console.WriteLine($"Assistant: {msg.Text}");
            break;
        case TurnCompletedEvent completed:
            Console.WriteLine($"Tokens: in={completed.Usage.InputTokens}");
            break;
        case CommandExecutionItem command:
            Console.WriteLine($"Command: {command.Command}");
            break;
    }
}

Notes

During implementation, we collected several practical lessons:

Process management: The C# version must manage the full process lifecycle manually, including process termination during cancellation. Use Kill(entireProcessTree: true) to make sure child processes are also cleaned up.
Error handling: We use InvalidOperationException to throw parsing errors, keeping the error handling style similar to the TypeScript SDK.
Resource cleanup: OutputSchemaTempFile implements IAsyncDisposable to ensure temporary files are cleaned up correctly.
Environment variables: The C# version supports fully overriding process environment variables through CodexOptions.Env. It is a small feature, but a very practical one.
Platform differences: The C# version does not include the TypeScript version’s logic for automatically locating binaries inside npm packages. Since .NET projects typically do not depend on npm, the path to the codex executable must be specified via the CODEX_EXECUTABLE environment variable or CodexPathOverride.

Conclusion

Porting a mature TypeScript SDK to C# is not just a matter of syntax conversion - it also requires understanding the design philosophies of both languages. TypeScript’s flexibility and JavaScript ecosystem features such as AbortSignal need appropriate counterparts in C#.

The key takeaway is this: maintaining API consistency matters more than maintaining implementation-level consistency. Users care about whether the interface is easy to use, not whether the internal implementation is identical. That sounds simple, but making those trade-offs takes judgment.

If you are working on a similar cross-language port, our experience is to fully understand the original SDK architecture first, then translate it module by module, and finally use a complete test suite to ensure behavioral consistency. This kind of work cannot be rushed.

Everything will work out in the end.

References

Official TypeScript SDK: github.com/openai/codex
C# SDK source code: github.com/HagiCode-org/site/tree/main/repos/playground/CodexDotnet
Official Codex documentation: codex.docs.anysphere.co

If this article helped you:

Give us a star on GitHub: github.com/HagiCode-org/site
Visit the official website to learn more: hagicode.com
Watch the 30-minute live demo: www.bilibili.com/video/BV1pirZBuEzq/
Try one-click installation: docs.hagicode.com/installation/docker-compose
Quick install for the Desktop client: hagicode.com/desktop/
The public beta has started, and you are welcome to try it

Thank you for reading. If you found this article useful, please click the like button below so more people can discover it.

This content was created with AI-assisted collaboration, reviewed by the author, and reflects the author’s own views and position.

Author: newbe36524
Article link: https://docs.hagicode.com/blog/2026-03-07-codex-sdk-typescript-to-csharp-porting-guide/
Copyright notice: Unless otherwise stated, all articles on this blog are licensed under BY-NC-SA. Please include the source when reposting.