Skip to content

Blog

Full GLM-5.1 Support and Gemini CLI Integration: HagiCode's Path of Multi-Model Evolution

Full GLM-5.1 Support and Gemini CLI Integration: HagiCode’s Path of Multi-Model Evolution

Section titled “Full GLM-5.1 Support and Gemini CLI Integration: HagiCode’s Path of Multi-Model Evolution”

This article introduces two major recent updates to the HagiCode platform: full support for the Zhipu AI GLM-5.1 model and the successful integration of Gemini CLI as the tenth Agent CLI. Together, these updates further strengthen the platform’s multi-model capabilities and multi-CLI ecosystem.

Time really does move fast. The development of large language models has been rising like bamboo in spring. Not long ago, we were still cheering for “an AI that can write code.” Now we are already in an era of multi-model collaboration and multi-tool integration. Is that exciting? Perhaps. After all, what developers need has never been just the tool itself, but the ease of adapting to different scenarios and switching flexibly when needed.

As an AI-assisted coding platform, HagiCode has recently welcomed two important developments: first, the full integration of Zhipu AI’s GLM-5.1 model; second, the official addition of Gemini CLI as the tenth supported Agent CLI. These two updates may not sound earth-shaking, but they are unquestionably good news for the platform’s continued maturation.

GLM-5.1 is Zhipu AI’s latest flagship model. Compared with GLM-5.0, it offers stronger reasoning, deeper code understanding, and smoother tool calling. More importantly, it is the first GLM model to support image input. What does that mean? It means users can let the AI look directly at a screenshot instead of struggling to describe the problem in words. Once you’ve used that convenience, you immediately understand its value.

At the same time, through the HagiCode.Libs.Providers architecture, HagiCode successfully integrated Gemini CLI into the platform. This is now the tenth Agent CLI. To be honest, getting to this point does bring a modest sense of accomplishment.

It is also worth mentioning that HagiCode’s image upload feature lets users communicate with AI directly through screenshots. Even when running GLM 4.7, the platform still works well and has already helped complete many important build tasks. As for GLM-5.1, naturally, it goes one step further.

The approach shared in this article comes from our practical experience in the HagiCode project. HagiCode is an open-source AI-assisted coding platform designed to provide developers with a flexible and powerful AI programming assistant through a multi-model, multi-CLI architecture. Project repository: github.com/HagiCode-org/site

One of HagiCode’s core strengths is its support for multiple AI programming CLI tools through a unified abstraction layer. The advantage of this design is actually quite simple: new tools can come in, old tools can stay, and the codebase does not turn into chaos. To be fair, that is how everyone would like life to work.

The platform defines supported CLI provider types through the AIProviderType enum:

public enum AIProviderType
{
ClaudeCodeCli = 0, // Claude Code CLI
CodexCli = 1, // GitHub Copilot Codex
GitHubCopilot = 2, // GitHub Copilot
CodebuddyCli = 3, // Codebuddy CLI
OpenCodeCli = 4, // OpenCode CLI
IFlowCli = 5, // IFlow CLI
HermesCli = 6, // Hermes CLI
QoderCli = 7, // Qoder CLI
KiroCli = 8, // Kiro CLI
KimiCli = 9, // Kimi CLI
GeminiCli = 10, // Gemini CLI (new)
}

As you can see, Gemini CLI joins this family as the tenth member. Each CLI has its own distinct characteristics and usage scenarios, so users can choose flexibly based on their needs. After all, many roads lead to Rome; some are simply easier than others.

HagiCode.Libs.Providers provides a unified Provider interface that makes each CLI integration standardized and concise. Taking Gemini CLI as an example:

public class GeminiProvider : ICliProvider<GeminiOptions>
{
private static readonly string[] DefaultExecutableCandidates = ["gemini", "gemini-cli"];
private const string ManagedBootstrapArgument = "--acp";
public string Name => "gemini";
public bool IsAvailable => _executableResolver.ResolveFirstAvailablePath(DefaultExecutableCandidates) is not null;
}

The benefits of this design are:

  • Integrating a new CLI only requires implementing one Provider class
  • Unified lifecycle management and session pooling
  • Automated alias resolution and executable discovery

Put plainly, this design turns complicated things into simpler ones and makes life a bit easier.

The Provider Registry automatically handles alias mapping and registration:

if (provider is GeminiProvider)
{
registry.Register(provider.Name, provider, ["gemini-cli"]);
continue;
}

This means users can invoke Gemini CLI with either gemini or gemini-cli, and the system will recognize it automatically. It is like a friend with both a formal name and a nickname - either way, people know who you mean.

GLM-5.1 is Zhipu AI’s latest flagship model, and HagiCode has completed full support for it.

HagiCode manages all supported models through the Secondary Professions Catalog. Here is the configuration for the GLM series:

Model IDNameSupportsImageCompatible CLI Families
glm-4.7GLM 4.7-claude, codebuddy, hermes, qoder, kiro
glm-5GLM 5-claude, codebuddy, hermes, qoder, kiro
glm-5-turboGLM 5 Turbo-claude, codebuddy, hermes, qoder, kiro
glm-5.0GLM 5.0 (Legacy)-claude, codebuddy, hermes, qoder, kiro
glm-5.1GLM 5.1trueclaude, codebuddy, hermes, qoder, kiro

The key characteristics of GLM-5.1 can be summarized as follows:

  • A standalone version identifier with no legacy baggage
  • The first GLM model to support image input
  • Stronger reasoning and code understanding
  • Broad multi-CLI compatibility

At the code level, the key difference between GLM-5.1 and GLM-5.0 is shown here:

// GLM-5.0 (Legacy) - contains special retention logic
private const string Glm50CodebuddySecondaryProfessionId = "secondary-glm-5-codebuddy";
private const string Glm50CodebuddyModelValue = "glm-5.0";
// GLM-5.1 - standalone new model identifier
private const string Glm51SecondaryProfessionId = "secondary-glm-5-1";
private const string Glm51ModelValue = "glm-5.1";

GLM-5.0 carries the “Legacy” label because it is an old version identifier retained for backward compatibility. GLM-5.1, by contrast, is a brand-new standalone version with no historical burden. Some things stay in the past; others travel lighter and move faster.

Here is a configuration example for using GLM-5.1 in HagiCode:

{
"primaryProfessionId": "profession-claude-code",
"secondaryProfessionId": "secondary-glm-5-1",
"model": "glm-5.1",
"reasoning": "high"
}

HagiCode’s image support is implemented through the SupportsImage property on SecondaryProfession:

public class HeroSecondaryProfessionSettingDto
{
public bool SupportsImage { get; set; }
}

In the Secondary Professions Catalog, the GLM-5.1 configuration looks like this:

{
"id": "secondary-glm-5-1",
"supportsImage": true
}

This means users can upload screenshots directly for AI analysis, such as:

  • Screenshots of error messages
  • Problems in a UI screen
  • Data visualization charts
  • Code execution results

There is no longer any need to describe everything manually - just upload the screenshot. The convenience of this feature is obvious once you have used it. Sometimes one look says more than a long explanation.

As the tenth Agent CLI, Gemini CLI is integrated into HagiCode through the standard Provider architecture.

Gemini CLI supports a rich set of configuration options:

public class GeminiOptions
{
public string? ExecutablePath { get; set; }
public string? WorkingDirectory { get; set; }
public string? SessionId { get; set; }
public string? Model { get; set; }
public string? AuthenticationMethod { get; set; }
public string? AuthenticationToken { get; set; }
public Dictionary<string, string?> AuthenticationInfo { get; set; }
public Dictionary<string, string?> EnvironmentVariables { get; set; }
public string[] ExtraArguments { get; set; }
public TimeSpan? StartupTimeout { get; set; }
public CliPoolSettings? PoolSettings { get; set; }
}

These options cover everything from basic setup to advanced features, giving users the flexibility to configure the CLI around their own needs. Everyone’s workflow is different, so a little flexibility is always welcome.

Gemini CLI supports the ACP (Agent Communication Protocol), which is HagiCode’s unified CLI communication standard. Through ACP, different CLIs can interact with the platform in a consistent way, greatly simplifying integration work. In short, it standardizes the complicated parts so everyone can work more easily.

To use Zhipu AI models, you need to configure the corresponding environment variables.

Terminal window
export ANTHROPIC_AUTH_TOKEN="***"
export ANTHROPIC_BASE_URL="https://open.bigmodel.cn/api/anthropic"
Terminal window
export ANTHROPIC_AUTH_TOKEN="your-a...-key"
export ANTHROPIC_BASE_URL="https://coding.dashscope.aliyuncs.com/apps/anthropic"

Once configured, HagiCode can call the GLM-5.1 model normally. It is neither especially hard nor especially easy - you just need to follow the setup as intended.

Speaking of real-world practice, the best example is the HagiCode platform’s own build workflow. HagiCode’s development process has already made full use of AI capabilities.

HagiCode’s platform design is well optimized, so it can still provide a good development experience even with GLM 4.7. The platform has already helped complete multiple important build projects, including:

  • Integration of multiple CLI Providers
  • Implementation of the image upload feature
  • Documentation generation and content publishing

That is actually a good thing. Not everyone needs the newest thing all the time. What suits you best is often what matters most.

After upgrading to GLM-5.1, these capabilities become even stronger:

  • Stronger code understanding, reducing back-and-forth communication
  • More accurate dependency analysis, pointing in the right direction immediately
  • More efficient error diagnosis, locating issues faster
  • Support for image input, accelerating problem descriptions

It is like switching from a bicycle to a car. You can still reach the same destination, but the speed and comfort are not the same.

HagiCode.Libs.Providers provides a unified mechanism for registration and usage:

services.AddHagiCodeLibs();
var gemini = serviceProvider.GetRequiredService<ICliProvider<GeminiOptions>>();
var codebuddy = serviceProvider.GetRequiredService<ICliProvider<CodebuddyOptions>>();
var hermes = serviceProvider.GetRequiredService<ICliProvider<HermesOptions>>();

This dependency injection design keeps usage across different CLIs very concise and also makes unit testing and mocking more convenient. Clean code is a way of being responsible to yourself.

There are a few things to keep in mind in actual use:

  1. API key configuration: Make sure ANTHROPIC_AUTH_TOKEN is set correctly, or the model cannot be called
  2. Model availability: GLM-5.1 needs to be enabled by the corresponding model provider
  3. Image feature: Only models with supportsImage: true can use image upload
  4. CLI installation: Before using Gemini CLI, make sure gemini or gemini-cli is in the system PATH

These may be small details, but small details handled poorly can turn into big problems, so they are worth paying attention to.

With full support for GLM-5.1 and the successful integration of Gemini CLI, HagiCode further strengthens its capabilities as a multi-model, multi-CLI AI programming platform. These updates not only give users more choices, but also demonstrate HagiCode’s forward-looking architecture and scalability.

GLM-5.1’s image support, combined with HagiCode’s screenshot upload feature, makes it possible to let the AI “understand from the image” and greatly reduces the cost of describing problems. And with support for ten CLIs, users can flexibly choose the AI programming assistant that best fits their preferences and scenarios. More choice is almost always a good thing.

Most importantly, HagiCode’s own build practice proves that the platform can already run well and complete complex tasks even with GLM 4.7, while upgrading to GLM-5.1 can further improve development efficiency. Life is often like that too: you do not always need the absolute best, only what suits you. Of course, if what suits you can become even better, then so much the better.

If you are interested in a multi-model, multi-CLI AI programming platform, give HagiCode a try - open source, free, and still evolving. Trying it costs nothing, and it may turn out to be exactly what you need.


If this article helped you:

Thank you for reading. If you found this article useful, feel free to like, bookmark, and share it to show your support. This content was created with AI-assisted collaboration, and the final content was reviewed and confirmed by the author.

Hagicode and GLM-5.1 Multi-CLI Integration Guide

Hagicode and GLM-5.1 Multi-CLI Integration Guide

Section titled “Hagicode and GLM-5.1 Multi-CLI Integration Guide”

In the Hagicode project, users can choose from multiple CLI tools to drive AI programming assistants, including Claude Code CLI, GitHub Copilot, OpenCode CLI, Codebuddy CLI, Hermes CLI, and more. These CLI tools are general-purpose AI programming tools on their own, but through Hagicode’s abstraction layer, they can flexibly connect to different AI model providers.

Zhipu AI (ZAI) provides an interface compatible with the Anthropic Claude API, allowing these CLI tools to directly use domestic GLM series models. Among them, GLM-5.1 is Zhipu’s latest large language model release, with significant improvements over GLM-5.0.

Hagicode defines 11 CLI provider types through the AIProviderType enum, covering mainstream AI programming CLI tools:

public enum AIProviderType
{
ClaudeCodeCli = 0, // Claude Code CLI
CodexCli = 1, // GitHub Copilot Codex
GitHubCopilot = 2, // GitHub Copilot
CodebuddyCli = 3, // Codebuddy CLI
OpenCodeCli = 4, // OpenCode CLI
IFlowCli = 5, // IFlow CLI
HermesCli = 6, // Hermes CLI
QoderCli = 7, // Qoder CLI
KiroCli = 8, // Kiro CLI
KimiCli = 9, // Kimi CLI
GeminiCli = 10, // Gemini CLI
}

Each CLI has corresponding model parameter configuration and supports the model and reasoning parameters:

private static readonly IReadOnlyDictionary<AIProviderType, IReadOnlyList<string>> ManagedModelParameterKeysByProvider =
new Dictionary<AIProviderType, IReadOnlyList<string>>
{
[AIProviderType.ClaudeCodeCli] = ["model", "reasoning"],
[AIProviderType.CodexCli] = ["model", "reasoning"],
[AIProviderType.OpenCodeCli] = ["model", "reasoning"],
[AIProviderType.HermesCli] = ["model", "reasoning"],
[AIProviderType.CodebuddyCli] = ["model", "reasoning"],
[AIProviderType.QoderCli] = ["model", "reasoning"],
[AIProviderType.KiroCli] = ["model", "reasoning"],
[AIProviderType.GeminiCli] = ["model"], // Gemini does not support the reasoning parameter
// ...
};

Hagicode’s Secondary Professions Catalog defines complete support for the GLM model series:

Model IDNameDefault ReasoningCompatible CLI Families
glm-4.7GLM 4.7highclaude, codebuddy, hermes, qoder, kiro
glm-5GLM 5highclaude, codebuddy, hermes, qoder, kiro
glm-5-turboGLM 5 Turbohighclaude, codebuddy, hermes, qoder, kiro
glm-5.0GLM 5.0 (Legacy)highclaude, codebuddy, hermes, qoder, kiro
glm-5.1GLM 5.1highclaude, codebuddy, hermes, qoder, kiro

Key differences between GLM-5.1 and GLM-5.0

Section titled “Key differences between GLM-5.1 and GLM-5.0”

From the implementation in AcpSessionModelBootstrapper.cs, we can clearly see the differences between GLM-5.1 and GLM-5.0:

GLM-5.1 is a standalone new model identifier with no legacy handling logic:

private const string Glm51ModelValue = "glm-5.1";

Definition in the Secondary Professions Catalog:

{
"id": "secondary-glm-5-1",
"name": "GLM 5.1",
"family": "anthropic",
"summary": "hero.professionCopy.secondary.glm51.summary",
"sourceLabel": "hero.professionCopy.sources.aiSharedAnthropicModel",
"sortOrder": 64,
"supportsImage": true,
"compatiblePrimaryFamilies": [
"claude",
"codebuddy",
"hermes",
"qoder",
"kiro"
],
"defaultParameters": {
"model": "glm-5.1",
"reasoning": "high"
}
}

Zhipu AI provides the most complete GLM model support:

{
"providerId": "zai",
"name": "智谱 AI",
"description": "智谱 AI 提供的 Claude API 兼容服务",
"category": "china-providers",
"apiUrl": {
"codingPlanForAnthropic": "https://open.bigmodel.cn/api/anthropic"
},
"recommended": true,
"region": "cn",
"defaultModels": {
"sonnet": "glm-4.7",
"opus": "glm-5",
"haiku": "glm-4.5-air"
},
"supportedModels": [
"glm-4.7",
"glm-5",
"glm-4.5-air",
"qwen3-coder-next",
"qwen3-coder-plus"
],
"features": ["experimental-agent-teams"],
"authTokenEnv": "ANTHROPIC_AUTH_TOKEN",
"referralUrl": "https://www.bigmodel.cn/claude-code?ic=14BY54APZA",
"documentationUrl": "https://open.bigmodel.cn/dev/api"
}

Features:

  • Supports the widest variety of GLM model variants
  • Provides default mapping across the Sonnet/Opus/Haiku tiers
  • Supports the experimental-agent-teams feature

Claude Code CLI is one of Hagicode’s core CLIs and is configured through the Hero configuration system:

{
"primaryProfessionId": "profession-claude-code",
"secondaryProfessionId": "secondary-glm-5-1",
"model": "glm-5.1",
"reasoning": "high"
}

Corresponding HeroEquipmentCatalogItem configuration:

{
id: 'secondary-glm-5-1',
name: 'GLM 5.1',
family: 'anthropic',
kind: 'model',
primaryFamily: 'claude',
compatiblePrimaryFamilies: ['claude', 'codebuddy', 'hermes', 'qoder', 'kiro'],
defaultParameters: {
model: 'glm-5.1',
reasoning: 'high'
}
}

OpenCode CLI is the most flexible CLI and supports specifying any model in the provider/model format:

Method 1: Use the ZAI provider prefix

{
"primaryProfessionId": "profession-opencode",
"model": "zai/glm-5.1",
"reasoning": "high"
}

Method 2: Use the model ID directly

{
"model": "glm-5.1"
}

Method 3: Frontend configuration UI

In HeroModelEquipmentForm.tsx, OpenCode CLI has a dedicated placeholder hint:

const OPEN_CODE_MODEL_PLACEHOLDER = 'myprovider/glm-4.7';
const modelPlaceholder = primaryProviderType === PCode_Models_AIProviderType.OPEN_CODE_CLI
? OPEN_CODE_MODEL_PLACEHOLDER
: 'gpt-5.4';

Users can enter:

zai/glm-5.1
glm-5.1

OpenCode CLI model parsing logic:

internal OpenCodeModelSelection? ResolveModelSelection(string? rawModel)
{
var normalized = NormalizeOptionalValue(rawModel);
if (normalized == null) return null;
var slashIndex = normalized.IndexOf('/');
if (slashIndex < 0)
{
// No slash: use the model ID directly
return new OpenCodeModelSelection {
ProviderId = string.Empty,
ModelId = normalized,
};
}
// Slash exists: parse the provider/model format
var providerId = normalized[..slashIndex].Trim();
var modelId = normalized[(slashIndex + 1)..].Trim();
return new OpenCodeModelSelection {
ProviderId = providerId,
ModelId = modelId,
};
}

Codebuddy CLI has dedicated legacy handling logic:

{
"primaryProfessionId": "profession-codebuddy",
"model": "glm-5.1",
"reasoning": "high"
}

Note: Codebuddy retains special handling for GLM-5.0 and does not use legacy normalization:

return !string.Equals(providerName, "CodebuddyCli", StringComparison.OrdinalIgnoreCase)
&& string.Equals(normalizedModel, LegacyGlm5TurboModelValue, StringComparison.OrdinalIgnoreCase)
? Glm5TurboModelValue
: normalizedModel;
// For CodebuddyCli, glm-5.0 is not normalized to glm-5-turbo
Terminal window
# Set the API key
export ANTHROPIC_AUTH_TOKEN="***"
# Optional: specify the API endpoint (ZAI uses this endpoint by default)
export ANTHROPIC_BASE_URL="https://open.bigmodel.cn/api/anthropic"
Terminal window
# Set the API key
export ANTHROPIC_AUTH_TOKEN="your-a...-key"
# Specify the Alibaba Cloud endpoint
export ANTHROPIC_BASE_URL="https://coding.dashscope.aliyuncs.com/apps/anthropic"

Compared with GLM-5.0, GLM-5.1 brings the following significant improvements:

According to Zhipu’s official release information, improvements in GLM-5.1 include:

  • Stronger code understanding: More accurate analysis of complex code structures
  • Longer context comprehension: Supports longer conversational context
  • Enhanced tool calling: Higher success rate for MCP tool calls
  • Output stability: Reduces randomness and hallucinations

GLM-5.1 covers all mainstream CLIs supported by Hagicode:

compatiblePrimaryFamilies: [
"claude", // Claude Code CLI
"codebuddy", // Codebuddy CLI
"hermes", // Hermes CLI
"qoder", // Qoder CLI
"kiro" // Kiro CLI
]

Make sure the ANTHROPIC_AUTH_TOKEN environment variable is set correctly. It is the required credential for every CLI to connect to the model.

GLM-5.1 needs to be enabled by the corresponding model provider:

  • The Zhipu AI ZAI platform supports it by default
  • Alibaba Cloud DashScope may require a separate application

When using the provider/model format, make sure the provider ID is correct:

  • Zhipu AI: zai or zhipuai
  • Alibaba Cloud: aliyun or dashscope
  • high is recommended for the best code generation results
  • Gemini CLI does not support the reasoning parameter and will ignore this configuration automatically

Through a unified abstraction layer, Hagicode enables flexible integration between GLM-5.1 and multiple CLIs. Developers can choose the CLI tool that best fits their preferences and usage scenarios, then use the latest GLM-5.1 model through simple configuration.

As Zhipu’s latest model version, GLM-5.1 offers clear improvements over GLM-5.0:

  • An independent version identifier with no legacy burden
  • Stronger reasoning and code understanding
  • Broad multi-CLI compatibility
  • Flexible reasoning level configuration

With the correct environment variables and Hero equipment configured, users can fully unlock the power of GLM-5.1 across different CLI environments.

If you want to put GLM-5.1, multi-CLI orchestration, and HagiCode’s configuration model into real use, these are the fastest entry points:

Once you compare Kimi, Claude Code, OpenCode, and other CLIs inside the same abstraction layer, questions about model switching, parameter mapping, and engineering boundaries tend to become much easier to reason about.

HagiCode Desktop Hybrid Distribution Architecture Explained: How P2P Accelerates Large File Downloads

HagiCode Desktop Hybrid Distribution Architecture Explained: How P2P Accelerates Large File Downloads

Section titled “HagiCode Desktop Hybrid Distribution Architecture Explained: How P2P Accelerates Large File Downloads”

I held this article back for a long time before finally writing it, and I am still not sure whether it reads well. Technical writing is easy enough to produce, but hard to make truly engaging. Then again, I am no great literary master, so I might as well just set down this plain explanation.

Teams building desktop applications will all run into the same headache sooner or later: how do you distribute large files?

It is an awkward problem. Traditional HTTP/HTTPS direct downloads can still hold up when files are small and the number of users is limited. But time is rarely kind. As a project keeps growing, the installation packages grow with it: Desktop ZIP packages, portable packages, web deployment archives, and more. Then the issues start to surface:

  • Download speed is limited by origin bandwidth: no matter how much bandwidth a single server has, it still struggles when everyone downloads at once.
  • Resume support is nearly nonexistent: if an HTTP download is interrupted, you often have to start over from the beginning. That wastes both time and bandwidth.
  • The origin server takes all the pressure: all traffic flows back to a central server, bandwidth costs keep rising, and scalability becomes a real problem.

The HagiCode Desktop project was no exception. When we designed the distribution system, we kept asking ourselves: can we introduce a hybrid distribution approach without changing the existing index.json control plane? In other words, can we use the distributed nature of P2P networks to accelerate downloads while still keeping HTTP origin fallback so the system remains usable in constrained environments such as enterprise networks?

The impact of that decision turned out to be larger than you might expect. Let us walk through it step by step.

The approach shared in this article comes from our real-world experience in the HagiCode project. HagiCode is an open-source AI coding assistant project focused on helping development teams improve engineering efficiency. The project spans multiple subsystems, including the frontend, backend, desktop launcher, documentation, build pipeline, and server deployment.

The Desktop hybrid distribution architecture is exactly the kind of solution HagiCode refined through real operational experience and repeated optimization. If this design proves useful, then perhaps it also shows that HagiCode itself is worth paying attention to.

The project’s GitHub repository is HagiCode-org/site. If it interests you, feel free to give it a Star and save it for later.

Core Design Philosophy: P2P First, HTTP Fallback

Section titled “Core Design Philosophy: P2P First, HTTP Fallback”

At its heart, the hybrid distribution model can be summarized in a single sentence: P2P first, HTTP fallback.

The key lies in the word “hybrid.” This is not about simply adding BitTorrent and calling it a day. The point is to make the two delivery methods work together and complement each other:

  • The P2P network provides distributed acceleration. The more people download, the more peers join, and the faster the transfer becomes.
  • WebSeed/HTTP fallback guarantees availability, so downloads can still work in enterprise firewalls and internal network environments.
  • The control plane remains simple. We do not change the core logic of index.json; we only add a few optional metadata fields.

The real benefit is straightforward: users feel that “downloads are faster,” while the engineering team does not have to shoulder too much extra complexity. After all, the BT protocol is already mature, and there is little reason to reinvent the wheel.

Let us start with the overall architecture diagram to build a high-level mental model:

┌─────────────────────────────────────┐
│ Renderer (UI layer) │
├─────────────────────────────────────┤
│ IPC/Preload (bridge layer) │
├─────────────────────────────────────┤
│ VersionManager (version manager) │
├─────────────────────────────────────┤
│ HybridDownloadCoordinator (coord.) │
│ ├── DistributionPolicyEvaluator │
│ ├── DownloadEngineAdapter │
│ ├── CacheRetentionManager │
│ └── SHA256 Verifier │
├─────────────────────────────────────┤
│ WebTorrent (download engine) │
└─────────────────────────────────────┘

As the diagram shows, the system uses a layered design. The reason for separating responsibilities this clearly is simple: testability and replaceability.

  • The UI layer is responsible for displaying download progress and the sharing acceleration toggle. It is the surface.
  • The coordination layer is the core. It contains policy evaluation, engine adaptation, cache management, and integrity verification.
  • The engine layer encapsulates the concrete download implementation. At the moment, it uses WebTorrent.

The engine layer is abstracted behind the DownloadEngineAdapter interface. If we ever want to swap in a different BT engine later, or move the implementation into a sidecar process, that becomes much easier.

Separation of Control Plane and Data Plane

Section titled “Separation of Control Plane and Data Plane”

HagiCode Desktop keeps index.json as the sole control plane, and that design is critical. The control plane is responsible for version discovery, channel selection, and centralized policy, while the data plane is where the actual file transfer happens.

The new fields added to index.json are optional:

{
"asset": {
"torrentUrl": "https://cdn.example.com/app.torrent",
"infoHash": "abc123...",
"webSeeds": [
"https://cdn.example.com/app.zip",
"https://backup.example.com/app.zip"
],
"sha256": "def456...",
"directUrl": "https://cdn.example.com/app.zip"
}
}

All of these fields are optional. If they are missing, the client falls back to the traditional HTTP download mode. The advantage of this design is backward compatibility: older clients are completely unaffected.

Not every file is worth distributing through P2P.

DistributionPolicyEvaluator is responsible for evaluating the policy. Only files that meet all of the following conditions will use hybrid download:

  1. The source type must be an HTTP index: direct GitHub downloads or local folder sources do not use this path.
  2. The file size must be at least 100 MB: for smaller files, the overhead of P2P outweighs the benefit.
  3. Complete hybrid metadata must be present: torrentUrl, webSeeds, and sha256 are all required.
  4. Only the latest desktop package and web deployment package are eligible: historical versions continue to use the traditional distribution path.
class DistributionPolicyEvaluator {
evaluate(version: Version, settings: SharingAccelerationSettings): HybridDownloadPolicy {
// Check source type
if (version.sourceType !== 'http-index') {
return { useHybrid: false, reason: 'not-http-index' };
}
// Check metadata completeness
if (!version.hybrid) {
return { useHybrid: false, reason: 'not-eligible' };
}
// Check whether the feature is enabled
if (!settings.enabled) {
return { useHybrid: false, reason: 'shared-disabled' };
}
// Check asset type (latest desktop/web packages only)
if (!version.hybrid.isLatestDesktopAsset && !version.hybrid.isLatestWebAsset) {
return { useHybrid: false, reason: 'latest-only' };
}
return { useHybrid: true, reason: 'shared-enabled' };
}
}

This gives the system predictable behavior. Both developers and users can clearly understand which files will use P2P and which will not.

Let us start with the type definitions, because they form the foundation of the entire system.

// Hybrid distribution metadata
interface HybridDistributionMetadata {
torrentUrl?: string; // Torrent file URL
infoHash?: string; // InfoHash
webSeeds: string[]; // WebSeed list
sha256?: string; // File hash
directUrl?: string; // HTTP direct link (for origin fallback)
eligible: boolean; // Whether hybrid distribution is applicable
thresholdBytes: number; // Threshold in bytes
assetKind: VersionAssetKind;
isLatestDesktopAsset: boolean;
isLatestWebAsset: boolean;
}
// Sharing acceleration settings
interface SharingAccelerationSettings {
enabled: boolean; // Master switch
uploadLimitMbps: number; // Upload bandwidth limit
cacheLimitGb: number; // Cache limit
retentionDays: number; // Retention period
hybridThresholdMb: number; // Hybrid distribution threshold
onboardingChoiceRecorded: boolean;
}
// Download progress
interface VersionDownloadProgress {
current: number;
total: number;
percentage: number;
stage: VersionInstallStage; // queued, downloading, backfilling, verifying, extracting, completed, error
mode: VersionDownloadMode; // http-direct, shared-acceleration, source-fallback
peers?: number; // Number of connected peers
p2pBytes?: number; // Bytes received from P2P
fallbackBytes?: number; // Bytes received from fallback
verified?: boolean; // Whether verification has completed
}

Once the type system is clear, the rest of the implementation follows naturally.

HybridDownloadCoordinator orchestrates the entire download workflow. It coordinates policy evaluation, engine execution, SHA256 verification, and cache management.

class HybridDownloadCoordinator {
async download(
version: Version,
cachePath: string,
packageSource: PackageSource,
onProgress?: DownloadProgressCallback,
): Promise<HybridDownloadResult> {
// 1. Evaluate the policy: should hybrid download be used?
const policy = this.policyEvaluator.evaluate(version, settings);
// 2. Execute the download
if (policy.useHybrid) {
await this.engine.download(version, cachePath, settings, onProgress);
} else {
await packageSource.downloadPackage(version, cachePath, onProgress);
}
// 3. SHA256 verification (hard gate)
const verified = await this.verify(version, cachePath, onProgress);
if (!verified) {
await this.cacheRetentionManager.discard(version.id, cachePath);
throw new Error(`sha256 verification failed for ${version.id}`);
}
// 4. Mark as trusted cache and begin controlled seeding
await this.cacheRetentionManager.markTrusted({
versionId: version.id,
cachePath,
cacheSize,
}, settings);
return { cachePath, policy, verified };
}
}

There is one especially important point here: SHA256 verification is a hard gate. A downloaded file must pass verification before it can enter the installation flow. If verification fails, the cache is discarded to ensure that an incorrect file never causes installation problems.

DownloadEngineAdapter is an abstract interface that defines the methods every engine must implement:

interface DownloadEngineAdapter {
download(
version: Version,
destinationPath: string,
settings: SharingAccelerationSettings,
onProgress?: (progress: VersionDownloadProgress) => void,
): Promise<void>;
stopAll(): Promise<void>;
}

The V1 implementation is based on WebTorrent and is wrapped in InProcessTorrentEngineAdapter:

class InProcessTorrentEngineAdapter implements DownloadEngineAdapter {
async download(...) {
const client = this.getClient(settings); // Apply upload rate limiting
const torrent = client.add(torrentId, {
path: path.dirname(destinationPath),
destroyStoreOnDestroy: false,
maxWebConns: 8,
});
// Add WebSeed sources
torrent.on('ready', () => {
for (const seed of hybrid.webSeeds) {
torrent.addWebSeed(seed);
}
if (hybrid.directUrl) {
torrent.addWebSeed(hybrid.directUrl);
}
});
// Progress reporting - distinguish P2P from origin fallback
torrent.on('download', () => {
const hasP2PPeer = torrent.wires.some(w => w.type !== 'webSeed');
const mode = hasP2PPeer ? 'shared-acceleration' : 'source-fallback';
// ... report progress
});
}
}

A pluggable engine design makes future optimization much easier. For example, V2 could run the engine in a helper process to avoid bringing down the main process if the engine crashes.

At the UI layer, the thing users care about most is simple: “am I currently downloading through P2P or through HTTP fallback?” InProcessTorrentEngineAdapter determines that by checking the types inside torrent.wires:

const hasP2PPeer = torrent.wires.some((wire) => wire.type !== 'webSeed');
const hasFallbackWire = torrent.wires.some((wire) => wire.type === 'webSeed');
const mode = hasP2PPeer ? 'shared-acceleration'
: hasFallbackWire ? 'source-fallback'
: 'shared-acceleration';
const stage = hasP2PPeer ? 'downloading'
: hasFallbackWire ? 'backfilling'
: 'downloading';

The logic looks simple, but it is a key part of the user experience. Users can clearly see whether the current state is “sharing acceleration” or “origin backfilling,” which makes the behavior easier to understand.

Integrity verification uses Node.js’s crypto module to compute the hash in a streaming manner, which avoids loading the entire file into memory:

private async computeSha256(filePath: string): Promise<string> {
const hash = createHash('sha256');
await new Promise<void>((resolve, reject) => {
const stream = fs.createReadStream(filePath);
stream.on('data', (chunk) => hash.update(chunk));
stream.on('error', reject);
stream.on('end', resolve);
});
return hash.digest('hex').toLowerCase();
}

This implementation is especially friendly for large files. Imagine downloading a 2 GB installation package and then trying to load the whole thing into memory just to verify it. Streaming solves that cleanly.

The full data flow looks like this:

┌────────────────────────────────────────────────────────────────────┐
│ User clicks install on a large-file version │
└────────────────────────────────────────────────────────────────────┘
┌────────────────────────────────────────────────────────────────────┐
│ VersionManager invokes the coordinator │
│ HybridDownloadCoordinator.download() │
└────────────────────────────────────────────────────────────────────┘
┌────────────────────────────────────────────────────────────────────┐
│ DistributionPolicyEvaluator.evaluate() │
│ Checks: source, metadata, switch, and asset type │
└────────────────────────────────────────────────────────────────────┘
┌───────────┴───────────┐
│ useHybrid? │
└───────────┬───────────┘
yes │ │ no
▼ ▼
┌──────────────────┐ ┌─────────────────────┐
│ P2P + WebSeed │ │ HTTP direct download│
│ Hybrid download │ │ (compatibility path)│
└──────────────────┘ └─────────────────────┘
┌──────────────────┐
│ SHA256 verify │
│ (hard gate) │
└────────┬─────────┘
┌────────┴─────────┐
│ Passed? │
└────────┬─────────┘
yes │ │ no
▼ ▼
┌────────────┐ ┌────────────────┐
│ Extract + │ │ Drop cache + │
│ install + │ │ return error │
│ seed safely│ └────────────────┘
└────────────┘

The flow is very clear end to end, and every step has a well-defined responsibility. When something goes wrong, it is much easier to pinpoint the failing stage.

Even the best technical design will fall flat if the user experience is poor. HagiCode Desktop invested a fair amount of effort in productizing this capability.

Most users do not know what BitTorrent or InfoHash means. So at the product level, we present the feature using the phrase “sharing acceleration”:

  • The feature is called “sharing acceleration,” not P2P download.
  • The setting is called “upload limit,” not seeding.
  • The progress label says “origin backfilling,” not WebSeed fallback.

This lowers the cognitive burden of the terminology and makes the feature easier to accept.

Enabled by Default in the First-Run Wizard

Section titled “Enabled by Default in the First-Run Wizard”

When new users launch the desktop app for the first time, they see a wizard page introducing sharing acceleration:

To improve download speed, we share the portions you have already downloaded with other users while your own download is in progress. This is completely optional, and you can turn it off at any time in Settings.

It is enabled by default, but users are given a clear way to opt out. If enterprise users do not want it, they can simply disable it during onboarding.

The settings page exposes three tunable parameters:

ParameterDefaultDescription
Upload limit2 MB/sPrevents excessive upstream bandwidth usage
Cache limit10 GBControls disk space consumption
Retention days7 daysAutomatically cleans old cache after this period

These parameters all have sensible defaults. Most users never need to change them, while advanced users can adjust them based on their own network environment.

Looking back at the overall solution, several design decisions are worth calling out.

Why not start with a sidecar or helper process right away? The reason is simple: ship quickly. An in-process design has a shorter development cycle and is easier to debug. The first priority is to get the feature running, then improve stability afterward.

Of course, this decision comes with a cost: if the engine crashes, it can affect the main process. We reduce that risk through adapter boundaries and timeout controls, and we also keep a migration path open so V2 can move into a separate process more easily.

We use SHA256 instead of MD5 or CRC32 because SHA256 is more secure. The collision cost for MD5 and CRC32 is too low. If someone maliciously crafted a fake installation package, the consequences could be severe. SHA256 costs more to compute, but the security gain is worth it.

Scenarios such as GitHub downloads and local folder sources do not use hybrid distribution. This is not a technical limitation; it is about avoiding unnecessary complexity. BT protocols add limited value inside private network scenarios and would only increase code complexity.

Inside SharingAccelerationSettingsStore, every numeric value must go through bounds checking and normalization:

private normalize(settings: SharingAccelerationSettings): SharingAccelerationSettings {
return {
enabled: Boolean(settings.enabled),
uploadLimitMbps: this.clampNumber(settings.uploadLimitMbps, 1, 200, DEFAULT_SETTINGS.uploadLimitMbps),
cacheLimitGb: this.clampNumber(settings.cacheLimitGb, 1, 500, DEFAULT_SETTINGS.cacheLimitGb),
retentionDays: this.clampNumber(settings.retentionDays, 1, 90, DEFAULT_SETTINGS.retentionDays),
hybridThresholdMb: DEFAULT_SETTINGS.hybridThresholdMb, // Fixed value, not user-configurable
onboardingChoiceRecorded: Boolean(settings.onboardingChoiceRecorded),
};
}
private clampNumber(value: number, min: number, max: number, fallback: number): number {
if (!Number.isFinite(value)) {
return fallback;
}
return Math.min(max, Math.max(min, Math.round(value)));
}

This prevents users from manually editing the configuration file into invalid values.

CacheRetentionManager.prune() is responsible for cleaning expired or oversized cache entries. The cleanup strategy uses LRU (least recently used):

const records = [...this.listRecords()]
.sort((left, right) =>
new Date(left.lastUsedAt).getTime() - new Date(right.lastUsedAt).getTime()
);
// When over the limit, evict the least recently used entries first
while (totalBytes > maxBytes && retainedEntries.length > 0) {
const evicted = records.find((record) => retainedEntries.includes(record.versionId));
retainedEntries.splice(retainedEntries.indexOf(evicted.versionId), 1);
removedEntries.push(evicted.versionId);
totalBytes -= evicted.cacheSize;
await fs.rm(evicted.cachePath, { force: true });
}

This logic ensures disk space is used efficiently while preserving historical versions that the user might still need.

When the user turns off sharing acceleration, the app must immediately stop seeding and destroy the torrent client:

async disableSharingAcceleration(): Promise<void> {
this.settingsStore.updateSettings({ enabled: false });
await this.cacheRetentionManager.stopAllSeeding(); // Stop seeding
await this.engine.stopAll(); // Destroy the torrent client
}

If a user disables the feature, the product should no longer consume any P2P resources. That is basic product etiquette.

There is no perfect solution, and hybrid distribution is no exception. These are the main trade-offs:

Crash isolation is weaker than a sidecar: V1 uses an in-process engine, so an engine crash can affect the main process. Adapter boundaries and timeout controls reduce the risk, but they are not a fundamental fix. V2 includes a planned migration path to a helper process.

Enabled-by-default resource usage: the default settings of 2 MB/s upload, 10 GB cache, and 7-day retention do consume some machine resources. User expectations are managed through onboarding copy and transparent settings.

Enterprise network compatibility: automatic WebSeed/HTTPS fallback preserves usability in enterprise networks, but it can reduce the acceleration gains from P2P. This is an intentional trade-off that prioritizes availability.

Backward-compatible metadata: all new fields are optional. If they are missing, the system falls back to HTTP mode. Older clients are completely unaffected, making upgrades smooth.

This article walked through the hybrid distribution architecture used in the HagiCode Desktop project. The key takeaways are:

  1. Layered architecture: the control plane and data plane are separated, and the engine is abstracted behind a pluggable interface for easier testing and extension.

  2. Policy-driven behavior: not every file uses P2P. Hybrid distribution is enabled only for large files that meet the required conditions.

  3. Integrity verification: SHA256 serves as a hard gate, and streaming verification avoids memory pressure.

  4. Productized presentation: BT terminology is hidden behind the phrase “sharing acceleration,” and the feature is enabled by default during onboarding.

  5. User control: upload limits, cache limits, retention days, and other parameters remain user-adjustable.

This architecture has already been implemented in the HagiCode Desktop project. If you try it out, we would love to hear your feedback after installation and real-world use.


If this article helped you:

Maybe we are all just ordinary people making our way through the world of technology, but that is fine. Ordinary people can still be persistent, and that persistence matters.

Thank you for reading. If you found this article useful, feel free to like, save, and share it. This content was created with AI-assisted collaboration, with the final version reviewed and approved by the author.

Running AI CLI Tools in Docker Containers: A Practical Guide to User Isolation and Persistent Volumes

Running AI CLI Tools in Docker Containers: A Practical Guide to User Isolation and Persistent Volumes

Section titled “Running AI CLI Tools in Docker Containers: A Practical Guide to User Isolation and Persistent Volumes”

Integrating AI coding tools like Claude Code, Codex, and OpenCode into containerized environments sounds simple, but there are hidden complexities everywhere. This article takes a deep dive into how the HagiCode project solves core challenges in Docker deployments, including user permissions, configuration persistence, and version management, so you can avoid the common pitfalls.

When we decided to run AI coding CLI tools inside Docker containers, the most intuitive thought was probably: “Aren’t containers just root? Why not install everything directly and call it done?” In reality, that seemingly simple idea hides several core problems that must be solved.

First, security restrictions are the first hurdle. Take Claude CLI as an example: it explicitly forbids running as the root user. This is a mandatory security check, and if root is detected, it refuses to start. You might think, can’t I just switch users with the USER directive? It is not that simple. There is still a mapping problem between the non-root user inside the container and the user permissions on the host machine.

Second, state persistence is the second trap. Claude Code requires login, Codex has its own configuration, and OpenCode also has a cache directory. If you have to reconfigure everything every time the container restarts, the whole idea of “automation” loses its meaning. We need these configurations to persist beyond the lifecycle of the container.

The third problem is permission consistency. Can processes inside the container access configuration files created by the host user? UID/GID mismatches often cause file permission errors, and this is extremely common in real deployments.

These problems may look independent, but in practice they are tightly connected. During HagiCode’s development, we gradually worked out a practical solution. Next, I will share the technical details and the lessons learned from those pitfalls.

The solution shared in this article comes from our practical experience in the HagiCode project. HagiCode is an open-source AI-assisted programming platform that integrates multiple mainstream AI coding assistants, including Claude Code, Codex, and OpenCode. As a project that needs cross-platform and highly available deployment, HagiCode has to solve the full range of challenges involved in containerized deployment.

If you find the technical solution in this article valuable, that is a sign HagiCode has something real to offer in engineering practice. In that case, the HagiCode official website and GitHub repository are both worth following.

There is a common misunderstanding here: Docker containers run as root by default, so why not just install the tools as root? If you think that way, Claude CLI will quickly teach you otherwise.

Terminal window
# Run Claude CLI directly as root? No.
docker run --rm -it --user root myimage claude
# Output: Error: This command cannot be run as root user

This is a hard security restriction in Claude CLI. The reason is simple: these CLI tools read and write sensitive user configuration, including API tokens, local caches, and even scripts written by the user. Running them with root privileges introduces too much risk.

So the question becomes: how can we satisfy the CLI’s security requirements while keeping container management flexible? We need to change the way we think about it: instead of switching users at runtime, create a dedicated user during the image build stage.

Creating a dedicated user: more than just changing a name

Section titled “Creating a dedicated user: more than just changing a name”

You might think that adding a single USER line to the Dockerfile is enough. That is indeed the simplest approach, but it is not robust enough.

HagiCode’s approach is to create a hagicode user with UID 1000, which usually matches the default user on most host machines:

RUN groupadd -o -g 1000 hagicode && \
useradd -o -u 1000 -g 1000 -s /bin/bash -m hagicode && \
mkdir -p /home/hagicode/.claude && \
chown -R hagicode:hagicode /home/hagicode

But this only solves the built-in user inside the image. What if the host user is UID 1001? You still need to support dynamic mapping when the container starts.

docker-entrypoint.sh contains the key logic:

Terminal window
if [ -n "$PUID" ] && [ -n "$PGID" ]; then
if ! id hagicode >/dev/null 2>&1; then
groupadd -g "$PGID" hagicode
useradd -u "$PUID" -g "$PGID" -s /bin/bash -m hagicode
fi
fi

The advantage of this design is clear: use the default UID 1000 at image build time, then adjust dynamically at runtime through the PUID and PGID environment variables. No matter what UID the host user has, ownership of configuration files remains correct.

The design philosophy of persistent volumes

Section titled “The design philosophy of persistent volumes”

Each AI CLI tool has its own preferred configuration directory, so they need to be mapped one by one:

CLI ToolPath in ContainerNamed Volume
Claude/home/hagicode/.claudeclaude-data
Codex/home/hagicode/.codexcodex-data
OpenCode/home/hagicode/.config/opencodeopencode-config-data

Why use named volumes instead of bind mounts? Three reasons:

  1. Simpler management: Named volumes are managed automatically by Docker, so you do not need to create host directories manually.
  2. Permission isolation: The initial contents of the volumes are created by the user inside the container, avoiding permission conflicts with the host.
  3. Independent migration: Volumes can exist independently of containers, so data is not lost when images are upgraded.

docker-compose-builder-web automatically generates the corresponding volume configuration:

volumes:
claude-data:
codex-data:
opencode-config-data:
services:
hagicode:
volumes:
- claude-data:/home/hagicode/.claude
- codex-data:/home/hagicode/.codex
- opencode-config-data:/home/hagicode/.config/opencode
user: "${PUID:-1000}:${PGID:-1000}"

Pay attention to the user field here: PUID and PGID are injected through environment variables to ensure that processes inside the container run with an identity that matches the host user. This detail matters because permission issues are painful to debug once they appear.

Version management: baked-in versions with runtime overrides

Section titled “Version management: baked-in versions with runtime overrides”

Pinning Docker image versions is essential for reproducibility. But in real development, we often need to test a newer version or urgently fix a bug. If we had to rebuild the image every time, the workflow would be far too inefficient.

HagiCode’s strategy is fixed versions as the default, with runtime overrides as an extension mechanism. It is a pragmatic engineering compromise between stability and flexibility.

Dockerfile.template pins versions here:

USER hagicode
WORKDIR /home/hagicode
# Configure the global npm install path
RUN mkdir -p /home/hagicode/.npm-global && \
npm config set prefix '/home/hagicode/.npm-global'
# Install CLI tools using pinned versions
RUN npm install -g @anthropic-ai/claude-code@2.1.71 && \
npm install -g @openai/codex@0.112.0 && \
npm install -g opencode-ai@1.2.25 && \
npm cache clean --force

docker-entrypoint.sh supports runtime overrides:

Terminal window
install_cli_override_if_needed() {
local package_name="$2"
local override_version="$5"
if [ -n "$override_version" ]; then
gosu hagicode npm install -g "${package_name}@${override_version}"
fi
}
# Example usage
install_cli_override_if_needed "" "@anthropic-ai/claude-code" "" "" "${CLAUDE_CODE_CLI_VERSION}"

This lets you test a new version through an environment variable without rebuilding the image:

Terminal window
docker run -e CLAUDE_CODE_CLI_VERSION=2.2.0 myimage

This design is practical because nobody wants to rebuild an image every time they test a new feature.

In addition to configuring CLI tools manually, some scenarios require automatic configuration injection. The most typical example is an API token.

Terminal window
if [ -n "$ANTHROPIC_AUTH_TOKEN" ]; then
mkdir -p /home/hagicode/.claude
cat > /home/hagicode/.claude/settings.json <<EOF
{
"env": {
"ANTHROPIC_AUTH_TOKEN": "${ANTHROPIC_AUTH_TOKEN}"
}
}
EOF
chown -R hagicode:hagicode /home/hagicode/.claude
fi

Two things matter here: pass sensitive information through environment variables instead of hard-coding it into the image, and make sure the ownership of configuration files is set correctly, otherwise the CLI tools will not be able to read them.

This is the easiest trap to fall into. The host user has UID 1001, while the container uses 1000, so files created on one side cannot be accessed on the other.

Terminal window
# Correct approach: make the container match the host user
docker run \
-e PUID=$(id -u) \
-e PGID=$(id -g) \
myimage

This issue is very common, and it can be frustrating the first time you run into it.

Configuration disappears after container restart

Section titled “Configuration disappears after container restart”

If you find yourself logging in again after every restart, check whether you forgot to mount a persistent volume:

volumes:
- claude-data:/home/hagicode/.claude

Nothing is more frustrating than carefully setting up a configuration only to see it disappear.

Do not run npm install -g directly inside a running container. The correct approaches are:

  1. Set an environment variable to trigger override installation.
  2. Or rebuild the image.
Terminal window
# Option 1: runtime override
docker run -e CLAUDE_CODE_CLI_VERSION=2.2.0 myimage
# Option 2: rebuild the image
docker build -t myimage:v2 .

There is more than one road to Rome, but some roads are smoother than others.

  • Pass API tokens through environment variables instead of writing them into the image.
  • Set configuration file permissions to 600.
  • Always run the application as a non-root user.
  • Update CLI versions regularly to fix security vulnerabilities.

Security is always important, but the real challenge is consistently enforcing it in practice.

If you want to support a new CLI tool in the future, there are only three steps:

  1. Dockerfile.template: add the installation step.
  2. docker-entrypoint.sh: add the version override logic.
  3. docker-compose-builder-web: add the persistent volume mapping.

This template-based design makes extension simple without changing the core logic.

Running AI CLI tools in Docker containers involves three core challenges: user permissions, configuration persistence, and version management. By combining dedicated users, named-volume isolation, and environment-variable-based overrides, the HagiCode project built a deployment architecture that is both secure and flexible.

Key design points:

  • User isolation: Create a dedicated user during the image build stage, with runtime support for dynamic PUID/PGID mapping.
  • Persistence strategy: Each CLI tool gets its own named volume, so restarts do not affect configuration.
  • Version flexibility: Fixed defaults ensure reproducibility, while runtime overrides provide room for testing.
  • Automated configuration: Sensitive configuration can be injected automatically through environment variables.

This solution has been running stably in the HagiCode project for some time, and I hope it offers useful reference points for developers with similar needs.

Thank you for reading. If you found this article useful, you are welcome to like, bookmark, and share it. This content was created with AI-assisted collaboration, and the final content was reviewed and confirmed by the author.

Technical Analysis of the HagiCode Soul Platform: The Evolution from Emerging Needs to an Independent Platform

Technical Analysis of the HagiCode Soul Platform: The Evolution from Emerging Needs to an Independent Platform

Section titled “Technical Analysis of the HagiCode Soul Platform: The Evolution from Emerging Needs to an Independent Platform”

Writing technical articles is not really such a grand thing. It is mostly just a matter of organizing the pitfalls you have run into and the detours you have taken. We have all been inexperienced before, after all. This article takes an in-depth look at the design philosophy, architectural evolution, and core technical implementation of Soul in the HagiCode project, and explores how an independent platform can provide a more focused experience for creating and sharing Agent personas.

In the practice of building AI Agents, we often run into a question that looks simple but is actually crucial: how do we give different Agents stable and distinctive language styles and personality traits?

It is a slightly frustrating question, honestly. In the early Hero system of HagiCode, different Heroes (Agent instances) were mainly distinguished through profession settings and generic prompts. That approach came with some fairly obvious pain points, and anyone who has tried something similar has probably felt the same.

First, language style was difficult to keep consistent. The same “developer engineer” role might sound professional and rigorous one day, then casual and loose the next. This was not a model problem so much as the absence of an independent personality configuration layer to constrain and guide the output style.

Second, the sense of character was generally weak. When we described an Agent’s traits, we often had to rely on vague adjectives like “friendly,” “professional,” or “humorous,” without concrete language rules to support those abstract descriptions. Put plainly, it sounded nice in theory, but there was little to hold onto in practice.

Third, persona configurations were almost impossible to reuse. Suppose we carefully designed the speaking style of a “catgirl waitress” and wanted to reuse that expression style in another business scenario. In practice, we would almost have to configure it again from scratch. Sometimes you do not want to possess something beautiful, only reuse it a little… and even that turns out to be hard.

To solve those real problems, we introduced the Soul mechanism: an independent language style configuration layer separate from equipment and descriptions. Soul can define an Agent’s speaking habits, tone preferences, and wording boundaries, can be shared and reused across multiple Heroes, and can also be injected into the system prompt automatically on the first Session call.

Some people might say that this is just configuring a few prompts. But sometimes the real question is not whether something can be done; it is how to do it more elegantly. As Soul matured, we realized it had enough depth to develop independently. A dedicated Soul platform could let users focus on creating, sharing, and browsing interesting persona configurations without being distracted by the rest of the Hero system. That is how the standalone platform at soul.hagicode.com came into being.

HagiCode is an open-source AI coding assistant project built with a modern technology stack and aimed at giving developers a smooth intelligent programming experience. The Soul platform approach shared in this article comes from our own hands-on exploration while building HagiCode to solve the practical problem of Agent persona management. If you find the approach valuable, then it probably means we have accumulated a certain amount of engineering judgment in practice, and the HagiCode project itself may also be worth a closer look.

The Technical Architecture Evolution of the Soul Platform

Section titled “The Technical Architecture Evolution of the Soul Platform”

The Soul platform did not appear all at once. It went through three clear stages. The story began abruptly and concluded naturally.

Phase 1: Soul Configuration Embedded in Hero

Section titled “Phase 1: Soul Configuration Embedded in Hero”

The earliest Soul implementation existed as a functional module inside the Hero workspace. We added an independent SOUL editing area to the Hero UI, supporting both preset application and text fine-tuning.

Preset application let users choose from classic persona templates such as “professional developer engineer” and “catgirl waitress.” Text fine-tuning let users personalize those presets further. On the backend, the Hero entity gained a Soul field, with SoulCatalogId used to identify its source.

This stage solved the question of whether the capability existed at all, and it grew forward somewhat awkwardly, like anything young does. But as Soul content became richer, the limitations of an architecture tightly coupled with the Hero system started to show.

To provide a better Soul discovery and reuse experience, we built a SOUL Marketplace catalog page with support for browsing, searching, viewing details, and favoriting.

At this stage, we introduced a combinatorial design built from 50 main Catalogs (base roles) and 10 orthogonal rules (expression styles). The main Catalogs defined the Agent’s core persona, with abstract character settings such as “Mistport Traveler” and “Night Hunter.” The orthogonal rules defined how the Agent expressed itself, with language style traits such as “Concise & Professional” and “Verbose & Friendly.”

50 x 10 = 500 possible combinations gave users a wide configuration space for personas. It is not an overwhelming number, but it is not small either. There are many roads to Rome, after all; some are simply easier to walk than others. On the backend, the full SOUL catalog was generated through catalog-sources.json, while the frontend presented those catalog entries as an interactive card list.

The in-site Marketplace was a good transitional solution, but only that: transitional. It was still attached to the main system, and for users who only wanted Soul functionality, the access path remained too deep. Not everyone wants to take the scenic route just to do something simple.

Phase 3: Splitting into an Independent Platform

Section titled “Phase 3: Splitting into an Independent Platform”

In the end, we decided to move Soul into an independent repository (repos/soul). The Marketplace in the original main system was changed into an external jump guide, while the new platform adopted a Builder-first design philosophy: the homepage is the creation workspace by default, so users can start building their own persona configuration the moment they open the site.

The technology stack was also comprehensively upgraded in this stage: Vite 8 + React 19 + TypeScript 5.9, a unified design language through the shadcn/ui component system, and Tailwind CSS 4 theme variables. The improvement in frontend engineering laid a solid foundation for future feature iteration.

Everything faded away… no, actually, everything was only just beginning.

One core design principle of the Soul platform is local-first. That means the homepage must remain fully functional without a backend, and failure to load remote materials must never block page entry.

There is nothing especially miraculous about that. It simply means thinking one step further when designing the system. Using a local snapshot as the baseline and remote data as enhancement lets the product remain basically usable under any network condition. Concretely, we implemented a two-layer material architecture:

export async function loadBuilderMaterials(): Promise<BuilderMaterials> {
const localMaterials = createLocalMaterials(snapshot) // local baseline
try {
const inspirationFragments = await fetchMarketplaceItems() // remote enhancement
return { ...localMaterials, inspirationFragments, remoteState: "ready" }
} catch (error) {
return { ...localMaterials, remoteState: "fallback" } // graceful degradation
}
}

Local materials come from build-time snapshots of the main system documentation and include the complete data for 50 base roles and 10 expression rules. Remote materials come from Souls published by users and fetched through the Marketplace API. Together, they give users a full spectrum of materials, from official templates to community creativity. If that sounds dramatic, it really is just local plus remote.

The core data abstraction of Soul is the SoulFragment:

export type SoulFragment = {
fragmentId: string
group: "main-catalog" | "expression-rule" | "published-soul"
title: string
summary: string
content: string
keywords: string[]
localized?: Partial<Record<AppLocale, LocalizedFragmentContent>>
sourceRef: SoulFragmentSourceRef
meta: SoulFragmentMeta
}

The group field distinguishes fragment types: the main catalog defines the character core, orthogonal rules define expression style, and user-published Souls are marked as published-soul. The localized field supports multilingual presentation, allowing the same fragment to display different titles and descriptions in different language environments. Internationalization is something you really want to think about early, and in this case we actually did.

The Builder draft state encapsulates the user’s current editing state:

export type SoulBuilderDraft = {
draftId: string
name: string
selectedMainFragmentId: string | null
selectedRuleFragmentId: string | null
inspirationSoulId: string | null
mainSlotText: string
ruleSlotText: string
customPrompt: string
previewText: string
updatedAt: string
}

Each fragment selected in the editor has its content concatenated into the corresponding slot, forming the final preview text. mainSlotText corresponds to the main role content, ruleSlotText corresponds to the expression rule content, and customPrompt is the user’s additional instruction text.

Preview compilation is the core capability of Soul Builder. It assembles user-selected fragments and custom text into a system prompt that can be copied directly:

export function compilePreview(
draft: Pick<SoulBuilderDraft, "mainSlotText" | "ruleSlotText" | "customPrompt">,
fragments: {
mainFragment: SoulFragment | null
ruleFragment: SoulFragment | null
inspirationFragment: SoulFragment | null
}
): PreviewCompilation {
// Assembly logic: main role + expression rule + inspiration reference + custom content
}

The compilation result is shown in the central preview panel, where users can see the final effect in real time and copy it to the clipboard with one click. It sounds simple, and it is. But simple things are often the most useful.

Frontend state management in Soul Builder follows one important principle: clear separation of state boundaries. More specifically, drawer state is not persisted and does not write directly into the draft. Only explicit Builder actions trigger meaningful state changes.

// Domain state (useSoulBuilder)
export function useSoulBuilder() {
// Material loading and caching
// Slot aggregation and preview compilation
// Copy actions and feedback messages
// Locale-safe descriptors
}
// Presentation state (useHomeEditorState)
export function useHomeEditorState() {
// activeSlot, drawerSide, drawerOpen
// default focus behavior
}

That separation ensures both edit-state safety and responsive UI behavior. Opening and closing the drawer is purely a UI interaction and should not trigger complicated persistence logic. It may sound obvious, but it matters: UI state and business state should be separated clearly so interface interactions do not pollute the core data model.

Soul Builder uses a single-drawer mode: only one slot drawer may be open at a time. Clicking the mask, pressing the ESC key, or switching slots automatically closes the current drawer. This simplifies state management and also matches common drawer interaction patterns on mobile.

Closing the drawer does not clear the current editing content, so when users come back, their context is preserved. This kind of “lightweight” drawer design avoids interrupting the user’s flow. Nobody wants carefully written content to disappear because of one accidental click.

Internationalization is an important capability of the Soul platform. System copy fully supports bilingual switching, while user draft text is never rewritten when the language changes, because draft text is user-authored free input rather than system-translated content.

Official inspiration cards (Marketplace Souls) keep the upstream display name while also providing a best-effort English summary. For Souls with Chinese names, we generate English versions through predefined mapping rules:

// English name mapping for main roles
const mainNameEnglishMap = {
"雾港旅人": "Mistport Traveler",
"夜航猎手": "Night Hunter",
// ...
}
// English name mapping for orthogonal rules
const ruleNameEnglishMap = {
"简洁干练": "Concise & Professional",
"啰嗦亲切": "Verbose & Friendly",
// ...
}

The mapping table itself looks simple enough, but keeping it in good shape still takes care. There are 50 main roles and 10 orthogonal rules, which means 500 combinations in total. That is not huge, but it is enough to deserve respect.

Bulk generation of the Soul Catalog happens on the backend, where C# is used to automate the creation of 50 x 10 = 500 combinations:

foreach (var main in source.MainCatalogs)
{
foreach (var orthogonal in source.OrthogonalCatalogs)
{
var catalogId = $"soul-{main.Index:00}-{orthogonal.Index:00}";
var displayName = BuildNickname(main, orthogonal);
var soulSnapshot = BuildSoulSnapshot(main, orthogonal);
// Write to the database...
}
}

The nickname generation algorithm combines the main role name with the expression rule name to create imaginative Agent codenames:

private static readonly string[] MainHandleRoots = [
"雾港", "夜航", "零帧", "星渊", "霓虹", "断云", ...
];
private static readonly string[] OrthogonalHandleSuffixes = [
"旅人", "猎手", "术师", "行者", "星使", ...
];
// Combination examples: 雾港旅人, 夜航猎手, 零帧术师...

Soul snapshot assembly follows a fixed template format that combines the main role core, signature traits, expression rule core, and output constraints together:

private static string BuildSoulSnapshot(main, orthogonal) => string.Join('\n', [
$"你的人设内核来自「{main.Name}」:{main.Core}",
$"保持以下标志性语言特征:{main.Signature}",
$"你的表达规则来自「{orthogonal.Name}」:{orthogonal.Core}",
$"必须遵循这些输出约束:{orthogonal.Signature}"
]);

Template assembly may sound terribly dull, but without that sort of dull work, interesting products rarely appear.

After splitting Soul from the main system into an independent platform, one important challenge was handling existing user data. It is a familiar problem: splitting things apart is easy, migration is not. We adopted three safeguards:

Backward compatibility protection. Previously saved Hero SOUL snapshots remain visible, and historical snapshots can still be previewed even if they no longer have a Marketplace source ID. In other words, none of the user’s prior configurations are lost; only where they appear has changed.

Main system API deprecation. The in-site Marketplace API returns HTTP status 410 Gone together with a migration notice that guides users to soul.hagicode.com.

Hero SOUL form refactoring. A migration notice block was added to the Hero Soul editing area to clearly tell users that the Soul platform is now independent and to provide a one-click jump button:

HeroSoulForm.tsx
<div className="rounded-2xl border border-orange-200/70 bg-orange-50/80 p-4">
<div>{t('hero.soul.migrationTitle')}</div>
<p>{t('hero.soul.migrationDescription')}</p>
<Button onClick={onOpenSoulPlatform}>
{t('hero.soul.openSoulPlatformAction')}
</Button>
</div>

Looking back at the development of the Soul platform as a whole, there are a few practical lessons worth sharing. They are not grand principles, just things learned from real mistakes.

Local-first runtime assumptions. When designing features that depend on remote data, always assume the network may be unavailable. Using local snapshots as the baseline and remote data as enhancement ensures the product remains basically usable under any network condition.

Clear separation of state boundaries. UI state and business state should be distinguished clearly so interface interactions do not pollute the core data model. Drawer toggles are purely UI state and should not be mixed with draft persistence.

Design for internationalization early. If your product has multilingual requirements, it is best to think about them during the data model design phase. The localized field adds some structural complexity, but it greatly reduces the long-term maintenance cost of multilingual content.

Automate the material synchronization workflow. Local materials for the Soul platform come from the main system documentation. When upstream documentation changes, there needs to be a mechanism to sync it into frontend snapshots. We designed the npm run materials:sync script to automate that process and keep materials aligned with upstream.

Based on the current architecture, the Soul platform could move in several directions in the future. These are only tentative ideas, but perhaps they can be useful as a starting point.

Community sharing ecosystem. Support user uploads and sharing of custom Souls, with rating, commenting, and recommendation mechanisms so excellent Soul configurations can be discovered and reused by more people.

Multimodal expansion. Beyond text style, the platform could also support dimensions such as voice style configuration, emoji usage preferences, and code style and formatting rules. It sounds attractive in theory; implementation may tell a more complicated story.

Intelligent assistance. Automatically recommend Souls based on usage scenarios, support style transfer and fusion, and even run A/B tests on the real-world effectiveness of different Souls. There is no better way to know than to try.

Cross-platform synchronization. Support importing persona configurations from other AI platforms, provide a standardized Soul export format, and integrate with mainstream Agent frameworks.

This article shares the full evolution of the HagiCode Soul platform from its earliest emerging need to an independent platform. We discussed why a Soul mechanism is needed to solve Agent persona consistency, analyzed the three stages of architectural evolution (embedded configuration, in-site Marketplace, and independent platform), examined the core data model, state management, preview compilation, and internationalization design in depth, and summarized practical migration lessons.

The essence of Soul is an independent persona configuration layer separated from business logic. It makes the language style of AI Agents definable, reusable, and shareable. From a technical perspective, the design itself is not especially complicated, but the problem it solves is real and broadly relevant.

If you are also building AI Agent products, it may be worth asking whether your persona configuration solution is flexible enough. The Soul platform’s practical experience may offer a few useful ideas.

Perhaps one day you will run into a similar problem as well. If this article can help a little when that happens, that is probably enough.


If you found this article helpful, feel free to give the project a Star on GitHub. The public beta has already started, and you are welcome to install it and try it out.

Thank you for reading. If you found this article useful, likes, bookmarks, and shares are all appreciated. This content was created with AI-assisted collaboration, and the final content was reviewed and confirmed by the author.

Technical Analysis of the HagiCode Skill System: Building a Scalable AI Skill Management Platform

Technical Analysis of the HagiCode Skill System: Building a Scalable AI Skill Management Platform

Section titled “Technical Analysis of the HagiCode Skill System: Building a Scalable AI Skill Management Platform”

This article takes an in-depth look at the architecture and implementation of the Skill management system in the HagiCode project, covering the technical details behind four core capabilities: local global management, marketplace search, intelligent recommendations, and trusted provider management.

In the field of AI coding assistants, how to extend the boundaries of AI capabilities has always been a core question. Claude Code itself is already strong at code assistance, but different development teams and different technology stacks often need specialized capabilities for specific scenarios, such as handling Docker deployments, database optimization, or frontend component generation. That is exactly where a Skill system becomes especially important.

During the development of the HagiCode project, we ran into a similar challenge: how do we let Claude Code “learn” new professional skills like a person would, while still maintaining a solid user experience and good engineering maintainability? This problem is both hard and simple in its own way. Around that question, we designed and implemented a complete Skill management system.

This article walks through the technical architecture and core implementation of the system in detail. It is intended for developers interested in AI extensibility and command-line tool integration. It might be useful to you, or it might not, but at least it is written down now.

The approach shared in this article comes from our practical experience in the HagiCode project. HagiCode is an open-source AI coding assistant project designed to help development teams improve engineering efficiency. The project’s stack includes ASP.NET Core, the Orleans distributed framework, a TanStack Start + React frontend, and the Skill management subsystem introduced in this article.

The GitHub repository is HagiCode-org/site. If you find the technical approach in this article valuable, feel free to give it a Star. More Stars tend to improve the mood, after all.

The Skill system uses a frontend-backend separated architecture. There is nothing especially mysterious about that.

Frontend uses TanStack Start + React to build the user interface, with Redux Toolkit managing state. The four main capabilities map directly to four Tab components: Local Skills, Skill Gallery, Intelligent Recommendations, and Trusted Providers. In the end, the design is mostly about making the user experience better.

Backend is based on ASP.NET Core + ABP Framework, using Orleans Grain for distributed state management. The online API client wraps the IOnlineApiClient interface to communicate with the remote skill catalog service.

The overall architectural principle is to separate command execution from business logic. Through the adapter pattern, the implementation details of npm/npx command execution are hidden inside independent modules. After all, nobody really wants command-line calls scattered all over the codebase.

Core Capability 1: Local Global Management

Section titled “Core Capability 1: Local Global Management”

Local global management is the most basic module. It is responsible for listing installed skills and supporting uninstall operations. There is nothing overly complicated here; it is mostly about doing the basics well.

The implementation lives in LocalSkillsTab.tsx and LocalSkillCommandAdapter.cs. The core idea is to wrap the npx skills command, parse its JSON output, and convert it into internal data structures. It sounds simple, and in practice it mostly is.

public async Task<IReadOnlyList<LocalSkillInventoryResponseDto>> GetLocalSkillsAsync(
CancellationToken cancellationToken = default)
{
var result = await _commandAdapter.ListGlobalSkillsAsync(cancellationToken);
return result.Skills.Select(skill => new LocalSkillInventoryResponseDto
{
Name = skill.Name,
Version = skill.Version,
Source = skill.Source,
InstalledPath = skill.InstalledPath,
Description = skill.Description
}).ToList();
}

The data flow is very clear: the frontend sends a request -> SkillGalleryAppService receives it -> LocalSkillCommandAdapter executes the npx command -> the JSON result is parsed -> a DTO is returned. Each step follows naturally from the previous one.

Skill uninstallation uses the npx skills remove -g <skillName> -y command, and the system automatically handles dependencies and cleanup. Installation metadata is stored in managed-install.json inside the skill directory, recording information such as install time and source version for later updates and auditing. Some things are simply worth recording.

Skill installation requires several coordinated steps. In truth, it is not especially complicated:

public async Task<SkillInstallResultDto> InstallAsync(
SkillInstallRequestDto request,
CancellationToken cancellationToken = default)
{
// 1. Normalize the installation reference
var normalized = _referenceNormalizer.Normalize(
request.SkillId,
request.Source,
request.SkillSlug,
request.Version);
// 2. Check prerequisites
await _prerequisiteChecker.CheckAsync(cancellationToken);
// 3. Acquire installation lock
using var installLock = await _lockProvider.AcquireAsync(normalized.SkillId);
// 4. Execute installation command
var result = await _installCommandRunner.ExecuteAsync(
new SkillInstallCommandExecutionRequest
{
Command = $"npx skills add {normalized.FullReference} -g -y",
Timeout = TimeSpan.FromMinutes(4)
},
cancellationToken);
// 5. Persist installation metadata
await _metadataStore.WriteAsync(normalized.SkillPath, request);
return new SkillInstallResultDto { Success = result.Success };
}

Several key design patterns are used here: the reference normalizer converts different input formats, such as tanweai/pua and @opencode/docker-skill, into a unified internal representation; the installation lock mechanism ensures only one installation operation can run for the same skill at a time; and streaming output pushes installation progress to the frontend in real time through Server-Sent Events, so users can watch terminal-like logs as they happen.

In the end, all of these patterns are there for one purpose: to keep the system simpler to use and maintain.

Marketplace search lets users discover and install skills from the community. One person’s ability is always limited; collective knowledge goes much further.

The search feature relies on the online API https://api.hagicode.com/v1/skills/search. To improve response speed, the system implements caching. Cache is a bit like memory: if you keep useful things around, you do not have to think so hard the next time.

private async Task<IReadOnlyList<SkillGallerySkillDto>> SearchCatalogAsync(
string query,
CancellationToken cancellationToken,
IReadOnlySet<string>? allowedSources = null)
{
var cacheKey = $"skill_search:{query}:{string.Join(",", allowedSources ?? Array.Empty<string>())}";
if (_memoryCache.TryGetValue(cacheKey, out var cached))
return (IReadOnlyList<SkillGallerySkillDto>)cached!;
var response = await _onlineApiClient.SearchAsync(
new SearchSkillsRequest
{
Query = query,
Limit = _options.LimitPerQuery,
},
cancellationToken);
var results = response.Skills
.Where(skill => allowedSources is null || allowedSources.Contains(skill.Source))
.Select(skill => new SkillGallerySkillDto { ... })
.ToList();
_memoryCache.Set(cacheKey, results, TimeSpan.FromMinutes(10));
return results;
}

Search results support filtering by trusted sources, so users only see skill sources they trust. Seed queries such as popular and recent are used to initialize the catalog, allowing users to see recommended popular skills the first time they open it. First impressions still matter.

Core Capability 3: Intelligent Recommendations

Section titled “Core Capability 3: Intelligent Recommendations”

Intelligent recommendations are the most complex part of the system. They can automatically recommend the most suitable skills based on the current project context. Complex as it is, it is still worth building.

The full recommendation flow is divided into five stages:

1. Build project context
2. AI generates search queries
3. Search the online catalog in parallel
4. AI ranks the candidates
5. Return the recommendation list

First, the system analyzes characteristics such as the project’s technology stack, programming languages, and domain structure to build a “project profile.” That profile is a bit like a resume, recording the key traits of the project.

Then an AI Grain is used to generate targeted search queries. This design is actually quite interesting: instead of directly asking the AI, “What skills should I recommend?”, we first ask it to think about “What search terms are likely to find relevant skills?” Sometimes the way you ask the question matters more than the answer itself:

var queryGeneration = await aiGrain.GenerateSkillRecommendationQueriesAsync(
projectContext, // Project context
locale, // User language preference
maxQueries, // Maximum number of queries
effectiveSearchHero); // AI model selection

Next, those search queries are executed in parallel to gather a candidate skill list. Parallel processing is, at the end of the day, just a way to save time.

Finally, another AI Grain ranks the candidate skills. This step considers factors such as skill relevance to the project, trust status, and user historical preferences:

var ranking = await aiGrain.RankSkillRecommendationsAsync(
projectContext,
candidates,
installedSkillNames,
locale,
maxRecommendations,
effectiveRankingHero);
response.Items = MergeRecommendations(projectContext, candidates, ranking, maxRecommendations);

AI models can respond slowly or become temporarily unavailable. Even the best systems stumble sometimes. For that reason, the system includes a deterministic fallback mechanism: when the AI service is unavailable, it uses a rule-based heuristic algorithm to generate recommendations, such as inferring likely required skills from dependencies in package.json.

Put plainly, this fallback mechanism is simply a backup plan for the system.

Core Capability 4: Trusted Provider Management

Section titled “Core Capability 4: Trusted Provider Management”

Trusted provider management allows users to control which skill sources are considered trustworthy. Trust is still something users should be able to define for themselves.

Trusted providers support two matching rules: exact match (exact) and prefix match (prefix).

public static TrustedSkillProviderResolutionSnapshot Resolve(
TrustedSkillProviderSnapshot snapshot,
string source)
{
var normalizedSource = Normalize(source);
foreach (var entry in snapshot.Entries.OrderBy(e => e.SortOrder))
{
if (!entry.IsEnabled) continue;
foreach (var rule in entry.MatchRules)
{
bool isMatch = rule.MatchType switch
{
TrustedSkillProviderMatchRuleType.Exact
=> string.Equals(normalizedSource, Normalize(rule.Value),
StringComparison.OrdinalIgnoreCase),
TrustedSkillProviderMatchRuleType.Prefix
=> normalizedSource.StartsWith(Normalize(rule.Value) + "/",
StringComparison.OrdinalIgnoreCase),
_ => false
};
if (isMatch)
return new TrustedSkillProviderResolutionSnapshot
{
IsTrustedSource = true,
ProviderId = entry.ProviderId,
DisplayName = entry.DisplayName
};
}
}
return new TrustedSkillProviderResolutionSnapshot { IsTrustedSource = false };
}

Built-in trusted providers include well-known organizations and projects such as Vercel, Azure, anthropics, Microsoft, and browser-use. Custom providers can be added through configuration files by specifying a provider ID, display name, badge label, matching rules, and more. The world is large enough that only trusting a few built-ins would never be enough.

Trusted configuration is persisted using an Orleans Grain:

public class TrustedSkillProviderGrain : Grain<TrustedSkillProviderState>,
ITrustedSkillProviderGrain
{
public async Task UpdateConfigurationAsync(TrustedSkillProviderSnapshot snapshot)
{
State.Snapshot = snapshot;
await WriteStateAsync();
}
public Task<TrustedSkillProviderSnapshot> GetConfigurationAsync()
{
return Task.FromResult(State.Snapshot);
}
}

The benefit of this approach is that configuration changes are automatically synchronized across all nodes, without any need to refresh caches manually. Automation is, ultimately, about letting people worry less.

The Skill system needs to execute various npx commands. If that logic were scattered everywhere, the code would quickly become difficult to maintain. That is why we designed an adapter interface. Design patterns, in the end, exist to make code easier to maintain:

public interface ISkillInstallCommandRunner
{
Task<SkillInstallCommandExecutionResult> ExecuteAsync(
SkillInstallCommandExecutionRequest request,
CancellationToken cancellationToken = default);
}

Different commands have different executor implementations, but all of them implement the same interface, making testing and replacement straightforward.

Installation progress is pushed to the frontend in real time through Server-Sent Events:

public async Task InstallWithProgressAsync(
SkillInstallRequestDto request,
IServerStreamWriter<SkillInstallProgressEventDto> stream,
CancellationToken cancellationToken)
{
var process = new Process
{
StartInfo = new ProcessStartInfo
{
FileName = "npx",
Arguments = $"skills add {request.FullReference} -g -y",
RedirectStandardOutput = true,
RedirectStandardError = true,
UseShellExecute = false
}
};
process.OutputDataReceived += async (sender, e) =>
{
await stream.WriteAsync(new SkillInstallProgressEventDto
{
EventType = "output",
Data = e.Data ?? string.Empty
});
};
process.Start();
process.BeginOutputReadLine();
await process.WaitForExitAsync(cancellationToken);
}

On the frontend, users can see terminal-like output in real time, which makes the experience very intuitive. Real-time feedback helps people feel at ease.

Take installing the pua skill as an example (it is a popular community skill):

  1. Open the Skills drawer and switch to the Skill Gallery tab
  2. Enter pua in the search box
  3. Click the search result to view the skill details
  4. Click the Install button
  5. Switch to the Local Skills tab to confirm the installation succeeded

The installation command is npx skills add tanweai/pua -g -y, and the system handles all the details automatically. There are not really that many steps once you take them one by one.

If your team has its own skill repository, you can add it as a trusted source:

providerId: "my-team"
displayName: "My Team Skills"
badgeLabel: "MyTeam"
isEnabled: true
sortOrder: 100
matchRules:
- matchType: "prefix"
value: "my-team/"
- matchType: "exact"
value: "my-team/special-skill"

This way, all skills from your team will display a trusted badge, making users more comfortable installing them. Labels and signals do help people feel more confident.

Creating a custom skill requires the following structure:

my-skill/
├── SKILL.md # Skill metadata (YAML front matter)
├── index.ts # Skill entry point
├── agents/ # Supported agent configuration
└── references/ # Reference resources

An example SKILL.md format:

---
name: my-skill
description: A brief description of what this skill does
---
# My Skill
Detailed documentation...
  1. Network requirements: skill search and installation require access to api.hagicode.com and the npm registry
  2. Node.js version: Node.js 18 or later is recommended
  3. Permission requirements: global npm installation permissions are required
  4. Concurrency control: only one install or uninstall operation can run for the same skill at a time
  5. Timeout settings: the default timeout for installation is 4 minutes, but complex scenarios may require adjustment

These notes exist, ultimately, to help things go smoothly.

This article introduced the complete implementation of the Skill management system in the HagiCode project. Through a frontend-backend separated architecture, the adapter pattern, Orleans-based distributed state management, and related techniques, the system delivers:

  • Local global management: a unified skill management interface built by wrapping npx skills commands
  • Marketplace search: rapid discovery of community skills through the online API and caching mechanisms
  • Intelligent recommendations: AI-powered skill recommendations based on project context
  • Trust management: a flexible configuration system that lets users control trust boundaries

This design approach is not only applicable to Skill management. It is also useful as a reference for any scenario that needs to integrate command-line tools while balancing local storage and online services.

If this article helped you, feel free to give us a Star on GitHub: github.com/HagiCode-org/site. You can also visit the official site to learn more: hagicode.com.

You may think this system is well designed, or you may not. Either way, that is fine. Once code is written, someone will use it, and someone will not.

Thank you for reading. If you found this article useful, feel free to like, bookmark, and share it. This content was produced with AI-assisted collaboration, and the final content was reviewed and approved by the author.

I Might Be Replaced by an Agent, So I Ran the Numbers

I Might Be Replaced by an Agent, So I Ran the Numbers

Section titled “I Might Be Replaced by an Agent, So I Ran the Numbers”

Quantifying AI replacement risk with data: a deep dive into how the HagiCode team uses six core formulas to redefine how knowledge workers evaluate their competitiveness.

With AI technology advancing at breakneck speed, every knowledge worker is facing an urgent question: In the AI era, will I be replaced?

It sounds a little alarmist, but plenty of people are quietly uneasy about it. You just finish learning a new framework, and AI is already telling you your role might be automated away; you finally master a language, and then discover that someone using AI is producing three times as much as you. If you are reading this, you have probably felt at least some of that anxiety.

And honestly, that anxiety is not irrational. No one wants to admit that the skills they spent years building could be outperformed by a single ChatGPT session. Still, anxiety is one thing; life goes on.

Traditional discussions usually start from the question of “what AI can do,” but that framing misses two critical dimensions:

  1. The business perspective: whether a company is willing to equip an employee with AI tools depends on whether AI costs make economic sense relative to labor costs. It is not enough for AI to be capable of replacing a role; the company also has to run the numbers. Capital is not a charity, and every dollar has to count.
  2. The efficiency perspective: AI-driven productivity gains need to be quantified instead of being reduced to the vague claim that “using AI makes you stronger.” Maybe your efficiency doubles with AI, but someone else gets a 5x improvement. That gap matters. It is like school: everyone sits in the same class, but some score 90 while others barely pass.

So the real question is: how do we turn this fuzzy anxiety into measurable indicators?

It is always better to know where you stand than to fumble around in the dark. That is what we are talking about today: the design logic behind the AI productivity calculator built by the HagiCode team.

So I made a site: https://cost.hagicode.com.

HagiCode is an open-source AI coding assistant project built to help developers code more efficiently.

What is interesting is that while building their own product, the HagiCode team accumulated a lot of hands-on experience around AI productivity. They realized that the value of an AI tool cannot be assessed in isolation from a company’s employment costs. Based on that insight, the team decided to build a productivity calculator to help knowledge workers evaluate their competitiveness in the AI era more scientifically.

Plenty of people could build something like this. The difference is that very few are willing to do it seriously. The HagiCode team spent time on it as a way of giving something back to the developer community.

The design shared in this article is a summary of HagiCode’s experience applying AI in real engineering work. If you find this evaluation framework valuable, it suggests that HagiCode really does have something to offer in engineering practice. In that case, the HagiCode project itself is also worth paying attention to.

A company’s real cost for an employee is far more than salary alone. A lot of people only realize this when changing jobs: you negotiate a monthly salary of 20,000 CNY, but take home only 14,000. On the company side, the spend is not just 20,000 either. Social insurance, housing fund contributions, training, and recruiting costs all have to be included.

According to the implementation in calculate-ai-risk.ts:

Total annual employment cost = Annual salary x (1 + city coefficient) + Annual salary / 12

The city coefficient reflects differences in hiring and retention costs across cities:

City tierRepresentative citiesCoefficient
Tier 1Beijing / Shanghai / Shenzhen / Guangzhou0.4
New Tier 1Hangzhou / Chengdu / Suzhou / Nanjing0.3
Tier 2Wuhan / Xi’an / Tianjin / Zhengzhou0.2
OtherYichang / Luoyang and others0.1

A Tier 1 city coefficient of 0.4 means the company needs to pay roughly 40% extra in recruiting, training, insurance, and similar overhead. The all-in cost of hiring someone in Beijing really is much higher than in a Tier 2 city.

The cost of living in major cities is high too. You could think of it as another version of a “drifter tax.”

Different AI models have separate input and output pricing, and the gap can be huge. In coding scenarios, the input/output ratio is roughly 3:1. You might give the AI a block of code to review, while its analysis is usually much shorter than the input.

The blended unit price formula is:

Blended unit price = (input-output ratio x input price + output price) / (input-output ratio + 1)

Take GPT-5 as an example:

  • Input: $2.5/1M tokens
  • Output: $15/1M tokens
  • Blended = (3 x 2.5 + 15) / 4 = $5.625/1M tokens

For models priced in USD, you also need to convert using an exchange rate. The HagiCode team currently sets that rate to 7.25 and updates it as the market changes.

Exchange rates are like the stock market: no one can predict them exactly. You just follow the trend.

Average daily AI cost = Average daily token demand (M) x blended unit price (CNY/1M)
Annual AI cost = Average daily AI cost x 264 working days

264 = 22 days/month x 12 months, which is the number of working days in a standard year. Why not use 365? Because you have to account for weekends, holidays, sick leave, and so on.

We are not robots, after all. AI may not need rest, but people still need room to breathe.

4. The Core Innovation: Equivalent Headcount

Section titled “4. The Core Innovation: Equivalent Headcount”

This is the heart of the whole evaluation system, and also where the HagiCode team’s insight shows most clearly.

Affordable workflow count = Total annual employment cost / Annual AI cost
Affordability ratio = min(affordable workflow count, 1)
Equivalent headcount = 1 + (productivity multiplier - 1) x affordability ratio

That formula looks a little abstract, so let me unpack it.

The traditional view would simply say, “your efficiency improved by 2x.” But this formula introduces a crucial constraint: is the company’s AI budget sustainable?

For example, Xiao Ming improves his efficiency by 3x, but his annual AI usage costs 300,000 CNY while the company is only paying him a salary of 200,000 CNY. In that case, his personal productivity may be impressive, but it is not sustainable. No company is going to lose money just to keep him operating at peak efficiency.

That is what the affordability ratio means. If the company can only afford 0.5 of an AI workflow, then Xiao Ming’s equivalent headcount is 1 + (3 - 1) x 0.5 = 2 people, not 3.

The key insight: what matters is not just how large your productivity multiplier is, but whether the company can afford the AI investment required to sustain that multiplier.

The logic is simple once you see it. Most people just do not think from that angle. We are used to looking at the world from our own side, not from the boss’s side, where money does not come out of thin air either.

AI cost ratio = Annual AI cost / Total annual employment cost
Productivity gain = Productivity multiplier - 1
Cost-benefit ratio = Productivity gain / AI cost ratio
  • Cost-benefit ratio < 1: the AI investment is not worth it; the productivity gain does not justify the cost
  • Cost-benefit ratio 1-2: barely worth it
  • Cost-benefit ratio > 2: high return, strongly recommended

This metric is especially useful for managers because it helps them quickly judge whether a given role is worth equipping with AI tools.

At the end of the day, ROI is what matters. You can talk about higher efficiency all you want, but if the cost explodes, no one is going to buy the argument.

Risk is categorized according to equivalent headcount:

Equivalent headcountRisk levelConclusion
>= 2.0High riskIf your coworkers gain the same conditions, they become a serious threat to you
1.5 - 2.0WarningCoworkers have begun to build a clear productivity advantage
< 1.5SafeFor now, you can still maintain a gap

After seeing that table, you probably have a rough sense of where you stand. Still, there is no point in panicking. Anxiety does not solve problems. It is better to think about how to raise your own productivity multiplier.

To make the results more fun, the calculator introduces a system of seven special titles. These titles are persisted through localStorage, allowing users to unlock and display their own “achievements.”

Title IDNameUnlock condition
craftsman-spiritCraftsman SpiritAverage daily token usage = 0
prompt-alchemistPrompt AlchemistDaily tokens <= 20M and productivity multiplier >= 6
all-in-operatorAll-In OperatorDaily tokens >= 150M and productivity multiplier >= 3
minimalist-runnerMinimalist RunnerDaily tokens <= 5M and productivity multiplier >= 2
cost-tamerCost TamerCost-benefit ratio >= 2.5 and AI cost ratio <= 15%
danger-oracleDanger OracleEquivalent headcount >= 2.5 or entering the high-risk zone
budget-coordinatorBudget CoordinatorAffordable workflow count >= 8

Each title also carries a hidden meaning:

TitleHidden meaning
Craftsman SpiritYou can still do fine without AI, but you need unique competitive strengths
Prompt AlchemistYou achieve high output with very few tokens; a classic power-user profile
All-In OperatorHigh input, high output; suitable for high-frequency scenarios
Minimalist RunnerLightweight AI usage; suitable for light-assistance scenarios
Cost TamerExtremely high ROI; the kind of employee companies love
Danger OracleYou are already, or soon will be, in a high-risk group
Budget CoordinatorYou can operate multiple AI workflows at the same time

Gamification is really just a way to make dry data a little more entertaining. After all, who does not like collecting achievements? Like badges in a game, they may not have much practical value, but they still feel good to earn.

Data Sources: An Authoritative Pricing System

Section titled “Data Sources: An Authoritative Pricing System”

The calculator’s pricing data comes from multiple official API pricing pages to keep the results authoritative and up to date:

This data is updated regularly, with the latest refresh on 2026-03-19.

Data only matters when it is current. Once it is outdated, it stops being useful. On that front, the HagiCode team has been quite responsible about keeping things updated.

Suppose you are a developer in Beijing with an annual salary of 400,000 CNY, using Claude Sonnet 4.6, consuming 50M tokens per day on average, and estimating that AI gives you a 3x productivity boost. The simulated input looks like this:

const input = {
annualIncomeCny: 400000,
cityTier: "tier1", // Beijing
modelId: "claude-sonnet-4-6",
performanceMultiplier: 3.0,
dailyTokenUsageM: 50,
}
// Calculation process
// Total annual employment cost = 400k x (1 + 0.4) + 400k/12 ~= 603.3k
// Annual AI cost ~= 50 x 7.125 x 264 ~= 94k
// Affordable workflow count ~= 603.3 / 94 ~= 6.4 workflows
// Equivalent headcount = 1 + (3 - 1) x 1 = 3 people

Conclusion: if one of your coworkers has the same conditions, their output would be equivalent to three people. You are already in the high-risk zone.

If you discover that your current AI usage is “not worth it” (cost-benefit ratio < 1), you can consider:

  1. Reducing token usage: use more efficient prompts and cut down ineffective requests
  2. Choosing a more cost-effective model: for example, DeepSeek-V3 (priced in CNY and cheaper)
  3. Increasing your productivity multiplier: learn advanced Agent usage techniques and truly turn AI into productivity

In the end, all of this comes down to the art of balance. Use too much and you waste money; use too little and nothing changes. The key is finding the sweet spot.

When designing this calculator, the HagiCode team made several engineering decisions worth learning from:

  1. Pure frontend computation: all calculations run in the browser, with no backend API dependency, which protects user privacy
  2. Configuration-driven: all formulas, pricing, and role data are centralized in configuration files, so future updates do not require changing core code logic
  3. Multilingual support: supports both Chinese and English
  4. Instant feedback: results update in real time as soon as the user changes inputs
  5. Detailed formula display: every result includes the full calculation formula to help users understand it

This design makes the calculator easy to maintain and extend, while also serving as a reference template for similar data-driven applications.

Good architecture, like good code, takes time to build up. The HagiCode team put real thought into it.

The core value of the AI productivity calculator is that it turns the vague anxiety of an “AI replacement threat” into metrics that can be quantified and compared.

The equivalent headcount formula, 1 + (productivity multiplier - 1) x affordability ratio, is the core innovation of the entire framework. It considers not only productivity gains, but also whether a company can afford the AI cost, making the evaluation much closer to reality.

This framework tells us one thing clearly: in the AI era, not knowing where you stand is the most dangerous position of all.

Instead of worrying, let the data speak.

A lot of fear comes from the unknown. Once you quantify everything, the situation no longer feels quite so terrifying. At worst, you improve yourself or change tracks. Life is long, and there is no need to hang everything on a single tree.


Visit cost.hagicode.com now and complete your AI productivity assessment.



Data source: cost.hagicode.com | Powered by HagiCode

In the end, a line of poetry came to mind: “This feeling might have become a thing to remember, yet even then one was already lost.” The AI era is much the same. Instead of waiting until you are replaced and filled with regret, it is better to start taking action now…

Thank you for reading. If you found this article useful, likes, bookmarks, and shares are all welcome. This content was created with AI-assisted collaboration, and the final version was reviewed and confirmed by the author.

Hagicode.Libs: Engineering Practice for Unified Integration of Multiple AI Coding Assistant CLIs

Hagicode.Libs: Engineering Practice for Unified Integration of Multiple AI Coding Assistant CLIs

Section titled “Hagicode.Libs: Engineering Practice for Unified Integration of Multiple AI Coding Assistant CLIs”

During the development of the HagiCode project, we needed to integrate multiple AI coding assistant CLIs at the same time, including Claude Code, Codex, and CodeBuddy. Each CLI has different interfaces, parameters, and output formats, and the repeated integration code made the project harder and harder to maintain. In this article, we share how we built a unified abstraction layer with HagiCode.Libs to solve this engineering pain point. You could also say it is simply some hard-earned experience gathered from the pitfalls we have already hit.

The market for AI coding assistants is quite lively now. Besides Claude Code, there are also OpenAI’s Codex, Zhipu’s CodeBuddy, and more. As an AI coding assistant project, HagiCode needs to integrate these different CLI tools across multiple subprojects, including desktop, backend, and web.

At first, the problem was manageable. Integrating one CLI was only a few hundred lines of code. But as the number of CLIs we needed to support kept growing, things started to get messy.

Each CLI has its own command-line argument format, different environment variable requirements, and a wide variety of output formats. Some output JSON, some output streaming JSON, and some output plain text. On top of that, there are cross-platform compatibility issues. Executable discovery and process management work very differently between Windows and Unix systems, so code duplication kept increasing. In truth, it was just a bit more Ctrl+C and Ctrl+V, but maintenance quickly became painful.

The most frustrating part was that every time we wanted to add support for a new CLI capability, we had to change the same code in several projects. That approach was clearly not sustainable in the long run. Code has a temper too; duplicate it too many times and it starts causing trouble.

The approach shared in this article comes from our practical experience in the HagiCode project. HagiCode is an open-source AI coding assistant project that needs to maintain multiple subprojects at the same time, including a frontend VSCode extension, backend AI services, and a cross-platform desktop client. In a way, it was exactly this complex, multi-language, multi-platform environment that led to the birth of HagiCode.Libs. You could say we were forced into it, and so be it.

Although these AI coding assistant CLIs each have their own characteristics, from a technical perspective they share several obvious traits:

Similar interaction patterns: they all start a CLI process, send a prompt, receive streaming responses, parse messages, and then either end or continue the session. At the end of the day, the whole flow follows the same basic mold.

Similar configuration needs: they all need API key authentication, working directory setup, model selection, tool permission control, and session management. After all, everyone is making a living from APIs; the differences are mostly a matter of flavor.

The same cross-platform challenges: they all need to solve executable path resolution (claude vs claude.exe vs /usr/local/bin/claude), process startup and environment variable handling, shell command escaping, and argument construction. Cross-platform work is painful no matter how you describe it. Only people who have stepped into the traps really understand the difference between Windows and Unix.

Based on this analysis, we needed a unified abstraction layer that could provide a consistent interface, encapsulate cross-platform CLI discovery logic, handle streaming output parsing, and support both dependency injection and non-DI scenarios. It is the kind of problem that makes your head hurt just thinking about it, but you still have to face it. After all, it is our own project, so we have to finish it even if we have to cry our way through it.

We created HagiCode.Libs, a lightweight .NET 10 library workspace released under the MIT license and now published on GitHub. It may not be some world-shaking masterpiece, but it is genuinely useful for solving real problems.

HagiCode.Libs/
├── src/
│ ├── HagiCode.Libs.Core/ # Core capabilities
│ │ ├── Discovery/ # CLI executable discovery
│ │ ├── Process/ # Cross-platform process management
│ │ ├── Transport/ # Streaming message transport
│ │ └── Environment/ # Runtime environment resolution
│ ├── HagiCode.Libs.Providers/ # Provider implementations
│ │ ├── ClaudeCode/ # Claude Code provider
│ │ ├── Codex/ # Codex provider
│ │ └── Codebuddy/ # CodeBuddy provider
│ ├── HagiCode.Libs.ConsoleTesting/ # Testing framework
│ ├── HagiCode.Libs.ClaudeCode.Console/
│ ├── HagiCode.Libs.Codex.Console/
│ └── HagiCode.Libs.Codebuddy.Console/
└── tests/ # xUnit tests

When designing HagiCode.Libs, we followed a few principles. They all came from lessons learned the hard way:

Zero heavy framework dependencies: it does not depend on ABP or any other large framework, which keeps it lightweight. These days, the fewer dependencies you have, the fewer headaches you get. Most people have already been beaten up by dependency hell at least once.

Cross-platform support: native support for Windows, macOS, and Linux, without writing separate code for different platforms. One codebase that runs everywhere is a pretty good thing.

Streaming processing: CLI output is handled with asynchronous streams, which fits modern .NET programming patterns much better. Times change, and async is king.

Flexible integration: it supports dependency injection scenarios while also allowing direct instantiation. Different people have different preferences, so we wanted it to be convenient either way.

If your project already uses dependency injection, such as ASP.NET Core or the generic host, you can integrate it directly. It is a small thing, but a well-behaved one:

using HagiCode.Libs.Providers;
using Microsoft.Extensions.DependencyInjection;
var services = new ServiceCollection();
services.AddHagiCodeLibs();
await using var provider = services.BuildServiceProvider();
var claude = provider.GetRequiredService<ICliProvider<ClaudeCodeOptions>>();
var options = new ClaudeCodeOptions
{
ApiKey = "your-api-key",
Model = "claude-sonnet-4-20250514"
};
await foreach (var message in claude.ExecuteAsync(options, "Hello, Claude!"))
{
Console.WriteLine($"{message.Type}: {message.Content}");
}

If you are writing a simple script or working in a non-DI scenario, creating an instance directly also works. Put simply, it depends on your personal preference:

var claude = new ClaudeCodeProvider();
var options = new ClaudeCodeOptions
{
ApiKey = "sk-ant-xxx",
Model = "claude-sonnet-4-20250514"
};
await foreach (var message in claude.ExecuteAsync(options, "Help me write a quicksort"))
{
// Handle messages
}

Both approaches use the same underlying implementation, so you can choose the integration style that best fits your project. There is no universal right answer in this world. What suits you is the best option. It may sound cliché, but it is true.

Each provider has its own dedicated testing console project, making it easier to validate the integration independently. Testing is one of those things where if you are going to do it, you should do it properly:

Terminal window
# Claude Code tests
dotnet run --project src/HagiCode.Libs.ClaudeCode.Console -- --test-provider
dotnet run --project src/HagiCode.Libs.ClaudeCode.Console -- --test-all claude
# CodeBuddy tests
dotnet run --project src/HagiCode.Libs.Codebuddy.Console -- --test-provider codebuddy-cli
# Codex tests
dotnet run --project src/HagiCode.Libs.Codex.Console -- --test-provider codex-cli

The testing scenarios cover several key cases:

  • Ping: health check to confirm the CLI is available
  • Simple Prompt: basic prompt test
  • Complex Prompt: multi-turn conversation test
  • Session Restore/Resume: session recovery test
  • Repository Analysis: repository analysis test

This standalone testing console design is especially useful during debugging because it lets us quickly identify whether the issue is in the HagiCode.Libs layer or in the CLI itself. Debugging is really just about finding where the problem is. Once the direction is right, you are already halfway there.

Cross-platform compatibility is one of the core goals of HagiCode.Libs. We configured the GitHub Actions workflow .github/workflows/cli-discovery-cross-platform.yml to run real CLI discovery validation across ubuntu-latest, macos-latest, and windows-latest.

This ensures that every code change does not break cross-platform compatibility. During local development, you can also reproduce it with the following commands. After all, you cannot ask CI to take the blame for everything. Your local environment should be able to run it too:

Terminal window
npm install --global @anthropic-ai/claude-code@2.1.79
HAGICODE_REAL_CLI_TESTS=1 dotnet test --filter "Category=RealCli"

HagiCode.Libs uses asynchronous streams to process CLI output. Compared with traditional callback or event-based approaches, this fits the asynchronous programming style of modern .NET much better. In the end, this is simply how technology moves forward, whether anyone likes it or not:

public async IAsyncEnumerable<CliMessage> ExecuteAsync(
TOptions options,
string prompt,
[EnumeratorCancellation] CancellationToken cancellationToken = default)
{
// Start the CLI process
// Parse streaming JSON output
// Yield the CliMessage sequence
}

The message types include:

  • user: user message
  • assistant: assistant response
  • tool_use: tool invocation
  • result: session end

This design lets callers handle streaming output flexibly, whether for real-time display, buffered post-processing, or forwarding to other services. Why worry whether the sky is sunny or cloudy? What matters is that once the idea opens up, you can use it however you like.

The HagiCode.Libs.Exploration module provides Git repository discovery and status checking, which is especially useful in repository analysis scenarios. This feature was also born out of necessity, because HagiCode needs to analyze repositories:

// Discover Git repositories
var repositories = await GitRepositoryDiscovery.DiscoverAsync("/path/to/search");
// Get repository information
var info = await GitRepository.GetInfoAsync(repoPath);
Console.WriteLine($"Branch: {info.Branch}, Remote: {info.RemoteUrl}");
Console.WriteLine($"Has uncommitted changes: {info.HasUncommittedChanges}");

HagiCode’s code analysis capabilities use this module to identify project structure and Git status. It is a good example of making full use of what we built.

Based on our practice in the HagiCode project, there are several points that deserve special attention. They are all real issues that need to be handled carefully:

API key security: do not hardcode API keys in your code. Use environment variables or configuration management instead. HagiCode.Libs supports passing configuration through Options objects, making it easier to integrate with different configuration sources. When it comes to security, there is no such thing as being too careful.

CLI version pinning: in CI/CD, we pin specific versions, such as @anthropic-ai/claude-code@2.1.79, to reduce uncertainty caused by version drift. It is also a good idea to use fixed versions in local development. Versioning can be painful. If you do not pin versions, the problem will teach you a lesson very quickly.

Test categorization: default tests use fake providers to keep them deterministic and fast, while real CLI tests must be enabled explicitly. This gives CI fast feedback while still allowing real-environment validation when needed. Striking that balance is never easy. Speed and stability always require trade-offs.

Session management: different CLIs have different session recovery mechanisms. Claude Code uses the .claude/ directory to store sessions, while Codex and CodeBuddy each have their own approaches. When using them, be sure to check their respective documentation and understand the details of their session persistence mechanisms. There is no harm in understanding it clearly.

HagiCode.Libs is the unified abstraction layer we built during the development of HagiCode to solve the repeated engineering work involved in multi-CLI integration. By providing a consistent interface, encapsulating cross-platform details, and supporting flexible integration patterns, it greatly reduces the engineering complexity of integrating multiple AI coding assistants. Much may fade away, but the experience remains.

If you also need to integrate multiple AI CLI tools in your project, or if you are interested in cross-platform process management and streaming message handling, feel free to check it out on GitHub. The project is released under the MIT license, and contributions and feedback are welcome. In the end, it is a happy coincidence that we met here, so since you are already here, we might as well become friends.

The approach shared in this article was shaped by real pitfalls and real optimization work inside HagiCode. What else could we do? Running into pitfalls is normal. If you think this solution is valuable, then perhaps our engineering work is doing all right. And HagiCode itself may also be worth your attention. You might even find a pleasant surprise.


If this article helped you:

Thank you for reading. If you found this article useful, you are welcome to like, bookmark, and share it. This content was created with AI-assisted collaboration, and the final content was reviewed and confirmed by the author.

Why HagiCode Chose Hermes as Its Integrated Agent Core

Why HagiCode Chose Hermes as Its Integrated Agent Core

Section titled “Why HagiCode Chose Hermes as Its Integrated Agent Core”

When building an AI-assisted coding platform, choosing the right Agent core directly determines the upper limit of the system’s capabilities. Some things simply cannot be forced; pick the wrong framework, and no amount of effort will make it feel right. This article shares the thinking behind HagiCode’s technical selection and our hands-on experience integrating Hermes Agent.

When building an AI-assisted coding product, one of the hardest parts is choosing the underlying Agent framework. There are actually quite a few options on the market, but some are too limited in functionality, some are overly complex to deploy, and others simply do not scale well enough. What we needed was a solution that could run on a $5 VPS while also being able to connect to a GPU cluster. That requirement may not sound extreme, but it is enough to scare plenty of teams away.

In practice, many so-called “all-in-one Agents” either only run in the cloud or require absurdly high local deployment costs. After spending two weeks researching different approaches, we made a bold decision: rebuild the entire Agent core around Hermes as the underlying engine for our integrated Agent.

Everything that followed may simply have been fate.

The approach shared in this article comes from real-world experience in the HagiCode project. HagiCode is an AI-assisted coding platform that provides developers with an intelligent coding assistant through a VSCode extension, a desktop client, and web services. You may have used similar tools before and felt they were just missing that final touch; we understand that feeling well.

Before diving into Hermes itself, it helps to explain why HagiCode needed something like it in the first place. Things rarely work exactly the way you want, so you need a practical reason to commit to a technical direction.

As an AI coding assistant, HagiCode needs to support several usage scenarios at the same time:

  • Local development environments: developers want to run it on their own machines so data never leaves the local environment. These days, data security is never a trivial concern.
  • Team collaboration environments: small teams should be able to share an Agent deployment running on a server. Saving money matters, and everyone has limits.
  • Elastic cloud expansion: when handling complex tasks, the system should automatically scale out to a GPU cluster. It is always better to be prepared.

This “we want everything at once” requirement is what led us to Hermes. Whether it was the perfect choice, I cannot say for sure, but at the time we did not see a better option.

Hermes Agent is an autonomous AI Agent created by Nous Research. Some readers may not be familiar with Nous Research; they are the lab behind open-source large models such as Hermes, Nomos, and Psyché. They have built many excellent things, even if they are still more underappreciated than they deserve.

Unlike traditional IDE coding assistants or simple API chat wrappers, Hermes has a defining trait: the longer it runs, the more capable it becomes. It is not designed to complete a task once and stop; it keeps learning and accumulating experience over long-running operation. In that sense, it feels a little like a person.

Several of Hermes’s core capabilities happen to align very closely with HagiCode’s needs.

This means HagiCode can choose the most suitable deployment model based on each user’s scenario: individuals run it locally, teams deploy it on servers, and complex tasks use GPU resources. One codebase handles all of it. In a world this busy, saving one layer of complexity is already a win.

Multi-platform messaging gateway Hermes natively supports Telegram, Discord, Slack, WhatsApp, and more. For HagiCode, this means we can support AI assistants on those channels much more easily in the future. More paths forward are always welcome.

Rich tool system Hermes comes with 40+ built-in tools and supports MCP (Model Context Protocol) extensions. This is essential for a coding assistant: executing shell commands, working with the file system, and calling Git all depend on tool support. An Agent without tools is like a bird without wings.

Cross-session memory Hermes includes a persistent memory system and uses FTS5 full-text search to recall historical conversations. That allows the Agent to remember prior context instead of “losing its memory” every time. Sometimes people wish they could forget things that easily, but reality is usually less generous.

Now that the “why” is clear, let us look at the “how.” Once something makes sense in theory, the next step is to build it.

In HagiCode’s architecture, all AI Providers implement a unified IAIProvider interface:

public sealed class HermesCliProvider : IAIProvider, IVersionedAIProvider
{
public ProviderCapabilities Capabilities { get; } = new ProviderCapabilities
{
SupportsStreaming = true, // Supports streaming output
SupportsTools = true, // Supports tool invocation
SupportsSystemMessages = true, // Supports system prompts
SupportsArtifacts = false
};
}

This abstraction layer allows HagiCode to switch seamlessly between different AI Providers. Whether the backend is OpenAI, Claude, or Hermes, the upper-layer calling pattern stays exactly the same. In plain terms, it keeps things simple.

Hermes communicates through ACP (Agent Communication Protocol). This protocol is designed specifically for Agent communication, and its main methods include:

MethodDescription
initializeInitialize the connection and obtain the protocol version and client capabilities
authenticateHandle authentication and support multiple authentication methods
session/newCreate a new session and configure the working directory and MCP servers
session/promptSend a prompt and receive a response

HagiCode implements the ACP transport layer through StdioAcpTransport, launching a Hermes subprocess and communicating with it over standard input and output. It may sound complicated, but in practice it is manageable as long as you have enough patience.

Configuration is managed through the HermesPlatformConfiguration class:

public sealed class HermesPlatformConfiguration : IAcpPlatformConfiguration
{
public string ExecutablePath { get; set; } = "hermes";
public string Arguments { get; set; } = "acp";
public int StartupTimeoutMs { get; set; } = 5000;
public string ClientName { get; set; } = "HagiCode";
public HermesAuthenticationConfiguration Authentication { get; set; }
public HermesSessionDefaultsConfiguration SessionDefaults { get; set; }
}

Configure Hermes in appsettings.json:

{
"Providers": {
"HermesCli": {
"ExecutablePath": "hermes",
"Arguments": "acp",
"StartupTimeoutMs": 10000,
"ClientName": "HagiCode",
"Authentication": {
"PreferredMethodId": "api-key",
"MethodInfo": {
"api-key": "your-api-key-here"
}
},
"SessionDefaults": {
"Model": "claude-sonnet-4-20250514",
"ModeId": "default"
}
}
}
}

Configuration often looks simple on paper, but getting every detail right still takes real effort.

HagiCode uses Orleans to build its distributed system, and the Hermes integration is implemented through the following components:

  • HermesGrain: An Orleans Grain implementation that handles session execution
  • HermesPlatformConfiguration: Platform-specific configuration
  • HermesAcpSessionAdapter: ACP session adapter
  • HermesConsole: A dedicated validation console

The name Orleans does have a certain charm to it. Even if this Orleans has nothing to do with the legendary city, a good name never hurts.

The following is the core execution logic of the Hermes Provider:

private async IAsyncEnumerable<AIStreamingChunk> StreamCoreAsync(
AIRequest request,
string? embeddedCommandPrompt,
[EnumeratorCancellation] CancellationToken cancellationToken)
{
// 1. Create transport layer and launch Hermes subprocess
await using var transport = new StdioAcpTransport(
platformConfiguration.GetExecutablePath(),
platformConfiguration.GetArguments(),
platformConfiguration.GetEnvironmentVariables(),
platformConfiguration.GetStartupTimeout(),
_loggerFactory.CreateLogger<StdioAcpTransport>());
await transport.ConnectAsync(cancellationToken);
// 2. Initialize and obtain protocol version and authentication methods
var initializeResult = await SendHermesRequestAsync(
transport, nextRequestId++, "initialize",
BuildInitializeParameters(platformConfiguration), cancellationToken);
// 3. Handle authentication
var authMethods = ParseAuthMethods(initializeResult);
if (!isAuthenticated)
{
var methodId = platformConfiguration.Authentication.ResolveMethodId(authMethods);
await SendHermesRequestAsync(transport, nextRequestId++, "authenticate", ...);
}
// 4. Create session
var newSessionResult = await SendHermesRequestAsync(
transport, nextRequestId++, "session/new",
BuildNewSessionParameters(platformConfiguration, workingDirectory, model), cancellationToken);
var sessionId = ParseSessionId(newSessionResult);
// 5. Execute prompt and collect streaming responses
await foreach (var payload in transport.ReceiveMessagesAsync(cancellationToken))
{
// Handle session/update notifications and convert them into streaming chunks
if (TryParseSessionNotification(root, out var notification))
{
if (_responseMapper.TryConvertToStreamingChunk(notification, out var chunk))
{
yield return chunk;
}
}
}
}

With code, the details eventually become familiar. What matters most is the overall approach.

To ensure Hermes remains available, HagiCode implements a health check mechanism:

public async Task<ProviderTestResult> PingAsync(CancellationToken cancellationToken = default)
{
var response = await ExecuteAsync(
new AIRequest
{
Prompt = "Reply with exactly PONG.",
CessionId = null,
AllowedTools = Array.Empty<string>(),
WorkingDirectory = ResolveWorkingDirectory(null)
},
cancellationToken);
var success = string.Equals(response.Content.Trim(), "PONG", StringComparison.OrdinalIgnoreCase);
return new ProviderTestResult
{
ProviderName = Name,
Success = success,
ResponseTimeMs = stopwatch.ElapsedMilliseconds,
ErrorMessage = success ? null : $"Unexpected Hermes ping response: '{response.Content}'."
};
}

That is roughly what a “health check” looks like here. In some ways, people are not so different: it helps to check in from time to time, even if no one tells us exactly what to look for.

There are a few pitfalls worth understanding before integrating Hermes. Everyone steps into a few traps sooner or later.

Hermes supports multiple authentication methods, including API keys and tokens, so you need to choose based on the actual deployment scenario. Misconfiguration can cause connection failures, and the resulting error messages are not always intuitive. Sometimes the reported error is far away from the real root cause, which means slow and careful debugging is unavoidable.

When creating a session, you can configure a list of MCP servers so Hermes can call external tools. But keep the following points in mind:

  • MCP server addresses must be reachable
  • Timeouts must be configured reasonably
  • The system needs degradation handling when a server is unavailable

In practice, defensive thinking matters more than people expect.

Each session must specify a working directory so Hermes can access project files correctly. In multi-project scenarios, the working directory needs to switch dynamically. It sounds straightforward, but there are more edge cases than you might think.

Hermes responses may be split across session/update notifications and the final result, so they must be merged correctly. Otherwise, content may be lost.

Runtime errors should be returned explicitly instead of silently falling back to another Provider. That way, users know the issue came from Hermes rather than wondering why the system suddenly switched models behind the scenes.

HagiCode’s decision to use Hermes as its integrated Agent core was not a casual impulse. It was a careful choice based on practical requirements and the technical characteristics of the framework. Whether it proves to be the perfect long-term answer is still too early to say, but so far it has been serving us well.

Hermes gives HagiCode the flexibility to adapt to a wide range of scenarios. Its powerful tool system and MCP support allow the AI assistant to do real work, while the ACP protocol and Provider abstraction layer keep the integration process clear and controllable.

If you are choosing an Agent framework for your own AI project, I hope this article offers a useful reference. Picking the right underlying architecture can make everything that follows much easier.

Thank you for reading. If you found this article useful, you are welcome to support it with a like, bookmark, or share. This content was created with AI-assisted collaboration, and the final content was reviewed and confirmed by the author.

.NET Code Protection in Practice: From Obfuscation to Virtual Machine Protection

.NET Code Protection in Practice: From Obfuscation to Virtual Machine Protection

Section titled “.NET Code Protection in Practice: From Obfuscation to Virtual Machine Protection”

This article explains how to implement a multi-layered code protection strategy in .NET projects, covering the full path from basic obfuscation to professional virtual machine protection.

In .NET application development, protecting core code such as license validation, business logic, and sensitive configuration from decompilation and reverse engineering is, frankly, a topic you cannot avoid. As the .NET ecosystem has matured, developers have gained access to a range of protection options, from built-in obfuscation attributes to professional virtualization-based protection tools.

As a complex multilingual monorepo project, HagiCode includes desktop applications, build systems, and license management capabilities. The code inevitably contains license validation logic, sensitive configuration such as API keys and product IDs, and business-critical logic. Those parts need serious protection, because no one wants their hard work to be exposed so easily.

This article shares the code protection approach we actually adopted in the HagiCode project and summarizes the full journey from early pitfalls to later optimization. Hopefully it gives you some useful ideas.

HagiCode is an open source AI coding assistant project dedicated to providing developers with an intelligent programming experience. The project uses a monorepo architecture and simultaneously maintains a VSCode extension, backend AI services, a cross-platform desktop client, and more. That multi-language, multi-platform complexity makes code protection an engineering challenge we have to face head-on.

The approach shared in this article is the result of real trial and error during HagiCode development. If you want to see how we solved these technical problems, keep reading. You may find a few unexpected takeaways.

1. Microsoft’s Built-in Obfuscation Attribute

Section titled “1. Microsoft’s Built-in Obfuscation Attribute”

.NET Framework provides a built-in [ObfuscationAttribute], which is the most basic and commonly used code obfuscation marker. This attribute lives in the System.Reflection namespace and allows you to apply baseline protection to code without introducing third-party tools.

Core features:

  • Feature property: Specifies the obfuscation feature, such as "ultra" (high obfuscation) or "all" (full obfuscation)
  • Exclude property: true means exclude from obfuscation, and false means apply obfuscation
  • Can be applied to classes, methods, properties, and other type members

In the HagiCode project, you can see it used like this:

[Obfuscation(Feature = "ultra", Exclude = false)]
public async Task<LicenseValidationResult?> ValidateLicenseAsync(...)

The advantages of this approach are fairly obvious:

  • No extra dependencies; it is built into .NET Framework out of the box
  • Can be recognized and processed by third-party obfuscation tools
  • Does not significantly increase the size of the compiled assembly

That said, it also has limitations. It is only a marker, and the actual obfuscation result depends on the tool implementation. It cannot provide virtual machine protection-level security.

VMP is a professional code protection tool that provides high-level protection by compiling code into virtual machine instructions. Unlike simple name obfuscation, VMP actually transforms code logic into a form that conventional decompilers cannot reconstruct.

Protection level classification:

LevelVirtualizationMutationAnti-debuggingString EncryptionUse Cases
HIGHfullhighenabledenabledLicense validation, session concurrency, sensitive constants
MEDIUMpartialmediumenabledenabledBusiness logic, domain models
LOWnonelowdisableddisabledUtility classes, non-critical code

The HagiCode project defines a declarative attribute system for marking code that needs protection:

// High-priority protection
[VmProtect(VmProtectionPriority.High, Reason = "Contains license verification logic")]
public class KeygenClient { ... }
// Exclude from protection
[VmExclude(Reason = "Public API that must remain unchanged")]
public class PublicApi { ... }
// Inherited protection
[VmProtect(Priority.High, ProtectDerived = true)]
public class BaseLicenseValidator { ... }

VMP protection does not only matter at runtime. It also needs to be automated as part of the build pipeline, because doing it manually would be far too tedious. HagiCode’s build system supports several modes:

  • Native Windows mode: Invoke the VMProtect tool directly
  • Linux Docker container mode: Run VMP inside a container to solve cross-platform compatibility issues
  • Attribute scanning: Automatically discover protection markers in code
  • Validation mechanism: Confirm that protection has been applied successfully

Taken together, these capabilities make the process much easier to manage.

1. Using Microsoft’s Built-in Obfuscation Attribute

Section titled “1. Using Microsoft’s Built-in Obfuscation Attribute”

Apply ObfuscationAttribute directly in code:

using System.Reflection;
[Obfuscation(Feature = "ultra", Exclude = false)]
public class LicenseService
{
[Obfuscation(Feature = "ultra", Exclude = false)]
public async Task<bool> ValidateLicenseAsync(string key)
{
// License validation logic
}
[Obfuscation(Feature = "flow", Exclude = false)]
private string DecryptToken(string encrypted)
{
// Decryption logic
}
}

Sometimes you need to let test assemblies access internal members while still keeping production code secure:

AssemblyInfo.cs
[assembly: InternalsVisibleTo("HagiCode.Application.Tests")]
[assembly: InternalsVisibleTo("DynamicProxyGenAssembly2")] // for Moq

This makes testing much more convenient, because the code still needs to be tested properly.

2. Custom Attribute Definitions for VMP Protection

Section titled “2. Custom Attribute Definitions for VMP Protection”

Create custom protection attributes to control VMP behavior:

using System;
[AttributeUsage(AttributeTargets.Class | AttributeTargets.Method | AttributeTargets.Property)]
public class VmProtectAttribute : Attribute
{
public VmProtectionPriority Priority { get; set; }
public string? Reason { get; set; }
public bool ProtectDerived { get; set; }
}
[AttributeUsage(AttributeTargets.Class | AttributeTargets.Method | AttributeTargets.Property)]
public class VmExcludeAttribute : Attribute
{
public string? Reason { get; set; }
}
public enum VmProtectionPriority
{
None = 0,
Low = 1,
Medium = 2,
High = 3
}

Custom attributes are often easier to work with because they reflect your own protection requirements directly.

vmp_config.yml
protection:
priority_mode: "attribute" # Attribute-based priority
default_level: "medium"
tools:
- name: "vmprotect"
path: "C:\\Program Files\\VMProtect Ultimate\\VMProtect.exe"
protection_levels:
high:
virtualization: "full"
mutation: "high"
anti_debug: true
anti_dump: true
encrypt_strings: true
encrypt_resources: true
medium:
virtualization: "partial"
mutation: "medium"
anti_debug: true
encrypt_strings: true
low:
virtualization: "none"
mutation: "low"
anti_debug: false

A clearer configuration makes the system much easier to maintain later.

1. Protection Practices for Critical Components

Section titled “1. Protection Practices for Critical Components”

According to HagiCode’s code-protection specification, the following components must use HIGH-priority protection:

// Production constants - must be encrypted and protected by VMP
[VmProtect(VmProtectionPriority.High, Reason = "Production constants")]
public static class ProductionConstants
{
// Encrypted string accessor, protected by VMP
[VmProtect(VmProtectionPriority.High)]
public static string GetLicenseServerUrl(IOptions<LicenseOptions> options) => ...;
}
// License validation logic
[VmProtect(VmProtectionPriority.High, Reason = "License verification logic")]
public class KeygenClient : IKeygenClient
{
[Obfuscation(Feature = "ultra", Exclude = false)]
public async Task<LicenseValidationResult?> ValidateLicenseAsync(...) { ... }
}
// Machine fingerprint service
[VmProtect(VmProtectionPriority.High)]
public class MachineFingerprintService : IMachineFingerprintService { ... }

Critical code deserves stronger protection, because exposing core logic would cause real problems.

2. String Encryption and Runtime Decryption

Section titled “2. String Encryption and Runtime Decryption”

Encrypt strings at build time and decrypt them at runtime:

public static class StringDecryption
{
[VmProtect(VmProtectionPriority.High, Reason = "CRITICAL SECURITY")]
public static string DecryptString(byte[] encryptedData, byte[] key, byte[] iv)
{
using var aes = Aes.Create();
aes.Key = key;
aes.IV = iv;
using var decryptor = aes.CreateDecryptor();
using var ms = new MemoryStream(encryptedData);
using var cs = new CryptoStream(ms, decryptor, CryptoStreamMode.Read);
using var reader = new StreamReader(cs);
return reader.ReadToEnd();
}
}
// Production constant accessor (lazy loading + caching)
public static class ProductionConstants
{
private static string? _cachedLicenseServerUrl;
public static string GetLicenseServerUrl(IOptions<LicenseOptions> options)
{
if (_cachedLicenseServerUrl == null)
{
var encrypted = GetEncryptedLicenseServerUrl();
#if DEBUG
_cachedLicenseServerUrl = options.Value.PrimaryServer.Url;
#else
_cachedLicenseServerUrl = StringDecryption.DecryptString(
encrypted,
GetEncryptionKey(),
GetEncryptionIV());
#endif
}
return _cachedLicenseServerUrl;
}
}

This step matters because sensitive information should never be left in plaintext.

After the build, you must verify whether protection was applied successfully; otherwise, you cannot be sure it is actually working:

// Example verification script
public bool VerifyProtection(string assemblyPath)
{
// 1. Check the VMP signature
var bytes = File.ReadAllBytes(assemblyPath);
var vmpSignature = Encoding.ASCII.GetBytes("VMProtect");
if (bytes.Any(b => vmpSignature.Contains(b)))
{
return true;
}
// 2. Check for file size changes (the protected file is usually larger)
var originalInfo = new FileInfo(assemblyPath.Replace(".dll", ".bak"));
if (originalInfo.Exists)
{
var sizeRatio = (double)new FileInfo(assemblyPath).Length / originalInfo.Length;
return sizeRatio > 1.1;
}
return false;
}

Verification is always worth doing, because otherwise problems can slip through unnoticed.

There are several pitfalls here that deserve special attention:

  1. Do not obfuscate all code: Public APIs, interface definitions, and DTO classes usually do not need protection. Excessive obfuscation can hurt performance and debugging. The HagiCode project learned this the hard way.

  2. Protect key accessors: Methods that retrieve encryption keys must receive the same or a higher protection level than the encrypted data itself; otherwise the whole setup loses its value.

  3. Balance testing and production: DEBUG builds should skip encryption to make development and debugging easier, while RELEASE builds should enable full protection. Remember to separate them with conditional compilation such as #if DEBUG.

  4. Consider the Docker environment: Running VMP on Linux requires a containerized approach to ensure tool compatibility. HagiCode uses a Wine + VMP container solution to solve the cross-platform problem.

  5. Verification is mandatory: After the build finishes, you must verify that protection was applied successfully. Otherwise sensitive code may still be exposed, and the verification code shown earlier exists for exactly this purpose.

With this multi-layer protection strategy, HagiCode built a comprehensive code security system that spans from baseline obfuscation to virtual machine protection:

  • Layer 1: Use ObfuscationAttribute for baseline marking and provide hints to third-party tools
  • Layer 2: Use custom VmProtectAttribute declarations to express protection intent and priority
  • Layer 3: Use VMP virtual machine protection to transform critical code into irreversible virtual machine instructions
  • Layer 4: Automatically scan and apply protection during the build, then verify the result

This approach can resist ordinary decompilation tools while also standing up better against advanced reverse engineering attacks. If you are building a .NET application that needs code protection, I hope this gives you at least a useful reference point.


If this article helped you:

Building an AI Adventure Party: A Practical Guide to Multi-Agent Collaboration Configuration in HagiCode

Building an AI Adventure Party: A Practical Guide to Multi-Agent Collaboration Configuration in HagiCode

Section titled “Building an AI Adventure Party: A Practical Guide to Multi-Agent Collaboration Configuration in HagiCode”

In modern software development, a single AI Agent is no longer enough for complex needs. How can multiple AI assistants from different companies collaborate within the same project? This article shares the multi-Agent collaboration configuration approach that the HagiCode project developed through real-world practice.

Many developers have likely had this experience: bringing an AI assistant into a project really does improve coding efficiency. But as requirements grow more complex, one AI Agent starts to fall short. You want it to handle code review, documentation generation, unit tests, and more at the same time, but the result is often that it cannot balance everything well, and output quality becomes inconsistent.

What is even more frustrating is that once you try to introduce multiple AI assistants, things get more complicated. Each Agent has its own configuration method, API interface, and execution logic, and they may even conflict with one another. It is like a sports team where every player is individually strong, but nobody knows how to coordinate, so the whole match turns into chaos.

The HagiCode project ran into the same problem during development. As a complex project involving a frontend VSCode extension, backend AI services, and a cross-platform desktop client, in the 2026-03 version at that time we needed to integrate multiple AI assistants from different companies at once: Claude Code, Codex, CodeBuddy, iFlow, and more. Figuring out how to let them coexist harmoniously in the same project while making the best use of their individual strengths became a critical problem we had to solve.

That alone would already be enough trouble. After all, who wants to deal with a group of AI tools fighting each other every day?

The approach shared in this article is the multi-Agent collaboration configuration practice we developed in the HagiCode project through real trial and error and repeated optimization. If you are also struggling with multiple AI assistants working together, this article may give you some ideas. Maybe. Every project is different, after all.

HagiCode is an AI coding assistant project that adopts an “adventure party” model in which multiple AI engines work together. Project repository: github.com/HagiCode-org/site.

The multi-Agent configuration approach shared here is one of the core techniques that allows HagiCode to maintain efficient development in complex projects. There is nothing especially mystical about it - it just turns a group of AIs into an adventure party that can actually coordinate.

HagiCode’s Multi-Agent Architecture Design

Section titled “HagiCode’s Multi-Agent Architecture Design”

From “Going Solo” to “Team Collaboration”

Section titled “From “Going Solo” to “Team Collaboration””

In the early days of the HagiCode project, we also tried using a single AI Agent to handle everything. We quickly discovered a clear bottleneck in that approach: different tasks demand different strengths. Some tasks require stronger contextual understanding, while others need more precise code editing. One Agent has a hard time excelling at all of them.

That made us realize that multiple Agents had to work together. But the problem was this: how do you let AI products from different companies coexist peacefully in the same project? We needed to solve several core issues:

  1. Configuration management complexity: each Agent has different configuration methods, API interfaces, and execution modes
  2. Unified communication protocol: we need a standardized way for different Agents to exchange data
  3. Task coordination and division of labor: how do we assign work reasonably so each Agent can play to its strengths

With those questions in mind, we started designing HagiCode’s multi-Agent architecture. It was not really that complicated in the end; we just had to think it through clearly.

After multiple iterations, this is the architecture we settled on:

┌─────────────────────────────────────────────────────────────────┐
│ AIProviderFactory │
│ (Factory pattern for unified management of all AI Providers) │
├─────────────────────────────────────────────────────────────────┤
│ ClaudeCodeCli │ CodexCli │ CodebuddyCli │ IFlowCli │
│ (Anthropic) │ (OpenAI) │ (Zhipu GLM) │ (Zhipu) │
└─────────────────────────────────────────────────────────────────┘

The core idea is to let different AI Agents be managed by the same code through a unified Provider interface. At the same time, the factory pattern is used to dynamically create and configure these Providers, ensuring scalability and flexibility across the system.

It is like division of labor in daily life. Everyone has a role; here we simply turned that idea into code architecture.

Agent Types and Division of Responsibilities

Section titled “Agent Types and Division of Responsibilities”

Based on HagiCode’s real-world experience, we assigned different responsibilities to each Agent:

AgentProviderModelPrimary Use
ClaudeCodeCliAnthropicglm-5-turboGenerate technical solutions and Proposals
CodexCliOpenAI/Zedgpt-5.4Execute precise code changes
CodebuddyCliZhipuglm-4.7Refine proposal descriptions and documentation
IFlowCliZhipuglm-4.7Archive proposals and historical records (configuration at the time; now legacy-compatible only)
OpenCodeCli--General-purpose code editing
GitHubCopilotMicrosoft-Assisted programming and code completion

The logic behind this division of labor is simple: every Agent has its own area of strength. Claude Code performs well at understanding and analyzing complex requirements, so it handles early solution design. Codex is more precise when modifying code, so it is better suited for concrete implementation work. CodeBuddy offers strong cost performance, which makes it a great fit for refining documentation.

After all, the right tool for the right job is usually the best choice. There are many roads to Rome; some are simply easier to walk than others.

To manage different AI Agents in a unified way, we first need to define a common interface. In HagiCode, that interface looks like this:

public interface IAIProvider
{
// Unified Provider interface
Task<IAIProvider?> GetProviderAsync(AIProviderType providerType);
Task<IAIProvider?> GetProviderAsync(string providerName, CancellationToken cancellationToken);
}

The interface looks simple, but it is the foundation of the entire multi-Agent system. With a unified interface, we can call AI products from different companies in exactly the same way, no matter what is underneath.

This is really just a matter of making complex things simple. Simple is beautiful, after all.

Once the interface is unified, the next question is how to create these Provider instances. HagiCode uses the factory pattern:

private IAIProvider? CreateProvider(AIProviderType providerType, ProviderConfiguration config)
{
return providerType switch
{
AIProviderType.ClaudeCodeCli =>
ActivatorUtilities.CreateInstance<ClaudeCodeCliProvider>(_serviceProvider, Options.Create(config)),
AIProviderType.CodebuddyCli =>
ActivatorUtilities.CreateInstance<CodebuddyCliProvider>(_serviceProvider, Options.Create(config)),
AIProviderType.CodexCli =>
ActivatorUtilities.CreateInstance<CodexCliProvider>(_serviceProvider, Options.Create(config)),
AIProviderType.IFlowCli =>
ActivatorUtilities.CreateInstance<IFlowCliProvider>(_serviceProvider, Options.Create(config)),
_ => null
};
}

This uses dependency injection through ActivatorUtilities.CreateInstance, which can dynamically create Provider instances at runtime while automatically injecting dependencies. The benefit of this design is that when a new Agent type is added, you only need to add the corresponding Provider class and then add one more case branch in the factory method. There is no need to modify the existing code at all.

That is reason enough. Who wants to rewrite a pile of old code every time a new feature is added?

To make configuration more flexible, we also implemented a type-mapping mechanism:

public static AIProviderTypeExtensions
{
private static readonly Dictionary<string, AIProviderType> _typeMap = new(
StringComparer.OrdinalIgnoreCase)
{
["ClaudeCodeCli"] = AIProviderType.ClaudeCodeCli,
["CodebuddyCli"] = AIProviderType.CodebuddyCli,
["CodexCli"] = AIProviderType.CodexCli,
["IFlowCli"] = AIProviderType.IFlowCli,
// ...more type mappings
};
}

The purpose of this mapping table is to convert string-form Provider names into enum types. This allows configuration files to use intuitive string names, while the internal code uses type-safe enums for processing.

Configuration should be as intuitive as possible. Nobody wants to memorize a pile of obscure code names.

In practice, everything can be configured in appsettings.json:

AI:
Providers:
Providers:
ClaudeCodeCli:
Enabled: true
Model: glm-5-turbo
WorkingDirectory: /path/to/project
CodebuddyCli:
Enabled: true
Model: glm-4.7
CodexCli:
Enabled: true
Model: gpt-5.4
IFlowCli:
Enabled: true
Model: glm-4.7

Each Provider can independently configure parameters such as enablement, model version, and working directory. This design preserves flexibility while remaining easy to manage and maintain.

In some ways, configuration files are like life’s options: you can choose to enable or disable certain things. The only difference is that code choices are easier to regret later.

With the unified technical architecture in place, the next step is making multiple Agents work together. HagiCode designed a task flow mechanism so different Agents can handle different stages of the work:

Proposal creation (user)
[Claude Code] ──generate proposal──▶ Proposal document
│ │
│ ▼
│ [Codebuddy] ──refine description──▶ Refined proposal
│ │
│ ▼
│ [Codex] ──execute changes──▶ Code changes
│ │
│ ▼
└──────────────────────▶ [iFlow] ──archive──▶ Historical records

The benefit of this division of labor is that each Agent only needs to focus on the tasks it does best, rather than trying to do everything. Claude Code generates proposals from scratch. Codebuddy makes proposal descriptions clearer. Codex turns proposals into actual code changes. iFlow archives and preserves those changes.

This is really just teamwork, the same as in daily life. Everyone has a role, and only together can something big get done. Here, the team members just happen to be AIs.

In actual operation, we summarized the following lessons:

1. Agent selection strategy matters

Tasks should not be assigned casually; they should be matched to each Agent’s strengths:

  • Proposal generation: use Claude Code, because it has stronger contextual understanding
  • Code execution: use Codex, because it is more precise for code modification
  • Proposal refinement: use Codebuddy, because it offers strong cost performance
  • Archival storage: use iFlow, because it is stable and reliable

After all, putting the right person on the right task is a timeless principle.

2. Configuration isolation ensures stability

Each Agent’s configuration is managed independently, supports environment-variable overrides, and uses separate working directories. As a result, a configuration error in one Agent does not affect the others.

This is like personal boundaries in life. Everyone needs their own space; non-interference makes coexistence possible.

3. Error-handling mechanism

A failure in a single Agent should not affect the overall workflow. We implemented a fallback strategy: when one Agent fails, the system can automatically switch to a backup plan or skip that step and continue with later tasks. At the same time, complete logging makes troubleshooting easier afterward.

Nobody can guarantee that errors will never happen. The key is how you handle them. Life works much the same way.

4. Monitoring and observability

Through the ACP protocol (our custom communication protocol based on JSON-RPC 2.0), we can track the execution status of each Agent. Session isolation ensures concurrency safety, while dynamic caching improves performance.

The things you cannot see are often the ones most likely to go wrong. Some visibility is always better than flying blind.

After adopting this multi-Agent collaboration configuration, the HagiCode project’s development efficiency improved significantly. Specifically:

  1. Task-handling capacity doubled: in the past, one Agent had to handle many kinds of tasks at once; now tasks can be processed in parallel, and throughput has increased dramatically
  2. More stable output quality: each Agent focuses only on what it does best, so consistency and quality both improve
  3. Lower maintenance cost: unified interfaces and configuration management make the whole system easier to maintain and extend
  4. Adding new Agents is simple: to integrate a new AI product, you only need to implement the interface and add configuration, without changing the core logic

This approach not only solved HagiCode’s own problems, but also proved that multi-Agent collaboration is a viable architectural choice.

The gains were quite noticeable. The process was just a bit of a hassle.

This article shared the HagiCode project’s practical experience with multi-Agent collaboration configuration. The main takeaways include:

  1. Standardized interfaces: IAIProvider unifies the behavior of different Agents, allowing the code to ignore which company’s product is underneath
  2. Factory pattern: ActivatorUtilities.CreateInstance dynamically creates Provider instances, supporting runtime configuration and dependency injection
  3. Protocol unification: the ACP protocol provides standardized communication between Agents through a bidirectional mechanism based on JSON-RPC 2.0
  4. Task routing: assign work reasonably across different Agents so each can play to its strengths, instead of expecting one Agent to do everything

This design not only solves the problem of “multiple Agents fighting each other,” but also uses the adventure party task flow mechanism to make the development process more automated and specialized.

If you are also considering introducing multiple AI assistants, I hope this article gives you some useful reference points. Of course, every project is different, and the specific approach still needs to be adjusted to the actual situation. There is no one-size-fits-all solution; the best solution is the one that fits you.

Beautiful things or people do not need to be possessed. As long as they remain beautiful, simply appreciating that beauty is enough. Technical solutions are the same: the one that suits you is the best one…

Building an AI Adventure Party: HagiCode Multi-Agent Collaboration Configuration in Practice

Building an AI Adventure Party: HagiCode Multi-Agent Collaboration Configuration in Practice

Section titled “Building an AI Adventure Party: HagiCode Multi-Agent Collaboration Configuration in Practice”

In modern software development, a single AI Agent is no longer enough to meet complex requirements. How can multiple AI assistants from different companies collaborate within the same project? This article shares the multi-Agent collaboration configuration approach that the HagiCode project developed through real-world practice.

Many developers have probably had this experience: after introducing an AI assistant into a project, productivity really does improve. But as requirements become more and more complex, one AI Agent starts to feel insufficient. You want it to handle code review, documentation generation, unit testing, and other tasks at the same time, but the result is often that it cannot keep everything balanced, and the output quality becomes inconsistent.

What is even more frustrating is that once you try to bring in multiple AI assistants, the problem becomes more complicated. Each Agent has its own configuration method, API interface, and execution logic, and they may even conflict with one another. It is like a sports team in which every player is talented, but nobody knows how to work together, so the match turns into a mess.

The HagiCode project ran into the same challenge during development. As a complex project involving a frontend VSCode extension, backend AI services, and a cross-platform desktop client, we needed to connect multiple AI assistants from different companies at the same time: Claude Code, Codex, CodeBuddy, iFlow, and more. How to let them coexist harmoniously in the same project and make the most of their strengths became a key problem we had to solve.

That alone would already be enough trouble. After all, who wants to deal with a bunch of fighting AIs every day?

The approach shared in this article is the multi-Agent collaboration configuration practice that we developed in the HagiCode project through real trial and error and repeated optimization. If you are also struggling with multiple AI assistants working together, this article may give you some inspiration. Maybe. Every project is different, after all.

HagiCode is an AI coding assistant project that adopts an “adventure party” model in which multiple AI engines work together. Project repository: github.com/HagiCode-org/site.

The multi-Agent configuration approach shared in this article is one of the core technologies that allows HagiCode to maintain efficient development in complex projects. There is nothing especially magical about it; it simply turns a group of AIs into an adventure party that can actually coordinate.

HagiCode’s Multi-Agent Architecture Design

Section titled “HagiCode’s Multi-Agent Architecture Design”

From “Going Solo” to “Team Collaboration”

Section titled “From “Going Solo” to “Team Collaboration””

In the early days of the HagiCode project, we also tried using a single AI Agent to handle every task. We soon discovered a clear bottleneck in that approach: different tasks require different strengths. Some tasks need stronger contextual understanding, while others need more precise code modification capabilities. One Agent has a hard time excelling at everything.

That made us realize that multiple Agents had to work together. But the problem was this: how do you let AI products from different companies coexist peacefully in the same project? We needed to solve several core issues:

  1. Configuration management complexity: each Agent has different configuration methods, API interfaces, and execution modes
  2. Unified communication protocol: we need a standardized way for different Agents to exchange data
  3. Task coordination and division of labor: how do we assign work reasonably so each Agent can play to its strengths

With those questions in mind, we started designing HagiCode’s multi-Agent architecture. It was not actually that complicated; we just had to think it through clearly.

After multiple iterations, this is the architecture we settled on:

┌─────────────────────────────────────────────────────────────────┐
│ AIProviderFactory │
│ (Factory pattern for unified management of all AI Providers) │
├─────────────────────────────────────────────────────────────────┤
│ ClaudeCodeCli │ CodexCli │ CodebuddyCli │ IFlowCli │
│ (Anthropic) │ (OpenAI) │ (Zhipu GLM) │ (Zhipu) │
└─────────────────────────────────────────────────────────────────┘

The core idea is to let different AI Agents be managed by the same set of code through a unified Provider interface. At the same time, the factory pattern is used to dynamically create and configure these Providers, ensuring scalability and flexibility across the system.

It is like division of labor in everyday life. Everyone has their own role; here we simply turned that idea into code architecture.

Agent Types and Division of Responsibilities

Section titled “Agent Types and Division of Responsibilities”

Based on HagiCode’s real-world experience, we assigned different responsibilities to each Agent:

AgentProviderModelPrimary Use
ClaudeCodeCliAnthropicglm-5-turboGenerate technical solutions and Proposals
CodexCliOpenAI/Zedgpt-5.4Execute precise code changes
CodebuddyCliZhipuglm-4.7Refine proposal descriptions and documentation
IFlowCliZhipuglm-4.7Archive proposals and historical records
OpenCodeCli--General-purpose code editing
GitHubCopilotMicrosoft-Assisted programming and code completion

The logic behind this division of labor is simple: every Agent has its own area of strength. Claude Code performs well at understanding and analyzing complex requirements, so it handles early solution design. Codex is more precise when modifying code, so it is better suited for concrete implementation work. CodeBuddy offers strong cost performance, which makes it ideal for refining proposal text and documentation.

After all, the right tool for the right job is the best choice. There are many roads to Rome; some are simply easier to walk than others.

To manage different AI Agents in a unified way, we first need to define a common interface. In HagiCode, that interface looks like this:

public interface IAIProvider
{
// Unified Provider interface
Task<IAIProvider?> GetProviderAsync(AIProviderType providerType);
Task<IAIProvider?> GetProviderAsync(string providerName, CancellationToken cancellationToken);
}

The interface looks simple, but it is the foundation of the entire multi-Agent system. With a unified interface, we can call AI products from different companies in the same way regardless of which company is behind them.

This is really just about making complex things simple. Simple is beautiful, after all.

Once the interface is unified, the next question is how to create these Provider instances. HagiCode uses the factory pattern:

private IAIProvider? CreateProvider(AIProviderType providerType, ProviderConfiguration config)
{
return providerType switch
{
AIProviderType.ClaudeCodeCli =>
ActivatorUtilities.CreateInstance<ClaudeCodeCliProvider>(_serviceProvider, Options.Create(config)),
AIProviderType.CodebuddyCli =>
ActivatorUtilities.CreateInstance<CodebuddyCliProvider>(_serviceProvider, Options.Create(config)),
AIProviderType.CodexCli =>
ActivatorUtilities.CreateInstance<CodexCliProvider>(_serviceProvider, Options.Create(config)),
AIProviderType.IFlowCli =>
ActivatorUtilities.CreateInstance<IFlowCliProvider>(_serviceProvider, Options.Create(config)),
_ => null
};
}

This uses dependency injection through ActivatorUtilities.CreateInstance, which can dynamically create Provider instances at runtime while automatically injecting dependencies. The benefit of this design is that when a new Agent type is added, you only need to add the corresponding Provider class and then add one more case branch in the factory method. There is no need to modify the existing code at all.

That is reason enough. Who wants to rewrite a pile of old code every time a new feature is added?

To make configuration more flexible, we also implemented a type-mapping mechanism:

public static AIProviderTypeExtensions
{
private static readonly Dictionary<string, AIProviderType> _typeMap = new(
StringComparer.OrdinalIgnoreCase)
{
["ClaudeCodeCli"] = AIProviderType.ClaudeCodeCli,
["CodebuddyCli"] = AIProviderType.CodebuddyCli,
["CodexCli"] = AIProviderType.CodexCli,
["IFlowCli"] = AIProviderType.IFlowCli,
// ...more type mappings
};
}

The purpose of this mapping table is to convert string-form Provider names into enum types. This allows configuration files to use intuitive string names, while the internal code uses type-safe enums for processing.

Configuration should be as intuitive as possible. Nobody wants to memorize a pile of complicated code names.

In practice, everything can be configured in appsettings.json:

AI:
Providers:
Providers:
ClaudeCodeCli:
Enabled: true
Model: glm-5-turbo
WorkingDirectory: /path/to/project
CodebuddyCli:
Enabled: true
Model: glm-4.7
CodexCli:
Enabled: true
Model: gpt-5.4
IFlowCli:
Enabled: true
Model: glm-4.7

Each Provider can independently configure parameters such as enablement, model version, and working directory. This design preserves flexibility while remaining easy to manage and maintain.

Configuration files are a bit like life’s options: you can choose to enable or disable certain things. The only difference is that code choices are easier to regret later.

With the unified technical architecture in place, the next step is making multiple Agents work together. HagiCode designed a task flow mechanism so different Agents can handle different stages of the work:

Proposal creation (user)
[Claude Code] ──generate proposal──▶ Proposal document
│ │
│ ▼
│ [Codebuddy] ──refine description──▶ Refined proposal
│ │
│ ▼
│ [Codex] ──execute changes──▶ Code changes
│ │
│ ▼
└──────────────────────▶ [iFlow] ──archive──▶ Historical records

The benefit of this division of labor is that each Agent only needs to focus on the tasks it does best, rather than trying to do everything. Claude Code is responsible for generating proposals from scratch. Codebuddy makes proposal descriptions clearer. Codex turns proposals into actual code changes. iFlow archives and preserves those changes.

This is really just teamwork, much like in everyday life. Everyone has their own role, and only together can something big get done. The only difference is that the team members here happen to be AIs.

In actual operation, we summarized the following lessons:

1. Agent selection strategy matters

Tasks should not be assigned casually; they should be matched to each Agent’s strengths:

  • Proposal generation: use Claude Code, because it has stronger contextual understanding
  • Code execution: use Codex, because it is more precise for code modification
  • Proposal refinement: use Codebuddy, because it offers strong cost performance
  • Archival storage: use iFlow, because it is stable and reliable

After all, putting the right person on the right task is a timeless principle.

2. Configuration isolation ensures stability

Each Agent’s configuration is managed independently, supports environment-variable overrides, and uses separate working directories. As a result, a configuration error in one Agent does not affect the others.

This is like personal boundaries in life. Everyone needs their own space; non-interference makes harmonious coexistence possible.

3. Error-handling mechanism

A failure in a single Agent should not affect the overall workflow. We implemented a fallback strategy: when one Agent fails, the system can automatically switch to a backup plan or skip that step and continue with later tasks. At the same time, complete logging makes troubleshooting easier afterward.

Nobody can guarantee that errors will never happen. The key is how you handle them. Life works much the same way.

4. Monitoring and observability

Through the ACP protocol (our custom communication protocol based on JSON-RPC 2.0), we can track the execution status of each Agent. Session isolation ensures concurrency safety, while dynamic caching improves performance.

The things you cannot see are often the ones most likely to go wrong. Some visibility is always better than flying blind.

After adopting this multi-Agent collaboration configuration, the HagiCode project’s development efficiency improved significantly. Specifically:

  1. Task-handling capacity doubled: in the past, one Agent had to handle many kinds of tasks at once; now tasks can be processed in parallel, and throughput has increased dramatically
  2. More stable output quality: each Agent focuses only on what it does best, so consistency and quality both improve
  3. Lower maintenance cost: unified interfaces and configuration management make the whole system easier to maintain and extend
  4. Adding new Agents is simple: to integrate a new AI product, you only need to implement the interface and add configuration, without changing the core logic

This approach not only solved HagiCode’s own problems, but also proved that multi-Agent collaboration is a viable architectural choice.

The gains were quite noticeable. The process was just a bit of a hassle.

This article shared the HagiCode project’s practical experience with multi-Agent collaboration configuration. The main takeaways include:

  1. Standardized interfaces: IAIProvider unifies the behavior of different Agents, allowing the code to ignore which company’s product is underneath
  2. Factory pattern: ActivatorUtilities.CreateInstance dynamically creates Provider instances, supporting runtime configuration and dependency injection
  3. Protocol unification: the ACP protocol provides standardized communication between Agents through a bidirectional mechanism based on JSON-RPC 2.0
  4. Task routing: assign work reasonably across different Agents so each can play to its strengths, instead of expecting one Agent to do everything

This design not only solves the problem of “multiple Agents fighting each other,” but also uses the adventure party task flow mechanism to make the development process more automated and specialized.

If you are also considering introducing multiple AI assistants, I hope this article gives you some useful reference points. Of course, every project is different, and the specific approach still needs to be adjusted to the actual situation. There is no one-size-fits-all solution; the best solution is the one that fits you.

Beautiful things or people do not need to be possessed. As long as they remain beautiful, simply appreciating that beauty is enough. Technical solutions are the same: the one that suits you is the best one…


If this article was helpful to you, feel free to give the project a Star on GitHub. Your support is what keeps us sharing more. The public beta has already started, and you are welcome to install it and give it a try.


Thank you for reading. If you found this article useful, please click the like button below so more people can discover it.

This content was created with AI-assisted collaboration, reviewed by me, and reflects my own views and position.

How Gamification Design Makes AI Coding More Fun

How Gamification Design Makes AI Coding More Fun

Section titled “How Gamification Design Makes AI Coding More Fun”

Traditional AI coding tools are actually quite powerful; they just lack a bit of warmth. When we were building HagiCode, we thought: if we are going to write code anyway, why not turn it into a game?

Anyone who has used an AI coding assistant has probably had this experience: at first it feels fresh and exciting, but after a while it starts to feel like something is missing. The tool itself is powerful, capable of code generation, autocomplete, and Bug fixes, but… it does not feel very warm, and over time it can become monotonous and dull.

That alone is enough to make you wonder who wants to stare at a cold, impersonal tool every day.

It is a bit like playing a game. If all you do is finish a task list, with no character growth, no achievement unlocks, and no team coordination, it quickly stops being fun. Beautiful things and people do not need to be possessed to be appreciated; their beauty is enough on its own. Programming tools do not even offer that kind of beauty, so it is easy to lose heart.

We ran into exactly this problem while developing HagiCode. As a multi-AI assistant collaboration platform, HagiCode needs to keep users engaged over the long term. But in reality, even a great tool is hard to stick with if it lacks any emotional connection.

To solve this pain point, we made a bold decision: turn programming into a game. Not the superficial kind with a simple points leaderboard, but a true role-playing gamified experience. The impact of that decision may be even bigger than you imagine.

After all, people need a bit of ritual in their lives.

The ideas shared in this article come from our practical experience on the HagiCode project. HagiCode is a multi-AI assistant collaboration platform that supports Claude Code, Codex, Copilot, OpenCode, and other AI assistants working together. If you are interested in multi-AI collaboration or gamified programming, visit github.com/HagiCode-org/site to learn more.

There is nothing especially mysterious about it. We simply turned programming into an adventure.

The essence of gamification is not just “adding a leaderboard.” It is about building a complete incentive system so users can feel growth, achievement, and social recognition while doing tasks.

HagiCode’s gamification design revolves around one core idea: every AI assistant is a “Hero,” and the user is the captain of this Hero team. You lead these Heroes to conquer various “Dungeons” (programming tasks). Along the way, Heroes gain experience, level up, unlock abilities, and your team earns achievements as well.

This is not a gimmick. It is a design grounded in human behavioral psychology. When tasks are given meaning and progress feedback, people’s engagement and persistence increase significantly.

As the old saying goes, “This feeling can become a memory, though at the time it left us bewildered.” We bring that emotional experience into the tool, so programming is no longer just typing code, but a journey worth remembering.

Hero is the core concept in HagiCode’s gamification system. Each Hero represents one AI assistant. For example, Claude Code is a Hero, and Codex is also a Hero.

A Hero has three equipment slots, and the design is surprisingly elegant:

  1. CLI slot (main class): Determines the Hero’s base ability, such as whether it is Claude Code or Codex
  2. Model slot (secondary class): Determines which model is used, such as Claude 4.5 or Claude 4.6
  3. Style slot (style): Determines the Hero’s behavior style, such as “Fengluo Strategist” or another style

The combination of these three slots creates unique Hero configurations. Much like equipment builds in games, you choose the right setup based on the task. After all, what suits you best is what matters most. Life is similar: many roads lead to Rome, but some are smoother than others.

Each Hero has its own XP and level:

type HeroProgressionSnapshot = {
currentLevel: number; // Current level
totalExperience: number; // Total experience
currentLevelStartExperience: number; // Experience at the start of the current level
nextLevelExperience: number; // Experience required for the next level
experienceProgressPercent: number; // Progress percentage
remainingExperienceToNextLevel: number; // Experience still needed for the next level
lastExperienceGain: number; // Most recent experience gained
lastExperienceGainAtUtc?: string | null; // Time when experience was gained
};

Levels are divided into four stages, and each stage has an immersive name:

export const resolveHeroProgressionStage = (level?: number | null): HeroProgressionStage => {
const normalizedLevel = Math.max(1, level ?? 1);
if (normalizedLevel <= 100) return 'rookieSprint'; // Rookie sprint
if (normalizedLevel <= 300) return 'growthRun'; // Growth run
if (normalizedLevel <= 700) return 'veteranClimb'; // Veteran climb
return 'legendMarathon'; // Legend marathon
};

From “rookie” to “legend,” this growth path gives users a clear sense of direction and achievement. It mirrors personal growth in life, from confusion to maturity, only made more tangible here.

To create a Hero, you need to configure three slots:

const heroDraft: HeroDraft = {
name: 'Athena',
icon: 'hero-avatar:storm-03',
description: 'A brilliant strategist',
executorType: AIProviderType.CLAUDE_CODE_CLI,
slots: {
cli: {
id: 'profession-claude-code',
parameters: { /* CLI-related parameters */ }
},
model: {
id: 'secondary-claude-4-sonnet',
parameters: { /* Model-related parameters */ }
},
style: {
id: 'fengluo-strategist',
parameters: { /* Style-related parameters */ }
}
}
};

Every Hero has a unique avatar, description, and professional identity, which gives what would otherwise be a cold AI assistant more personality and warmth. After all, who wants to work with a tool that has no character?

A “Dungeon” is a classic game concept representing a challenge that requires a team to clear. In HagiCode, each workflow is a Dungeon.

Dungeon organizes workflows into different “Dungeons”:

  • Proposal generation dungeon: Responsible for generating technical proposals
  • Proposal execution dungeon: Responsible for executing tasks in proposals
  • Proposal archive dungeon: Responsible for organizing and archiving completed proposals

Each dungeon has its own Captain Hero, and the captain is automatically chosen as the first enabled Hero.

This is really just division of labor, like in everyday life, except turned into a game mechanic.

You can configure different Hero squads for different dungeons:

const dungeonRoster: HeroDungeonRoster = {
scriptKey: 'proposal.generate',
displayName: 'Proposal Generation',
members: [
{ heroId: 'hero-1', name: 'Athena', executorType: 'ClaudeCode' },
{ heroId: 'hero-2', name: 'Apollo', executorType: 'Codex' }
]
};

For example, you can use Athena for generating proposals because it is good at strategy, and Apollo for implementing code because it is good at execution. That way, every Hero can play to its strengths. It is like forming a band: each person has an instrument, and together they create something beautiful.

Dungeon uses fixed scriptKey values to identify different workflows:

// Script keys map to different workflows
const dungeonScripts = {
'proposal.generate': 'Proposal Generation',
'proposal.execute': 'Proposal Execution',
'proposal.archive': 'Proposal Archive'
};

The task state flow is: queued (waiting) -> dispatching (being assigned) -> dispatched (assigned). The whole process is automated and requires no manual intervention. That is also part of our lazy side, because who wants to manage this stuff by hand?

XP is the core feedback mechanism in the gamification system. Users gain XP by completing tasks, XP levels up Heroes, and leveling up unlocks new abilities, forming a positive feedback loop.

In HagiCode, XP can be earned through the following activities:

  • Completing code execution
  • Successfully calling tools
  • Generating proposals
  • Session management operations
  • Project operations

Every time a valid action is completed, the corresponding Hero gains XP. Just like growth in life, every step counts, only here that growth is quantified.

XP and level progress are visualized in real time:

type HeroDungeonMember = {
heroId: string;
name: string;
icon?: string | null;
executorType: PCode_Models_AIProviderType;
currentLevel?: number; // Current level
totalExperience?: number; // Total experience
experienceProgressPercent?: number; // Progress percentage
};

Users can always see each Hero’s level and progress, and that immediate feedback is the key to gamification design. People need feedback, otherwise how would they know they are improving?

Achievements are another important element in gamification. They provide long-term goals and milestone-driven satisfaction.

HagiCode supports multiple types of achievements:

  • Code generation achievements: Generate X lines of code, generate Y files
  • Session management achievements: Complete Z conversations
  • Project operation achievements: Work across W projects

These achievements are really like milestones in life, except we have turned them into a game mechanic.

Achievements have three states:

type AchievementStatus = 'unlocked' | 'in-progress' | 'locked';

The three states have clear visual distinctions:

  • Unlocked: Gold gradient with a halo effect
  • In progress: Blue pulse animation
  • Locked: Gray, with unlock conditions shown

Each achievement clearly displays its trigger condition, so users know what to do next. When people feel lost, a little guidance always helps.

When an achievement is unlocked, a celebration animation is triggered. That kind of positive reinforcement gives users the satisfying feeling of “I did it” and motivates them to keep going. Small rewards in life work the same way: they may be small, but the happiness can last a long time.

Battle Report is one of HagiCode’s signature features. At the end of each day, it generates a full-screen battle-style report.

Battle Report displays the following information:

type HeroBattleReport = {
reportDate: string;
summary: {
totalHeroCount: number; // Total number of Heroes
activeHeroCount: number; // Number of active Heroes
totalBattleScore: number; // Total battle score
mvp: HeroBattleHero; // Most valuable Hero
};
heroes: HeroBattleHero[]; // Detailed data for all Heroes
};
  • Total team score
  • Number of active Heroes
  • Number of tool calls
  • Total working time
  • MVP (Most Valuable Hero)
  • Detailed card for each Hero

The MVP is the best-performing Hero of the day and is highlighted in the report. This is not just data statistics, but a form of honor and recognition. After all, who does not want to be recognized?

Each Hero card includes:

  • Level progress
  • XP gained
  • Number of executions
  • Usage time

These metrics help users clearly understand how the team is performing. Seeing the results of your own effort is satisfying in itself.

HagiCode’s gamification system uses a modern technology stack and design patterns. There is nothing especially magical about it; we just chose tools that fit the job.

// React + TypeScript for the frontend
import React from 'react';
// Framer Motion for animations
import { AnimatePresence, motion } from 'framer-motion';
// Redux Toolkit for state management
import { useAppDispatch, useAppSelector } from '@/store';
// shadcn/ui for UI components
import { Dialog, DialogContent } from '@/components/ui/dialog';

Framer Motion handles all animation effects, shadcn/ui provides the foundational UI components, and Redux Toolkit manages the complex gamification state. Good tools make good work.

HagiCode uses a Glassmorphism + Tech Dark design style:

/* Primary gradient */
background: linear-gradient(135deg, #22C55E 0%, #25c2a0 50%, #06b6d4 100%);
/* Glass effect */
backdrop-filter: blur(12px);
/* Glow effect */
background: radial-gradient(circle at center, rgba(34, 197, 94, 0.15) 0%, transparent 70%);

The green gradient combined with glassmorphism creates a technical, futuristic atmosphere. Visual beauty is part of the user experience too.

Framer Motion is used to create smooth entrance animations:

<motion.div
animate={{ opacity: 1, y: 0 }}
initial={{ opacity: 0, y: 18 }}
transition={{ duration: 0.35, ease: 'easeOut', delay: index * 0.08 }}
className="card"
>
{/* Card content */}
</motion.div>

Each card enters one after another with a delay of 0.08 seconds, creating a fluid visual effect. Smooth animation improves the experience. That part is hard to argue with.

Gamification data is stored using the Grain storage system to ensure state consistency. Even fine-grained data like accumulated Hero XP can be persisted accurately. No one wants to lose the experience they worked hard to earn.

Creating your first Hero is actually quite simple:

  1. Go to the Hero management page
  2. Click the “Create Hero” button
  3. Configure the three slots (CLI, Model, Style)
  4. Give the Hero a name and description
  5. Save it, and your first Hero is born

It is like meeting a new friend: you give them a name, learn what makes them special, and then head off on an adventure together.

Building a team is also simple:

  1. Go to the Dungeon management page
  2. Choose the dungeon you want to configure, such as “Proposal Generation”
  3. Select members from your Hero list
  4. The system automatically selects the first enabled Hero as Captain
  5. Save the configuration

This is simply the process of forming a team, much like building a team in real life where everyone has their own role.

At the end of each day, you can view the day’s Battle Report:

  1. Click the “Battle Report” button
  2. View the day’s work results in a full-screen display
  3. Check the MVP and the detailed data for each Hero
  4. Share it with team members if you want

This is also a kind of ritual, a way to see how much effort you put in today and how far you still are from your goal.

Use React.memo to avoid unnecessary re-renders:

const HeroCard = React.memo(({ hero }: { hero: HeroDungeonMember }) => {
// Component implementation
});

Performance matters too. No one wants to use a laggy tool.

Detect the user’s motion preference settings and provide a simplified experience for motion-sensitive users:

const prefersReducedMotion = useReducedMotion();
const duration = prefersReducedMotion ? 0 : 0.35;

Not everyone likes animation, and respecting user preferences is part of good design.

Keep legacyIds to support migration from older versions:

type HeroDungeonMember = {
heroId: string;
legacyIds?: string[]; // Supports legacy ID mapping
// ...
};

No one wants to lose data just because of a version upgrade.

Use i18n translation keys for all text to make multi-language support easy:

const displayName = t(`dungeon.${scriptKey}`, { defaultValue: displayName });

Language should never be a barrier to using the product.

Gamification is not just a simple points leaderboard, but a complete incentive system. Through the Hero system, Dungeon system, XP and level system, achievement system, and Battle Report, HagiCode transforms programming work into a heroic journey full of adventure.

The core value of this system lies in:

  • Emotional connection: Giving cold AI assistants personality
  • Positive feedback: Every action produces immediate feedback
  • Long-term goals: Levels and achievements provide a growth path
  • Team identity: A sense of collaboration within Dungeon teams
  • Honor and recognition: Battle Report and MVP showcases

Gamification design makes programming no longer dull, but an interesting adventure. While completing coding tasks, users also experience the fun of character growth, team collaboration, and achievement unlocking, which improves retention and activity.

At its core, programming is already an act of creation. We just made the creative process a little more fun.

If this article helped you:


Thank you for reading. If you found this article useful, please click the like button below so more people can discover it.

This content was created with AI-assisted collaboration, reviewed by me, and reflects my own views and position.

ImgBin CLI Tool Design: HagiCode's Image Asset Management Approach

ImgBin CLI Tool Design: HagiCode’s Image Asset Management Approach

Section titled “ImgBin CLI Tool Design: HagiCode’s Image Asset Management Approach”

This article explains how to build an automatable image asset pipeline from scratch, covering CLI tool design, a Provider Adapter architecture, and metadata management strategies.

Honestly, I did not expect image asset management to keep us tangled up for this long.

During HagiCode development, we ran into a problem that looked simple on the surface but was surprisingly thorny in practice: generating and managing image assets. In a way, it was like the dramas of adolescence - calm on the outside, turbulent underneath.

As the project accumulated more documentation and marketing materials, we needed a large number of supporting images. Some had to be AI-generated, some had to be selected from an existing asset library, and others needed AI recognition plus automatic labeling. The problem was that all of this had long been handled through scattered scripts and manual steps. Every time we generated an image, we had to run a script by hand, organize metadata by hand, and create thumbnails by hand. That alone was annoying enough, but the bigger issue was that everything was scattered everywhere. When we wanted to find something, we could not. When we needed to reuse something, we could not.

The pain points were concrete:

  1. No unified entry point: the logic for image generation was spread across different scripts, so batch execution was basically impossible.
  2. Missing metadata: generated images had no unified metadata.json, which meant no reliable searchability or traceability.
  3. High manual organization cost: titles and tags had to be sorted out one by one by hand, which was inefficient.
  4. No automation: automatically generating visual assets in a CI/CD pipeline? Not a chance.

We did think about just leaving it alone. But projects still need to move forward. Since we could not avoid the problem, we figured we might as well solve it. So we decided to upgrade ImgBin from a set of scattered scripts into an image asset pipeline that can be executed automatically. Some problems, after all, do not disappear just because you look away.

The approach shared in this article comes from our hands-on experience in the HagiCode project. HagiCode is an AI coding assistant project that simultaneously maintains multiple components, including a VSCode extension, backend AI services, and a cross-platform desktop client. In a complex, multilingual, cross-platform environment like this, standardized image asset management becomes a key part of improving development efficiency.

You could say this was one of those small growing pains in HagiCode’s journey. Every project has moments like that: a minor issue that looks insignificant, yet somehow manages to take up half the day.

HagiCode’s build system is based on the TypeScript + Node.js ecosystem, so ImgBin naturally adopted the same tech stack to keep the project technically consistent. Once you are used to one stack, switching to something else just feels like unnecessary trouble.


ImgBin uses a layered architecture that cleanly separates CLI commands, application services, third-party API adapters, and the infrastructure layer:

Component hierarchy
├── CLI Entry (cli.ts) Global argument parsing, command routing
├── Commands (commands/*) generate | batch | annotate | thumbnail
├── Application Services job-runner | metadata | thumbnail | asset-writer
├── Provider Adapters image-api-provider | vision-api-provider
└── Infrastructure Layer config | logger | paths | schema

The benefit of this layered design is clear responsibility boundaries. It also makes testing easier because external dependencies can be mocked cleanly. In practice, it just means each layer does its own job without getting in the way of the others, so when something breaks, it is easier to figure out why.

ImgBin uses a model of “one asset, one directory.” Every time an image is generated, it creates a structure like this:

library/
└── 2026-03/
└── orange-dashboard/
├── original.png # Original image
├── thumbnail.webp # 512x512 thumbnail
└── metadata.json # Structured metadata

The advantages of this model are:

  1. Self-contained: all files for a single asset live in the same directory, making migration and backup convenient.
  2. Traceable: metadata.json makes it possible to trace generation time, prompt, model, and other details.
  3. Extensible: if more variants are needed later, such as thumbnails in multiple sizes, we can simply add new files in the same directory.

Beautiful things do not always need to be possessed. Sometimes it is enough that they remain beautiful, and that you can quietly appreciate them. That may sound a little far afield, but the logic still holds here: once images are kept together, they are more pleasant to look at and much easier to find.

metadata.json is the core of the entire system. It uses a layered storage strategy that separates fields into three categories:

{
"schemaVersion": 2,
"assetId": "orange-dashboard",
"slug": "orange-dashboard",
"title": "Orange Dashboard",
"tags": ["dashboard", "hero", "orange"],
"source": { "type": "generated" },
"paths": {
"assetDir": "library/2026-03/orange-dashboard",
"original": "original.png",
"thumbnail": "thumbnail.webp"
},
"generated": {
"prompt": "orange dashboard for docs hero",
"provider": "azure-openai-image-api",
"model": "gpt-image-1.5"
},
"recognized": {
"title": "Orange Dashboard",
"tags": ["dashboard", "ui", "orange"],
"description": "A modern orange dashboard with charts and metrics"
},
"status": {
"generation": "succeeded",
"recognition": "succeeded",
"thumbnail": "succeeded"
},
"timestamps": {
"createdAt": "2026-03-11T04:01:19.570Z",
"updatedAt": "2026-03-11T04:02:09.132Z"
}
}
  • generated: records the original information from image generation, such as the prompt, provider, and model.
  • recognized: stores AI recognition results, such as auto-generated titles, tags, and descriptions.
  • manual: stores manually curated results. Data in this area has the highest priority and will not be overwritten by AI recognition.

This layered strategy resolves one of our earlier core conflicts: when AI recognition and manual curation disagree, which one should win? The answer is manual input. AI recognition is there to assist, not to decide. That question also became clearer over time - machines are still machines, and in the end, people still need to make the call.


Another core part of ImgBin is the Provider Adapter pattern. We abstract external APIs behind a unified interface so that even if we switch AI service providers, we do not need to change the business logic.

In a way, it is a bit like relationships - outward appearances can change, but what matters is that the inner structure stays the same. Once the interface is fixed, the internal implementation can vary freely.

interface ImageGenerationProvider {
// Generate an image and return its Buffer
generate(options: GenerateOptions): Promise<Buffer>;
// Get the list of supported models
getSupportedModels(): Promise<string[]>;
}
interface GenerateOptions {
prompt: string;
model?: string;
size?: '1024x1024' | '1792x1024' | '1024x1792';
quality?: 'standard' | 'hd';
format?: 'png' | 'webp' | 'jpeg';
}
interface VisionRecognitionProvider {
// Recognize image content and return structured metadata
recognize(imageBuffer: Buffer): Promise<RecognitionResult>;
// Get the list of supported models
getSupportedModels(): Promise<string[]>;
}
interface RecognitionResult {
title?: string;
tags: string[];
description?: string;
confidence: number;
}

The advantages of this interface design are:

  1. Testable: in unit tests, we can pass in mock providers instead of making real external API calls.
  2. Extensible: adding a new provider only requires implementing the interface; caller code does not need to change.
  3. Replaceable: production can use Azure OpenAI while testing can use a local model, with configuration being the only thing that changes.

Sometimes project work feels like that too. On the surface it looks like we just swapped an API, but the internal logic remains exactly the same, and that makes the whole thing a lot less scary.


ImgBin provides four core commands to cover different usage scenarios:

Terminal window
# Simplest usage
imgbin generate --prompt "orange dashboard for docs hero"
# Generate a thumbnail and AI annotations at the same time
imgbin generate --prompt "orange dashboard" --annotate --thumbnail
# Specify an output directory
imgbin generate --prompt "orange dashboard" --output ./library

Batch jobs are defined through YAML or JSON manifest files, which makes them suitable for CI/CD workflows:

assets/jobs/launch.yaml
defaults:
annotate: true
thumbnail: true
libraryRoot: ./library
jobs:
- prompt: "orange dashboard hero"
slug: orange-dashboard
tags: [dashboard, hero, orange]
- prompt: "pricing grid for docs"
slug: pricing-grid
tags: [pricing, grid, docs]

Run the command:

Terminal window
imgbin batch assets/jobs/launch.yaml

The batch job design supports failure isolation: items in the manifest are processed one by one, and a failure in one item does not affect the others. You can also preview the job with --dry-run without actually executing it.

And the best part is that it tells you exactly what succeeded and what failed. Unlike some things in life, where failure happens and you are left not even knowing how it happened.

Run AI recognition on existing images to automatically generate titles, tags, and descriptions:

Terminal window
# Annotate a single image
imgbin annotate ./library/2026-03/orange-dashboard
# Annotate an entire directory in batch
imgbin annotate ./library/2026-03/

Generate thumbnails for existing images:

Terminal window
# Generate a thumbnail
imgbin thumbnail ./library/2026-03/orange-dashboard

The manifest format for batch jobs supports flexible configuration. Defaults can be set globally, and individual jobs can override them:

# Global defaults
defaults:
annotate: true # Enable AI annotation by default
thumbnail: true # Generate thumbnails by default
libraryRoot: ./library
model: gpt-image-1.5
jobs:
# Minimal configuration: only provide a prompt
- prompt: "first image"
# Full configuration
- prompt: "second image"
slug: custom-slug
tags: [tag1, tag2]
annotate: false # Do not run AI annotation for this job
model: dall-e-3 # Use a different model for this job

When executed, ImgBin processes jobs one by one. The result of each job is written to its corresponding metadata.json. Even if one job fails, the others are unaffected. After all jobs complete, the CLI outputs a summary report:

✓ orange-dashboard (succeeded)
✓ pricing-grid (succeeded)
✗ hero-banner (failed: API rate limit exceeded)
2/3 succeeded, 1 failed

Some things cannot be rushed. Taking them one at a time is often the steadier path. Maybe that is the philosophy behind batch jobs.


ImgBin supports flexible configuration through environment variables:

Terminal window
# ImgBin working directory
IMGBIN_WORKDIR=/path/to/imgbin
# Executable path (for invocation inside scripts)
IMGBIN_EXECUTABLE=/path/to/imgbin/dist/cli.js
# Asset library root
IMGBIN_LIBRARY_ROOT=./.imgbin-library
# Azure OpenAI configuration (if using the Azure provider)
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_API_KEY=***
AZURE_OPENAI_IMAGE_DEPLOYMENT=gpt-image-1

Configuration is one of those things that can feel both important and not that important at the same time. In the end, whatever feels comfortable and fits your workflow best is usually the right choice.


During implementation, we summarized a few key points:

Interface definitions should be clear and complete, including input parameters, return values, and error handling. It is also a good idea to provide both synchronous and asynchronous invocation styles for different scenarios.

That is one small piece of hard-earned experience. Once an interface is set, nobody wants to keep changing it later.

When one item fails in a batch job, the CLI should:

  1. Write detailed error information to a separate log file.
  2. Continue executing other jobs instead of interrupting the whole process.
  3. Return a non-zero exit code at the end to indicate that some jobs failed.
  4. Clearly display the execution result of every job in the summary report.

Some failures are just failures. There is no point pretending otherwise. It is better to acknowledge them openly and then figure out how to solve them. The same logic applies to projects and to life.

Recognition results are written to the recognized section by default, while manually edited fields are marked in manual. Metadata updates follow an append-only strategy: unless --force is explicitly passed, existing manually curated results are not overwritten.

That point became clear too - some things, once overwritten, are just gone. It is often better to preserve them, because the record itself has value.

Use fs.mkdir({ recursive: true }) to ensure directory creation remains atomic and to avoid race conditions in concurrent scenarios.

Maybe that is what security feels like - being stable when stability matters, moving fast when speed matters, and never getting stuck second-guessing.


As the core tool for image asset management in the HagiCode project, ImgBin solves our problems through the following design choices:

  1. Unified entry point: the CLI covers generation, annotation, thumbnails, and all other core operations.
  2. Metadata-driven: every asset has a complete metadata.json, enabling search and traceability.
  3. Provider Adapter: flexible abstraction for external APIs, making testing and extension easier.
  4. Batch job support: batch image generation can be automated within CI/CD workflows.

Everything else may have faded, but this approach really did end up proving useful.

This solution not only improves HagiCode’s own development efficiency, but also forms a reusable framework for image asset management. If you are building a similarly multi-component project, I believe ImgBin’s design ideas may give you some inspiration.

Youth is all about trying things and making a bit of a mess. If you never put yourself through that, how would you know what you are really capable of?



Thank you for reading. If you found this article helpful, please click the like button below so more people can discover it.

This content was produced with AI-assisted collaboration, reviewed by me, and reflects my own views and position.

Primary profession management in hero settings

Primary profession management in hero settings

Section titled “Primary profession management in hero settings”
  • Hero settings now include a dedicated Primary Professions tab to toggle availability.
  • Enablement is persisted at the system level and gates hero availability in dungeon selection and status checks.
  • CLI detection surfaces availability and version; enablement toggles stay locked until the CLI is detected.

Practical Guide to Integrating CodeBuddy CLI into a C# Backend

Practical Guide to Integrating CodeBuddy CLI into a C# Backend

Section titled “Practical Guide to Integrating CodeBuddy CLI into a C# Backend”

This article walks through a complete approach to integrating CodeBuddy CLI into a C# backend project so you can deliver AI coding assistant capabilities end to end.

In modern AI coding assistant development, a single AI Provider often cannot satisfy complex and changing development scenarios. HagiCode, as a multifunctional AI coding assistant, needs to support multiple AI Providers to deliver a better user experience. Users should have enough freedom to choose. In early 2026, the project faced a key decision: how to restore CodeBuddy ACP (Agent Communication Protocol) integration capabilities in the C# backend.

The project had previously implemented CodeBuddy integration, but the related code was removed during a refactor. There is not much to complain about there; during iterative development, something always gets left behind. The goal of this technical solution was to fully restore that capability and improve the architecture so it would be more robust and maintainable.

If you are also considering connecting multiple AI coding assistants to your own project, the approach below may give you some ideas. It reflects lessons we summarized after stepping into plenty of pitfalls, and maybe it can help you avoid a few detours.

The approach shared in this article comes from our practical experience in the HagiCode project. HagiCode is an open-source AI coding assistant project that supports multiple AI Providers and cross-platform operation. To satisfy different user preferences, we need to switch flexibly among different AI coding assistants, which is exactly why we built the CodeBuddy integration described here.

HagiCode uses a modular design, with AI Providers implemented as pluggable components. This architecture lets us add new AI support easily without affecting existing features. When a design is done well up front, it saves a lot of trouble later. If you are interested in our technical architecture, you can view the full source code on GitHub.

The integration between C# and CodeBuddy uses a clear layered architecture. This design makes responsibilities explicit and makes long-term maintenance much easier:

┌─────────────────────────────────────────────┐
│ Provider Contract Layer │
│ AIProviderType enum + extension methods │
├─────────────────────────────────────────────┤
│ Provider Factory Layer │
│ AIProviderFactory dependency injection factory │
├─────────────────────────────────────────────┤
│ Provider Implementation Layer │
│ CodebuddyCliProvider concrete implementation │
├─────────────────────────────────────────────┤
│ ACP Infrastructure Layer │
│ ACPSessionManager / StdioAcpTransport │
│ AcpRpcClient / AcpAgentClient │
└─────────────────────────────────────────────┘

What are the benefits of this layering? Put simply, each layer stays out of the others’ way. If we later want to change the communication mechanism, for example from stdio to WebSocket, we only need to modify the bottom layer, and the business logic above it stays untouched. Nobody wants a communication change to ripple through the entire codebase.

The Provider contract layer is the foundation of the entire architecture. We define the AIProviderType enum, where CodebuddyCli = 3 is used as the enum value, and implement bidirectional mapping between strings and enums through extension methods. That allows strings in configuration files to be converted conveniently into enums, and enums to be converted back to strings for debugging output.

The Provider factory layer is responsible for creating the corresponding Provider instance based on configuration. It uses .NET dependency injection together with ActivatorUtilities.CreateInstance for dynamic creation. The advantage of the factory pattern is that when adding a new Provider, you only need to add the creation logic instead of modifying existing code.

The Provider implementation layer is where the actual work happens. CodebuddyCliProvider implements the IAIProvider interface and provides two invocation modes: ExecuteAsync for non-streaming calls and StreamAsync for streaming calls.

The ACP infrastructure layer provides the communication foundation underneath. This layer handles all protocol details, including process management, message serialization, and response parsing. It is the foundation that keeps everything above it stable.

CodeBuddy uses Stdio (standard input/output) to communicate with external processes. The startup command is simple:

Terminal window
codebuddy --acp

After that, JSON-RPC messages are exchanged through standard input and output. This approach has several advantages:

  1. Fast startup: local process communication avoids network latency
  2. Simple configuration: you only need to specify the executable path
  3. Environment isolation: each session runs in an independent process, so they do not affect one another

Environment variable injection is supported during communication. Common examples include:

  • CODEBUDDY_API_KEY: API key authentication
  • CODEBUDDY_INTERNET_ENVIRONMENT: network environment configuration

As with communication between people, it helps to choose a convenient channel first.

ACP is based on JSON-RPC 2.0. The message format looks roughly like this:

// Request message
{
"jsonrpc": "2.0",
"id": 1,
"method": "agent/prompt",
"params": {
"prompt": "Help me write a sorting algorithm",
"sessionId": "session-123"
}
}
// Response message
{
"jsonrpc": "2.0",
"id": 1,
"result": {
"content": "Here is the AI response..."
}
}

In the real implementation, we encapsulate all of these protocol details so the upper business layer only needs to care about the prompt and response.

First, restore the CodeBuddy type in the enum file:

PCode.Models/AIProviderType.cs
public enum AIProviderType
{
ClaudeCodeCli = 0,
CodexCli = 1,
GitHubCopilot = 2,
CodebuddyCli = 3, // Restore this enum value
OpenCodeCli = 4,
IFlowCli = 5,
}

Then add string mapping in the extension methods so the configuration file can specify the Provider by string:

AIProviderTypeExtensions.cs
private static readonly Dictionary<string, AIProviderType> _typeMap = new(
StringComparer.OrdinalIgnoreCase)
{
["CodebuddyCli"] = AIProviderType.CodebuddyCli,
["Codebuddy"] = AIProviderType.CodebuddyCli,
["codebuddy"] = AIProviderType.CodebuddyCli,
// ... Mappings for other providers
};

Add a CodeBuddy creation branch in the factory class:

AIProviderFactory.cs
private IAIProvider? CreateProvider(AIProviderType providerType, ProviderConfiguration config)
{
return providerType switch
{
AIProviderType.CodebuddyCli =>
ActivatorUtilities.CreateInstance<CodebuddyCliProvider>(
_serviceProvider,
Options.Create(config)),
// ... Other providers
_ => throw new NotSupportedException($"Provider {providerType} not supported")
};
}

This uses dependency injection through ActivatorUtilities, which automatically handles constructor parameter injection and is very convenient.

Below is the core implementation of CodebuddyCliProvider, covering both streaming and non-streaming invocation modes:

public class CodebuddyCliProvider : IAIProvider
{
private readonly ILogger<CodebuddyCliProvider> _logger;
private readonly IACPSessionManager _sessionManager;
private readonly ProviderConfiguration _config;
public string Name => "CodebuddyCli";
public bool SupportsStreaming => true;
public ProviderCapabilities Capabilities { get; }
public CodebuddyCliProvider(
ILogger<CodebuddyCliProvider> logger,
IACPSessionManager sessionManager,
IOptions<ProviderConfiguration> config)
{
_logger = logger;
_sessionManager = sessionManager;
_config = config.Value;
// Define the capabilities of the current Provider
Capabilities = new ProviderCapabilities
{
SupportsStreaming = true,
SupportsTools = true,
SupportsSystemMessages = true,
SupportsArtifacts = false,
MaxTokens = 8192
};
}
// Non-streaming call: return all results together after completion
public async Task<AIResponse> ExecuteAsync(
AIRequest request,
CancellationToken cancellationToken = default)
{
// Create an independent session for the request
var session = await _sessionManager.CreateSessionAsync(
"CodebuddyCli",
request.WorkingDirectory,
cancellationToken,
request.SessionId);
try
{
var fullPrompt = BuildPrompt(request);
await session.SendPromptAsync(fullPrompt, cancellationToken);
var responseBuilder = new StringBuilder();
var toolCalls = new List<AIToolCall>();
// Collect all response chunks
await foreach (var chunk in StreamFromSession(session, cancellationToken))
{
if (!string.IsNullOrEmpty(chunk.Content))
{
responseBuilder.Append(chunk.Content);
}
// Handle tool calls...
}
return new AIResponse
{
Content = AIResultContentSanitizer.SanitizeResultContent(
responseBuilder.ToString()),
ToolCalls = toolCalls,
Provider = Name,
Model = string.Empty
};
}
finally
{
// Release session resources
await session.DisposeAsync();
}
}
// Streaming call: return response chunks in real time
public async IAsyncEnumerable<AIStreamingChunk> StreamAsync(
AIRequest request,
[EnumeratorCancellation] CancellationToken cancellationToken = default)
{
var session = await _sessionManager.CreateSessionAsync(
"CodebuddyCli",
request.WorkingDirectory,
cancellationToken);
try
{
var fullPrompt = BuildPrompt(request);
await session.SendPromptAsync(fullPrompt, cancellationToken);
await foreach (var chunk in StreamFromSession(session, cancellationToken))
{
yield return chunk;
}
}
finally
{
await session.DisposeAsync();
}
}
private async IAsyncEnumerable<AIStreamingChunk> StreamFromSession(
IACPSession session,
[EnumeratorCancellation] CancellationToken cancellationToken)
{
// Iterate through all updates in the session
await foreach (var notification in session.ReceiveUpdatesAsync(cancellationToken))
{
switch (notification.Update)
{
case AgentMessageChunkSessionUpdate agentMessage:
// Handle text content chunks
if (agentMessage.Content is AcpImp.TextContentBlock textContent)
{
yield return new AIStreamingChunk
{
Content = textContent.Text,
Type = StreamingChunkType.ContentDelta,
IsComplete = false
};
}
break;
case ToolCallSessionUpdate toolCall:
// Handle tool calls
yield return new AIStreamingChunk
{
Content = string.Empty,
Type = StreamingChunkType.ToolCallDelta,
ToolCallDelta = new AIToolCallDelta
{
Id = toolCall.ToolCallId,
Name = toolCall.Kind.ToString(),
Arguments = toolCall.RawInput?.ToString()
}
};
break;
case AcpImp.PromptCompletedSessionUpdate:
// Response complete
yield break;
}
}
}
// Build the full prompt
private string BuildPrompt(AIRequest request, string? embeddedCommandPrompt = null)
{
var sb = new StringBuilder();
// Embedded command prompt, if present
if (!string.IsNullOrEmpty(embeddedCommandPrompt))
{
sb.AppendLine(embeddedCommandPrompt);
sb.AppendLine();
}
// System message
if (!string.IsNullOrEmpty(request.SystemMessage))
{
sb.AppendLine(request.SystemMessage);
sb.AppendLine();
}
// User prompt
sb.Append(request.Prompt);
return sb.ToString();
}
}

There are several key points in this code:

  1. Session management: each request creates an independent session and releases resources after the request completes. This is a lesson learned through trial and error. If session reuse is not handled well, state pollution appears easily.

  2. Streaming processing: IAsyncEnumerable allows the response to be returned while it is still being generated, instead of waiting for all content to finish. This is especially important for long-text scenarios and significantly improves the user experience.

  3. Tool calls: CodeBuddy supports tool calling (Function Calling), handled through ToolCallSessionUpdate. This capability is critical for complex code editing tasks.

  4. Content filtering: AIResultContentSanitizer is used to filter Think block content and keep the output clean.

Add the related services during module registration:

PCodeClaudeHelperModule.cs
public void ConfigureModule(IServiceCollection context)
{
// Register Provider
context.Services.AddTransient<CodebuddyCliProvider>();
// Register ACP infrastructure
context.Services.AddSingleton<IACPSessionManager, ACPSessionManager>();
context.Services.AddSingleton<IAcpPlatformConfigurationResolver, AcpPlatformConfigurationResolver>();
context.Services.AddSingleton<IAIRequestToAcpMapper, AIRequestToAcpMapper>();
context.Services.AddSingleton<IAcpToAIResponseMapper, AcpToAIResponseMapper>();
}

Add CodeBuddy-related configuration to appsettings.json:

AI:
# Default Provider to use
DefaultProvider: "CodebuddyCli"
# Provider configuration
Providers:
CodebuddyCli:
Type: "CodebuddyCli"
WorkingDirectory: "C:/projects/my-app"
ExecutablePath: "C:/tools/codebuddy.cmd"
# Platform-specific configuration
PlatformConfigurations:
CodebuddyCli:
ExecutablePath: "C:/tools/codebuddy.cmd"
Arguments: "--acp"
StartupTimeoutMs: 5000
EnvironmentVariables:
CODEBUDDY_API_KEY: "${CODEBUDDY_API_KEY}"
CODEBUDDY_INTERNET_ENVIRONMENT: "production"

The corresponding configuration model definition:

public class CodebuddyPlatformConfiguration : IAcpPlatformConfiguration
{
public string ProviderName => "CodebuddyCli";
public AcpTransportType TransportType => AcpTransportType.Stdio;
public string ExecutablePath { get; set; } = "codebuddy";
public string Arguments { get; set; } = "--acp";
public int StartupTimeoutMs { get; set; } = 5000;
public Dictionary<string, string?>? EnvironmentVariables { get; set; }
}

We ran into several typical pitfalls during implementation, and sharing them here may help others avoid the same detours:

  1. Session leak issue: at first, sessions were not released correctly, which exhausted process resources. The solution was to use try-finally to ensure resources are released for every request.

  2. Environment variable passing: Windows and Linux use different environment variable syntax, so we later standardized on Dictionary<string, string?> to handle this.

  3. Timeout configuration: CLI startup takes time, so we set a 5-second startup timeout to avoid fast request failures.

  4. Encoding issues: on Windows, the default encoding may cause garbled Chinese text, so UTF-8 encoding is explicitly specified when starting the process.

  1. Session pool: for frequent short requests, consider implementing a session pool to reuse processes
  2. Connection cache: the factory class already supports caching Provider instances
  3. Async first: use asynchronous programming throughout to avoid blocking threads

Performance is always worth optimizing. The longer users wait, the worse the experience becomes.

This article introduced a complete solution for integrating CodeBuddy CLI into a C# backend, covering the entire process from architecture design to concrete implementation. Through a layered architecture, we separate protocol details from business logic, making the code clearer and easier to maintain.

Key takeaways:

  • Use a layered architecture with a Provider contract layer, factory layer, implementation layer, and infrastructure layer
  • Use JSON-RPC over Stdio for inter-process communication
  • Implement flexible configuration and extensibility through dependency injection
  • Provide both streaming and non-streaming invocation modes

This approach is not only suitable for CodeBuddy; adding new AI Providers can follow the same pattern. If you are also building a similar multi-AI-Provider integration, I hope this article gives you a useful reference.



If this article helped you:

Practical Multi-AI Provider Architecture in the HagiCode Platform

Practical Multi-AI Provider Architecture in the HagiCode Platform

Section titled “Practical Multi-AI Provider Architecture in the HagiCode Platform”

This article shares the technical approach we used under the Orleans Grain architecture to integrate two AI tools, iflow and OpenCode, through a unified IAIProvider interface, and compares the implementation differences between WebSocket and HTTP communication in detail.

There is nothing especially mysterious about it. While building HagiCode, we ran into a very practical problem: users wanted to work with different AI tools. That is hardly surprising, since everyone has their own habits. Some prefer Claude Code, some love GitHub Copilot, and some teams use tools they developed themselves.

Our initial solution was simple and direct: write dedicated integration code for each AI tool. But the drawbacks showed up quickly. The codebase filled up with if-else branches, every change required testing in multiple places, and every new tool meant writing another pile of logic from scratch.

Later, I realized it would be better to create a unified IAIProvider interface and abstract the capabilities shared by all AI providers. That way, no matter which tool is used underneath, the upper layers can call it in the same way.

Recently, the project needed to integrate two new tools: iflow and OpenCode. Both support the ACP protocol, but their communication styles are different. iflow uses WebSocket, while OpenCode uses an HTTP API. That became a useful architectural test: adapt two different transport modes behind one unified interface.

The approach shared in this article comes from our practical experience in the HagiCode project. HagiCode is an AI-assisted development platform built on the Orleans Grain architecture. It integrates with different AI providers through a unified IAIProvider interface, allowing users to flexibly choose the AI tools they prefer.

First, we defined the IAIProvider interface and abstracted the capabilities that every AI provider needs to implement:

public interface IAIProvider
{
string Name { get; }
bool SupportsStreaming { get; }
ProviderCapabilities Capabilities { get; }
Task<AIResponse> ExecuteAsync(AIRequest request, CancellationToken cancellationToken = default);
IAsyncEnumerable<AIStreamingChunk> StreamAsync(AIRequest request, CancellationToken cancellationToken = default);
Task<ProviderTestResult> PingAsync(CancellationToken cancellationToken = default);
IAsyncEnumerable<AIStreamingChunk> SendMessageAsync(AIRequest request, string? embeddedCommandPrompt = null, CancellationToken cancellationToken = default);
}

This interface includes several key methods:

  • ExecuteAsync: execute a one-shot AI request
  • StreamAsync: get streaming responses for real-time display
  • PingAsync: perform a health check to verify whether the provider is available
  • SendMessageAsync: send a message with support for embedded commands

IFlowCliProvider: A WebSocket-Based Implementation

Section titled “IFlowCliProvider: A WebSocket-Based Implementation”

iflow uses WebSocket for ACP communication. The overall architecture looks like this:

IFlowCliProvider → ACPSessionManager → WebSocketAcpTransport → iflow CLI
Dynamic port allocation + process management

The core flow is also fairly straightforward:

  1. ACPSessionManager creates and manages ACP sessions.
  2. WebSocketAcpTransport handles WebSocket communication.
  3. A port is allocated dynamically, and the iflow process is started with iflow --experimental-acp --port.
  4. IAIRequestToAcpMapper and IAcpToAIResponseMapper convert requests and responses.

Here is the core code:

private async IAsyncEnumerable<AIStreamingChunk> StreamCoreAsync(
AIRequest request,
string? embeddedCommandPrompt,
[EnumeratorCancellation] CancellationToken cancellationToken)
{
// Resolve working directory
var resolvedWorkingDirectory = ResolveWorkingDirectory(request);
var effectiveRequest = ApplyEmbeddedCommandPrompt(request, embeddedCommandPrompt);
// Create ACP session
await using var session = await _sessionManager.CreateSessionAsync(
Name,
resolvedWorkingDirectory,
cancellationToken,
request.SessionId);
// Send prompt
var prompt = _requestMapper.ToPromptString(effectiveRequest);
var promptResponse = await session.SendPromptAsync(prompt, cancellationToken);
// Receive streaming response
await foreach (var notification in session.ReceiveUpdatesAsync(cancellationToken))
{
if (_responseMapper.TryConvertToStreamingChunk(notification, out var chunk))
{
if (chunk.Type == StreamingChunkType.Metadata && chunk.IsComplete)
{
yield return chunk;
yield break;
}
yield return chunk;
}
}
}

There are a few design points worth calling out here:

  • Use await using to ensure the session is released correctly and avoid resource leaks.
  • Return streaming responses through IAsyncEnumerable, which naturally supports async streams.
  • Use Metadata chunks to determine completion and ensure the full response has been received.

OpenCodeCliProvider: An HTTP API-Based Implementation

Section titled “OpenCodeCliProvider: An HTTP API-Based Implementation”

OpenCode provides its service through an HTTP API, so the architecture is slightly different:

OpenCodeCliProvider → OpenCodeRuntimeManager → OpenCodeClient → OpenCode HTTP API
OpenCodeProcessManager → opencode process management

A notable feature of OpenCode is that it uses an SQLite database to persist session bindings. That makes session recovery and prompt-response recovery possible:

private async Task<OpenCodePromptExecutionResult> ExecutePromptAsync(
AIRequest request,
string? embeddedCommandPrompt,
CancellationToken cancellationToken)
{
var prompt = BuildPrompt(request, embeddedCommandPrompt);
var resolvedWorkingDirectory = ResolveWorkingDirectory(request.WorkingDirectory);
var client = await _runtimeManager.GetClientAsync(resolvedWorkingDirectory, cancellationToken);
var bindingSessionId = request.SessionId;
var boundSession = TryGetBinding(bindingSessionId, resolvedWorkingDirectory);
// Try to use the already bound session
if (boundSession is not null)
{
try
{
return await PromptSessionAsync(
client,
boundSession,
BuildPromptRequest(request, prompt, CreatePromptMessageId()),
request.Model ?? _settings.Model,
cancellationToken);
}
catch (OpenCodeApiException ex) when (IsStaleBinding(ex))
{
// The session has expired, remove the binding
RemoveBinding(bindingSessionId);
}
}
// Create a new session
var session = await client.Session.CreateAsync(new OpenCodeSessionCreateRequest
{
Title = BuildSessionTitle(request)
}, cancellationToken);
BindSession(bindingSessionId, session.Id, resolvedWorkingDirectory);
return await PromptSessionAsync(client, session.Id, ...);
}

This implementation has several interesting highlights:

  • Session binding mechanism: the same SessionId reuses the same OpenCode session, avoiding repeated session creation.
  • Expiration handling: when a session is found to be expired, the binding is automatically cleaned up.
  • Database persistence: bindings are stored in SQLite and remain effective after restart.
AspectIFlowCliProviderOpenCodeCliProvider
CommunicationWebSocket (ACP)HTTP API
Process managementACPSessionManagerOpenCodeProcessManager
Port allocationDynamic portNo port (uses HTTP)
Session managementACPSessionOpenCodeSession
PersistenceIn-memory cacheSQLite database
Startup commandiflow --experimental-acp --portopencode
LatencyLower (long-lived connection)Relatively higher (HTTP requests)

Which approach you choose depends mainly on your needs. WebSocket is better for scenarios with high real-time requirements, while an HTTP API is simpler and easier to debug.

First, enable the two providers in the configuration file:

AI:
Providers:
IFlowCli:
Type: "IFlowCli"
Enabled: true
ExecutablePath: "iflow"
Model: null
WorkingDirectory: null
OpenCodeCli:
Type: "OpenCodeCli"
Enabled: true
ExecutablePath: "opencode"
Model: "anthropic/claude-sonnet-4"
WorkingDirectory: null
OpenCode:
Enabled: true
BaseUrl: "http://localhost:38376"
ExecutablePath: "opencode"
StartupTimeoutSeconds: 30
RequestTimeoutSeconds: 120
// Get provider through the factory
var provider = await _providerFactory.GetProviderAsync(AIProviderType.IFlowCli);
// Execute an AI request
var request = new AIRequest
{
Prompt = "请帮我重构这个函数",
WorkingDirectory = "/path/to/project",
Model = "claude-sonnet-4"
};
// Get the complete response
var response = await provider.ExecuteAsync(request, cancellationToken);
Console.WriteLine(response.Content);
// Or use streaming responses
await foreach (var chunk in provider.StreamAsync(request, cancellationToken))
{
if (chunk.Type == StreamingChunkType.ContentDelta)
{
Console.Write(chunk.Content);
}
}
// Get provider through the factory
var provider = await _providerFactory.GetProviderAsync(AIProviderType.OpenCodeCli);
var request = new AIRequest
{
Prompt = "请帮我分析这个错误",
WorkingDirectory = "/path/to/project",
Model = "anthropic/claude-sonnet-4"
};
var response = await provider.ExecuteAsync(request, cancellationToken);
Console.WriteLine(response.Content);

Before startup or before use, you can check whether the provider is available:

var iflowResult = await iflowProvider.PingAsync(cancellationToken);
if (!iflowResult.Success)
{
Console.WriteLine($"IFlow is unavailable: {iflowResult.ErrorMessage}");
return;
}
var openCodeResult = await openCodeProvider.PingAsync(cancellationToken);
if (!openCodeResult.Success)
{
Console.WriteLine($"OpenCode is unavailable: {openCodeResult.ErrorMessage}");
return;
}

Both providers support embedded commands, such as /file:xxx:

var request = new AIRequest
{
Prompt = "分析这个文件的问题",
SystemMessage = "你是一个代码分析专家"
};
await foreach (var chunk in provider.SendMessageAsync(
request,
embeddedCommandPrompt: "/file:src/main.cs",
cancellationToken))
{
Console.Write(chunk.Content);
}

IFlow uses long-lived WebSocket connections, so resource management deserves special attention:

  • Use await using to ensure sessions are released properly.
  • Cancellation triggers process cleanup.
  • ACPSessionManager supports a maximum session count limit.

OpenCode process management is relatively simpler, and OpenCodeRuntimeManager handles it automatically.

Both providers have complete error handling:

  • IFlow errors are propagated through ACP session updates.
  • OpenCode errors are thrown through OpenCodeApiException.
  • It is recommended that the caller catch and handle these exceptions.
  • IFlow WebSocket communication has lower latency than HTTP.
  • OpenCode session reuse can reduce the overhead of HTTP requests.
  • The factory cache mechanism avoids repeatedly creating providers.
  • In high-concurrency scenarios, pay close attention to the limits on process count and connection count.

The executable path is validated at startup, but runtime issues can still happen. PingAsync is a useful tool for verifying whether the configuration is correct:

// Check at startup
var provider = await _providerFactory.GetProviderAsync(providerType);
var result = await provider.PingAsync(cancellationToken);
if (!result.Success)
{
_logger.LogError("Provider {ProviderType} is unavailable: {Error}", providerType, result.ErrorMessage);
}

This article shares the technical approach used by the HagiCode platform when integrating the two AI tools iflow and OpenCode. Through a unified IAIProvider interface, we adapted different communication styles, WebSocket and HTTP, while keeping the upper-layer calling pattern consistent.

The core idea is actually quite simple:

  1. Define a unified interface abstraction.
  2. Build adapter layers for different implementations.
  3. Manage everything uniformly through the factory pattern.

That gives the system good extensibility. When a new AI tool needs to be integrated later, all we need to do is implement the IAIProvider interface without changing too much existing code.

If you are also working on multi-AI-tool integration, I hope this article is helpful.


If this article helped you:

Complete Guide to Codex SDK Console Message Parsing

Complete Guide to Codex SDK Console Message Parsing

Section titled “Complete Guide to Codex SDK Console Message Parsing”

This article explains the Codex SDK event stream mechanism, message type parsing, and best practices in real projects, helping developers quickly master the core skills behind AI execution services.

When building an AI execution service based on the Codex SDK, we inevitably run into a practical question: how should we handle the streamed event messages returned by Codex? These messages contain important information such as execution status, output content, and error details, so they deserve careful handling.

As part of the HagiCode project, we needed a reliable executor for AI coding assistant scenarios. That is exactly why we decided to study the Codex SDK event stream mechanism in depth. After all, only by understanding how the underlying messages work can we build a truly enterprise-grade AI execution platform.

The Codex SDK is a programming-assistance SDK released by OpenAI. It returns execution results through an Event Stream. Unlike the traditional request-response model, Codex uses streamed events so that we can:

  • Get execution progress in real time
  • Handle errors promptly
  • Obtain detailed token usage statistics
  • Support long-running complex tasks

Understanding these event types and parsing them correctly is essential for implementing a fully capable AI executor. In the end, nobody wants to work with a black box.

The solution shared in this article comes from our practical experience in the HagiCode project. HagiCode is an open-source AI coding assistant project dedicated to providing developers with intelligent coding support. During development, we needed to build a reliable AI execution service to handle user code execution requests, which is the direct reason we introduced the Codex SDK.

As an AI coding assistant, HagiCode needs to deal with a variety of complex code execution scenarios: getting execution progress in real time, handling errors promptly, and collecting detailed token usage statistics. By deeply understanding the Codex SDK event stream mechanism, we can build an executor that meets production environment requirements. Ultimately, whether it is software or real life, everything benefits from steady accumulation and refinement.

The Codex SDK uses the thread.runStreamed() method to return an asynchronous event iterator:

import { Codex } from '@openai/codex-sdk';
const client = new Codex({
apiKey: process.env.CODEX_API_KEY,
baseUrl: process.env.CODEX_BASE_URL,
});
const thread = client.startThread({
workingDirectory: '/path/to/project',
skipGitRepoCheck: false,
});
const { events } = await thread.runStreamed('your prompt here', {
outputSchema: {
type: 'object',
properties: {
output: { type: 'string' },
status: { type: 'string', enum: ['ok', 'action_required'] },
},
required: ['output', 'status'],
},
});
for await (const event of events) {
// Handle each event
}
Event TypeDescriptionKey Data
thread.startedThread started successfullythread_id
item.updatedMessage content updateditem.text
item.completedMessage completeditem.text
turn.completedExecution completedusage (token usage)
turn.failedExecution failederror.message
errorError eventmessage

In real projects, HagiCode’s executor component is built on top of these event types. We need to handle each kind of event carefully to ensure a smooth user experience. Good systems are built by taking details seriously.

Message content is extracted through an event handler:

private handleThreadEvent(event: ThreadEvent, onMessage: (content: string) => void): void {
// Only handle message update and completion events
if (event.type !== 'item.updated' && event.type !== 'item.completed') {
return;
}
// Only handle agent message content
if (event.item.type !== 'agent_message') {
return;
}
// Extract text content
onMessage(event.item.text);
}

Key points:

  • Only handle item.updated and item.completed events
  • Only handle content of type agent_message
  • The message content is in the event.item.text field

Codex supports JSON structured output. You can specify the return format through the outputSchema parameter:

const DEFAULT_OUTPUT_SCHEMA = {
type: 'object',
properties: {
output: { type: 'string' },
status: { type: 'string', enum: ['ok', 'action_required'] },
},
required: ['output', 'status'],
additionalProperties: false,
} as const;

The parsing function attempts to parse JSON, and if that fails it falls back to the raw text.

function toStructuredOutput(raw: string): StructuredOutput {
try {
const parsed = JSON.parse(raw) as Partial<StructuredOutput>;
if (typeof parsed.output === 'string') {
return {
output: parsed.output,
status: parsed.status === 'action_required' ? 'action_required' : 'ok',
};
}
} catch {
// JSON parsing failed, fall back to the raw text
}
return {
output: raw,
status: 'ok',
};
}
private async runWithStreaming(
thread: Thread,
input: CodexStageExecutionInput
): Promise<{ output: string; usage: Usage | null }> {
const abortController = new AbortController();
const timeoutHandle = setTimeout(() => {
abortController.abort();
}, Math.max(1000, input.timeoutMs));
let latestMessage = '';
let usage: Usage | null = null;
let emittedLength = 0;
try {
const { events } = await thread.runStreamed(input.prompt, {
outputSchema: DEFAULT_OUTPUT_SCHEMA,
signal: abortController.signal,
});
for await (const event of events) {
// Handle message content
this.handleThreadEvent(event, (nextContent) => {
const delta = nextContent.slice(emittedLength);
if (delta.length > 0) {
emittedLength = nextContent.length;
input.callbacks?.onChunk?.(delta); // Streaming callback
}
latestMessage = nextContent;
});
// Process different data based on the event type
if (event.type === 'thread.started') {
this.threadId = event.thread_id;
} else if (event.type === 'turn.completed') {
usage = event.usage;
} else if (event.type === 'turn.failed') {
throw new CodexExecutorError('gateway_unavailable', event.error.message, true);
} else if (event.type === 'error') {
throw new CodexExecutorError('gateway_unavailable', event.message, true);
}
}
} catch (error) {
if (abortController.signal.aborted) {
throw new CodexExecutorError(
'upstream_timeout',
`Codex stage timed out after ${input.timeoutMs}ms`,
true
);
}
throw error;
} finally {
clearTimeout(timeoutHandle);
}
const structured = toStructuredOutput(latestMessage);
return { output: structured.output, usage };
}

Map specific error patterns to concrete error codes so the upper layers can handle them more easily:

function mapError(error: unknown): CodexExecutorError {
if (error instanceof CodexExecutorError) {
return error;
}
const message = error instanceof Error ? error.message : String(error);
const normalized = message.toLowerCase();
// Authentication errors - not retryable
if (normalized.includes('401') ||
normalized.includes('403') ||
normalized.includes('api key') ||
normalized.includes('auth')) {
return new CodexExecutorError('auth_invalid', message, false);
}
// Rate limit errors - retryable
if (normalized.includes('429') || normalized.includes('rate limit')) {
return new CodexExecutorError('rate_limited', message, true);
}
// Timeout errors - retryable
if (normalized.includes('timeout') || normalized.includes('aborted')) {
return new CodexExecutorError('upstream_timeout', message, true);
}
// Default error
return new CodexExecutorError('gateway_unavailable', message, true);
}
export type CodexErrorCode =
| 'auth_invalid' // Authentication failure
| 'upstream_timeout' // Upstream timeout
| 'rate_limited' // Rate limited
| 'gateway_unavailable'; // Gateway unavailable
export class CodexExecutorError extends Error {
readonly code: CodexErrorCode;
readonly retryable: boolean;
constructor(code: CodexErrorCode, message: string, retryable: boolean) {
super(message);
this.name = 'CodexExecutorError';
this.code = code;
this.retryable = retryable;
}
}

Working Directory and Environment Configuration

Section titled “Working Directory and Environment Configuration”

The Codex SDK requires the working directory to be a valid Git repository.

export function validateWorkingDirectory(
workingDirectory: string,
skipGitRepoCheck: boolean
): void {
const resolvedWorkingDirectory = path.resolve(workingDirectory);
if (!existsSync(resolvedWorkingDirectory)) {
throw new CodexExecutorError(
'gateway_unavailable',
'Working directory does not exist.',
false
);
}
if (!statSync(resolvedWorkingDirectory).isDirectory()) {
throw new CodexExecutorError(
'gateway_unavailable',
'Working directory is not a directory.',
false
);
}
if (skipGitRepoCheck) {
return;
}
const gitDir = path.join(resolvedWorkingDirectory, '.git');
if (!existsSync(gitDir)) {
throw new CodexExecutorError(
'gateway_unavailable',
'Working directory is not a git repository.',
false
);
}
}

The Codex SDK needs to load environment variables from the login shell so the AI Agent can access system commands:

function parseEnvironmentOutput(output: Buffer): Record<string, string> {
const parsed: Record<string, string> = {};
for (const entry of output.toString('utf8').split('\0')) {
if (!entry) continue;
const separatorIndex = entry.indexOf('=');
if (separatorIndex <= 0) continue;
const key = entry.slice(0, separatorIndex);
const value = entry.slice(separatorIndex + 1);
if (key.length > 0) {
parsed[key] = value;
}
}
return parsed;
}
function tryLoadEnvironmentFromShell(shellPath: string): Record<string, string> | null {
const result = spawnSync(shellPath, ['-ilc', 'env -0'], {
env: process.env,
stdio: ['ignore', 'pipe', 'pipe'],
timeout: 5000,
});
if (result.error || result.status !== 0) {
return null;
}
return parseEnvironmentOutput(result.stdout);
}
export function createExecutorEnvironment(
envOverrides: Record<string, string> = {}
): Record<string, string> {
// Load environment variables from the login shell
const consoleEnv = loadConsoleEnvironmentFromShell();
return {
...process.env,
...consoleEnv,
...envOverrides,
};
}

In the HagiCode project, we use the following approach to initialize the Codex client and execute tasks:

import { Codex } from '@openai/codex-sdk';
async function executeWithCodex(prompt: string, workingDir: string) {
const client = new Codex({
apiKey: process.env.CODEX_API_KEY,
env: { PATH: process.env.PATH },
});
const thread = client.startThread({
workingDirectory: workingDir,
});
const { events } = await thread.runStreamed(prompt);
let result = '';
for await (const event of events) {
if (event.type === 'item.updated' && event.item.type === 'agent_message') {
result = event.item.text;
}
if (event.type === 'turn.completed') {
console.log('Token usage:', event.usage);
}
}
// Try to parse JSON output
try {
const parsed = JSON.parse(result);
return parsed.output;
} catch {
return result;
}
}
export class CodexSdkExecutor {
private readonly config: CodexRuntimeConfig;
private readonly client: Codex;
private threadId: string | null = null;
async executeStage(input: CodexStageExecutionInput): Promise<CodexStageExecutionResult> {
const maxAttempts = Math.max(1, this.config.retryCount + 1);
let attempt = 0;
let lastError: CodexExecutorError | null = null;
while (attempt < maxAttempts) {
attempt += 1;
try {
const thread = this.getThread(input.workingDirectory);
const { output, usage } = await this.runWithStreaming(thread, input);
return {
output,
usage,
threadId: this.threadId!,
attempts: attempt,
latencyMs: Date.now() - startedAt,
};
} catch (error) {
const mappedError = mapError(error);
lastError = mappedError;
// Non-retryable error or max retry attempts reached
if (!mappedError.retryable || attempt >= maxAttempts) {
throw mappedError;
}
// Wait before retrying
await new Promise(resolve => setTimeout(resolve, 1000 * attempt));
}
}
throw lastError!;
}
}
  • Make sure the working directory is a valid Git repository
  • Use the PROJECT_ROOT environment variable to specify it explicitly
  • During debugging, you can set CODEX_SKIP_GIT_REPO_CHECK=true to skip the check
  • Pass only the required environment variables through a whitelist mechanism
  • Use the login shell to load the full environment
  • Avoid passing sensitive information
  • Set reasonable timeouts based on task complexity
  • Implement exponential backoff for retryable errors
  • Record retry counts and reasons
  • Distinguish between retryable and non-retryable errors
  • Provide clear error messages and suggestions
  • Use unified error codes so upper layers can handle them consistently
  • Implement incremental output callbacks to improve user experience
  • Correctly handle incremental message updates
  • Record token usage for cost analysis

In the actual production environment of the HagiCode project, we have already verified the effectiveness of the best practices above. This approach has helped us build a stable and reliable AI execution service. In the end, practical validation matters more than theory alone.

The Codex SDK event stream mechanism provides strong capabilities for building AI execution services. By correctly parsing different kinds of events, we can:

  • Get execution status and output in real time
  • Implement reliable error handling and retry mechanisms
  • Obtain detailed execution statistics
  • Build a full-featured AI execution platform

The core concepts and code samples introduced in this article can be applied directly in real projects, helping developers get started quickly with Codex SDK integration. If you find this approach valuable, it also reflects the strength of HagiCode’s engineering practice and makes HagiCode itself worth following.


Thank you for reading. If you found this article helpful, please click the like button below so more people can discover it.

This content was created with AI-assisted collaboration, reviewed by the author, and reflects the author’s own views and positions.

HagiCode Multi-AI Provider Switching and Interoperability Implementation Plan

HagiCode Multi-AI Provider Switching and Interoperability Implementation Plan

Section titled “HagiCode Multi-AI Provider Switching and Interoperability Implementation Plan”

In the modern developer-tooling ecosystem, developers often need to use different AI coding assistants to support their work. Anthropic’s Claude Code CLI and OpenAI’s Codex CLI each have their own strengths: Claude is known for outstanding code understanding and long-context handling, while Codex excels at code generation and tool usage.

This article takes an in-depth look at how the HagiCode project achieves seamless switching and interoperability across multiple AI providers, including the core architectural design, key implementation details, and practical considerations.

The core challenge faced by the HagiCode project is supporting multiple AI CLIs on the same platform, so users can:

  1. Flexibly switch between AI providers based on their needs
  2. Maintain session continuity during provider switching
  3. Unify the API differences across different CLIs behind a common abstraction
  4. Reserve extension points for adding new AI providers in the future
  1. Unifying interface differences: Claude Code CLI is invoked through command-line calls, while Codex CLI uses a JSON event stream
  2. Handling streaming responses: Both providers support streaming responses, but with different data formats
  3. Tool-calling semantics: Claude and Codex differ in how they represent tool calls and manage their lifecycle
  4. Session lifecycle: The system must correctly manage session creation, restoration, and termination for each provider

HagiCode uses the Provider Pattern combined with the Factory Pattern to abstract AI service invocation. The core ideas of this design are:

  1. Unified interface abstraction: Define the IAIProvider interface as the common abstraction for all AI providers
  2. Factory-created instances: Use AIProviderFactory to dynamically create the corresponding provider instance based on type
  3. Intelligent selection logic: Use AIProviderSelector to automatically select the most suitable provider based on scenario and configuration
  4. Session state management: Persist the binding relationship between sessions and CLI threads in the database
ComponentResponsibilityLanguage
IAIProviderUnified provider interfaceC#
AIProviderFactoryCreate and manage provider instancesC#
AIProviderSelectorSelect providers intelligentlyC#
ClaudeCodeCliProviderClaude Code CLI implementationC#
CodexCliProviderCodex CLI implementationC#
AgentCliManagerDesktop-side CLI managementTypeScript

The IAIProvider interface defines the unified provider abstraction:

public interface IAIProvider
{
/// <summary>
/// Provider display name
/// </summary>
string Name { get; }
/// <summary>
/// Whether streaming responses are supported
/// </summary>
bool SupportsStreaming { get; }
/// <summary>
/// Provider capability description
/// </summary>
ProviderCapabilities Capabilities { get; }
/// <summary>
/// Execute a single AI request
/// </summary>
Task<AIResponse> ExecuteAsync(AIRequest request, CancellationToken cancellationToken = default);
/// <summary>
/// Execute a streaming AI request
/// </summary>
IAsyncEnumerable<AIStreamingChunk> StreamAsync(AIRequest request, CancellationToken cancellationToken = default);
/// <summary>
/// Check provider connectivity and responsiveness
/// </summary>
Task<ProviderTestResult> PingAsync(CancellationToken cancellationToken = default);
/// <summary>
/// Send a message with an embedded command
/// </summary>
IAsyncEnumerable<AIStreamingChunk> SendMessageAsync(
AIRequest request,
string? embeddedCommandPrompt = null,
CancellationToken cancellationToken = default);
}

Key characteristics of this interface design:

  • Unified request/response model: All providers use the same AIRequest and AIResponse types
  • Streaming support: Standardize streaming output through IAsyncEnumerable<AIStreamingChunk>
  • Capability description: ProviderCapabilities describes the features supported by the provider (streaming, tools, maximum tokens, and so on)
  • Embedded commands: SendMessageAsync supports embedding OpenSpec commands into prompts
public enum AIProviderType
{
ClaudeCodeCli, // Anthropic Claude Code
OpenCodeCli, // Other CLIs (extensible)
GitHubCopilot, // GitHub Copilot
CodebuddyCli, // Codebuddy
CodexCli // OpenAI Codex
}

This enum provides a type-safe representation for all providers supported by the system.

The AIProviderFactory is responsible for creating and managing provider instances:

public class AIProviderFactory : IAIProviderFactory
{
private readonly ConcurrentDictionary<AIProviderType, IAIProvider> _cache;
private readonly IOptions<AIProviderOptions> _options;
private readonly IServiceProvider _serviceProvider;
public Task<IAIProvider?> GetProviderAsync(AIProviderType providerType)
{
// Use caching to avoid duplicate creation
if (_cache.TryGetValue(providerType, out var cached))
return Task.FromResult<IAIProvider?>(cached);
// Get provider configuration from settings
var aiOptions = _options.Value;
if (!aiOptions.Providers.TryGetValue(providerType, out var config))
{
_logger.LogWarning("Provider '{ProviderType}' not found in configuration", providerType);
return Task.FromResult<IAIProvider?>(null);
}
// Create provider by type
var provider = providerType switch
{
AIProviderType.ClaudeCodeCli =>
_serviceProvider.GetService(typeof(ClaudeCodeCliProvider)) as IAIProvider,
AIProviderType.CodexCli =>
_serviceProvider.GetService(typeof(CodexCliProvider)) as IAIProvider,
AIProviderType.GitHubCopilot =>
_serviceProvider.GetService(typeof(CopilotAIProvider)) as IAIProvider,
_ => null
};
if (provider != null)
{
_cache[providerType] = provider;
}
return Task.FromResult<IAIProvider?>(provider);
}
}

Advantages of the factory pattern:

  • Instance caching: Avoid repeatedly creating the same type of provider
  • Dependency injection: Create instances through IServiceProvider, with dependency injection support
  • Configuration-driven: Read provider settings from configuration files
  • Exception handling: Return null when creation fails, making it easier for upper layers to handle errors

The AIProviderSelector implements provider-selection strategies:

public class AIProviderSelector : IAIProviderSelector
{
private readonly BusinessLayerConfiguration _configuration;
private readonly IAIProviderFactory _providerFactory;
private readonly IMemoryCache _cache;
public async Task<AIProviderType> SelectProviderAsync(
BusinessScenario scenario,
CancellationToken cancellationToken = default)
{
// 1. Try getting a provider from scenario mapping
if (_configuration.ScenarioProviderMapping.TryGetValue(scenario, out var providerType))
{
if (await IsProviderAvailableAsync(providerType, cancellationToken))
{
_logger.LogDebug("Selected provider '{Provider}' for scenario '{Scenario}'",
providerType, scenario);
return providerType;
}
_logger.LogWarning("Configured provider '{Provider}' for scenario '{Scenario}' is not available",
providerType, scenario);
}
// 2. Try the default provider
if (await IsProviderAvailableAsync(_configuration.DefaultProvider, cancellationToken))
{
_logger.LogDebug("Using default provider '{Provider}' for scenario '{Scenario}'",
_configuration.DefaultProvider, scenario);
return _configuration.DefaultProvider;
}
// 3. Try the fallback chain
foreach (var fallbackProvider in _configuration.FallbackChain)
{
if (await IsProviderAvailableAsync(fallbackProvider, cancellationToken))
{
_logger.LogInformation("Using fallback provider '{Provider}' for scenario '{Scenario}'",
fallbackProvider, scenario);
return fallbackProvider;
}
}
// 4. No available provider can be found
throw new InvalidOperationException(
$"No available AI provider found for scenario '{scenario}'");
}
public async Task<bool> IsProviderAvailableAsync(
AIProviderType providerType,
CancellationToken cancellationToken = default)
{
var cacheKey = $"provider_available_{providerType}";
// Use caching to reduce Ping calls
if (_configuration.EnableCache &&
_cache.TryGetValue<bool>(cacheKey, out var cached))
{
return cached;
}
var provider = await _providerFactory.GetProviderAsync(providerType);
var isAvailable = provider != null;
if (_configuration.EnableCache && isAvailable)
{
_cache.Set(cacheKey, isAvailable,
TimeSpan.FromSeconds(_configuration.CacheExpirationSeconds));
}
return isAvailable;
}
}

Selector strategy:

  • Scenario mapping first: First check whether the business scenario has a specific provider mapping
  • Fallback to default provider: Use the default provider if scenario mapping fails
  • Fallback chain as a final safeguard: Try providers in the fallback chain one by one
  • Availability caching: Cache provider availability checks to reduce Ping calls

5. Claude Code CLI Provider Implementation

Section titled “5. Claude Code CLI Provider Implementation”
public class ClaudeCodeCliProvider : IAIProvider
{
private readonly ILogger<ClaudeCodeCliProvider> _logger;
private readonly IClaudeStreamManager _streamManager;
private readonly ProviderConfiguration _config;
public string Name => "ClaudeCodeCli";
public bool SupportsStreaming => true;
public ProviderCapabilities Capabilities { get; }
public async Task<AIResponse> ExecuteAsync(AIRequest request, CancellationToken cancellationToken = default)
{
_logger.LogInformation("Executing AI request with provider: {Provider}", Name);
var sessionOptions = ClaudeRequestMapper.MapToSessionOptions(request, _config);
var messages = _streamManager.SendMessageAsync(request.Prompt, sessionOptions, cancellationToken);
var responseBuilder = new StringBuilder();
ResultMessage? finalResult = null;
await foreach (var streamMessage in messages)
{
switch (streamMessage.Message)
{
case ResultMessage result:
finalResult = result;
responseBuilder.Append(result.Result);
break;
}
}
if (finalResult != null)
{
return ClaudeResponseMapper.MapToAIResponse(finalResult, Name);
}
return new AIResponse
{
Content = responseBuilder.ToString(),
FinishReason = FinishReason.Unknown,
Provider = Name
};
}
}

Characteristics of the Claude Code CLI provider:

  • Streaming manager integration: Use IClaudeStreamManager to communicate with the Claude CLI
  • CessionId session isolation: Use CessionId as the unique session identifier, distinct from the system sessionId
  • Working directory configuration: Support configuration of the working directory, permission mode, and more
  • Tool support: Support tool-permission settings such as AllowedTools and DisallowedTools
public class CodexCliProvider : IAIProvider
{
private readonly ILogger<CodexCliProvider> _logger;
private readonly CodexSettings _settings;
private readonly ConcurrentDictionary<string, string> _sessionThreadBindings;
public string Name => "CodexCli";
public bool SupportsStreaming => true;
public ProviderCapabilities Capabilities { get; }
public async IAsyncEnumerable<AIStreamingChunk> StreamAsync(
AIRequest request,
[EnumeratorCancellation] CancellationToken cancellationToken = default)
{
_logger.LogInformation("Executing streaming AI request with provider: {Provider}", Name);
var codex = CreateCodexClient();
var thread = ResolveThread(codex, request);
var currentTurn = 0;
var activeToolCalls = new Dictionary<string, AIToolCallDelta>();
await foreach (var threadEvent in thread.RunStreamedAsync(BuildPrompt(request), cancellationToken))
{
if (threadEvent is TurnStartedEvent)
{
currentTurn++;
}
switch (threadEvent)
{
case ItemCompletedEvent { Item: AgentMessageItem message }:
var messageText = message.Text ?? string.Empty;
yield return new AIStreamingChunk
{
Content = messageText,
Type = StreamingChunkType.ContentDelta,
IsComplete = false
};
break;
case ItemStartedEvent or ItemUpdatedEvent or ItemCompletedEvent:
var toolChunk = BuildToolChunk(threadEvent, currentTurn);
if (toolChunk?.ToolCallDelta != null)
{
yield return toolChunk;
}
break;
case TurnCompletedEvent turnCompleted:
activeToolCalls.Clear();
yield return new AIStreamingChunk
{
Content = string.Empty,
Type = StreamingChunkType.Metadata,
IsComplete = true,
Usage = MapUsage(turnCompleted.Usage)
};
break;
}
}
BindSessionThread(request.SessionId, thread.Id);
}
private CodexThread ResolveThread(Codex codex, AIRequest request)
{
var sessionId = request.SessionId;
// Check whether there is already a bound thread
if (!string.IsNullOrWhiteSpace(sessionId) &&
_sessionThreadBindings.TryGetValue(sessionId, out var threadId) &&
!string.IsNullOrWhiteSpace(threadId))
{
_logger.LogInformation("Resuming Codex thread {ThreadId} for session {SessionId}", threadId, sessionId);
return codex.ResumeThread(threadId, threadOptions);
}
_logger.LogInformation("Starting new Codex thread for session {SessionId}", sessionId ?? "(none)");
return codex.StartThread(threadOptions);
}
}

Characteristics of the Codex CLI provider:

  • JSON event-stream handling: Parse Codex JSON event streams (TurnStarted, ItemStarted, TurnCompleted, and so on)
  • Session-thread binding: Persist the binding between sessions and threads with an SQLite database
  • Thread reuse: Support resuming existing threads to maintain session continuity
  • Tool-call tracking: Track active tool-call state and correctly handle the tool lifecycle

Codex CLI uses an SQLite database to persist the binding between sessions and threads:

public class CodexCliProvider : IAIProvider
{
private const int SessionThreadBindingRetentionDays = 30;
private readonly ConcurrentDictionary<string, string> _sessionThreadBindings;
private readonly string _sessionThreadBindingDatabaseConnectionString;
private readonly string _sessionThreadBindingDatabasePath;
private void BindSessionThread(string? sessionId, string? threadId)
{
if (string.IsNullOrWhiteSpace(sessionId) || string.IsNullOrWhiteSpace(threadId))
{
return;
}
// In-memory cache
_sessionThreadBindings.AddOrUpdate(sessionId, threadId, (_, _) => threadId);
// Persist to SQLite
PersistSessionThreadBinding(sessionId, threadId);
}
private void PersistSessionThreadBinding(string sessionId, string threadId)
{
try
{
using var connection = new SqliteConnection(_sessionThreadBindingDatabaseConnectionString);
connection.Open();
using var upsertCommand = connection.CreateCommand();
upsertCommand.CommandText =
"""
INSERT INTO SessionThreadBindings (SessionId, ThreadId, CreatedAtUtc, UpdatedAtUtc)
VALUES ($sessionId, $threadId, $createdAtUtc, $updatedAtUtc)
ON CONFLICT(SessionId) DO UPDATE SET
ThreadId = excluded.ThreadId,
UpdatedAtUtc = excluded.UpdatedAtUtc;
""";
var nowUtc = DateTimeOffset.UtcNow.ToString("O");
upsertCommand.Parameters.AddWithValue("$sessionId", sessionId);
upsertCommand.Parameters.AddWithValue("$threadId", threadId);
upsertCommand.Parameters.AddWithValue("$createdAtUtc", nowUtc);
upsertCommand.Parameters.AddWithValue("$updatedAtUtc", nowUtc);
upsertCommand.ExecuteNonQuery();
}
catch (Exception ex)
{
_logger.LogWarning(
ex,
"Failed to persist Codex session-thread binding for session {SessionId} to {DatabasePath}",
sessionId,
_sessionThreadBindingDatabasePath);
}
}
private void LoadPersistedSessionThreadBindings()
{
using var connection = new SqliteConnection(_sessionThreadBindingDatabaseConnectionString);
connection.Open();
using var loadCommand = connection.CreateCommand();
loadCommand.CommandText = "SELECT SessionId, ThreadId FROM SessionThreadBindings;";
using var reader = loadCommand.ExecuteReader();
while (reader.Read())
{
var sessionId = reader.GetString(0);
var threadId = reader.GetString(1);
_sessionThreadBindings[sessionId] = threadId;
}
}
}

Advantages of session-thread binding:

  • Session restoration: Previous sessions can be restored after a system restart
  • Thread reuse: The same session can reuse an existing Codex thread
  • Automatic cleanup: Bindings older than 30 days are cleaned up automatically

hagicode-desktop manages CLI selection through AgentCliManager:

export enum AgentCliType {
ClaudeCode = 'claude-code',
Codex = 'codex',
// Future extensions: other CLIs such as Aider and Cursor
}
export class AgentCliManager {
private static readonly STORE_KEY = 'agentCliSelection';
private static readonly EXECUTOR_TYPE_MAP: Record<AgentCliType, string> = {
[AgentCliType.ClaudeCode]: 'ClaudeCodeCli',
[AgentCliType.Codex]: 'CodexCli',
};
constructor(private store: any) {}
async saveSelection(cliType: AgentCliType): Promise<void> {
const selection: StoredAgentCliSelection = {
cliType,
isSkipped: false,
selectedAt: new Date().toISOString(),
};
this.store.set(AgentCliManager.STORE_KEY, selection);
}
loadSelection(): StoredAgentCliSelection {
return this.store.get(AgentCliManager.STORE_KEY, {
cliType: null,
isSkipped: false,
selectedAt: null,
});
}
getCommandName(cliType: AgentCliType): string {
switch (cliType) {
case AgentCliType.ClaudeCode:
return 'claude';
case AgentCliType.Codex:
return 'codex';
default:
return 'claude';
}
}
getExecutorType(cliType: AgentCliType | null): string {
if (!cliType) return 'ClaudeCodeCli';
return this.EXECUTOR_TYPE_MAP[cliType] || 'ClaudeCodeCli';
}
}

Example desktop-side IPC handler:

ipcMain.handle('llm:call-api', async (event, manifestPath, region) => {
if (!state.llmInstallationManager) {
return { success: false, error: 'LLM Installation Manager not initialized' };
}
try {
const prompt = await state.llmInstallationManager.loadPrompt(manifestPath, region);
// Determine the CLI command based on the user's selection
let commandName = 'claude';
if (state.agentCliManager) {
const selectedCliType = state.agentCliManager.getSelectedCliType();
if (selectedCliType) {
commandName = state.agentCliManager.getCommandName(selectedCliType);
}
}
// Execute with the selected CLI
const result = await state.llmInstallationManager.callApi(
prompt.filePath,
event.sender,
commandName
);
return result;
} catch (error) {
return {
success: false,
error: error instanceof Error ? error.message : 'Unknown error'
};
}
});

9. Codex’s Internal Model Provider System

Section titled “9. Codex’s Internal Model Provider System”

Codex itself also supports multiple model providers via ModelProviderInfo configuration:

pub const OPENAI_PROVIDER_NAME: &str = "OpenAI";
pub const OLLAMA_OSS_PROVIDER_ID: &str = "ollama";
pub const LMSTUDIO_OSS_PROVIDER_ID: &str = "lmstudio";
pub fn built_in_model_providers() -> HashMap<String, ModelProviderInfo> {
use ModelProviderInfo as P;
[
("openai", P::create_openai_provider()),
(OLLAMA_OSS_PROVIDER_ID, create_oss_provider(DEFAULT_OLLAMA_PORT, WireApi::Responses)),
(LMSTUDIO_OSS_PROVIDER_ID, create_oss_provider(DEFAULT_LMSTUDIO_PORT, WireApi::Responses)),
]
.into_iter()
.map(|(k, v)| (k.to_string(), v))
.collect()
}
pub struct ModelProviderInfo {
pub name: String,
pub base_url: Option<String>,
pub env_key: Option<String>,
pub query_params: Option<HashMap<String, String>>,
pub http_headers: Option<HashMap<String, String>>,
pub request_max_retries: Option<u64>,
pub stream_max_retries: Option<u64>,
pub stream_idle_timeout_ms: Option<u64>,
pub requires_openai_auth: bool,
pub supports_websockets: bool,
}

Codex model-provider support includes:

  • Built-in providers: OpenAI, Ollama, and LM Studio
  • Custom providers: Users can add custom providers in config.toml
  • Retry strategy: Configurable retry counts for requests and streams
  • WebSocket support: Some providers support WebSocket transport

Configure multiple providers in appsettings.json:

{
"AI": {
"Providers": {
"DefaultProvider": "ClaudeCodeCli",
"Providers": {
"ClaudeCodeCli": {
"Type": "ClaudeCodeCli",
"Model": "claude-sonnet-4-20250514",
"WorkingDirectory": "/path/to/workspace",
"PermissionMode": "acceptEdits",
"AllowedTools": ["file-edit", "command-run", "bash"]
},
"CodexCli": {
"Type": "CodexCli",
"Model": "gpt-4.1",
"ExecutablePath": "codex",
"SandboxMode": "enabled",
"WebSearchMode": "auto",
"NetworkAccessEnabled": false
}
},
"ScenarioProviderMapping": {
"CodeAnalysis": "ClaudeCodeCli",
"CodeGeneration": "CodexCli",
"Refactoring": "ClaudeCodeCli",
"Debugging": "CodexCli"
},
"FallbackChain": ["CodexCli", "ClaudeCodeCli"]
},
"Selector": {
"EnableCache": true,
"CacheExpirationSeconds": 300
}
}
}
public class AIOrchestrator
{
private readonly IAIProviderFactory _providerFactory;
private readonly IAIProviderSelector _providerSelector;
private readonly ILogger<AIOrchestrator> _logger;
public AIOrchestrator(
IAIProviderFactory providerFactory,
IAIProviderSelector providerSelector,
ILogger<AIOrchestrator> logger)
{
_providerFactory = providerFactory;
_providerSelector = providerSelector;
_logger = logger;
}
public async Task<AIResponse> ProcessRequestAsync(
AIRequest request,
BusinessScenario scenario)
{
_logger.LogInformation("Processing request for scenario: {Scenario}", scenario);
try
{
// Select a provider intelligently
var providerType = await _providerSelector.SelectProviderAsync(scenario, request.CancellationToken);
// Get the provider instance
var provider = await _providerFactory.GetProviderAsync(providerType);
if (provider == null)
{
throw new InvalidOperationException($"Provider {providerType} not available");
}
_logger.LogInformation("Using provider: {Provider} for request", provider.Name);
// Execute the request
var response = await provider.ExecuteAsync(request, request.CancellationToken);
_logger.LogInformation("Request completed with provider: {Provider}, tokens used: {Tokens}",
provider.Name,
response.Usage?.TotalTokens ?? 0);
return response;
}
catch (Exception ex)
{
_logger.LogError(ex, "Failed to process request for scenario: {Scenario}", scenario);
throw;
}
}
}
public async IAsyncEnumerable<AIStreamingChunk> StreamResponseAsync(
AIRequest request,
BusinessScenario scenario)
{
var providerType = await _providerSelector.SelectProviderAsync(scenario);
var provider = await _providerFactory.GetProviderAsync(providerType);
if (provider == null)
{
throw new InvalidOperationException($"Provider {providerType} not available");
}
await foreach (var chunk in provider.StreamAsync(request))
{
// Process streaming chunks
switch (chunk.Type)
{
case StreamingChunkType.ContentDelta:
// Show text content in real time
await SendToClientAsync(chunk.Content);
break;
case StreamingChunkType.ToolCallDelta:
// Handle tool calls
await HandleToolCallAsync(chunk.ToolCallDelta);
break;
case StreamingChunkType.Metadata:
// Handle completion events and stats
if (chunk.IsComplete)
{
_logger.LogInformation("Stream completed, usage: {@Usage}", chunk.Usage);
}
break;
case StreamingChunkType.Error:
// Handle errors
_logger.LogError("Stream error: {Error}", chunk.ErrorMessage);
throw new InvalidOperationException(chunk.ErrorMessage);
}
}
}
public async Task<string> ExecuteOpenSpecCommandAsync(
string command,
string arguments,
BusinessScenario scenario)
{
var providerType = await _providerSelector.SelectProviderAsync(scenario);
var provider = await _providerFactory.GetProviderAsync(providerType);
// Build an embedded command prompt
var commandPrompt = $"""
Execute the following OpenSpec command:
Command: {command}
Arguments: {arguments}
Please execute this command and return the results.
""";
var request = new AIRequest
{
Prompt = "Process this command request",
EmbeddedCommandPrompt = commandPrompt,
WorkingDirectory = Directory.GetCurrentDirectory()
};
var response = await provider.SendMessageAsync(request, commandPrompt);
return response.Content;
}

Before switching providers, it is recommended to call PingAsync first to ensure the target provider is available:

public async Task<bool> IsProviderHealthyAsync(AIProviderType providerType)
{
var provider = await _providerFactory.GetProviderAsync(providerType);
if (provider == null) return false;
var testResult = await provider.PingAsync();
return testResult.Success &&
testResult.ResponseTimeMs < 5000; // A response within 5 seconds is considered healthy
}

Use CessionId (Claude) or ThreadId (Codex) to ensure session isolation:

  • Claude Code CLI: use CessionId as the unique session identifier
  • Codex CLI: use ThreadId as the session identifier
// Claude Code CLI session options
var claudeSessionOptions = new ClaudeSessionOptions
{
CessionId = CessionId.New(), // Generate a unique ID
WorkingDirectory = workspacePath,
AllowedTools = allowedTools,
PermissionMode = PermissionMode.acceptEdits
};
// Codex thread options
var codexThreadOptions = new ThreadOptions
{
Model = "gpt-4.1",
SandboxMode = "enabled",
WorkingDirectory = workspacePath
};

Fallback mechanisms must be robust when a provider is unavailable, ensuring that at least one provider remains usable:

public async Task<AIResponse> ExecuteWithFallbackAsync(
AIRequest request,
List<AIProviderType> preferredProviders)
{
Exception? lastException = null;
foreach (var providerType in preferredProviders)
{
try
{
var provider = await _providerFactory.GetProviderAsync(providerType);
if (provider == null) continue;
// Try execution
return await provider.ExecuteAsync(request);
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Provider {ProviderType} failed, trying next", providerType);
lastException = ex;
}
}
// All providers failed
throw new InvalidOperationException(
"All preferred providers failed. Last error: " + lastException?.Message,
lastException);
}

Validate settings for all configured providers at startup to avoid runtime errors:

public void ValidateConfiguration(AIProviderOptions options)
{
foreach (var (providerType, config) in options.Providers)
{
// Validate executable paths (for CLI-based providers)
if (IsCliBasedProvider(providerType))
{
if (string.IsNullOrWhiteSpace(config.ExecutablePath))
{
throw new ConfigurationException(
$"Provider {providerType} requires ExecutablePath");
}
if (!File.Exists(config.ExecutablePath))
{
throw new ConfigurationException(
$"Executable not found for {providerType}: {config.ExecutablePath}");
}
}
// Validate API keys (for API-based providers)
if (IsApiBasedProvider(providerType))
{
if (string.IsNullOrWhiteSpace(config.ApiKey))
{
throw new ConfigurationException(
$"Provider {providerType} requires ApiKey");
}
}
// Validate model names
if (string.IsNullOrWhiteSpace(config.Model))
{
_logger.LogWarning("No model configured for {ProviderType}, using default", providerType);
}
}
}

Provider instances are cached, so pay attention to lifecycle management and memory usage:

// Clean up the cache periodically
public void ClearInactiveProviders(TimeSpan inactiveThreshold)
{
var now = DateTimeOffset.UtcNow;
var keysToRemove = new List<AIProviderType>();
foreach (var (type, instance) in _cache)
{
// Assume providers have a LastUsedTime property
if (instance.LastUsedTime.HasValue &&
now - instance.LastUsedTime.Value > inactiveThreshold)
{
keysToRemove.Add(type);
}
}
foreach (var key in keysToRemove)
{
_cache.TryRemove(key, out _);
_logger.LogInformation("Cleared inactive provider: {Provider}", key);
}
}

Log provider selection, switching, and execution in detail to make debugging easier:

public class AIProviderLogging
{
private readonly ILogger _logger;
public void LogProviderSelection(
BusinessScenario scenario,
AIProviderType selectedProvider,
SelectionReason reason)
{
_logger.LogInformation(
"[ProviderSelection] Scenario={Scenario}, Provider={Provider}, Reason={Reason}",
scenario,
selectedProvider,
reason);
}
public void LogProviderSwitch(
AIProviderType fromProvider,
AIProviderType toProvider,
string reason)
{
_logger.LogWarning(
"[ProviderSwitch] From={FromProvider} To={ToProvider}, Reason={Reason}",
fromProvider,
toProvider,
reason);
}
public void LogProviderError(
AIProviderType provider,
Exception error,
AIRequest request)
{
_logger.LogError(error,
"[ProviderError] Provider={Provider}, RequestLength={Length}, Error={Message}",
provider,
request.Prompt.Length,
error.Message);
}
}

Using concurrent collections such as ConcurrentDictionary ensures thread safety:

public class ThreadSafeProviderCache
{
private readonly ConcurrentDictionary<AIProviderType, IAIProvider> _cache;
private readonly ReaderWriterLockSlim _lock = new();
public IAIProvider? GetProvider(AIProviderType type)
{
// Read operations do not require a lock
if (_cache.TryGetValue(type, out var provider))
return provider;
// Creation requires a write lock
_lock.EnterWriteLock();
try
{
// Double-check
if (_cache.TryGetValue(type, out provider))
return provider;
var newProvider = CreateProvider(type);
if (newProvider != null)
{
_cache[type] = newProvider;
}
return newProvider;
}
finally
{
_lock.ExitWriteLock();
}
}
}

When the session-thread binding database schema changes, data migration must be considered:

public class SessionThreadMigration
{
public async Task MigrateAsync(string dbPath)
{
var version = await GetSchemaVersionAsync(dbPath);
if (version >= 2) return; // Already the latest version
using var connection = new SqliteConnection(dbPath);
connection.Open();
// Migrate to v2: add the CreatedAtUtc column
if (version < 2)
{
_logger.LogInformation("Migrating SessionThreadBindings to v2...");
using var addColumnCommand = connection.CreateCommand();
addColumnCommand.CommandText = "ALTER TABLE SessionThreadBindings ADD COLUMN CreatedAtUtc TEXT;";
addColumnCommand.ExecuteNonQuery();
using var backfillCommand = connection.CreateCommand();
backfillCommand.CommandText =
"""
UPDATE SessionThreadBindings
SET CreatedAtUtc = COALESCE(NULLIF(UpdatedAtUtc, ''), $nowUtc)
WHERE CreatedAtUtc IS NULL OR CreatedAtUtc = '';
""";
backfillCommand.Parameters.AddWithValue("$nowUtc", DateTimeOffset.UtcNow.ToString("O"));
backfillCommand.ExecuteNonQuery();
}
await UpdateSchemaVersionAsync(dbPath, 2);
_logger.LogInformation("Migration to v2 completed");
}
}

HagiCode combines the provider pattern, factory pattern, and selector pattern to implement a flexible and extensible multi-AI provider architecture:

  • Unified interface abstraction: The IAIProvider interface hides the differences between CLIs
  • Dynamic instance creation: AIProviderFactory supports runtime creation of provider instances
  • Intelligent selection strategy: AIProviderSelector implements scenario-driven provider selection
  • Session state persistence: Database bindings ensure session continuity
  • Desktop integration: AgentCliManager supports user selection and configuration

The advantages of this architecture are:

  1. Extensibility: Adding a new AI provider only requires implementing the IAIProvider interface
  2. Testability: Providers can be tested and mocked independently
  3. Maintainability: Each provider implementation is isolated and has a single responsibility
  4. User-friendliness: Support both scenario-based automatic selection and manual switching

With this design, HagiCode successfully enables seamless switching and interoperability between Claude Code CLI and Codex CLI, giving developers a flexible and powerful AI coding assistant experience.


Thank you for reading. If you found this article useful, please click the like button below 👍 so more people can discover it.

This content was created with AI-assisted collaboration, reviewed by me, and reflects my own views and position.

From TypeScript to C#: Cross-Language Porting Practice for the Codex SDK

From TypeScript to C#: Cross-Language Porting Practice for the Codex SDK

Section titled “From TypeScript to C#: Cross-Language Porting Practice for the Codex SDK”

Put simply, this article is also a bit of a baby of ours: it records the full process of porting the official TypeScript Codex SDK to C#. Calling it a “port” almost makes it sound too easy - it was more like a long adventure, because these two languages have very different personalities, and we had to find a way to make them cooperate.

Codex is the AI Agent CLI tool released by OpenAI, and it is genuinely powerful. The official team provides a TypeScript SDK in the @openai/codex package. It interacts with the Codex CLI by calling the codex exec --experimental-json command and parsing a JSONL event stream.

The problem is that in the HagiCode project, we need to use it in a pure .NET environment - specifically in C# backend services and desktop applications. We could not reasonably introduce a Node.js runtime into a .NET project just to call a CLI tool. That would be far too cumbersome.

So we were left with two choices: maintain a complex Node.js bridge layer, or build a native C# SDK ourselves.

We chose the latter.

This article also comes directly from our hands-on experience in the HagiCode project. HagiCode is an open-source AI coding assistant project. In plain terms, it means maintaining multiple components at once: a VSCode extension on the frontend, AI services on the backend, and a cross-platform desktop client. That multi-language, multi-platform complexity is exactly why we needed a native C# SDK - we really did not want to run Node.js inside a .NET project.

If you find this article helpful, feel free to give us a star on GitHub: github.com/HagiCode-org/site. You can also visit the official website to learn more: hagicode.com. It is always encouraging when an open-source project receives support.

Before translating code one-to-one, we first had to understand the architectural design of both SDKs. You have to understand both sides before you can port them well.

The core architecture of the TypeScript SDK looks like this:

Codex (entry class)
└── CodexExec (executor, manages child processes)
└── Thread (conversation thread)
├── run() / runStreamed() (synchronous/asynchronous execution)
└── event stream parsing

The C# SDK keeps the same architectural layering, but adapts the implementation details. The overall idea is straightforward: preserve API consistency while fully leveraging C# language features in the implementation.

This is the most fundamental and also the most important part of the work. If the foundation is weak, everything that follows becomes harder.

TypeScript’s type system is more flexible than C#‘s, and that is simply a fact. We needed to find an appropriate mapping strategy:

TypeScriptC#Notes
interface / typerecordC# uses record for immutable data structures
string | nullstring?Nullable reference type
boolean | undefinedbool?Nullable Boolean
AsyncGeneratorIAsyncEnumerableAsync iterator

The event type system is a typical example. TypeScript uses union types to define events:

export type ThreadEvent =
| ThreadStartedEvent
| TurnStartedEvent
| TurnCompletedEvent
| ...

In C#, we use an inheritance hierarchy and pattern matching to achieve a similar effect:

public abstract record ThreadEvent(string Type);
public sealed record ThreadStartedEvent(string ThreadId) : ThreadEvent("thread.started");
public sealed record TurnStartedEvent() : ThreadEvent("turn.started");
public sealed record TurnCompletedEvent(Usage Usage) : ThreadEvent("turn.completed");
// ...

We chose record instead of class because event objects should be immutable, which matches the intent behind using plain objects in TypeScript. The sealed keyword also prevents additional inheritance and gives the compiler room to optimize.

Event parsing is the core of the entire SDK, because it determines whether we can correctly understand every message returned by the Codex CLI. If parsing is wrong, everything after that is wasted effort.

The TypeScript version uses JSON.parse() to parse each line of JSON:

export function parseEvent(line: string): ThreadEvent {
const data = JSON.parse(line);
// Handle different event types...
}

The C# version uses System.Text.Json.JsonDocument instead:

public static ThreadEvent Parse(string line)
{
using var document = JsonDocument.Parse(line);
var root = document.RootElement;
var type = GetRequiredString(root, "type", "event.type");
return type switch
{
"thread.started" => new ThreadStartedEvent(GetRequiredString(root, "thread_id", ...)),
"turn.started" => new TurnStartedEvent(),
"turn.completed" => new TurnCompletedEvent(ParseUsage(...)),
// ...
_ => new UnknownThreadEvent(type, root.Clone()),
};
}

There is one small but important trick here: root.Clone() is required, because elements from JsonDocument become invalid after the document is disposed. We need to retain a copy for unknown event types. That is simply one of the differences between C# JSON handling and JavaScript.

This is where the two SDKs differ the most. Node.js and .NET have different runtime conventions, so the implementation has to adapt.

TypeScript uses Node.js’s spawn() function:

const child = spawn(this.executablePath, commandArgs, { env, signal });

C# uses .NET’s System.Diagnostics.Process:

using var process = new Process { StartInfo = startInfo };
process.Start();
// stdin/stdout/stderr must be managed manually

More specifically, the C# version needs to configure the process like this:

var startInfo = new ProcessStartInfo
{
FileName = _executablePath,
RedirectStandardInput = true,
RedirectStandardOutput = true,
RedirectStandardError = true,
UseShellExecute = false,
CreateNoWindow = true,
};

The biggest difference is the cancellation mechanism. TypeScript uses AbortSignal, which is part of the Web API and very convenient to work with:

const child = spawn(cmd, args, { signal: cancellationSignal });

C# uses CancellationToken instead:

public async IAsyncEnumerable<string> RunAsync(
CodexExecArgs args,
[EnumeratorCancellation] CancellationToken cancellationToken = default)
{
// Check cancellation status inside the loop
while (!cancellationToken.IsCancellationRequested)
{
// Process output...
}
// Terminate the process when cancellation is requested
if (cancellationToken.IsCancellationRequested)
{
try { process.Kill(entireProcessTree: true); } catch { }
}
}

At a high level, this is just another example of the difference between the Web API ecosystem and the .NET ecosystem.

Both SDKs implement the logic that converts JSON configuration into TOML configuration, because the Codex CLI accepts configuration overrides in TOML format. This part must remain completely consistent, otherwise the same configuration will behave differently in the two SDKs.

That is the kind of detail you cannot compromise on. Success or failure often comes down to details like this.

We created the following project structure:

CodexSdk/
├── CodexSdk.csproj
├── Codex.cs # Entry class
├── CodexThread.cs # Conversation thread
├── CodexExec.cs # Executor
├── Events.cs # Event type definitions
├── Items.cs # Item type definitions
├── EventParser.cs # Event parser
├── OutputSchemaTempFile.cs # Temporary file management
└── ...

It is a fairly clean structure, and that helped a lot during the port.

The basic usage remains consistent with the TypeScript SDK:

using CodexSdk;
// Create a Codex instance
var codex = new Codex();
var thread = codex.StartThread();
// Execute a query
var result = await thread.RunAsync("Summarize this repository.");
Console.WriteLine(result.FinalResponse);

Streaming event handling takes advantage of C# pattern matching:

await foreach (var @event in thread.RunStreamedAsync("Analyze the code."))
{
switch (@event)
{
case ItemCompletedEvent itemCompleted
when itemCompleted.Item is AgentMessageItem msg:
Console.WriteLine($"Assistant: {msg.Text}");
break;
case TurnCompletedEvent completed:
Console.WriteLine($"Tokens: in={completed.Usage.InputTokens}");
break;
case CommandExecutionItem command:
Console.WriteLine($"Command: {command.Command}");
break;
}
}

During implementation, we collected several practical lessons:

  1. Process management: The C# version must manage the full process lifecycle manually, including process termination during cancellation. Use Kill(entireProcessTree: true) to make sure child processes are also cleaned up.

  2. Error handling: We use InvalidOperationException to throw parsing errors, keeping the error handling style similar to the TypeScript SDK.

  3. Resource cleanup: OutputSchemaTempFile implements IAsyncDisposable to ensure temporary files are cleaned up correctly.

  4. Environment variables: The C# version supports fully overriding process environment variables through CodexOptions.Env. It is a small feature, but a very practical one.

  5. Platform differences: The C# version does not include the TypeScript version’s logic for automatically locating binaries inside npm packages. Since .NET projects typically do not depend on npm, the path to the codex executable must be specified via the CODEX_EXECUTABLE environment variable or CodexPathOverride.

Porting a mature TypeScript SDK to C# is not just a matter of syntax conversion - it also requires understanding the design philosophies of both languages. TypeScript’s flexibility and JavaScript ecosystem features such as AbortSignal need appropriate counterparts in C#.

The key takeaway is this: maintaining API consistency matters more than maintaining implementation-level consistency. Users care about whether the interface is easy to use, not whether the internal implementation is identical. That sounds simple, but making those trade-offs takes judgment.

If you are working on a similar cross-language port, our experience is to fully understand the original SDK architecture first, then translate it module by module, and finally use a complete test suite to ensure behavioral consistency. This kind of work cannot be rushed.

Everything will work out in the end.



If this article helped you:


Thank you for reading. If you found this article useful, please click the like button below so more people can discover it.

This content was created with AI-assisted collaboration, reviewed by the author, and reflects the author’s own views and position.