Skip to content

Architecture Design

3 posts with the tag “Architecture Design”

Building an AI Adventure Party: A Practical Guide to Multi-Agent Collaboration Configuration in HagiCode

Building an AI Adventure Party: A Practical Guide to Multi-Agent Collaboration Configuration in HagiCode

Section titled “Building an AI Adventure Party: A Practical Guide to Multi-Agent Collaboration Configuration in HagiCode”

In modern software development, a single AI Agent is no longer enough for complex needs. How can multiple AI assistants from different companies collaborate within the same project? This article shares the multi-Agent collaboration configuration approach that the HagiCode project developed through real-world practice.

Many developers have likely had this experience: bringing an AI assistant into a project really does improve coding efficiency. But as requirements grow more complex, one AI Agent starts to fall short. You want it to handle code review, documentation generation, unit tests, and more at the same time, but the result is often that it cannot balance everything well, and output quality becomes inconsistent.

What is even more frustrating is that once you try to introduce multiple AI assistants, things get more complicated. Each Agent has its own configuration method, API interface, and execution logic, and they may even conflict with one another. It is like a sports team where every player is individually strong, but nobody knows how to coordinate, so the whole match turns into chaos.

The HagiCode project ran into the same problem during development. As a complex project involving a frontend VSCode extension, backend AI services, and a cross-platform desktop client, in the 2026-03 version at that time we needed to integrate multiple AI assistants from different companies at once: Claude Code, Codex, CodeBuddy, iFlow, and more. Figuring out how to let them coexist harmoniously in the same project while making the best use of their individual strengths became a critical problem we had to solve.

That alone would already be enough trouble. After all, who wants to deal with a group of AI tools fighting each other every day?

The approach shared in this article is the multi-Agent collaboration configuration practice we developed in the HagiCode project through real trial and error and repeated optimization. If you are also struggling with multiple AI assistants working together, this article may give you some ideas. Maybe. Every project is different, after all.

HagiCode is an AI coding assistant project that adopts an “adventure party” model in which multiple AI engines work together. Project repository: github.com/HagiCode-org/site.

The multi-Agent configuration approach shared here is one of the core techniques that allows HagiCode to maintain efficient development in complex projects. There is nothing especially mystical about it - it just turns a group of AIs into an adventure party that can actually coordinate.

HagiCode’s Multi-Agent Architecture Design

Section titled “HagiCode’s Multi-Agent Architecture Design”

From “Going Solo” to “Team Collaboration”

Section titled “From “Going Solo” to “Team Collaboration””

In the early days of the HagiCode project, we also tried using a single AI Agent to handle everything. We quickly discovered a clear bottleneck in that approach: different tasks demand different strengths. Some tasks require stronger contextual understanding, while others need more precise code editing. One Agent has a hard time excelling at all of them.

That made us realize that multiple Agents had to work together. But the problem was this: how do you let AI products from different companies coexist peacefully in the same project? We needed to solve several core issues:

  1. Configuration management complexity: each Agent has different configuration methods, API interfaces, and execution modes
  2. Unified communication protocol: we need a standardized way for different Agents to exchange data
  3. Task coordination and division of labor: how do we assign work reasonably so each Agent can play to its strengths

With those questions in mind, we started designing HagiCode’s multi-Agent architecture. It was not really that complicated in the end; we just had to think it through clearly.

After multiple iterations, this is the architecture we settled on:

┌─────────────────────────────────────────────────────────────────┐
│ AIProviderFactory │
│ (Factory pattern for unified management of all AI Providers) │
├─────────────────────────────────────────────────────────────────┤
│ ClaudeCodeCli │ CodexCli │ CodebuddyCli │ IFlowCli │
│ (Anthropic) │ (OpenAI) │ (Zhipu GLM) │ (Zhipu) │
└─────────────────────────────────────────────────────────────────┘

The core idea is to let different AI Agents be managed by the same code through a unified Provider interface. At the same time, the factory pattern is used to dynamically create and configure these Providers, ensuring scalability and flexibility across the system.

It is like division of labor in daily life. Everyone has a role; here we simply turned that idea into code architecture.

Agent Types and Division of Responsibilities

Section titled “Agent Types and Division of Responsibilities”

Based on HagiCode’s real-world experience, we assigned different responsibilities to each Agent:

AgentProviderModelPrimary Use
ClaudeCodeCliAnthropicglm-5-turboGenerate technical solutions and Proposals
CodexCliOpenAI/Zedgpt-5.4Execute precise code changes
CodebuddyCliZhipuglm-4.7Refine proposal descriptions and documentation
IFlowCliZhipuglm-4.7Archive proposals and historical records (configuration at the time; now legacy-compatible only)
OpenCodeCli--General-purpose code editing
GitHubCopilotMicrosoft-Assisted programming and code completion

The logic behind this division of labor is simple: every Agent has its own area of strength. Claude Code performs well at understanding and analyzing complex requirements, so it handles early solution design. Codex is more precise when modifying code, so it is better suited for concrete implementation work. CodeBuddy offers strong cost performance, which makes it a great fit for refining documentation.

After all, the right tool for the right job is usually the best choice. There are many roads to Rome; some are simply easier to walk than others.

To manage different AI Agents in a unified way, we first need to define a common interface. In HagiCode, that interface looks like this:

public interface IAIProvider
{
// Unified Provider interface
Task<IAIProvider?> GetProviderAsync(AIProviderType providerType);
Task<IAIProvider?> GetProviderAsync(string providerName, CancellationToken cancellationToken);
}

The interface looks simple, but it is the foundation of the entire multi-Agent system. With a unified interface, we can call AI products from different companies in exactly the same way, no matter what is underneath.

This is really just a matter of making complex things simple. Simple is beautiful, after all.

Once the interface is unified, the next question is how to create these Provider instances. HagiCode uses the factory pattern:

private IAIProvider? CreateProvider(AIProviderType providerType, ProviderConfiguration config)
{
return providerType switch
{
AIProviderType.ClaudeCodeCli =>
ActivatorUtilities.CreateInstance<ClaudeCodeCliProvider>(_serviceProvider, Options.Create(config)),
AIProviderType.CodebuddyCli =>
ActivatorUtilities.CreateInstance<CodebuddyCliProvider>(_serviceProvider, Options.Create(config)),
AIProviderType.CodexCli =>
ActivatorUtilities.CreateInstance<CodexCliProvider>(_serviceProvider, Options.Create(config)),
AIProviderType.IFlowCli =>
ActivatorUtilities.CreateInstance<IFlowCliProvider>(_serviceProvider, Options.Create(config)),
_ => null
};
}

This uses dependency injection through ActivatorUtilities.CreateInstance, which can dynamically create Provider instances at runtime while automatically injecting dependencies. The benefit of this design is that when a new Agent type is added, you only need to add the corresponding Provider class and then add one more case branch in the factory method. There is no need to modify the existing code at all.

That is reason enough. Who wants to rewrite a pile of old code every time a new feature is added?

To make configuration more flexible, we also implemented a type-mapping mechanism:

public static AIProviderTypeExtensions
{
private static readonly Dictionary<string, AIProviderType> _typeMap = new(
StringComparer.OrdinalIgnoreCase)
{
["ClaudeCodeCli"] = AIProviderType.ClaudeCodeCli,
["CodebuddyCli"] = AIProviderType.CodebuddyCli,
["CodexCli"] = AIProviderType.CodexCli,
["IFlowCli"] = AIProviderType.IFlowCli,
// ...more type mappings
};
}

The purpose of this mapping table is to convert string-form Provider names into enum types. This allows configuration files to use intuitive string names, while the internal code uses type-safe enums for processing.

Configuration should be as intuitive as possible. Nobody wants to memorize a pile of obscure code names.

In practice, everything can be configured in appsettings.json:

AI:
Providers:
Providers:
ClaudeCodeCli:
Enabled: true
Model: glm-5-turbo
WorkingDirectory: /path/to/project
CodebuddyCli:
Enabled: true
Model: glm-4.7
CodexCli:
Enabled: true
Model: gpt-5.4
IFlowCli:
Enabled: true
Model: glm-4.7

Each Provider can independently configure parameters such as enablement, model version, and working directory. This design preserves flexibility while remaining easy to manage and maintain.

In some ways, configuration files are like life’s options: you can choose to enable or disable certain things. The only difference is that code choices are easier to regret later.

With the unified technical architecture in place, the next step is making multiple Agents work together. HagiCode designed a task flow mechanism so different Agents can handle different stages of the work:

Proposal creation (user)
[Claude Code] ──generate proposal──▶ Proposal document
│ │
│ ▼
│ [Codebuddy] ──refine description──▶ Refined proposal
│ │
│ ▼
│ [Codex] ──execute changes──▶ Code changes
│ │
│ ▼
└──────────────────────▶ [iFlow] ──archive──▶ Historical records

The benefit of this division of labor is that each Agent only needs to focus on the tasks it does best, rather than trying to do everything. Claude Code generates proposals from scratch. Codebuddy makes proposal descriptions clearer. Codex turns proposals into actual code changes. iFlow archives and preserves those changes.

This is really just teamwork, the same as in daily life. Everyone has a role, and only together can something big get done. Here, the team members just happen to be AIs.

In actual operation, we summarized the following lessons:

1. Agent selection strategy matters

Tasks should not be assigned casually; they should be matched to each Agent’s strengths:

  • Proposal generation: use Claude Code, because it has stronger contextual understanding
  • Code execution: use Codex, because it is more precise for code modification
  • Proposal refinement: use Codebuddy, because it offers strong cost performance
  • Archival storage: use iFlow, because it is stable and reliable

After all, putting the right person on the right task is a timeless principle.

2. Configuration isolation ensures stability

Each Agent’s configuration is managed independently, supports environment-variable overrides, and uses separate working directories. As a result, a configuration error in one Agent does not affect the others.

This is like personal boundaries in life. Everyone needs their own space; non-interference makes coexistence possible.

3. Error-handling mechanism

A failure in a single Agent should not affect the overall workflow. We implemented a fallback strategy: when one Agent fails, the system can automatically switch to a backup plan or skip that step and continue with later tasks. At the same time, complete logging makes troubleshooting easier afterward.

Nobody can guarantee that errors will never happen. The key is how you handle them. Life works much the same way.

4. Monitoring and observability

Through the ACP protocol (our custom communication protocol based on JSON-RPC 2.0), we can track the execution status of each Agent. Session isolation ensures concurrency safety, while dynamic caching improves performance.

The things you cannot see are often the ones most likely to go wrong. Some visibility is always better than flying blind.

After adopting this multi-Agent collaboration configuration, the HagiCode project’s development efficiency improved significantly. Specifically:

  1. Task-handling capacity doubled: in the past, one Agent had to handle many kinds of tasks at once; now tasks can be processed in parallel, and throughput has increased dramatically
  2. More stable output quality: each Agent focuses only on what it does best, so consistency and quality both improve
  3. Lower maintenance cost: unified interfaces and configuration management make the whole system easier to maintain and extend
  4. Adding new Agents is simple: to integrate a new AI product, you only need to implement the interface and add configuration, without changing the core logic

This approach not only solved HagiCode’s own problems, but also proved that multi-Agent collaboration is a viable architectural choice.

The gains were quite noticeable. The process was just a bit of a hassle.

This article shared the HagiCode project’s practical experience with multi-Agent collaboration configuration. The main takeaways include:

  1. Standardized interfaces: IAIProvider unifies the behavior of different Agents, allowing the code to ignore which company’s product is underneath
  2. Factory pattern: ActivatorUtilities.CreateInstance dynamically creates Provider instances, supporting runtime configuration and dependency injection
  3. Protocol unification: the ACP protocol provides standardized communication between Agents through a bidirectional mechanism based on JSON-RPC 2.0
  4. Task routing: assign work reasonably across different Agents so each can play to its strengths, instead of expecting one Agent to do everything

This design not only solves the problem of “multiple Agents fighting each other,” but also uses the adventure party task flow mechanism to make the development process more automated and specialized.

If you are also considering introducing multiple AI assistants, I hope this article gives you some useful reference points. Of course, every project is different, and the specific approach still needs to be adjusted to the actual situation. There is no one-size-fits-all solution; the best solution is the one that fits you.

Beautiful things or people do not need to be possessed. As long as they remain beautiful, simply appreciating that beauty is enough. Technical solutions are the same: the one that suits you is the best one…

Building an AI Adventure Party: HagiCode Multi-Agent Collaboration Configuration in Practice

Building an AI Adventure Party: HagiCode Multi-Agent Collaboration Configuration in Practice

Section titled “Building an AI Adventure Party: HagiCode Multi-Agent Collaboration Configuration in Practice”

In modern software development, a single AI Agent is no longer enough to meet complex requirements. How can multiple AI assistants from different companies collaborate within the same project? This article shares the multi-Agent collaboration configuration approach that the HagiCode project developed through real-world practice.

Many developers have probably had this experience: after introducing an AI assistant into a project, productivity really does improve. But as requirements become more and more complex, one AI Agent starts to feel insufficient. You want it to handle code review, documentation generation, unit testing, and other tasks at the same time, but the result is often that it cannot keep everything balanced, and the output quality becomes inconsistent.

What is even more frustrating is that once you try to bring in multiple AI assistants, the problem becomes more complicated. Each Agent has its own configuration method, API interface, and execution logic, and they may even conflict with one another. It is like a sports team in which every player is talented, but nobody knows how to work together, so the match turns into a mess.

The HagiCode project ran into the same challenge during development. As a complex project involving a frontend VSCode extension, backend AI services, and a cross-platform desktop client, we needed to connect multiple AI assistants from different companies at the same time: Claude Code, Codex, CodeBuddy, iFlow, and more. How to let them coexist harmoniously in the same project and make the most of their strengths became a key problem we had to solve.

That alone would already be enough trouble. After all, who wants to deal with a bunch of fighting AIs every day?

The approach shared in this article is the multi-Agent collaboration configuration practice that we developed in the HagiCode project through real trial and error and repeated optimization. If you are also struggling with multiple AI assistants working together, this article may give you some inspiration. Maybe. Every project is different, after all.

HagiCode is an AI coding assistant project that adopts an “adventure party” model in which multiple AI engines work together. Project repository: github.com/HagiCode-org/site.

The multi-Agent configuration approach shared in this article is one of the core technologies that allows HagiCode to maintain efficient development in complex projects. There is nothing especially magical about it; it simply turns a group of AIs into an adventure party that can actually coordinate.

HagiCode’s Multi-Agent Architecture Design

Section titled “HagiCode’s Multi-Agent Architecture Design”

From “Going Solo” to “Team Collaboration”

Section titled “From “Going Solo” to “Team Collaboration””

In the early days of the HagiCode project, we also tried using a single AI Agent to handle every task. We soon discovered a clear bottleneck in that approach: different tasks require different strengths. Some tasks need stronger contextual understanding, while others need more precise code modification capabilities. One Agent has a hard time excelling at everything.

That made us realize that multiple Agents had to work together. But the problem was this: how do you let AI products from different companies coexist peacefully in the same project? We needed to solve several core issues:

  1. Configuration management complexity: each Agent has different configuration methods, API interfaces, and execution modes
  2. Unified communication protocol: we need a standardized way for different Agents to exchange data
  3. Task coordination and division of labor: how do we assign work reasonably so each Agent can play to its strengths

With those questions in mind, we started designing HagiCode’s multi-Agent architecture. It was not actually that complicated; we just had to think it through clearly.

After multiple iterations, this is the architecture we settled on:

┌─────────────────────────────────────────────────────────────────┐
│ AIProviderFactory │
│ (Factory pattern for unified management of all AI Providers) │
├─────────────────────────────────────────────────────────────────┤
│ ClaudeCodeCli │ CodexCli │ CodebuddyCli │ IFlowCli │
│ (Anthropic) │ (OpenAI) │ (Zhipu GLM) │ (Zhipu) │
└─────────────────────────────────────────────────────────────────┘

The core idea is to let different AI Agents be managed by the same set of code through a unified Provider interface. At the same time, the factory pattern is used to dynamically create and configure these Providers, ensuring scalability and flexibility across the system.

It is like division of labor in everyday life. Everyone has their own role; here we simply turned that idea into code architecture.

Agent Types and Division of Responsibilities

Section titled “Agent Types and Division of Responsibilities”

Based on HagiCode’s real-world experience, we assigned different responsibilities to each Agent:

AgentProviderModelPrimary Use
ClaudeCodeCliAnthropicglm-5-turboGenerate technical solutions and Proposals
CodexCliOpenAI/Zedgpt-5.4Execute precise code changes
CodebuddyCliZhipuglm-4.7Refine proposal descriptions and documentation
IFlowCliZhipuglm-4.7Archive proposals and historical records
OpenCodeCli--General-purpose code editing
GitHubCopilotMicrosoft-Assisted programming and code completion

The logic behind this division of labor is simple: every Agent has its own area of strength. Claude Code performs well at understanding and analyzing complex requirements, so it handles early solution design. Codex is more precise when modifying code, so it is better suited for concrete implementation work. CodeBuddy offers strong cost performance, which makes it ideal for refining proposal text and documentation.

After all, the right tool for the right job is the best choice. There are many roads to Rome; some are simply easier to walk than others.

To manage different AI Agents in a unified way, we first need to define a common interface. In HagiCode, that interface looks like this:

public interface IAIProvider
{
// Unified Provider interface
Task<IAIProvider?> GetProviderAsync(AIProviderType providerType);
Task<IAIProvider?> GetProviderAsync(string providerName, CancellationToken cancellationToken);
}

The interface looks simple, but it is the foundation of the entire multi-Agent system. With a unified interface, we can call AI products from different companies in the same way regardless of which company is behind them.

This is really just about making complex things simple. Simple is beautiful, after all.

Once the interface is unified, the next question is how to create these Provider instances. HagiCode uses the factory pattern:

private IAIProvider? CreateProvider(AIProviderType providerType, ProviderConfiguration config)
{
return providerType switch
{
AIProviderType.ClaudeCodeCli =>
ActivatorUtilities.CreateInstance<ClaudeCodeCliProvider>(_serviceProvider, Options.Create(config)),
AIProviderType.CodebuddyCli =>
ActivatorUtilities.CreateInstance<CodebuddyCliProvider>(_serviceProvider, Options.Create(config)),
AIProviderType.CodexCli =>
ActivatorUtilities.CreateInstance<CodexCliProvider>(_serviceProvider, Options.Create(config)),
AIProviderType.IFlowCli =>
ActivatorUtilities.CreateInstance<IFlowCliProvider>(_serviceProvider, Options.Create(config)),
_ => null
};
}

This uses dependency injection through ActivatorUtilities.CreateInstance, which can dynamically create Provider instances at runtime while automatically injecting dependencies. The benefit of this design is that when a new Agent type is added, you only need to add the corresponding Provider class and then add one more case branch in the factory method. There is no need to modify the existing code at all.

That is reason enough. Who wants to rewrite a pile of old code every time a new feature is added?

To make configuration more flexible, we also implemented a type-mapping mechanism:

public static AIProviderTypeExtensions
{
private static readonly Dictionary<string, AIProviderType> _typeMap = new(
StringComparer.OrdinalIgnoreCase)
{
["ClaudeCodeCli"] = AIProviderType.ClaudeCodeCli,
["CodebuddyCli"] = AIProviderType.CodebuddyCli,
["CodexCli"] = AIProviderType.CodexCli,
["IFlowCli"] = AIProviderType.IFlowCli,
// ...more type mappings
};
}

The purpose of this mapping table is to convert string-form Provider names into enum types. This allows configuration files to use intuitive string names, while the internal code uses type-safe enums for processing.

Configuration should be as intuitive as possible. Nobody wants to memorize a pile of complicated code names.

In practice, everything can be configured in appsettings.json:

AI:
Providers:
Providers:
ClaudeCodeCli:
Enabled: true
Model: glm-5-turbo
WorkingDirectory: /path/to/project
CodebuddyCli:
Enabled: true
Model: glm-4.7
CodexCli:
Enabled: true
Model: gpt-5.4
IFlowCli:
Enabled: true
Model: glm-4.7

Each Provider can independently configure parameters such as enablement, model version, and working directory. This design preserves flexibility while remaining easy to manage and maintain.

Configuration files are a bit like life’s options: you can choose to enable or disable certain things. The only difference is that code choices are easier to regret later.

With the unified technical architecture in place, the next step is making multiple Agents work together. HagiCode designed a task flow mechanism so different Agents can handle different stages of the work:

Proposal creation (user)
[Claude Code] ──generate proposal──▶ Proposal document
│ │
│ ▼
│ [Codebuddy] ──refine description──▶ Refined proposal
│ │
│ ▼
│ [Codex] ──execute changes──▶ Code changes
│ │
│ ▼
└──────────────────────▶ [iFlow] ──archive──▶ Historical records

The benefit of this division of labor is that each Agent only needs to focus on the tasks it does best, rather than trying to do everything. Claude Code is responsible for generating proposals from scratch. Codebuddy makes proposal descriptions clearer. Codex turns proposals into actual code changes. iFlow archives and preserves those changes.

This is really just teamwork, much like in everyday life. Everyone has their own role, and only together can something big get done. The only difference is that the team members here happen to be AIs.

In actual operation, we summarized the following lessons:

1. Agent selection strategy matters

Tasks should not be assigned casually; they should be matched to each Agent’s strengths:

  • Proposal generation: use Claude Code, because it has stronger contextual understanding
  • Code execution: use Codex, because it is more precise for code modification
  • Proposal refinement: use Codebuddy, because it offers strong cost performance
  • Archival storage: use iFlow, because it is stable and reliable

After all, putting the right person on the right task is a timeless principle.

2. Configuration isolation ensures stability

Each Agent’s configuration is managed independently, supports environment-variable overrides, and uses separate working directories. As a result, a configuration error in one Agent does not affect the others.

This is like personal boundaries in life. Everyone needs their own space; non-interference makes harmonious coexistence possible.

3. Error-handling mechanism

A failure in a single Agent should not affect the overall workflow. We implemented a fallback strategy: when one Agent fails, the system can automatically switch to a backup plan or skip that step and continue with later tasks. At the same time, complete logging makes troubleshooting easier afterward.

Nobody can guarantee that errors will never happen. The key is how you handle them. Life works much the same way.

4. Monitoring and observability

Through the ACP protocol (our custom communication protocol based on JSON-RPC 2.0), we can track the execution status of each Agent. Session isolation ensures concurrency safety, while dynamic caching improves performance.

The things you cannot see are often the ones most likely to go wrong. Some visibility is always better than flying blind.

After adopting this multi-Agent collaboration configuration, the HagiCode project’s development efficiency improved significantly. Specifically:

  1. Task-handling capacity doubled: in the past, one Agent had to handle many kinds of tasks at once; now tasks can be processed in parallel, and throughput has increased dramatically
  2. More stable output quality: each Agent focuses only on what it does best, so consistency and quality both improve
  3. Lower maintenance cost: unified interfaces and configuration management make the whole system easier to maintain and extend
  4. Adding new Agents is simple: to integrate a new AI product, you only need to implement the interface and add configuration, without changing the core logic

This approach not only solved HagiCode’s own problems, but also proved that multi-Agent collaboration is a viable architectural choice.

The gains were quite noticeable. The process was just a bit of a hassle.

This article shared the HagiCode project’s practical experience with multi-Agent collaboration configuration. The main takeaways include:

  1. Standardized interfaces: IAIProvider unifies the behavior of different Agents, allowing the code to ignore which company’s product is underneath
  2. Factory pattern: ActivatorUtilities.CreateInstance dynamically creates Provider instances, supporting runtime configuration and dependency injection
  3. Protocol unification: the ACP protocol provides standardized communication between Agents through a bidirectional mechanism based on JSON-RPC 2.0
  4. Task routing: assign work reasonably across different Agents so each can play to its strengths, instead of expecting one Agent to do everything

This design not only solves the problem of “multiple Agents fighting each other,” but also uses the adventure party task flow mechanism to make the development process more automated and specialized.

If you are also considering introducing multiple AI assistants, I hope this article gives you some useful reference points. Of course, every project is different, and the specific approach still needs to be adjusted to the actual situation. There is no one-size-fits-all solution; the best solution is the one that fits you.

Beautiful things or people do not need to be possessed. As long as they remain beautiful, simply appreciating that beauty is enough. Technical solutions are the same: the one that suits you is the best one…


If this article was helpful to you, feel free to give the project a Star on GitHub. Your support is what keeps us sharing more. The public beta has already started, and you are welcome to install it and give it a try.


Thank you for reading. If you found this article useful, please click the like button below so more people can discover it.

This content was created with AI-assisted collaboration, reviewed by me, and reflects my own views and position.

Practical Multi-AI Provider Architecture in the HagiCode Platform

Practical Multi-AI Provider Architecture in the HagiCode Platform

Section titled “Practical Multi-AI Provider Architecture in the HagiCode Platform”

This article shares the technical approach we used under the Orleans Grain architecture to integrate two AI tools, iflow and OpenCode, through a unified IAIProvider interface, and compares the implementation differences between WebSocket and HTTP communication in detail.

There is nothing especially mysterious about it. While building HagiCode, we ran into a very practical problem: users wanted to work with different AI tools. That is hardly surprising, since everyone has their own habits. Some prefer Claude Code, some love GitHub Copilot, and some teams use tools they developed themselves.

Our initial solution was simple and direct: write dedicated integration code for each AI tool. But the drawbacks showed up quickly. The codebase filled up with if-else branches, every change required testing in multiple places, and every new tool meant writing another pile of logic from scratch.

Later, I realized it would be better to create a unified IAIProvider interface and abstract the capabilities shared by all AI providers. That way, no matter which tool is used underneath, the upper layers can call it in the same way.

Recently, the project needed to integrate two new tools: iflow and OpenCode. Both support the ACP protocol, but their communication styles are different. iflow uses WebSocket, while OpenCode uses an HTTP API. That became a useful architectural test: adapt two different transport modes behind one unified interface.

The approach shared in this article comes from our practical experience in the HagiCode project. HagiCode is an AI-assisted development platform built on the Orleans Grain architecture. It integrates with different AI providers through a unified IAIProvider interface, allowing users to flexibly choose the AI tools they prefer.

First, we defined the IAIProvider interface and abstracted the capabilities that every AI provider needs to implement:

public interface IAIProvider
{
string Name { get; }
bool SupportsStreaming { get; }
ProviderCapabilities Capabilities { get; }
Task<AIResponse> ExecuteAsync(AIRequest request, CancellationToken cancellationToken = default);
IAsyncEnumerable<AIStreamingChunk> StreamAsync(AIRequest request, CancellationToken cancellationToken = default);
Task<ProviderTestResult> PingAsync(CancellationToken cancellationToken = default);
IAsyncEnumerable<AIStreamingChunk> SendMessageAsync(AIRequest request, string? embeddedCommandPrompt = null, CancellationToken cancellationToken = default);
}

This interface includes several key methods:

  • ExecuteAsync: execute a one-shot AI request
  • StreamAsync: get streaming responses for real-time display
  • PingAsync: perform a health check to verify whether the provider is available
  • SendMessageAsync: send a message with support for embedded commands

IFlowCliProvider: A WebSocket-Based Implementation

Section titled “IFlowCliProvider: A WebSocket-Based Implementation”

iflow uses WebSocket for ACP communication. The overall architecture looks like this:

IFlowCliProvider → ACPSessionManager → WebSocketAcpTransport → iflow CLI
Dynamic port allocation + process management

The core flow is also fairly straightforward:

  1. ACPSessionManager creates and manages ACP sessions.
  2. WebSocketAcpTransport handles WebSocket communication.
  3. A port is allocated dynamically, and the iflow process is started with iflow --experimental-acp --port.
  4. IAIRequestToAcpMapper and IAcpToAIResponseMapper convert requests and responses.

Here is the core code:

private async IAsyncEnumerable<AIStreamingChunk> StreamCoreAsync(
AIRequest request,
string? embeddedCommandPrompt,
[EnumeratorCancellation] CancellationToken cancellationToken)
{
// Resolve working directory
var resolvedWorkingDirectory = ResolveWorkingDirectory(request);
var effectiveRequest = ApplyEmbeddedCommandPrompt(request, embeddedCommandPrompt);
// Create ACP session
await using var session = await _sessionManager.CreateSessionAsync(
Name,
resolvedWorkingDirectory,
cancellationToken,
request.SessionId);
// Send prompt
var prompt = _requestMapper.ToPromptString(effectiveRequest);
var promptResponse = await session.SendPromptAsync(prompt, cancellationToken);
// Receive streaming response
await foreach (var notification in session.ReceiveUpdatesAsync(cancellationToken))
{
if (_responseMapper.TryConvertToStreamingChunk(notification, out var chunk))
{
if (chunk.Type == StreamingChunkType.Metadata && chunk.IsComplete)
{
yield return chunk;
yield break;
}
yield return chunk;
}
}
}

There are a few design points worth calling out here:

  • Use await using to ensure the session is released correctly and avoid resource leaks.
  • Return streaming responses through IAsyncEnumerable, which naturally supports async streams.
  • Use Metadata chunks to determine completion and ensure the full response has been received.

OpenCodeCliProvider: An HTTP API-Based Implementation

Section titled “OpenCodeCliProvider: An HTTP API-Based Implementation”

OpenCode provides its service through an HTTP API, so the architecture is slightly different:

OpenCodeCliProvider → OpenCodeRuntimeManager → OpenCodeClient → OpenCode HTTP API
OpenCodeProcessManager → opencode process management

A notable feature of OpenCode is that it uses an SQLite database to persist session bindings. That makes session recovery and prompt-response recovery possible:

private async Task<OpenCodePromptExecutionResult> ExecutePromptAsync(
AIRequest request,
string? embeddedCommandPrompt,
CancellationToken cancellationToken)
{
var prompt = BuildPrompt(request, embeddedCommandPrompt);
var resolvedWorkingDirectory = ResolveWorkingDirectory(request.WorkingDirectory);
var client = await _runtimeManager.GetClientAsync(resolvedWorkingDirectory, cancellationToken);
var bindingSessionId = request.SessionId;
var boundSession = TryGetBinding(bindingSessionId, resolvedWorkingDirectory);
// Try to use the already bound session
if (boundSession is not null)
{
try
{
return await PromptSessionAsync(
client,
boundSession,
BuildPromptRequest(request, prompt, CreatePromptMessageId()),
request.Model ?? _settings.Model,
cancellationToken);
}
catch (OpenCodeApiException ex) when (IsStaleBinding(ex))
{
// The session has expired, remove the binding
RemoveBinding(bindingSessionId);
}
}
// Create a new session
var session = await client.Session.CreateAsync(new OpenCodeSessionCreateRequest
{
Title = BuildSessionTitle(request)
}, cancellationToken);
BindSession(bindingSessionId, session.Id, resolvedWorkingDirectory);
return await PromptSessionAsync(client, session.Id, ...);
}

This implementation has several interesting highlights:

  • Session binding mechanism: the same SessionId reuses the same OpenCode session, avoiding repeated session creation.
  • Expiration handling: when a session is found to be expired, the binding is automatically cleaned up.
  • Database persistence: bindings are stored in SQLite and remain effective after restart.
AspectIFlowCliProviderOpenCodeCliProvider
CommunicationWebSocket (ACP)HTTP API
Process managementACPSessionManagerOpenCodeProcessManager
Port allocationDynamic portNo port (uses HTTP)
Session managementACPSessionOpenCodeSession
PersistenceIn-memory cacheSQLite database
Startup commandiflow --experimental-acp --portopencode
LatencyLower (long-lived connection)Relatively higher (HTTP requests)

Which approach you choose depends mainly on your needs. WebSocket is better for scenarios with high real-time requirements, while an HTTP API is simpler and easier to debug.

First, enable the two providers in the configuration file:

AI:
Providers:
IFlowCli:
Type: "IFlowCli"
Enabled: true
ExecutablePath: "iflow"
Model: null
WorkingDirectory: null
OpenCodeCli:
Type: "OpenCodeCli"
Enabled: true
ExecutablePath: "opencode"
Model: "anthropic/claude-sonnet-4"
WorkingDirectory: null
OpenCode:
Enabled: true
BaseUrl: "http://localhost:38376"
ExecutablePath: "opencode"
StartupTimeoutSeconds: 30
RequestTimeoutSeconds: 120
// Get provider through the factory
var provider = await _providerFactory.GetProviderAsync(AIProviderType.IFlowCli);
// Execute an AI request
var request = new AIRequest
{
Prompt = "请帮我重构这个函数",
WorkingDirectory = "/path/to/project",
Model = "claude-sonnet-4"
};
// Get the complete response
var response = await provider.ExecuteAsync(request, cancellationToken);
Console.WriteLine(response.Content);
// Or use streaming responses
await foreach (var chunk in provider.StreamAsync(request, cancellationToken))
{
if (chunk.Type == StreamingChunkType.ContentDelta)
{
Console.Write(chunk.Content);
}
}
// Get provider through the factory
var provider = await _providerFactory.GetProviderAsync(AIProviderType.OpenCodeCli);
var request = new AIRequest
{
Prompt = "请帮我分析这个错误",
WorkingDirectory = "/path/to/project",
Model = "anthropic/claude-sonnet-4"
};
var response = await provider.ExecuteAsync(request, cancellationToken);
Console.WriteLine(response.Content);

Before startup or before use, you can check whether the provider is available:

var iflowResult = await iflowProvider.PingAsync(cancellationToken);
if (!iflowResult.Success)
{
Console.WriteLine($"IFlow is unavailable: {iflowResult.ErrorMessage}");
return;
}
var openCodeResult = await openCodeProvider.PingAsync(cancellationToken);
if (!openCodeResult.Success)
{
Console.WriteLine($"OpenCode is unavailable: {openCodeResult.ErrorMessage}");
return;
}

Both providers support embedded commands, such as /file:xxx:

var request = new AIRequest
{
Prompt = "分析这个文件的问题",
SystemMessage = "你是一个代码分析专家"
};
await foreach (var chunk in provider.SendMessageAsync(
request,
embeddedCommandPrompt: "/file:src/main.cs",
cancellationToken))
{
Console.Write(chunk.Content);
}

IFlow uses long-lived WebSocket connections, so resource management deserves special attention:

  • Use await using to ensure sessions are released properly.
  • Cancellation triggers process cleanup.
  • ACPSessionManager supports a maximum session count limit.

OpenCode process management is relatively simpler, and OpenCodeRuntimeManager handles it automatically.

Both providers have complete error handling:

  • IFlow errors are propagated through ACP session updates.
  • OpenCode errors are thrown through OpenCodeApiException.
  • It is recommended that the caller catch and handle these exceptions.
  • IFlow WebSocket communication has lower latency than HTTP.
  • OpenCode session reuse can reduce the overhead of HTTP requests.
  • The factory cache mechanism avoids repeatedly creating providers.
  • In high-concurrency scenarios, pay close attention to the limits on process count and connection count.

The executable path is validated at startup, but runtime issues can still happen. PingAsync is a useful tool for verifying whether the configuration is correct:

// Check at startup
var provider = await _providerFactory.GetProviderAsync(providerType);
var result = await provider.PingAsync(cancellationToken);
if (!result.Success)
{
_logger.LogError("Provider {ProviderType} is unavailable: {Error}", providerType, result.ErrorMessage);
}

This article shares the technical approach used by the HagiCode platform when integrating the two AI tools iflow and OpenCode. Through a unified IAIProvider interface, we adapted different communication styles, WebSocket and HTTP, while keeping the upper-layer calling pattern consistent.

The core idea is actually quite simple:

  1. Define a unified interface abstraction.
  2. Build adapter layers for different implementations.
  3. Manage everything uniformly through the factory pattern.

That gives the system good extensibility. When a new AI tool needs to be integrated later, all we need to do is implement the IAIProvider interface without changing too much existing code.

If you are also working on multi-AI-tool integration, I hope this article is helpful.


If this article helped you: