Skip to content

ACP

4 posts with the tag “ACP”

Hermes Agent Integration Practice: From Protocol to Production

Hermes Agent Integration Practice: From Protocol to Production

Section titled “Hermes Agent Integration Practice: From Protocol to Production”

Sharing the complete HagiCode experience of integrating Hermes Agent, including core lessons around ACP protocol adaptation, session pool management, and front-end/back-end contract synchronization.

While building HagiCode, an AI-assisted coding platform, our team needed to integrate an Agent framework that could run locally and also scale to the cloud. After research, Hermes Agent from Nous Research was chosen as the underlying engine for our general-purpose Agent capabilities.

In truth, technology selection is neither especially hard nor especially easy. There are plenty of strong Agent frameworks on the market, but Hermes stood out because its ACP protocol and tool system fit HagiCode’s demanding requirements particularly well: local development, team collaboration, and cloud expansion. Still, bringing Hermes into a real production system meant solving a long list of engineering problems. This part was anything but trivial.

HagiCode’s stack uses Orleans to build a distributed system, while the front end is built with React + TypeScript. Integrating Hermes meant preserving architectural consistency while making Hermes a first-class executor alongside ClaudeCode and OpenCode. It sounds simple enough, but implementation always tells the real story.

This article shares our practical experience integrating Hermes Agent into HagiCode, and we hope it offers useful reference material for teams facing similar needs. After all, once you’ve fallen into a pit, there is no reason to let someone else fall into the same one.

The solution described in this article comes from our hands-on work in the HagiCode project. HagiCode is an AI-driven coding assistance platform that supports unified access to and management of multiple AI Providers. During the Hermes Agent integration, we designed a generic Provider abstraction layer so new Agent types could plug into the existing system seamlessly.

If you’re interested in HagiCode, feel free to visit GitHub to learn more. The more people who pay attention, the stronger the momentum.

HagiCode’s Hermes integration uses a clear layered architecture, with each layer focused on its own responsibilities:

Back-end core layer

  • HermesCliProvider: implements the IAIProvider interface as the unified AI Provider entry point
  • HermesPlatformConfiguration: manages Hermes executable path, arguments, authentication, and related settings
  • ICliProvider<HermesOptions>: the low-level CLI abstraction provided by HagiCode.Libs for handling subprocess lifecycles

Transport layer

  • StdioAcpTransport: communicates with the Hermes ACP subprocess through standard input and output
  • ACP protocol methods: initialize, authenticate, session/new, session/prompt

Runtime layer

  • HermesGrain: Orleans Grain implementation that handles distributed session execution
  • CliAcpSessionPool: session pool that reuses ACP subprocesses to avoid frequent startup overhead

Front-end layer

  • ExecutorAvatar: Hermes visual identity and icon
  • executorTypeAdapter: Provider type mapping logic
  • SignalR real-time messaging: maintains Hermes identity consistency throughout the message stream

This layered design allows each layer to evolve independently. For example, if we want to add a new transport mechanism in the future, such as WebSocket, we only need to modify the transport layer. There is no need to turn over the whole system just because one transport changes.

All AI Providers implement the IAIProvider interface, which is one of the core design choices in HagiCode’s architecture:

public interface IAIProvider
{
string Name { get; }
ProviderCapabilities Capabilities { get; }
IAsyncEnumerable<AIStreamingChunk> StreamAsync(
AIRequest request,
CancellationToken cancellationToken = default);
Task<AIResponse> ExecuteAsync(
AIRequest request,
CancellationToken cancellationToken = default);
}

HermesCliProvider implements this interface and stands on equal footing with ClaudeCodeProvider, OpenCodeProvider, and others. The benefits of this design include:

  1. Replaceability: switching Providers does not affect upper-layer business logic
  2. Testability: Providers can be mocked easily for unit testing
  3. Extensibility: adding a new Provider only requires implementing the interface

In the end, interfaces are a lot like rules. Once the rules are in place, everyone can coexist harmoniously, play to their strengths, and avoid stepping on each other. There is a certain elegance in that.

HermesCliProvider is the core of the entire integration. It coordinates the various components needed to complete a single AI invocation:

public sealed class HermesCliProvider : IAIProvider, IVersionedAIProvider
{
private readonly ICliProvider<LibsHermesOptions> _provider;
private readonly ConcurrentDictionary<string, string> _sessionBindings;
public ProviderCapabilities Capabilities { get; } = new()
{
SupportsStreaming = true,
SupportsTools = true,
SupportsSystemMessages = true,
SupportsArtifacts = false
};
public async IAsyncEnumerable<AIStreamingChunk> StreamAsync(
AIRequest request,
[EnumeratorCancellation] CancellationToken cancellationToken = default)
{
// 1. Resolve the session binding key
var bindingKey = ResolveBindingKey(request.CessionId);
// 2. Get or create a Hermes session through the session pool
var options = new HermesOptions
{
ExecutablePath = _platformConfiguration.ExecutablePath,
Arguments = _platformConfiguration.Arguments,
SessionId = _sessionBindings.TryGetValue(bindingKey, out var sessionId) ? sessionId : null,
WorkingDirectory = request.WorkingDirectory,
Model = request.Model
};
// 3. Execute and collect the streaming response
await foreach (var message in _provider.ExecuteAsync(options, request.Prompt, cancellationToken))
{
// 4. Map ACP messages to AIStreamingChunk
if (_responseMapper.TryConvertToStreamingChunk(message, out var chunk))
{
yield return chunk;
}
}
}
}

Several design points are especially important here:

  1. Session binding: uses CessionId to bind multiple requests to the same Hermes subprocess, preserving context continuity across multi-turn conversations
  2. Response mapping: converts Hermes ACP message format into the unified AIStreamingChunk format
  3. Streaming support: uses IAsyncEnumerable to support true streaming responses

Session binding is a bit like human relationships. Once a connection is established, future communication has context, so you do not need to start from zero each time. Of course, that relationship still has to be maintained.

Hermes uses ACP (Agent Communication Protocol), which differs from a traditional HTTP API. ACP is a protocol based on standard input and output, and it has several characteristics:

  1. Startup marker: after the Hermes process starts, it outputs the //ready marker
  2. Dynamic authentication: authentication methods are not fixed and must be negotiated through the protocol
  3. Session reuse: established sessions are reused through SessionId
  4. Fragmented responses: a complete response may be split across multiple session/update notifications

HagiCode handles these characteristics through StdioAcpTransport:

public class StdioAcpTransport
{
public async Task InitializeAsync(CancellationToken cancellationToken)
{
// Wait for the //ready marker
var readyLine = await _outputReader.ReadLineAsync(cancellationToken);
if (readyLine != "//ready")
{
throw new InvalidOperationException("Hermes did not send ready signal");
}
// Send the initialize request
await SendRequestAsync(new
{
jsonrpc = "2.0",
id = 1,
method = "initialize",
@params = new
{
protocolVersion = "2024-11-05",
capabilities = new { },
clientInfo = new { name = "HagiCode", version = "1.0.0" }
}
}, cancellationToken);
}
}

Protocols are a bit like mutual understanding between people. Once that understanding is there, communication flows much more smoothly. Building it just takes time.

Starting Hermes subprocesses frequently is expensive, so we implemented a session pool mechanism:

services.AddSingleton(static _ =>
{
var registry = new CliProviderPoolConfigurationRegistry();
registry.Register("hermes", new CliPoolSettings
{
MaxActiveSessions = 50,
IdleTimeout = TimeSpan.FromMinutes(10)
});
return registry;
});

Key session pool parameters:

  • MaxActiveSessions: controls the concurrency limit to avoid exhausting resources
  • IdleTimeout: idle timeout that balances startup cost against memory usage

In practice, we found that:

  1. If the idle timeout is too short, sessions restart frequently; if it is too long, memory remains occupied
  2. The concurrency limit must be tuned according to actual load, because setting it too high can make the system sluggish
  3. Session pool utilization needs monitoring so parameters can be adjusted in time

This is much like many choices in life: being too aggressive creates problems, while being too conservative misses opportunities. The goal is simply to find the right balance.

The front end needs to correctly identify the Hermes Provider and display the corresponding visual elements:

executorTypeAdapter.ts
export const resolveExecutorVisualTypeFromProviderType = (
providerType: PCode_Models_AIProviderType | null | undefined
): ExecutorVisualType => {
switch (providerType) {
case PCode_Models_AIProviderType.HERMES_CLI:
return 'Hermes';
default:
return 'Unknown';
}
};

Hermes has its own icon and color identity:

ExecutorAvatar.tsx
const renderExecutorGlyph = (executorType: ExecutorVisualType, iconSize: number) => {
switch (executorType) {
case 'Hermes':
return (
<svg viewBox="0 0 24 24" fill="none" className="h-4 w-4">
<rect x="4" y="4" width="16" height="16" rx="4" fill="currentColor" opacity="0.16" />
<path d="M8 7v10M16 7v10M8 12h8" stroke="currentColor" strokeWidth="2" strokeLinecap="round" />
</svg>
);
default:
return <DefaultAvatar />;
}
};

After all, beautiful things deserve beautiful presentation. Making sure that beauty is actually visible still depends on front-end craftsmanship.

The front end and back end keep their contract aligned through OpenAPI generation. The back end defines the AIProviderType enum:

public enum AIProviderType
{
Unknown,
ClaudeCode,
OpenCode,
HermesCli // Newly added
}

The front end generates the corresponding TypeScript type through OpenAPI, ensuring enum values stay consistent. This is the key to avoiding the front end displaying Unknown.

A contract is a lot like a promise. Once agreed, it has to be honored, otherwise you end up in awkward situations like Unknown.

Hermes configuration is managed through appsettings.json:

{
"Providers": {
"HermesCli": {
"ExecutablePath": "hermes",
"Arguments": "acp",
"StartupTimeoutMs": 10000,
"ClientName": "HagiCode",
"Authentication": {
"PreferredMethodId": "api-key",
"MethodInfo": {
"api-key": "your-api-key-here"
}
},
"SessionDefaults": {
"Model": "claude-sonnet-4-20250514",
"ModeId": "default"
}
}
}
}

This configuration-driven design brings flexibility:

  • executable paths can be overridden, which is convenient for development and testing
  • startup arguments can be customized to match different Hermes versions
  • authentication information can be configured to support multiple authentication methods

Configuration is a bit like multiple-choice questions in life. If enough options are available, there is usually one that fits. That said, too many options can create decision fatigue of their own.

Building a reliable Provider requires comprehensive health checks:

public async Task<ProviderTestResult> PingAsync(CancellationToken cancellationToken = default)
{
var response = await ExecuteAsync(new AIRequest
{
Prompt = "Reply with exactly PONG.",
CessionId = null,
AllowedTools = Array.Empty<string>(),
WorkingDirectory = ResolveWorkingDirectory(null)
}, cancellationToken);
var success = string.Equals(response.Content.Trim(), "PONG", StringComparison.OrdinalIgnoreCase);
return new ProviderTestResult
{
ProviderName = Name,
Success = success,
ResponseTimeMs = stopwatch.ElapsedMilliseconds,
ErrorMessage = success ? null : $"Unexpected Hermes ping response: '{response.Content}'."
};
}

Points to watch in health checks:

  1. Use simple test cases and avoid overly complex scenarios
  2. Set reasonable timeout values
  3. Record response time to support performance analysis

Just as people need physical checkups, systems need health checks too. The sooner issues are found, the easier they are to fix.

HagiCode provides a dedicated console for validating Hermes integration:

Terminal window
# Basic validation
HagiCode.Libs.Hermes.Console --test-provider
# Full suite (including repository analysis)
HagiCode.Libs.Hermes.Console --test-provider-full --repo .
# Custom executable
HagiCode.Libs.Hermes.Console --test-provider-full --executable /path/to/hermes

This tool is extremely useful during development because it lets us quickly verify whether the integration is correct. After all, no one wants to wait until a problem surfaces before remembering to test.

Authentication failure

  • Check whether Authentication.PreferredMethodId matches the authentication method Hermes actually supports
  • Confirm the authentication information format is correct, such as API Key or Bearer Token

Session timeout

  • Increase the StartupTimeoutMs value
  • Check MCP server reachability
  • Review system resource utilization

Incomplete response

  • Ensure session/update notifications and the final result are aggregated correctly
  • Check cancellation logic in streaming handling
  • Verify error handling is complete

Front end displays Unknown

  • Confirm OpenAPI generation already includes the HermesCli enum value
  • Check whether type mapping is correct
  • Clear browser cache and regenerate types

Problems will always exist. When they appear, the important thing is not to panic. Trace the cause step by step, and in the end, most of them can be solved.

  1. Use the session pool: reuse ACP subprocesses to reduce startup overhead
  2. Set timeouts appropriately: balance memory use against startup cost
  3. Reuse session IDs: use the same CessionId for batch tasks
  4. Configure MCP on demand: avoid unnecessary tool invocations

Performance is a lot like efficiency in daily life. When you get it right, you achieve more with less; when you get it wrong, effort multiplies while results shrink. Finding that “right” point takes both experience and luck.

Integrating Hermes Agent into a production system requires considering problems across multiple dimensions:

  1. Architecture: design a unified Provider interface and implement a replaceable component architecture
  2. Protocol: correctly handle ACP-specific behavior such as startup markers and dynamic authentication
  3. Performance: reuse resources through the session pool and balance startup cost against memory usage
  4. Front end: ensure contract synchronization and provide a consistent visual experience

HagiCode’s experience shows that with good layered design and configuration-driven implementation, a complex Agent system can be integrated seamlessly into an existing architecture.

These principles sound simple when described in words, but actual implementation always introduces many different kinds of problems. That is fine. If a problem gets solved, it becomes experience. If it does not, it becomes a lesson. Either way, it still has value.

Beautiful things or people do not need to be possessed; as long as they remain beautiful, it is enough to quietly appreciate that beauty. Technology is much the same. If it helps make the system better, then the specific framework or protocol matters far less than people sometimes think.

Thank you for reading. If you found this article helpful, feel free to like, bookmark, and share it. This content was created with AI-assisted collaboration, and the final content was reviewed and confirmed by the author.

Why HagiCode Chose Hermes as Its Integrated Agent Core

Why HagiCode Chose Hermes as Its Integrated Agent Core

Section titled “Why HagiCode Chose Hermes as Its Integrated Agent Core”

When building an AI-assisted coding platform, choosing the right Agent core directly determines the upper limit of the system’s capabilities. Some things simply cannot be forced; pick the wrong framework, and no amount of effort will make it feel right. This article shares the thinking behind HagiCode’s technical selection and our hands-on experience integrating Hermes Agent.

When building an AI-assisted coding product, one of the hardest parts is choosing the underlying Agent framework. There are actually quite a few options on the market, but some are too limited in functionality, some are overly complex to deploy, and others simply do not scale well enough. What we needed was a solution that could run on a $5 VPS while also being able to connect to a GPU cluster. That requirement may not sound extreme, but it is enough to scare plenty of teams away.

In practice, many so-called “all-in-one Agents” either only run in the cloud or require absurdly high local deployment costs. After spending two weeks researching different approaches, we made a bold decision: rebuild the entire Agent core around Hermes as the underlying engine for our integrated Agent.

Everything that followed may simply have been fate.

The approach shared in this article comes from real-world experience in the HagiCode project. HagiCode is an AI-assisted coding platform that provides developers with an intelligent coding assistant through a VSCode extension, a desktop client, and web services. You may have used similar tools before and felt they were just missing that final touch; we understand that feeling well.

Before diving into Hermes itself, it helps to explain why HagiCode needed something like it in the first place. Things rarely work exactly the way you want, so you need a practical reason to commit to a technical direction.

As an AI coding assistant, HagiCode needs to support several usage scenarios at the same time:

  • Local development environments: developers want to run it on their own machines so data never leaves the local environment. These days, data security is never a trivial concern.
  • Team collaboration environments: small teams should be able to share an Agent deployment running on a server. Saving money matters, and everyone has limits.
  • Elastic cloud expansion: when handling complex tasks, the system should automatically scale out to a GPU cluster. It is always better to be prepared.

This “we want everything at once” requirement is what led us to Hermes. Whether it was the perfect choice, I cannot say for sure, but at the time we did not see a better option.

Hermes Agent is an autonomous AI Agent created by Nous Research. Some readers may not be familiar with Nous Research; they are the lab behind open-source large models such as Hermes, Nomos, and Psyché. They have built many excellent things, even if they are still more underappreciated than they deserve.

Unlike traditional IDE coding assistants or simple API chat wrappers, Hermes has a defining trait: the longer it runs, the more capable it becomes. It is not designed to complete a task once and stop; it keeps learning and accumulating experience over long-running operation. In that sense, it feels a little like a person.

Several of Hermes’s core capabilities happen to align very closely with HagiCode’s needs.

This means HagiCode can choose the most suitable deployment model based on each user’s scenario: individuals run it locally, teams deploy it on servers, and complex tasks use GPU resources. One codebase handles all of it. In a world this busy, saving one layer of complexity is already a win.

Multi-platform messaging gateway Hermes natively supports Telegram, Discord, Slack, WhatsApp, and more. For HagiCode, this means we can support AI assistants on those channels much more easily in the future. More paths forward are always welcome.

Rich tool system Hermes comes with 40+ built-in tools and supports MCP (Model Context Protocol) extensions. This is essential for a coding assistant: executing shell commands, working with the file system, and calling Git all depend on tool support. An Agent without tools is like a bird without wings.

Cross-session memory Hermes includes a persistent memory system and uses FTS5 full-text search to recall historical conversations. That allows the Agent to remember prior context instead of “losing its memory” every time. Sometimes people wish they could forget things that easily, but reality is usually less generous.

Now that the “why” is clear, let us look at the “how.” Once something makes sense in theory, the next step is to build it.

In HagiCode’s architecture, all AI Providers implement a unified IAIProvider interface:

public sealed class HermesCliProvider : IAIProvider, IVersionedAIProvider
{
public ProviderCapabilities Capabilities { get; } = new ProviderCapabilities
{
SupportsStreaming = true, // Supports streaming output
SupportsTools = true, // Supports tool invocation
SupportsSystemMessages = true, // Supports system prompts
SupportsArtifacts = false
};
}

This abstraction layer allows HagiCode to switch seamlessly between different AI Providers. Whether the backend is OpenAI, Claude, or Hermes, the upper-layer calling pattern stays exactly the same. In plain terms, it keeps things simple.

Hermes communicates through ACP (Agent Communication Protocol). This protocol is designed specifically for Agent communication, and its main methods include:

MethodDescription
initializeInitialize the connection and obtain the protocol version and client capabilities
authenticateHandle authentication and support multiple authentication methods
session/newCreate a new session and configure the working directory and MCP servers
session/promptSend a prompt and receive a response

HagiCode implements the ACP transport layer through StdioAcpTransport, launching a Hermes subprocess and communicating with it over standard input and output. It may sound complicated, but in practice it is manageable as long as you have enough patience.

Configuration is managed through the HermesPlatformConfiguration class:

public sealed class HermesPlatformConfiguration : IAcpPlatformConfiguration
{
public string ExecutablePath { get; set; } = "hermes";
public string Arguments { get; set; } = "acp";
public int StartupTimeoutMs { get; set; } = 5000;
public string ClientName { get; set; } = "HagiCode";
public HermesAuthenticationConfiguration Authentication { get; set; }
public HermesSessionDefaultsConfiguration SessionDefaults { get; set; }
}

Configure Hermes in appsettings.json:

{
"Providers": {
"HermesCli": {
"ExecutablePath": "hermes",
"Arguments": "acp",
"StartupTimeoutMs": 10000,
"ClientName": "HagiCode",
"Authentication": {
"PreferredMethodId": "api-key",
"MethodInfo": {
"api-key": "your-api-key-here"
}
},
"SessionDefaults": {
"Model": "claude-sonnet-4-20250514",
"ModeId": "default"
}
}
}
}

Configuration often looks simple on paper, but getting every detail right still takes real effort.

HagiCode uses Orleans to build its distributed system, and the Hermes integration is implemented through the following components:

  • HermesGrain: An Orleans Grain implementation that handles session execution
  • HermesPlatformConfiguration: Platform-specific configuration
  • HermesAcpSessionAdapter: ACP session adapter
  • HermesConsole: A dedicated validation console

The name Orleans does have a certain charm to it. Even if this Orleans has nothing to do with the legendary city, a good name never hurts.

The following is the core execution logic of the Hermes Provider:

private async IAsyncEnumerable<AIStreamingChunk> StreamCoreAsync(
AIRequest request,
string? embeddedCommandPrompt,
[EnumeratorCancellation] CancellationToken cancellationToken)
{
// 1. Create transport layer and launch Hermes subprocess
await using var transport = new StdioAcpTransport(
platformConfiguration.GetExecutablePath(),
platformConfiguration.GetArguments(),
platformConfiguration.GetEnvironmentVariables(),
platformConfiguration.GetStartupTimeout(),
_loggerFactory.CreateLogger<StdioAcpTransport>());
await transport.ConnectAsync(cancellationToken);
// 2. Initialize and obtain protocol version and authentication methods
var initializeResult = await SendHermesRequestAsync(
transport, nextRequestId++, "initialize",
BuildInitializeParameters(platformConfiguration), cancellationToken);
// 3. Handle authentication
var authMethods = ParseAuthMethods(initializeResult);
if (!isAuthenticated)
{
var methodId = platformConfiguration.Authentication.ResolveMethodId(authMethods);
await SendHermesRequestAsync(transport, nextRequestId++, "authenticate", ...);
}
// 4. Create session
var newSessionResult = await SendHermesRequestAsync(
transport, nextRequestId++, "session/new",
BuildNewSessionParameters(platformConfiguration, workingDirectory, model), cancellationToken);
var sessionId = ParseSessionId(newSessionResult);
// 5. Execute prompt and collect streaming responses
await foreach (var payload in transport.ReceiveMessagesAsync(cancellationToken))
{
// Handle session/update notifications and convert them into streaming chunks
if (TryParseSessionNotification(root, out var notification))
{
if (_responseMapper.TryConvertToStreamingChunk(notification, out var chunk))
{
yield return chunk;
}
}
}
}

With code, the details eventually become familiar. What matters most is the overall approach.

To ensure Hermes remains available, HagiCode implements a health check mechanism:

public async Task<ProviderTestResult> PingAsync(CancellationToken cancellationToken = default)
{
var response = await ExecuteAsync(
new AIRequest
{
Prompt = "Reply with exactly PONG.",
CessionId = null,
AllowedTools = Array.Empty<string>(),
WorkingDirectory = ResolveWorkingDirectory(null)
},
cancellationToken);
var success = string.Equals(response.Content.Trim(), "PONG", StringComparison.OrdinalIgnoreCase);
return new ProviderTestResult
{
ProviderName = Name,
Success = success,
ResponseTimeMs = stopwatch.ElapsedMilliseconds,
ErrorMessage = success ? null : $"Unexpected Hermes ping response: '{response.Content}'."
};
}

That is roughly what a “health check” looks like here. In some ways, people are not so different: it helps to check in from time to time, even if no one tells us exactly what to look for.

There are a few pitfalls worth understanding before integrating Hermes. Everyone steps into a few traps sooner or later.

Hermes supports multiple authentication methods, including API keys and tokens, so you need to choose based on the actual deployment scenario. Misconfiguration can cause connection failures, and the resulting error messages are not always intuitive. Sometimes the reported error is far away from the real root cause, which means slow and careful debugging is unavoidable.

When creating a session, you can configure a list of MCP servers so Hermes can call external tools. But keep the following points in mind:

  • MCP server addresses must be reachable
  • Timeouts must be configured reasonably
  • The system needs degradation handling when a server is unavailable

In practice, defensive thinking matters more than people expect.

Each session must specify a working directory so Hermes can access project files correctly. In multi-project scenarios, the working directory needs to switch dynamically. It sounds straightforward, but there are more edge cases than you might think.

Hermes responses may be split across session/update notifications and the final result, so they must be merged correctly. Otherwise, content may be lost.

Runtime errors should be returned explicitly instead of silently falling back to another Provider. That way, users know the issue came from Hermes rather than wondering why the system suddenly switched models behind the scenes.

HagiCode’s decision to use Hermes as its integrated Agent core was not a casual impulse. It was a careful choice based on practical requirements and the technical characteristics of the framework. Whether it proves to be the perfect long-term answer is still too early to say, but so far it has been serving us well.

Hermes gives HagiCode the flexibility to adapt to a wide range of scenarios. Its powerful tool system and MCP support allow the AI assistant to do real work, while the ACP protocol and Provider abstraction layer keep the integration process clear and controllable.

If you are choosing an Agent framework for your own AI project, I hope this article offers a useful reference. Picking the right underlying architecture can make everything that follows much easier.

Thank you for reading. If you found this article useful, you are welcome to support it with a like, bookmark, or share. This content was created with AI-assisted collaboration, and the final content was reviewed and confirmed by the author.

Practical Guide to Integrating CodeBuddy CLI into a C# Backend

Practical Guide to Integrating CodeBuddy CLI into a C# Backend

Section titled “Practical Guide to Integrating CodeBuddy CLI into a C# Backend”

This article walks through a complete approach to integrating CodeBuddy CLI into a C# backend project so you can deliver AI coding assistant capabilities end to end.

In modern AI coding assistant development, a single AI Provider often cannot satisfy complex and changing development scenarios. HagiCode, as a multifunctional AI coding assistant, needs to support multiple AI Providers to deliver a better user experience. Users should have enough freedom to choose. In early 2026, the project faced a key decision: how to restore CodeBuddy ACP (Agent Communication Protocol) integration capabilities in the C# backend.

The project had previously implemented CodeBuddy integration, but the related code was removed during a refactor. There is not much to complain about there; during iterative development, something always gets left behind. The goal of this technical solution was to fully restore that capability and improve the architecture so it would be more robust and maintainable.

If you are also considering connecting multiple AI coding assistants to your own project, the approach below may give you some ideas. It reflects lessons we summarized after stepping into plenty of pitfalls, and maybe it can help you avoid a few detours.

The approach shared in this article comes from our practical experience in the HagiCode project. HagiCode is an open-source AI coding assistant project that supports multiple AI Providers and cross-platform operation. To satisfy different user preferences, we need to switch flexibly among different AI coding assistants, which is exactly why we built the CodeBuddy integration described here.

HagiCode uses a modular design, with AI Providers implemented as pluggable components. This architecture lets us add new AI support easily without affecting existing features. When a design is done well up front, it saves a lot of trouble later. If you are interested in our technical architecture, you can view the full source code on GitHub.

The integration between C# and CodeBuddy uses a clear layered architecture. This design makes responsibilities explicit and makes long-term maintenance much easier:

┌─────────────────────────────────────────────┐
│ Provider Contract Layer │
│ AIProviderType enum + extension methods │
├─────────────────────────────────────────────┤
│ Provider Factory Layer │
│ AIProviderFactory dependency injection factory │
├─────────────────────────────────────────────┤
│ Provider Implementation Layer │
│ CodebuddyCliProvider concrete implementation │
├─────────────────────────────────────────────┤
│ ACP Infrastructure Layer │
│ ACPSessionManager / StdioAcpTransport │
│ AcpRpcClient / AcpAgentClient │
└─────────────────────────────────────────────┘

What are the benefits of this layering? Put simply, each layer stays out of the others’ way. If we later want to change the communication mechanism, for example from stdio to WebSocket, we only need to modify the bottom layer, and the business logic above it stays untouched. Nobody wants a communication change to ripple through the entire codebase.

The Provider contract layer is the foundation of the entire architecture. We define the AIProviderType enum, where CodebuddyCli = 3 is used as the enum value, and implement bidirectional mapping between strings and enums through extension methods. That allows strings in configuration files to be converted conveniently into enums, and enums to be converted back to strings for debugging output.

The Provider factory layer is responsible for creating the corresponding Provider instance based on configuration. It uses .NET dependency injection together with ActivatorUtilities.CreateInstance for dynamic creation. The advantage of the factory pattern is that when adding a new Provider, you only need to add the creation logic instead of modifying existing code.

The Provider implementation layer is where the actual work happens. CodebuddyCliProvider implements the IAIProvider interface and provides two invocation modes: ExecuteAsync for non-streaming calls and StreamAsync for streaming calls.

The ACP infrastructure layer provides the communication foundation underneath. This layer handles all protocol details, including process management, message serialization, and response parsing. It is the foundation that keeps everything above it stable.

CodeBuddy uses Stdio (standard input/output) to communicate with external processes. The startup command is simple:

Terminal window
codebuddy --acp

After that, JSON-RPC messages are exchanged through standard input and output. This approach has several advantages:

  1. Fast startup: local process communication avoids network latency
  2. Simple configuration: you only need to specify the executable path
  3. Environment isolation: each session runs in an independent process, so they do not affect one another

Environment variable injection is supported during communication. Common examples include:

  • CODEBUDDY_API_KEY: API key authentication
  • CODEBUDDY_INTERNET_ENVIRONMENT: network environment configuration

As with communication between people, it helps to choose a convenient channel first.

ACP is based on JSON-RPC 2.0. The message format looks roughly like this:

// Request message
{
"jsonrpc": "2.0",
"id": 1,
"method": "agent/prompt",
"params": {
"prompt": "Help me write a sorting algorithm",
"sessionId": "session-123"
}
}
// Response message
{
"jsonrpc": "2.0",
"id": 1,
"result": {
"content": "Here is the AI response..."
}
}

In the real implementation, we encapsulate all of these protocol details so the upper business layer only needs to care about the prompt and response.

First, restore the CodeBuddy type in the enum file:

PCode.Models/AIProviderType.cs
public enum AIProviderType
{
ClaudeCodeCli = 0,
CodexCli = 1,
GitHubCopilot = 2,
CodebuddyCli = 3, // Restore this enum value
OpenCodeCli = 4,
IFlowCli = 5,
}

Then add string mapping in the extension methods so the configuration file can specify the Provider by string:

AIProviderTypeExtensions.cs
private static readonly Dictionary<string, AIProviderType> _typeMap = new(
StringComparer.OrdinalIgnoreCase)
{
["CodebuddyCli"] = AIProviderType.CodebuddyCli,
["Codebuddy"] = AIProviderType.CodebuddyCli,
["codebuddy"] = AIProviderType.CodebuddyCli,
// ... Mappings for other providers
};

Add a CodeBuddy creation branch in the factory class:

AIProviderFactory.cs
private IAIProvider? CreateProvider(AIProviderType providerType, ProviderConfiguration config)
{
return providerType switch
{
AIProviderType.CodebuddyCli =>
ActivatorUtilities.CreateInstance<CodebuddyCliProvider>(
_serviceProvider,
Options.Create(config)),
// ... Other providers
_ => throw new NotSupportedException($"Provider {providerType} not supported")
};
}

This uses dependency injection through ActivatorUtilities, which automatically handles constructor parameter injection and is very convenient.

Below is the core implementation of CodebuddyCliProvider, covering both streaming and non-streaming invocation modes:

public class CodebuddyCliProvider : IAIProvider
{
private readonly ILogger<CodebuddyCliProvider> _logger;
private readonly IACPSessionManager _sessionManager;
private readonly ProviderConfiguration _config;
public string Name => "CodebuddyCli";
public bool SupportsStreaming => true;
public ProviderCapabilities Capabilities { get; }
public CodebuddyCliProvider(
ILogger<CodebuddyCliProvider> logger,
IACPSessionManager sessionManager,
IOptions<ProviderConfiguration> config)
{
_logger = logger;
_sessionManager = sessionManager;
_config = config.Value;
// Define the capabilities of the current Provider
Capabilities = new ProviderCapabilities
{
SupportsStreaming = true,
SupportsTools = true,
SupportsSystemMessages = true,
SupportsArtifacts = false,
MaxTokens = 8192
};
}
// Non-streaming call: return all results together after completion
public async Task<AIResponse> ExecuteAsync(
AIRequest request,
CancellationToken cancellationToken = default)
{
// Create an independent session for the request
var session = await _sessionManager.CreateSessionAsync(
"CodebuddyCli",
request.WorkingDirectory,
cancellationToken,
request.SessionId);
try
{
var fullPrompt = BuildPrompt(request);
await session.SendPromptAsync(fullPrompt, cancellationToken);
var responseBuilder = new StringBuilder();
var toolCalls = new List<AIToolCall>();
// Collect all response chunks
await foreach (var chunk in StreamFromSession(session, cancellationToken))
{
if (!string.IsNullOrEmpty(chunk.Content))
{
responseBuilder.Append(chunk.Content);
}
// Handle tool calls...
}
return new AIResponse
{
Content = AIResultContentSanitizer.SanitizeResultContent(
responseBuilder.ToString()),
ToolCalls = toolCalls,
Provider = Name,
Model = string.Empty
};
}
finally
{
// Release session resources
await session.DisposeAsync();
}
}
// Streaming call: return response chunks in real time
public async IAsyncEnumerable<AIStreamingChunk> StreamAsync(
AIRequest request,
[EnumeratorCancellation] CancellationToken cancellationToken = default)
{
var session = await _sessionManager.CreateSessionAsync(
"CodebuddyCli",
request.WorkingDirectory,
cancellationToken);
try
{
var fullPrompt = BuildPrompt(request);
await session.SendPromptAsync(fullPrompt, cancellationToken);
await foreach (var chunk in StreamFromSession(session, cancellationToken))
{
yield return chunk;
}
}
finally
{
await session.DisposeAsync();
}
}
private async IAsyncEnumerable<AIStreamingChunk> StreamFromSession(
IACPSession session,
[EnumeratorCancellation] CancellationToken cancellationToken)
{
// Iterate through all updates in the session
await foreach (var notification in session.ReceiveUpdatesAsync(cancellationToken))
{
switch (notification.Update)
{
case AgentMessageChunkSessionUpdate agentMessage:
// Handle text content chunks
if (agentMessage.Content is AcpImp.TextContentBlock textContent)
{
yield return new AIStreamingChunk
{
Content = textContent.Text,
Type = StreamingChunkType.ContentDelta,
IsComplete = false
};
}
break;
case ToolCallSessionUpdate toolCall:
// Handle tool calls
yield return new AIStreamingChunk
{
Content = string.Empty,
Type = StreamingChunkType.ToolCallDelta,
ToolCallDelta = new AIToolCallDelta
{
Id = toolCall.ToolCallId,
Name = toolCall.Kind.ToString(),
Arguments = toolCall.RawInput?.ToString()
}
};
break;
case AcpImp.PromptCompletedSessionUpdate:
// Response complete
yield break;
}
}
}
// Build the full prompt
private string BuildPrompt(AIRequest request, string? embeddedCommandPrompt = null)
{
var sb = new StringBuilder();
// Embedded command prompt, if present
if (!string.IsNullOrEmpty(embeddedCommandPrompt))
{
sb.AppendLine(embeddedCommandPrompt);
sb.AppendLine();
}
// System message
if (!string.IsNullOrEmpty(request.SystemMessage))
{
sb.AppendLine(request.SystemMessage);
sb.AppendLine();
}
// User prompt
sb.Append(request.Prompt);
return sb.ToString();
}
}

There are several key points in this code:

  1. Session management: each request creates an independent session and releases resources after the request completes. This is a lesson learned through trial and error. If session reuse is not handled well, state pollution appears easily.

  2. Streaming processing: IAsyncEnumerable allows the response to be returned while it is still being generated, instead of waiting for all content to finish. This is especially important for long-text scenarios and significantly improves the user experience.

  3. Tool calls: CodeBuddy supports tool calling (Function Calling), handled through ToolCallSessionUpdate. This capability is critical for complex code editing tasks.

  4. Content filtering: AIResultContentSanitizer is used to filter Think block content and keep the output clean.

Add the related services during module registration:

PCodeClaudeHelperModule.cs
public void ConfigureModule(IServiceCollection context)
{
// Register Provider
context.Services.AddTransient<CodebuddyCliProvider>();
// Register ACP infrastructure
context.Services.AddSingleton<IACPSessionManager, ACPSessionManager>();
context.Services.AddSingleton<IAcpPlatformConfigurationResolver, AcpPlatformConfigurationResolver>();
context.Services.AddSingleton<IAIRequestToAcpMapper, AIRequestToAcpMapper>();
context.Services.AddSingleton<IAcpToAIResponseMapper, AcpToAIResponseMapper>();
}

Add CodeBuddy-related configuration to appsettings.json:

AI:
# Default Provider to use
DefaultProvider: "CodebuddyCli"
# Provider configuration
Providers:
CodebuddyCli:
Type: "CodebuddyCli"
WorkingDirectory: "C:/projects/my-app"
ExecutablePath: "C:/tools/codebuddy.cmd"
# Platform-specific configuration
PlatformConfigurations:
CodebuddyCli:
ExecutablePath: "C:/tools/codebuddy.cmd"
Arguments: "--acp"
StartupTimeoutMs: 5000
EnvironmentVariables:
CODEBUDDY_API_KEY: "${CODEBUDDY_API_KEY}"
CODEBUDDY_INTERNET_ENVIRONMENT: "production"

The corresponding configuration model definition:

public class CodebuddyPlatformConfiguration : IAcpPlatformConfiguration
{
public string ProviderName => "CodebuddyCli";
public AcpTransportType TransportType => AcpTransportType.Stdio;
public string ExecutablePath { get; set; } = "codebuddy";
public string Arguments { get; set; } = "--acp";
public int StartupTimeoutMs { get; set; } = 5000;
public Dictionary<string, string?>? EnvironmentVariables { get; set; }
}

We ran into several typical pitfalls during implementation, and sharing them here may help others avoid the same detours:

  1. Session leak issue: at first, sessions were not released correctly, which exhausted process resources. The solution was to use try-finally to ensure resources are released for every request.

  2. Environment variable passing: Windows and Linux use different environment variable syntax, so we later standardized on Dictionary<string, string?> to handle this.

  3. Timeout configuration: CLI startup takes time, so we set a 5-second startup timeout to avoid fast request failures.

  4. Encoding issues: on Windows, the default encoding may cause garbled Chinese text, so UTF-8 encoding is explicitly specified when starting the process.

  1. Session pool: for frequent short requests, consider implementing a session pool to reuse processes
  2. Connection cache: the factory class already supports caching Provider instances
  3. Async first: use asynchronous programming throughout to avoid blocking threads

Performance is always worth optimizing. The longer users wait, the worse the experience becomes.

This article introduced a complete solution for integrating CodeBuddy CLI into a C# backend, covering the entire process from architecture design to concrete implementation. Through a layered architecture, we separate protocol details from business logic, making the code clearer and easier to maintain.

Key takeaways:

  • Use a layered architecture with a Provider contract layer, factory layer, implementation layer, and infrastructure layer
  • Use JSON-RPC over Stdio for inter-process communication
  • Implement flexible configuration and extensibility through dependency injection
  • Provide both streaming and non-streaming invocation modes

This approach is not only suitable for CodeBuddy; adding new AI Providers can follow the same pattern. If you are also building a similar multi-AI-Provider integration, I hope this article gives you a useful reference.



If this article helped you:

Practical Multi-AI Provider Architecture in the HagiCode Platform

Practical Multi-AI Provider Architecture in the HagiCode Platform

Section titled “Practical Multi-AI Provider Architecture in the HagiCode Platform”

This article shares the technical approach we used under the Orleans Grain architecture to integrate two AI tools, iflow and OpenCode, through a unified IAIProvider interface, and compares the implementation differences between WebSocket and HTTP communication in detail.

There is nothing especially mysterious about it. While building HagiCode, we ran into a very practical problem: users wanted to work with different AI tools. That is hardly surprising, since everyone has their own habits. Some prefer Claude Code, some love GitHub Copilot, and some teams use tools they developed themselves.

Our initial solution was simple and direct: write dedicated integration code for each AI tool. But the drawbacks showed up quickly. The codebase filled up with if-else branches, every change required testing in multiple places, and every new tool meant writing another pile of logic from scratch.

Later, I realized it would be better to create a unified IAIProvider interface and abstract the capabilities shared by all AI providers. That way, no matter which tool is used underneath, the upper layers can call it in the same way.

Recently, the project needed to integrate two new tools: iflow and OpenCode. Both support the ACP protocol, but their communication styles are different. iflow uses WebSocket, while OpenCode uses an HTTP API. That became a useful architectural test: adapt two different transport modes behind one unified interface.

The approach shared in this article comes from our practical experience in the HagiCode project. HagiCode is an AI-assisted development platform built on the Orleans Grain architecture. It integrates with different AI providers through a unified IAIProvider interface, allowing users to flexibly choose the AI tools they prefer.

First, we defined the IAIProvider interface and abstracted the capabilities that every AI provider needs to implement:

public interface IAIProvider
{
string Name { get; }
bool SupportsStreaming { get; }
ProviderCapabilities Capabilities { get; }
Task<AIResponse> ExecuteAsync(AIRequest request, CancellationToken cancellationToken = default);
IAsyncEnumerable<AIStreamingChunk> StreamAsync(AIRequest request, CancellationToken cancellationToken = default);
Task<ProviderTestResult> PingAsync(CancellationToken cancellationToken = default);
IAsyncEnumerable<AIStreamingChunk> SendMessageAsync(AIRequest request, string? embeddedCommandPrompt = null, CancellationToken cancellationToken = default);
}

This interface includes several key methods:

  • ExecuteAsync: execute a one-shot AI request
  • StreamAsync: get streaming responses for real-time display
  • PingAsync: perform a health check to verify whether the provider is available
  • SendMessageAsync: send a message with support for embedded commands

IFlowCliProvider: A WebSocket-Based Implementation

Section titled “IFlowCliProvider: A WebSocket-Based Implementation”

iflow uses WebSocket for ACP communication. The overall architecture looks like this:

IFlowCliProvider → ACPSessionManager → WebSocketAcpTransport → iflow CLI
Dynamic port allocation + process management

The core flow is also fairly straightforward:

  1. ACPSessionManager creates and manages ACP sessions.
  2. WebSocketAcpTransport handles WebSocket communication.
  3. A port is allocated dynamically, and the iflow process is started with iflow --experimental-acp --port.
  4. IAIRequestToAcpMapper and IAcpToAIResponseMapper convert requests and responses.

Here is the core code:

private async IAsyncEnumerable<AIStreamingChunk> StreamCoreAsync(
AIRequest request,
string? embeddedCommandPrompt,
[EnumeratorCancellation] CancellationToken cancellationToken)
{
// Resolve working directory
var resolvedWorkingDirectory = ResolveWorkingDirectory(request);
var effectiveRequest = ApplyEmbeddedCommandPrompt(request, embeddedCommandPrompt);
// Create ACP session
await using var session = await _sessionManager.CreateSessionAsync(
Name,
resolvedWorkingDirectory,
cancellationToken,
request.SessionId);
// Send prompt
var prompt = _requestMapper.ToPromptString(effectiveRequest);
var promptResponse = await session.SendPromptAsync(prompt, cancellationToken);
// Receive streaming response
await foreach (var notification in session.ReceiveUpdatesAsync(cancellationToken))
{
if (_responseMapper.TryConvertToStreamingChunk(notification, out var chunk))
{
if (chunk.Type == StreamingChunkType.Metadata && chunk.IsComplete)
{
yield return chunk;
yield break;
}
yield return chunk;
}
}
}

There are a few design points worth calling out here:

  • Use await using to ensure the session is released correctly and avoid resource leaks.
  • Return streaming responses through IAsyncEnumerable, which naturally supports async streams.
  • Use Metadata chunks to determine completion and ensure the full response has been received.

OpenCodeCliProvider: An HTTP API-Based Implementation

Section titled “OpenCodeCliProvider: An HTTP API-Based Implementation”

OpenCode provides its service through an HTTP API, so the architecture is slightly different:

OpenCodeCliProvider → OpenCodeRuntimeManager → OpenCodeClient → OpenCode HTTP API
OpenCodeProcessManager → opencode process management

A notable feature of OpenCode is that it uses an SQLite database to persist session bindings. That makes session recovery and prompt-response recovery possible:

private async Task<OpenCodePromptExecutionResult> ExecutePromptAsync(
AIRequest request,
string? embeddedCommandPrompt,
CancellationToken cancellationToken)
{
var prompt = BuildPrompt(request, embeddedCommandPrompt);
var resolvedWorkingDirectory = ResolveWorkingDirectory(request.WorkingDirectory);
var client = await _runtimeManager.GetClientAsync(resolvedWorkingDirectory, cancellationToken);
var bindingSessionId = request.SessionId;
var boundSession = TryGetBinding(bindingSessionId, resolvedWorkingDirectory);
// Try to use the already bound session
if (boundSession is not null)
{
try
{
return await PromptSessionAsync(
client,
boundSession,
BuildPromptRequest(request, prompt, CreatePromptMessageId()),
request.Model ?? _settings.Model,
cancellationToken);
}
catch (OpenCodeApiException ex) when (IsStaleBinding(ex))
{
// The session has expired, remove the binding
RemoveBinding(bindingSessionId);
}
}
// Create a new session
var session = await client.Session.CreateAsync(new OpenCodeSessionCreateRequest
{
Title = BuildSessionTitle(request)
}, cancellationToken);
BindSession(bindingSessionId, session.Id, resolvedWorkingDirectory);
return await PromptSessionAsync(client, session.Id, ...);
}

This implementation has several interesting highlights:

  • Session binding mechanism: the same SessionId reuses the same OpenCode session, avoiding repeated session creation.
  • Expiration handling: when a session is found to be expired, the binding is automatically cleaned up.
  • Database persistence: bindings are stored in SQLite and remain effective after restart.
AspectIFlowCliProviderOpenCodeCliProvider
CommunicationWebSocket (ACP)HTTP API
Process managementACPSessionManagerOpenCodeProcessManager
Port allocationDynamic portNo port (uses HTTP)
Session managementACPSessionOpenCodeSession
PersistenceIn-memory cacheSQLite database
Startup commandiflow --experimental-acp --portopencode
LatencyLower (long-lived connection)Relatively higher (HTTP requests)

Which approach you choose depends mainly on your needs. WebSocket is better for scenarios with high real-time requirements, while an HTTP API is simpler and easier to debug.

First, enable the two providers in the configuration file:

AI:
Providers:
IFlowCli:
Type: "IFlowCli"
Enabled: true
ExecutablePath: "iflow"
Model: null
WorkingDirectory: null
OpenCodeCli:
Type: "OpenCodeCli"
Enabled: true
ExecutablePath: "opencode"
Model: "anthropic/claude-sonnet-4"
WorkingDirectory: null
OpenCode:
Enabled: true
BaseUrl: "http://localhost:38376"
ExecutablePath: "opencode"
StartupTimeoutSeconds: 30
RequestTimeoutSeconds: 120
// Get provider through the factory
var provider = await _providerFactory.GetProviderAsync(AIProviderType.IFlowCli);
// Execute an AI request
var request = new AIRequest
{
Prompt = "请帮我重构这个函数",
WorkingDirectory = "/path/to/project",
Model = "claude-sonnet-4"
};
// Get the complete response
var response = await provider.ExecuteAsync(request, cancellationToken);
Console.WriteLine(response.Content);
// Or use streaming responses
await foreach (var chunk in provider.StreamAsync(request, cancellationToken))
{
if (chunk.Type == StreamingChunkType.ContentDelta)
{
Console.Write(chunk.Content);
}
}
// Get provider through the factory
var provider = await _providerFactory.GetProviderAsync(AIProviderType.OpenCodeCli);
var request = new AIRequest
{
Prompt = "请帮我分析这个错误",
WorkingDirectory = "/path/to/project",
Model = "anthropic/claude-sonnet-4"
};
var response = await provider.ExecuteAsync(request, cancellationToken);
Console.WriteLine(response.Content);

Before startup or before use, you can check whether the provider is available:

var iflowResult = await iflowProvider.PingAsync(cancellationToken);
if (!iflowResult.Success)
{
Console.WriteLine($"IFlow is unavailable: {iflowResult.ErrorMessage}");
return;
}
var openCodeResult = await openCodeProvider.PingAsync(cancellationToken);
if (!openCodeResult.Success)
{
Console.WriteLine($"OpenCode is unavailable: {openCodeResult.ErrorMessage}");
return;
}

Both providers support embedded commands, such as /file:xxx:

var request = new AIRequest
{
Prompt = "分析这个文件的问题",
SystemMessage = "你是一个代码分析专家"
};
await foreach (var chunk in provider.SendMessageAsync(
request,
embeddedCommandPrompt: "/file:src/main.cs",
cancellationToken))
{
Console.Write(chunk.Content);
}

IFlow uses long-lived WebSocket connections, so resource management deserves special attention:

  • Use await using to ensure sessions are released properly.
  • Cancellation triggers process cleanup.
  • ACPSessionManager supports a maximum session count limit.

OpenCode process management is relatively simpler, and OpenCodeRuntimeManager handles it automatically.

Both providers have complete error handling:

  • IFlow errors are propagated through ACP session updates.
  • OpenCode errors are thrown through OpenCodeApiException.
  • It is recommended that the caller catch and handle these exceptions.
  • IFlow WebSocket communication has lower latency than HTTP.
  • OpenCode session reuse can reduce the overhead of HTTP requests.
  • The factory cache mechanism avoids repeatedly creating providers.
  • In high-concurrency scenarios, pay close attention to the limits on process count and connection count.

The executable path is validated at startup, but runtime issues can still happen. PingAsync is a useful tool for verifying whether the configuration is correct:

// Check at startup
var provider = await _providerFactory.GetProviderAsync(providerType);
var result = await provider.PingAsync(cancellationToken);
if (!result.Success)
{
_logger.LogError("Provider {ProviderType} is unavailable: {Error}", providerType, result.ErrorMessage);
}

This article shares the technical approach used by the HagiCode platform when integrating the two AI tools iflow and OpenCode. Through a unified IAIProvider interface, we adapted different communication styles, WebSocket and HTTP, while keeping the upper-layer calling pattern consistent.

The core idea is actually quite simple:

  1. Define a unified interface abstraction.
  2. Build adapter layers for different implementations.
  3. Manage everything uniformly through the factory pattern.

That gives the system good extensibility. When a new AI tool needs to be integrated later, all we need to do is implement the IAIProvider interface without changing too much existing code.

If you are also working on multi-AI-tool integration, I hope this article is helpful.


If this article helped you: