Skip to content

Orleans

6 posts with the tag “Orleans”

Hermes Agent Integration Practice: From Protocol to Production

Hermes Agent Integration Practice: From Protocol to Production

Section titled “Hermes Agent Integration Practice: From Protocol to Production”

Sharing the complete HagiCode experience of integrating Hermes Agent, including core lessons around ACP protocol adaptation, session pool management, and front-end/back-end contract synchronization.

While building HagiCode, an AI-assisted coding platform, our team needed to integrate an Agent framework that could run locally and also scale to the cloud. After research, Hermes Agent from Nous Research was chosen as the underlying engine for our general-purpose Agent capabilities.

In truth, technology selection is neither especially hard nor especially easy. There are plenty of strong Agent frameworks on the market, but Hermes stood out because its ACP protocol and tool system fit HagiCode’s demanding requirements particularly well: local development, team collaboration, and cloud expansion. Still, bringing Hermes into a real production system meant solving a long list of engineering problems. This part was anything but trivial.

HagiCode’s stack uses Orleans to build a distributed system, while the front end is built with React + TypeScript. Integrating Hermes meant preserving architectural consistency while making Hermes a first-class executor alongside ClaudeCode and OpenCode. It sounds simple enough, but implementation always tells the real story.

This article shares our practical experience integrating Hermes Agent into HagiCode, and we hope it offers useful reference material for teams facing similar needs. After all, once you’ve fallen into a pit, there is no reason to let someone else fall into the same one.

The solution described in this article comes from our hands-on work in the HagiCode project. HagiCode is an AI-driven coding assistance platform that supports unified access to and management of multiple AI Providers. During the Hermes Agent integration, we designed a generic Provider abstraction layer so new Agent types could plug into the existing system seamlessly.

If you’re interested in HagiCode, feel free to visit GitHub to learn more. The more people who pay attention, the stronger the momentum.

HagiCode’s Hermes integration uses a clear layered architecture, with each layer focused on its own responsibilities:

Back-end core layer

  • HermesCliProvider: implements the IAIProvider interface as the unified AI Provider entry point
  • HermesPlatformConfiguration: manages Hermes executable path, arguments, authentication, and related settings
  • ICliProvider<HermesOptions>: the low-level CLI abstraction provided by HagiCode.Libs for handling subprocess lifecycles

Transport layer

  • StdioAcpTransport: communicates with the Hermes ACP subprocess through standard input and output
  • ACP protocol methods: initialize, authenticate, session/new, session/prompt

Runtime layer

  • HermesGrain: Orleans Grain implementation that handles distributed session execution
  • CliAcpSessionPool: session pool that reuses ACP subprocesses to avoid frequent startup overhead

Front-end layer

  • ExecutorAvatar: Hermes visual identity and icon
  • executorTypeAdapter: Provider type mapping logic
  • SignalR real-time messaging: maintains Hermes identity consistency throughout the message stream

This layered design allows each layer to evolve independently. For example, if we want to add a new transport mechanism in the future, such as WebSocket, we only need to modify the transport layer. There is no need to turn over the whole system just because one transport changes.

All AI Providers implement the IAIProvider interface, which is one of the core design choices in HagiCode’s architecture:

public interface IAIProvider
{
string Name { get; }
ProviderCapabilities Capabilities { get; }
IAsyncEnumerable<AIStreamingChunk> StreamAsync(
AIRequest request,
CancellationToken cancellationToken = default);
Task<AIResponse> ExecuteAsync(
AIRequest request,
CancellationToken cancellationToken = default);
}

HermesCliProvider implements this interface and stands on equal footing with ClaudeCodeProvider, OpenCodeProvider, and others. The benefits of this design include:

  1. Replaceability: switching Providers does not affect upper-layer business logic
  2. Testability: Providers can be mocked easily for unit testing
  3. Extensibility: adding a new Provider only requires implementing the interface

In the end, interfaces are a lot like rules. Once the rules are in place, everyone can coexist harmoniously, play to their strengths, and avoid stepping on each other. There is a certain elegance in that.

HermesCliProvider is the core of the entire integration. It coordinates the various components needed to complete a single AI invocation:

public sealed class HermesCliProvider : IAIProvider, IVersionedAIProvider
{
private readonly ICliProvider<LibsHermesOptions> _provider;
private readonly ConcurrentDictionary<string, string> _sessionBindings;
public ProviderCapabilities Capabilities { get; } = new()
{
SupportsStreaming = true,
SupportsTools = true,
SupportsSystemMessages = true,
SupportsArtifacts = false
};
public async IAsyncEnumerable<AIStreamingChunk> StreamAsync(
AIRequest request,
[EnumeratorCancellation] CancellationToken cancellationToken = default)
{
// 1. Resolve the session binding key
var bindingKey = ResolveBindingKey(request.CessionId);
// 2. Get or create a Hermes session through the session pool
var options = new HermesOptions
{
ExecutablePath = _platformConfiguration.ExecutablePath,
Arguments = _platformConfiguration.Arguments,
SessionId = _sessionBindings.TryGetValue(bindingKey, out var sessionId) ? sessionId : null,
WorkingDirectory = request.WorkingDirectory,
Model = request.Model
};
// 3. Execute and collect the streaming response
await foreach (var message in _provider.ExecuteAsync(options, request.Prompt, cancellationToken))
{
// 4. Map ACP messages to AIStreamingChunk
if (_responseMapper.TryConvertToStreamingChunk(message, out var chunk))
{
yield return chunk;
}
}
}
}

Several design points are especially important here:

  1. Session binding: uses CessionId to bind multiple requests to the same Hermes subprocess, preserving context continuity across multi-turn conversations
  2. Response mapping: converts Hermes ACP message format into the unified AIStreamingChunk format
  3. Streaming support: uses IAsyncEnumerable to support true streaming responses

Session binding is a bit like human relationships. Once a connection is established, future communication has context, so you do not need to start from zero each time. Of course, that relationship still has to be maintained.

Hermes uses ACP (Agent Communication Protocol), which differs from a traditional HTTP API. ACP is a protocol based on standard input and output, and it has several characteristics:

  1. Startup marker: after the Hermes process starts, it outputs the //ready marker
  2. Dynamic authentication: authentication methods are not fixed and must be negotiated through the protocol
  3. Session reuse: established sessions are reused through SessionId
  4. Fragmented responses: a complete response may be split across multiple session/update notifications

HagiCode handles these characteristics through StdioAcpTransport:

public class StdioAcpTransport
{
public async Task InitializeAsync(CancellationToken cancellationToken)
{
// Wait for the //ready marker
var readyLine = await _outputReader.ReadLineAsync(cancellationToken);
if (readyLine != "//ready")
{
throw new InvalidOperationException("Hermes did not send ready signal");
}
// Send the initialize request
await SendRequestAsync(new
{
jsonrpc = "2.0",
id = 1,
method = "initialize",
@params = new
{
protocolVersion = "2024-11-05",
capabilities = new { },
clientInfo = new { name = "HagiCode", version = "1.0.0" }
}
}, cancellationToken);
}
}

Protocols are a bit like mutual understanding between people. Once that understanding is there, communication flows much more smoothly. Building it just takes time.

Starting Hermes subprocesses frequently is expensive, so we implemented a session pool mechanism:

services.AddSingleton(static _ =>
{
var registry = new CliProviderPoolConfigurationRegistry();
registry.Register("hermes", new CliPoolSettings
{
MaxActiveSessions = 50,
IdleTimeout = TimeSpan.FromMinutes(10)
});
return registry;
});

Key session pool parameters:

  • MaxActiveSessions: controls the concurrency limit to avoid exhausting resources
  • IdleTimeout: idle timeout that balances startup cost against memory usage

In practice, we found that:

  1. If the idle timeout is too short, sessions restart frequently; if it is too long, memory remains occupied
  2. The concurrency limit must be tuned according to actual load, because setting it too high can make the system sluggish
  3. Session pool utilization needs monitoring so parameters can be adjusted in time

This is much like many choices in life: being too aggressive creates problems, while being too conservative misses opportunities. The goal is simply to find the right balance.

The front end needs to correctly identify the Hermes Provider and display the corresponding visual elements:

executorTypeAdapter.ts
export const resolveExecutorVisualTypeFromProviderType = (
providerType: PCode_Models_AIProviderType | null | undefined
): ExecutorVisualType => {
switch (providerType) {
case PCode_Models_AIProviderType.HERMES_CLI:
return 'Hermes';
default:
return 'Unknown';
}
};

Hermes has its own icon and color identity:

ExecutorAvatar.tsx
const renderExecutorGlyph = (executorType: ExecutorVisualType, iconSize: number) => {
switch (executorType) {
case 'Hermes':
return (
<svg viewBox="0 0 24 24" fill="none" className="h-4 w-4">
<rect x="4" y="4" width="16" height="16" rx="4" fill="currentColor" opacity="0.16" />
<path d="M8 7v10M16 7v10M8 12h8" stroke="currentColor" strokeWidth="2" strokeLinecap="round" />
</svg>
);
default:
return <DefaultAvatar />;
}
};

After all, beautiful things deserve beautiful presentation. Making sure that beauty is actually visible still depends on front-end craftsmanship.

The front end and back end keep their contract aligned through OpenAPI generation. The back end defines the AIProviderType enum:

public enum AIProviderType
{
Unknown,
ClaudeCode,
OpenCode,
HermesCli // Newly added
}

The front end generates the corresponding TypeScript type through OpenAPI, ensuring enum values stay consistent. This is the key to avoiding the front end displaying Unknown.

A contract is a lot like a promise. Once agreed, it has to be honored, otherwise you end up in awkward situations like Unknown.

Hermes configuration is managed through appsettings.json:

{
"Providers": {
"HermesCli": {
"ExecutablePath": "hermes",
"Arguments": "acp",
"StartupTimeoutMs": 10000,
"ClientName": "HagiCode",
"Authentication": {
"PreferredMethodId": "api-key",
"MethodInfo": {
"api-key": "your-api-key-here"
}
},
"SessionDefaults": {
"Model": "claude-sonnet-4-20250514",
"ModeId": "default"
}
}
}
}

This configuration-driven design brings flexibility:

  • executable paths can be overridden, which is convenient for development and testing
  • startup arguments can be customized to match different Hermes versions
  • authentication information can be configured to support multiple authentication methods

Configuration is a bit like multiple-choice questions in life. If enough options are available, there is usually one that fits. That said, too many options can create decision fatigue of their own.

Building a reliable Provider requires comprehensive health checks:

public async Task<ProviderTestResult> PingAsync(CancellationToken cancellationToken = default)
{
var response = await ExecuteAsync(new AIRequest
{
Prompt = "Reply with exactly PONG.",
CessionId = null,
AllowedTools = Array.Empty<string>(),
WorkingDirectory = ResolveWorkingDirectory(null)
}, cancellationToken);
var success = string.Equals(response.Content.Trim(), "PONG", StringComparison.OrdinalIgnoreCase);
return new ProviderTestResult
{
ProviderName = Name,
Success = success,
ResponseTimeMs = stopwatch.ElapsedMilliseconds,
ErrorMessage = success ? null : $"Unexpected Hermes ping response: '{response.Content}'."
};
}

Points to watch in health checks:

  1. Use simple test cases and avoid overly complex scenarios
  2. Set reasonable timeout values
  3. Record response time to support performance analysis

Just as people need physical checkups, systems need health checks too. The sooner issues are found, the easier they are to fix.

HagiCode provides a dedicated console for validating Hermes integration:

Terminal window
# Basic validation
HagiCode.Libs.Hermes.Console --test-provider
# Full suite (including repository analysis)
HagiCode.Libs.Hermes.Console --test-provider-full --repo .
# Custom executable
HagiCode.Libs.Hermes.Console --test-provider-full --executable /path/to/hermes

This tool is extremely useful during development because it lets us quickly verify whether the integration is correct. After all, no one wants to wait until a problem surfaces before remembering to test.

Authentication failure

  • Check whether Authentication.PreferredMethodId matches the authentication method Hermes actually supports
  • Confirm the authentication information format is correct, such as API Key or Bearer Token

Session timeout

  • Increase the StartupTimeoutMs value
  • Check MCP server reachability
  • Review system resource utilization

Incomplete response

  • Ensure session/update notifications and the final result are aggregated correctly
  • Check cancellation logic in streaming handling
  • Verify error handling is complete

Front end displays Unknown

  • Confirm OpenAPI generation already includes the HermesCli enum value
  • Check whether type mapping is correct
  • Clear browser cache and regenerate types

Problems will always exist. When they appear, the important thing is not to panic. Trace the cause step by step, and in the end, most of them can be solved.

  1. Use the session pool: reuse ACP subprocesses to reduce startup overhead
  2. Set timeouts appropriately: balance memory use against startup cost
  3. Reuse session IDs: use the same CessionId for batch tasks
  4. Configure MCP on demand: avoid unnecessary tool invocations

Performance is a lot like efficiency in daily life. When you get it right, you achieve more with less; when you get it wrong, effort multiplies while results shrink. Finding that “right” point takes both experience and luck.

Integrating Hermes Agent into a production system requires considering problems across multiple dimensions:

  1. Architecture: design a unified Provider interface and implement a replaceable component architecture
  2. Protocol: correctly handle ACP-specific behavior such as startup markers and dynamic authentication
  3. Performance: reuse resources through the session pool and balance startup cost against memory usage
  4. Front end: ensure contract synchronization and provide a consistent visual experience

HagiCode’s experience shows that with good layered design and configuration-driven implementation, a complex Agent system can be integrated seamlessly into an existing architecture.

These principles sound simple when described in words, but actual implementation always introduces many different kinds of problems. That is fine. If a problem gets solved, it becomes experience. If it does not, it becomes a lesson. Either way, it still has value.

Beautiful things or people do not need to be possessed; as long as they remain beautiful, it is enough to quietly appreciate that beauty. Technology is much the same. If it helps make the system better, then the specific framework or protocol matters far less than people sometimes think.

Thank you for reading. If you found this article helpful, feel free to like, bookmark, and share it. This content was created with AI-assisted collaboration, and the final content was reviewed and confirmed by the author.

From CLI Calls to SDK Integration: Best Practices for GitHub Copilot in .NET Projects

From CLI Calls to SDK Integration: Best Practices for GitHub Copilot in .NET Projects

Section titled “From CLI Calls to SDK Integration: Best Practices for GitHub Copilot in .NET Projects”

The upgrade path from command-line invocation to official SDK integration has been quite a journey. Today, I want to share the pitfalls we ran into and what we learned while building HagiCode.

After the GitHub Copilot SDK was officially released in 2025, we started integrating it into our AI capability layer. Before that, the project mainly used GitHub Copilot by directly invoking the Copilot CLI command-line tool, but that approach had several obvious issues:

  • Complex process management: We had to manually manage the CLI process lifecycle, startup timeouts, and process cleanup. Processes can crash without warning.
  • Incomplete event handling: Raw CLI invocation makes it hard to capture fine-grained events from model reasoning and tool execution. It is like seeing only the result without the thinking process.
  • Difficult session management: There was no effective mechanism for session reuse and recovery, so every interaction had to start over.
  • Compatibility problems: CLI arguments changed frequently, which meant we constantly had to maintain compatibility logic for those parameters.

These issues became increasingly apparent in day-to-day development, especially when we needed to track model reasoning (thinking) and tool execution status in real time. At that point, it became clear that we needed a lower-level and more complete integration approach.

The approach shared in this article comes from our practical experience in the HagiCode project. HagiCode is an open-source AI coding assistant project, and during development we needed deep integration with GitHub Copilot capabilities, from basic code completion to complex multi-turn conversations and tool calling. Those real-world requirements pushed us to move from CLI invocation to the official SDK.

If this implementation sounds useful to you, there is a good chance our engineering experience can help. In that case, the HagiCode project itself may also be worth checking out. You might even find more project information and links at the end of this article.

The project uses a layered architecture to address the limitations of CLI invocation:

┌─────────────────────────────────────────────────────────┐
│ hagicode-core (Orleans Grains + AI Provider Layer) │
│ - CopilotAIProvider: Converts AIRequest to CopilotOptions │
│ - GitHubCopilotGrain: Orleans distributed execution interface │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ HagiCode.Libs (Shared Provider Layer) │
│ - CopilotProvider: CLI Provider interface implementation │
│ - ICopilotSdkGateway: SDK invocation abstraction │
│ - GitHubCopilotSdkGateway: SDK session management and event dispatch │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ GitHub Copilot SDK (Official .NET SDK) │
│ - CopilotClient: SDK client │
│ - CopilotSession: Session management │
│ - SessionEvent: Event stream │
└─────────────────────────────────────────────────────────┘

This layered design brings several practical technical advantages:

  1. Separation of concerns: Core business logic is decoupled from SDK implementation details.
  2. Testability: The ICopilotSdkGateway interface makes unit testing straightforward.
  3. Reusability: HagiCode.Libs can be referenced by multiple projects.
  4. Maintainability: SDK upgrades only require changes in the Gateway layer, while upper layers remain untouched.

Authentication is the first and most important step of SDK integration. If authentication fails, nothing else matters. We designed a flexible authentication configuration that supports multiple authentication sources:

// CopilotProvider.cs - Authentication source configuration
public class CopilotOptions
{
public bool UseLoggedInUser { get; set; } = true;
public string? GitHubToken { get; set; }
public string? CliUrl { get; set; }
}
// Convert to SDK request
return new CopilotSdkRequest(
GitHubToken: options.AuthSource == CopilotAuthSource.GitHubToken
? options.GitHubToken
: null,
UseLoggedInUser: options.AuthSource != CopilotAuthSource.GitHubToken
);

The benefits of this design are fairly clear:

  • It supports logged-in user mode without requiring a token, which fits desktop scenarios well.
  • It supports GitHub Token mode, which is suitable for server-side deployments and centralized management.
  • It supports overriding the Copilot CLI URL, which helps with enterprise proxy configuration.

In practice, this flexible authentication model greatly simplified configuration across different deployment scenarios. The desktop client can use each user’s own Copilot login state, while the server side can manage authentication centrally through tokens.

One of the most powerful capabilities of the SDK is complete event stream capture. We implemented an event dispatch system that can handle all kinds of SDK events in real time:

// GitHubCopilotSdkGateway.cs - Core logic for event dispatch
internal static SessionEventDispatchResult DispatchSessionEvent(
SessionEvent evt, bool sawDelta)
{
switch (evt)
{
case AssistantReasoningEvent reasoningEvent:
// Capture the model reasoning process
events.Add(new CopilotSdkStreamEvent(
CopilotSdkStreamEventType.ReasoningDelta,
Content: reasoningEvent.Data.Content));
break;
case ToolExecutionStartEvent toolStartEvent:
// Capture the start of a tool invocation
events.Add(new CopilotSdkStreamEvent(
CopilotSdkStreamEventType.ToolExecutionStart,
ToolName: toolStartEvent.Data.ToolName,
ToolCallId: toolStartEvent.Data.ToolCallId));
break;
case ToolExecutionCompleteEvent toolCompleteEvent:
// Capture tool completion and its result
events.Add(new CopilotSdkStreamEvent(
CopilotSdkStreamEventType.ToolExecutionEnd,
Content: ExtractToolExecutionContent(toolCompleteEvent)));
break;
default:
// Preserve unhandled events as RawEvent
events.Add(new CopilotSdkStreamEvent(
CopilotSdkStreamEventType.RawEvent,
RawEventType: evt.GetType().Name));
break;
}
}

The value of this implementation is significant:

  • Complete capture of the model reasoning process (thinking): users can see how the AI is reasoning, not just the final answer.
  • Real-time tracking of tool execution status: we know which tools are running, when they finish, and what they return.
  • Zero event loss: by falling back to RawEvent, we ensure every event is recorded.

In HagiCode, these fine-grained events help users understand how the AI works internally, especially when debugging complex tasks.

After migrating from CLI invocation to the SDK, we found that some existing CLI parameters no longer applied in the SDK. To preserve backward compatibility, we implemented a parameter filtering system:

// CopilotCliCompatibility.cs - Argument filtering
private static readonly Dictionary<string, string> RejectedFlags = new()
{
["--headless"] = "Unsupported startup argument",
["--model"] = "Passed through an SDK-native field",
["--prompt"] = "Passed through an SDK-native field",
["--interactive"] = "Interaction is managed by the provider",
};
public static CopilotCliArgumentBuildResult BuildCliArgs(CopilotOptions options)
{
// Filter out unsupported arguments and keep compatible ones
// Generate diagnostic information
}

This gives us several benefits:

  • It automatically filters incompatible CLI arguments to prevent runtime errors.
  • It generates clear diagnostic messages to help developers locate problems quickly.
  • It keeps the SDK stable and insulated from changes in CLI parameters.

During the upgrade, this compatibility mechanism helped us transition smoothly. Existing configuration files could still be used and only needed gradual adjustments based on the diagnostic output.

Creating Copilot SDK sessions is relatively expensive, and creating and destroying sessions too frequently hurts performance. To solve that, we implemented a session pool management system:

// CopilotProvider.cs - Session pool management
await using var lease = await _poolCoordinator.AcquireCopilotRuntimeAsync(
request,
async ct => await _gateway.CreateRuntimeAsync(sdkRequest, ct),
cancellationToken);
if (lease.IsWarmLease)
{
// Reuse an existing session
yield return CreateSessionReusedMessage();
}
await foreach (var eventData in lease.Entry.Resource.SendPromptAsync(...))
{
yield return MapEvent(eventData);
}

The benefits of session pooling include:

  • Session reuse: requests with the same sessionId can reuse existing sessions and reduce startup overhead.
  • Session recovery support: after a network interruption, previous session state can be restored.
  • Automatic pooling management: expired sessions are cleaned up automatically to avoid resource leaks.

In HagiCode, session pooling noticeably improved responsiveness, especially for continuous conversations.

HagiCode uses Orleans as its distributed framework, and we integrated the Copilot SDK into Orleans Grains:

// GitHubCopilotGrain.cs - Distributed execution
public async IAsyncEnumerable<GitHubCopilotResponse> ExecuteCommandStreamAsync(
string command,
CancellationToken token = default)
{
var provider = await aiProviderFactory.GetProviderAsync(AIProviderType.GitHubCopilot);
await foreach (var chunk in provider.SendMessageAsync(request, null, token))
{
// Map to the unified response format
yield return BuildChunkResponse(chunk, startedAt);
}
}

The advantages of Orleans integration are substantial:

  • Unified AI Provider abstraction: it becomes easy to switch between different AI providers.
  • Multi-tenant isolation: Copilot sessions for different users remain isolated from one another.
  • Persistent session state: session state can be restored even after server restarts.

For scenarios that need to handle a large number of concurrent requests, Orleans provides strong scalability.

Here is a complete configuration example:

{
"AI": {
"Providers": {
"Providers": {
"GitHubCopilot": {
"Enabled": true,
"ExecutablePath": "copilot",
"Model": "gpt-5",
"WorkingDirectory": "/path/to/project",
"Timeout": 7200,
"StartupTimeout": 30,
"UseLoggedInUser": true,
"NoAskUser": true,
"Permissions": {
"AllowAllTools": false,
"AllowedTools": ["Read", "Bash", "Grep"],
"DeniedTools": ["Edit"]
}
}
}
}
}
}

In real-world usage, we summarized several points worth paying attention to:

Startup timeout configuration: The first startup of Copilot CLI can take a relatively long time, so we recommend setting StartupTimeout to at least 30 seconds. If this is the first login, it may take even longer.

Permission management: In production environments, avoid using AllowAllTools: true. Use the AllowedTools allowlist to control which tools are available, and use the DeniedTools denylist to block dangerous operations. This effectively prevents the AI from executing risky commands.

Session management: Requests with the same sessionId automatically reuse sessions. Session state is persisted through ProviderSessionId. Cancellation is propagated via CancellationTokenSource.

Diagnostic output: Incompatible CLI arguments generate messages of type diagnostic. Raw SDK events are preserved as event.raw. Error messages include categories such as startup timeout and argument incompatibility to make troubleshooting easier.

Based on our practical experience, here are a few best practices:

1. Use a tool allowlist

var request = new AIRequest
{
Prompt = "Analyze this file",
AllowedTools = new[] { "Read", "Grep", "Bash(git:*)" }
};

Explicitly specifying the allowed tools through an allowlist helps prevent unexpected AI actions. This is especially important for tools with write permissions, such as Edit, which should be handled with extra care.

2. Set reasonable timeouts

options.Timeout = 3600; // 1 hour
options.StartupTimeout = 60; // 1 minute

Set appropriate timeout values based on task complexity. If the value is too short, tasks may be interrupted. If it is too long, resources may be wasted waiting on unresponsive requests.

3. Enable session reuse

options.SessionId = "my-session-123";

Using the same sessionId for related tasks lets you reuse prior session context and improve response speed.

4. Handle streaming responses

await foreach (var chunk in provider.StreamAsync(request))
{
switch (chunk.Type)
{
case StreamingChunkType.ThinkingDelta:
// Handle the reasoning process
break;
case StreamingChunkType.ToolCallDelta:
// Handle tool invocation
break;
case StreamingChunkType.ContentDelta:
// Handle text output
break;
}
}

Streaming responses let you show AI processing progress in real time, which improves the user experience. This is especially valuable for long-running tasks.

5. Error handling and retries

try
{
await foreach (var chunk in provider.StreamAsync(request))
{
// Handle the response
}
}
catch (CopilotSessionException ex)
{
// Handle session exceptions
logger.LogError(ex, "Copilot session failed");
// Decide whether to retry based on the exception type
}

Proper error handling and retry logic improve overall system stability.

Upgrading from CLI invocation to SDK integration delivered substantial value to the HagiCode project:

  • Improved stability: the SDK provides a more stable interface that is not affected by CLI version changes.
  • More complete functionality: it captures the full event stream, including reasoning and tool execution status.
  • Higher development efficiency: the type-safe SDK interface makes development more efficient and reduces runtime errors.
  • Better user experience: real-time event feedback gives users a clearer understanding of what the AI is doing.

This upgrade was not just a replacement of technical implementation. It was also an architectural optimization of the entire AI capability layer. Through layered design and abstraction interfaces, we gained better maintainability and extensibility.

If you are considering integrating GitHub Copilot into your own .NET project, I hope these practical lessons help you avoid unnecessary detours. The official SDK is indeed more stable and complete than CLI invocation, and it is worth the time to understand and adopt it.


If this article helped you:


That brings this article to a close. Technical writing never really ends, because technology keeps evolving and we keep learning. If you have questions or suggestions while using HagiCode, feel free to contact us anytime.

Thank you for reading. If you found this article useful, you are welcome to like, bookmark, and share it. This content was created with AI-assisted collaboration, and the final version was reviewed and approved by the author.

Technical Analysis of the HagiCode Skill System: Building a Scalable AI Skill Management Platform

Technical Analysis of the HagiCode Skill System: Building a Scalable AI Skill Management Platform

Section titled “Technical Analysis of the HagiCode Skill System: Building a Scalable AI Skill Management Platform”

This article takes an in-depth look at the architecture and implementation of the Skill management system in the HagiCode project, covering the technical details behind four core capabilities: local global management, marketplace search, intelligent recommendations, and trusted provider management.

In the field of AI coding assistants, how to extend the boundaries of AI capabilities has always been a core question. Claude Code itself is already strong at code assistance, but different development teams and different technology stacks often need specialized capabilities for specific scenarios, such as handling Docker deployments, database optimization, or frontend component generation. That is exactly where a Skill system becomes especially important.

During the development of the HagiCode project, we ran into a similar challenge: how do we let Claude Code “learn” new professional skills like a person would, while still maintaining a solid user experience and good engineering maintainability? This problem is both hard and simple in its own way. Around that question, we designed and implemented a complete Skill management system.

This article walks through the technical architecture and core implementation of the system in detail. It is intended for developers interested in AI extensibility and command-line tool integration. It might be useful to you, or it might not, but at least it is written down now.

The approach shared in this article comes from our practical experience in the HagiCode project. HagiCode is an open-source AI coding assistant project designed to help development teams improve engineering efficiency. The project’s stack includes ASP.NET Core, the Orleans distributed framework, a TanStack Start + React frontend, and the Skill management subsystem introduced in this article.

The GitHub repository is HagiCode-org/site. If you find the technical approach in this article valuable, feel free to give it a Star. More Stars tend to improve the mood, after all.

The Skill system uses a frontend-backend separated architecture. There is nothing especially mysterious about that.

Frontend uses TanStack Start + React to build the user interface, with Redux Toolkit managing state. The four main capabilities map directly to four Tab components: Local Skills, Skill Gallery, Intelligent Recommendations, and Trusted Providers. In the end, the design is mostly about making the user experience better.

Backend is based on ASP.NET Core + ABP Framework, using Orleans Grain for distributed state management. The online API client wraps the IOnlineApiClient interface to communicate with the remote skill catalog service.

The overall architectural principle is to separate command execution from business logic. Through the adapter pattern, the implementation details of npm/npx command execution are hidden inside independent modules. After all, nobody really wants command-line calls scattered all over the codebase.

Core Capability 1: Local Global Management

Section titled “Core Capability 1: Local Global Management”

Local global management is the most basic module. It is responsible for listing installed skills and supporting uninstall operations. There is nothing overly complicated here; it is mostly about doing the basics well.

The implementation lives in LocalSkillsTab.tsx and LocalSkillCommandAdapter.cs. The core idea is to wrap the npx skills command, parse its JSON output, and convert it into internal data structures. It sounds simple, and in practice it mostly is.

public async Task<IReadOnlyList<LocalSkillInventoryResponseDto>> GetLocalSkillsAsync(
CancellationToken cancellationToken = default)
{
var result = await _commandAdapter.ListGlobalSkillsAsync(cancellationToken);
return result.Skills.Select(skill => new LocalSkillInventoryResponseDto
{
Name = skill.Name,
Version = skill.Version,
Source = skill.Source,
InstalledPath = skill.InstalledPath,
Description = skill.Description
}).ToList();
}

The data flow is very clear: the frontend sends a request -> SkillGalleryAppService receives it -> LocalSkillCommandAdapter executes the npx command -> the JSON result is parsed -> a DTO is returned. Each step follows naturally from the previous one.

Skill uninstallation uses the npx skills remove -g <skillName> -y command, and the system automatically handles dependencies and cleanup. Installation metadata is stored in managed-install.json inside the skill directory, recording information such as install time and source version for later updates and auditing. Some things are simply worth recording.

Skill installation requires several coordinated steps. In truth, it is not especially complicated:

public async Task<SkillInstallResultDto> InstallAsync(
SkillInstallRequestDto request,
CancellationToken cancellationToken = default)
{
// 1. Normalize the installation reference
var normalized = _referenceNormalizer.Normalize(
request.SkillId,
request.Source,
request.SkillSlug,
request.Version);
// 2. Check prerequisites
await _prerequisiteChecker.CheckAsync(cancellationToken);
// 3. Acquire installation lock
using var installLock = await _lockProvider.AcquireAsync(normalized.SkillId);
// 4. Execute installation command
var result = await _installCommandRunner.ExecuteAsync(
new SkillInstallCommandExecutionRequest
{
Command = $"npx skills add {normalized.FullReference} -g -y",
Timeout = TimeSpan.FromMinutes(4)
},
cancellationToken);
// 5. Persist installation metadata
await _metadataStore.WriteAsync(normalized.SkillPath, request);
return new SkillInstallResultDto { Success = result.Success };
}

Several key design patterns are used here: the reference normalizer converts different input formats, such as tanweai/pua and @opencode/docker-skill, into a unified internal representation; the installation lock mechanism ensures only one installation operation can run for the same skill at a time; and streaming output pushes installation progress to the frontend in real time through Server-Sent Events, so users can watch terminal-like logs as they happen.

In the end, all of these patterns are there for one purpose: to keep the system simpler to use and maintain.

Marketplace search lets users discover and install skills from the community. One person’s ability is always limited; collective knowledge goes much further.

The search feature relies on the online API https://api.hagicode.com/v1/skills/search. To improve response speed, the system implements caching. Cache is a bit like memory: if you keep useful things around, you do not have to think so hard the next time.

private async Task<IReadOnlyList<SkillGallerySkillDto>> SearchCatalogAsync(
string query,
CancellationToken cancellationToken,
IReadOnlySet<string>? allowedSources = null)
{
var cacheKey = $"skill_search:{query}:{string.Join(",", allowedSources ?? Array.Empty<string>())}";
if (_memoryCache.TryGetValue(cacheKey, out var cached))
return (IReadOnlyList<SkillGallerySkillDto>)cached!;
var response = await _onlineApiClient.SearchAsync(
new SearchSkillsRequest
{
Query = query,
Limit = _options.LimitPerQuery,
},
cancellationToken);
var results = response.Skills
.Where(skill => allowedSources is null || allowedSources.Contains(skill.Source))
.Select(skill => new SkillGallerySkillDto { ... })
.ToList();
_memoryCache.Set(cacheKey, results, TimeSpan.FromMinutes(10));
return results;
}

Search results support filtering by trusted sources, so users only see skill sources they trust. Seed queries such as popular and recent are used to initialize the catalog, allowing users to see recommended popular skills the first time they open it. First impressions still matter.

Core Capability 3: Intelligent Recommendations

Section titled “Core Capability 3: Intelligent Recommendations”

Intelligent recommendations are the most complex part of the system. They can automatically recommend the most suitable skills based on the current project context. Complex as it is, it is still worth building.

The full recommendation flow is divided into five stages:

1. Build project context
2. AI generates search queries
3. Search the online catalog in parallel
4. AI ranks the candidates
5. Return the recommendation list

First, the system analyzes characteristics such as the project’s technology stack, programming languages, and domain structure to build a “project profile.” That profile is a bit like a resume, recording the key traits of the project.

Then an AI Grain is used to generate targeted search queries. This design is actually quite interesting: instead of directly asking the AI, “What skills should I recommend?”, we first ask it to think about “What search terms are likely to find relevant skills?” Sometimes the way you ask the question matters more than the answer itself:

var queryGeneration = await aiGrain.GenerateSkillRecommendationQueriesAsync(
projectContext, // Project context
locale, // User language preference
maxQueries, // Maximum number of queries
effectiveSearchHero); // AI model selection

Next, those search queries are executed in parallel to gather a candidate skill list. Parallel processing is, at the end of the day, just a way to save time.

Finally, another AI Grain ranks the candidate skills. This step considers factors such as skill relevance to the project, trust status, and user historical preferences:

var ranking = await aiGrain.RankSkillRecommendationsAsync(
projectContext,
candidates,
installedSkillNames,
locale,
maxRecommendations,
effectiveRankingHero);
response.Items = MergeRecommendations(projectContext, candidates, ranking, maxRecommendations);

AI models can respond slowly or become temporarily unavailable. Even the best systems stumble sometimes. For that reason, the system includes a deterministic fallback mechanism: when the AI service is unavailable, it uses a rule-based heuristic algorithm to generate recommendations, such as inferring likely required skills from dependencies in package.json.

Put plainly, this fallback mechanism is simply a backup plan for the system.

Core Capability 4: Trusted Provider Management

Section titled “Core Capability 4: Trusted Provider Management”

Trusted provider management allows users to control which skill sources are considered trustworthy. Trust is still something users should be able to define for themselves.

Trusted providers support two matching rules: exact match (exact) and prefix match (prefix).

public static TrustedSkillProviderResolutionSnapshot Resolve(
TrustedSkillProviderSnapshot snapshot,
string source)
{
var normalizedSource = Normalize(source);
foreach (var entry in snapshot.Entries.OrderBy(e => e.SortOrder))
{
if (!entry.IsEnabled) continue;
foreach (var rule in entry.MatchRules)
{
bool isMatch = rule.MatchType switch
{
TrustedSkillProviderMatchRuleType.Exact
=> string.Equals(normalizedSource, Normalize(rule.Value),
StringComparison.OrdinalIgnoreCase),
TrustedSkillProviderMatchRuleType.Prefix
=> normalizedSource.StartsWith(Normalize(rule.Value) + "/",
StringComparison.OrdinalIgnoreCase),
_ => false
};
if (isMatch)
return new TrustedSkillProviderResolutionSnapshot
{
IsTrustedSource = true,
ProviderId = entry.ProviderId,
DisplayName = entry.DisplayName
};
}
}
return new TrustedSkillProviderResolutionSnapshot { IsTrustedSource = false };
}

Built-in trusted providers include well-known organizations and projects such as Vercel, Azure, anthropics, Microsoft, and browser-use. Custom providers can be added through configuration files by specifying a provider ID, display name, badge label, matching rules, and more. The world is large enough that only trusting a few built-ins would never be enough.

Trusted configuration is persisted using an Orleans Grain:

public class TrustedSkillProviderGrain : Grain<TrustedSkillProviderState>,
ITrustedSkillProviderGrain
{
public async Task UpdateConfigurationAsync(TrustedSkillProviderSnapshot snapshot)
{
State.Snapshot = snapshot;
await WriteStateAsync();
}
public Task<TrustedSkillProviderSnapshot> GetConfigurationAsync()
{
return Task.FromResult(State.Snapshot);
}
}

The benefit of this approach is that configuration changes are automatically synchronized across all nodes, without any need to refresh caches manually. Automation is, ultimately, about letting people worry less.

The Skill system needs to execute various npx commands. If that logic were scattered everywhere, the code would quickly become difficult to maintain. That is why we designed an adapter interface. Design patterns, in the end, exist to make code easier to maintain:

public interface ISkillInstallCommandRunner
{
Task<SkillInstallCommandExecutionResult> ExecuteAsync(
SkillInstallCommandExecutionRequest request,
CancellationToken cancellationToken = default);
}

Different commands have different executor implementations, but all of them implement the same interface, making testing and replacement straightforward.

Installation progress is pushed to the frontend in real time through Server-Sent Events:

public async Task InstallWithProgressAsync(
SkillInstallRequestDto request,
IServerStreamWriter<SkillInstallProgressEventDto> stream,
CancellationToken cancellationToken)
{
var process = new Process
{
StartInfo = new ProcessStartInfo
{
FileName = "npx",
Arguments = $"skills add {request.FullReference} -g -y",
RedirectStandardOutput = true,
RedirectStandardError = true,
UseShellExecute = false
}
};
process.OutputDataReceived += async (sender, e) =>
{
await stream.WriteAsync(new SkillInstallProgressEventDto
{
EventType = "output",
Data = e.Data ?? string.Empty
});
};
process.Start();
process.BeginOutputReadLine();
await process.WaitForExitAsync(cancellationToken);
}

On the frontend, users can see terminal-like output in real time, which makes the experience very intuitive. Real-time feedback helps people feel at ease.

Take installing the pua skill as an example (it is a popular community skill):

  1. Open the Skills drawer and switch to the Skill Gallery tab
  2. Enter pua in the search box
  3. Click the search result to view the skill details
  4. Click the Install button
  5. Switch to the Local Skills tab to confirm the installation succeeded

The installation command is npx skills add tanweai/pua -g -y, and the system handles all the details automatically. There are not really that many steps once you take them one by one.

If your team has its own skill repository, you can add it as a trusted source:

providerId: "my-team"
displayName: "My Team Skills"
badgeLabel: "MyTeam"
isEnabled: true
sortOrder: 100
matchRules:
- matchType: "prefix"
value: "my-team/"
- matchType: "exact"
value: "my-team/special-skill"

This way, all skills from your team will display a trusted badge, making users more comfortable installing them. Labels and signals do help people feel more confident.

Creating a custom skill requires the following structure:

my-skill/
├── SKILL.md # Skill metadata (YAML front matter)
├── index.ts # Skill entry point
├── agents/ # Supported agent configuration
└── references/ # Reference resources

An example SKILL.md format:

---
name: my-skill
description: A brief description of what this skill does
---
# My Skill
Detailed documentation...
  1. Network requirements: skill search and installation require access to api.hagicode.com and the npm registry
  2. Node.js version: Node.js 18 or later is recommended
  3. Permission requirements: global npm installation permissions are required
  4. Concurrency control: only one install or uninstall operation can run for the same skill at a time
  5. Timeout settings: the default timeout for installation is 4 minutes, but complex scenarios may require adjustment

These notes exist, ultimately, to help things go smoothly.

This article introduced the complete implementation of the Skill management system in the HagiCode project. Through a frontend-backend separated architecture, the adapter pattern, Orleans-based distributed state management, and related techniques, the system delivers:

  • Local global management: a unified skill management interface built by wrapping npx skills commands
  • Marketplace search: rapid discovery of community skills through the online API and caching mechanisms
  • Intelligent recommendations: AI-powered skill recommendations based on project context
  • Trust management: a flexible configuration system that lets users control trust boundaries

This design approach is not only applicable to Skill management. It is also useful as a reference for any scenario that needs to integrate command-line tools while balancing local storage and online services.

If this article helped you, feel free to give us a Star on GitHub: github.com/HagiCode-org/site. You can also visit the official site to learn more: hagicode.com.

You may think this system is well designed, or you may not. Either way, that is fine. Once code is written, someone will use it, and someone will not.

Thank you for reading. If you found this article useful, feel free to like, bookmark, and share it. This content was produced with AI-assisted collaboration, and the final content was reviewed and approved by the author.

Practical Multi-AI Provider Architecture in the HagiCode Platform

Practical Multi-AI Provider Architecture in the HagiCode Platform

Section titled “Practical Multi-AI Provider Architecture in the HagiCode Platform”

This article shares the technical approach we used under the Orleans Grain architecture to integrate two AI tools, iflow and OpenCode, through a unified IAIProvider interface, and compares the implementation differences between WebSocket and HTTP communication in detail.

There is nothing especially mysterious about it. While building HagiCode, we ran into a very practical problem: users wanted to work with different AI tools. That is hardly surprising, since everyone has their own habits. Some prefer Claude Code, some love GitHub Copilot, and some teams use tools they developed themselves.

Our initial solution was simple and direct: write dedicated integration code for each AI tool. But the drawbacks showed up quickly. The codebase filled up with if-else branches, every change required testing in multiple places, and every new tool meant writing another pile of logic from scratch.

Later, I realized it would be better to create a unified IAIProvider interface and abstract the capabilities shared by all AI providers. That way, no matter which tool is used underneath, the upper layers can call it in the same way.

Recently, the project needed to integrate two new tools: iflow and OpenCode. Both support the ACP protocol, but their communication styles are different. iflow uses WebSocket, while OpenCode uses an HTTP API. That became a useful architectural test: adapt two different transport modes behind one unified interface.

The approach shared in this article comes from our practical experience in the HagiCode project. HagiCode is an AI-assisted development platform built on the Orleans Grain architecture. It integrates with different AI providers through a unified IAIProvider interface, allowing users to flexibly choose the AI tools they prefer.

First, we defined the IAIProvider interface and abstracted the capabilities that every AI provider needs to implement:

public interface IAIProvider
{
string Name { get; }
bool SupportsStreaming { get; }
ProviderCapabilities Capabilities { get; }
Task<AIResponse> ExecuteAsync(AIRequest request, CancellationToken cancellationToken = default);
IAsyncEnumerable<AIStreamingChunk> StreamAsync(AIRequest request, CancellationToken cancellationToken = default);
Task<ProviderTestResult> PingAsync(CancellationToken cancellationToken = default);
IAsyncEnumerable<AIStreamingChunk> SendMessageAsync(AIRequest request, string? embeddedCommandPrompt = null, CancellationToken cancellationToken = default);
}

This interface includes several key methods:

  • ExecuteAsync: execute a one-shot AI request
  • StreamAsync: get streaming responses for real-time display
  • PingAsync: perform a health check to verify whether the provider is available
  • SendMessageAsync: send a message with support for embedded commands

IFlowCliProvider: A WebSocket-Based Implementation

Section titled “IFlowCliProvider: A WebSocket-Based Implementation”

iflow uses WebSocket for ACP communication. The overall architecture looks like this:

IFlowCliProvider → ACPSessionManager → WebSocketAcpTransport → iflow CLI
Dynamic port allocation + process management

The core flow is also fairly straightforward:

  1. ACPSessionManager creates and manages ACP sessions.
  2. WebSocketAcpTransport handles WebSocket communication.
  3. A port is allocated dynamically, and the iflow process is started with iflow --experimental-acp --port.
  4. IAIRequestToAcpMapper and IAcpToAIResponseMapper convert requests and responses.

Here is the core code:

private async IAsyncEnumerable<AIStreamingChunk> StreamCoreAsync(
AIRequest request,
string? embeddedCommandPrompt,
[EnumeratorCancellation] CancellationToken cancellationToken)
{
// Resolve working directory
var resolvedWorkingDirectory = ResolveWorkingDirectory(request);
var effectiveRequest = ApplyEmbeddedCommandPrompt(request, embeddedCommandPrompt);
// Create ACP session
await using var session = await _sessionManager.CreateSessionAsync(
Name,
resolvedWorkingDirectory,
cancellationToken,
request.SessionId);
// Send prompt
var prompt = _requestMapper.ToPromptString(effectiveRequest);
var promptResponse = await session.SendPromptAsync(prompt, cancellationToken);
// Receive streaming response
await foreach (var notification in session.ReceiveUpdatesAsync(cancellationToken))
{
if (_responseMapper.TryConvertToStreamingChunk(notification, out var chunk))
{
if (chunk.Type == StreamingChunkType.Metadata && chunk.IsComplete)
{
yield return chunk;
yield break;
}
yield return chunk;
}
}
}

There are a few design points worth calling out here:

  • Use await using to ensure the session is released correctly and avoid resource leaks.
  • Return streaming responses through IAsyncEnumerable, which naturally supports async streams.
  • Use Metadata chunks to determine completion and ensure the full response has been received.

OpenCodeCliProvider: An HTTP API-Based Implementation

Section titled “OpenCodeCliProvider: An HTTP API-Based Implementation”

OpenCode provides its service through an HTTP API, so the architecture is slightly different:

OpenCodeCliProvider → OpenCodeRuntimeManager → OpenCodeClient → OpenCode HTTP API
OpenCodeProcessManager → opencode process management

A notable feature of OpenCode is that it uses an SQLite database to persist session bindings. That makes session recovery and prompt-response recovery possible:

private async Task<OpenCodePromptExecutionResult> ExecutePromptAsync(
AIRequest request,
string? embeddedCommandPrompt,
CancellationToken cancellationToken)
{
var prompt = BuildPrompt(request, embeddedCommandPrompt);
var resolvedWorkingDirectory = ResolveWorkingDirectory(request.WorkingDirectory);
var client = await _runtimeManager.GetClientAsync(resolvedWorkingDirectory, cancellationToken);
var bindingSessionId = request.SessionId;
var boundSession = TryGetBinding(bindingSessionId, resolvedWorkingDirectory);
// Try to use the already bound session
if (boundSession is not null)
{
try
{
return await PromptSessionAsync(
client,
boundSession,
BuildPromptRequest(request, prompt, CreatePromptMessageId()),
request.Model ?? _settings.Model,
cancellationToken);
}
catch (OpenCodeApiException ex) when (IsStaleBinding(ex))
{
// The session has expired, remove the binding
RemoveBinding(bindingSessionId);
}
}
// Create a new session
var session = await client.Session.CreateAsync(new OpenCodeSessionCreateRequest
{
Title = BuildSessionTitle(request)
}, cancellationToken);
BindSession(bindingSessionId, session.Id, resolvedWorkingDirectory);
return await PromptSessionAsync(client, session.Id, ...);
}

This implementation has several interesting highlights:

  • Session binding mechanism: the same SessionId reuses the same OpenCode session, avoiding repeated session creation.
  • Expiration handling: when a session is found to be expired, the binding is automatically cleaned up.
  • Database persistence: bindings are stored in SQLite and remain effective after restart.
AspectIFlowCliProviderOpenCodeCliProvider
CommunicationWebSocket (ACP)HTTP API
Process managementACPSessionManagerOpenCodeProcessManager
Port allocationDynamic portNo port (uses HTTP)
Session managementACPSessionOpenCodeSession
PersistenceIn-memory cacheSQLite database
Startup commandiflow --experimental-acp --portopencode
LatencyLower (long-lived connection)Relatively higher (HTTP requests)

Which approach you choose depends mainly on your needs. WebSocket is better for scenarios with high real-time requirements, while an HTTP API is simpler and easier to debug.

First, enable the two providers in the configuration file:

AI:
Providers:
IFlowCli:
Type: "IFlowCli"
Enabled: true
ExecutablePath: "iflow"
Model: null
WorkingDirectory: null
OpenCodeCli:
Type: "OpenCodeCli"
Enabled: true
ExecutablePath: "opencode"
Model: "anthropic/claude-sonnet-4"
WorkingDirectory: null
OpenCode:
Enabled: true
BaseUrl: "http://localhost:38376"
ExecutablePath: "opencode"
StartupTimeoutSeconds: 30
RequestTimeoutSeconds: 120
// Get provider through the factory
var provider = await _providerFactory.GetProviderAsync(AIProviderType.IFlowCli);
// Execute an AI request
var request = new AIRequest
{
Prompt = "请帮我重构这个函数",
WorkingDirectory = "/path/to/project",
Model = "claude-sonnet-4"
};
// Get the complete response
var response = await provider.ExecuteAsync(request, cancellationToken);
Console.WriteLine(response.Content);
// Or use streaming responses
await foreach (var chunk in provider.StreamAsync(request, cancellationToken))
{
if (chunk.Type == StreamingChunkType.ContentDelta)
{
Console.Write(chunk.Content);
}
}
// Get provider through the factory
var provider = await _providerFactory.GetProviderAsync(AIProviderType.OpenCodeCli);
var request = new AIRequest
{
Prompt = "请帮我分析这个错误",
WorkingDirectory = "/path/to/project",
Model = "anthropic/claude-sonnet-4"
};
var response = await provider.ExecuteAsync(request, cancellationToken);
Console.WriteLine(response.Content);

Before startup or before use, you can check whether the provider is available:

var iflowResult = await iflowProvider.PingAsync(cancellationToken);
if (!iflowResult.Success)
{
Console.WriteLine($"IFlow is unavailable: {iflowResult.ErrorMessage}");
return;
}
var openCodeResult = await openCodeProvider.PingAsync(cancellationToken);
if (!openCodeResult.Success)
{
Console.WriteLine($"OpenCode is unavailable: {openCodeResult.ErrorMessage}");
return;
}

Both providers support embedded commands, such as /file:xxx:

var request = new AIRequest
{
Prompt = "分析这个文件的问题",
SystemMessage = "你是一个代码分析专家"
};
await foreach (var chunk in provider.SendMessageAsync(
request,
embeddedCommandPrompt: "/file:src/main.cs",
cancellationToken))
{
Console.Write(chunk.Content);
}

IFlow uses long-lived WebSocket connections, so resource management deserves special attention:

  • Use await using to ensure sessions are released properly.
  • Cancellation triggers process cleanup.
  • ACPSessionManager supports a maximum session count limit.

OpenCode process management is relatively simpler, and OpenCodeRuntimeManager handles it automatically.

Both providers have complete error handling:

  • IFlow errors are propagated through ACP session updates.
  • OpenCode errors are thrown through OpenCodeApiException.
  • It is recommended that the caller catch and handle these exceptions.
  • IFlow WebSocket communication has lower latency than HTTP.
  • OpenCode session reuse can reduce the overhead of HTTP requests.
  • The factory cache mechanism avoids repeatedly creating providers.
  • In high-concurrency scenarios, pay close attention to the limits on process count and connection count.

The executable path is validated at startup, but runtime issues can still happen. PingAsync is a useful tool for verifying whether the configuration is correct:

// Check at startup
var provider = await _providerFactory.GetProviderAsync(providerType);
var result = await provider.PingAsync(cancellationToken);
if (!result.Success)
{
_logger.LogError("Provider {ProviderType} is unavailable: {Error}", providerType, result.ErrorMessage);
}

This article shares the technical approach used by the HagiCode platform when integrating the two AI tools iflow and OpenCode. Through a unified IAIProvider interface, we adapted different communication styles, WebSocket and HTTP, while keeping the upper-layer calling pattern consistent.

The core idea is actually quite simple:

  1. Define a unified interface abstraction.
  2. Build adapter layers for different implementations.
  3. Manage everything uniformly through the factory pattern.

That gives the system good extensibility. When a new AI tool needs to be integrated later, all we need to do is implement the IAIProvider interface without changing too much existing code.

If you are also working on multi-AI-tool integration, I hope this article is helpful.


If this article helped you:

AI Compose Commit: Using AI to Intelligently Refactor the Git Commit Workflow

AI Compose Commit: Using AI to Intelligently Refactor the Git Commit Workflow

Section titled “AI Compose Commit: Using AI to Intelligently Refactor the Git Commit Workflow”

In the software development process, committing code is a routine task every programmer faces every day. But have you ever run into this situation: at the end of a workday, you open Git, see dozens of unstaged modified files, and have no idea how to organize them into sensible commits?

The traditional approach is to manually stage files in batches, commit them one by one, and write commit messages. This process is both time-consuming and error-prone. We often waste quite a bit of time on this, and after all, nobody wants to worry about these tedious chores late at night when they are already tired.

In the HagiCode project, we introduced a new feature - AI Compose Commit - designed to completely transform this workflow. By using AI to intelligently analyze all uncommitted changes in the working tree, it automatically groups them into multiple logical commits and performs standards-compliant commit operations. In this article, we will take a deep dive into the implementation principles, technical architecture, and the challenges and solutions we encountered in practice.

The approach shared in this article comes from our practical experience in the HagiCode project.

As a version control system, Git gives developers powerful code management capabilities. But in real-world usage, committing often becomes a bottleneck in the development workflow:

  1. Manual grouping is time-consuming: When there are many file changes, developers need to inspect each file one by one and decide which changes belong to the same feature. That takes a lot of mental effort.
  2. Inconsistent commit message quality: Writing commit messages that follow the Conventional Commits specification requires experience and skill, and beginners often produce non-standard commits.
  3. Complex multi-repository management: In a monorepo environment, switching between different repositories adds operational complexity.
  4. Interrupted workflow: Committing code interrupts your train of thought and hurts coding efficiency.

These issues are especially obvious in large projects and collaborative team environments. A good development tool should let developers focus on core coding work instead of getting bogged down in a cumbersome commit workflow.

In recent years, AI has been used more and more widely in software development. From code completion and bug detection to automatic documentation generation, AI is gradually reaching every stage of the development process. In Git workflows, while some tools already support commit message generation, most are limited to single-commit scenarios and lack the ability to intelligently analyze and group changes across the entire working tree.

HagiCode encountered these pain points during development as well. We tried many tools, but each had one limitation or another. Either the functionality was incomplete, or the user experience was not good enough. That is why we ultimately decided to implement AI Compose Commit ourselves.

HagiCode’s AI Compose Commit feature was created to fill that gap. It does not just generate commit messages - it takes over the entire process from file analysis to commit execution.

While implementing AI Compose Commit, we faced several technical challenges:

  1. File semantic understanding: The AI needs to understand semantic relationships between file changes and decide which files belong to the same functional module. This requires deep analysis of file content, directory structure, and change context.

  2. Commit grouping strategy: How should a reasonable grouping standard be defined? By feature, by module, or by file type? Different projects may need different strategies.

  3. Real-time feedback and asynchronous processing: Git operations can take a long time, especially when handling a large number of files. How can we complete complex operations while preserving a good user experience?

  4. Multi-repository support: In a monorepo architecture, operations need to be routed correctly between the main repository and sub-repositories.

  5. Error handling and rollback: If one commit fails, how should already executed commits be handled? Do already staged files need to be rolled back?

  6. Commit message consistency: Generated commit messages need to match the project’s existing style and remain consistent with historical commits.

AI processing over a large number of file changes consumes significant time and compute resources. We needed to optimize in the following areas:

  • Reduce unnecessary AI calls
  • Optimize how file context is constructed
  • Implement efficient Git operation batching

These issues all appeared in real HagiCode usage, and we only arrived at a relatively complete solution through repeated iteration and optimization. If you are building a similar tool, we hope our experience gives you some inspiration.

We adopted a layered architecture to implement AI Compose Commit, ensuring good scalability and maintainability:

GitController provides the POST /api/git/auto-compose-commit endpoint as the entry point. To optimize user experience, we adopted a fire-and-forget asynchronous pattern:

  • After the client sends a request, the server immediately returns HTTP 202 Accepted
  • The actual AI processing runs asynchronously in the background
  • When processing finishes, the client is notified through SignalR

This design ensures that even if AI processing takes several minutes, users still get an immediate response and do not feel that the system is frozen.

2. Application Service Layer (Application Layer)

Section titled “2. Application Service Layer (Application Layer)”

GitAppService is responsible for the core business logic:

  • Repository detection: supports multi-repository management in a monorepo
  • Lock management: prevents conflicts caused by concurrent operations
  • File staging coordination: interacts with the AI processing flow
  • Error rollback: restores state when failures occur

3. Distributed Computing Layer (Orleans Grains)

Section titled “3. Distributed Computing Layer (Orleans Grains)”

AIGrain serves as the core execution unit for AI operations. It implements the AutoComposeCommitAsync method from the IAIGrain interface:

// Define the interface method for AI-powered automatic commit composition
// Parameter notes:
// - projectId: unique project identifier
// - unstagedFiles: list of unstaged files, including file paths and status information
// - projectPath: project root directory path (optional), used to access project context
// Return value: a response object containing execution results, including success/failure status and detailed information
[Alias("AutoComposeCommitAsync")]
[ResponseTimeout("00:20:00")] // 20-minute timeout, suitable for handling large change sets
Task<AutoComposeCommitResponseDto> AutoComposeCommitAsync(
string projectId,
GitFileStatusDto[] unstagedFiles,
string? projectPath = null);

This method sets a 20-minute timeout to handle large change sets. In real-world HagiCode usage, we found that some projects can involve hundreds of changed files in a single pass, requiring more processing time.

Through the abstract IAIService interface, we implemented a pluggable AI service architecture. We currently use the Claude Helper service, but it can be easily switched to other AI providers.

The AI needs to understand the state of each file before it can make intelligent decisions. We build file context through the BuildFileChangesXml method:

/// <summary>
/// Build an XML representation of file changes to provide the AI with complete file context information
/// </summary>
/// <param name="stagedFiles">List of staged files, including file path, status, and old path (for rename operations)</param>
/// <returns>A formatted XML string containing metadata for all files</returns>
private static string BuildFileChangesXml(GitFileStatusDto[] stagedFiles)
{
var sb = new StringBuilder();
sb.AppendLine("<files>");
foreach (var file in stagedFiles)
{
sb.AppendLine(" <file>");
// Use XML escaping to ensure special characters do not break the XML structure
sb.AppendLine($" <path>{System.Security.SecurityElement.Escape(file.Path)}</path>");
sb.AppendLine($" <status>{System.Security.SecurityElement.Escape(file.Status)}</status>");
// Handle file rename scenarios and record the old path so the AI can understand change relationships
if (!string.IsNullOrEmpty(file.OldPath))
{
sb.AppendLine($" <oldPath>{System.Security.SecurityElement.Escape(file.OldPath)}</oldPath>");
}
sb.AppendLine(" </file>");
}
sb.AppendLine("</files>");
return sb.ToString();
}

This XML-based context includes file paths, statuses, and old paths for rename operations, giving the AI complete metadata. With a structured XML format, we ensure that the AI can accurately understand the state and change type of each file.

To let the AI execute Git operations directly, we configured comprehensive tool permissions:

// Define the set of tools the AI can use, including file operations and Git command execution permissions
// Read/Write/Edit: file reading, writing, and editing capabilities
// Bash(git:*): permission to execute all Git commands
// Other Bash commands: used to inspect file contents and directory structure so the AI can understand context
var allowedTools = new[]
{
"Read", "Write", "Edit",
"Bash(git:*)", "Bash(cat:*)", "Bash(ls:*)", "Bash(find:*)",
"Bash(grep:*)", "Bash(head:*)", "Bash(tail:*)", "Bash(wc:*)"
};
// Build the complete AI request object
var request = new AIRequest
{
Prompt = prompt, // Complete prompt template, including task instructions and constraints
WorkingDirectory = projectPath ?? GetTempDirectory(), // Working directory, ensuring the AI runs in the correct project context
AllowedTools = allowedTools, // Allowed tool set
PermissionMode = PermissionMode.bypassPermissions, // Bypass permission checks so Git operations can run directly
LanguagePreference = languagePreference // Language preference setting, ensuring commit messages match user expectations
};

Here we use PermissionMode.bypassPermissions, which allows the AI to execute Git commands directly without user confirmation. This is central to the feature design, but it also requires strict input validation to prevent abuse. In HagiCode’s production deployment, we ensured the safety of this mechanism through backend parameter validation and log monitoring.

After the AI finishes execution, it returns structured results. We implemented a dual parsing strategy to ensure compatibility:

/// <summary>
/// Parse commit execution results returned by the AI, supporting both delimiter format and regex format
/// </summary>
/// <param name="aiResponse">Raw response content returned by the AI</param>
/// <returns>A parsed list of commit results, where each result includes the commit hash and execution status</returns>
private List<CommitResultDto> ParseCommitExecutionResults(string aiResponse)
{
var results = new List<CommitResultDto>();
// Prefer delimiter-based parsing (new format), which is more explicit and reliable
if (aiResponse.Contains("---"))
{
logger.LogDebug("Using delimiter-based parsing for AI response");
results = ParseDelimitedFormat(aiResponse);
if (results.Count > 0)
{
return results; // Successfully parsed, return the results directly
}
logger.LogWarning("Delimiter-based parsing produced no results, falling back to regex");
}
else
{
logger.LogDebug("No delimiter found, using legacy regex-based parsing");
}
// Fall back to regex parsing (old format) to ensure backward compatibility
return ParseLegacyFormat(aiResponse);
}

The delimiter format uses --- to separate commits, making the structure clear and easy to parse:

---
Commit 1: abc123def456
feat(auth): add user login functionality
Implement JWT-based authentication with login form and API endpoints.
Co-Authored-By: Hagicode <noreply@hagicode.com>
---
Commit 2: 789ghi012jkl
docs(readme): update installation instructions
Add new setup steps for Docker environment.
Co-Authored-By: Hagicode <noreply@hagicode.com>
---

This format makes parsing simple and reliable, while also remaining easy for humans to read.

To prevent state conflicts caused by concurrent operations, we implemented a repository lock mechanism:

// Acquire the repository lock to prevent concurrent operations
// Parameter notes:
// - fullPath: full repository path, used to identify different repository instances
// - requestedBy: requester identifier, used for tracking and logging
await _autoComposeLockService.AcquireLockAsync(fullPath, requestedBy);
try
{
// Execute the AI Compose Commit operation
// This section calls an Orleans Grain method to perform the actual AI processing and Git operations
await aiGrain.AutoComposeCommitAsync(projectId, unstagedFiles, projectPath);
}
finally
{
// Ensure the lock is released whether the operation succeeds or fails
// Using a finally block guarantees lock release even when exceptions occur, preventing deadlocks
await _autoComposeLockService.ReleaseLockAsync(fullPath);
}

The lock has a 20-minute timeout, matching the timeout used for AI operations. If the operation fails or times out, the system automatically releases the lock to avoid permanent blocking. In real HagiCode usage, we found this lock mechanism to be extremely important, especially in collaborative environments where multiple developers may trigger AI Compose Commit at the same time.

After processing completes, the system sends a notification to the frontend through SignalR:

/// <summary>
/// Send a notification when automatic commit composition is complete
/// </summary>
/// <param name="projectId">Project identifier, used to route the notification to the correct client</param>
/// <param name="totalCount">Total number of commits, including successes and failures</param>
/// <param name="successCount">Number of successful commits</param>
/// <param name="failureCount">Number of failed commits</param>
/// <param name="success">Whether the overall operation succeeded</param>
/// <param name="error">Error message (if the operation failed)</param>
private async Task SendAutoComposeCommitNotificationAsync(
string projectId,
int totalCount,
int successCount,
int failureCount,
bool success,
string? error)
{
try
{
// Build the notification DTO containing detailed execution results
var notification = new AutoComposeCommitCompletedDto
{
ProjectId = projectId,
TotalCount = totalCount,
SuccessCount = successCount,
FailureCount = failureCount,
Success = success,
Error = error
};
// Broadcast the notification to all connected clients through the SignalR Hub
await messageService.SendAutoComposeCommitCompletedAsync(notification);
logger.LogInformation(
"Auto compose commit notification sent for project {ProjectId}: {SuccessCount}/{TotalCount} succeeded",
projectId, successCount, totalCount);
}
catch (Exception ex)
{
// Log notification errors without affecting the main operation flow
// A notification failure should not cause the entire operation to fail
logger.LogError(ex, "Failed to send auto compose commit notification for project {ProjectId}", projectId);
}
}

After the frontend receives the notification, it can update the UI to show whether the commit succeeded or failed, improving the user experience. This real-time feedback mechanism received strong feedback from HagiCode users, who can clearly see when the operation finishes and what the outcome is.

AI behavior is entirely determined by the prompt, so we carefully designed the prompt template for Auto Compose Commit. Taking the Chinese version as an example (auto-compose-commit.zh-CN.hbs):

At the beginning of the prompt, we explicitly declare support for non-interactive execution mode, which is a critical requirement for CI/CD and automation scripts:

**Important Note**: This prompt may run in a non-interactive environment (such as CI/CD or automation scripts).
**Non-Interactive Mode**:
- Do not use AskUserQuestion or any interactive tools
- When user input is required:
- Use sensible defaults (for example, use feat as the commit type)
- Skip optional confirmation steps
- Record any assumptions made

This design ensures that AI Compose Commit can be used not only in interactive IDE environments, but also integrated into CI/CD pipelines to deliver a fully automated commit workflow.

To prevent the AI from executing dangerous operations, we added strict branch protection rules to the prompt:

**Branch Protection**:
- Do not perform any branch switching operations (git checkout, git switch)
- All `git commit` commands must run on the current branch
- Do not create, delete, or rename branches
- Do not modify untracked files or unstaged changes
- If branch switching is required to complete the operation, return an error instead of executing it

By constraining the AI’s tool usage scope, these rules ensure operational safety. In HagiCode’s practical testing, we verified the effectiveness of these constraints: when the AI encounters a situation that would require a branch switch, it safely returns an error instead of taking dangerous action.

The prompt defines the decision logic for file grouping in detail:

**File Grouping Decision Tree**:
├── Is it a configuration file (package.json, tsconfig.json, .env, etc.)?
│ ├── Yes -> separate commit (type: chore or build)
│ └── No -> continue
├── Is it a documentation file (README.md, *.md, docs/**)?
│ ├── Yes -> separate commit (type: docs)
│ └── No -> continue
├── Is it related to the same feature?
│ ├── Yes -> merge into the same commit
│ └── No -> commit separately
└── Is it a cross-module change?
├── Yes -> group by module
└── No -> group by feature

This decision tree gives the AI clear grouping logic, ensuring the generated commits remain semantically reasonable. In real HagiCode usage, we found that this decision tree can handle the vast majority of common scenarios, and the grouping results match developer expectations.

To keep commit messages consistent with project history, the prompt requires the AI to analyze recent commit history before generation:

**Historical Format Consistency**: Before generating commit messages, you **must** analyze the current repository's commit history to match the existing style.
1. Use `git log -n 15 --pretty=format:"%H|%s|%b%n---%n"` to get the recent commit history
2. Analyze the commits to identify:
- Structural patterns: does the project use multi-paragraph messages? Are there `Changes:` or `Capabilities:` sections?
- Language patterns: are commit messages in English, Chinese, or mixed?
- Common types: which commit types are most often used (`feat`, `fix`, `docs`, etc.)?
- Special formatting: are there `Co-Authored-By` lines? Any other project-specific conventions?
3. Generate commit messages that follow the detected patterns

This analysis ensures that AI-generated commit messages do not feel out of place, but instead remain stylistically aligned with the project’s history. In HagiCode’s multilingual projects, this feature is especially important because it can automatically choose the appropriate language and format based on commit history.

Every commit must include Co-Authored-By information:

**Important**: Every commit must include Co-Authored-By information
- Use the following format: `git commit -m "type(scope): subject" -m "" -m "Co-Authored-By: Hagicode <noreply@hagicode.com>"`
- Or include the `Co-Authored-By` line directly in the commit message

This is not only for contribution compliance, but also for tracing AI-assisted commit history. HagiCode treats this as a mandatory rule to ensure that all AI-generated commits carry a clear source marker.

The full AI Compose Commit workflow is as follows:

  1. User trigger: The user clicks the “AI Auto Compose Commit” button in the Git Status panel or Quick Actions Zone.
  2. API request: The frontend sends a POST request to the /api/git/auto-compose-commit endpoint.
  3. Immediate response: The server returns HTTP 202 Accepted without waiting for processing to finish.
  4. Background processing:
    • GitAppService acquires the repository lock
    • Calls AIGrain.AutoComposeCommitAsync
    • Builds the file context XML
    • Executes the AI prompt so the AI can analyze and perform commits
  5. AI execution:
    • Uses Git commands to obtain all unstaged changes
    • Reads file contents to understand the nature of the changes
    • Groups files by semantic relationship
    • Executes git add and git commit for each group
  6. Result parsing: Parses the execution results returned by the AI.
  7. Notification delivery: Notifies the frontend through SignalR.
  8. Lock release: Releases the repository lock whether the operation succeeds or fails.

This workflow is designed so that users can continue with other work immediately after initiating the operation, without waiting for the AI to finish. Feedback from HagiCode users shows that this asynchronous processing model greatly improves the workflow experience.

We implemented multi-layer error handling:

// Validate request parameters to prevent invalid requests from reaching backend processing logic
if (request.UnstagedFiles == null || request.UnstagedFiles.Count == 0)
{
return BadRequest(new
{
message = "No unstaged files provided. Please make changes in the working directory first.",
status = "validation_failed"
});
}

If an error occurs during AI processing, the system performs a rollback operation and unstages files that were already staged, preventing an inconsistent state from being left behind. In real HagiCode usage, this mechanism saved us from multiple unexpected interruptions and ensured repository state integrity.

The 20-minute timeout ensures that long-running operations do not block resources indefinitely. After a timeout, the system releases the lock and notifies the user that the operation failed. In real HagiCode usage, we found that most operations complete within 2 to 5 minutes, and only extremely large change sets approach the timeout limit.

Best Practices for Using AI Compose Commit

Section titled “Best Practices for Using AI Compose Commit”

AI Compose Commit is best suited for the following scenarios:

  • At the end of a workday, when you need to process changes across many files in one batch
  • After a refactoring operation, when several related files need to be committed separately
  • After a feature is completed, when related changes need to be grouped into commits

It is not suitable for the following scenarios:

  • Quick commits for a single file (a normal commit is faster)
  • Scenarios requiring precise control over commit content
  • Commits containing sensitive information that require human review

Although AI-powered intelligent grouping is powerful, developers should still review the generated commits:

  • Check whether the grouping matches expectations
  • Verify the accuracy of commit messages
  • Confirm that no files were omitted or incorrectly included

If you find an unreasonable grouping, you can use git reset --soft HEAD~N to undo it and regroup. HagiCode’s experience shows that even when AI grouping is smart, manual review is still valuable, especially for important feature commits.

Make sure your project’s Git configuration supports Conventional Commits:

Terminal window
# Install commitlint
npm install -g @commitlint/cli @commitlint/config-conventional
# Configure commitlint
echo "module.exports = {extends: ['@commitlint/config-conventional']}" > commitlint.config.js

This lets you validate commit message format in CI/CD workflows and keeps it aligned with the format generated by AI Compose Commit.

If you want to implement a similar AI-assisted commit feature in your own project, here are our suggestions:

Begin with single commit message generation, then gradually expand to multi-commit grouping. This makes it easier to validate and iterate. HagiCode followed the same path: early versions only supported single commits, and later expanded to intelligent grouping across multiple commits.

Do not implement AI invocation logic from scratch. Using an existing SDK reduces development time and potential bugs. We used the Claude Helper service, which provides a stable interface and robust error handling.

Prompt quality directly determines output quality. Spend time designing a detailed prompt, including:

  • Clear task descriptions
  • Specific output format requirements
  • Rules for handling edge cases
  • Illustrative examples

HagiCode invested heavily in prompt design, and this was one of the key reasons the feature succeeded.

AI operations can fail for many reasons, such as network issues, API rate limits, or content moderation. Make sure your system can handle these errors gracefully and provide meaningful error information.

Do not automate everything completely. Leave users in control. Provide options to review grouping results, adjust groups, and manually edit commit messages to balance automation and flexibility. Although HagiCode supports automatic execution, it still preserves preview and adjustment capabilities.

When constructing file context, filter out files that do not need AI analysis:

// Filter out generated files and excessively large files to reduce the AI processing burden
var relevantFiles = stagedFiles
.Where(f => !IsGeneratedFile(f.Path))
.Where(f => !IsLargeFile(f.Path))
.ToArray();

If multiple independent repositories are supported, commits in different repositories can be processed in parallel to improve overall efficiency.

Cache project commit history analysis results to avoid re-analyzing them every time. Historical format preferences can be stored in configuration files to reduce AI calls.

AI Compose Commit represents a deep application of AI technology in software development tools. By intelligently analyzing file changes, automatically grouping commits, and generating standards-compliant commit messages, it significantly improves the efficiency of Git workflows and allows developers to focus more on core coding work.

During implementation, we learned several important lessons:

  1. User feedback is critical: Early versions used synchronous waiting, and users reported a poor experience. After switching to a fire-and-forget model, satisfaction improved significantly.
  2. Prompt design determines quality: A carefully designed prompt does more to guarantee AI output quality than a complex algorithm.
  3. Safety always comes first: Granting the AI permission to execute Git commands directly improves efficiency, but it must be paired with strict constraints and validation.
  4. Progressive improvement works best: Starting with simple scenarios and gradually increasing complexity is more likely to succeed than trying to implement everything at once.

In the future, we plan to further optimize AI Compose Commit, including:

  • Supporting more commit grouping strategies (by time, by developer, and so on)
  • Integrating code review workflows to trigger review automatically before commits
  • Supporting custom commit message templates to meet the personalized needs of different projects

If you find the approach shared in this article valuable, give HagiCode a try and experience how this feature works in real development. After all, practice is the only criterion for testing truth.


Thank you for reading. If you found this article helpful, please click the like button below so more people can discover it.

This content was created with AI-assisted collaboration, reviewed by me, and reflects my own views and positions.

.NET Core Dual-Database in Practice: Best Practices for Elegantly Combining PostgreSQL and SQLite

.NET Core Dual-Database in Practice: Let PostgreSQL and SQLite Coexist Peacefully

Section titled “.NET Core Dual-Database in Practice: Let PostgreSQL and SQLite Coexist Peacefully”

When building modern applications, we often face this trade-off: development environments want something lightweight and convenient, while production environments demand high concurrency and high availability. This article shares how to elegantly support both PostgreSQL and SQLite in a .NET Core project and implement the best practice of “SQLite for development, PostgreSQL for production.”

In software development, differences between environments have always been one of the hardest problems for engineering teams. Take the HagiCode platform we are building as an example: it is an AI-assisted development system based on ASP.NET Core 10 and React, with Orleans integrated internally for distributed state management. The stack is modern and fairly sophisticated.

Early in the project, we ran into a classic engineering pain point: developers wanted the local environment to work out of the box, without having to install and configure a heavy PostgreSQL database. But in production, we needed to handle high-concurrency writes and complex JSON queries, and that is exactly where lightweight SQLite starts to show its limits.

How can we keep a single codebase while allowing the application to benefit from SQLite’s portability like a desktop app, and also leverage PostgreSQL’s powerful performance like an enterprise-grade service? That is the core question this article explores.

The dual-database adaptation approach shared in this article comes directly from our hands-on experience in the HagiCode project. HagiCode is a next-generation development platform that integrates AI prompt management and the OpenSpec workflow. It was precisely to balance developer experience with production stability that we arrived at this proven architectural pattern.

Feel free to visit our GitHub repository to see the full project: HagiCode-org/site.

Core Topic 1: Architecture Design and Unified Abstraction

Section titled “Core Topic 1: Architecture Design and Unified Abstraction”

To support two databases in .NET Core, the key idea is to depend on abstractions rather than concrete implementations. We need to separate database selection from business code and let the configuration layer decide.

  1. Unified interface: All business logic should depend on the DbContext base class or custom interfaces, rather than a specific PostgreSqlDbContext.
  2. Configuration-driven: Use configuration items in appsettings.json to dynamically decide which database provider to load at application startup.
  3. Feature isolation: Add adaptation logic for PostgreSQL-specific capabilities, such as JSONB, so the application can still degrade gracefully on SQLite.

Code Implementation: Dynamic Context Configuration

Section titled “Code Implementation: Dynamic Context Configuration”

In ASP.NET Core’s Program.cs, we should not hard-code UseNpgsql or UseSqlite. Instead, we should read configuration and decide dynamically.

First, define the configuration class:

public class DatabaseSettings
{
public const string SectionName = "Database";
// Database type: PostgreSQL or SQLite
public string DbType { get; set; } = "PostgreSQL";
// Connection string
public string ConnectionString { get; set; } = string.Empty;
}

Then register the service in Program.cs based on configuration:

// Read configuration
var databaseSettings = builder.Configuration.GetSection(DatabaseSettings.SectionName).Get<DatabaseSettings>();
// Register DbContext
builder.Services.AddDbContext<ApplicationDbContext>(options =>
{
if (databaseSettings?.DbType?.ToLower() == "sqlite")
{
// SQLite configuration
options.UseSqlite(databaseSettings.ConnectionString);
// Handling SQLite's concurrent write limitations
// Note: in production, enabling WAL mode is recommended to improve concurrency performance
}
else
{
// PostgreSQL configuration (default)
options.UseNpgsql(databaseSettings.ConnectionString, npgsqlOptions =>
{
// Enable JSONB support, which is very useful when handling AI conversation records
npgsqlOptions.UseJsonNet();
});
// Configure connection pool retry policy
options.EnableRetryOnFailure(3);
}
});

Core Topic 2: Handling Differences and Migration Strategy

Section titled “Core Topic 2: Handling Differences and Migration Strategy”

Although PostgreSQL and SQLite both support the SQL standard, they differ significantly in specific capabilities and behavior. If these differences are not handled carefully, you can easily end up with the awkward situation where everything works locally but fails after deployment.

In HagiCode, we need to store a large amount of prompts and AI metadata, which usually involves JSON columns.

  • PostgreSQL: Has a native JSONB type with excellent query performance.
  • SQLite: Does not have a native JSON type (newer versions include the JSON1 extension, but object mapping still differs), so data is usually stored as TEXT.

Solution: In EF Core entity mapping, we configure it as a convertible type.

protected override void OnModelCreating(ModelBuilder modelBuilder)
{
base.OnModelCreating(modelBuilder);
// Configure entity
modelBuilder.Entity<PromptTemplate>(entity =>
{
entity.Property(e => e.Metadata)
.HasColumnType("jsonb") // PG uses jsonb
.HasConversion(
v => JsonSerializer.Serialize(v, (JsonSerializerOptions)null),
v => JsonSerializer.Deserialize<Dictionary<string, object>>(v, (JsonSerializerOptions)null)
);
});
}

When SQLite is used, HasColumnType("jsonb") may be ignored or trigger a warning. However, because HasConversion is configured, the data is still serialized and deserialized correctly as strings stored in a TEXT field, ensuring compatibility.

Never try to make one set of Migration scripts work for both PostgreSQL and SQLite at the same time. Differences in primary key generation strategies, index syntax, and other database details will inevitably cause failures.

Recommended practice: Maintain two migration branches or projects. In the HagiCode development workflow, this is how we handle it:

  1. Development stage: Work mainly with SQLite. Use Add-Migration Init_Sqlite -OutputDir Migrations/Sqlite.
  2. Adaptation stage: After developing a feature, switch the connection string to PostgreSQL and run Add-Migration Init_Postgres -OutputDir Migrations/Postgres.
  3. Automation scripts: Write a simple PowerShell or Bash script to automatically apply the correct migration based on the current environment variable.
Terminal window
# Pseudocode for simple deployment logic
if [ "$DATABASE_PROVIDER" = "PostgreSQL" ]; then
dotnet ef database update --project Migrations.Postgres
else
dotnet ef database update --project Migrations.Sqlite
fi

Core Topic 3: Lessons Learned from HagiCode in Production

Section titled “Core Topic 3: Lessons Learned from HagiCode in Production”

While refactoring HagiCode from a single-database model to dual-database support, we hit a few pitfalls and gathered some important lessons that may help you avoid the same mistakes.

1. Differences in Concurrency and Transactions

Section titled “1. Differences in Concurrency and Transactions”

PostgreSQL uses a server-client architecture and supports high-concurrency writes with powerful transaction isolation levels. SQLite uses file locking, so write operations lock the entire database file unless WAL mode is enabled.

Recommendation: When writing business logic that involves frequent writes, such as real-time saving of a user’s editing state, you must take SQLite’s locking model into account. When designing the OpenSpec collaboration module in HagiCode, we introduced a “merge before write” mechanism to reduce the frequency of direct database writes, allowing us to maintain good performance on both databases.

2. Lifecycle Management of Connection Strings

Section titled “2. Lifecycle Management of Connection Strings”

Establishing a PostgreSQL connection is relatively expensive and depends on connection pooling. SQLite connections are very lightweight, but if they are not released promptly, file locks may cause later operations to time out.

In Program.cs, we can fine-tune behavior for each database:

if (databaseSettings?.DbType?.ToLower() == "sqlite")
{
// SQLite: keeping connections open can improve performance, but watch out for file locks
options.UseSqlite(connectionString, sqliteOptions =>
{
// Set command timeout
sqliteOptions.CommandTimeout(30);
});
}
else
{
// PG: make full use of connection pooling
options.UseNpgsql(connectionString, npgsqlOptions =>
{
npgsqlOptions.MaxBatchSize(100);
npgsqlOptions.CommandTimeout(30);
});
}

Many developers, including some early members of our team, tend to make one common mistake: running unit tests only in the development environment, which is usually SQLite.

In HagiCode’s CI/CD pipeline, we enforced a GitHub Actions step to make sure every pull request runs PostgreSQL integration tests.

# Example snippet from .github/workflows/test.yml
- name: Run Integration Tests (PostgreSQL)
run: |
docker-compose up -d db_postgres
dotnet test --filter "Category=Integration"

This helped us catch countless bugs related to SQL syntax differences and case sensitivity.

By introducing an abstraction layer and configuration-driven dependency injection, we successfully implemented a dual-track PostgreSQL and SQLite setup in the HagiCode project. This not only greatly lowered the onboarding barrier for new developers by removing the need to install PostgreSQL, but also provided strong performance guarantees for production.

To recap the key points:

  1. Abstraction first: Business code should not depend on concrete database implementations.
  2. Separate configuration: Use different appsettings.json files for development and production.
  3. Separate migrations: Do not try to make one Migration set work everywhere.
  4. Feature degradation: Prioritize compatibility in SQLite and performance in PostgreSQL.

This architectural pattern is not only suitable for HagiCode, but for any .NET project that needs to strike a balance between lightweight development and heavyweight production.


If this article helped you, feel free to give us a Star on GitHub, or experience the efficient development workflow brought by HagiCode directly:

The public beta has started. Welcome to install it and give it a try!


Thank you for reading. If you found this article useful, please click the like button below 👍 so more people can discover it.

This content was created with AI-assisted collaboration, reviewed by me, and reflects my own views and position.