HagiCode

26 posts with the tag “HagiCode”

How to Implement Automatic Retry for Agent CLIs Like Claude Code and Codex

Apr 18, 2026

How to Implement Automatic Retry for Agent CLIs Like Claude Code and Codex

The phrase automatic retry looks like a small toggle switch, but once you put it into a real engineering environment, it is nothing like that. Hello everyone, I am HagiCode creator Yu Kun. Today, I do not want to trade in empty talk. I want to talk about how automatic retry for Agent CLIs such as Claude Code and Codex should actually be done, so it can both recover from exceptions and avoid dragging the system into endless repeated execution.

Background

If you have also been working on AI coding lately, you have probably already run into this kind of problem: the task does not fail immediately, but breaks halfway through execution.

In an ordinary HTTP request, that often just means sending it again, maybe with some exponential backoff. But Agent CLIs are different. Tools like Claude Code and Codex usually execute in a streaming manner, pushing output out chunk by chunk. During that process, they may also bind to a thread, session, or resume token. In other words, the question is not simply, “Did this request fail or not?” It becomes:

Does the content that was already emitted still count?
Can the current context continue running?
Should this failure be recovered automatically?
If it should be recovered, how long should we wait before retrying, what should we send during the retry, and should we still reuse the original context?

The first time many teams build this part, they instinctively write the most naive version: if an error occurs, try once more. That idea is perfectly natural, but once it reaches a real project, one problem after another starts surfacing.

Some errors are clearly temporary failures, yet get treated as final failures
Some errors are not worth retrying at all, yet the system replays them over and over
Requests with a thread and requests without a thread get treated exactly the same
The backoff strategy has no boundary, and background requests overload themselves

While integrating multiple Agent CLIs, HagiCode also stepped into these traps. On the Codex side in particular, the first issue we exposed was that a certain type of reconnect message was not recognized as a retryable terminal state, so the recovery mechanism we already had never got a chance to take effect. To put it plainly, it was not that the system lacked automatic retry. The system simply failed to recognize that this particular failure was worth retrying.

So the core point of this article is very clear: automatic retry is not a button, but a layered design.

About HagiCode

The approach shared in this article comes from real practice in our HagiCode project. What HagiCode is trying to do is not just connect one model and call it a day. It is about unifying the streaming messages, tool calls, failure recovery, and session context of multiple Agent CLIs into one execution model that can be maintained over the long term.

One of the things I care about most is how to make AI coding truly land in real engineering work. Writing a demo is not hard. The hard part is turning that demo into something a team is genuinely willing to use for a long time. HagiCode takes automatic retry seriously not because the feature looks sophisticated, but because if long-running, streaming, resumable CLI execution is not stable, what users see is not an intelligent assistant, but a command wrapper that drops the connection halfway through every other run.

If you want to look at the project entry points first, here are two:

GitHub: github.com/HagiCode-org/site
Official website: hagicode.com

Taking it one step further, HagiCode is also on Steam now. If you use Steam, feel free to add it to your wishlist first:

Steam store page (add to wishlist / view details)

Why Automatic Retry for Agent CLIs Is Harder Than Ordinary Retry

This is a very practical question, so let us go straight to the conclusion: the difficulty of automatic retry for Agent CLIs is not “try again after a few seconds,” but “can it still continue in the original context?”

You can think of it as a long conversation. Ordinary API retry is more like redialing when the phone line is busy. Agent CLI retry is more like the signal dropping while the other party is halfway through a sentence, and then you have to decide whether to call back, whether to start over when you do, and whether the other party still remembers where the conversation stopped. These are not the same kind of engineering problem at all.

More concretely, there are four especially typical difficulties.

1. It is streaming

Once output has already been sent to the user, you can no longer handle failure the way you would with an ordinary request, where you silently swallow it and quietly try again. That is because the earlier content has already been seen. If the replay strategy is wrong, the frontend can easily show duplicated text and inconsistent state, and the lifecycle of tool calls can become tangled as well. This is not metaphysics. It is engineering.

2. It usually binds to session context

Providers like Codex bind to a thread, and implementations like Claude Code also have a continuation target or an equivalent resumable context. The real prerequisite for automatic retry is not just that the error looks like a temporary failure, but also that there is still a carrier that allows this execution to continue.

3. Not every error is worth retrying

Network jitter, SSE idle timeout, and temporary upstream failures are usually worth another try. But if what you are facing is authentication failure, lost context, or a provider that has no resume capability at all, then retrying is usually not recovery. It is noise generation.

4. It needs boundaries

Unlimited automatic retry is almost always wrong. Technology trends can be noisy for a while, but engineering laws often remain stable for many years. One of them is that failure recovery must have boundaries. The system has to know how many times it can retry at most, how long it should wait each time, and when it should stop and admit that this one is really not going to recover.

Because of these characteristics, HagiCode ultimately did not implement automatic retry as a few lines of try/catch inside a specific provider. Instead, we extracted it into a shared capability layer. In the end, engineering problems still need to be solved with engineering methods.

HagiCode’s Approach: Pull Retry Out of the Provider

HagiCode’s current real-world implementation can be compressed into one sentence:

The shared layer manages the retry flow uniformly, and each concrete provider is only responsible for answering two questions: is this terminal state worth retrying, and can the current context still continue?

This is not complicated, but it is critical. Once responsibilities are separated this way, Claude Code, Codex, and even other Agent CLIs can all reuse the same skeleton. Models will change, tools will evolve, workflows will be upgraded, but the engineering foundation remains there.

Layer 1: Use a unified coordinator to manage the retry loop

The core implementation fragment in the project looks roughly like this:

internal static class ProviderErrorAutoRetryCoordinator
{
    public static async IAsyncEnumerable<CliMessage> ExecuteAsync(
        string prompt,
        ProviderErrorAutoRetrySettings? settings,
        Func<string, IAsyncEnumerable<CliMessage>> executeAttemptAsync,
        Func<bool> canRetryInSameContext,
        Func<TimeSpan, CancellationToken, Task> delayAsync,
        Func<CliMessage, bool> isRetryableTerminalMessage,
        [EnumeratorCancellation] CancellationToken cancellationToken)
    {
        var normalizedSettings = ProviderErrorAutoRetrySettings.Normalize(settings);
        var retrySchedule = normalizedSettings.Enabled
            ? normalizedSettings.GetRetrySchedule()
            : [];

        for (var attempt = 0; ; attempt++)
        {
            var attemptPrompt = attempt == 0
                ? prompt
                : ProviderErrorAutoRetrySettings.ContinuationPrompt;

            CliMessage? terminalFailure = null;

            await foreach (var message in executeAttemptAsync(attemptPrompt)
                               .WithCancellation(cancellationToken))
            {
                if (isRetryableTerminalMessage(message))
                {
                    terminalFailure = message;
                    break;
                }

                yield return message;
            }

            if (terminalFailure is null)
            {
                yield break;
            }

            if (attempt >= retrySchedule.Count || !canRetryInSameContext())
            {
                yield return terminalFailure;
                yield break;
            }

            await delayAsync(retrySchedule[attempt], cancellationToken);
        }
    }
}

What this code does is actually very straightforward, but also very effective.

Do not pass intermediate failures through directly at first; the coordinator decides whether recovery is still possible
Only when the retry budget is exhausted does the final failure actually return to the upper layer
Starting from the second attempt, the original prompt is no longer sent; a continuation prompt is sent uniformly instead

That is why I kept stressing earlier that automatic retry is not simply “make the request again.” It is not just patching an exception branch. It is managing the life cycle of an execution. That may sound like product-manager language, but in engineering terms, that is exactly what it is.

Layer 2: Snapshot the retry policy

Another issue that is very easy to overlook is this: who decides whether automatic retry is enabled for this request?

HagiCode’s answer is not to depend on some “current global configuration,” but to turn the policy into a snapshot and let it travel together with this request. That way, session queuing, message persistence, execution forwarding, and provider adaptation will not lose the policy along the way. One successful run is not a system. Sustained success is a system.

The core structure can be simplified into this:

public sealed record ProviderErrorAutoRetrySnapshot
{
    public const string DefaultStrategy = "default";

    public bool Enabled { get; init; }

    public string Strategy { get; init; } = DefaultStrategy;

    public static ProviderErrorAutoRetrySnapshot Normalize(bool? enabled, string? strategy)
    {
        return new ProviderErrorAutoRetrySnapshot
        {
            Enabled = enabled ?? true,
            Strategy = string.IsNullOrWhiteSpace(strategy)
                ? DefaultStrategy
                : strategy.Trim()
        };
    }
}

Then on the execution side, it is mapped into the settings object actually consumed by the provider. The value of this approach is very direct:

The business layer decides whether retry should be allowed
The runtime decides how retry should be performed

Each side manages its own concern without colliding with the other. Many problems are not impossible to solve. Their cost simply has not been made explicit. Turning the policy into a snapshot is essentially a way of accounting for that cost in advance.

Layer 3: Providers only decide terminal state and context viability

Once we reach the concrete Claude Code or Codex provider, the responsibility here actually becomes very thin. You can think of it as enhancement, not replacement.

Taking Codex as an example, when it hooks into the shared coordinator, it really only needs to provide three things:

await foreach (var message in ProviderErrorAutoRetryCoordinator.ExecuteAsync(
                   prompt,
                   options.ProviderErrorAutoRetry,
                   retryPrompt => ExecuteCodexAttemptAsync(...),
                   () => !string.IsNullOrWhiteSpace(resolvedThreadId),
                   DelayAsync,
                   IsRetryableTerminalFailure,
                   cancellationToken))
{
    yield return message;
}

You will notice that the provider-specific decisions are really only these two:

IsRetryableTerminalFailure
canRetryInSameContext

Codex checks whether the thread can still continue, while Claude Code checks whether the continuation target still exists. Backoff policy, retry count, and follow-up prompts should not be reinvented by every provider separately.

Once this layer is separated out, the cost of integrating more CLIs into HagiCode drops a lot. You do not have to duplicate an entire retry state machine. You only need to plug in the boundary conditions of that provider. Writing quickly is not the same as writing robustly. Being able to connect something is not the same as connecting it well. Getting it to run is also not the same as making it maintainable over time.

An Easy Mistake to Make: Do Not Treat Every Error as Retryable

In this analysis, the point I most want to single out is not “how to implement retry,” but “how to avoid the wrong retries.”

The original entry point into the problem was that Codex failed to recognize one reconnect message. By intuition, many people would pick the smallest possible fix: add one more string prefix to the whitelist. That idea is not exactly wrong, but it feels more like a demo-stage solution than a long-term maintainable one.

From the current HagiCode implementation, the system has already taken a step in a more robust direction. It no longer stares only at one literal string. Instead, it hands recoverable terminal states over to the shared coordinator uniformly. The benefits are obvious:

It is less likely to fail completely because of a small wording change in one message
Test coverage can be built around the terminal-state envelope rather than a single hard-coded text line
Retry logic becomes more consistent within the same provider

Of course, there needs to be a firm boundary here: being more general does not mean being more permissive. If the current context cannot continue, then even if the error looks like a temporary failure, it should not be replayed blindly.

This point is critical. What really makes people trust a system is not that it occasionally works, but that it is reliable most of the time. If a flow can only be maintained by experts, then it is still a long way from real adoption.

The Three Most Valuable Lessons to Keep in Practice

At this point, it makes sense to start bringing the discussion back down to implementation practice. If you are planning to build a similar capability in your own project, these are the three rules I most strongly recommend protecting first.

1. The retry budget must have boundaries

HagiCode’s current default backoff rhythm is:

10 seconds
20 seconds
60 seconds

This rhythm may not fit every system, but the existence of boundaries must remain. Otherwise, automatic retry quickly stops being a recovery mechanism and turns into an incident amplifier. Do not rush to give it an impressive name. First make sure the thing can survive two iterations inside a real team.

2. The continuation prompt should be unified

The project uses a fixed continuation prompt so that later attempts clearly follow the path of continuing the current context rather than starting a brand-new complete request. This capability is not flashy, but when you build a real project, you cannot do without it. Many things that look like magic are, once broken apart, just a polished engineering process.

3. Both the shared library and the adapter layer need mirrored tests

I especially want to say a little more about this point. Many teams will write one layer of tests in the shared runtime and think that is probably enough. It is not.

The reason I feel relatively confident about HagiCode’s implementation is that both layers have test coverage:

The shared provider tests whether automatic continuation really happened
The adapter layer tests whether final errors and streaming messages were preserved correctly

This time I also reran two related test groups, and all 31 test cases passed in both of them. That result alone does not prove the design is perfect, but it proves at least one thing: the current automatic retry is not a paper design. It is a capability constrained by both code and tests. Talk is cheap. Show me the code. It fits perfectly here.

Summary

If the entire article had to be compressed into one sentence, it would be this:

For Agent CLIs such as Claude Code and Codex, automatic retry should not be implemented as a local trick hidden inside one provider. It should be built as a combination of a shared coordinator, policy snapshot, context viability checks, and mirrored tests.

The benefits of doing it this way are very practical:

The logic is written once and reused across multiple providers
Whether a request is allowed to retry can travel stably with the execution chain
Continue running when context exists, and stop in time when it does not
What the frontend ultimately sees is a stable completed state or failed state, not a pile of abandoned intermediate noise

This solution was polished little by little while HagiCode was integrating multiple Agent CLIs in real scenarios. Who says AI-assisted programming is not the new era of pair programming? Models help you get started, complete code, and branch out, but what often determines the upper bound of the experience is still context, process, and constraints.

If this article was helpful to you, you are also welcome to look at HagiCode’s public entry points:

GitHub: github.com/HagiCode-org/site
Official website: hagicode.com
30-minute hands-on demo: www.bilibili.com/video/BV1pirZBuEzq/
Desktop installation entry: hagicode.com/desktop/
Steam: Steam store page (add to wishlist / view details)

HagiCode is already on Steam now. This is not vaporware, and I have put the link right here. If you use Steam, go ahead and add it to your wishlist. Clicking in to take a look yourself is more direct than hearing me say ten more lines about it here.

That is enough on this topic for now. We will keep meeting inside real projects.

References

HagiCode project homepage: https://hagicode.com
HagiCode GitHub repository: https://github.com/HagiCode-org/site
Official demo video: https://www.bilibili.com/video/BV1pirZBuEzq/
Desktop installation instructions: https://hagicode.com/desktop/

Copyright Notice

Thank you for reading. If you found this article useful, you are welcome to like, bookmark, and share it. This content was created with AI-assisted collaboration, and the final content was reviewed and confirmed by the author.

Author: newbe36524
Original article: https://docs.hagicode.com/blog/2026-02-11-agent-cli-automatic-retry/
Copyright notice: Unless otherwise stated, all articles on this blog are licensed under BY-NC-SA. Please indicate the source when reposting.

SQLite Sharding in Practice: An In-Depth Comparison of Three Sharding Strategies

Apr 17, 2026

SQLite Sharding in Practice: An In-Depth Comparison of Three Sharding Strategies

When a single-file SQLite database hits concurrency bottlenecks, how do we break through? This article shares three SQLite sharding approaches from the HagiCode project across different scenarios, helping you understand how to choose the right sharding strategy.

Hello everyone, I am Yu Kun, the creator of HagiCode.

Background

When building high-performance applications, single-file SQLite databases run into very practical problems. Once user count and data volume grow, these issues start lining up one after another:

Write operations start queueing up, and response times visibly increase
Query performance drops as data volume grows
Frequent database is locked errors appear under multithreaded access

Many people instinctively ask: should we just migrate directly to PostgreSQL or MySQL? That can solve the problem, but deployment complexity rises sharply. Is there a lighter-weight option?

The answer is sharding. In the end, engineering problems should still be solved with engineering methods. By distributing data across multiple SQLite files, we can significantly improve concurrency and query performance while preserving SQLite’s lightweight characteristics.

About HagiCode

The approaches shared in this article come from our practical experience in the HagiCode project. As an AI coding assistant project, HagiCode needs to handle a large volume of conversation messages, state persistence, and event history records. It was through solving these real problems that we summarized three sharding approaches for different scenarios.

Good tools matter, but how you use them depends on the work you actually need to do.

Our code repository is at github.com/HagiCode-org/site. Feel free to take a deeper look if you are interested.

Overview of Three Sharding Approaches

After analyzing the HagiCode codebase, we identified three SQLite sharding approaches for different business scenarios:

Session Message sharded storage: storage for AI conversation messages, characterized by high-frequency writes and session-based isolated queries
Orleans Grain sharded storage: state persistence for a distributed framework, characterized by cross-node access and the need for deterministic routing
Hero History sharded storage: historical event records for a gamified system, characterized by event sourcing and the need for migration compatibility

Although their business scenarios differ, all three follow the same core design principles:

Deterministic routing: calculate the shard directly from the business ID, without a metadata table
Transparent access: upper layers use a unified interface and remain unaware of the underlying shards
Independent storage: each shard is a fully independent SQLite file
Concurrency optimization: WAL mode plus busy_timeout reduces lock contention

Many people ask: why not build one generic sharding solution? That is a very practical question, and the conclusion is straightforward: in engineering, there is no universal solution, only the one that best fits the current business scenario. Next, we will compare the concrete implementations of these three approaches in depth.

Comparison of Sharding Strategies

Shard Count and Naming Rules

Aspect	Session Message	Orleans Grain	Hero History
Shard count	256 (16²)	100	10
Naming rule	Hexadecimal (00-ff)	Decimal (00-99)	Decimal (0-9)
Storage directory	`DataDir/messages/`	`DataDir/orleans/grains/`	`DataDir/hero-history/`
Filename pattern	`{shard}.db`	`grains-{shard}.db`	`{shard}.db`

Why is there such a large difference in shard counts? It depends on business characteristics. Put another way, models will change, tools will evolve, and workflows will be upgraded, but the engineering fundamentals remain the same: first understand the problem you are actually trying to solve.

Session Message uses 256 shards because conversation messages have the highest write frequency and need more shards to spread the load
Orleans Grain uses 100 shards, balancing concurrency performance and operational complexity
Hero History uses only 10 shards because historical event writes are less frequent and migration cost must be considered

Differences in Routing Algorithms

The routing algorithm is the core of a sharding scheme. It determines how data is distributed across shards. The three approaches use different routing strategies:

// Session Message: last two hexadecimal characters of the GUID
var normalized = Guid.Parse(sessionId.Value).ToString("N").ToLowerInvariant();
return normalized[^2..];  // Take the last two hexadecimal characters

// Orleans Grain: extract digits, then use the last two digits modulo shard count
var digits = ExtractDigits(grainId);  // Extract all digits
var lastTwoDigits = (digits[^2] * 10) + digits[^1];
return lastTwoDigits % shardCount;

// Hero History: modulo 10 using the ASCII value of the last character
return heroId[^1] % 10;

Design analysis:

Session Message IDs are GUIDs. After converting to hexadecimal, taking the last two characters gives an even distribution across 256 shards
Orleans Grain IDs do not have a consistent format and may contain both letters and digits, so all digits are extracted before taking the modulo
Hero History IDs are strings, so the ASCII value of the last character is used directly with modulo. It is simple, but the distribution may be less uniform

Key point: regardless of which algorithm you use, the same ID must always map to the same shard. This is one of the most fundamental requirements in distributed systems. Otherwise, data inconsistency is inevitable. If routing is unstable, every other effort collapses to zero.

Differences in Initialization Strategies

Aspect	Session Message	Orleans Grain	Hero History
Initialization timing	Lazy-loaded on demand	Full parallel initialization at startup	Lazy-loaded on demand
Concurrency control	`Lazy<Task>` prevents duplicate initialization	`Parallel.ForEachAsync`	`Lazy<Task>` prevents duplicate initialization

Why does Orleans Grain choose full initialization at startup?

Because Orleans is a distributed framework, a Grain may be scheduled to any node. If a shard file is discovered to be missing only at runtime, requests can fail. Full initialization at startup extends startup time, but it guarantees runtime stability. Getting it running is only the beginning; keeping it maintainable is the real skill.

Advantages of lazy loading:

For Session Message and Hero History, lazy loading reduces startup time. Files and schema are created only when a shard is actually needed. Using Lazy<Task> also prevents race conditions during concurrent initialization. The design looks simple, but in real projects it saves a lot of unnecessary trouble.

Schema Design Characteristics

The schema designs of the three approaches reflect their respective business characteristics:

Session Message:

Supports the Event Sourcing model (event table plus snapshot table)
Includes a child table for message content blocks (MessageContentBlocks)
Has compression and compression-flag fields to support future optimizations

Orleans Grain:

Minimalist design: a single GrainState table
Stores state as serialized JSON
Uses ETag-based optimistic concurrency control

Hero History:

Timeline query optimization indexes
A unique DedupeKey constraint prevents duplication
Supports multiple event types and statuses

These designs show that schema design should stay tightly aligned with business requirements rather than chasing genericity. Orleans Grain is simple precisely because it only needs to store serialized state and does not require complex query capabilities. This is not mysticism. It is engineering. Do not rush to give something a grand name before checking whether it can survive two iterations inside a real team.

Concurrency Configuration Comparison

All three approaches use the same SQLite concurrency optimization settings:

PRAGMA journal_mode=WAL;      -- Write-ahead logging mode
PRAGMA synchronous=NORMAL;     -- Reduce persistence overhead
PRAGMA busy_timeout=5000;      -- 5-second busy wait
PRAGMA foreign_keys=ON;        -- Foreign key constraints

Advantages of WAL mode:

Traditional rollback journal mode causes lock contention during writes, while WAL mode allows reads and writes to proceed concurrently. In large-data scenarios, this can significantly improve performance. Many developers overlook this setting, but it matters far more than they think.

The tradeoff of synchronous=NORMAL:

Setting it to FULL provides maximum safety, but it significantly reduces performance. NORMAL strikes a balance between safety and performance, making it the right choice for most applications. There is no need to overthink this one. NORMAL is enough.

How to Choose a Sharding Strategy

Based on the analysis of HagiCode’s three approaches, we can summarize the following decision matrix:

High-throughput scenarios -> more shards (for example, Message uses 256)
Simple maintainability   -> fewer shards (for example, Hero History uses 10)
Mostly numeric IDs       -> modulo algorithm (Orleans Grain)
Mostly GUIDs             -> hexadecimal suffix (Session Message)
String IDs               -> ASCII modulo (Hero History)

Rules of thumb for choosing shard counts:

Too few (< 10): limited concurrency improvement, making sharding less meaningful
Too many (> 1000): file management becomes complex and connection-pool overhead rises
Rule of thumb: 10 to 100 shards fit most scenarios
Extremely high concurrency scenarios: 256 shards can be considered

If you only look at demos, it is easy to get carried away. But once you enter production, every cost has to be calculated carefully. Many things are not impossible, just not honestly priced.

Practical Guide

Implement a Standardized Shard Resolver

public interface IShardResolver<TId>
{
    string ResolveShardKey(TId id);
}

// Hexadecimal sharding (for GUIDs)
public class HexSuffixShardResolver : IShardResolver<string>
{
    private readonly int _suffixLength;

    public HexSuffixShardResolver(int suffixLength = 2)
    {
        _suffixLength = suffixLength;
    }

    public string ResolveShardKey(string id)
    {
        var normalized = id.Replace("-", "").ToLowerInvariant();
        return normalized[^_suffixLength..];
    }
}

// Numeric modulo sharding (for purely numeric IDs)
public class NumericModuloShardResolver : IShardResolver<long>
{
    private readonly int _shardCount;

    public NumericModuloShardResolver(int shardCount)
    {
        _shardCount = shardCount;
    }

    public string ResolveShardKey(long id)
    {
        return (id % _shardCount).ToString("D2");
    }
}

Unified Connection Factory Pattern

public class ShardedConnectionFactory<TOptions>
{
    private readonly ConcurrentDictionary<string, Lazy<Task>> _initializationTasks = new();
    private readonly TOptions _options;
    private readonly IShardSchemaInitializer _initializer;

    public ShardedConnectionFactory(
        TOptions options,
        IShardSchemaInitializer initializer)
    {
        _options = options;
        _initializer = initializer;
    }

    public async Task<TDbContext> CreateAsync(string shardKey, CancellationToken ct)
    {
        var connectionString = BuildConnectionString(shardKey);

        // Use Lazy<Task> to prevent concurrent initialization
        var initTask = _initializationTasks.GetOrAdd(
            connectionString,
            _ => new Lazy<Task>(() => InitializeShardAsync(connectionString, ct))
        );

        await initTask.Value;
        return CreateDbContext(connectionString);
    }

    private async Task InitializeShardAsync(string connectionString, CancellationToken ct)
    {
        await _initializer.InitializeAsync(connectionString, ct);
    }

    private string BuildConnectionString(string shardKey)
    {
        var shardPath = Path.Combine(_options.BaseDirectory, $"{shardKey}.db");
        return $"Data Source={shardPath}";
    }

    private TDbContext CreateDbContext(string connectionString)
    {
        // Create the DbContext according to the specific ORM
        return Activator.CreateInstance(typeof(TDbContext), connectionString) as TDbContext;
    }
}

Best Practices for Schema Initialization

public class SqliteShardInitializer : IShardSchemaInitializer
{
    public async Task InitializeAsync(string connectionString, CancellationToken ct)
    {
        await using var connection = new SqliteConnection(connectionString);
        await connection.OpenAsync(ct);

        // Concurrency optimization settings
        await connection.ExecuteAsync("""
            PRAGMA journal_mode=WAL;
            PRAGMA synchronous=NORMAL;
            PRAGMA busy_timeout=5000;
            PRAGMA foreign_keys=ON;
        """);

        // Create table schema
        await connection.ExecuteAsync("""
            CREATE TABLE IF NOT EXISTS Entities (
                Id TEXT PRIMARY KEY,
                CreatedAt TEXT NOT NULL,
                UpdatedAt TEXT NOT NULL,
                Data TEXT NOT NULL,
                ETag TEXT
            );
        """);

        // Create indexes
        await connection.ExecuteAsync("""
            CREATE INDEX IF NOT EXISTS IX_Entities_CreatedAt
            ON Entities(CreatedAt DESC);

            CREATE INDEX IF NOT EXISTS IX_Entities_UpdatedAt
            ON Entities(UpdatedAt DESC);
        """);
    }
}

Key Considerations

1. Routing stability

The routing algorithm must guarantee that the same ID always maps to the same shard. Avoid random or time-dependent calculations, and do not introduce mutable parameters into the algorithm.

2. Choosing the shard count

The number of shards should be decided during the design phase. Changing it later is extremely difficult. Consider:

Current and future concurrency volume
The management cost of each shard
The complexity of data migration

3. Migration planning

The Hero History approach demonstrates a complete migration path:

Build the new sharded storage infrastructure
Implement a migration service to copy data from the primary database into the shards
Verify query compatibility after migration
Switch read and write paths to the shards
Clean up legacy tables in the primary database

Future migration requirements need to be considered while designing the sharding scheme. Talk is cheap. Show me the code. But code alone is not enough. You also need a complete migration path. A one-time success is not a system; sustained success is.

4. Monitoring and operations

Monitor size distribution across shards to detect data skew early
Set alerts for shard hot spots to prevent a single shard from becoming the bottleneck
Regularly inspect WAL file sizes to avoid excessive disk usage
Establish shard health-check mechanisms

5. Test coverage

Test boundary conditions such as empty IDs, special characters, and overly long IDs
Verify routing determinism to ensure the same ID always maps to the same shard
Run concurrent write stress tests to confirm lock contention is effectively reduced
Run migration tests to ensure data integrity and consistency

Summary

By comparing the three SQLite sharding approaches in the HagiCode project, we can see that:

There is no universal solution: different business scenarios need different sharding strategies
The core principles are shared: deterministic routing, transparent access, independent storage, and concurrency optimization
Design should face the future: consider migration paths and operational costs

If your project is using SQLite and has started hitting concurrency bottlenecks, I hope this article gives you some useful ideas. There is no need to rush into migrating to a heavyweight database. Sometimes the right sharding strategy is enough to solve the problem.

Of course, sharding is not a silver bullet. Before choosing a sharding strategy, first make sure that:

You have already optimized single-table query performance
You have already added appropriate indexes
You have already enabled WAL mode

Only after these optimizations are done, and a performance bottleneck still remains, should you consider introducing sharding. Doing simple things well is a capability in itself.

Sometimes doing the work once says more than explaining it ten times. From here, let the engineering results speak for themselves.

References

HagiCode project repository: github.com/HagiCode-org/site
SQLite WAL mode documentation: sqlite.org/wal.html
Orleans distributed framework: dotnet.github.io/orleans

Copyright Notice

Thank you for reading. If you found this article useful, feel free to like, bookmark, and share it. This content was created with AI-assisted collaboration, and the final content was reviewed and confirmed by the author.

Author: newbe36524
Original link: https://docs.hagicode.com/blog/2026-04-17-sqlite-sharding-strategies-comparison/
Copyright: Unless otherwise stated, all blog articles are licensed under BY-NC-SA. Please cite the source when reposting.

VSCode and code-server: Choosing a Browser-Based Code Editing Solution

Apr 13, 2026

VSCode and code-server: Choosing a Browser-Based Code Editing Solution

When building browser-based code editing capabilities, developers face a key choice: use VSCode’s official code serve-web feature, or adopt the community-driven code-server solution? This decision affects not only the technical architecture, but also license compliance and deployment flexibility.

Background

Technical selection is a lot like choosing a path in life. Once you pick one, you usually have to keep walking it, and switching later can become very expensive.

In the era of AI-assisted programming, browser-based code editing is becoming increasingly important. Users expect that after an AI assistant finishes analyzing code, they can immediately open an editor in the same browser session and make changes without switching applications. That kind of seamless experience should simply be there when you need it.

However, when implementing this feature, developers face a critical technical choice: should they use VSCode’s official code serve-web feature, or the community-driven code-server solution?

Each option has its own strengths and trade-offs, and choosing poorly can create a lot of trouble later. Licensing is one example: if you only discover after launch that your product is not license-compliant, it is already too late. Deployment is another: a solution might work perfectly in development, then run into all kinds of problems once moved into containers. These are exactly the kinds of pitfalls teams want to avoid.

About HagiCode

The approach shared in this article comes from our practical experience in the HagiCode project. HagiCode is an AI-driven coding assistant. While implementing browser-based code editing, we studied both solutions in depth and ultimately designed our architecture to support both, while choosing code-server as the default.

Project repository: github.com/HagiCode-org/site

Licensing Differences (Most Important)

This is the most fundamental difference between the two solutions, and the first factor we considered during evaluation. When making a technical choice, it is important to understand the legal risks up front.

code-server

MIT license, fully open source
Maintained by Coder.com with an active community
Free to use commercially, modify, and distribute
No restrictions on usage scenarios

VSCode code serve-web

Part of the Microsoft VSCode product
Uses Microsoft’s license (the VS Code license includes restrictions on commercial use)
Primarily intended for individual developers
Enterprise deployment may require additional commercial licensing review

From a licensing perspective, code-server is more friendly to commercial projects. This is something you need to think through during product planning, because migrating later can become very costly.

Deployment Differences

Once licensing is settled, the next issue is deployment. That directly affects your operations cost and architectural design.

code-server

A standalone Node.js application that can be deployed independently
Supports multiple runtime sources:
- Directly specifying the executable path
- Looking it up through the system PATH
- Automatic detection of an NVM Node.js 22.x environment
No need to install the VSCode desktop application on the server
Easier to deploy in containers

VSCode code serve-web

Must depend on a locally installed VSCode CLI
Requires an available code command on the host machine
The system filters out VS Code Remote CLI wrappers
Primarily designed for local development scenarios

code-server is better suited for server and container deployment scenarios. If your product needs to run in Docker, or your users do not have VSCode installed, code-server is usually the right choice.

Feature Parameter Differences

The two solutions also differ in a few feature parameters. The differences are not huge, but they can create integration friction in real-world usage.

Feature	code-server	code serve-web
Public base path	`/` (configurable)	`/vscode-server` (fixed)
Authentication	`--auth` parameter with multiple modes	`--connection-token` / `--without-connection-token`
Data directory	`{DataDir}/code-server`	`{DataDir}/vscode-serve-web`
Telemetry	Disabled by default with `--disable-telemetry`	Depends on VSCode settings
Update checks	Can be disabled with `--disable-update-check`	Depends on VSCode settings

These differences need special attention during integration. For example, different URL paths mean your frontend code needs dedicated handling.

Availability Detection Differences

When implementing editor switching, the availability detection logic also differs.

code-server

Always returned as a visible implementation
Still shown even when unavailable, with an install-required status
Supports automatic detection of an NVM Node.js 22.x environment

code serve-web

Only visible when a local code CLI is detected
If unavailable, the frontend automatically hides this option
Depends on the local VSCode installation state

This difference directly affects the user experience. code-server is more transparent: users can see the option and understand that installation is still required. code serve-web is more hidden: users may not even realize the option exists. Which approach is better depends on the product positioning.

HagiCode’s Dual-Implementation Architecture

After in-depth analysis, the HagiCode project adopted a dual-implementation architecture that supports both solutions at the architectural level.

Defaulting to code-server

// The default active implementation is code-server
// If an explicit activeImplementation is saved, try that implementation first
// If the requested implementation is unavailable, the resolver tries the other one
// If a fallback occurs, return fallbackReason

We default to code-server mainly because of licensing and deployment flexibility. However, for users who already have a local VSCode environment, code serve-web is also a solid option.

Implementation Selector

CodeServerImplementationResolver is responsible for unifying:

Implementation selection during startup warm-up
Implementation selection when reading status
Implementation selection when opening projects
Implementation selection when opening Vaults

This design allows the system to respond flexibly to different scenarios, and users can choose the implementation that best matches their environment.

Frontend Adaptation Rules

// When localCodeAvailable=false, do not show code serve-web
// When localCodeAvailable=true, show the code serve-web configuration

The frontend automatically shows available options based on the environment, so users are not confused by features they cannot use.

Practical Guide

After all that theory, what should you pay attention to during actual deployment? In the end, implementation is what matters.

Docker Deployment Recommendation

For containerized deployment, code-server is the better choice:

# Use the official code-server image directly
FROM codercom/code-server:latest

# Or install through npm
RUN npm install -g code-server

This solves the problem in a single layer without requiring an additional VSCode installation.

Configuration Examples

code-server configuration

{
  "vscodeServer": {
    "enabled": true,
    "activeImplementation": "code-server",
    "codeServer": {
      "host": "0.0.0.0",
      "port": 8080,
      "executablePath": "",
      "authMode": "none"
    }
  }
}

code serve-web configuration

{
  "vscodeServer": {
    "enabled": true,
    "activeImplementation": "serve-web",
    "serveWeb": {
      "host": "0.0.0.0",
      "port": 8080,
      "executablePath": "/usr/local/bin/code"
    }
  }
}

Configuration can be a bit tedious the first time, but once it is in place, things become much easier to maintain.

URL Construction Differences

code-server

http://localhost:8080/?folder=/path/to/project&vscode-lang=zh-CN

code serve-web

http://localhost:8080/vscode-server/?folder=/path/to/project&tkn=xxx&vscode-lang=zh-CN

Pay attention to the differences in paths and parameters. You need to handle them separately during integration.

Switching Implementations

The system supports runtime switching and automatically stops the previous implementation when switching:

// VsCodeServerManager automatically handles mutual exclusion
// When switching activeImplementation, the old implementation will not keep running in the background

This design lets users try different implementations at any time and find the option that works best for them.

Status Monitoring

const { settings, runtime } = await getVsCodeServerSettings();

// runtime.activeImplementation: "code-server" | "serve-web"
// runtime.fallbackReason: reason for switching
// runtime.status: "running" | "starting" | "stopped" | "unhealthy"

When status is visible, users can quickly determine whether a problem comes from the server side or from their own operation.

Conclusion

Comparison Dimension	code-server	code serve-web	Recommendation
License	MIT (commercial-friendly)	Microsoft (restricted)	code-server
Deployment flexibility	Independent deployment	Depends on local VSCode	code-server
Server suitability	Designed for servers	Mainly for local development	code-server
Containerization	Native support	Requires VSCode installation	code-server
Feature completeness	Close to desktop edition	Official complete version	code serve-web
Maintenance activity	Active community	Officially maintained by Microsoft	Both have strengths

Recommended strategy: Use code-server first, and consider code serve-web when you need full official functionality and already have a local VSCode environment.

The approach shared in this article is distilled from HagiCode’s real development experience. If you find this solution valuable, that is also a good sign that HagiCode itself is worth paying attention to.

References

HagiCode GitHub: github.com/HagiCode-org/site
HagiCode website: hagicode.com
code-server website: coder.com/code-server
VSCode official documentation: code.visualstudio.com/docs

If this article helped you:

Give us a Star on GitHub: github.com/HagiCode-org/site
Visit the official website to learn more: hagicode.com
Watch the 30-minute hands-on demo: www.bilibili.com/video/BV1pirZBuEzq/
Try the one-click installation: docs.hagicode.com/installation/docker-compose
Quick install for the Desktop app: hagicode.com/desktop/
Public beta has started, and you are welcome to try it

Copyright Notice

Thank you for reading. If you found this article useful, feel free to like, bookmark, and share it. This content was created with AI-assisted collaboration, with the final version reviewed and approved by the author.

Author: newbe36524
Original link: https://docs.hagicode.com/blog/2026-04-13-vscode-web-integration-browser-editing/
Copyright: Unless otherwise stated, all articles on this blog are licensed under BY-NC-SA. Please include the source when reposting.

Fast Code Editing in the Browser: VSCode Web Integration in Practice

Apr 12, 2026

Fast Code Editing in the Browser: VSCode Web Integration in Practice

After AI finishes analyzing code, how do you immediately open an editor in the browser and start making changes? This article shares our practical experience integrating code-server in the HagiCode project to create a seamless bridge between the AI assistant and the code editing experience.

Background

In the era of AI-assisted programming, developers often need to inspect and edit code quickly. The traditional workflow is simple: open the project in a desktop IDE, locate the file, edit it, and save. But in some situations, that flow always feels slightly off.

Scenario one: remote development. When using an AI assistant like HagiCode, the backend may be running on a remote server or inside a container, and local machines cannot directly access the project files. Every time you need to inspect or modify code, you have to connect through SSH or another method, and the experience feels fragmented. It is like wanting to meet someone through a thick pane of glass: you can see them, but you cannot reach them.

Scenario two: quick previews. After the AI assistant analyzes the code, the user may only want to quickly browse a file or make a small change. Launching a full desktop IDE feels heavy, while a lightweight in-browser editor better fits the need for a “quick look.” After all, who wants to mobilize an entire toolchain just to take a glance?

Scenario three: cross-device collaboration. When working across different devices, a browser-based editor provides a unified access point without requiring every machine to be configured with a development environment. That alone saves a lot of trouble. Life is short; why repeat the same setup work over and over?

To solve these pain points, we integrated VSCode Web into the HagiCode project. This lets the AI assistant and the code editing experience connect seamlessly: after AI analyzes the code, users can immediately open an editor and make changes in the same browser session, without switching applications. It is the kind of experience where, when you need it, it is simply there.

About HagiCode

The solution shared in this article comes from our practical experience in the HagiCode project. HagiCode is an AI-driven coding assistant designed to improve development efficiency through natural language interaction. During development, we found that users often need to switch quickly between AI analysis and code editing, which pushed us to explore how to integrate the editor directly into the browser.

Project repository: github.com/HagiCode-org/site

Technical Choice: Why code-server?

Among the many VSCode Web solutions available, we chose code-server. There were a few concrete reasons behind that decision.

Feature completeness. code-server is the web version of VSCode and supports most desktop features, including the extension system, intelligent suggestions, debugging, and more. That means users can get an editing experience in the browser that is very close to the desktop version. After all, who really wants to compromise on functionality?

Flexible deployment. code-server can run as an independent service and also supports Docker-based deployment, which fits well with HagiCode’s architecture. Our backend is written in C#, the frontend uses React, and the two communicate with the code-server service through REST APIs. It is like building with blocks: every piece has its place.

Secure authentication. code-server includes a built-in connection-token mechanism to prevent unauthorized access. Each session has a unique token so that only authorized users can open the editor. Security is one of those things you only fully appreciate once you have it.

Architecture Design

HagiCode’s VSCode Web integration uses a front-end/back-end separated architecture.

Frontend Service Layer

The frontend wraps interactions with the backend through vscodeServerService.ts:

// Open project
export async function openProjectInCodeServer(
  id: string,
  currentInterfaceLanguage?: string,
): Promise<VsCodeServerLaunchResponseDto>

// Open vault
export async function openVaultInCodeServer(
  id: string,
  path?: string,
  currentInterfaceLanguage?: string,
): Promise<VsCodeServerLaunchResponseDto>

The difference between these two methods is straightforward: openProjectInCodeServer opens the entire project, while openVaultInCodeServer opens a specific path inside a Vault. For MonoSpecs multi-repository projects, the system automatically creates a workspace file. Clear responsibilities are often enough when each part does its own job well.

Backend Service Layer

The backend VaultAppService.cs implements the core logic:

public async Task<VsCodeServerLaunchResponseDto> OpenInCodeServerAsync(
    string id,
    string? relativePath = null,
    string? currentInterfaceLanguage = null,
    CancellationToken cancellationToken = default)
{
    // 1. Get settings and check whether the feature is enabled
    var settings = await _vsCodeServerSettingsService.GetResolvedSettingsAsync(cancellationToken);
    if (!settings.Enabled) {
        throw new BusinessException(VsCodeServerErrorCodes.Disabled, "VSCode Server is disabled.");
    }

    // 2. Get vault and resolve the launch directory
    var vault = await RequireVaultAsync(id, cancellationToken);
    var launchDirectory = ResolveLaunchDirectory(vault, relativePath);

    // 3. Ensure code-server is running and get runtime info
    var runtime = await _vsCodeServerManager.EnsureStartedAsync(settings, cancellationToken);

    // 4. Resolve language settings
    var language = _vsCodeServerSettingsService.ResolveLaunchLanguage(
        settings.Language,
        currentInterfaceLanguage);

    // 5. Build launch URL
    return new VsCodeServerLaunchResponseDto {
        LaunchUrl = AppendQueryString(runtime.BaseUrl, new Dictionary<string, string?> {
            ["folder"] = launchDirectory,
            ["tkn"] = runtime.ConnectionToken,
            ["vscode-lang"] = language,
        }),
        ConnectionToken = runtime.ConnectionToken,
        OpenMode = "folder",
        Runtime = VsCodeServerSettingsService.MapRuntime(
            await _vsCodeServerManager.GetRuntimeSnapshotAsync(cancellationToken)),
    };
}

This method has a very clear responsibility: check settings, resolve paths, start the service, and build the URL. Among them, the ResolveLaunchDirectory method performs path security checks to prevent path traversal attacks. Code can feel a little like poetry when every line has a purpose.

Automatic Runtime Management

The backend manages the code-server process through VsCodeServerManager:

Check process status
Automatically start stopped services
Return runtime snapshots such as port, process ID, and start time

This design lets the system automatically handle the code-server lifecycle, so users do not need to manage service processes manually. Life is already complicated enough; anything that can be automated should be.

Language Synchronization

HagiCode supports a multilingual interface, and code-server needs to follow that setting. The system supports three language modes:

follow: follow the current interface language
zh-CN: fixed to Chinese
en-US: fixed to English

The setting is passed to code-server through the vscode-lang URL parameter so that the editor language stays consistent with the HagiCode interface. Language feels best when it is unified.

MonoSpecs Multi-Repository Workspace

For MonoSpecs projects, which contain multiple sub-repositories inside one monorepo, the system automatically creates a .code-workspace file:

private async Task<string> CreateWorkspaceFileAsync(Project project, Guid projectId)
{
    var folders = await ResolveWorkspaceFoldersAsync(project.Path);
    var workspaceDocument = new {
        folders = folders.Select(path => new { path }).ToArray(),
    };
    // Generate workspace file...
}

This makes it possible to edit multiple sub-repositories in the same code-server instance, which is especially practical for large monorepo projects. Multiple repositories in one window can feel like multiple stories gathered in the same book.

Frontend Integration

The HagiCode frontend uses React + TypeScript, and integrating code-server is not especially complicated.

Quick Action Button

Add a Code Server button to the project card:

<Button
  size="sm"
  variant="default"
  onClick={() => onAction({ type: 'open-code-server' })}
>
  <Globe className="h-3 w-3 mr-1" />
  <span className="text-xs">{t('project.openCodeServer')}</span>
</Button>

This button triggers the open action and calls the backend API to obtain the launch URL. One button, one action, direct and simple.

Handling the Open Action

const handleAction = async (action: ProjectAction) => {
  if (action.type === 'open-code-server') {
    const response = await openProjectInCodeServer(project.id, i18n.language);
    window.open(response.launchUrl, '_blank', 'noopener,noreferrer');
  }
};

Use window.open to open code-server in a new tab. The noopener,noreferrer parameters provide extra security. When it comes to security, there is no such thing as being too careful.

Vault Editing Entry

Add a similar edit button in the Vault list:

const handleEditVault = async (vault: VaultItemDto) => {
  const response = await openVaultInCodeServer(vault.id);
  window.open(response.launchUrl, '_blank', 'noopener,noreferrer');
};

Projects and Vaults use the same open mechanism, which keeps the interaction consistent. Consistency matters almost as much as the feature itself.

URL Construction Logic

The URL format for code-server has a few details worth noting.

Folder mode:

http://{host}:{port}/?folder={path}&tkn={token}&vscode-lang={lang}

Workspace mode:

http://{host}:{port}/?workspace={workspacePath}&tkn={token}&vscode-lang={lang}

Here, tkn is the connection token. It is generated automatically every time code-server starts, ensuring secure access. The vscode-lang parameter controls the editor UI language. Every one of these parameters has a role to play.

Usage Scenarios

Scenario One: AI-Assisted Code Review

The user talks with HagiCode, the AI analyzes the project code and finds a potential issue, and then the user clicks the “Open in Code Server” button to open the editor directly in the browser, inspect the affected file, fix it, and return to HagiCode to continue the conversation. The entire flow happens in the browser without switching applications. It feels smooth in the way running water feels smooth.

Scenario Two: Editing Study Materials in a Vault

A user creates a Vault for studying an open source project and wants to add study notes under the docs/ directory. With code-server, they can edit Markdown files directly in the browser, save them, and let HagiCode immediately read the updated notes. This is especially useful for building a personal knowledge base. Knowledge only becomes more valuable the more you accumulate it.

Scenario Three: MonoSpecs Multi-Repository Development

A MonoSpecs project contains multiple sub-repositories, and code-server automatically creates a multi-folder workspace. In the browser, users can edit code across several repositories at once and then commit changes back to their respective Git repositories. This workflow is particularly well suited for changes that need to span multiple repositories. Editing several repositories together takes a bit of technique, just like handling multiple tasks at the same time.

Security Considerations

When implementing code-server integration, security deserves special attention. If security goes wrong, you always notice too late.

Connection Token

The connection-token is generated randomly and should not be exposed. It is best used under HTTPS to prevent the token from being intercepted by a man-in-the-middle. Sensitive information is worth protecting properly.

File Path Security

The backend implements path traversal checks:

private static string ResolveLaunchDirectory(VaultRegistryEntry vault, string? relativePath)
{
    var vaultRoot = EnsureTrailingSeparator(Path.GetFullPath(vault.PhysicalPath));
    var combinedPath = Path.GetFullPath(Path.Combine(vaultRoot, relativePath ?? "."));
    if (!combinedPath.StartsWith(vaultRoot, StringComparison.OrdinalIgnoreCase))
    {
        throw new BusinessException(VaultRelativePathTraversalCode, "Relative path traversal detected.");
    }
    return combinedPath;
}

This code ensures that users cannot use ../ or similar patterns to access files outside the Vault directory. Boundary checks are always better done than skipped.

Permission Control

The code-server process should run with appropriate user permissions so that it cannot access sensitive system files. It is best to run the code-server service under a dedicated user. Permission control is one of those fundamentals you should always keep in place.

Performance Optimization

code-server consumes server resources, so here are a few optimization suggestions:

Monitor CPU and memory usage, and adjust resource limits when necessary
Large projects may require longer timeouts
Implement automatic session timeout cleanup to release resources
Consider caching to reduce repeated computation

HagiCode provides a runtime status monitoring API, and the frontend can call getVsCodeServerSettings() to retrieve the current state:

const { settings, runtime } = await getVsCodeServerSettings();
// runtime.status: 'disabled' | 'stopped' | 'starting' | 'running' | 'unhealthy'
// runtime.baseUrl: "http://localhost:8080"
// runtime.processId: 12345

This design allows users to clearly understand the health status of code-server and quickly locate problems when something goes wrong. When the status is visible, people feel more in control.

User Experience Details

During implementation, we discovered a few details that noticeably affect the user experience and deserve extra attention.

Opening code-server for the first time may require waiting for startup, and that delay can range from a few seconds to half a minute. It is a good idea to show a loading state in the frontend so users know the system is still working. Waiting is easier when there is feedback.

Browsers may block the popup, so users should be prompted to allow it manually. On first launch, HagiCode displays guidance that explains how to grant the necessary browser permissions. User experience often lives in exactly these small details.

It is also a good idea to display runtime status such as starting, running, or error, so that when problems occur, users can quickly tell whether the issue is on the server side or in their own operation. Knowing where the problem is at least gives you a place to start.

Configuration Example

The configuration for code-server is not complicated:

{
  "vscodeServer": {
    "enabled": true,
    "host": "0.0.0.0",
    "port": 8080,
    "language": "follow"
  }
}

enabled controls whether the feature is turned on, host and port define the listening address, and language sets the language mode. These settings can be modified through the UI and take effect immediately. Simple things are often the easiest to use.

Conclusion

HagiCode’s VSCode Web integration provides an elegant solution: it lets the AI assistant and the code editing experience connect seamlessly. By integrating code-server into the browser, users can quickly act on AI analysis results and complete the full flow from analysis to editing in the same browser session.

This solution brings several key advantages: a unified experience, because projects and Vaults use the same open mechanism; multi-repository support, because MonoSpecs projects automatically create workspaces; and controllable security, thanks to runtime status monitoring and path safety checks.

The approach shared in this article is something HagiCode distilled from real development work. If you find this solution valuable, that suggests our engineering practice is doing something right, and HagiCode itself may be worth a closer look. Good tools deserve to be seen by more people.

References

HagiCode GitHub: github.com/HagiCode-org/site
HagiCode official website: hagicode.com
code-server official website: coder.com/code-server
Related code files:
- repos/web/src/services/vscodeServerService.ts
- repos/hagicode-core/src/PCode.Application/Services/VaultAppService.cs
- repos/hagicode-core/src/PCode.Application/ProjectAppService.VsCodeServer.cs

If this article helped you:

Give us a Star on GitHub: github.com/HagiCode-org/site
Visit the official website to learn more: hagicode.com
Watch the 30-minute hands-on demo: www.bilibili.com/video/BV1pirZBuEzq/
One-click installation guide: docs.hagicode.com/installation/docker-compose
Quick install for the Desktop app: hagicode.com/desktop/
Public beta has started, and you are welcome to try it

Copyright Notice

Thank you for reading. If you found this article useful, feel free to like it, save it, and share it. This content was created with AI-assisted collaboration, and the final content was reviewed and approved by the author.

Author: newbe36524
Original article: https://docs.hagicode.com/blog/2026-04-12-vscode-web-integration-browser-editing/
Copyright notice: Unless otherwise stated, all articles in this blog are licensed under BY-NC-SA. Please include attribution when reposting.

Building a Cross-Project Knowledge Base for the AI Era with the Vault System

Apr 10, 2026

Building a Cross-Project Knowledge Base for the AI Era with the Vault System

Learning by studying and reproducing real projects is becoming mainstream, but scattered learning materials and broken context make it hard for AI assistants to deliver their full value. This article introduces the Vault system design in the HagiCode project: through a unified storage abstraction layer, AI assistants can understand and access all learning resources, enabling true cross-project knowledge reuse.

Background

In fact, in the AI era, the way we learn new technologies is quietly changing. Traditional approaches like reading books and watching videos still matter, but “studying and reproducing projects” - deeply researching and learning from the code, architecture, and design patterns of excellent open source projects - is clearly becoming more efficient. Running and modifying high-quality open source projects directly is one of the fastest ways to understand real-world engineering practice.

But this approach also brings new challenges.

Learning materials are too scattered. Notes might live in Obsidian, code repositories may be spread across different folders, and an AI assistant’s conversation history becomes a separate data island. Every time you need AI help analyzing a project, you have to manually copy code snippets and organize context, which is quite tedious.

Context keeps getting lost. AI assistants cannot directly access local learning resources, so every conversation starts with re-explaining background information. The code repositories you study update quickly, and manual synchronization is error-prone. Worse still, knowledge is hard to share across multiple learning projects - the design patterns learned in project A are completely unknown to the AI when it works on project B.

At the core, these issues are all forms of “data islands.” If there were a unified storage abstraction layer that let AI assistants understand and access all learning resources, the problem would be solved.

To address these pain points, we made a key design decision while developing HagiCode: build a Vault system as a unified knowledge storage abstraction layer. The impact of that decision may be even greater than you expect - more on that shortly.

About HagiCode

The approach shared in this article comes from practical experience in the HagiCode project. HagiCode is an AI coding assistant based on the OpenSpec workflow. Its core idea is that AI should not only be able to “talk,” but also be able to “do” - directly operate on code repositories, execute commands, and run tests. GitHub: github.com/HagiCode-org/site

During development, we found that AI assistants need frequent access to many kinds of user learning resources: code repositories, notes, configuration files, and more. If users had to provide everything manually each time, the experience would be terrible. That led us to design the Vault system.

Core Design

Multi-Type Support

HagiCode’s Vault system supports four types, each corresponding to different usage scenarios:

Type	Purpose	Typical Scenario
`folder`	General-purpose folder type	Temporary learning materials, drafts
`coderef`	Designed specifically for studying code projects	Systematically learning an open source project
`obsidian`	Integrates with Obsidian note-taking software	Reusing an existing notes library
`system-managed`	Managed automatically by the system	Project configuration, prompt templates, and more

Among them, the coderef type is the most commonly used in HagiCode. It provides a standardized directory structure and AI-readable metadata descriptions for code-study projects. Why design this type specifically? Because studying an open source project is not as simple as “downloading code.” You also need to manage the code itself, learning notes, configuration files, and other content at the same time, and coderef standardizes all of that.

Persistent Storage Mechanism

The Vault registry is persisted to the file system as JSON:

_registryFilePath = Path.Combine(absoluteDataDir, "personal-data", "vaults", "registry.json");

This design may look simple, but it was carefully considered:

Simple and reliable. JSON is human-readable, making it easy to debug and modify manually. When something goes wrong, you can open the file directly to inspect the state or even repair it by hand - especially useful during development.

Reduced dependencies. File system storage avoids the complexity of a database. There is no need to install and configure an extra database service, which reduces system complexity and maintenance cost.

Concurrency-safe. SemaphoreSlim is used to guarantee thread safety. In an AI coding assistant scenario, multiple operations may access the Vault registry at the same time, so concurrency control is necessary.

AI Context Integration

The system’s core capability is that it can automatically inject Vault information into the context of AI proposals:

export function buildTargetVaultsText(
  vaults: VaultForText[],
  template: VaultPromptTemplate = DEFAULT_VAULT_PROMPT_TEMPLATE,
): string {
  const readOnlyVaults = vaults.filter((vault) => vault.accessType === 'read');
  const editableVaults = vaults.filter((vault) => vault.accessType === 'write');

  const sections = [
    buildVaultSection(readOnlyVaults, template.reference),
    buildVaultSection(editableVaults, template.editable),
  ].filter(Boolean);

  return `\n\n### ${template.heading}\n\n${sections.join('\n')}`;
}

This allows the AI assistant to automatically understand which learning resources are available, without requiring the user to provide context manually every time. It makes the HagiCode experience feel especially natural - tell the AI, “Help me analyze React concurrent rendering,” and it can automatically find the previously registered React learning Vault instead of asking you to paste code over and over again.

Access Control Mechanism

The system divides Vaults into two access types:

reference (read-only): AI can only use the content for analysis and understanding, without modifying it
editable (modifiable): AI can modify the content as needed for the task

This distinction tells the AI which content is “read-only reference” and which content it is allowed to modify, reducing the risk of accidental changes. For example, if you register an open source project’s Vault as learning material, you definitely do not want AI casually editing the code inside it - so mark it as reference. But if it is your own project Vault, you can mark it as editable and let AI help modify the code.

Practical Guide

Standardized Structure for a CodeRef Vault

For coderef Vaults, the system provides a standardized directory structure:

my-coderef-vault/
├── index.yaml          # vault metadata description
├── AGENTS.md           # operating guide for AI assistants
├── docs/               # stores learning notes and documentation
└── repos/              # manages referenced code repositories through Git submodules

What is the design philosophy behind this structure?

docs/ stores learning notes, using Markdown to record your understanding of the code, architecture analysis, and lessons from debugging. These notes are not only for you - AI can understand them too, and will automatically reference them when handling related tasks.

repos/ manages the studied repositories through Git submodules rather than by copying code directly. This has two benefits: first, it stays in sync with upstream, and a single git submodule update fetches the latest code; second, it saves space, because multiple Vaults can reference different versions of the same repository.

index.yaml contains Vault metadata so the AI assistant can quickly understand its purpose and contents. It is essentially a “self-introduction” for the Vault, so the AI knows what it is for the first time it sees it.

AGENTS.md is a guide written specifically for AI assistants, explaining how to handle the content inside the Vault. You can tell the AI things like: “When analyzing this project, focus on code related to performance optimization” or “Do not modify test files.”

Creating and Using a Vault

Creating a CodeRef Vault is simple:

const createCodeRefVault = async () => {
  const response = await VaultService.postApiVaults({
    requestBody: {
      name: "React Learning Vault",
      type: "coderef",
      physicalPath: "/Users/developer/vaults/react-learning",
      gitUrl: "https://github.com/facebook/react.git"
    }
  });

  // The system will automatically:
  // 1. Clone the React repository to vault/repos/react
  // 2. Create the docs/ directory for notes
  // 3. Generate index.yaml metadata
  // 4. Create the AGENTS.md guide file

  return response;
};

Then reference this Vault in an AI proposal:

const proposal = composeProposalChiefComplaint({
  chiefComplaint: "Help me analyze React's concurrent rendering mechanism",
  repositories: [
    { id: "react", gitUrl: "https://github.com/facebook/react.git" }
  ],
  vaults: [
    {
      id: "react-learning",
      name: "React Learning Vault",
      type: "coderef",
      physicalPath: "/vaults/react-learning",
      accessType: "read"  // AI can only read, not modify
    }
  ],
  quickRequestText: "Pay special attention to the Fiber architecture and scheduler implementation"
});

Typical Usage Scenarios

Scenario 1: Systematically studying open source projects

Create a CodeRef Vault, manage the target repository through Git submodules, and record learning notes in the docs/ directory. AI can access both the code and the notes at the same time, providing more accurate analysis. Notes written while studying a module are automatically referenced by the AI when it later analyzes related code - like having an “assistant” that remembers your previous thinking.

Scenario 2: Reusing an Obsidian notes library

If you are already using Obsidian to manage notes, just register your existing Vault in HagiCode directly. AI can access your knowledge base without manual copy-paste. This feature is especially practical because many people have years of accumulated notes, and once connected, AI can “read” and understand that knowledge system.

Scenario 3: Cross-project knowledge reuse

Multiple AI proposals can reference the same Vault, enabling knowledge reuse across projects. For example, you can create a “design patterns learning Vault” that contains notes and code examples for many design patterns. No matter which project the AI is analyzing, it can refer to the content in that Vault - knowledge does not need to be accumulated repeatedly.

Path Safety Mechanism

The system strictly validates paths to prevent path traversal attacks:

private static string ResolveFilePath(string vaultRoot, string relativePath)
{
    var rootPath = EnsureTrailingSeparator(Path.GetFullPath(vaultRoot));
    var combinedPath = Path.GetFullPath(Path.Combine(rootPath, relativePath));
    if (!combinedPath.StartsWith(rootPath, StringComparison.OrdinalIgnoreCase))
    {
        throw new BusinessException(VaultRelativePathTraversalCode,
            "Vault file paths must stay inside the registered vault root.");
    }
    return combinedPath;
}

This ensures all file operations stay within the Vault root directory and prevents malicious path access. Security is not something to take lightly. If an AI assistant is going to operate on the file system, the boundaries must be clearly defined.

Notes

When using the HagiCode Vault system, there are several things to pay special attention to:

Path safety: Make sure custom paths stay within the allowed scope, otherwise the system will reject the operation. This prevents accidental misuse and potential security risks.
Git submodule management: CodeRef Vaults are best managed with Git submodules instead of directly copying code. The benefits were covered earlier - keeping in sync and saving space. That said, submodules have their own workflow, so first-time users may need a little time to get familiar with them.
File preview limits: The system limits file size (256KB) and quantity (500 files), so oversized files need to be handled in batches. This limit exists for performance reasons. If you run into very large files, you can split them manually or process them another way.
Diagnostic information: Creating a Vault returns diagnostic information that can be used for debugging on failure. Check the diagnostics first when you run into issues - in most cases, that is where you will find the clue.

Conclusion

The HagiCode Vault system is fundamentally solving a simple but profound problem: how to let AI assistants understand and use local knowledge resources.

Through a unified storage abstraction layer, a standardized directory structure, and automated context injection, it delivers a knowledge management model of “register once, reuse everywhere.” Once a Vault is created, AI can automatically access and understand learning notes, code repositories, and documentation resources.

The experience improvement from this design is obvious. There is no longer any need to manually copy code snippets or repeatedly explain background information - the AI assistant becomes more like a teammate who truly understands the project and can provide more valuable help based on existing knowledge.

The Vault system shared in this article is a solution shaped through real trial and error and real optimization during HagiCode development. If you think this design is valuable, that says something about the engineering behind it - and HagiCode itself is worth checking out as well.

References

HagiCode GitHub: github.com/HagiCode-org/site
HagiCode website: hagicode.com
30-minute hands-on demo: www.bilibili.com/video/BV1pirZBuEzq/
Docker Compose installation guide: docs.hagicode.com/installation/docker-compose
Desktop quick install: hagicode.com/desktop/

If this article helped you:

Give us a Star on GitHub: github.com/HagiCode-org/site
Visit the website to learn more: hagicode.com
Watch the hands-on demo video: www.bilibili.com/video/BV1pirZBuEzq/
Try one-click installation: docs.hagicode.com/installation/docker-compose
Desktop quick install: hagicode.com/desktop/

The public beta has started. Welcome to install it and give it a try.

Copyright Notice

Thank you for reading. If you found this article useful, please like, save, and share it. This content was created with AI-assisted collaboration, and the final version was reviewed and confirmed by the author.

Author: newbe36524
Original link: https://docs.hagicode.com/blog/2026-04-10-vault-system-ai-knowledge-base/
Copyright notice: Unless otherwise stated, all articles on this blog are licensed under BY-NC-SA. Please indicate the source when reprinting.

Edit DESIGN.md Directly in the Web Interface: From Idea to Implementation

Apr 9, 2026

Edit DESIGN.md Directly in the Web Interface: From Idea to Implementation

In the MonoSpecs project management system, DESIGN.md carries the architectural design and technical decisions of a project. But the traditional editing workflow forces users to jump out to an external editor. That fragmented experience is like being interrupted in the middle of reading a poem: the inspiration is gone, and so is the mood. This article shares the solution we put into practice in the HagiCode project: editing DESIGN.md directly in the web interface, with support for importing templates from an online design site. After all, who does not enjoy the feeling of completing everything in one flow?

Background

As the core carrier of project design documents, DESIGN.md holds key information such as architecture design, technical decisions, and implementation guidance. However, the traditional editing approach requires users to switch to an external editor such as VS Code, manually locate the physical path, and then edit the file. It is not especially complicated, but after repeating the process a few times, it becomes tiring.

The problems mainly show up in the following ways:

Fragmented workflow: users must constantly switch between the web management interface and a local editor, breaking the continuity of their workflow, much like having the music cut out in the middle of a song.
Hard to reuse: the design site already publishes a rich library of design templates, but they cannot be integrated directly into the project editing workflow. The good stuff exists, but you still cannot use it where you need it.
Missing experience loop: there is no closed loop for “preview-select-import,” so users must copy and paste manually, which increases the risk of mistakes.
Collaboration friction: keeping design documents and code implementation in sync becomes a high-friction process, which hurts team efficiency.

To solve these pain points, we decided to add direct editing support for DESIGN.md in the web interface and allow one-click template import from an online design site. It was not some earth-shaking decision. We simply wanted to make the development experience smoother.

About HagiCode

The solution shared in this article comes from our hands-on experience in the HagiCode project. HagiCode is an AI-driven coding assistant project, and during development we frequently need to maintain project design documents. To help the team collaborate more efficiently, we explored and implemented this online editing and import solution. There is nothing mysterious about it. We ran into a problem and worked out a way to solve it.

Technical Solution

Overall Architecture

This solution uses a frontend-backend separated architecture with a same-origin proxy, mainly composed of the following layers. In practice, the design can be summed up as “each part doing its own job”:

1. Frontend editor layer

// Core component: DesignMdManagementDrawer
// Responsibility: handle editing, saving, version conflict detection, and import flow

2. Backend service layer

// Location: repos/hagicode-core/src/PCode.Application/ProjectAppService.DesignMd.cs
// Responsibility: path resolution, file read/write, and version management

3. Same-origin proxy layer

// Location: repos/hagicode-core/src/PCode.Application/ProjectAppService.DesignMdSiteIndex.cs
// Responsibility: proxy design site resources, preview image caching, and security validation

Key Technical Decisions

Decision 1: Global Drawer Pattern

We use a single global drawer instead of local pop-up layers, with state managed through layoutSlice, which gives users a consistent experience across views (classic and kanban). No matter which view the user opens the editor from, they get the same interaction model. A consistent experience makes people feel more at ease instead of getting disoriented when they switch views.

Decision 2: Project-Scoped API

We mounted DESIGN.md-related endpoints under ProjectController, reusing the existing project permission boundary and avoiding the complexity of adding a separate controller. This makes permission handling clearer and also aligns with RESTful resource organization. Sometimes reuse is more meaningful than creating something new from scratch.

Decision 3: Version Conflict Detection

We derive an opaque version from the file system’s LastWriteTimeUtc, which gives us lightweight optimistic concurrency control. When multiple users edit the same file at once, the system can detect conflicts and prompt the user to refresh. This design does not block editing, while still protecting data consistency.

Decision 4: Same-Origin Proxy Pattern

We use IHttpClientFactory to proxy external design-site resources, avoiding both cross-origin issues and SSRF risks. This keeps the system secure while also simplifying frontend calls. You can hardly be too careful with security.

Core Implementation

1. Edit DESIGN.md Directly

Backend Implementation

The backend is mainly responsible for path resolution, file read/write, and version management. These tasks are basic, but indispensable, like the foundation of a house:

// Path resolution and security validation
private Task<string> ResolveDesignDocumentDirectoryAsync(string projectPath, string? repositoryPath)
{
    if (string.IsNullOrWhiteSpace(repositoryPath))
    {
        return Task.FromResult(Path.GetFullPath(projectPath));
    }
    return ValidateSubPathAsync(projectPath, repositoryPath);
}

// Version generation (based on file system timestamp and size)
private static string BuildDesignDocumentVersion(string path)
{
    var fileInfo = new FileInfo(path);
    fileInfo.Refresh();
    return string.Create(
        CultureInfo.InvariantCulture,
        $"{fileInfo.LastWriteTimeUtc.Ticks:x}-{fileInfo.Length:x}");
}

The version design is interesting in its simplicity: we use the file’s last modified time and size to generate a unique version identifier. It is lightweight and reliable, with no extra version database to maintain. Simple solutions are often the most effective.

Frontend Implementation

On the frontend, we implement dirty-state detection and save logic. This design helps users understand whether their changes have been saved and reduces the anxiety of “what if I lose it?”:

// Dirty-state detection and save logic
const [draft, setDraft] = useState('');
const [savedDraft, setSavedDraft] = useState('');
const isDirty = draft !== savedDraft;

const handleSave = useCallback(async () => {
    const result = await saveProjectDesignMdDocument({
        ...activeTarget,
        content: draft,
        expectedVersion: document.version, // optimistic concurrency control
    });
    setSavedDraft(draft); // update saved state
}, [activeTarget, document, draft]);

In this implementation, we maintain two pieces of state: draft is the content currently being edited, while savedDraft is the saved content. Comparing them tells us whether there are unsaved changes. The design is simple, but it gives people peace of mind. Nobody wants the thing they worked hard on to disappear.

2. Import Design Files from an Online Source

Directory Structure

repos/index/
└── src/data/public/design.json    # Design template index

repos/awesome-design-md-site/
├── vendor/awesome-design-md/       # Upstream design templates
│   └── design-md/
│       ├── clickhouse/
│       │   └── DESIGN.md
│       ├── linear/
│       │   └── DESIGN.md
│       └── ...
└── src/lib/content/
    └── awesomeDesignCatalog.ts     # Content pipeline

Index Data Format

The index file on the design site defines all available templates. With this index, users can choose the template they want as easily as ordering from a menu:

{
  "entries": [
    {
      "slug": "linear.app",
      "title": "Linear Inspired Design System",
      "summary": "AI Product / Dark Feel",
      "detailUrl": "/designs/linear.app/",
      "designDownloadUrl": "/designs/linear.app/DESIGN.md",
      "previewLightImageUrl": "...",
      "previewDarkImageUrl": "..."
    }
  ]
}

Each entry includes the template’s basic information and download links. The backend reads the list of available templates from this index and presents them for the user to choose from. That makes selection intuitive instead of forcing people to feel their way around in the dark.

Same-Origin Proxy Implementation

To keep things secure, the backend performs strict validation on access to the design site. You cannot be too cautious about security:

// Safe slug validation
private static readonly Regex SafeDesignSiteSlugRegex =
    new("^[A-Za-z0-9](?:[A-Za-z0-9._-]{0,127})$", RegexOptions.Compiled);

private static string NormalizeDesignSiteSlug(string slug)
{
    var normalizedSlug = slug?.Trim() ?? string.Empty;
    if (!IsSafeDesignSiteSlug(normalizedSlug))
    {
        throw new BusinessException(
            ProjectDesignSiteIndexErrorCodes.InvalidSlug,
            "Design site slug must be a single safe path segment.");
    }
    return normalizedSlug;
}

// Preview image caching (OS temp directory)
private static string ComputePreviewCacheKey(string slug, string theme, string previewUrl)
{
    var raw = $"{slug}|{theme}|{previewUrl}";
    var bytes = SHA256.HashData(Encoding.UTF8.GetBytes(raw));
    return Convert.ToHexString(bytes).ToLowerInvariant();
}

We do two things here: first, we validate the slug format strictly with a regular expression to prevent path traversal attacks; second, we cache preview images to reduce pressure on the external site. The former is protection, the latter is optimization, and both matter.

3. Full Import Flow

// 1. Open the import drawer
const handleRequestImportDrawer = useCallback(() => {
    setIsImportDrawerOpen(true);
}, []);

// 2. Select and import
const handleImportRequest = useCallback((entry) => {
    if (isDirty) {
        setPendingImportEntry(entry);
        setConfirmMode('import'); // overwrite confirmation
        return;
    }
    void executeImport(entry);
}, [isDirty]);

// 3. Execute import
const executeImport = useCallback(async (entry) => {
    const result = await getProjectDesignMdSiteImportDocument(
        activeTarget.projectId,
        entry.slug
    );
    setDraft(result.content); // replace editor text only, do not save automatically
    setIsImportDrawerOpen(false);
}, [activeTarget?.projectId]);

The import flow follows a “user confirmation” principle: after import, only the editor content is updated, and nothing is saved automatically. Users can inspect the imported content and save it manually only after confirming it looks right. The final decision should stay in the hands of the user.

Practical Examples

Scenario 1: Creating DESIGN.md in the Project Root

When DESIGN.md does not exist, the backend returns a virtual document state. This lets the frontend avoid special handling for the “file does not exist” case, and a unified API simplifies the code logic:

return new ProjectDesignDocumentDto
{
    Path = targetPath,
    Exists = false,  // virtual document state
    Content = string.Empty,
    Version = null
};

// Automatically create the file on first save
public async Task<SaveProjectDesignDocumentResultDto> SaveDesignDocumentAsync(...)
{
    Directory.CreateDirectory(targetDirectory);
    await File.WriteAllTextAsync(targetPath, input.Content);
    return new SaveProjectDesignDocumentResultDto { Created = !exists };
}

The benefit of this design is that the frontend does not need special-case logic for missing files. By hiding that complexity in the backend, the frontend can focus more easily on user experience.

Scenario 2: Import a Template from the Design Site

After the user selects the “Linear” design template in the import drawer, the system fetches the DESIGN.md content through the backend proxy. The whole process is transparent to the user: they only choose a template, and the system handles the network requests and data transformation automatically.

// 1. The system fetches DESIGN.md content through the backend proxy
GET /api/project/{id}/design-md/site-index/linear.app

// 2. The backend validates the slug and fetches content from upstream
var entry = FindDesignSiteEntry(catalog, "linear.app");
using var upstreamResponse = await httpClient.SendAsync(request);
var content = await upstreamResponse.Content.ReadAsStringAsync();

// 3. The frontend replaces the editor text
setDraft(result.content);
// The user reviews it and then saves it manually to disk

The whole flow stays transparent to the user. They just choose a template, and the system handles the networking and transformation behind the scenes. That is the experience we want: simple, but powerful.

Scenario 3: Handling Version Conflicts

When multiple users edit the same DESIGN.md at the same time, the system detects version conflicts. This optimistic concurrency control mechanism preserves data consistency without blocking the user’s edits:

if (!string.Equals(currentVersion, expectedVersion, StringComparison.Ordinal))
{
    throw new BusinessException(
        ProjectDesignDocumentErrorCodes.VersionConflict,
        $"DESIGN.md at '{targetPath}' changed on disk.");
}

The frontend catches this error and prompts the user:

// Frontend prompts the user to refresh and retry
<Alert>
    <AlertTitle>Version conflict</AlertTitle>
    <AlertDescription>
        The file was modified by another process. Please refresh to get the latest version and try again.
    </AlertDescription>
</Alert>

This optimistic concurrency control mechanism keeps data consistent without blocking users while they work. Conflicts are unavoidable, but at least users should know what happened instead of silently losing their changes.

Notes and Best Practices

1. Path Security

Always validate repositoryPath to prevent path traversal attacks. You can never do too much when it comes to security:

// Always validate repositoryPath to prevent path traversal attacks
return ValidateSubPathAsync(projectPath, repositoryPath);
// Reject dangerous inputs such as "../" and absolute paths

2. Cache Strategy

Cache preview images for 24 hours, with a maximum of 160 files. Moderate caching improves performance, but balance still matters:

// Cache preview images for 24 hours, with a maximum of 160 files
private static readonly TimeSpan PreviewCacheTtl = TimeSpan.FromHours(24);
private const int PreviewCacheMaxFiles = 160;
// Periodically clean up expired cache

3. Error Handling

Gracefully degrade when the upstream site is unavailable. This design ensures that even if an external dependency fails, the core editing functionality still works normally:

// Gracefully degrade when the upstream site is unavailable
try {
    const catalog = await getProjectDesignMdSiteImportCatalog(projectId);
} catch (error) {
    toast.error(t('project.designMd.siteImport.feedback.catalogLoadFailed'));
    // The main editing drawer remains available
}

This graceful degradation ensures that even when external dependencies are unavailable, the core editing function continues to work. A system should be resilient instead of collapsing the moment something goes wrong.

4. User Experience Optimization

Confirm overwrites before importing, and do not save automatically after import. Users should stay in control of their own actions:

// Confirm overwrite before import
if (isDirty) {
    setConfirmMode('import');
    return;
}

// Do not save automatically after import; let the user confirm
setDraft(result.content); // update draft only
// The content is written to disk only after the user reviews it and clicks Save

5. Performance Considerations

Use an HTTP client factory to avoid creating too many connections. Resource management may seem small, but doing it well can make a big difference:

// Use an HTTP client factory to avoid creating too many connections
private const string DesignSiteProxyClientName = "ProjectDesignSiteProxy";
private static readonly TimeSpan DesignSiteProxyTimeout = TimeSpan.FromSeconds(8);

Suggested Extensions

Markdown enhancement: we currently use a basic Textarea, but we could upgrade to CodeMirror for syntax highlighting and keyboard shortcuts. When the editor feels better, writing documentation feels better too.
Preview mode: add real-time Markdown preview to improve the editing experience. What-you-see-is-what-you-get always gives people more confidence.
Diff merge: implement an intelligent merge algorithm instead of simple full-text replacement. Conflicts are inevitable, but the conflict-resolution process does not have to be painful.
Local caching: cache design.json in the database to reduce dependency on the external site. The fewer dependencies a system has, the more stable it tends to be.

Summary

In the HagiCode project, we implemented a complete online editing and import solution for DESIGN.md through frontend-backend collaboration. The core value of this solution lies in the following points:

Higher efficiency: no need to switch tools; editing and importing design documents can happen in one unified web interface.
Lower barrier to entry: one-click design template import helps new projects get started quickly.
Secure and reliable: path validation, version conflict detection, and graceful degradation mechanisms keep the system stable.
Better user experience: the global drawer, dirty-state detection, and confirmation dialogs refine the overall interaction experience.

This solution is already running in the HagiCode project and has solved the team’s pain points around design document management. If you are facing similar problems, I hope this article gives you some useful ideas. There is no particularly profound theory here, only the practical work of running into a problem and finding a way to solve it.

References

HagiCode project repository: github.com/HagiCode-org/site
HagiCode official website: hagicode.com
MonoSpecs project management system: docs.hagicode.com
30-minute hands-on demo: www.bilibili.com/video/BV1pirZBuEzq/
Docker Compose installation guide: docs.hagicode.com/installation/docker-compose
Desktop installation: hagicode.com/desktop/

If this article helped you, feel free to give the project a Star on GitHub. The public beta has already started, and you can join the experience right after installing it. Open-source projects always need more feedback and encouragement, and if you found this useful, it is worth helping more people discover it.

“Beautiful things or people do not have to belong to you. As long as they remain beautiful, it is enough to quietly appreciate that beauty.”

The same goes for a DESIGN.md editor. It does not need to be overly complex. If it helps you work efficiently, that is already enough.

Copyright Notice

Thank you for reading. If you found this article useful, please consider liking, bookmarking, and sharing it. This content was created with AI-assisted collaboration, and the final version was reviewed and approved by the author.

Author: newbe36524
Original article: https://docs.hagicode.com/blog/2026-04-09-design-md-web-editor-implementation/
Copyright statement: Unless otherwise noted, all articles on this blog are licensed under BY-NC-SA. Please include attribution when reposting!

How to Reproduce Projects in the AI Era: Vault, a Cross-Project Persistent Storage System

Apr 6, 2026

How to Reproduce Projects in the AI Era: Vault, a Cross-Project Persistent Storage System

In the era of AI-assisted development, how can we help AI assistants better understand our learning resources? The HagiCode project built the Vault system as a unified knowledge storage abstraction layer that AI can understand, greatly improving the efficiency of learning through project reproduction.

Background

In the AI era, the way developers learn new technologies and architectures is changing profoundly. “Reproducing projects” - that is, deeply studying and learning from the code, architecture, and design patterns of excellent open source projects - has become an efficient way to learn. Compared with traditional methods like reading books or watching videos, directly reading and running high-quality open source projects helps you understand real-world engineering practices much faster.

Still, this learning method comes with quite a few challenges.

Learning materials are too scattered. Your notes may live in Obsidian, code repositories may be scattered across different folders, and your AI assistant’s conversation history becomes yet another isolated data island. When you want AI to help analyze a project, you have to manually copy code snippets and organize context, which is rather tedious.

What is even more troublesome is the broken context. AI assistants cannot directly access your local learning resources, so you have to provide background information again in every conversation. On top of that, reproduced code repositories update quickly, manual syncing is error-prone, and knowledge is hard to share across multiple learning projects.

At the root, all of these problems come from “data islands.” If there were a unified storage abstraction layer that allowed AI assistants to understand and access all your learning resources, the problem would be solved neatly.

About HagiCode

The Vault system shared in this article is exactly the solution we developed while building HagiCode. HagiCode is an AI coding assistant project, and in our daily development work we often need to study and refer to many different open source projects. To help AI assistants better understand these learning resources, we designed Vault, a cross-project persistent storage system.

This solution has already been validated in HagiCode in real use. If you are facing similar knowledge management challenges, I hope these experiences can offer some inspiration. After all, once you’ve fallen into a few pits yourself, you should leave something behind for the next person.

Vault system design philosophy

The core idea of the Vault system is simple: create a unified knowledge storage abstraction layer that AI can understand. From an implementation perspective, the system has several key characteristics.

Multi-type support

The system supports four vault types, each corresponding to a different usage scenario:

// folder: general-purpose folder type
export const DEFAULT_VAULT_TYPE = 'folder';

// coderef: a type specifically for reproduced code projects
export const CODEREF_VAULT_TYPE = 'coderef';

// obsidian: integrated with Obsidian note-taking software
export const OBSIDIAN_VAULT_TYPE = 'obsidian';

// system-managed: vault automatically managed by the system
export const SYSTEM_MANAGED_VAULT_TYPE = 'system-managed';

Among them, the coderef type is the most commonly used in HagiCode. It is specifically designed for reproduced code projects, providing a standardized directory structure and AI-readable metadata descriptions.

Persistent storage mechanism

The Vault registry is stored persistently in JSON format, ensuring that the configuration remains available after the application restarts:

public class VaultRegistryStore : IVaultRegistryStore
{
    private readonly string _registryFilePath;

    public VaultRegistryStore(IConfiguration configuration, ILogger<VaultRegistryStore> logger)
    {
        var dataDir = configuration["DataDir"] ?? "./data";
        var absoluteDataDir = Path.IsPathRooted(dataDir)
            ? dataDir
            : Path.GetFullPath(Path.Combine(Directory.GetCurrentDirectory(), dataDir));

        _registryFilePath = Path.Combine(absoluteDataDir, "personal-data", "vaults", "registry.json");
    }
}

The advantage of this design is that it is simple and reliable. JSON is human-readable, which makes debugging and manual editing easier; filesystem storage avoids the complexity of a database and reduces system dependencies. After all, sometimes the simplest option really is the best one.

AI context integration

Most importantly, the system can automatically inject vault information into the context of AI proposals:

export function buildTargetVaultsText(
  vaults: VaultForText[],
  template: VaultPromptTemplate = DEFAULT_VAULT_PROMPT_TEMPLATE,
): string {
  const readOnlyVaults = vaults.filter((vault) => vault.accessType === 'read');
  const editableVaults = vaults.filter((vault) => vault.accessType === 'write');

  if (readOnlyVaults.length === 0 && editableVaults.length === 0) {
    return '';
  }

  const sections = [
    buildVaultSection(readOnlyVaults, template.reference),
    buildVaultSection(editableVaults, template.editable),
  ].filter(Boolean);

  return `\n\n### ${template.heading}\n\n${sections.join('\n')}`;
}

This enables an important capability: AI assistants can automatically understand the available learning resources without users manually providing context. You could say that counts as a kind of tacit understanding.

The standardized structure of CodeRef Vault

For the coderef type of vault, HagiCode provides a standardized directory structure:

my-coderef-vault/
├── index.yaml          # vault metadata description
├── AGENTS.md           # operating guide for AI assistants
├── docs/               # stores study notes and documents
└── repos/              # manages reproduced code repositories through Git submodules

When creating a vault, the system automatically initializes this structure:

private async Task EnsureCodeRefStructureAsync(
    string vaultName,
    string physicalPath,
    ICollection<VaultBootstrapDiagnosticDto> diagnostics,
    CancellationToken cancellationToken)
{
    Directory.CreateDirectory(physicalPath);

    var indexPath = Path.Combine(physicalPath, CodeRefIndexFileName);
    var docsPath = Path.Combine(physicalPath, CodeRefDocsDirectoryName);
    var reposPath = Path.Combine(physicalPath, CodeRefReposDirectoryName);

    // Create the standard directory structure
    if (!Directory.Exists(docsPath))
    {
        Directory.CreateDirectory(docsPath);
    }

    if (!Directory.Exists(reposPath))
    {
        Directory.CreateDirectory(reposPath);
    }

    // Create the AGENTS.md guide
    await EnsureCodeRefAgentsDocumentAsync(physicalPath, cancellationToken);

    // Create the index.yaml metadata
    await WriteCodeRefIndexDocumentAsync(indexPath, mergedDocument, cancellationToken);
}

This structure is carefully designed as well:

docs/ stores your study notes, where you can record your understanding of the code, architecture analysis, lessons learned, and so on in Markdown
repos/ manages reproduced repositories through Git submodules instead of copying code directly, which keeps the code in sync and saves space
index.yaml contains the vault metadata so AI assistants can quickly understand the purpose and contents of the vault
AGENTS.md is a guide written specifically for AI assistants, explaining how to handle the contents of the vault

Organized this way, perhaps AI can understand what you have in mind a little more easily.

Automatic initialization for system-managed vaults

In addition to manually created vaults, HagiCode also supports system-managed vaults:

public async Task<IReadOnlyList<VaultRegistryEntry>> EnsureAllSystemManagedVaultsAsync(
    CancellationToken cancellationToken = default)
{
    var definitions = GetAllResolvedDefinitions();
    var entries = new List<VaultRegistryEntry>(definitions.Count);

    foreach (var definition in definitions)
    {
        entries.Add(await EnsureResolvedSystemManagedVaultAsync(definition, cancellationToken));
    }

    return entries;
}

The system automatically creates and manages the following vaults:

hagiprojectdata: project data storage used to save project configuration and state
personaldata: personal data storage used to save user preferences
hbsprompt: a prompt template library used to manage commonly used AI prompts

These vaults are initialized automatically when the system starts, so users do not need to configure them manually. Some things are simply better left to the system instead of humans worrying about them.

Access control mechanism

An important part of the design is access control. The system divides vaults into two access types:

export interface VaultForText {
  id: string;
  name: string;
  type: string;
  physicalPath: string;
  accessType: 'read' | 'write';  // Key: distinguish read-only from editable
}

reference (read-only): AI is only used for analysis and understanding and cannot modify content. Suitable for referenced open source projects, documents, and similar materials
editable (editable): AI can modify content as needed for the task. Suitable for your notes, drafts, and similar materials

This distinction matters. It tells AI which content is “read-only reference” and which content is “safe to edit,” reducing the risk of accidental changes. After all, nobody wants their hard work to disappear because of an unintended edit.

In practice: creating and using Vault

Now that we’ve covered the ideas, let’s look at how it works in practice.

Create a CodeRef Vault

Here is a complete frontend call example:

const createCodeRefVault = async () => {
  const response = await VaultService.postApiVaults({
    requestBody: {
      name: "React Learning Vault",
      type: "coderef",
      physicalPath: "/Users/developer/vaults/react-learning",
      gitUrl: "https://github.com/facebook/react.git"
    }
  });

  // The system will automatically:
  // 1. Clone the React repository into vault/repos/react
  // 2. Create the docs/ directory for notes
  // 3. Generate the index.yaml metadata
  // 4. Create the AGENTS.md guide file

  return response;
};

This API call completes a series of actions: creating the directory structure, initializing Git submodules, generating metadata files, and more. You only need to provide the basic information and let the system handle the rest. It is honestly a fairly worry-free approach.

Use Vault in an AI proposal

After creating the vault, you can reference it in an AI proposal:

const proposal = composeProposalChiefComplaint({
  chiefComplaint: "Help me analyze React's concurrent rendering mechanism",
  repositories: [
    { id: "react", gitUrl: "https://github.com/facebook/react.git" }
  ],
  vaults: [
    {
      id: "react-learning",
      name: "React Learning Vault",
      type: "coderef",
      physicalPath: "/vaults/react-learning",
      accessType: "read"  // AI can only read, not modify
    }
  ],
  quickRequestText: "Focus on the Fiber architecture and scheduler implementation"
});

The system automatically injects vault information into the AI context, letting AI know which learning resources are available. When AI can understand what you have in mind, that kind of tacit understanding is hard to come by.

Best practices and things to watch for

While using the Vault system, we have summarized a few lessons learned.

Path safety

The system strictly validates paths to prevent path traversal attacks:

private static string ResolveFilePath(string vaultRoot, string relativePath)
{
    var rootPath = EnsureTrailingSeparator(Path.GetFullPath(vaultRoot));
    var combinedPath = Path.GetFullPath(Path.Combine(rootPath, relativePath));
    if (!combinedPath.StartsWith(rootPath, StringComparison.OrdinalIgnoreCase))
    {
        throw new BusinessException(VaultRelativePathTraversalCode,
            "Vault file paths must stay inside the registered vault root.");
    }
    return combinedPath;
}

This is important. If you customize a vault path, make sure it stays within the allowed range, otherwise the system will reject the operation. You really cannot overemphasize security.

Git submodule management

CodeRef Vault recommends Git submodules instead of directly copying code:

private static string BuildCodeRefAgentsContent()
{
    return """
    # CodeRef Vault Guide

    Repositories under `repos/` should be maintained through Git submodules
    rather than copied directly into the vault root.

    Keep this structure stable so assistants and tools can understand the vault quickly.
    """ + Environment.NewLine;
}

This brings several advantages: keeping code synchronized with upstream, saving disk space, and making it easier to manage multiple versions of the code. After all, who wants to download the same thing again and again?

File preview limits

To prevent performance problems, the system limits file size and type:

private const int FileEnumerationLimit = 500;
private const int PreviewByteLimit = 256 * 1024;  // 256KB

If your vault contains a large number of files or very large files, preview performance may be affected. In that case, you can consider processing files in batches or using specialized search tools. Sometimes when something gets too large, it becomes harder to handle, not easier.

Diagnostic information

When creating a vault, the system returns diagnostic information to help with debugging:

List<VaultBootstrapDiagnosticDto> bootstrapDiagnostics = [];

if (IsCodeRefVaultType(normalizedType))
{
    bootstrapDiagnostics = await EnsureCodeRefBootstrapAsync(
        normalizedName,
        normalizedPhysicalPath,
        normalizedGitUrl,
        cancellationToken);
}

If creation fails, you can inspect the diagnostic information to understand the specific cause. When something goes wrong, checking the diagnostics is often the most direct way forward.

Summary

Through a unified storage abstraction layer, the Vault system solves several core pain points of reproducing projects in the AI era:

Centralized knowledge management: all learning resources are gathered in one place instead of scattered everywhere
Automatic AI context injection: AI assistants can automatically understand the available learning resources without manual context setup
Cross-project knowledge reuse: knowledge can be shared and reused across multiple learning projects
Standardized directory structure: a consistent directory layout lowers the learning curve

This solution has already been validated in the HagiCode project. If you are also building tools related to AI-assisted development, or facing similar knowledge management problems, I hope these experiences can serve as a useful reference.

In truth, the value of a technical solution does not lie in how complicated it is, but in whether it solves real problems. The core idea of the Vault system is very simple: build a unified knowledge storage layer that AI can understand. Yet it is precisely this simple abstraction that improved our development efficiency quite a bit.

Sometimes the simple approach really is the best one. After all, complicated things often hide even more pitfalls…

References

HagiCode project: github.com/HagiCode-org/site
HagiCode official website: hagicode.com
HagiCode installation docs: docs.hagicode.com/installation/docker-compose
Obsidian official website: obsidian.md
Git submodule documentation: git-scm.com/docs/gitsubmodules

If this article helped you, feel free to give the project a Star on GitHub, or visit the official website to learn more about HagiCode. The public beta has already started, and you can experience the full AI coding assistant features as soon as you install it.

Maybe you should give it a try as well…

Copyright notice

Author: newbe36524
Original link: https://docs.hagicode.com/blog/2026-04-06-vault-persistent-storage-for-ai-era/
Copyright notice: Unless otherwise stated, all blog posts on this site are licensed under BY-NC-SA. Please include attribution when reprinting.

Progressive Disclosure: Improving Human-Computer Interaction in AI Products with the Less Is More Philosophy

Apr 5, 2026

Progressive Disclosure: Improving Human-Computer Interaction in AI Products with the “Less Is More” Philosophy

In AI product design, the quality of user input often determines the quality of the output. This article shares a “progressive disclosure” interaction approach we practiced in the HagiCode project. Through step-by-step guidance, intelligent completion, and instant feedback, it turns users’ brief and vague inputs into structured technical proposals, significantly improving human-computer interaction efficiency.

Background

Anyone building AI products has probably seen this situation: a user opens your app and enthusiastically types a one-line request, only for the AI to return something completely off target. It is not that the AI is not smart enough. The user simply did not provide enough information. Mind-reading is hard for anyone.

This issue became especially obvious while we were building HagiCode. HagiCode is an AI-driven coding assistant where users describe requirements in natural language to create technical proposals and sessions. In actual use, we found that user input often has these problems:

Inconsistent input quality: some users type only a few words, such as “optimize login” or “fix bug”, without the necessary context
Inconsistent technical terminology: different users use different terms for the same thing; some say “frontend” while others say “FE”
Missing structured information: there is no project background, repository scope, or impact scope, even though these are critical details
Repeated problems: the same types of requests appear again and again, and each time they need to be explained from scratch

The direct result is predictable: the AI has a harder time understanding the request, proposal quality becomes unstable, and the user experience suffers. Users think, “This AI is not very good,” while we feel unfairly blamed. If you give me only one sentence, how am I supposed to guess what you really want?

In truth, this is understandable. Even people need time to understand one another, and machines are no exception.

To solve these pain points, we made a bold decision: introduce the design principle of “progressive disclosure” to improve human-computer interaction. The changes this brought were probably larger than you would imagine. To be honest, we did not expect it to be this effective at the time.

About HagiCode

The approach shared in this article comes from our practical experience in the HagiCode project. HagiCode is an open-source AI coding assistant designed to help developers complete tasks such as code writing, technical proposal generation, and code review through natural language interaction. Project link: github.com/HagiCode-org/site.

We developed this progressive disclosure approach through multiple rounds of iteration and optimization during real product development. If you find it valuable, that at least suggests our engineering is doing something right. In that case, HagiCode itself may also be worth a look. Good tools are meant to be shared.

What Is Progressive Disclosure

“Progressive disclosure” is a design principle from the field of HCI (human-computer interaction). Its core idea is simple: do not show users all information and options at once. Instead, reveal only what is necessary step by step, based on the user’s actions and needs.

This principle is especially suitable for AI products because AI interaction is naturally progressive. The user says a little, the AI understands a little, then the user adds more, and the AI understands more. It is very similar to how people communicate with each other: understanding usually develops gradually.

In HagiCode’s scenario, we applied progressive disclosure in four areas:

1. Description optimization mechanism: let AI help you say things more clearly

When a user enters a short description, we do not send it directly to the AI for interpretation. Instead, we first trigger a “description optimization” flow. The core of this flow is “structured output”: converting the user’s free text into a standard format. It is like stringing loose pearls into a necklace so everything becomes easier to understand.

The optimized description must include the following standard sections:

Background: the problem background and context
Analysis: technical analysis and reasoning
Solution: the solution and implementation steps
Practice: concrete code examples and notes

At the same time, we automatically generate a Markdown table showing information such as the target repository, paths, and edit permissions, making subsequent AI operations easier. A clear directory always makes things easier to find.

Here is the actual implementation:

// Core method in ProposalDescriptionMemoryService.cs
public async Task<string> OptimizeDescriptionAsync(
    string title,
    string description,
    string locale = "zh-CN",
    DescriptionOptimizationMemoryContext? memoryContext = null,
    CancellationToken cancellationToken = default)
{
    // Build query parameters
    var queryContext = BuildQueryContext(title, description);

    // Retrieve historical context
    var memoryContext = await RetrieveHistoricalContextAsync(queryContext, cancellationToken);

    // Generate a structured prompt
    var prompt = await BuildOptimizationPromptAsync(
        title,
        description,
        memoryContext,
        cancellationToken);

    // Call AI for optimization
    return await _aiService.CompleteAsync(prompt, cancellationToken);
}

The key to this flow is “memory injection”. We inject historical context such as project conventions, similar cases, and negative patterns into the prompt, allowing the AI to reference past experience during optimization. Experience should not go to waste.

Notes:

Make sure the current input takes priority over historical memory, so explicitly specified user information is not overridden
HagIndex references must be treated as factual sources and must not be altered by historical cases
Low-confidence correction suggestions should not be injected as strong constraints

2. Voice input capability: speaking is more natural than typing

In addition to text input, we also support voice input. This is especially useful for describing complex requirements. Typing a technical request can take minutes, while saying it out loud may take only a few dozen seconds.

The key design focus for voice input is “state management”. Users must clearly know what state the system is currently in. We defined the following states:

Idle: the system is ready and recording can start
Waiting upstream: the system is connecting to the backend service
Recording: the user’s voice is being recorded
Processing: speech is being converted to text
Error: an error occurred and needs user attention

The frontend state model looks roughly like this:

interface VoiceInputState {
  status: 'idle' | 'waiting-upstream' | 'recording' | 'processing' | 'error';
  duration: number;
  error?: string;
  deletedSet: Set<string>; // Fingerprint set of deleted results
}

// State transition when recording starts
const handleVoiceInputStart = async () => {
  // Enter waiting state first and show a loading animation
  setState({ status: 'waiting-upstream' });

  // Wait for backend readiness confirmation
  const isReady = await waitForBackendReady();
  if (!isReady) {
    setState({ status: 'error', error: 'Backend service is not ready' });
    return;
  }

  // Start recording
  setState({ status: 'recording', startTime: Date.now() });
};

// Handle recognition results
const handleRecognitionResult = (result: RecognitionResult) => {
  const fingerprint = normalizeFingerprint(result.text);

  // Check whether it has already been deleted
  if (state.deletedSet.has(fingerprint)) {
    return; // Skip deleted content
  }

  // Merge the result into the text box
  appendResult(result);
};

There is an important detail here: we use a “fingerprint set” to manage deletion synchronization. When speech recognition returns multiple results, users may delete some of them. We store the fingerprints of deleted content so that if the same content appears again later, it is skipped automatically. It is essentially a way to remember what the user has already rejected.

3. Prompt management system: externalize the AI’s “brain”

HagiCode has a flexible prompt management system in which all prompts are stored as files:

prompts/
├── metadata/
│   ├── optimize-description.zh-CN.json
│   └── optimize-description.en-US.json
└── templates/
    ├── optimize-description.zh-CN.hbs
    └── optimize-description.en-US.hbs

Each prompt consists of two parts:

Metadata file (.json): defines information such as the prompt scenario, version, and parameters
Template file (.hbs): the actual prompt content, written with Handlebars syntax

The metadata file format looks like this:

{
  "scenario": "optimize-description",
  "locale": "zh-CN",
  "version": "1.0.0",
  "syntax": "handlebars",
  "syntaxVersion": "1.0",
  "parameters": [
    {
      "name": "title",
      "type": "string",
      "required": true,
      "description": "Proposal title"
    },
    {
      "name": "description",
      "type": "string",
      "required": true,
      "description": "Original description"
    }
  ],
  "author": "HagiCode Team",
  "description": "Optimize the user's technical proposal description",
  "lastModified": "2026-04-05",
  "tags": ["optimization", "nlp"]
}

The template file uses Handlebars syntax and supports parameter injection:

You are a technical proposal expert.

<task>
Generate a structured technical proposal description based on the following information.
</task>

<input>
<title>{{title}}</title>
<description>{{description}}</description>
{{#if memoryContext}}
<memory_context>
{{memoryContext}}
</memory_context>
{{/if}}
</input>

<output_format>
## Background
[Describe the problem background and context, including project information, repository scope, and so on]

## Analysis
[Provide the technical analysis and reasoning process, and explain why this change is needed]

## Solution
[Provide the solution and implementation steps, listing the key code locations]

## Practice
[Provide concrete code examples and notes]
</output_format>

The benefits of this design are clear:

prompts can be version-controlled just like code
multiple languages are supported and can be switched automatically based on user preference
the parameterized design allows context to be injected dynamically
completeness can be validated at startup, avoiding runtime errors

If knowledge stays only in someone’s head, it is easy to lose. Recording it in a structured way from the beginning is much safer.

4. Progressive wizard: split complex tasks into small steps

For complex tasks, such as first-time installation and configuration, we use a multi-step wizard design. Each step requests only the necessary information and provides clear progress indicators. Large tasks become much more manageable when handled one step at a time.

The wizard state model:

interface WizardState {
  currentStep: number;           // 0-3, corresponding to 4 steps
  steps: WizardStep[];
  canGoNext: boolean;
  canGoBack: boolean;
  isLoading: boolean;
  error: string | null;
}

interface WizardStep {
  id: number;
  title: string;
  description: string;
  completed: boolean;
}

// Step navigation logic
const goToNextStep = () => {
  if (wizardState.currentStep < wizardState.steps.length - 1) {
    // Validate input for the current step
    if (validateCurrentStep()) {
      wizardState.currentStep++;
      wizardState.steps[wizardState.currentStep - 1].completed = true;
    }
  }
};

const goToPreviousStep = () => {
  if (wizardState.currentStep > 0) {
    wizardState.currentStep--;
  }
};

Each step has its own validation logic, and completed steps receive clear visual markers. Canceling opens a confirmation dialog to prevent users from losing progress through accidental actions.

Conclusion

Looking back at HagiCode’s progressive disclosure practice, we can summarize several core principles:

Step-by-step guidance: break complex tasks into smaller steps and request only the necessary information at each stage
Intelligent completion: use historical context and project knowledge to fill in information automatically
Instant feedback: give every action clear visual feedback and status hints
Fault-tolerance mechanisms: allow users to undo and reset so mistakes do not lead to irreversible loss
Input diversity: support multiple input methods such as text and voice

In HagiCode, the practical result of this approach was clear: the average length of user input increased from fewer than 20 characters to structured descriptions of 200-300 characters, the quality of AI-generated proposals improved significantly, and user satisfaction increased along with it.

This is not surprising. The more information users provide, the more accurately the AI can understand them, and the better the results it can return. In that sense, it is not very different from communication between people.

If you are also building AI-related products, I hope these experiences offer some useful inspiration. Remember: users do not necessarily refuse to provide information. More often, the product has not yet asked the right questions in the right way. The core of progressive disclosure is finding the best timing and form for those questions.

References

HagiCode project: github.com/HagiCode-org/site
HagiCode official website: hagicode.com
Progressive disclosure design principle: Wikipedia - Progressive Disclosure
Handlebars template engine: handlebarsjs.com

If this article helped you, feel free to give us a Star on GitHub and follow the continued development of the HagiCode project. Public beta has already started, and you can experience the full feature set right now by installing it:

GitHub: github.com/HagiCode-org/site
Official website: hagicode.com
Watch the 30-minute hands-on demo: www.bilibili.com/video/BV1pirZBuEzq/
One-click installation with Docker Compose: docs.hagicode.com/installation/docker-compose
Quick installation for the Desktop app: hagicode.com/desktop/

Copyright Notice

Thank you for reading. If you found this article useful, feel free to like, save, and share it. This content was created with AI-assisted collaboration, and the final content was reviewed and confirmed by the author.

Author: newbe36524
Original link: https://docs.hagicode.com/blog/2026-04-05-progressive-disclosure-hci/
Copyright notice: Unless otherwise stated, all articles on this blog are licensed under BY-NC-SA. Please cite the source when reprinting.

AI Output Token Optimization: Practicing an Ultra-Minimal Classical Chinese Mode

Apr 4, 2026

AI Output Token Optimization: Practicing an Ultra-Minimal Classical Chinese Mode

In AI application development, token consumption directly affects cost. In the HagiCode project, we implemented an “ultra-minimal Classical Chinese output mode” through the SOUL system. Without sacrificing information density, it reduces output tokens by roughly 30-50%. This article shares the implementation details of that approach and the lessons we learned using it.

Background

In AI application development, token consumption is an unavoidable cost issue. This becomes especially painful in scenarios where the AI needs to produce large amounts of content. How do you reduce output tokens without sacrificing information density? The more you think about it, the more frustrating the problem can get.

Traditional optimization ideas mostly focus on the input side: trimming system prompts, compressing context, or using more efficient encoding. But these methods eventually hit a ceiling. Push compression too far, and you start hurting the AI’s comprehension and output quality. That is basically just deleting content, which is not very meaningful.

So what about the output side? Could we get the AI to express the same meaning more concisely?

The question sounds simple, but there is quite a bit hidden beneath it. If you directly ask the AI to “be concise,” it may really give you only a few words. If you add “keep the information complete,” it may drift back to the original verbose style. Constraints that are too strong hurt usability; constraints that are too weak do nothing. Where exactly is the balance point? No one can say for sure.

To solve these pain points, we made a bold decision: start from language style itself and design a configurable, composable constraint system for expression. The impact of that decision may be even larger than you expect. I will get into the details shortly, and the result may surprise you a little.

About HagiCode

The approach shared in this article comes from our practical experience in the HagiCode project.

HagiCode is an open-source AI coding assistant that supports multiple AI models and custom configuration. During development, we discovered that AI output token usage was too high, so we designed a solution for it. If you find this approach valuable, that probably says something good about our engineering work. And if that is the case, HagiCode itself may also be worth your attention. Code does not lie.

SOUL System Overview

The full name of the SOUL system is Soul Oriented Universal Language. It is the configuration system used in the HagiCode project to define the language style of an AI Hero. Its core idea is simple: by constraining how the AI expresses itself, it can output content in a more concise linguistic form while preserving informational completeness.

It is a bit like putting a linguistic mask on the AI… though honestly, it is not quite that mystical.

Technical Architecture

The SOUL system uses a frontend-backend separated architecture:

Frontend (Soul Builder):

Built with React + TypeScript + Vite
Located in the repos/soul/ directory
Provides a visual Soul building interface
Supports bilingual use (zh-CN / en-US)

Backend:

Built on .NET (C#) + the Orleans distributed runtime
The Hero entity includes a Soul field (maximum 8000 characters)
Injects Soul into the system prompt through SessionSystemMessageCompiler

Agent Templates generation:

Generated from reference materials
Output to the /agent-templates/soul/templates/ directory
Includes 50 main Catalog groups and 10 orthogonal dimensions

Soul Injection Mechanism

When a Session executes for the first time, the system reads the Hero’s Soul configuration and injects it into the system prompt:

sequenceDiagram
    participant UI as User Interface
    participant Session as SessionGrain
    participant Hero as Hero Repository
    participant AI as AI Executor

    UI->>Session: Send message (bind Hero)
    Session->>Hero: Read Hero.Soul
    Session->>Session: Cache Soul snapshot
    Session->>AI: Build AIRequest (inject Soul)
    AI-->>Session: Execution result
    Session-->>UI: Stream response

The injected system prompt format is:

<hero_soul>
[User-defined Soul content]
</hero_soul>

This injection mechanism is implemented in SessionSystemMessageCompiler.cs:

internal static string? BuildSystemMessage(
    string? existingSystemMessage,
    string? languagePreference,
    IReadOnlyList<HeroTraitDto>? traits,
    string? soul)
{
    var segments = new List<string>();

    // ... language preference and Traits handling ...

    var normalizedSoul = NormalizeSoul(soul);
    if (!string.IsNullOrWhiteSpace(normalizedSoul))
    {
        segments.Add($"<hero_soul>\n{normalizedSoul}\n</hero_soul>");
    }

    // ... other system messages ...

    return segments.Count == 0 ? null : string.Join("\n\n", segments);
}

Once you have seen the code and understood the principle, that is really all there is to it.

Ultra-Minimal Classical Chinese Mode

Ultra-minimal Classical Chinese mode is the most representative token-saving strategy in the SOUL system. Its core principle is to use the high semantic density of Classical Chinese to compress output length while preserving complete information.

Why Classical Chinese

Classical Chinese has several natural advantages:

Semantic compression: the same meaning can be expressed with fewer characters.
Redundancy removal: Classical Chinese naturally omits many conjunctions and particles common in modern Chinese.
Concise structure: each sentence carries high information density, making it well suited as a vehicle for AI output.

Here is a concrete example:

Modern Chinese output (about 80 characters):

Based on your code analysis, I found several issues. First, on line 23, the variable name is too long and should be shortened. Second, on line 45, you did not handle null values and should add conditional logic. Finally, the overall code structure is acceptable, but it can be further optimized.

Ultra-minimal Classical Chinese output (about 35 characters, saving 56%):

Code reviewed: line 23 variable name verbose, abbreviate; line 45 lacks null handling, add checks. Overall structure acceptable; minor tuning suffices.

The gap is large enough to make you stop and think.

Soul Configuration Template

The complete Soul configuration for ultra-minimal Classical Chinese mode is as follows:

{
  "id": "soul-orth-11-classical-chinese-ultra-minimal-mode",
  "name": "Ultra-Minimal Classical Chinese Output Mode",
  "summary": "Use relatively readable Classical Chinese to compress semantic density, convey the meaning with as few words as possible, and retain only conclusions, judgments, and necessary actions, thereby significantly reducing output tokens.",
  "soul": "Your persona core comes from the \"Ultra-Minimal Classical Chinese Output Mode\": use relatively readable Classical Chinese to compress semantic density, convey the meaning with as few words as possible, and retain only conclusions, judgments, and necessary actions, thereby significantly reducing output tokens.\nMaintain the following signature language traits: 1. Prefer concise Classical Chinese sentence patterns such as \"can\", \"should\", \"do not\", \"already\", \"however\", and \"therefore\", while avoiding obscure and difficult wording;\n2. Compress each sentence to 4-12 characters whenever possible, removing preamble, pleasantries, repeated explanation, and ineffective modifiers;\n3. Do not expand arguments unless necessary; if the user does not ask a follow-up, provide only conclusions, steps, or judgments;\n4. Do not alter the core persona of the main Catalog; only compress the expression into restrained, classical, ultra-minimal short sentences."
}

There are several key points in this template design:

Clear constraints: 4-12 characters per sentence, remove redundancy, prioritize conclusions.
Avoid obscurity: use concise Classical Chinese sentence patterns and avoid rare, difficult wording.
Preserve persona: only change the mode of expression, not the core persona.

When you keep adjusting configuration, it all comes down to a few parameters in the end.

Other Ultra-Minimal Modes

Besides the Classical Chinese mode, the HagiCode SOUL system also provides several other token-saving modes:

Telegraph-style ultra-minimal output mode (soul-orth-02):

Keep every sentence strictly within 10 characters
Prohibit decorative adjectives
No modal particles, exclamation marks, or reduplication throughout

Short fragmented muttering mode (soul-orth-01):

Keep sentences within 1-5 characters
Simulate fragmented self-talk
Weaken explicit logic and prioritize emotional transmission

Guided Q&A mode (soul-orth-03):

Use questions to guide the user’s thinking
Reduce direct output content
Lower token usage through interaction

Each of these modes emphasizes a different design direction, but the core goal is the same: reduce output tokens while preserving information quality. There are many roads to Rome; some are simply easier to walk than others.

Combination Strategy

One powerful feature of the SOUL system is support for cross-combining main Catalogs and orthogonal dimensions:

50 main Catalog groups: define the base persona (such as healing style, top-student style, aloof style, and so on)
10 orthogonal dimensions: define the mode of expression (such as Classical Chinese, telegraph-style, Q&A style, and so on)
Combination effect: can generate 500+ unique language-style combinations

For example, you can combine “Professional Development Engineer” with “Ultra-Minimal Classical Chinese Output Mode” to create an AI assistant that is both professional and concise. This flexibility allows the SOUL system to adapt to many different scenarios. You can mix and match however you like; there are more combinations than you are likely to exhaust.

Practical Guide

Create Through Soul Builder

Visit soul.hagicode.com and follow these steps:

Select a main Catalog (for example, “Professional Development Engineer”)
Select an orthogonal dimension (for example, “Ultra-Minimal Classical Chinese Output Mode”)
Preview the generated Soul content
Copy the generated Soul configuration

It is mostly just point-and-click, so there is probably not much more to say.

Use in Hero Configuration

Apply the Soul configuration to a Hero through the web interface or API:

// Hero Soul update example
const heroUpdate = {
  soul: "Your persona core comes from the \"Ultra-Minimal Classical Chinese Output Mode\": ...",
  soulCatalogId: "soul-orth-11-classical-chinese-ultra-minimal-mode",
  soulDisplayName: "Ultra-Minimal Classical Chinese Output Mode",
  soulStyleType: "orthogonal-dimension",
  soulSummary: "Use relatively readable Classical Chinese to compress semantic density..."
};

await updateHero(heroId, heroUpdate);

Custom Soul Templates

Users can fine-tune a preset template or write one from scratch. Here is a custom example for a code review scenario:

You are a code reviewer who pursues extreme concision.
All output must follow these rules:
1. Only point out specific problems and line numbers
2. Each issue must not exceed 15 characters
3. Use concise terms such as "should", "must", and "do not"
4. Do not provide extra explanation

Example output:
- Line 23: variable name too long, should abbreviate
- Line 45: null not handled, must add checks
- Line 67: logic redundant, can simplify

You can revise the template however you like. A template is only a starting point anyway.

Notes

Compatibility:

Classical Chinese mode works with all 50 main Catalog groups
Can be combined with any base persona
Does not change the core persona of the main Catalog

Caching mechanism:

Soul is cached when the Session executes for the first time
The cache is reused within the same SessionId
Modifying Hero configuration does not affect Sessions that have already started

Constraints and limits:

The maximum length of the Soul field is 8000 characters
Heroes without a Soul field in historical data can still be used normally
Soul and style equipment slots are independent and do not overwrite each other

Effect Comparison

According to real test data from the project, the results after enabling ultra-minimal Classical Chinese mode are as follows:

Scenario	Original output tokens	Classical Chinese mode	Savings
Code review	850	420	51%
Technical Q&A	620	380	39%
Solution suggestions	1100	680	38%
Average	-	-	30-50%

The data comes from actual usage statistics in the HagiCode project, and exact results vary by scenario. Still, the saved tokens add up, and your wallet will appreciate it.

Conclusion

The HagiCode SOUL system offers an innovative way to optimize AI output: reduce token consumption by constraining expression rather than compressing the information itself. As its most representative approach, ultra-minimal Classical Chinese mode has delivered 30-50% token savings in real-world use.

The core value of this approach lies in the following:

Preserve information quality: instead of simply truncating output, it expresses the same content more efficiently.
Flexible and composable: supports 500+ combinations of personas and expression styles.
Easy to use: Soul Builder provides a visual interface, so no coding is required.
Production-grade stability: validated in the project and capable of large-scale use.

If you are also building AI applications, or if you are interested in the HagiCode project, feel free to reach out. The meaning of open source lies in progressing together, and we also look forward to seeing your own innovative uses. The saying may be old, but it remains true: one person may go fast, but a group goes farther.

References

HagiCode GitHub: github.com/HagiCode-org/site
HagiCode official site: hagicode.com
Soul Builder: soul.hagicode.com
Docker deployment guide: docs.hagicode.com/installation/docker-compose
Desktop app: hagicode.com/desktop/
30-minute hands-on demo: www.bilibili.com/video/BV1pirZBuEzq/

If this article helped you:

Give us a Star on GitHub: github.com/HagiCode-org/site
Visit the official site to learn more: hagicode.com
The public beta has started, and you are welcome to install and try it

Copyright Notice

Thank you for reading. If you found this article useful, you are welcome to like, bookmark, and share it. This content was created with AI-assisted collaboration, and the final version was reviewed and confirmed by the author.

Author: newbe36524
Original article link: https://docs.hagicode.com/blog/2026-04-04-soul-token-optimization-classical-chinese/
Copyright notice: Unless otherwise stated, all articles on this blog are licensed under BY-NC-SA. Please cite the source when reposting.

The Hallucination Problem in AI Coding Assistants: How to Achieve Specification-driven Development with OpenSpec

Apr 2, 2026

The Hallucination Problem in AI Coding Assistants: How to Achieve Specification-driven Development with OpenSpec

AI coding assistants are powerful, but they often generate code that does not match real requirements or violates project conventions. This article shares how the HagiCode project uses the OpenSpec workflow to implement specification-driven development and significantly reduce the risk of AI hallucinations through a structured proposal mechanism.

Background

Anyone who has used GitHub Copilot or ChatGPT to write code has probably had this experience: the code generated by AI looks polished, but once you actually use it, problems show up everywhere. Maybe it uses the wrong component from the project, maybe it ignores the team’s coding standards, or maybe it writes a large chunk of logic based on assumptions that do not even exist.

This is the so-called “AI hallucination” problem. In programming, it appears as code that seems reasonable on the surface but does not actually fit the real state of the project.

There is also something a bit frustrating about this. As AI coding assistants become more widespread, the problem becomes more serious. After all, AI lacks an understanding of project history, architectural decisions, and coding conventions, and when given too much freedom it can “creatively” generate code that does not match reality. It is a bit like writing an article: without structure, it is easy to wander off into imagination, even though the real situation is far more grounded.

To solve these pain points, we made a bold decision: instead of trying to make AI smarter, we put it inside a “specification” cage. The change this decision brought was probably bigger than you might expect, and I will explain that shortly.

About HagiCode

The approach shared in this article comes from our practical experience in the HagiCode project. HagiCode is an open-source AI coding assistant project dedicated to solving real problems in AI programming through structured engineering practices.

The Root Causes of AI Hallucinations

Before diving into the solution, let us first look at where the problem actually comes from. After all, if you understand both yourself and your opponent, you can fight a hundred battles without defeat. Applied to AI, that saying is still surprisingly fitting.

Missing context

AI models are trained on public code repositories, but your project has its own history, conventions, and architectural decisions. AI cannot directly access this kind of “implicit knowledge,” so the code it generates is often disconnected from the actual project.

This is not entirely the AI’s fault. It has never lived inside your project, so how could it know all of your unwritten rules? Like a brand-new intern, not understanding the local customs is normal. The only issue is that the cost can be rather high.

Too much freedom

When you ask AI, “Help me implement a user authentication feature,” it may generate code in almost any form. Without clear constraints, AI will implement things in the way it “thinks” is reasonable instead of following your project’s requirements.

That is like asking someone who has never learned your project standards to improvise freely. How could that not cause trouble? It is not even that the AI is being irresponsible; it just has no idea what responsibility means in this context.

Lack of validation

After AI generates code, if there is no structured review process, code based on false assumptions can go directly into the repository. By the time the problem is discovered in testing or even in production, the cost is already far too high.

That is like trying to mend the pen after the sheep are already gone. The principle is obvious, but in practice people often still find the extra work bothersome. Before things go wrong, who really wants to spend more time up front?

OpenSpec: The Answer to Specification-driven Development

HagiCode chose OpenSpec as the solution. The core idea is simple: all code changes must go through a structured proposal workflow, turning abstract ideas into executable implementation plans.

That may sound grand, but in plain terms it just means making AI write the requirements document before writing the code. As the old saying goes, preparation leads to success, and lack of preparation leads to failure.

What is OpenSpec

OpenSpec is an npm-based command-line tool (@fission-ai/openspec) that defines a standard proposal file structure and validation mechanism. Put simply, it makes AI “write the requirements document” before it writes code.

A three-step workflow to prevent hallucinations

OpenSpec ensures proposal quality through a three-step workflow:

Step 1: Initialize the proposal - Set the session state to Openspecing Step 2: Intermediate processing - Keep the Openspecing state while gradually refining the artifacts Step 3: Complete the proposal - Transition to the Reviewing state

There is a clever detail in this design: the first step uses the ProposalGenerationStart type, and completing it does not trigger a state transition. This ensures that the review stage is not entered too early before the entire multi-step workflow is finished.

This detail is actually quite interesting. It is like cooking: if you lift the lid before the heat is right, nothing will turn out well. Only by moving step by step with a bit of patience can you end up with a good dish.

// Implementation in the HagiCode project
public enum MessageAssociationType
{
    ProposalGeneration = 2,
    ProposalExecution = 3,

    /// <summary>
    /// Marks the start of the three-step proposal generation workflow
    /// Does not transition to the Reviewing state when completed
    /// </summary>
    ProposalGenerationStart = 5
}

Standardized file structure

Every OpenSpec proposal follows the same directory structure:

openspec/
├── changes/                    # Active and archived changes
│   ├── {change-name}/
│   │   ├── proposal.md        # Proposal description
│   │   ├── design.md          # Design document
│   │   ├── specs/             # Technical specifications
│   │   └── tasks.md           # Executable task list
│   └── archive/               # Archived changes
└── specs/                     # Standalone specification library

According to statistics from the HagiCode project, there are already more than 4,000 archived changes and over 150,000 lines of specification files. This historical accumulation not only gives AI clear guidance to follow, but also provides the team with a valuable knowledge base.

It is a bit like the classics left behind by earlier generations. Read enough of them and patterns begin to emerge. The only difference is that these classics are stored in files instead of written on bamboo slips.

Multi-layer validation mechanism

The system implements multiple layers of validation to ensure proposal quality:

// Validate that required files exist
ValidateProposalFiles()

// Validate prerequisites before execution
ValidateExecuteAsync()

// Validate start conditions
ValidateStartAsync()

// Validate archive conditions
ValidateArchiveAsync()

// Validate proposal name format (kebab-case)
ValidateNameFormat()

These validations are like gatekeepers at multiple checkpoints. Only truly qualified proposals can pass through. It may look tedious, but it is still much better than letting poor code enter the repository.

Prompt template constraints

When AI runs inside HagiCode, it uses predefined Handlebars templates. These templates contain explicit step-by-step instructions and protective guardrails. For example:

Do not continue before understanding the user’s intent
Do not generate unvalidated code
Require the user to provide the name again if it is invalid
If the change already exists, suggest using the continue command instead of recreating it

This way of “dancing in shackles” actually helps AI focus more on understanding requirements and generating code that follows standards. Constraints are not always a bad thing. Sometimes too much freedom is exactly what creates chaos.

Practice: How to Use OpenSpec in a Project

Installation and initialization

npm install -g @fission-ai/openspec@1
openspec --version  # Verify the installation

The openspec/ folder structure will be created automatically in the project root.

There is not much mystery in this step. It is just tool installation, which everyone understands. Just remember to use @fission-ai/openspec@1; newer versions may have pitfalls, and stability matters most.

Create a proposal

In the HagiCode conversation interface, use the shortcut command:

/opsx:new

Or specify a change name and target repository:

/opsx:new "add-user-auth" --repos "repos/web"

Creating a proposal is like outlining an article before writing it. Once you have an outline, the rest becomes much easier. Many people prefer to jump straight into writing, only to realize halfway through that the idea does not hold together. That is when the real headache begins.

Generate artifacts

Use /opsx:continue to generate the required artifacts step by step:

proposal.md - Describes the purpose and scope of the change

# Proposal: Add User Authentication

## Why
The current system lacks user authentication and cannot protect sensitive APIs.

## What Changes
- Add JWT authentication middleware
- Implement login/registration APIs
- Update frontend integration

design.md - Detailed technical design

# Design: Add User Authentication

## Context
The system currently uses public APIs, so anyone can access them...

## Decisions
1. Choose JWT instead of Session...
2. Use the HS256 algorithm...

## Risks
- Risk of token leakage...
- Mitigation measures...

specs/ - Technical specifications and test scenarios

# user-auth Specification

## Requirements

### Requirement: JWT Token Generation
The system SHALL use the HS256 algorithm to generate JWT tokens.

#### Scenario: Valid login
- WHEN the user provides valid credentials
- THEN the system SHALL return a valid JWT token

tasks.md - Executable task list

# Tasks: Add User Authentication

## 1. Backend Changes
- [ ] 1.1 Create AuthController
- [ ] 1.2 Implement JWT middleware
- [ ] 1.3 Add unit tests

These artifacts are a lot like drafts for an article. Once the draft is complete, the main text flows naturally. Many people dislike writing drafts because they think it wastes time, but in reality that is often where the clearest thinking happens.

Review and apply

After all artifacts are complete:

/opsx:apply

AI will read all context files and execute tasks step by step according to the checklist in tasks.md. At this point, because the specification is already clear, the quality of the generated code is much higher.

By this stage, half the work is already done. Once there is a clear task list, the rest is simply executing it step by step. The problem is that many people skip the earlier steps and jump straight here, and then quality naturally becomes hard to guarantee.

Notes and Best Practices

Proposal naming rules

Use kebab-case, start with a letter, and include only lowercase letters, numbers, and hyphens:

✅ add-user-auth
❌ AddUserAuth
❌ add--user-auth

Naming rules may seem minor, but consistency is always worth something. In software, consistency matters even when people do not always pay attention to it.

Avoid common mistakes

Using the wrong type in step 1 of the three-step workflow - This causes the state to transition too early
Forgetting to trigger the state transition in the final step - This leaves the workflow stuck in the Openspecing state
Skipping review and executing directly - You should validate that all artifacts are complete first

These mistakes are all common for beginners. Experienced people naturally know how to avoid them. Still, everyone becomes experienced eventually, and taking a few detours is part of the process. The only hope is to avoid taking too many.

Multi-change management

OpenSpec supports managing multiple proposals at the same time, which is especially useful for large features:

# View all active changes
openspec list

# Switch to a specific change
openspec apply "add-user-auth"

# View change status
openspec status --change "add-user-auth"

Managing multiple changes is like writing several articles at once. It takes some technique and patience, but once you get used to it, it becomes natural enough.

Understanding the session state machine

Understanding state transitions helps with troubleshooting:

Init → Drafting → Openspecing → Reviewing → Executing → ExecutionCompleted → Completed → Archived

Openspecing: Generating the plan
Reviewing: Under review (artifacts can be revised repeatedly)
Executing: In execution (applying tasks.md)

A state machine is, in the end, just a set of rules. Rules can feel annoying at times, but more often they are useful. As the saying goes, without rules, nothing can be accomplished properly.

Conclusion

Through the OpenSpec workflow, the HagiCode project has achieved significant results in addressing the AI hallucination problem:

Fewer hallucinations - AI must follow a structured specification instead of generating code arbitrarily
Higher quality - Multi-layer validation ensures changes comply with project standards
Faster collaboration - Archived changes provide references for future development
Traceability - Every change has a complete record of proposal, design, specification, and tasks

This approach does not make AI smarter. It puts AI inside a “specification” cage. Practice has shown that dancing in shackles can actually lead to a better performance.

The principle is simple. Constraints are not necessarily bad. Like writing, having a format to follow often makes it easier to produce good work. Many people dislike constraints because they think constraints limit creativity, but creativity also needs the right soil to grow.

If you are also using AI coding assistants and have run into similar problems, give OpenSpec a try. Specification-driven development may seem to add extra steps, but that early investment pays back many times over in code quality and maintenance efficiency.

Sometimes slowing down a little is actually the fastest way forward. Many people just do not realize it yet.

References

OpenSpec npm package: www.npmjs.com/package/@fission-ai/openspec
HagiCode project repository: github.com/HagiCode-org/site
HagiCode official website: hagicode.com
Watch the 30-minute hands-on demo: www.bilibili.com/video/BV1pirZBuEzq/
One-click installation with Docker Compose: docs.hagicode.com/installation/docker-compose
Quick installation for Desktop: hagicode.com/desktop/

If this article helped you, feel free to give us a Star on GitHub. The HagiCode public beta has already started, and you can join the experience by installing it now.

That is about enough for this article. There is nothing especially profound here, just a summary of a few practical lessons. I hope it is useful to everyone. Sharing is a good thing: you learn something yourself, and others learn something too.

Still, an article is only an article. Practice is what really matters. Knowledge from the page always feels shallow until you apply it yourself.

Copyright Notice

Author: newbe36524
Original link: https://docs.hagicode.com/blog/2026-04-02-ai-coding-assistant-hallucination-openspec-spec-driven-development/
Copyright: Unless otherwise stated, all articles on this blog are licensed under BY-NC-SA. Please include the source when reposting.

Full GLM-5.1 Support and Gemini CLI Integration: HagiCode's Path of Multi-Model Evolution

Mar 30, 2026

Full GLM-5.1 Support and Gemini CLI Integration: HagiCode’s Path of Multi-Model Evolution

This article introduces two major recent updates to the HagiCode platform: full support for the Zhipu AI GLM-5.1 model and the successful integration of Gemini CLI as the tenth Agent CLI. Together, these updates further strengthen the platform’s multi-model capabilities and multi-CLI ecosystem.

Background

Time really does move fast. The development of large language models has been rising like bamboo in spring. Not long ago, we were still cheering for “an AI that can write code.” Now we are already in an era of multi-model collaboration and multi-tool integration. Is that exciting? Perhaps. After all, what developers need has never been just the tool itself, but the ease of adapting to different scenarios and switching flexibly when needed.

As an AI-assisted coding platform, HagiCode has recently welcomed two important developments: first, the full integration of Zhipu AI’s GLM-5.1 model; second, the official addition of Gemini CLI as the tenth supported Agent CLI. These two updates may not sound earth-shaking, but they are unquestionably good news for the platform’s continued maturation.

GLM-5.1 is Zhipu AI’s latest flagship model. Compared with GLM-5.0, it offers stronger reasoning, deeper code understanding, and smoother tool calling. More importantly, it is the first GLM model to support image input. What does that mean? It means users can let the AI look directly at a screenshot instead of struggling to describe the problem in words. Once you’ve used that convenience, you immediately understand its value.

At the same time, through the HagiCode.Libs.Providers architecture, HagiCode successfully integrated Gemini CLI into the platform. This is now the tenth Agent CLI. To be honest, getting to this point does bring a modest sense of accomplishment.

It is also worth mentioning that HagiCode’s image upload feature lets users communicate with AI directly through screenshots. Even when running GLM 4.7, the platform still works well and has already helped complete many important build tasks. As for GLM-5.1, naturally, it goes one step further.

About HagiCode

The approach shared in this article comes from our practical experience in the HagiCode project. HagiCode is an open-source AI-assisted coding platform designed to provide developers with a flexible and powerful AI programming assistant through a multi-model, multi-CLI architecture. Project repository: github.com/HagiCode-org/site

Multi-CLI architecture design

One of HagiCode’s core strengths is its support for multiple AI programming CLI tools through a unified abstraction layer. The advantage of this design is actually quite simple: new tools can come in, old tools can stay, and the codebase does not turn into chaos. To be fair, that is how everyone would like life to work.

AIProviderType enum

The platform defines supported CLI provider types through the AIProviderType enum:

public enum AIProviderType
{
    ClaudeCodeCli = 0,    // Claude Code CLI
    CodexCli = 1,          // GitHub Copilot Codex
    GitHubCopilot = 2,     // GitHub Copilot
    CodebuddyCli = 3,     // Codebuddy CLI
    OpenCodeCli = 4,      // OpenCode CLI
    IFlowCli = 5,         // IFlow CLI
    HermesCli = 6,        // Hermes CLI
    QoderCli = 7,         // Qoder CLI
    KiroCli = 8,          // Kiro CLI
    KimiCli = 9,          // Kimi CLI
    GeminiCli = 10,       // Gemini CLI (new)
}

As you can see, Gemini CLI joins this family as the tenth member. Each CLI has its own distinct characteristics and usage scenarios, so users can choose flexibly based on their needs. After all, many roads lead to Rome; some are simply easier than others.

Provider architecture

HagiCode.Libs.Providers provides a unified Provider interface that makes each CLI integration standardized and concise. Taking Gemini CLI as an example:

public class GeminiProvider : ICliProvider<GeminiOptions>
{
    private static readonly string[] DefaultExecutableCandidates = ["gemini", "gemini-cli"];
    private const string ManagedBootstrapArgument = "--acp";

    public string Name => "gemini";
    public bool IsAvailable => _executableResolver.ResolveFirstAvailablePath(DefaultExecutableCandidates) is not null;
}

The benefits of this design are:

Integrating a new CLI only requires implementing one Provider class
Unified lifecycle management and session pooling
Automated alias resolution and executable discovery

Put plainly, this design turns complicated things into simpler ones and makes life a bit easier.

Provider Registry

The Provider Registry automatically handles alias mapping and registration:

if (provider is GeminiProvider)
{
    registry.Register(provider.Name, provider, ["gemini-cli"]);
    continue;
}

This means users can invoke Gemini CLI with either gemini or gemini-cli, and the system will recognize it automatically. It is like a friend with both a formal name and a nickname - either way, people know who you mean.

GLM-5.1 model support

GLM-5.1 is Zhipu AI’s latest flagship model, and HagiCode has completed full support for it.

Secondary Professions Catalog

HagiCode manages all supported models through the Secondary Professions Catalog. Here is the configuration for the GLM series:

Model ID	Name	SupportsImage	Compatible CLI Families
`glm-4.7`	GLM 4.7	-	claude, codebuddy, hermes, qoder, kiro
`glm-5`	GLM 5	-	claude, codebuddy, hermes, qoder, kiro
`glm-5-turbo`	GLM 5 Turbo	-	claude, codebuddy, hermes, qoder, kiro
`glm-5.0`	GLM 5.0 (Legacy)	-	claude, codebuddy, hermes, qoder, kiro
`glm-5.1`	GLM 5.1	true	claude, codebuddy, hermes, qoder, kiro

The key characteristics of GLM-5.1 can be summarized as follows:

A standalone version identifier with no legacy baggage
The first GLM model to support image input
Stronger reasoning and code understanding
Broad multi-CLI compatibility

GLM-5.1 vs GLM-5.0

At the code level, the key difference between GLM-5.1 and GLM-5.0 is shown here:

// GLM-5.0 (Legacy) - contains special retention logic
private const string Glm50CodebuddySecondaryProfessionId = "secondary-glm-5-codebuddy";
private const string Glm50CodebuddyModelValue = "glm-5.0";

// GLM-5.1 - standalone new model identifier
private const string Glm51SecondaryProfessionId = "secondary-glm-5-1";
private const string Glm51ModelValue = "glm-5.1";

GLM-5.0 carries the “Legacy” label because it is an old version identifier retained for backward compatibility. GLM-5.1, by contrast, is a brand-new standalone version with no historical burden. Some things stay in the past; others travel lighter and move faster.

Configure GLM-5.1

Here is a configuration example for using GLM-5.1 in HagiCode:

{
  "primaryProfessionId": "profession-claude-code",
  "secondaryProfessionId": "secondary-glm-5-1",
  "model": "glm-5.1",
  "reasoning": "high"
}

Image upload feature

HagiCode’s image support is implemented through the SupportsImage property on SecondaryProfession:

public class HeroSecondaryProfessionSettingDto
{
    public bool SupportsImage { get; set; }
}

In the Secondary Professions Catalog, the GLM-5.1 configuration looks like this:

{
  "id": "secondary-glm-5-1",
  "supportsImage": true
}

This means users can upload screenshots directly for AI analysis, such as:

Screenshots of error messages
Problems in a UI screen
Data visualization charts
Code execution results

There is no longer any need to describe everything manually - just upload the screenshot. The convenience of this feature is obvious once you have used it. Sometimes one look says more than a long explanation.

Gemini CLI integration

As the tenth Agent CLI, Gemini CLI is integrated into HagiCode through the standard Provider architecture.

Configuration options

Gemini CLI supports a rich set of configuration options:

public class GeminiOptions
{
    public string? ExecutablePath { get; set; }
    public string? WorkingDirectory { get; set; }
    public string? SessionId { get; set; }
    public string? Model { get; set; }
    public string? AuthenticationMethod { get; set; }
    public string? AuthenticationToken { get; set; }
    public Dictionary<string, string?> AuthenticationInfo { get; set; }
    public Dictionary<string, string?> EnvironmentVariables { get; set; }
    public string[] ExtraArguments { get; set; }
    public TimeSpan? StartupTimeout { get; set; }
    public CliPoolSettings? PoolSettings { get; set; }
}

These options cover everything from basic setup to advanced features, giving users the flexibility to configure the CLI around their own needs. Everyone’s workflow is different, so a little flexibility is always welcome.

ACP communication protocol

Gemini CLI supports the ACP (Agent Communication Protocol), which is HagiCode’s unified CLI communication standard. Through ACP, different CLIs can interact with the platform in a consistent way, greatly simplifying integration work. In short, it standardizes the complicated parts so everyone can work more easily.

Environment configuration

To use Zhipu AI models, you need to configure the corresponding environment variables.

Zhipu AI ZAI platform

export ANTHROPIC_AUTH_TOKEN="***"
export ANTHROPIC_BASE_URL="https://open.bigmodel.cn/api/anthropic"

Alibaba Cloud DashScope

export ANTHROPIC_AUTH_TOKEN="your-a...-key"
export ANTHROPIC_BASE_URL="https://coding.dashscope.aliyuncs.com/apps/anthropic"

Once configured, HagiCode can call the GLM-5.1 model normally. It is neither especially hard nor especially easy - you just need to follow the setup as intended.

HagiCode’s own build practice

Speaking of real-world practice, the best example is the HagiCode platform’s own build workflow. HagiCode’s development process has already made full use of AI capabilities.

Works well even with GLM 4.7

HagiCode’s platform design is well optimized, so it can still provide a good development experience even with GLM 4.7. The platform has already helped complete multiple important build projects, including:

Integration of multiple CLI Providers
Implementation of the image upload feature
Documentation generation and content publishing

That is actually a good thing. Not everyone needs the newest thing all the time. What suits you best is often what matters most.

GLM-5.1 delivers more with less effort

After upgrading to GLM-5.1, these capabilities become even stronger:

Stronger code understanding, reducing back-and-forth communication
More accurate dependency analysis, pointing in the right direction immediately
More efficient error diagnosis, locating issues faster
Support for image input, accelerating problem descriptions

It is like switching from a bicycle to a car. You can still reach the same destination, but the speed and comfort are not the same.

Best practices for multi-CLI integration

HagiCode.Libs.Providers provides a unified mechanism for registration and usage:

services.AddHagiCodeLibs();

var gemini = serviceProvider.GetRequiredService<ICliProvider<GeminiOptions>>();
var codebuddy = serviceProvider.GetRequiredService<ICliProvider<CodebuddyOptions>>();
var hermes = serviceProvider.GetRequiredService<ICliProvider<HermesOptions>>();

This dependency injection design keeps usage across different CLIs very concise and also makes unit testing and mocking more convenient. Clean code is a way of being responsible to yourself.

Notes

There are a few things to keep in mind in actual use:

API key configuration: Make sure ANTHROPIC_AUTH_TOKEN is set correctly, or the model cannot be called
Model availability: GLM-5.1 needs to be enabled by the corresponding model provider
Image feature: Only models with supportsImage: true can use image upload
CLI installation: Before using Gemini CLI, make sure gemini or gemini-cli is in the system PATH

These may be small details, but small details handled poorly can turn into big problems, so they are worth paying attention to.

Conclusion

With full support for GLM-5.1 and the successful integration of Gemini CLI, HagiCode further strengthens its capabilities as a multi-model, multi-CLI AI programming platform. These updates not only give users more choices, but also demonstrate HagiCode’s forward-looking architecture and scalability.

GLM-5.1’s image support, combined with HagiCode’s screenshot upload feature, makes it possible to let the AI “understand from the image” and greatly reduces the cost of describing problems. And with support for ten CLIs, users can flexibly choose the AI programming assistant that best fits their preferences and scenarios. More choice is almost always a good thing.

Most importantly, HagiCode’s own build practice proves that the platform can already run well and complete complex tasks even with GLM 4.7, while upgrading to GLM-5.1 can further improve development efficiency. Life is often like that too: you do not always need the absolute best, only what suits you. Of course, if what suits you can become even better, then so much the better.

If you are interested in a multi-model, multi-CLI AI programming platform, give HagiCode a try - open source, free, and still evolving. Trying it costs nothing, and it may turn out to be exactly what you need.

References

If this article helped you:

Give it a Star on GitHub: github.com/HagiCode-org/site
Visit the official website to learn more: hagicode.com
Watch the 30-minute hands-on demo: www.bilibili.com/video/BV1pirZBuEzq/
Try the one-click installation: docs.hagicode.com/installation/docker-compose
Public beta has started, and you are welcome to install and try it

Copyright notice

Thank you for reading. If you found this article useful, feel free to like, bookmark, and share it to show your support. This content was created with AI-assisted collaboration, and the final content was reviewed and confirmed by the author.

Author: newbe36524
Original link: https://docs.hagicode.com/blog/2026-03-30-hagicode-glm-5-1-gemini-cli-update/
Copyright: Unless otherwise stated, all blog posts on this site are licensed under BY-NC-SA. Please cite the source when reposting.

Hagicode and GLM-5.1 Multi-CLI Integration Guide

Mar 28, 2026

Hagicode and GLM-5.1 Multi-CLI Integration Guide

Background

In the Hagicode project, users can choose from multiple CLI tools to drive AI programming assistants, including Claude Code CLI, GitHub Copilot, OpenCode CLI, Codebuddy CLI, Hermes CLI, and more. These CLI tools are general-purpose AI programming tools on their own, but through Hagicode’s abstraction layer, they can flexibly connect to different AI model providers.

Zhipu AI (ZAI) provides an interface compatible with the Anthropic Claude API, allowing these CLI tools to directly use domestic GLM series models. Among them, GLM-5.1 is Zhipu’s latest large language model release, with significant improvements over GLM-5.0.

Hagicode’s CLI abstraction architecture

Hagicode defines 11 CLI provider types through the AIProviderType enum, covering mainstream AI programming CLI tools:

public enum AIProviderType
{
    ClaudeCodeCli = 0,    // Claude Code CLI
    CodexCli = 1,          // GitHub Copilot Codex
    GitHubCopilot = 2,     // GitHub Copilot
    CodebuddyCli = 3,     // Codebuddy CLI
    OpenCodeCli = 4,      // OpenCode CLI
    IFlowCli = 5,         // IFlow CLI
    HermesCli = 6,        // Hermes CLI
    QoderCli = 7,         // Qoder CLI
    KiroCli = 8,          // Kiro CLI
    KimiCli = 9,          // Kimi CLI
    GeminiCli = 10,       // Gemini CLI
}

Each CLI has corresponding model parameter configuration and supports the model and reasoning parameters:

private static readonly IReadOnlyDictionary<AIProviderType, IReadOnlyList<string>> ManagedModelParameterKeysByProvider =
    new Dictionary<AIProviderType, IReadOnlyList<string>>
    {
        [AIProviderType.ClaudeCodeCli] = ["model", "reasoning"],
        [AIProviderType.CodexCli] = ["model", "reasoning"],
        [AIProviderType.OpenCodeCli] = ["model", "reasoning"],
        [AIProviderType.HermesCli] = ["model", "reasoning"],
        [AIProviderType.CodebuddyCli] = ["model", "reasoning"],
        [AIProviderType.QoderCli] = ["model", "reasoning"],
        [AIProviderType.KiroCli] = ["model", "reasoning"],
        [AIProviderType.GeminiCli] = ["model"],  // Gemini does not support the reasoning parameter
        // ...
    };

GLM model support system

Hagicode’s Secondary Professions Catalog defines complete support for the GLM model series:

Model ID	Name	Default Reasoning	Compatible CLI Families
`glm-4.7`	GLM 4.7	high	claude, codebuddy, hermes, qoder, kiro
`glm-5`	GLM 5	high	claude, codebuddy, hermes, qoder, kiro
`glm-5-turbo`	GLM 5 Turbo	high	claude, codebuddy, hermes, qoder, kiro
`glm-5.0`	GLM 5.0 (Legacy)	high	claude, codebuddy, hermes, qoder, kiro
`glm-5.1`	GLM 5.1	high	claude, codebuddy, hermes, qoder, kiro

Key differences between GLM-5.1 and GLM-5.0

From the implementation in AcpSessionModelBootstrapper.cs, we can clearly see the differences between GLM-5.1 and GLM-5.0:

Standalone implementation of GLM-5.1

GLM-5.1 is a standalone new model identifier with no legacy handling logic:

private const string Glm51ModelValue = "glm-5.1";

Definition in the Secondary Professions Catalog:

{
  "id": "secondary-glm-5-1",
  "name": "GLM 5.1",
  "family": "anthropic",
  "summary": "hero.professionCopy.secondary.glm51.summary",
  "sourceLabel": "hero.professionCopy.sources.aiSharedAnthropicModel",
  "sortOrder": 64,
  "supportsImage": true,
  "compatiblePrimaryFamilies": [
    "claude",
    "codebuddy",
    "hermes",
    "qoder",
    "kiro"
  ],
  "defaultParameters": {
    "model": "glm-5.1",
    "reasoning": "high"
  }
}

Model provider configuration

Zhipu AI (ZAI)

Zhipu AI provides the most complete GLM model support:

{
  "providerId": "zai",
  "name": "智谱 AI",
  "description": "智谱 AI 提供的 Claude API 兼容服务",
  "category": "china-providers",
  "apiUrl": {
    "codingPlanForAnthropic": "https://open.bigmodel.cn/api/anthropic"
  },
  "recommended": true,
  "region": "cn",
  "defaultModels": {
    "sonnet": "glm-4.7",
    "opus": "glm-5",
    "haiku": "glm-4.5-air"
  },
  "supportedModels": [
    "glm-4.7",
    "glm-5",
    "glm-4.5-air",
    "qwen3-coder-next",
    "qwen3-coder-plus"
  ],
  "features": ["experimental-agent-teams"],
  "authTokenEnv": "ANTHROPIC_AUTH_TOKEN",
  "referralUrl": "https://www.bigmodel.cn/claude-code?ic=14BY54APZA",
  "documentationUrl": "https://open.bigmodel.cn/dev/api"
}

Features:

Supports the widest variety of GLM model variants
Provides default mapping across the Sonnet/Opus/Haiku tiers
Supports the experimental-agent-teams feature

Using GLM-5.1 in different CLIs

1. Claude Code CLI + GLM-5.1

Claude Code CLI is one of Hagicode’s core CLIs and is configured through the Hero configuration system:

{
  "primaryProfessionId": "profession-claude-code",
  "secondaryProfessionId": "secondary-glm-5-1",
  "model": "glm-5.1",
  "reasoning": "high"
}

Corresponding HeroEquipmentCatalogItem configuration:

{
  id: 'secondary-glm-5-1',
  name: 'GLM 5.1',
  family: 'anthropic',
  kind: 'model',
  primaryFamily: 'claude',
  compatiblePrimaryFamilies: ['claude', 'codebuddy', 'hermes', 'qoder', 'kiro'],
  defaultParameters: {
    model: 'glm-5.1',
    reasoning: 'high'
  }
}

2. OpenCode CLI + GLM-5.1

OpenCode CLI is the most flexible CLI and supports specifying any model in the provider/model format:

Method 1: Use the ZAI provider prefix

{
  "primaryProfessionId": "profession-opencode",
  "model": "zai/glm-5.1",
  "reasoning": "high"
}

Method 2: Use the model ID directly

{
  "model": "glm-5.1"
}

Method 3: Frontend configuration UI

In HeroModelEquipmentForm.tsx, OpenCode CLI has a dedicated placeholder hint:

const OPEN_CODE_MODEL_PLACEHOLDER = 'myprovider/glm-4.7';

const modelPlaceholder = primaryProviderType === PCode_Models_AIProviderType.OPEN_CODE_CLI
  ? OPEN_CODE_MODEL_PLACEHOLDER
  : 'gpt-5.4';

Users can enter:

zai/glm-5.1
glm-5.1

OpenCode CLI model parsing logic:

internal OpenCodeModelSelection? ResolveModelSelection(string? rawModel)
{
    var normalized = NormalizeOptionalValue(rawModel);
    if (normalized == null) return null;

    var slashIndex = normalized.IndexOf('/');
    if (slashIndex < 0)
    {
        // No slash: use the model ID directly
        return new OpenCodeModelSelection {
            ProviderId = string.Empty,
            ModelId = normalized,
        };
    }

    // Slash exists: parse the provider/model format
    var providerId = normalized[..slashIndex].Trim();
    var modelId = normalized[(slashIndex + 1)..].Trim();

    return new OpenCodeModelSelection {
        ProviderId = providerId,
        ModelId = modelId,
    };
}

3. Codebuddy CLI + GLM-5.1

Codebuddy CLI has dedicated legacy handling logic:

{
  "primaryProfessionId": "profession-codebuddy",
  "model": "glm-5.1",
  "reasoning": "high"
}

Note: Codebuddy retains special handling for GLM-5.0 and does not use legacy normalization:

return !string.Equals(providerName, "CodebuddyCli", StringComparison.OrdinalIgnoreCase)
       && string.Equals(normalizedModel, LegacyGlm5TurboModelValue, StringComparison.OrdinalIgnoreCase)
    ? Glm5TurboModelValue
    : normalizedModel;
// For CodebuddyCli, glm-5.0 is not normalized to glm-5-turbo

Environment variable configuration

Using Zhipu AI ZAI

# Set the API key
export ANTHROPIC_AUTH_TOKEN="***"

# Optional: specify the API endpoint (ZAI uses this endpoint by default)
export ANTHROPIC_BASE_URL="https://open.bigmodel.cn/api/anthropic"

Using Alibaba Cloud DashScope

# Set the API key
export ANTHROPIC_AUTH_TOKEN="your-a...-key"

# Specify the Alibaba Cloud endpoint
export ANTHROPIC_BASE_URL="https://coding.dashscope.aliyuncs.com/apps/anthropic"

Get an API key

Zhipu AI: https://www.bigmodel.cn/claude-code?ic=14BY54APZA
Alibaba Cloud: https://www.aliyun.com/benefit/ai/aistar?userCode=vmx5szbq

Improvement advantages of GLM-5.1

Compared with GLM-5.0, GLM-5.1 brings the following significant improvements:

1. Better reasoning capability

According to Zhipu’s official release information, improvements in GLM-5.1 include:

Stronger code understanding: More accurate analysis of complex code structures
Longer context comprehension: Supports longer conversational context
Enhanced tool calling: Higher success rate for MCP tool calls
Output stability: Reduces randomness and hallucinations

2. Comprehensive multi-CLI compatibility

GLM-5.1 covers all mainstream CLIs supported by Hagicode:

compatiblePrimaryFamilies: [
  "claude",      // Claude Code CLI
  "codebuddy",   // Codebuddy CLI
  "hermes",      // Hermes CLI
  "qoder",       // Qoder CLI
  "kiro"         // Kiro CLI
]

Notes

1. API key configuration

Make sure the ANTHROPIC_AUTH_TOKEN environment variable is set correctly. It is the required credential for every CLI to connect to the model.

2. Model availability

GLM-5.1 needs to be enabled by the corresponding model provider:

The Zhipu AI ZAI platform supports it by default
Alibaba Cloud DashScope may require a separate application

3. OpenCode CLI format

When using the provider/model format, make sure the provider ID is correct:

Zhipu AI: zai or zhipuai
Alibaba Cloud: aliyun or dashscope

4. Reasoning parameter

high is recommended for the best code generation results
Gemini CLI does not support the reasoning parameter and will ignore this configuration automatically

Conclusion

Through a unified abstraction layer, Hagicode enables flexible integration between GLM-5.1 and multiple CLIs. Developers can choose the CLI tool that best fits their preferences and usage scenarios, then use the latest GLM-5.1 model through simple configuration.

As Zhipu’s latest model version, GLM-5.1 offers clear improvements over GLM-5.0:

An independent version identifier with no legacy burden
Stronger reasoning and code understanding
Broad multi-CLI compatibility
Flexible reasoning level configuration

With the correct environment variables and Hero equipment configured, users can fully unlock the power of GLM-5.1 across different CLI environments.

Continue With HagiCode

If you want to put GLM-5.1, multi-CLI orchestration, and HagiCode’s configuration model into real use, these are the fastest entry points:

Track the main project and latest implementation progress on GitHub: github.com/HagiCode-org/site
Visit the official site to understand the product direction, capability boundaries, and install options: hagicode.com
Start with the Docker Compose guide, then switch models and CLIs in a real environment: docs.hagicode.com/installation/docker-compose
If you prefer a local desktop workflow, begin with the Desktop entry point: hagicode.com/desktop/

Once you compare Kimi, Claude Code, OpenCode, and other CLIs inside the same abstraction layer, questions about model switching, parameter mapping, and engineering boundaries tend to become much easier to reason about.

HagiCode Desktop Hybrid Distribution Architecture Explained: How P2P Accelerates Large File Downloads

Mar 27, 2026

HagiCode Desktop Hybrid Distribution Architecture Explained: How P2P Accelerates Large File Downloads

I held this article back for a long time before finally writing it, and I am still not sure whether it reads well. Technical writing is easy enough to produce, but hard to make truly engaging. Then again, I am no great literary master, so I might as well just set down this plain explanation.

Background

Teams building desktop applications will all run into the same headache sooner or later: how do you distribute large files?

It is an awkward problem. Traditional HTTP/HTTPS direct downloads can still hold up when files are small and the number of users is limited. But time is rarely kind. As a project keeps growing, the installation packages grow with it: Desktop ZIP packages, portable packages, web deployment archives, and more. Then the issues start to surface:

Download speed is limited by origin bandwidth: no matter how much bandwidth a single server has, it still struggles when everyone downloads at once.
Resume support is nearly nonexistent: if an HTTP download is interrupted, you often have to start over from the beginning. That wastes both time and bandwidth.
The origin server takes all the pressure: all traffic flows back to a central server, bandwidth costs keep rising, and scalability becomes a real problem.

The HagiCode Desktop project was no exception. When we designed the distribution system, we kept asking ourselves: can we introduce a hybrid distribution approach without changing the existing index.json control plane? In other words, can we use the distributed nature of P2P networks to accelerate downloads while still keeping HTTP origin fallback so the system remains usable in constrained environments such as enterprise networks?

The impact of that decision turned out to be larger than you might expect. Let us walk through it step by step.

About HagiCode

The approach shared in this article comes from our real-world experience in the HagiCode project. HagiCode is an open-source AI coding assistant project focused on helping development teams improve engineering efficiency. The project spans multiple subsystems, including the frontend, backend, desktop launcher, documentation, build pipeline, and server deployment.

The Desktop hybrid distribution architecture is exactly the kind of solution HagiCode refined through real operational experience and repeated optimization. If this design proves useful, then perhaps it also shows that HagiCode itself is worth paying attention to.

The project’s GitHub repository is HagiCode-org/site. If it interests you, feel free to give it a Star and save it for later.

Core Design Philosophy: P2P First, HTTP Fallback

At its heart, the hybrid distribution model can be summarized in a single sentence: P2P first, HTTP fallback.

The key lies in the word “hybrid.” This is not about simply adding BitTorrent and calling it a day. The point is to make the two delivery methods work together and complement each other:

The P2P network provides distributed acceleration. The more people download, the more peers join, and the faster the transfer becomes.
WebSeed/HTTP fallback guarantees availability, so downloads can still work in enterprise firewalls and internal network environments.
The control plane remains simple. We do not change the core logic of index.json; we only add a few optional metadata fields.

The real benefit is straightforward: users feel that “downloads are faster,” while the engineering team does not have to shoulder too much extra complexity. After all, the BT protocol is already mature, and there is little reason to reinvent the wheel.

Architecture Design

Layered Architecture Overview

Let us start with the overall architecture diagram to build a high-level mental model:

┌─────────────────────────────────────┐
│     Renderer (UI layer)             │
├─────────────────────────────────────┤
│     IPC/Preload (bridge layer)      │
├─────────────────────────────────────┤
│   VersionManager (version manager)  │
├─────────────────────────────────────┤
│ HybridDownloadCoordinator (coord.)  │
│  ├── DistributionPolicyEvaluator    │
│  ├── DownloadEngineAdapter          │
│  ├── CacheRetentionManager          │
│  └── SHA256 Verifier                │
├─────────────────────────────────────┤
│   WebTorrent (download engine)      │
└─────────────────────────────────────┘

As the diagram shows, the system uses a layered design. The reason for separating responsibilities this clearly is simple: testability and replaceability.

The UI layer is responsible for displaying download progress and the sharing acceleration toggle. It is the surface.
The coordination layer is the core. It contains policy evaluation, engine adaptation, cache management, and integrity verification.
The engine layer encapsulates the concrete download implementation. At the moment, it uses WebTorrent.

The engine layer is abstracted behind the DownloadEngineAdapter interface. If we ever want to swap in a different BT engine later, or move the implementation into a sidecar process, that becomes much easier.

Separation of Control Plane and Data Plane

HagiCode Desktop keeps index.json as the sole control plane, and that design is critical. The control plane is responsible for version discovery, channel selection, and centralized policy, while the data plane is where the actual file transfer happens.

The new fields added to index.json are optional:

{
  "asset": {
    "torrentUrl": "https://cdn.example.com/app.torrent",
    "infoHash": "abc123...",
    "webSeeds": [
      "https://cdn.example.com/app.zip",
      "https://backup.example.com/app.zip"
    ],
    "sha256": "def456...",
    "directUrl": "https://cdn.example.com/app.zip"
  }
}

All of these fields are optional. If they are missing, the client falls back to the traditional HTTP download mode. The advantage of this design is backward compatibility: older clients are completely unaffected.

Policy-Driven Decisions

Not every file is worth distributing through P2P.

DistributionPolicyEvaluator is responsible for evaluating the policy. Only files that meet all of the following conditions will use hybrid download:

The source type must be an HTTP index: direct GitHub downloads or local folder sources do not use this path.
The file size must be at least 100 MB: for smaller files, the overhead of P2P outweighs the benefit.
Complete hybrid metadata must be present: torrentUrl, webSeeds, and sha256 are all required.
Only the latest desktop package and web deployment package are eligible: historical versions continue to use the traditional distribution path.

class DistributionPolicyEvaluator {
  evaluate(version: Version, settings: SharingAccelerationSettings): HybridDownloadPolicy {
    // Check source type
    if (version.sourceType !== 'http-index') {
      return { useHybrid: false, reason: 'not-http-index' };
    }

    // Check metadata completeness
    if (!version.hybrid) {
      return { useHybrid: false, reason: 'not-eligible' };
    }

    // Check whether the feature is enabled
    if (!settings.enabled) {
      return { useHybrid: false, reason: 'shared-disabled' };
    }

    // Check asset type (latest desktop/web packages only)
    if (!version.hybrid.isLatestDesktopAsset && !version.hybrid.isLatestWebAsset) {
      return { useHybrid: false, reason: 'latest-only' };
    }

    return { useHybrid: true, reason: 'shared-enabled' };
  }
}

This gives the system predictable behavior. Both developers and users can clearly understand which files will use P2P and which will not.

Core Implementation

Type Definition System

Let us start with the type definitions, because they form the foundation of the entire system.

// Hybrid distribution metadata
interface HybridDistributionMetadata {
  torrentUrl?: string;      // Torrent file URL
  infoHash?: string;        // InfoHash
  webSeeds: string[];       // WebSeed list
  sha256?: string;          // File hash
  directUrl?: string;       // HTTP direct link (for origin fallback)
  eligible: boolean;        // Whether hybrid distribution is applicable
  thresholdBytes: number;   // Threshold in bytes
  assetKind: VersionAssetKind;
  isLatestDesktopAsset: boolean;
  isLatestWebAsset: boolean;
}

// Sharing acceleration settings
interface SharingAccelerationSettings {
  enabled: boolean;           // Master switch
  uploadLimitMbps: number;    // Upload bandwidth limit
  cacheLimitGb: number;       // Cache limit
  retentionDays: number;      // Retention period
  hybridThresholdMb: number;  // Hybrid distribution threshold
  onboardingChoiceRecorded: boolean;
}

// Download progress
interface VersionDownloadProgress {
  current: number;
  total: number;
  percentage: number;
  stage: VersionInstallStage;  // queued, downloading, backfilling, verifying, extracting, completed, error
  mode: VersionDownloadMode;   // http-direct, shared-acceleration, source-fallback
  peers?: number;              // Number of connected peers
  p2pBytes?: number;           // Bytes received from P2P
  fallbackBytes?: number;      // Bytes received from fallback
  verified?: boolean;          // Whether verification has completed
}

Once the type system is clear, the rest of the implementation follows naturally.

Core Coordinator

HybridDownloadCoordinator orchestrates the entire download workflow. It coordinates policy evaluation, engine execution, SHA256 verification, and cache management.

class HybridDownloadCoordinator {
  async download(
    version: Version,
    cachePath: string,
    packageSource: PackageSource,
    onProgress?: DownloadProgressCallback,
  ): Promise<HybridDownloadResult> {
    // 1. Evaluate the policy: should hybrid download be used?
    const policy = this.policyEvaluator.evaluate(version, settings);

    // 2. Execute the download
    if (policy.useHybrid) {
      await this.engine.download(version, cachePath, settings, onProgress);
    } else {
      await packageSource.downloadPackage(version, cachePath, onProgress);
    }

    // 3. SHA256 verification (hard gate)
    const verified = await this.verify(version, cachePath, onProgress);
    if (!verified) {
      await this.cacheRetentionManager.discard(version.id, cachePath);
      throw new Error(`sha256 verification failed for ${version.id}`);
    }

    // 4. Mark as trusted cache and begin controlled seeding
    await this.cacheRetentionManager.markTrusted({
      versionId: version.id,
      cachePath,
      cacheSize,
    }, settings);

    return { cachePath, policy, verified };
  }
}

There is one especially important point here: SHA256 verification is a hard gate. A downloaded file must pass verification before it can enter the installation flow. If verification fails, the cache is discarded to ensure that an incorrect file never causes installation problems.

Download Engine Abstraction

DownloadEngineAdapter is an abstract interface that defines the methods every engine must implement:

interface DownloadEngineAdapter {
  download(
    version: Version,
    destinationPath: string,
    settings: SharingAccelerationSettings,
    onProgress?: (progress: VersionDownloadProgress) => void,
  ): Promise<void>;

  stopAll(): Promise<void>;
}

The V1 implementation is based on WebTorrent and is wrapped in InProcessTorrentEngineAdapter:

class InProcessTorrentEngineAdapter implements DownloadEngineAdapter {
  async download(...) {
    const client = this.getClient(settings);  // Apply upload rate limiting
    const torrent = client.add(torrentId, {
      path: path.dirname(destinationPath),
      destroyStoreOnDestroy: false,
      maxWebConns: 8,
    });

    // Add WebSeed sources
    torrent.on('ready', () => {
      for (const seed of hybrid.webSeeds) {
        torrent.addWebSeed(seed);
      }
      if (hybrid.directUrl) {
        torrent.addWebSeed(hybrid.directUrl);
      }
    });

    // Progress reporting - distinguish P2P from origin fallback
    torrent.on('download', () => {
      const hasP2PPeer = torrent.wires.some(w => w.type !== 'webSeed');
      const mode = hasP2PPeer ? 'shared-acceleration' : 'source-fallback';
      // ... report progress
    });
  }
}

A pluggable engine design makes future optimization much easier. For example, V2 could run the engine in a helper process to avoid bringing down the main process if the engine crashes.

Distinguishing Progress Reporting Modes

At the UI layer, the thing users care about most is simple: “am I currently downloading through P2P or through HTTP fallback?” InProcessTorrentEngineAdapter determines that by checking the types inside torrent.wires:

const hasP2PPeer = torrent.wires.some((wire) => wire.type !== 'webSeed');
const hasFallbackWire = torrent.wires.some((wire) => wire.type === 'webSeed');

const mode = hasP2PPeer ? 'shared-acceleration'
         : hasFallbackWire ? 'source-fallback'
         : 'shared-acceleration';

const stage = hasP2PPeer ? 'downloading'
           : hasFallbackWire ? 'backfilling'
           : 'downloading';

The logic looks simple, but it is a key part of the user experience. Users can clearly see whether the current state is “sharing acceleration” or “origin backfilling,” which makes the behavior easier to understand.

SHA256 Streaming Verification

Integrity verification uses Node.js’s crypto module to compute the hash in a streaming manner, which avoids loading the entire file into memory:

private async computeSha256(filePath: string): Promise<string> {
  const hash = createHash('sha256');
  await new Promise<void>((resolve, reject) => {
    const stream = fs.createReadStream(filePath);
    stream.on('data', (chunk) => hash.update(chunk));
    stream.on('error', reject);
    stream.on('end', resolve);
  });
  return hash.digest('hex').toLowerCase();
}

This implementation is especially friendly for large files. Imagine downloading a 2 GB installation package and then trying to load the whole thing into memory just to verify it. Streaming solves that cleanly.

Data Flow

The full data flow looks like this:

┌────────────────────────────────────────────────────────────────────┐
│             User clicks install on a large-file version            │
└────────────────────────────────────────────────────────────────────┘
                                 │
                                 ▼
┌────────────────────────────────────────────────────────────────────┐
│              VersionManager invokes the coordinator                │
│              HybridDownloadCoordinator.download()                  │
└────────────────────────────────────────────────────────────────────┘
                                 │
                                 ▼
┌────────────────────────────────────────────────────────────────────┐
│           DistributionPolicyEvaluator.evaluate()                   │
│       Checks: source, metadata, switch, and asset type            │
└────────────────────────────────────────────────────────────────────┘
                                 │
                    ┌───────────┴───────────┐
                    │ useHybrid?            │
                    └───────────┬───────────┘
                        yes │         │ no
                           ▼         ▼
              ┌──────────────────┐  ┌─────────────────────┐
              │ P2P + WebSeed    │  │ HTTP direct download│
              │ Hybrid download  │  │ (compatibility path)│
              └──────────────────┘  └─────────────────────┘
                        │
                        ▼
              ┌──────────────────┐
              │ SHA256 verify    │
              │ (hard gate)      │
              └────────┬─────────┘
                       │
              ┌────────┴─────────┐
              │ Passed?          │
              └────────┬─────────┘
                   yes │    │ no
                     ▼    ▼
          ┌────────────┐ ┌────────────────┐
          │ Extract +  │ │ Drop cache +   │
          │ install +  │ │ return error   │
          │ seed safely│ └────────────────┘
          └────────────┘

The flow is very clear end to end, and every step has a well-defined responsibility. When something goes wrong, it is much easier to pinpoint the failing stage.

Productization

Even the best technical design will fall flat if the user experience is poor. HagiCode Desktop invested a fair amount of effort in productizing this capability.

Hide BT Terminology

Most users do not know what BitTorrent or InfoHash means. So at the product level, we present the feature using the phrase “sharing acceleration”:

The feature is called “sharing acceleration,” not P2P download.
The setting is called “upload limit,” not seeding.
The progress label says “origin backfilling,” not WebSeed fallback.

This lowers the cognitive burden of the terminology and makes the feature easier to accept.

Enabled by Default in the First-Run Wizard

When new users launch the desktop app for the first time, they see a wizard page introducing sharing acceleration:

To improve download speed, we share the portions you have already downloaded with other users while your own download is in progress. This is completely optional, and you can turn it off at any time in Settings.

It is enabled by default, but users are given a clear way to opt out. If enterprise users do not want it, they can simply disable it during onboarding.

User-Controlled Parameters

The settings page exposes three tunable parameters:

Parameter	Default	Description
Upload limit	2 MB/s	Prevents excessive upstream bandwidth usage
Cache limit	10 GB	Controls disk space consumption
Retention days	7 days	Automatically cleans old cache after this period

These parameters all have sensible defaults. Most users never need to change them, while advanced users can adjust them based on their own network environment.

Key Design Decisions

Looking back at the overall solution, several design decisions are worth calling out.

Engine Runs in the Main Process (V1)

Why not start with a sidecar or helper process right away? The reason is simple: ship quickly. An in-process design has a shorter development cycle and is easier to debug. The first priority is to get the feature running, then improve stability afterward.

Of course, this decision comes with a cost: if the engine crashes, it can affect the main process. We reduce that risk through adapter boundaries and timeout controls, and we also keep a migration path open so V2 can move into a separate process more easily.

SHA256 as the Integrity Check

We use SHA256 instead of MD5 or CRC32 because SHA256 is more secure. The collision cost for MD5 and CRC32 is too low. If someone maliciously crafted a fake installation package, the consequences could be severe. SHA256 costs more to compute, but the security gain is worth it.

Enabled Only for HTTP Index Sources

Scenarios such as GitHub downloads and local folder sources do not use hybrid distribution. This is not a technical limitation; it is about avoiding unnecessary complexity. BT protocols add limited value inside private network scenarios and would only increase code complexity.

Practical Notes

Settings Normalization

Inside SharingAccelerationSettingsStore, every numeric value must go through bounds checking and normalization:

private normalize(settings: SharingAccelerationSettings): SharingAccelerationSettings {
  return {
    enabled: Boolean(settings.enabled),
    uploadLimitMbps: this.clampNumber(settings.uploadLimitMbps, 1, 200, DEFAULT_SETTINGS.uploadLimitMbps),
    cacheLimitGb: this.clampNumber(settings.cacheLimitGb, 1, 500, DEFAULT_SETTINGS.cacheLimitGb),
    retentionDays: this.clampNumber(settings.retentionDays, 1, 90, DEFAULT_SETTINGS.retentionDays),
    hybridThresholdMb: DEFAULT_SETTINGS.hybridThresholdMb,  // Fixed value, not user-configurable
    onboardingChoiceRecorded: Boolean(settings.onboardingChoiceRecorded),
  };
}

private clampNumber(value: number, min: number, max: number, fallback: number): number {
  if (!Number.isFinite(value)) {
    return fallback;
  }
  return Math.min(max, Math.max(min, Math.round(value)));
}

This prevents users from manually editing the configuration file into invalid values.

Cache LRU Cleanup

CacheRetentionManager.prune() is responsible for cleaning expired or oversized cache entries. The cleanup strategy uses LRU (least recently used):

const records = [...this.listRecords()]
  .sort((left, right) =>
    new Date(left.lastUsedAt).getTime() - new Date(right.lastUsedAt).getTime()
  );

// When over the limit, evict the least recently used entries first
while (totalBytes > maxBytes && retainedEntries.length > 0) {
  const evicted = records.find((record) => retainedEntries.includes(record.versionId));
  retainedEntries.splice(retainedEntries.indexOf(evicted.versionId), 1);
  removedEntries.push(evicted.versionId);
  totalBytes -= evicted.cacheSize;
  await fs.rm(evicted.cachePath, { force: true });
}

This logic ensures disk space is used efficiently while preserving historical versions that the user might still need.

Immediate Stop-Seeding Behavior

When the user turns off sharing acceleration, the app must immediately stop seeding and destroy the torrent client:

async disableSharingAcceleration(): Promise<void> {
  this.settingsStore.updateSettings({ enabled: false });
  await this.cacheRetentionManager.stopAllSeeding();  // Stop seeding
  await this.engine.stopAll();  // Destroy the torrent client
}

If a user disables the feature, the product should no longer consume any P2P resources. That is basic product etiquette.

Risks and Trade-Offs

There is no perfect solution, and hybrid distribution is no exception. These are the main trade-offs:

Crash isolation is weaker than a sidecar: V1 uses an in-process engine, so an engine crash can affect the main process. Adapter boundaries and timeout controls reduce the risk, but they are not a fundamental fix. V2 includes a planned migration path to a helper process.

Enabled-by-default resource usage: the default settings of 2 MB/s upload, 10 GB cache, and 7-day retention do consume some machine resources. User expectations are managed through onboarding copy and transparent settings.

Enterprise network compatibility: automatic WebSeed/HTTPS fallback preserves usability in enterprise networks, but it can reduce the acceleration gains from P2P. This is an intentional trade-off that prioritizes availability.

Backward-compatible metadata: all new fields are optional. If they are missing, the system falls back to HTTP mode. Older clients are completely unaffected, making upgrades smooth.

Conclusion

This article walked through the hybrid distribution architecture used in the HagiCode Desktop project. The key takeaways are:

Layered architecture: the control plane and data plane are separated, and the engine is abstracted behind a pluggable interface for easier testing and extension.
Policy-driven behavior: not every file uses P2P. Hybrid distribution is enabled only for large files that meet the required conditions.
Integrity verification: SHA256 serves as a hard gate, and streaming verification avoids memory pressure.
Productized presentation: BT terminology is hidden behind the phrase “sharing acceleration,” and the feature is enabled by default during onboarding.
User control: upload limits, cache limits, retention days, and other parameters remain user-adjustable.

This architecture has already been implemented in the HagiCode Desktop project. If you try it out, we would love to hear your feedback after installation and real-world use.

References

HagiCode Desktop GitHub: github.com/HagiCode-org/site
HagiCode official website: hagicode.com
WebTorrent official documentation: webtorrent.io
BitTorrent protocol specification: bittorrent.org
WebSeed extension specification: bittorrent.org/beps/bep_0017.html

If this article helped you:

Give the project a Star on GitHub: github.com/HagiCode-org/site
Visit the website to learn more: hagicode.com
Quick install for HagiCode Desktop: hagicode.com/desktop/
Public beta is now open, and you are welcome to install and try it

Maybe we are all just ordinary people making our way through the world of technology, but that is fine. Ordinary people can still be persistent, and that persistence matters.

Copyright Notice

Thank you for reading. If you found this article useful, feel free to like, save, and share it. This content was created with AI-assisted collaboration, with the final version reviewed and approved by the author.

Author: newbe36524
Original article: https://docs.hagicode.com/blog/2026-03-27-hagicode-desktop-p2p-acceleration-architecture/
License notice: Unless otherwise stated, all blog posts on this site are licensed under BY-NC-SA. Please include attribution when reposting.

Running AI CLI Tools in Docker Containers: A Practical Guide to User Isolation and Persistent Volumes

Mar 26, 2026

Running AI CLI Tools in Docker Containers: A Practical Guide to User Isolation and Persistent Volumes

Integrating AI coding tools like Claude Code, Codex, and OpenCode into containerized environments sounds simple, but there are hidden complexities everywhere. This article takes a deep dive into how the HagiCode project solves core challenges in Docker deployments, including user permissions, configuration persistence, and version management, so you can avoid the common pitfalls.

Background

When we decided to run AI coding CLI tools inside Docker containers, the most intuitive thought was probably: “Aren’t containers just root? Why not install everything directly and call it done?” In reality, that seemingly simple idea hides several core problems that must be solved.

First, security restrictions are the first hurdle. Take Claude CLI as an example: it explicitly forbids running as the root user. This is a mandatory security check, and if root is detected, it refuses to start. You might think, can’t I just switch users with the USER directive? It is not that simple. There is still a mapping problem between the non-root user inside the container and the user permissions on the host machine.

Second, state persistence is the second trap. Claude Code requires login, Codex has its own configuration, and OpenCode also has a cache directory. If you have to reconfigure everything every time the container restarts, the whole idea of “automation” loses its meaning. We need these configurations to persist beyond the lifecycle of the container.

The third problem is permission consistency. Can processes inside the container access configuration files created by the host user? UID/GID mismatches often cause file permission errors, and this is extremely common in real deployments.

These problems may look independent, but in practice they are tightly connected. During HagiCode’s development, we gradually worked out a practical solution. Next, I will share the technical details and the lessons learned from those pitfalls.

About HagiCode

The solution shared in this article comes from our practical experience in the HagiCode project. HagiCode is an open-source AI-assisted programming platform that integrates multiple mainstream AI coding assistants, including Claude Code, Codex, and OpenCode. As a project that needs cross-platform and highly available deployment, HagiCode has to solve the full range of challenges involved in containerized deployment.

If you find the technical solution in this article valuable, that is a sign HagiCode has something real to offer in engineering practice. In that case, the HagiCode official website and GitHub repository are both worth following.

Why can’t we just use root?

There is a common misunderstanding here: Docker containers run as root by default, so why not just install the tools as root? If you think that way, Claude CLI will quickly teach you otherwise.

# Run Claude CLI directly as root? No.
docker run --rm -it --user root myimage claude
# Output: Error: This command cannot be run as root user

This is a hard security restriction in Claude CLI. The reason is simple: these CLI tools read and write sensitive user configuration, including API tokens, local caches, and even scripts written by the user. Running them with root privileges introduces too much risk.

So the question becomes: how can we satisfy the CLI’s security requirements while keeping container management flexible? We need to change the way we think about it: instead of switching users at runtime, create a dedicated user during the image build stage.

Creating a dedicated user: more than just changing a name

You might think that adding a single USER line to the Dockerfile is enough. That is indeed the simplest approach, but it is not robust enough.

Static creation vs. dynamic mapping

HagiCode’s approach is to create a hagicode user with UID 1000, which usually matches the default user on most host machines:

RUN groupadd -o -g 1000 hagicode && \
    useradd -o -u 1000 -g 1000 -s /bin/bash -m hagicode && \
    mkdir -p /home/hagicode/.claude && \
    chown -R hagicode:hagicode /home/hagicode

But this only solves the built-in user inside the image. What if the host user is UID 1001? You still need to support dynamic mapping when the container starts.

docker-entrypoint.sh contains the key logic:

if [ -n "$PUID" ] && [ -n "$PGID" ]; then
    if ! id hagicode >/dev/null 2>&1; then
        groupadd -g "$PGID" hagicode
        useradd -u "$PUID" -g "$PGID" -s /bin/bash -m hagicode
    fi
fi

The advantage of this design is clear: use the default UID 1000 at image build time, then adjust dynamically at runtime through the PUID and PGID environment variables. No matter what UID the host user has, ownership of configuration files remains correct.

The design philosophy of persistent volumes

Each AI CLI tool has its own preferred configuration directory, so they need to be mapped one by one:

CLI Tool	Path in Container	Named Volume
Claude	`/home/hagicode/.claude`	`claude-data`
Codex	`/home/hagicode/.codex`	`codex-data`
OpenCode	`/home/hagicode/.config/opencode`	`opencode-config-data`

Why use named volumes instead of bind mounts? Three reasons:

Simpler management: Named volumes are managed automatically by Docker, so you do not need to create host directories manually.
Permission isolation: The initial contents of the volumes are created by the user inside the container, avoiding permission conflicts with the host.
Independent migration: Volumes can exist independently of containers, so data is not lost when images are upgraded.

docker-compose-builder-web automatically generates the corresponding volume configuration:

volumes:
  claude-data:
  codex-data:
  opencode-config-data:

services:
  hagicode:
    volumes:
      - claude-data:/home/hagicode/.claude
      - codex-data:/home/hagicode/.codex
      - opencode-config-data:/home/hagicode/.config/opencode
    user: "${PUID:-1000}:${PGID:-1000}"

Pay attention to the user field here: PUID and PGID are injected through environment variables to ensure that processes inside the container run with an identity that matches the host user. This detail matters because permission issues are painful to debug once they appear.

Version management: baked-in versions with runtime overrides

Pinning Docker image versions is essential for reproducibility. But in real development, we often need to test a newer version or urgently fix a bug. If we had to rebuild the image every time, the workflow would be far too inefficient.

HagiCode’s strategy is fixed versions as the default, with runtime overrides as an extension mechanism. It is a pragmatic engineering compromise between stability and flexibility.

Dockerfile.template pins versions here:

USER hagicode
WORKDIR /home/hagicode

# Configure the global npm install path
RUN mkdir -p /home/hagicode/.npm-global && \
    npm config set prefix '/home/hagicode/.npm-global'

# Install CLI tools using pinned versions
RUN npm install -g @anthropic-ai/claude-code@2.1.71 && \
    npm install -g @openai/codex@0.112.0 && \
    npm install -g opencode-ai@1.2.25 && \
    npm cache clean --force

docker-entrypoint.sh supports runtime overrides:

install_cli_override_if_needed() {
    local package_name="$2"
    local override_version="$5"

    if [ -n "$override_version" ]; then
        gosu hagicode npm install -g "${package_name}@${override_version}"
    fi
}

# Example usage
install_cli_override_if_needed "" "@anthropic-ai/claude-code" "" "" "${CLAUDE_CODE_CLI_VERSION}"

This lets you test a new version through an environment variable without rebuilding the image:

docker run -e CLAUDE_CODE_CLI_VERSION=2.2.0 myimage

This design is practical because nobody wants to rebuild an image every time they test a new feature.

Automatic configuration injection

In addition to configuring CLI tools manually, some scenarios require automatic configuration injection. The most typical example is an API token.

if [ -n "$ANTHROPIC_AUTH_TOKEN" ]; then
    mkdir -p /home/hagicode/.claude
    cat > /home/hagicode/.claude/settings.json <<EOF
{
  "env": {
    "ANTHROPIC_AUTH_TOKEN": "${ANTHROPIC_AUTH_TOKEN}"
  }
}
EOF
    chown -R hagicode:hagicode /home/hagicode/.claude
fi

Two things matter here: pass sensitive information through environment variables instead of hard-coding it into the image, and make sure the ownership of configuration files is set correctly, otherwise the CLI tools will not be able to read them.

Best practices and a pitfall checklist

Permission mismatch problems

This is the easiest trap to fall into. The host user has UID 1001, while the container uses 1000, so files created on one side cannot be accessed on the other.

# Correct approach: make the container match the host user
docker run \
    -e PUID=$(id -u) \
    -e PGID=$(id -g) \
    myimage

This issue is very common, and it can be frustrating the first time you run into it.

Configuration disappears after container restart

If you find yourself logging in again after every restart, check whether you forgot to mount a persistent volume:

volumes:
  - claude-data:/home/hagicode/.claude

Nothing is more frustrating than carefully setting up a configuration only to see it disappear.

The right way to upgrade versions

Do not run npm install -g directly inside a running container. The correct approaches are:

Set an environment variable to trigger override installation.
Or rebuild the image.

# Option 1: runtime override
docker run -e CLAUDE_CODE_CLI_VERSION=2.2.0 myimage

# Option 2: rebuild the image
docker build -t myimage:v2 .

There is more than one road to Rome, but some roads are smoother than others.

Security hardening checklist

Pass API tokens through environment variables instead of writing them into the image.
Set configuration file permissions to 600.
Always run the application as a non-root user.
Update CLI versions regularly to fix security vulnerabilities.

Security is always important, but the real challenge is consistently enforcing it in practice.

Extending support for new CLI tools

If you want to support a new CLI tool in the future, there are only three steps:

Dockerfile.template: add the installation step.
docker-entrypoint.sh: add the version override logic.
docker-compose-builder-web: add the persistent volume mapping.

This template-based design makes extension simple without changing the core logic.

Conclusion

Running AI CLI tools in Docker containers involves three core challenges: user permissions, configuration persistence, and version management. By combining dedicated users, named-volume isolation, and environment-variable-based overrides, the HagiCode project built a deployment architecture that is both secure and flexible.

Key design points:

User isolation: Create a dedicated user during the image build stage, with runtime support for dynamic PUID/PGID mapping.
Persistence strategy: Each CLI tool gets its own named volume, so restarts do not affect configuration.
Version flexibility: Fixed defaults ensure reproducibility, while runtime overrides provide room for testing.
Automated configuration: Sensitive configuration can be injected automatically through environment variables.

This solution has been running stably in the HagiCode project for some time, and I hope it offers useful reference points for developers with similar needs.

Copyright Notice

Author: newbe36524
Original article: https://docs.hagicode.com/blog/2026-03-26-docker-ai-cli-user-isolation-guide/
Copyright: Unless otherwise stated, all articles in this blog are licensed under BY-NC-SA. Please include the source when reposting.

Technical Analysis of the HagiCode Soul Platform: The Evolution from Emerging Needs to an Independent Platform

Mar 25, 2026

Technical Analysis of the HagiCode Soul Platform: The Evolution from Emerging Needs to an Independent Platform

Writing technical articles is not really such a grand thing. It is mostly just a matter of organizing the pitfalls you have run into and the detours you have taken. We have all been inexperienced before, after all. This article takes an in-depth look at the design philosophy, architectural evolution, and core technical implementation of Soul in the HagiCode project, and explores how an independent platform can provide a more focused experience for creating and sharing Agent personas.

Background

In the practice of building AI Agents, we often run into a question that looks simple but is actually crucial: how do we give different Agents stable and distinctive language styles and personality traits?

It is a slightly frustrating question, honestly. In the early Hero system of HagiCode, different Heroes (Agent instances) were mainly distinguished through profession settings and generic prompts. That approach came with some fairly obvious pain points, and anyone who has tried something similar has probably felt the same.

First, language style was difficult to keep consistent. The same “developer engineer” role might sound professional and rigorous one day, then casual and loose the next. This was not a model problem so much as the absence of an independent personality configuration layer to constrain and guide the output style.

Second, the sense of character was generally weak. When we described an Agent’s traits, we often had to rely on vague adjectives like “friendly,” “professional,” or “humorous,” without concrete language rules to support those abstract descriptions. Put plainly, it sounded nice in theory, but there was little to hold onto in practice.

Third, persona configurations were almost impossible to reuse. Suppose we carefully designed the speaking style of a “catgirl waitress” and wanted to reuse that expression style in another business scenario. In practice, we would almost have to configure it again from scratch. Sometimes you do not want to possess something beautiful, only reuse it a little… and even that turns out to be hard.

To solve those real problems, we introduced the Soul mechanism: an independent language style configuration layer separate from equipment and descriptions. Soul can define an Agent’s speaking habits, tone preferences, and wording boundaries, can be shared and reused across multiple Heroes, and can also be injected into the system prompt automatically on the first Session call.

Some people might say that this is just configuring a few prompts. But sometimes the real question is not whether something can be done; it is how to do it more elegantly. As Soul matured, we realized it had enough depth to develop independently. A dedicated Soul platform could let users focus on creating, sharing, and browsing interesting persona configurations without being distracted by the rest of the Hero system. That is how the standalone platform at soul.hagicode.com came into being.

About HagiCode

HagiCode is an open-source AI coding assistant project built with a modern technology stack and aimed at giving developers a smooth intelligent programming experience. The Soul platform approach shared in this article comes from our own hands-on exploration while building HagiCode to solve the practical problem of Agent persona management. If you find the approach valuable, then it probably means we have accumulated a certain amount of engineering judgment in practice, and the HagiCode project itself may also be worth a closer look.

GitHub: github.com/HagiCode-org/site
Official website: hagicode.com
Video demo: www.bilibili.com/video/BV1pirZBuEzq/
Quick desktop installation: hagicode.com/desktop/

The Technical Architecture Evolution of the Soul Platform

The Soul platform did not appear all at once. It went through three clear stages. The story began abruptly and concluded naturally.

Phase 1: Soul Configuration Embedded in Hero

The earliest Soul implementation existed as a functional module inside the Hero workspace. We added an independent SOUL editing area to the Hero UI, supporting both preset application and text fine-tuning.

Preset application let users choose from classic persona templates such as “professional developer engineer” and “catgirl waitress.” Text fine-tuning let users personalize those presets further. On the backend, the Hero entity gained a Soul field, with SoulCatalogId used to identify its source.

This stage solved the question of whether the capability existed at all, and it grew forward somewhat awkwardly, like anything young does. But as Soul content became richer, the limitations of an architecture tightly coupled with the Hero system started to show.

Phase 2: In-Site Marketplace

To provide a better Soul discovery and reuse experience, we built a SOUL Marketplace catalog page with support for browsing, searching, viewing details, and favoriting.

At this stage, we introduced a combinatorial design built from 50 main Catalogs (base roles) and 10 orthogonal rules (expression styles). The main Catalogs defined the Agent’s core persona, with abstract character settings such as “Mistport Traveler” and “Night Hunter.” The orthogonal rules defined how the Agent expressed itself, with language style traits such as “Concise & Professional” and “Verbose & Friendly.”

50 x 10 = 500 possible combinations gave users a wide configuration space for personas. It is not an overwhelming number, but it is not small either. There are many roads to Rome, after all; some are simply easier to walk than others. On the backend, the full SOUL catalog was generated through catalog-sources.json, while the frontend presented those catalog entries as an interactive card list.

The in-site Marketplace was a good transitional solution, but only that: transitional. It was still attached to the main system, and for users who only wanted Soul functionality, the access path remained too deep. Not everyone wants to take the scenic route just to do something simple.

Phase 3: Splitting into an Independent Platform

In the end, we decided to move Soul into an independent repository (repos/soul). The Marketplace in the original main system was changed into an external jump guide, while the new platform adopted a Builder-first design philosophy: the homepage is the creation workspace by default, so users can start building their own persona configuration the moment they open the site.

The technology stack was also comprehensively upgraded in this stage: Vite 8 + React 19 + TypeScript 5.9, a unified design language through the shadcn/ui component system, and Tailwind CSS 4 theme variables. The improvement in frontend engineering laid a solid foundation for future feature iteration.

Everything faded away… no, actually, everything was only just beginning.

Core Technical Design and Implementation

Material Integration Strategy

One core design principle of the Soul platform is local-first. That means the homepage must remain fully functional without a backend, and failure to load remote materials must never block page entry.

There is nothing especially miraculous about that. It simply means thinking one step further when designing the system. Using a local snapshot as the baseline and remote data as enhancement lets the product remain basically usable under any network condition. Concretely, we implemented a two-layer material architecture:

export async function loadBuilderMaterials(): Promise<BuilderMaterials> {
  const localMaterials = createLocalMaterials(snapshot)  // local baseline

  try {
    const inspirationFragments = await fetchMarketplaceItems()  // remote enhancement
    return { ...localMaterials, inspirationFragments, remoteState: "ready" }
  } catch (error) {
    return { ...localMaterials, remoteState: "fallback" }  // graceful degradation
  }
}

Local materials come from build-time snapshots of the main system documentation and include the complete data for 50 base roles and 10 expression rules. Remote materials come from Souls published by users and fetched through the Marketplace API. Together, they give users a full spectrum of materials, from official templates to community creativity. If that sounds dramatic, it really is just local plus remote.

Soul Fragment Data Model

The core data abstraction of Soul is the SoulFragment:

export type SoulFragment = {
  fragmentId: string
  group: "main-catalog" | "expression-rule" | "published-soul"
  title: string
  summary: string
  content: string
  keywords: string[]
  localized?: Partial<Record<AppLocale, LocalizedFragmentContent>>
  sourceRef: SoulFragmentSourceRef
  meta: SoulFragmentMeta
}

The group field distinguishes fragment types: the main catalog defines the character core, orthogonal rules define expression style, and user-published Souls are marked as published-soul. The localized field supports multilingual presentation, allowing the same fragment to display different titles and descriptions in different language environments. Internationalization is something you really want to think about early, and in this case we actually did.

The Builder draft state encapsulates the user’s current editing state:

export type SoulBuilderDraft = {
  draftId: string
  name: string
  selectedMainFragmentId: string | null
  selectedRuleFragmentId: string | null
  inspirationSoulId: string | null
  mainSlotText: string
  ruleSlotText: string
  customPrompt: string
  previewText: string
  updatedAt: string
}

Each fragment selected in the editor has its content concatenated into the corresponding slot, forming the final preview text. mainSlotText corresponds to the main role content, ruleSlotText corresponds to the expression rule content, and customPrompt is the user’s additional instruction text.

Preview Compilation Mechanism

Preview compilation is the core capability of Soul Builder. It assembles user-selected fragments and custom text into a system prompt that can be copied directly:

export function compilePreview(
  draft: Pick<SoulBuilderDraft, "mainSlotText" | "ruleSlotText" | "customPrompt">,
  fragments: {
    mainFragment: SoulFragment | null
    ruleFragment: SoulFragment | null
    inspirationFragment: SoulFragment | null
  }
): PreviewCompilation {
  // Assembly logic: main role + expression rule + inspiration reference + custom content
}

The compilation result is shown in the central preview panel, where users can see the final effect in real time and copy it to the clipboard with one click. It sounds simple, and it is. But simple things are often the most useful.

Frontend State Management

Frontend state management in Soul Builder follows one important principle: clear separation of state boundaries. More specifically, drawer state is not persisted and does not write directly into the draft. Only explicit Builder actions trigger meaningful state changes.

// Domain state (useSoulBuilder)
export function useSoulBuilder() {
  // Material loading and caching
  // Slot aggregation and preview compilation
  // Copy actions and feedback messages
  // Locale-safe descriptors
}

// Presentation state (useHomeEditorState)
export function useHomeEditorState() {
  // activeSlot, drawerSide, drawerOpen
  // default focus behavior
}

That separation ensures both edit-state safety and responsive UI behavior. Opening and closing the drawer is purely a UI interaction and should not trigger complicated persistence logic. It may sound obvious, but it matters: UI state and business state should be separated clearly so interface interactions do not pollute the core data model.

Single-Drawer Lifecycle

Soul Builder uses a single-drawer mode: only one slot drawer may be open at a time. Clicking the mask, pressing the ESC key, or switching slots automatically closes the current drawer. This simplifies state management and also matches common drawer interaction patterns on mobile.

Closing the drawer does not clear the current editing content, so when users come back, their context is preserved. This kind of “lightweight” drawer design avoids interrupting the user’s flow. Nobody wants carefully written content to disappear because of one accidental click.

Bilingual Support Architecture

Internationalization is an important capability of the Soul platform. System copy fully supports bilingual switching, while user draft text is never rewritten when the language changes, because draft text is user-authored free input rather than system-translated content.

Official inspiration cards (Marketplace Souls) keep the upstream display name while also providing a best-effort English summary. For Souls with Chinese names, we generate English versions through predefined mapping rules:

// English name mapping for main roles
const mainNameEnglishMap = {
  "雾港旅人": "Mistport Traveler",
  "夜航猎手": "Night Hunter",
  // ...
}

// English name mapping for orthogonal rules
const ruleNameEnglishMap = {
  "简洁干练": "Concise & Professional",
  "啰嗦亲切": "Verbose & Friendly",
  // ...
}

The mapping table itself looks simple enough, but keeping it in good shape still takes care. There are 50 main roles and 10 orthogonal rules, which means 500 combinations in total. That is not huge, but it is enough to deserve respect.

Backend Catalog Generation

Bulk generation of the Soul Catalog happens on the backend, where C# is used to automate the creation of 50 x 10 = 500 combinations:

foreach (var main in source.MainCatalogs)
{
    foreach (var orthogonal in source.OrthogonalCatalogs)
    {
        var catalogId = $"soul-{main.Index:00}-{orthogonal.Index:00}";
        var displayName = BuildNickname(main, orthogonal);
        var soulSnapshot = BuildSoulSnapshot(main, orthogonal);
        // Write to the database...
    }
}

The nickname generation algorithm combines the main role name with the expression rule name to create imaginative Agent codenames:

private static readonly string[] MainHandleRoots = [
    "雾港", "夜航", "零帧", "星渊", "霓虹", "断云", ...
];
private static readonly string[] OrthogonalHandleSuffixes = [
    "旅人", "猎手", "术师", "行者", "星使", ...
];
// Combination examples: 雾港旅人, 夜航猎手, 零帧术师...

Soul snapshot assembly follows a fixed template format that combines the main role core, signature traits, expression rule core, and output constraints together:

private static string BuildSoulSnapshot(main, orthogonal) => string.Join('\n', [
    $"你的人设内核来自「{main.Name}」：{main.Core}",
    $"保持以下标志性语言特征：{main.Signature}",
    $"你的表达规则来自「{orthogonal.Name}」：{orthogonal.Core}",
    $"必须遵循这些输出约束：{orthogonal.Signature}"
]);

Template assembly may sound terribly dull, but without that sort of dull work, interesting products rarely appear.

Platform Migration Strategy

After splitting Soul from the main system into an independent platform, one important challenge was handling existing user data. It is a familiar problem: splitting things apart is easy, migration is not. We adopted three safeguards:

Backward compatibility protection. Previously saved Hero SOUL snapshots remain visible, and historical snapshots can still be previewed even if they no longer have a Marketplace source ID. In other words, none of the user’s prior configurations are lost; only where they appear has changed.

Main system API deprecation. The in-site Marketplace API returns HTTP status 410 Gone together with a migration notice that guides users to soul.hagicode.com.

Hero SOUL form refactoring. A migration notice block was added to the Hero Soul editing area to clearly tell users that the Soul platform is now independent and to provide a one-click jump button:

<div className="rounded-2xl border border-orange-200/70 bg-orange-50/80 p-4">
  <div>{t('hero.soul.migrationTitle')}</div>
  <p>{t('hero.soul.migrationDescription')}</p>
  <Button onClick={onOpenSoulPlatform}>
    {t('hero.soul.openSoulPlatformAction')}
  </Button>
</div>

Practical Lessons

Looking back at the development of the Soul platform as a whole, there are a few practical lessons worth sharing. They are not grand principles, just things learned from real mistakes.

Local-first runtime assumptions. When designing features that depend on remote data, always assume the network may be unavailable. Using local snapshots as the baseline and remote data as enhancement ensures the product remains basically usable under any network condition.

Clear separation of state boundaries. UI state and business state should be distinguished clearly so interface interactions do not pollute the core data model. Drawer toggles are purely UI state and should not be mixed with draft persistence.

Design for internationalization early. If your product has multilingual requirements, it is best to think about them during the data model design phase. The localized field adds some structural complexity, but it greatly reduces the long-term maintenance cost of multilingual content.

Automate the material synchronization workflow. Local materials for the Soul platform come from the main system documentation. When upstream documentation changes, there needs to be a mechanism to sync it into frontend snapshots. We designed the npm run materials:sync script to automate that process and keep materials aligned with upstream.

Future Outlook

Based on the current architecture, the Soul platform could move in several directions in the future. These are only tentative ideas, but perhaps they can be useful as a starting point.

Community sharing ecosystem. Support user uploads and sharing of custom Souls, with rating, commenting, and recommendation mechanisms so excellent Soul configurations can be discovered and reused by more people.

Multimodal expansion. Beyond text style, the platform could also support dimensions such as voice style configuration, emoji usage preferences, and code style and formatting rules. It sounds attractive in theory; implementation may tell a more complicated story.

Intelligent assistance. Automatically recommend Souls based on usage scenarios, support style transfer and fusion, and even run A/B tests on the real-world effectiveness of different Souls. There is no better way to know than to try.

Cross-platform synchronization. Support importing persona configurations from other AI platforms, provide a standardized Soul export format, and integrate with mainstream Agent frameworks.

Conclusion

This article shares the full evolution of the HagiCode Soul platform from its earliest emerging need to an independent platform. We discussed why a Soul mechanism is needed to solve Agent persona consistency, analyzed the three stages of architectural evolution (embedded configuration, in-site Marketplace, and independent platform), examined the core data model, state management, preview compilation, and internationalization design in depth, and summarized practical migration lessons.

The essence of Soul is an independent persona configuration layer separated from business logic. It makes the language style of AI Agents definable, reusable, and shareable. From a technical perspective, the design itself is not especially complicated, but the problem it solves is real and broadly relevant.

If you are also building AI Agent products, it may be worth asking whether your persona configuration solution is flexible enough. The Soul platform’s practical experience may offer a few useful ideas.

Perhaps one day you will run into a similar problem as well. If this article can help a little when that happens, that is probably enough.

References

HagiCode official website: hagicode.com
Soul platform: soul.hagicode.com
HagiCode GitHub: github.com/HagiCode-org/site
HagiCode desktop: hagicode.com/desktop/
HagiCode installation docs: docs.hagicode.com/installation/docker-compose

If you found this article helpful, feel free to give the project a Star on GitHub. The public beta has already started, and you are welcome to install it and try it out.

Copyright Notice

Thank you for reading. If you found this article useful, likes, bookmarks, and shares are all appreciated. This content was created with AI-assisted collaboration, and the final content was reviewed and confirmed by the author.

Author: newbe36524
Original link: https://docs.hagicode.com/blog/2026-03-25-hagicode-soul-platform-technical-analysis/
Copyright notice: Unless otherwise stated, all articles on this blog are licensed under BY-NC-SA. Please cite the source when reposting.

Technical Analysis of the HagiCode Skill System: Building a Scalable AI Skill Management Platform

Mar 24, 2026

Technical Analysis of the HagiCode Skill System: Building a Scalable AI Skill Management Platform

This article takes an in-depth look at the architecture and implementation of the Skill management system in the HagiCode project, covering the technical details behind four core capabilities: local global management, marketplace search, intelligent recommendations, and trusted provider management.

Background

In the field of AI coding assistants, how to extend the boundaries of AI capabilities has always been a core question. Claude Code itself is already strong at code assistance, but different development teams and different technology stacks often need specialized capabilities for specific scenarios, such as handling Docker deployments, database optimization, or frontend component generation. That is exactly where a Skill system becomes especially important.

During the development of the HagiCode project, we ran into a similar challenge: how do we let Claude Code “learn” new professional skills like a person would, while still maintaining a solid user experience and good engineering maintainability? This problem is both hard and simple in its own way. Around that question, we designed and implemented a complete Skill management system.

This article walks through the technical architecture and core implementation of the system in detail. It is intended for developers interested in AI extensibility and command-line tool integration. It might be useful to you, or it might not, but at least it is written down now.

About HagiCode

The approach shared in this article comes from our practical experience in the HagiCode project. HagiCode is an open-source AI coding assistant project designed to help development teams improve engineering efficiency. The project’s stack includes ASP.NET Core, the Orleans distributed framework, a TanStack Start + React frontend, and the Skill management subsystem introduced in this article.

The GitHub repository is HagiCode-org/site. If you find the technical approach in this article valuable, feel free to give it a Star. More Stars tend to improve the mood, after all.

System Architecture Overview

The Skill system uses a frontend-backend separated architecture. There is nothing especially mysterious about that.

Frontend uses TanStack Start + React to build the user interface, with Redux Toolkit managing state. The four main capabilities map directly to four Tab components: Local Skills, Skill Gallery, Intelligent Recommendations, and Trusted Providers. In the end, the design is mostly about making the user experience better.

Backend is based on ASP.NET Core + ABP Framework, using Orleans Grain for distributed state management. The online API client wraps the IOnlineApiClient interface to communicate with the remote skill catalog service.

The overall architectural principle is to separate command execution from business logic. Through the adapter pattern, the implementation details of npm/npx command execution are hidden inside independent modules. After all, nobody really wants command-line calls scattered all over the codebase.

Core Capability 1: Local Global Management

Local global management is the most basic module. It is responsible for listing installed skills and supporting uninstall operations. There is nothing overly complicated here; it is mostly about doing the basics well.

Technical Approach

The implementation lives in LocalSkillsTab.tsx and LocalSkillCommandAdapter.cs. The core idea is to wrap the npx skills command, parse its JSON output, and convert it into internal data structures. It sounds simple, and in practice it mostly is.

public async Task<IReadOnlyList<LocalSkillInventoryResponseDto>> GetLocalSkillsAsync(
    CancellationToken cancellationToken = default)
{
    var result = await _commandAdapter.ListGlobalSkillsAsync(cancellationToken);
    return result.Skills.Select(skill => new LocalSkillInventoryResponseDto
    {
        Name = skill.Name,
        Version = skill.Version,
        Source = skill.Source,
        InstalledPath = skill.InstalledPath,
        Description = skill.Description
    }).ToList();
}

The data flow is very clear: the frontend sends a request -> SkillGalleryAppService receives it -> LocalSkillCommandAdapter executes the npx command -> the JSON result is parsed -> a DTO is returned. Each step follows naturally from the previous one.

Skill uninstallation uses the npx skills remove -g <skillName> -y command, and the system automatically handles dependencies and cleanup. Installation metadata is stored in managed-install.json inside the skill directory, recording information such as install time and source version for later updates and auditing. Some things are simply worth recording.

Installation Flow in Detail

Skill installation requires several coordinated steps. In truth, it is not especially complicated:

public async Task<SkillInstallResultDto> InstallAsync(
    SkillInstallRequestDto request,
    CancellationToken cancellationToken = default)
{
    // 1. Normalize the installation reference
    var normalized = _referenceNormalizer.Normalize(
        request.SkillId,
        request.Source,
        request.SkillSlug,
        request.Version);

    // 2. Check prerequisites
    await _prerequisiteChecker.CheckAsync(cancellationToken);

    // 3. Acquire installation lock
    using var installLock = await _lockProvider.AcquireAsync(normalized.SkillId);

    // 4. Execute installation command
    var result = await _installCommandRunner.ExecuteAsync(
        new SkillInstallCommandExecutionRequest
        {
            Command = $"npx skills add {normalized.FullReference} -g -y",
            Timeout = TimeSpan.FromMinutes(4)
        },
        cancellationToken);

    // 5. Persist installation metadata
    await _metadataStore.WriteAsync(normalized.SkillPath, request);

    return new SkillInstallResultDto { Success = result.Success };
}

Several key design patterns are used here: the reference normalizer converts different input formats, such as tanweai/pua and @opencode/docker-skill, into a unified internal representation; the installation lock mechanism ensures only one installation operation can run for the same skill at a time; and streaming output pushes installation progress to the frontend in real time through Server-Sent Events, so users can watch terminal-like logs as they happen.

In the end, all of these patterns are there for one purpose: to keep the system simpler to use and maintain.

Core Capability 2: Marketplace Search

Marketplace search lets users discover and install skills from the community. One person’s ability is always limited; collective knowledge goes much further.

Technical Approach

The search feature relies on the online API https://api.hagicode.com/v1/skills/search. To improve response speed, the system implements caching. Cache is a bit like memory: if you keep useful things around, you do not have to think so hard the next time.

private async Task<IReadOnlyList<SkillGallerySkillDto>> SearchCatalogAsync(
    string query,
    CancellationToken cancellationToken,
    IReadOnlySet<string>? allowedSources = null)
{
    var cacheKey = $"skill_search:{query}:{string.Join(",", allowedSources ?? Array.Empty<string>())}";

    if (_memoryCache.TryGetValue(cacheKey, out var cached))
        return (IReadOnlyList<SkillGallerySkillDto>)cached!;

    var response = await _onlineApiClient.SearchAsync(
        new SearchSkillsRequest
        {
            Query = query,
            Limit = _options.LimitPerQuery,
        },
        cancellationToken);

    var results = response.Skills
        .Where(skill => allowedSources is null || allowedSources.Contains(skill.Source))
        .Select(skill => new SkillGallerySkillDto { ... })
        .ToList();

    _memoryCache.Set(cacheKey, results, TimeSpan.FromMinutes(10));
    return results;
}

Search results support filtering by trusted sources, so users only see skill sources they trust. Seed queries such as popular and recent are used to initialize the catalog, allowing users to see recommended popular skills the first time they open it. First impressions still matter.

Core Capability 3: Intelligent Recommendations

Intelligent recommendations are the most complex part of the system. They can automatically recommend the most suitable skills based on the current project context. Complex as it is, it is still worth building.

Recommendation Flow

The full recommendation flow is divided into five stages:

1. Build project context
   ↓
2. AI generates search queries
   ↓
3. Search the online catalog in parallel
   ↓
4. AI ranks the candidates
   ↓
5. Return the recommendation list

First, the system analyzes characteristics such as the project’s technology stack, programming languages, and domain structure to build a “project profile.” That profile is a bit like a resume, recording the key traits of the project.

Then an AI Grain is used to generate targeted search queries. This design is actually quite interesting: instead of directly asking the AI, “What skills should I recommend?”, we first ask it to think about “What search terms are likely to find relevant skills?” Sometimes the way you ask the question matters more than the answer itself:

var queryGeneration = await aiGrain.GenerateSkillRecommendationQueriesAsync(
    projectContext,      // Project context
    locale,              // User language preference
    maxQueries,           // Maximum number of queries
    effectiveSearchHero); // AI model selection

Next, those search queries are executed in parallel to gather a candidate skill list. Parallel processing is, at the end of the day, just a way to save time.

Finally, another AI Grain ranks the candidate skills. This step considers factors such as skill relevance to the project, trust status, and user historical preferences:

var ranking = await aiGrain.RankSkillRecommendationsAsync(
    projectContext,
    candidates,
    installedSkillNames,
    locale,
    maxRecommendations,
    effectiveRankingHero);

response.Items = MergeRecommendations(projectContext, candidates, ranking, maxRecommendations);

Fallback Mechanism

AI models can respond slowly or become temporarily unavailable. Even the best systems stumble sometimes. For that reason, the system includes a deterministic fallback mechanism: when the AI service is unavailable, it uses a rule-based heuristic algorithm to generate recommendations, such as inferring likely required skills from dependencies in package.json.

Put plainly, this fallback mechanism is simply a backup plan for the system.

Core Capability 4: Trusted Provider Management

Trusted provider management allows users to control which skill sources are considered trustworthy. Trust is still something users should be able to define for themselves.

Matching Rules

Trusted providers support two matching rules: exact match (exact) and prefix match (prefix).

public static TrustedSkillProviderResolutionSnapshot Resolve(
    TrustedSkillProviderSnapshot snapshot,
    string source)
{
    var normalizedSource = Normalize(source);

    foreach (var entry in snapshot.Entries.OrderBy(e => e.SortOrder))
    {
        if (!entry.IsEnabled) continue;

        foreach (var rule in entry.MatchRules)
        {
            bool isMatch = rule.MatchType switch
            {
                TrustedSkillProviderMatchRuleType.Exact
                    => string.Equals(normalizedSource, Normalize(rule.Value),
                        StringComparison.OrdinalIgnoreCase),
                TrustedSkillProviderMatchRuleType.Prefix
                    => normalizedSource.StartsWith(Normalize(rule.Value) + "/",
                        StringComparison.OrdinalIgnoreCase),
                _ => false
            };

            if (isMatch)
                return new TrustedSkillProviderResolutionSnapshot
                {
                    IsTrustedSource = true,
                    ProviderId = entry.ProviderId,
                    DisplayName = entry.DisplayName
                };
        }
    }

    return new TrustedSkillProviderResolutionSnapshot { IsTrustedSource = false };
}

Built-in trusted providers include well-known organizations and projects such as Vercel, Azure, anthropics, Microsoft, and browser-use. Custom providers can be added through configuration files by specifying a provider ID, display name, badge label, matching rules, and more. The world is large enough that only trusting a few built-ins would never be enough.

Persistence Implementation

Trusted configuration is persisted using an Orleans Grain:

public class TrustedSkillProviderGrain : Grain<TrustedSkillProviderState>,
    ITrustedSkillProviderGrain
{
    public async Task UpdateConfigurationAsync(TrustedSkillProviderSnapshot snapshot)
    {
        State.Snapshot = snapshot;
        await WriteStateAsync();
    }

    public Task<TrustedSkillProviderSnapshot> GetConfigurationAsync()
    {
        return Task.FromResult(State.Snapshot);
    }
}

The benefit of this approach is that configuration changes are automatically synchronized across all nodes, without any need to refresh caches manually. Automation is, ultimately, about letting people worry less.

Key Technical Design

Command Execution Adapter Pattern

The Skill system needs to execute various npx commands. If that logic were scattered everywhere, the code would quickly become difficult to maintain. That is why we designed an adapter interface. Design patterns, in the end, exist to make code easier to maintain:

public interface ISkillInstallCommandRunner
{
    Task<SkillInstallCommandExecutionResult> ExecuteAsync(
        SkillInstallCommandExecutionRequest request,
        CancellationToken cancellationToken = default);
}

Different commands have different executor implementations, but all of them implement the same interface, making testing and replacement straightforward.

SSE Streaming Output

Installation progress is pushed to the frontend in real time through Server-Sent Events:

public async Task InstallWithProgressAsync(
    SkillInstallRequestDto request,
    IServerStreamWriter<SkillInstallProgressEventDto> stream,
    CancellationToken cancellationToken)
{
    var process = new Process
    {
        StartInfo = new ProcessStartInfo
        {
            FileName = "npx",
            Arguments = $"skills add {request.FullReference} -g -y",
            RedirectStandardOutput = true,
            RedirectStandardError = true,
            UseShellExecute = false
        }
    };

    process.OutputDataReceived += async (sender, e) =>
    {
        await stream.WriteAsync(new SkillInstallProgressEventDto
        {
            EventType = "output",
            Data = e.Data ?? string.Empty
        });
    };

    process.Start();
    process.BeginOutputReadLine();
    await process.WaitForExitAsync(cancellationToken);
}

On the frontend, users can see terminal-like output in real time, which makes the experience very intuitive. Real-time feedback helps people feel at ease.

Practical Guide

Installing a Community Skill

Take installing the pua skill as an example (it is a popular community skill):

Open the Skills drawer and switch to the Skill Gallery tab
Enter pua in the search box
Click the search result to view the skill details
Click the Install button
Switch to the Local Skills tab to confirm the installation succeeded

The installation command is npx skills add tanweai/pua -g -y, and the system handles all the details automatically. There are not really that many steps once you take them one by one.

Adding a Custom Trusted Source

If your team has its own skill repository, you can add it as a trusted source:

providerId: "my-team"
displayName: "My Team Skills"
badgeLabel: "MyTeam"
isEnabled: true
sortOrder: 100
matchRules:
  - matchType: "prefix"
    value: "my-team/"
  - matchType: "exact"
    value: "my-team/special-skill"

This way, all skills from your team will display a trusted badge, making users more comfortable installing them. Labels and signals do help people feel more confident.

Skill Development Basics

Creating a custom skill requires the following structure:

my-skill/
├── SKILL.md          # Skill metadata (YAML front matter)
├── index.ts          # Skill entry point
├── agents/           # Supported agent configuration
└── references/       # Reference resources

An example SKILL.md format:

---
name: my-skill
description: A brief description of what this skill does
---

# My Skill

Detailed documentation...

Notes

Network requirements: skill search and installation require access to api.hagicode.com and the npm registry
Node.js version: Node.js 18 or later is recommended
Permission requirements: global npm installation permissions are required
Concurrency control: only one install or uninstall operation can run for the same skill at a time
Timeout settings: the default timeout for installation is 4 minutes, but complex scenarios may require adjustment

These notes exist, ultimately, to help things go smoothly.

Conclusion

This article introduced the complete implementation of the Skill management system in the HagiCode project. Through a frontend-backend separated architecture, the adapter pattern, Orleans-based distributed state management, and related techniques, the system delivers:

Local global management: a unified skill management interface built by wrapping npx skills commands
Marketplace search: rapid discovery of community skills through the online API and caching mechanisms
Intelligent recommendations: AI-powered skill recommendations based on project context
Trust management: a flexible configuration system that lets users control trust boundaries

This design approach is not only applicable to Skill management. It is also useful as a reference for any scenario that needs to integrate command-line tools while balancing local storage and online services.

If this article helped you, feel free to give us a Star on GitHub: github.com/HagiCode-org/site. You can also visit the official site to learn more: hagicode.com.

You may think this system is well designed, or you may not. Either way, that is fine. Once code is written, someone will use it, and someone will not.

References

HagiCode project repository: github.com/HagiCode-org/site
HagiCode official site: hagicode.com
Claude Code official Skill documentation: docs.anthropic.com
Orleans framework documentation: dotnet.github.io/orleans
TanStack Start: tanstack.com/start

Copyright Notice

Thank you for reading. If you found this article useful, feel free to like, bookmark, and share it. This content was produced with AI-assisted collaboration, and the final content was reviewed and approved by the author.

Author: newbe36524
Original link: https://docs.hagicode.com/blog/2026-03-24-hagicode-skill-system-technical-analysis/
Copyright notice: Unless otherwise stated, all blog posts on this site are licensed under BY-NC-SA. Please include the source when reposting.

I Might Be Replaced by an Agent, So I Ran the Numbers

Mar 22, 2026

I Might Be Replaced by an Agent, So I Ran the Numbers

Quantifying AI replacement risk with data: a deep dive into how the HagiCode team uses six core formulas to redefine how knowledge workers evaluate their competitiveness.

Background

With AI technology advancing at breakneck speed, every knowledge worker is facing an urgent question: In the AI era, will I be replaced?

It sounds a little alarmist, but plenty of people are quietly uneasy about it. You just finish learning a new framework, and AI is already telling you your role might be automated away; you finally master a language, and then discover that someone using AI is producing three times as much as you. If you are reading this, you have probably felt at least some of that anxiety.

And honestly, that anxiety is not irrational. No one wants to admit that the skills they spent years building could be outperformed by a single ChatGPT session. Still, anxiety is one thing; life goes on.

Traditional discussions usually start from the question of “what AI can do,” but that framing misses two critical dimensions:

The business perspective: whether a company is willing to equip an employee with AI tools depends on whether AI costs make economic sense relative to labor costs. It is not enough for AI to be capable of replacing a role; the company also has to run the numbers. Capital is not a charity, and every dollar has to count.
The efficiency perspective: AI-driven productivity gains need to be quantified instead of being reduced to the vague claim that “using AI makes you stronger.” Maybe your efficiency doubles with AI, but someone else gets a 5x improvement. That gap matters. It is like school: everyone sits in the same class, but some score 90 while others barely pass.

So the real question is: how do we turn this fuzzy anxiety into measurable indicators?

It is always better to know where you stand than to fumble around in the dark. That is what we are talking about today: the design logic behind the AI productivity calculator built by the HagiCode team.

So I made a site: https://cost.hagicode.com.

About HagiCode

HagiCode is an open-source AI coding assistant project built to help developers code more efficiently.

What is interesting is that while building their own product, the HagiCode team accumulated a lot of hands-on experience around AI productivity. They realized that the value of an AI tool cannot be assessed in isolation from a company’s employment costs. Based on that insight, the team decided to build a productivity calculator to help knowledge workers evaluate their competitiveness in the AI era more scientifically.

Plenty of people could build something like this. The difference is that very few are willing to do it seriously. The HagiCode team spent time on it as a way of giving something back to the developer community.

The design shared in this article is a summary of HagiCode’s experience applying AI in real engineering work. If you find this evaluation framework valuable, it suggests that HagiCode really does have something to offer in engineering practice. In that case, the HagiCode project itself is also worth paying attention to.

The Core Formulas: 6 Key Metrics

1. Total Annual Employment Cost

A company’s real cost for an employee is far more than salary alone. A lot of people only realize this when changing jobs: you negotiate a monthly salary of 20,000 CNY, but take home only 14,000. On the company side, the spend is not just 20,000 either. Social insurance, housing fund contributions, training, and recruiting costs all have to be included.

According to the implementation in calculate-ai-risk.ts:

Total annual employment cost = Annual salary x (1 + city coefficient) + Annual salary / 12

The city coefficient reflects differences in hiring and retention costs across cities:

City tier	Representative cities	Coefficient
Tier 1	Beijing / Shanghai / Shenzhen / Guangzhou	0.4
New Tier 1	Hangzhou / Chengdu / Suzhou / Nanjing	0.3
Tier 2	Wuhan / Xi’an / Tianjin / Zhengzhou	0.2
Other	Yichang / Luoyang and others	0.1

A Tier 1 city coefficient of 0.4 means the company needs to pay roughly 40% extra in recruiting, training, insurance, and similar overhead. The all-in cost of hiring someone in Beijing really is much higher than in a Tier 2 city.

The cost of living in major cities is high too. You could think of it as another version of a “drifter tax.”

2. Blended Token Unit Price

Different AI models have separate input and output pricing, and the gap can be huge. In coding scenarios, the input/output ratio is roughly 3:1. You might give the AI a block of code to review, while its analysis is usually much shorter than the input.

The blended unit price formula is:

Blended unit price = (input-output ratio x input price + output price) / (input-output ratio + 1)

Take GPT-5 as an example:

Input: $2.5/1M tokens
Output: $15/1M tokens
Blended = (3 x 2.5 + 15) / 4 = $5.625/1M tokens

For models priced in USD, you also need to convert using an exchange rate. The HagiCode team currently sets that rate to 7.25 and updates it as the market changes.

Exchange rates are like the stock market: no one can predict them exactly. You just follow the trend.

3. Annual AI Cost

Average daily AI cost = Average daily token demand (M) x blended unit price (CNY/1M)
Annual AI cost = Average daily AI cost x 264 working days

264 = 22 days/month x 12 months, which is the number of working days in a standard year. Why not use 365? Because you have to account for weekends, holidays, sick leave, and so on.

We are not robots, after all. AI may not need rest, but people still need room to breathe.

4. The Core Innovation: Equivalent Headcount

This is the heart of the whole evaluation system, and also where the HagiCode team’s insight shows most clearly.

Affordable workflow count = Total annual employment cost / Annual AI cost
Affordability ratio = min(affordable workflow count, 1)
Equivalent headcount = 1 + (productivity multiplier - 1) x affordability ratio

That formula looks a little abstract, so let me unpack it.

The traditional view would simply say, “your efficiency improved by 2x.” But this formula introduces a crucial constraint: is the company’s AI budget sustainable?

For example, Xiao Ming improves his efficiency by 3x, but his annual AI usage costs 300,000 CNY while the company is only paying him a salary of 200,000 CNY. In that case, his personal productivity may be impressive, but it is not sustainable. No company is going to lose money just to keep him operating at peak efficiency.

That is what the affordability ratio means. If the company can only afford 0.5 of an AI workflow, then Xiao Ming’s equivalent headcount is 1 + (3 - 1) x 0.5 = 2 people, not 3.

The key insight: what matters is not just how large your productivity multiplier is, but whether the company can afford the AI investment required to sustain that multiplier.

The logic is simple once you see it. Most people just do not think from that angle. We are used to looking at the world from our own side, not from the boss’s side, where money does not come out of thin air either.

5. Cost-Benefit Ratio

AI cost ratio = Annual AI cost / Total annual employment cost
Productivity gain = Productivity multiplier - 1
Cost-benefit ratio = Productivity gain / AI cost ratio

Cost-benefit ratio < 1: the AI investment is not worth it; the productivity gain does not justify the cost
Cost-benefit ratio 1-2: barely worth it
Cost-benefit ratio > 2: high return, strongly recommended

This metric is especially useful for managers because it helps them quickly judge whether a given role is worth equipping with AI tools.

At the end of the day, ROI is what matters. You can talk about higher efficiency all you want, but if the cost explodes, no one is going to buy the argument.

6. Risk Level

Risk is categorized according to equivalent headcount:

Equivalent headcount	Risk level	Conclusion
>= 2.0	High risk	If your coworkers gain the same conditions, they become a serious threat to you
1.5 - 2.0	Warning	Coworkers have begun to build a clear productivity advantage
< 1.5	Safe	For now, you can still maintain a gap

After seeing that table, you probably have a rough sense of where you stand. Still, there is no point in panicking. Anxiety does not solve problems. It is better to think about how to raise your own productivity multiplier.

Gamified Design: 7 Special Titles

To make the results more fun, the calculator introduces a system of seven special titles. These titles are persisted through localStorage, allowing users to unlock and display their own “achievements.”

Title ID	Name	Unlock condition
craftsman-spirit	Craftsman Spirit	Average daily token usage = 0
prompt-alchemist	Prompt Alchemist	Daily tokens <= 20M and productivity multiplier >= 6
all-in-operator	All-In Operator	Daily tokens >= 150M and productivity multiplier >= 3
minimalist-runner	Minimalist Runner	Daily tokens <= 5M and productivity multiplier >= 2
cost-tamer	Cost Tamer	Cost-benefit ratio >= 2.5 and AI cost ratio <= 15%
danger-oracle	Danger Oracle	Equivalent headcount >= 2.5 or entering the high-risk zone
budget-coordinator	Budget Coordinator	Affordable workflow count >= 8

Each title also carries a hidden meaning:

Title	Hidden meaning
Craftsman Spirit	You can still do fine without AI, but you need unique competitive strengths
Prompt Alchemist	You achieve high output with very few tokens; a classic power-user profile
All-In Operator	High input, high output; suitable for high-frequency scenarios
Minimalist Runner	Lightweight AI usage; suitable for light-assistance scenarios
Cost Tamer	Extremely high ROI; the kind of employee companies love
Danger Oracle	You are already, or soon will be, in a high-risk group
Budget Coordinator	You can operate multiple AI workflows at the same time

Gamification is really just a way to make dry data a little more entertaining. After all, who does not like collecting achievements? Like badges in a game, they may not have much practical value, but they still feel good to earn.

Data Sources: An Authoritative Pricing System

The calculator’s pricing data comes from multiple official API pricing pages to keep the results authoritative and up to date:

OpenAI: Official API pricing page
Anthropic Claude: Official pricing docs
DeepSeek: CNY pricing page
Zhipu GLM: Zhipu Open Platform pricing page
MiniMax: Pay-as-you-go pricing

This data is updated regularly, with the latest refresh on 2026-03-19.

Data only matters when it is current. Once it is outdated, it stops being useful. On that front, the HagiCode team has been quite responsible about keeping things updated.

Practical Example

Suppose you are a developer in Beijing with an annual salary of 400,000 CNY, using Claude Sonnet 4.6, consuming 50M tokens per day on average, and estimating that AI gives you a 3x productivity boost. The simulated input looks like this:

const input = {
  annualIncomeCny: 400000,
  cityTier: "tier1",           // Beijing
  modelId: "claude-sonnet-4-6",
  performanceMultiplier: 3.0,
  dailyTokenUsageM: 50,
}

// Calculation process
// Total annual employment cost = 400k x (1 + 0.4) + 400k/12 ~= 603.3k
// Annual AI cost ~= 50 x 7.125 x 264 ~= 94k
// Affordable workflow count ~= 603.3 / 94 ~= 6.4 workflows
// Equivalent headcount = 1 + (3 - 1) x 1 = 3 people

Conclusion: if one of your coworkers has the same conditions, their output would be equivalent to three people. You are already in the high-risk zone.

If you discover that your current AI usage is “not worth it” (cost-benefit ratio < 1), you can consider:

Reducing token usage: use more efficient prompts and cut down ineffective requests
Choosing a more cost-effective model: for example, DeepSeek-V3 (priced in CNY and cheaper)
Increasing your productivity multiplier: learn advanced Agent usage techniques and truly turn AI into productivity

In the end, all of this comes down to the art of balance. Use too much and you waste money; use too little and nothing changes. The key is finding the sweet spot.

Technical Architecture Highlights

When designing this calculator, the HagiCode team made several engineering decisions worth learning from:

Pure frontend computation: all calculations run in the browser, with no backend API dependency, which protects user privacy
Configuration-driven: all formulas, pricing, and role data are centralized in configuration files, so future updates do not require changing core code logic
Multilingual support: supports both Chinese and English
Instant feedback: results update in real time as soon as the user changes inputs
Detailed formula display: every result includes the full calculation formula to help users understand it

This design makes the calculator easy to maintain and extend, while also serving as a reference template for similar data-driven applications.

Good architecture, like good code, takes time to build up. The HagiCode team put real thought into it.

Conclusion

The core value of the AI productivity calculator is that it turns the vague anxiety of an “AI replacement threat” into metrics that can be quantified and compared.

The equivalent headcount formula, 1 + (productivity multiplier - 1) x affordability ratio, is the core innovation of the entire framework. It considers not only productivity gains, but also whether a company can afford the AI cost, making the evaluation much closer to reality.

This framework tells us one thing clearly: in the AI era, not knowing where you stand is the most dangerous position of all.

Instead of worrying, let the data speak.

A lot of fear comes from the unknown. Once you quantify everything, the situation no longer feels quite so terrifying. At worst, you improve yourself or change tracks. Life is long, and there is no need to hang everything on a single tree.

If This Helped You

Leave a like so more people can see it
Give us a Star on GitHub: github.com/HagiCode-org/site
Visit the official website to learn more: hagicode.com
Watch the 30-minute hands-on demo: www.bilibili.com/video/BV1pirZBuEzq/
One-click install experience: docs.hagicode.com/installation/docker-compose
Quick installation for the Desktop app: hagicode.com/desktop/

Visit cost.hagicode.com now and complete your AI productivity assessment.

References

Data source: cost.hagicode.com | Powered by HagiCode

In the end, a line of poetry came to mind: “This feeling might have become a thing to remember, yet even then one was already lost.” The AI era is much the same. Instead of waiting until you are replaced and filled with regret, it is better to start taking action now…

Copyright Notice

Thank you for reading. If you found this article useful, likes, bookmarks, and shares are all welcome. This content was created with AI-assisted collaboration, and the final version was reviewed and confirmed by the author.

Author: newbe36524
Original article: https://docs.hagicode.com/blog/2026-03-22-ai-productivity-calculator-science/
Copyright: Unless otherwise stated, all posts on this blog are licensed under BY-NC-SA. Please cite the source when reprinting.

Why HagiCode Chose Hermes as Its Integrated Agent Core

Mar 19, 2026

Why HagiCode Chose Hermes as Its Integrated Agent Core

When building an AI-assisted coding platform, choosing the right Agent core directly determines the upper limit of the system’s capabilities. Some things simply cannot be forced; pick the wrong framework, and no amount of effort will make it feel right. This article shares the thinking behind HagiCode’s technical selection and our hands-on experience integrating Hermes Agent.

Background

When building an AI-assisted coding product, one of the hardest parts is choosing the underlying Agent framework. There are actually quite a few options on the market, but some are too limited in functionality, some are overly complex to deploy, and others simply do not scale well enough. What we needed was a solution that could run on a $5 VPS while also being able to connect to a GPU cluster. That requirement may not sound extreme, but it is enough to scare plenty of teams away.

In practice, many so-called “all-in-one Agents” either only run in the cloud or require absurdly high local deployment costs. After spending two weeks researching different approaches, we made a bold decision: rebuild the entire Agent core around Hermes as the underlying engine for our integrated Agent.

Everything that followed may simply have been fate.

About HagiCode

The approach shared in this article comes from real-world experience in the HagiCode project. HagiCode is an AI-assisted coding platform that provides developers with an intelligent coding assistant through a VSCode extension, a desktop client, and web services. You may have used similar tools before and felt they were just missing that final touch; we understand that feeling well.

GitHub: github.com/HagiCode-org/site
Official website: hagicode.com

Why HagiCode Needs Hermes

Before diving into Hermes itself, it helps to explain why HagiCode needed something like it in the first place. Things rarely work exactly the way you want, so you need a practical reason to commit to a technical direction.

As an AI coding assistant, HagiCode needs to support several usage scenarios at the same time:

Local development environments: developers want to run it on their own machines so data never leaves the local environment. These days, data security is never a trivial concern.
Team collaboration environments: small teams should be able to share an Agent deployment running on a server. Saving money matters, and everyone has limits.
Elastic cloud expansion: when handling complex tasks, the system should automatically scale out to a GPU cluster. It is always better to be prepared.

This “we want everything at once” requirement is what led us to Hermes. Whether it was the perfect choice, I cannot say for sure, but at the time we did not see a better option.

What Is Hermes Agent

Hermes Agent is an autonomous AI Agent created by Nous Research. Some readers may not be familiar with Nous Research; they are the lab behind open-source large models such as Hermes, Nomos, and Psyché. They have built many excellent things, even if they are still more underappreciated than they deserve.

Unlike traditional IDE coding assistants or simple API chat wrappers, Hermes has a defining trait: the longer it runs, the more capable it becomes. It is not designed to complete a task once and stop; it keeps learning and accumulating experience over long-running operation. In that sense, it feels a little like a person.

Core Features

Several of Hermes’s core capabilities happen to align very closely with HagiCode’s needs.

This means HagiCode can choose the most suitable deployment model based on each user’s scenario: individuals run it locally, teams deploy it on servers, and complex tasks use GPU resources. One codebase handles all of it. In a world this busy, saving one layer of complexity is already a win.

Multi-platform messaging gateway Hermes natively supports Telegram, Discord, Slack, WhatsApp, and more. For HagiCode, this means we can support AI assistants on those channels much more easily in the future. More paths forward are always welcome.

Rich tool system Hermes comes with 40+ built-in tools and supports MCP (Model Context Protocol) extensions. This is essential for a coding assistant: executing shell commands, working with the file system, and calling Git all depend on tool support. An Agent without tools is like a bird without wings.

Cross-session memory Hermes includes a persistent memory system and uses FTS5 full-text search to recall historical conversations. That allows the Agent to remember prior context instead of “losing its memory” every time. Sometimes people wish they could forget things that easily, but reality is usually less generous.

How HagiCode Integrates Hermes

Now that the “why” is clear, let us look at the “how.” Once something makes sense in theory, the next step is to build it.

Provider Layer Abstraction

In HagiCode’s architecture, all AI Providers implement a unified IAIProvider interface:

public sealed class HermesCliProvider : IAIProvider, IVersionedAIProvider
{
    public ProviderCapabilities Capabilities { get; } = new ProviderCapabilities
    {
        SupportsStreaming = true,   // Supports streaming output
        SupportsTools = true,       // Supports tool invocation
        SupportsSystemMessages = true, // Supports system prompts
        SupportsArtifacts = false
    };
}

This abstraction layer allows HagiCode to switch seamlessly between different AI Providers. Whether the backend is OpenAI, Claude, or Hermes, the upper-layer calling pattern stays exactly the same. In plain terms, it keeps things simple.

ACP Communication Protocol

Hermes communicates through ACP (Agent Communication Protocol). This protocol is designed specifically for Agent communication, and its main methods include:

Method	Description
`initialize`	Initialize the connection and obtain the protocol version and client capabilities
`authenticate`	Handle authentication and support multiple authentication methods
`session/new`	Create a new session and configure the working directory and MCP servers
`session/prompt`	Send a prompt and receive a response

HagiCode implements the ACP transport layer through StdioAcpTransport, launching a Hermes subprocess and communicating with it over standard input and output. It may sound complicated, but in practice it is manageable as long as you have enough patience.

Configuration Management

Configuration is managed through the HermesPlatformConfiguration class:

public sealed class HermesPlatformConfiguration : IAcpPlatformConfiguration
{
    public string ExecutablePath { get; set; } = "hermes";
    public string Arguments { get; set; } = "acp";
    public int StartupTimeoutMs { get; set; } = 5000;
    public string ClientName { get; set; } = "HagiCode";
    public HermesAuthenticationConfiguration Authentication { get; set; }
    public HermesSessionDefaultsConfiguration SessionDefaults { get; set; }
}

Configure Hermes in appsettings.json:

{
  "Providers": {
    "HermesCli": {
      "ExecutablePath": "hermes",
      "Arguments": "acp",
      "StartupTimeoutMs": 10000,
      "ClientName": "HagiCode",
      "Authentication": {
        "PreferredMethodId": "api-key",
        "MethodInfo": {
          "api-key": "your-api-key-here"
        }
      },
      "SessionDefaults": {
        "Model": "claude-sonnet-4-20250514",
        "ModeId": "default"
      }
    }
  }
}

Configuration often looks simple on paper, but getting every detail right still takes real effort.

Orleans Distributed Architecture

HagiCode uses Orleans to build its distributed system, and the Hermes integration is implemented through the following components:

HermesGrain: An Orleans Grain implementation that handles session execution
HermesPlatformConfiguration: Platform-specific configuration
HermesAcpSessionAdapter: ACP session adapter
HermesConsole: A dedicated validation console

The name Orleans does have a certain charm to it. Even if this Orleans has nothing to do with the legendary city, a good name never hurts.

End-to-End Execution Flow

The following is the core execution logic of the Hermes Provider:

private async IAsyncEnumerable<AIStreamingChunk> StreamCoreAsync(
    AIRequest request,
    string? embeddedCommandPrompt,
    [EnumeratorCancellation] CancellationToken cancellationToken)
{
    // 1. Create transport layer and launch Hermes subprocess
    await using var transport = new StdioAcpTransport(
        platformConfiguration.GetExecutablePath(),
        platformConfiguration.GetArguments(),
        platformConfiguration.GetEnvironmentVariables(),
        platformConfiguration.GetStartupTimeout(),
        _loggerFactory.CreateLogger<StdioAcpTransport>());
    await transport.ConnectAsync(cancellationToken);

    // 2. Initialize and obtain protocol version and authentication methods
    var initializeResult = await SendHermesRequestAsync(
        transport, nextRequestId++, "initialize",
        BuildInitializeParameters(platformConfiguration), cancellationToken);

    // 3. Handle authentication
    var authMethods = ParseAuthMethods(initializeResult);
    if (!isAuthenticated)
    {
        var methodId = platformConfiguration.Authentication.ResolveMethodId(authMethods);
        await SendHermesRequestAsync(transport, nextRequestId++, "authenticate", ...);
    }

    // 4. Create session
    var newSessionResult = await SendHermesRequestAsync(
        transport, nextRequestId++, "session/new",
        BuildNewSessionParameters(platformConfiguration, workingDirectory, model), cancellationToken);
    var sessionId = ParseSessionId(newSessionResult);

    // 5. Execute prompt and collect streaming responses
    await foreach (var payload in transport.ReceiveMessagesAsync(cancellationToken))
    {
        // Handle session/update notifications and convert them into streaming chunks
        if (TryParseSessionNotification(root, out var notification))
        {
            if (_responseMapper.TryConvertToStreamingChunk(notification, out var chunk))
            {
                yield return chunk;
            }
        }
    }
}

With code, the details eventually become familiar. What matters most is the overall approach.

Health Checks

To ensure Hermes remains available, HagiCode implements a health check mechanism:

public async Task<ProviderTestResult> PingAsync(CancellationToken cancellationToken = default)
{
    var response = await ExecuteAsync(
        new AIRequest
        {
            Prompt = "Reply with exactly PONG.",
            CessionId = null,
            AllowedTools = Array.Empty<string>(),
            WorkingDirectory = ResolveWorkingDirectory(null)
        },
        cancellationToken);

    var success = string.Equals(response.Content.Trim(), "PONG", StringComparison.OrdinalIgnoreCase);
    return new ProviderTestResult
    {
        ProviderName = Name,
        Success = success,
        ResponseTimeMs = stopwatch.ElapsedMilliseconds,
        ErrorMessage = success ? null : $"Unexpected Hermes ping response: '{response.Content}'."
    };
}

That is roughly what a “health check” looks like here. In some ways, people are not so different: it helps to check in from time to time, even if no one tells us exactly what to look for.

Practical Considerations

There are a few pitfalls worth understanding before integrating Hermes. Everyone steps into a few traps sooner or later.

Authentication Method Configuration

Hermes supports multiple authentication methods, including API keys and tokens, so you need to choose based on the actual deployment scenario. Misconfiguration can cause connection failures, and the resulting error messages are not always intuitive. Sometimes the reported error is far away from the real root cause, which means slow and careful debugging is unavoidable.

MCP Server Configuration

When creating a session, you can configure a list of MCP servers so Hermes can call external tools. But keep the following points in mind:

MCP server addresses must be reachable
Timeouts must be configured reasonably
The system needs degradation handling when a server is unavailable

In practice, defensive thinking matters more than people expect.

Working Directory Management

Each session must specify a working directory so Hermes can access project files correctly. In multi-project scenarios, the working directory needs to switch dynamically. It sounds straightforward, but there are more edge cases than you might think.

Response Aggregation

Hermes responses may be split across session/update notifications and the final result, so they must be merged correctly. Otherwise, content may be lost.

Error Handling Strategy

Runtime errors should be returned explicitly instead of silently falling back to another Provider. That way, users know the issue came from Hermes rather than wondering why the system suddenly switched models behind the scenes.

Conclusion

HagiCode’s decision to use Hermes as its integrated Agent core was not a casual impulse. It was a careful choice based on practical requirements and the technical characteristics of the framework. Whether it proves to be the perfect long-term answer is still too early to say, but so far it has been serving us well.

Hermes gives HagiCode the flexibility to adapt to a wide range of scenarios. Its powerful tool system and MCP support allow the AI assistant to do real work, while the ACP protocol and Provider abstraction layer keep the integration process clear and controllable.

If you are choosing an Agent framework for your own AI project, I hope this article offers a useful reference. Picking the right underlying architecture can make everything that follows much easier.

If This Article Helped You

Give us a star on GitHub: github.com/HagiCode-org/site
Visit the official website to learn more: hagicode.com
Watch the 30-minute hands-on demo: www.bilibili.com/video/BV1pirZBuEzq/
One-click installation experience: docs.hagicode.com/installation/docker-compose
Quick install for the Desktop client: hagicode.com/desktop/
Public beta has started; you are welcome to install and try it
Official Hermes documentation

Copyright Notice

Thank you for reading. If you found this article useful, you are welcome to support it with a like, bookmark, or share. This content was created with AI-assisted collaboration, and the final content was reviewed and confirmed by the author.

Author: newbe36524
Original article: https://docs.hagicode.com/blog/2026-03-19-hagicode-hermes-agent-core/
Copyright notice: Unless otherwise stated, all articles on this blog are licensed under BY-NC-SA. Please indicate the source when reprinting.

Building an AI Adventure Party: A Practical Guide to Multi-Agent Collaboration Configuration in HagiCode

Mar 17, 2026

Building an AI Adventure Party: A Practical Guide to Multi-Agent Collaboration Configuration in HagiCode

In modern software development, a single AI Agent is no longer enough for complex needs. How can multiple AI assistants from different companies collaborate within the same project? This article shares the multi-Agent collaboration configuration approach that the HagiCode project developed through real-world practice.

Background

Many developers have likely had this experience: bringing an AI assistant into a project really does improve coding efficiency. But as requirements grow more complex, one AI Agent starts to fall short. You want it to handle code review, documentation generation, unit tests, and more at the same time, but the result is often that it cannot balance everything well, and output quality becomes inconsistent.

What is even more frustrating is that once you try to introduce multiple AI assistants, things get more complicated. Each Agent has its own configuration method, API interface, and execution logic, and they may even conflict with one another. It is like a sports team where every player is individually strong, but nobody knows how to coordinate, so the whole match turns into chaos.

The HagiCode project ran into the same problem during development. As a complex project involving a frontend VSCode extension, backend AI services, and a cross-platform desktop client, in the 2026-03 version at that time we needed to integrate multiple AI assistants from different companies at once: Claude Code, Codex, CodeBuddy, iFlow, and more. Figuring out how to let them coexist harmoniously in the same project while making the best use of their individual strengths became a critical problem we had to solve.

That alone would already be enough trouble. After all, who wants to deal with a group of AI tools fighting each other every day?

The approach shared in this article is the multi-Agent collaboration configuration practice we developed in the HagiCode project through real trial and error and repeated optimization. If you are also struggling with multiple AI assistants working together, this article may give you some ideas. Maybe. Every project is different, after all.

About HagiCode

HagiCode is an AI coding assistant project that adopts an “adventure party” model in which multiple AI engines work together. Project repository: github.com/HagiCode-org/site.

The multi-Agent configuration approach shared here is one of the core techniques that allows HagiCode to maintain efficient development in complex projects. There is nothing especially mystical about it - it just turns a group of AIs into an adventure party that can actually coordinate.

HagiCode’s Multi-Agent Architecture Design

From “Going Solo” to “Team Collaboration”

In the early days of the HagiCode project, we also tried using a single AI Agent to handle everything. We quickly discovered a clear bottleneck in that approach: different tasks demand different strengths. Some tasks require stronger contextual understanding, while others need more precise code editing. One Agent has a hard time excelling at all of them.

That made us realize that multiple Agents had to work together. But the problem was this: how do you let AI products from different companies coexist peacefully in the same project? We needed to solve several core issues:

Configuration management complexity: each Agent has different configuration methods, API interfaces, and execution modes
Unified communication protocol: we need a standardized way for different Agents to exchange data
Task coordination and division of labor: how do we assign work reasonably so each Agent can play to its strengths

With those questions in mind, we started designing HagiCode’s multi-Agent architecture. It was not really that complicated in the end; we just had to think it through clearly.

Overall Architecture at a Glance

After multiple iterations, this is the architecture we settled on:

┌─────────────────────────────────────────────────────────────────┐
│                    AIProviderFactory                             │
│  (Factory pattern for unified management of all AI Providers)    │
├─────────────────────────────────────────────────────────────────┤
│  ClaudeCodeCli  │  CodexCli  │  CodebuddyCli  │  IFlowCli    │
│  (Anthropic)   │  (OpenAI)  │  (Zhipu GLM)    │  (Zhipu)     │
└─────────────────────────────────────────────────────────────────┘

The core idea is to let different AI Agents be managed by the same code through a unified Provider interface. At the same time, the factory pattern is used to dynamically create and configure these Providers, ensuring scalability and flexibility across the system.

It is like division of labor in daily life. Everyone has a role; here we simply turned that idea into code architecture.

Agent Types and Division of Responsibilities

Based on HagiCode’s real-world experience, we assigned different responsibilities to each Agent:

Agent	Provider	Model	Primary Use
ClaudeCodeCli	Anthropic	glm-5-turbo	Generate technical solutions and Proposals
CodexCli	OpenAI/Zed	gpt-5.4	Execute precise code changes
CodebuddyCli	Zhipu	glm-4.7	Refine proposal descriptions and documentation
IFlowCli	Zhipu	glm-4.7	Archive proposals and historical records (configuration at the time; now legacy-compatible only)
OpenCodeCli	-	-	General-purpose code editing
GitHubCopilot	Microsoft	-	Assisted programming and code completion

The logic behind this division of labor is simple: every Agent has its own area of strength. Claude Code performs well at understanding and analyzing complex requirements, so it handles early solution design. Codex is more precise when modifying code, so it is better suited for concrete implementation work. CodeBuddy offers strong cost performance, which makes it a great fit for refining documentation.

After all, the right tool for the right job is usually the best choice. There are many roads to Rome; some are simply easier to walk than others.

Core Configuration Mechanisms

Unified Provider Interface Design

To manage different AI Agents in a unified way, we first need to define a common interface. In HagiCode, that interface looks like this:

public interface IAIProvider
{
    // Unified Provider interface
    Task<IAIProvider?> GetProviderAsync(AIProviderType providerType);
    Task<IAIProvider?> GetProviderAsync(string providerName, CancellationToken cancellationToken);
}

The interface looks simple, but it is the foundation of the entire multi-Agent system. With a unified interface, we can call AI products from different companies in exactly the same way, no matter what is underneath.

This is really just a matter of making complex things simple. Simple is beautiful, after all.

Provider Factory Pattern Implementation

Once the interface is unified, the next question is how to create these Provider instances. HagiCode uses the factory pattern:

private IAIProvider? CreateProvider(AIProviderType providerType, ProviderConfiguration config)
{
    return providerType switch
    {
        AIProviderType.ClaudeCodeCli =>
            ActivatorUtilities.CreateInstance<ClaudeCodeCliProvider>(_serviceProvider, Options.Create(config)),
        AIProviderType.CodebuddyCli =>
            ActivatorUtilities.CreateInstance<CodebuddyCliProvider>(_serviceProvider, Options.Create(config)),
        AIProviderType.CodexCli =>
            ActivatorUtilities.CreateInstance<CodexCliProvider>(_serviceProvider, Options.Create(config)),
        AIProviderType.IFlowCli =>
            ActivatorUtilities.CreateInstance<IFlowCliProvider>(_serviceProvider, Options.Create(config)),
        _ => null
    };
}

This uses dependency injection through ActivatorUtilities.CreateInstance, which can dynamically create Provider instances at runtime while automatically injecting dependencies. The benefit of this design is that when a new Agent type is added, you only need to add the corresponding Provider class and then add one more case branch in the factory method. There is no need to modify the existing code at all.

That is reason enough. Who wants to rewrite a pile of old code every time a new feature is added?

Dynamic Configuration Resolution

To make configuration more flexible, we also implemented a type-mapping mechanism:

public static AIProviderTypeExtensions
{
    private static readonly Dictionary<string, AIProviderType> _typeMap = new(
        StringComparer.OrdinalIgnoreCase)
    {
        ["ClaudeCodeCli"] = AIProviderType.ClaudeCodeCli,
        ["CodebuddyCli"] = AIProviderType.CodebuddyCli,
        ["CodexCli"] = AIProviderType.CodexCli,
        ["IFlowCli"] = AIProviderType.IFlowCli,
        // ...more type mappings
    };
}

The purpose of this mapping table is to convert string-form Provider names into enum types. This allows configuration files to use intuitive string names, while the internal code uses type-safe enums for processing.

Configuration should be as intuitive as possible. Nobody wants to memorize a pile of obscure code names.

Example Configuration File

In practice, everything can be configured in appsettings.json:

AI:
  Providers:
    Providers:
      ClaudeCodeCli:
        Enabled: true
        Model: glm-5-turbo
        WorkingDirectory: /path/to/project
      CodebuddyCli:
        Enabled: true
        Model: glm-4.7
      CodexCli:
        Enabled: true
        Model: gpt-5.4
      IFlowCli:
        Enabled: true
        Model: glm-4.7

Each Provider can independently configure parameters such as enablement, model version, and working directory. This design preserves flexibility while remaining easy to manage and maintain.

In some ways, configuration files are like life’s options: you can choose to enable or disable certain things. The only difference is that code choices are easier to regret later.

Adventure Party Task Flow

The Art of Task Division

With the unified technical architecture in place, the next step is making multiple Agents work together. HagiCode designed a task flow mechanism so different Agents can handle different stages of the work:

Proposal creation (user)
    │
    ▼
[Claude Code] ──generate proposal──▶ Proposal document
    │                               │
    │                               ▼
    │                      [Codebuddy] ──refine description──▶ Refined proposal
    │                               │
    │                               ▼
    │                      [Codex] ──execute changes──▶ Code changes
    │                               │
    │                               ▼
    └──────────────────────▶ [iFlow] ──archive──▶ Historical records

The benefit of this division of labor is that each Agent only needs to focus on the tasks it does best, rather than trying to do everything. Claude Code generates proposals from scratch. Codebuddy makes proposal descriptions clearer. Codex turns proposals into actual code changes. iFlow archives and preserves those changes.

This is really just teamwork, the same as in daily life. Everyone has a role, and only together can something big get done. Here, the team members just happen to be AIs.

Key Practical Takeaways

In actual operation, we summarized the following lessons:

1. Agent selection strategy matters

Tasks should not be assigned casually; they should be matched to each Agent’s strengths:

Proposal generation: use Claude Code, because it has stronger contextual understanding
Code execution: use Codex, because it is more precise for code modification
Proposal refinement: use Codebuddy, because it offers strong cost performance
Archival storage: use iFlow, because it is stable and reliable

After all, putting the right person on the right task is a timeless principle.

2. Configuration isolation ensures stability

Each Agent’s configuration is managed independently, supports environment-variable overrides, and uses separate working directories. As a result, a configuration error in one Agent does not affect the others.

This is like personal boundaries in life. Everyone needs their own space; non-interference makes coexistence possible.

3. Error-handling mechanism

A failure in a single Agent should not affect the overall workflow. We implemented a fallback strategy: when one Agent fails, the system can automatically switch to a backup plan or skip that step and continue with later tasks. At the same time, complete logging makes troubleshooting easier afterward.

Nobody can guarantee that errors will never happen. The key is how you handle them. Life works much the same way.

4. Monitoring and observability

Through the ACP protocol (our custom communication protocol based on JSON-RPC 2.0), we can track the execution status of each Agent. Session isolation ensures concurrency safety, while dynamic caching improves performance.

The things you cannot see are often the ones most likely to go wrong. Some visibility is always better than flying blind.

Real-World Results and Benefits

After adopting this multi-Agent collaboration configuration, the HagiCode project’s development efficiency improved significantly. Specifically:

Task-handling capacity doubled: in the past, one Agent had to handle many kinds of tasks at once; now tasks can be processed in parallel, and throughput has increased dramatically
More stable output quality: each Agent focuses only on what it does best, so consistency and quality both improve
Lower maintenance cost: unified interfaces and configuration management make the whole system easier to maintain and extend
Adding new Agents is simple: to integrate a new AI product, you only need to implement the interface and add configuration, without changing the core logic

This approach not only solved HagiCode’s own problems, but also proved that multi-Agent collaboration is a viable architectural choice.

The gains were quite noticeable. The process was just a bit of a hassle.

Conclusion

This article shared the HagiCode project’s practical experience with multi-Agent collaboration configuration. The main takeaways include:

Standardized interfaces: IAIProvider unifies the behavior of different Agents, allowing the code to ignore which company’s product is underneath
Factory pattern: ActivatorUtilities.CreateInstance dynamically creates Provider instances, supporting runtime configuration and dependency injection
Protocol unification: the ACP protocol provides standardized communication between Agents through a bidirectional mechanism based on JSON-RPC 2.0
Task routing: assign work reasonably across different Agents so each can play to its strengths, instead of expecting one Agent to do everything

This design not only solves the problem of “multiple Agents fighting each other,” but also uses the adventure party task flow mechanism to make the development process more automated and specialized.

If you are also considering introducing multiple AI assistants, I hope this article gives you some useful reference points. Of course, every project is different, and the specific approach still needs to be adjusted to the actual situation. There is no one-size-fits-all solution; the best solution is the one that fits you.

Beautiful things or people do not need to be possessed. As long as they remain beautiful, simply appreciating that beauty is enough. Technical solutions are the same: the one that suits you is the best one…

References

Building an AI Adventure Party: HagiCode Multi-Agent Collaboration Configuration in Practice

Mar 17, 2026

Building an AI Adventure Party: HagiCode Multi-Agent Collaboration Configuration in Practice

In modern software development, a single AI Agent is no longer enough to meet complex requirements. How can multiple AI assistants from different companies collaborate within the same project? This article shares the multi-Agent collaboration configuration approach that the HagiCode project developed through real-world practice.

Background

Many developers have probably had this experience: after introducing an AI assistant into a project, productivity really does improve. But as requirements become more and more complex, one AI Agent starts to feel insufficient. You want it to handle code review, documentation generation, unit testing, and other tasks at the same time, but the result is often that it cannot keep everything balanced, and the output quality becomes inconsistent.

What is even more frustrating is that once you try to bring in multiple AI assistants, the problem becomes more complicated. Each Agent has its own configuration method, API interface, and execution logic, and they may even conflict with one another. It is like a sports team in which every player is talented, but nobody knows how to work together, so the match turns into a mess.

The HagiCode project ran into the same challenge during development. As a complex project involving a frontend VSCode extension, backend AI services, and a cross-platform desktop client, we needed to connect multiple AI assistants from different companies at the same time: Claude Code, Codex, CodeBuddy, iFlow, and more. How to let them coexist harmoniously in the same project and make the most of their strengths became a key problem we had to solve.

That alone would already be enough trouble. After all, who wants to deal with a bunch of fighting AIs every day?

The approach shared in this article is the multi-Agent collaboration configuration practice that we developed in the HagiCode project through real trial and error and repeated optimization. If you are also struggling with multiple AI assistants working together, this article may give you some inspiration. Maybe. Every project is different, after all.

About HagiCode

HagiCode is an AI coding assistant project that adopts an “adventure party” model in which multiple AI engines work together. Project repository: github.com/HagiCode-org/site.

The multi-Agent configuration approach shared in this article is one of the core technologies that allows HagiCode to maintain efficient development in complex projects. There is nothing especially magical about it; it simply turns a group of AIs into an adventure party that can actually coordinate.

HagiCode’s Multi-Agent Architecture Design

From “Going Solo” to “Team Collaboration”

In the early days of the HagiCode project, we also tried using a single AI Agent to handle every task. We soon discovered a clear bottleneck in that approach: different tasks require different strengths. Some tasks need stronger contextual understanding, while others need more precise code modification capabilities. One Agent has a hard time excelling at everything.

Configuration management complexity: each Agent has different configuration methods, API interfaces, and execution modes
Unified communication protocol: we need a standardized way for different Agents to exchange data
Task coordination and division of labor: how do we assign work reasonably so each Agent can play to its strengths

With those questions in mind, we started designing HagiCode’s multi-Agent architecture. It was not actually that complicated; we just had to think it through clearly.

Overall Architecture at a Glance

After multiple iterations, this is the architecture we settled on:

┌─────────────────────────────────────────────────────────────────┐
│                    AIProviderFactory                             │
│  (Factory pattern for unified management of all AI Providers)    │
├─────────────────────────────────────────────────────────────────┤
│  ClaudeCodeCli  │  CodexCli  │  CodebuddyCli  │  IFlowCli    │
│  (Anthropic)   │  (OpenAI)  │  (Zhipu GLM)    │  (Zhipu)     │
└─────────────────────────────────────────────────────────────────┘

The core idea is to let different AI Agents be managed by the same set of code through a unified Provider interface. At the same time, the factory pattern is used to dynamically create and configure these Providers, ensuring scalability and flexibility across the system.

It is like division of labor in everyday life. Everyone has their own role; here we simply turned that idea into code architecture.

Agent Types and Division of Responsibilities

Based on HagiCode’s real-world experience, we assigned different responsibilities to each Agent:

Agent	Provider	Model	Primary Use
ClaudeCodeCli	Anthropic	glm-5-turbo	Generate technical solutions and Proposals
CodexCli	OpenAI/Zed	gpt-5.4	Execute precise code changes
CodebuddyCli	Zhipu	glm-4.7	Refine proposal descriptions and documentation
IFlowCli	Zhipu	glm-4.7	Archive proposals and historical records
OpenCodeCli	-	-	General-purpose code editing
GitHubCopilot	Microsoft	-	Assisted programming and code completion

After all, the right tool for the right job is the best choice. There are many roads to Rome; some are simply easier to walk than others.

Core Configuration Mechanisms

Unified Provider Interface Design

To manage different AI Agents in a unified way, we first need to define a common interface. In HagiCode, that interface looks like this:

public interface IAIProvider
{
    // Unified Provider interface
    Task<IAIProvider?> GetProviderAsync(AIProviderType providerType);
    Task<IAIProvider?> GetProviderAsync(string providerName, CancellationToken cancellationToken);
}

The interface looks simple, but it is the foundation of the entire multi-Agent system. With a unified interface, we can call AI products from different companies in the same way regardless of which company is behind them.

This is really just about making complex things simple. Simple is beautiful, after all.

Provider Factory Pattern Implementation

Once the interface is unified, the next question is how to create these Provider instances. HagiCode uses the factory pattern:

private IAIProvider? CreateProvider(AIProviderType providerType, ProviderConfiguration config)
{
    return providerType switch
    {
        AIProviderType.ClaudeCodeCli =>
            ActivatorUtilities.CreateInstance<ClaudeCodeCliProvider>(_serviceProvider, Options.Create(config)),
        AIProviderType.CodebuddyCli =>
            ActivatorUtilities.CreateInstance<CodebuddyCliProvider>(_serviceProvider, Options.Create(config)),
        AIProviderType.CodexCli =>
            ActivatorUtilities.CreateInstance<CodexCliProvider>(_serviceProvider, Options.Create(config)),
        AIProviderType.IFlowCli =>
            ActivatorUtilities.CreateInstance<IFlowCliProvider>(_serviceProvider, Options.Create(config)),
        _ => null
    };
}

This uses dependency injection through ActivatorUtilities.CreateInstance, which can dynamically create Provider instances at runtime while automatically injecting dependencies. The benefit of this design is that when a new Agent type is added, you only need to add the corresponding Provider class and then add one more case branch in the factory method. There is no need to modify the existing code at all.

That is reason enough. Who wants to rewrite a pile of old code every time a new feature is added?

Dynamic Configuration Resolution

To make configuration more flexible, we also implemented a type-mapping mechanism:

public static AIProviderTypeExtensions
{
    private static readonly Dictionary<string, AIProviderType> _typeMap = new(
        StringComparer.OrdinalIgnoreCase)
    {
        ["ClaudeCodeCli"] = AIProviderType.ClaudeCodeCli,
        ["CodebuddyCli"] = AIProviderType.CodebuddyCli,
        ["CodexCli"] = AIProviderType.CodexCli,
        ["IFlowCli"] = AIProviderType.IFlowCli,
        // ...more type mappings
    };
}

Configuration should be as intuitive as possible. Nobody wants to memorize a pile of complicated code names.

Example Configuration File

In practice, everything can be configured in appsettings.json:

AI:
  Providers:
    Providers:
      ClaudeCodeCli:
        Enabled: true
        Model: glm-5-turbo
        WorkingDirectory: /path/to/project
      CodebuddyCli:
        Enabled: true
        Model: glm-4.7
      CodexCli:
        Enabled: true
        Model: gpt-5.4
      IFlowCli:
        Enabled: true
        Model: glm-4.7

Each Provider can independently configure parameters such as enablement, model version, and working directory. This design preserves flexibility while remaining easy to manage and maintain.

Configuration files are a bit like life’s options: you can choose to enable or disable certain things. The only difference is that code choices are easier to regret later.

Adventure Party Task Flow

The Art of Task Division

Proposal creation (user)
    │
    ▼
[Claude Code] ──generate proposal──▶ Proposal document
    │                               │
    │                               ▼
    │                      [Codebuddy] ──refine description──▶ Refined proposal
    │                               │
    │                               ▼
    │                      [Codex] ──execute changes──▶ Code changes
    │                               │
    │                               ▼
    └──────────────────────▶ [iFlow] ──archive──▶ Historical records

The benefit of this division of labor is that each Agent only needs to focus on the tasks it does best, rather than trying to do everything. Claude Code is responsible for generating proposals from scratch. Codebuddy makes proposal descriptions clearer. Codex turns proposals into actual code changes. iFlow archives and preserves those changes.

This is really just teamwork, much like in everyday life. Everyone has their own role, and only together can something big get done. The only difference is that the team members here happen to be AIs.

Key Practical Takeaways

In actual operation, we summarized the following lessons:

1. Agent selection strategy matters

Tasks should not be assigned casually; they should be matched to each Agent’s strengths:

Proposal generation: use Claude Code, because it has stronger contextual understanding
Code execution: use Codex, because it is more precise for code modification
Proposal refinement: use Codebuddy, because it offers strong cost performance
Archival storage: use iFlow, because it is stable and reliable

After all, putting the right person on the right task is a timeless principle.

2. Configuration isolation ensures stability

This is like personal boundaries in life. Everyone needs their own space; non-interference makes harmonious coexistence possible.

3. Error-handling mechanism

Nobody can guarantee that errors will never happen. The key is how you handle them. Life works much the same way.

4. Monitoring and observability

The things you cannot see are often the ones most likely to go wrong. Some visibility is always better than flying blind.

Real-World Results and Benefits

After adopting this multi-Agent collaboration configuration, the HagiCode project’s development efficiency improved significantly. Specifically:

Task-handling capacity doubled: in the past, one Agent had to handle many kinds of tasks at once; now tasks can be processed in parallel, and throughput has increased dramatically
More stable output quality: each Agent focuses only on what it does best, so consistency and quality both improve
Lower maintenance cost: unified interfaces and configuration management make the whole system easier to maintain and extend
Adding new Agents is simple: to integrate a new AI product, you only need to implement the interface and add configuration, without changing the core logic

This approach not only solved HagiCode’s own problems, but also proved that multi-Agent collaboration is a viable architectural choice.

The gains were quite noticeable. The process was just a bit of a hassle.

Conclusion

This article shared the HagiCode project’s practical experience with multi-Agent collaboration configuration. The main takeaways include:

Standardized interfaces: IAIProvider unifies the behavior of different Agents, allowing the code to ignore which company’s product is underneath
Factory pattern: ActivatorUtilities.CreateInstance dynamically creates Provider instances, supporting runtime configuration and dependency injection
Protocol unification: the ACP protocol provides standardized communication between Agents through a bidirectional mechanism based on JSON-RPC 2.0
Task routing: assign work reasonably across different Agents so each can play to its strengths, instead of expecting one Agent to do everything

References

HagiCode project repository: github.com/HagiCode-org/site
HagiCode official website: hagicode.com
Video demo: www.bilibili.com/video/BV1pirZBuEzq/
Installation guide: docs.hagicode.com/installation/docker-compose
Desktop app: hagicode.com/desktop/

If this article was helpful to you, feel free to give the project a Star on GitHub. Your support is what keeps us sharing more. The public beta has already started, and you are welcome to install it and give it a try.

Thank you for reading. If you found this article useful, please click the like button below so more people can discover it.

This content was created with AI-assisted collaboration, reviewed by me, and reflects my own views and position.

Author: newbe36524
Article link: https://docs.hagicode.com/blog/2026-03-17-hagicode-ai-agent-party/

How Gamification Design Makes AI Coding More Fun

Mar 16, 2026

How Gamification Design Makes AI Coding More Fun

Traditional AI coding tools are actually quite powerful; they just lack a bit of warmth. When we were building HagiCode, we thought: if we are going to write code anyway, why not turn it into a game?

Background

Anyone who has used an AI coding assistant has probably had this experience: at first it feels fresh and exciting, but after a while it starts to feel like something is missing. The tool itself is powerful, capable of code generation, autocomplete, and Bug fixes, but… it does not feel very warm, and over time it can become monotonous and dull.

That alone is enough to make you wonder who wants to stare at a cold, impersonal tool every day.

It is a bit like playing a game. If all you do is finish a task list, with no character growth, no achievement unlocks, and no team coordination, it quickly stops being fun. Beautiful things and people do not need to be possessed to be appreciated; their beauty is enough on its own. Programming tools do not even offer that kind of beauty, so it is easy to lose heart.

We ran into exactly this problem while developing HagiCode. As a multi-AI assistant collaboration platform, HagiCode needs to keep users engaged over the long term. But in reality, even a great tool is hard to stick with if it lacks any emotional connection.

To solve this pain point, we made a bold decision: turn programming into a game. Not the superficial kind with a simple points leaderboard, but a true role-playing gamified experience. The impact of that decision may be even bigger than you imagine.

After all, people need a bit of ritual in their lives.

About HagiCode

The ideas shared in this article come from our practical experience on the HagiCode project. HagiCode is a multi-AI assistant collaboration platform that supports Claude Code, Codex, Copilot, OpenCode, and other AI assistants working together. If you are interested in multi-AI collaboration or gamified programming, visit github.com/HagiCode-org/site to learn more.

There is nothing especially mysterious about it. We simply turned programming into an adventure.

Why Choose Gamification

The essence of gamification is not just “adding a leaderboard.” It is about building a complete incentive system so users can feel growth, achievement, and social recognition while doing tasks.

HagiCode’s gamification design revolves around one core idea: every AI assistant is a “Hero,” and the user is the captain of this Hero team. You lead these Heroes to conquer various “Dungeons” (programming tasks). Along the way, Heroes gain experience, level up, unlock abilities, and your team earns achievements as well.

This is not a gimmick. It is a design grounded in human behavioral psychology. When tasks are given meaning and progress feedback, people’s engagement and persistence increase significantly.

As the old saying goes, “This feeling can become a memory, though at the time it left us bewildered.” We bring that emotional experience into the tool, so programming is no longer just typing code, but a journey worth remembering.

Hero Character System

Hero is the core concept in HagiCode’s gamification system. Each Hero represents one AI assistant. For example, Claude Code is a Hero, and Codex is also a Hero.

The Three Hero Slots

A Hero has three equipment slots, and the design is surprisingly elegant:

CLI slot (main class): Determines the Hero’s base ability, such as whether it is Claude Code or Codex
Model slot (secondary class): Determines which model is used, such as Claude 4.5 or Claude 4.6
Style slot (style): Determines the Hero’s behavior style, such as “Fengluo Strategist” or another style

The combination of these three slots creates unique Hero configurations. Much like equipment builds in games, you choose the right setup based on the task. After all, what suits you best is what matters most. Life is similar: many roads lead to Rome, but some are smoother than others.

The Hero Growth System

Each Hero has its own XP and level:

type HeroProgressionSnapshot = {
  currentLevel: number;                    // Current level
  totalExperience: number;                 // Total experience
  currentLevelStartExperience: number;     // Experience at the start of the current level
  nextLevelExperience: number;             // Experience required for the next level
  experienceProgressPercent: number;       // Progress percentage
  remainingExperienceToNextLevel: number;  // Experience still needed for the next level
  lastExperienceGain: number;              // Most recent experience gained
  lastExperienceGainAtUtc?: string | null; // Time when experience was gained
};

Levels are divided into four stages, and each stage has an immersive name:

export const resolveHeroProgressionStage = (level?: number | null): HeroProgressionStage => {
  const normalizedLevel = Math.max(1, level ?? 1);
  if (normalizedLevel <= 100) return 'rookieSprint';      // Rookie sprint
  if (normalizedLevel <= 300) return 'growthRun';         // Growth run
  if (normalizedLevel <= 700) return 'veteranClimb';      // Veteran climb
  return 'legendMarathon';                                // Legend marathon
};

From “rookie” to “legend,” this growth path gives users a clear sense of direction and achievement. It mirrors personal growth in life, from confusion to maturity, only made more tangible here.

Creating a Custom Hero

To create a Hero, you need to configure three slots:

const heroDraft: HeroDraft = {
  name: 'Athena',
  icon: 'hero-avatar:storm-03',
  description: 'A brilliant strategist',
  executorType: AIProviderType.CLAUDE_CODE_CLI,
  slots: {
    cli: {
      id: 'profession-claude-code',
      parameters: { /* CLI-related parameters */ }
    },
    model: {
      id: 'secondary-claude-4-sonnet',
      parameters: { /* Model-related parameters */ }
    },
    style: {
      id: 'fengluo-strategist',
      parameters: { /* Style-related parameters */ }
    }
  }
};

Every Hero has a unique avatar, description, and professional identity, which gives what would otherwise be a cold AI assistant more personality and warmth. After all, who wants to work with a tool that has no character?

Dungeon System

A “Dungeon” is a classic game concept representing a challenge that requires a team to clear. In HagiCode, each workflow is a Dungeon.

How Dungeons Are Organized

Dungeon organizes workflows into different “Dungeons”:

Proposal generation dungeon: Responsible for generating technical proposals
Proposal execution dungeon: Responsible for executing tasks in proposals
Proposal archive dungeon: Responsible for organizing and archiving completed proposals

Each dungeon has its own Captain Hero, and the captain is automatically chosen as the first enabled Hero.

This is really just division of labor, like in everyday life, except turned into a game mechanic.

Team Collaboration Mechanism

You can configure different Hero squads for different dungeons:

const dungeonRoster: HeroDungeonRoster = {
  scriptKey: 'proposal.generate',
  displayName: 'Proposal Generation',
  members: [
    { heroId: 'hero-1', name: 'Athena', executorType: 'ClaudeCode' },
    { heroId: 'hero-2', name: 'Apollo', executorType: 'Codex' }
  ]
};

For example, you can use Athena for generating proposals because it is good at strategy, and Apollo for implementing code because it is good at execution. That way, every Hero can play to its strengths. It is like forming a band: each person has an instrument, and together they create something beautiful.

Dungeon Flow Control

Dungeon uses fixed scriptKey values to identify different workflows:

// Script keys map to different workflows
const dungeonScripts = {
  'proposal.generate': 'Proposal Generation',
  'proposal.execute': 'Proposal Execution',
  'proposal.archive': 'Proposal Archive'
};

The task state flow is: queued (waiting) -> dispatching (being assigned) -> dispatched (assigned). The whole process is automated and requires no manual intervention. That is also part of our lazy side, because who wants to manage this stuff by hand?

XP and Level System

XP is the core feedback mechanism in the gamification system. Users gain XP by completing tasks, XP levels up Heroes, and leveling up unlocks new abilities, forming a positive feedback loop.

Ways to Gain XP

In HagiCode, XP can be earned through the following activities:

Completing code execution
Successfully calling tools
Generating proposals
Session management operations
Project operations

Every time a valid action is completed, the corresponding Hero gains XP. Just like growth in life, every step counts, only here that growth is quantified.

Real-Time Progress Visualization

XP and level progress are visualized in real time:

type HeroDungeonMember = {
  heroId: string;
  name: string;
  icon?: string | null;
  executorType: PCode_Models_AIProviderType;
  currentLevel?: number;                    // Current level
  totalExperience?: number;                 // Total experience
  experienceProgressPercent?: number;       // Progress percentage
};

Users can always see each Hero’s level and progress, and that immediate feedback is the key to gamification design. People need feedback, otherwise how would they know they are improving?

Achievement System

Achievements are another important element in gamification. They provide long-term goals and milestone-driven satisfaction.

Achievement Types

HagiCode supports multiple types of achievements:

Code generation achievements: Generate X lines of code, generate Y files
Session management achievements: Complete Z conversations
Project operation achievements: Work across W projects

These achievements are really like milestones in life, except we have turned them into a game mechanic.

Achievement States

Achievements have three states:

type AchievementStatus = 'unlocked' | 'in-progress' | 'locked';

The three states have clear visual distinctions:

Unlocked: Gold gradient with a halo effect
In progress: Blue pulse animation
Locked: Gray, with unlock conditions shown

Each achievement clearly displays its trigger condition, so users know what to do next. When people feel lost, a little guidance always helps.

Celebration Effect on Unlock

When an achievement is unlocked, a celebration animation is triggered. That kind of positive reinforcement gives users the satisfying feeling of “I did it” and motivates them to keep going. Small rewards in life work the same way: they may be small, but the happiness can last a long time.

Battle Report Daily Combat Report

Battle Report is one of HagiCode’s signature features. At the end of each day, it generates a full-screen battle-style report.

Report Content

Battle Report displays the following information:

type HeroBattleReport = {
  reportDate: string;
  summary: {
    totalHeroCount: number;        // Total number of Heroes
    activeHeroCount: number;       // Number of active Heroes
    totalBattleScore: number;      // Total battle score
    mvp: HeroBattleHero;           // Most valuable Hero
  };
  heroes: HeroBattleHero[];        // Detailed data for all Heroes
};

Total team score
Number of active Heroes
Number of tool calls
Total working time
MVP (Most Valuable Hero)
Detailed card for each Hero

MVP Highlight Display

The MVP is the best-performing Hero of the day and is highlighted in the report. This is not just data statistics, but a form of honor and recognition. After all, who does not want to be recognized?

Detailed Hero Cards

Each Hero card includes:

Level progress
XP gained
Number of executions
Usage time

These metrics help users clearly understand how the team is performing. Seeing the results of your own effort is satisfying in itself.

Technical Implementation

HagiCode’s gamification system uses a modern technology stack and design patterns. There is nothing especially magical about it; we just chose tools that fit the job.

Technology Stack Choices

// React + TypeScript for the frontend
import React from 'react';

// Framer Motion for animations
import { AnimatePresence, motion } from 'framer-motion';

// Redux Toolkit for state management
import { useAppDispatch, useAppSelector } from '@/store';

// shadcn/ui for UI components
import { Dialog, DialogContent } from '@/components/ui/dialog';

Framer Motion handles all animation effects, shadcn/ui provides the foundational UI components, and Redux Toolkit manages the complex gamification state. Good tools make good work.

Gamified UI Design System

HagiCode uses a Glassmorphism + Tech Dark design style:

/* Primary gradient */
background: linear-gradient(135deg, #22C55E 0%, #25c2a0 50%, #06b6d4 100%);

/* Glass effect */
backdrop-filter: blur(12px);

/* Glow effect */
background: radial-gradient(circle at center, rgba(34, 197, 94, 0.15) 0%, transparent 70%);

The green gradient combined with glassmorphism creates a technical, futuristic atmosphere. Visual beauty is part of the user experience too.

Animation Effects

Framer Motion is used to create smooth entrance animations:

<motion.div
  animate={{ opacity: 1, y: 0 }}
  initial={{ opacity: 0, y: 18 }}
  transition={{ duration: 0.35, ease: 'easeOut', delay: index * 0.08 }}
  className="card"
>
  {/* Card content */}
</motion.div>

Each card enters one after another with a delay of 0.08 seconds, creating a fluid visual effect. Smooth animation improves the experience. That part is hard to argue with.

Data Persistence

Gamification data is stored using the Grain storage system to ensure state consistency. Even fine-grained data like accumulated Hero XP can be persisted accurately. No one wants to lose the experience they worked hard to earn.

Practical Guide

Create Your First Hero

Creating your first Hero is actually quite simple:

Go to the Hero management page
Click the “Create Hero” button
Configure the three slots (CLI, Model, Style)
Give the Hero a name and description
Save it, and your first Hero is born

It is like meeting a new friend: you give them a name, learn what makes them special, and then head off on an adventure together.

Build a Dungeon Team

Building a team is also simple:

Go to the Dungeon management page
Choose the dungeon you want to configure, such as “Proposal Generation”
Select members from your Hero list
The system automatically selects the first enabled Hero as Captain
Save the configuration

This is simply the process of forming a team, much like building a team in real life where everyone has their own role.

View the Daily Report

At the end of each day, you can view the day’s Battle Report:

Click the “Battle Report” button
View the day’s work results in a full-screen display
Check the MVP and the detailed data for each Hero
Share it with team members if you want

This is also a kind of ritual, a way to see how much effort you put in today and how far you still are from your goal.

Notes and Best Practices

Performance Optimization

Use React.memo to avoid unnecessary re-renders:

const HeroCard = React.memo(({ hero }: { hero: HeroDungeonMember }) => {
  // Component implementation
});

Performance matters too. No one wants to use a laggy tool.

Motion Can Degrade Gracefully

Detect the user’s motion preference settings and provide a simplified experience for motion-sensitive users:

const prefersReducedMotion = useReducedMotion();
const duration = prefersReducedMotion ? 0 : 0.35;

Not everyone likes animation, and respecting user preferences is part of good design.

Backward Compatibility

Keep legacyIds to support migration from older versions:

type HeroDungeonMember = {
  heroId: string;
  legacyIds?: string[];  // Supports legacy ID mapping
  // ...
};

No one wants to lose data just because of a version upgrade.

Internationalization Support

Use i18n translation keys for all text to make multi-language support easy:

const displayName = t(`dungeon.${scriptKey}`, { defaultValue: displayName });

Language should never be a barrier to using the product.

Summary

Gamification is not just a simple points leaderboard, but a complete incentive system. Through the Hero system, Dungeon system, XP and level system, achievement system, and Battle Report, HagiCode transforms programming work into a heroic journey full of adventure.

The core value of this system lies in:

Emotional connection: Giving cold AI assistants personality
Positive feedback: Every action produces immediate feedback
Long-term goals: Levels and achievements provide a growth path
Team identity: A sense of collaboration within Dungeon teams
Honor and recognition: Battle Report and MVP showcases

Gamification design makes programming no longer dull, but an interesting adventure. While completing coding tasks, users also experience the fun of character growth, team collaboration, and achievement unlocking, which improves retention and activity.

At its core, programming is already an act of creation. We just made the creative process a little more fun.

If this article helped you:

Leave a like so more people can discover it
Give us a Star on GitHub: github.com/HagiCode-org/site
Visit the official site to learn more: hagicode.com
Watch the 30-minute hands-on demo: www.bilibili.com/video/BV1pirZBuEzq/
Install with one click and try it: docs.hagicode.com/installation/docker-compose
Quick install for Desktop: hagicode.com/desktop/
Public beta has started, and you are welcome to install and try it

References

Thank you for reading. If you found this article useful, please click the like button below so more people can discover it.

This content was created with AI-assisted collaboration, reviewed by me, and reflects my own views and position.

Author: newbe36524
Article link: https://docs.hagicode.com/blog/2026-03-16-gamifying-ai-coding/

Practical Multi-AI Provider Architecture in the HagiCode Platform

Mar 11, 2026

Practical Multi-AI Provider Architecture in the HagiCode Platform

This article shares the technical approach we used under the Orleans Grain architecture to integrate two AI tools, iflow and OpenCode, through a unified IAIProvider interface, and compares the implementation differences between WebSocket and HTTP communication in detail.

Background

There is nothing especially mysterious about it. While building HagiCode, we ran into a very practical problem: users wanted to work with different AI tools. That is hardly surprising, since everyone has their own habits. Some prefer Claude Code, some love GitHub Copilot, and some teams use tools they developed themselves.

Our initial solution was simple and direct: write dedicated integration code for each AI tool. But the drawbacks showed up quickly. The codebase filled up with if-else branches, every change required testing in multiple places, and every new tool meant writing another pile of logic from scratch.

Later, I realized it would be better to create a unified IAIProvider interface and abstract the capabilities shared by all AI providers. That way, no matter which tool is used underneath, the upper layers can call it in the same way.

Recently, the project needed to integrate two new tools: iflow and OpenCode. Both support the ACP protocol, but their communication styles are different. iflow uses WebSocket, while OpenCode uses an HTTP API. That became a useful architectural test: adapt two different transport modes behind one unified interface.

About HagiCode

The approach shared in this article comes from our practical experience in the HagiCode project. HagiCode is an AI-assisted development platform built on the Orleans Grain architecture. It integrates with different AI providers through a unified IAIProvider interface, allowing users to flexibly choose the AI tools they prefer.

Architecture Design

Unified Interface Abstraction

First, we defined the IAIProvider interface and abstracted the capabilities that every AI provider needs to implement:

public interface IAIProvider
{
    string Name { get; }
    bool SupportsStreaming { get; }
    ProviderCapabilities Capabilities { get; }

    Task<AIResponse> ExecuteAsync(AIRequest request, CancellationToken cancellationToken = default);
    IAsyncEnumerable<AIStreamingChunk> StreamAsync(AIRequest request, CancellationToken cancellationToken = default);
    Task<ProviderTestResult> PingAsync(CancellationToken cancellationToken = default);
    IAsyncEnumerable<AIStreamingChunk> SendMessageAsync(AIRequest request, string? embeddedCommandPrompt = null, CancellationToken cancellationToken = default);
}

This interface includes several key methods:

ExecuteAsync: execute a one-shot AI request
StreamAsync: get streaming responses for real-time display
PingAsync: perform a health check to verify whether the provider is available
SendMessageAsync: send a message with support for embedded commands

IFlowCliProvider: A WebSocket-Based Implementation

iflow uses WebSocket for ACP communication. The overall architecture looks like this:

IFlowCliProvider → ACPSessionManager → WebSocketAcpTransport → iflow CLI
                ↓
         Dynamic port allocation + process management

The core flow is also fairly straightforward:

ACPSessionManager creates and manages ACP sessions.
WebSocketAcpTransport handles WebSocket communication.
A port is allocated dynamically, and the iflow process is started with iflow --experimental-acp --port.
IAIRequestToAcpMapper and IAcpToAIResponseMapper convert requests and responses.

Here is the core code:

private async IAsyncEnumerable<AIStreamingChunk> StreamCoreAsync(
    AIRequest request,
    string? embeddedCommandPrompt,
    [EnumeratorCancellation] CancellationToken cancellationToken)
{
    // Resolve working directory
    var resolvedWorkingDirectory = ResolveWorkingDirectory(request);
    var effectiveRequest = ApplyEmbeddedCommandPrompt(request, embeddedCommandPrompt);

    // Create ACP session
    await using var session = await _sessionManager.CreateSessionAsync(
        Name,
        resolvedWorkingDirectory,
        cancellationToken,
        request.SessionId);

    // Send prompt
    var prompt = _requestMapper.ToPromptString(effectiveRequest);
    var promptResponse = await session.SendPromptAsync(prompt, cancellationToken);

    // Receive streaming response
    await foreach (var notification in session.ReceiveUpdatesAsync(cancellationToken))
    {
        if (_responseMapper.TryConvertToStreamingChunk(notification, out var chunk))
        {
            if (chunk.Type == StreamingChunkType.Metadata && chunk.IsComplete)
            {
                yield return chunk;
                yield break;
            }
            yield return chunk;
        }
    }
}

There are a few design points worth calling out here:

Use await using to ensure the session is released correctly and avoid resource leaks.
Return streaming responses through IAsyncEnumerable, which naturally supports async streams.
Use Metadata chunks to determine completion and ensure the full response has been received.

OpenCodeCliProvider: An HTTP API-Based Implementation

OpenCode provides its service through an HTTP API, so the architecture is slightly different:

OpenCodeCliProvider → OpenCodeRuntimeManager → OpenCodeClient → OpenCode HTTP API
                      ↓
                OpenCodeProcessManager → opencode process management

A notable feature of OpenCode is that it uses an SQLite database to persist session bindings. That makes session recovery and prompt-response recovery possible:

private async Task<OpenCodePromptExecutionResult> ExecutePromptAsync(
    AIRequest request,
    string? embeddedCommandPrompt,
    CancellationToken cancellationToken)
{
    var prompt = BuildPrompt(request, embeddedCommandPrompt);
    var resolvedWorkingDirectory = ResolveWorkingDirectory(request.WorkingDirectory);
    var client = await _runtimeManager.GetClientAsync(resolvedWorkingDirectory, cancellationToken);
    var bindingSessionId = request.SessionId;
    var boundSession = TryGetBinding(bindingSessionId, resolvedWorkingDirectory);

    // Try to use the already bound session
    if (boundSession is not null)
    {
        try
        {
            return await PromptSessionAsync(
                client,
                boundSession,
                BuildPromptRequest(request, prompt, CreatePromptMessageId()),
                request.Model ?? _settings.Model,
                cancellationToken);
        }
        catch (OpenCodeApiException ex) when (IsStaleBinding(ex))
        {
            // The session has expired, remove the binding
            RemoveBinding(bindingSessionId);
        }
    }

    // Create a new session
    var session = await client.Session.CreateAsync(new OpenCodeSessionCreateRequest
    {
        Title = BuildSessionTitle(request)
    }, cancellationToken);

    BindSession(bindingSessionId, session.Id, resolvedWorkingDirectory);
    return await PromptSessionAsync(client, session.Id, ...);
}

This implementation has several interesting highlights:

Session binding mechanism: the same SessionId reuses the same OpenCode session, avoiding repeated session creation.
Expiration handling: when a session is found to be expired, the binding is automatically cleaned up.
Database persistence: bindings are stored in SQLite and remain effective after restart.

Comparing the Two Approaches

Aspect	IFlowCliProvider	OpenCodeCliProvider
Communication	WebSocket (ACP)	HTTP API
Process management	ACPSessionManager	OpenCodeProcessManager
Port allocation	Dynamic port	No port (uses HTTP)
Session management	ACPSession	OpenCodeSession
Persistence	In-memory cache	SQLite database
Startup command	`iflow --experimental-acp --port`	`opencode`
Latency	Lower (long-lived connection)	Relatively higher (HTTP requests)

Which approach you choose depends mainly on your needs. WebSocket is better for scenarios with high real-time requirements, while an HTTP API is simpler and easier to debug.

Practical Guide

Configure Providers

First, enable the two providers in the configuration file:

AI:
  Providers:
    IFlowCli:
      Type: "IFlowCli"
      Enabled: true
      ExecutablePath: "iflow"
      Model: null
      WorkingDirectory: null
    OpenCodeCli:
      Type: "OpenCodeCli"
      Enabled: true
      ExecutablePath: "opencode"
      Model: "anthropic/claude-sonnet-4"
      WorkingDirectory: null

OpenCode:
  Enabled: true
  BaseUrl: "http://localhost:38376"
  ExecutablePath: "opencode"
  StartupTimeoutSeconds: 30
  RequestTimeoutSeconds: 120

Use IFlowCliProvider

// Get provider through the factory
var provider = await _providerFactory.GetProviderAsync(AIProviderType.IFlowCli);

// Execute an AI request
var request = new AIRequest
{
    Prompt = "请帮我重构这个函数",
    WorkingDirectory = "/path/to/project",
    Model = "claude-sonnet-4"
};

// Get the complete response
var response = await provider.ExecuteAsync(request, cancellationToken);
Console.WriteLine(response.Content);

// Or use streaming responses
await foreach (var chunk in provider.StreamAsync(request, cancellationToken))
{
    if (chunk.Type == StreamingChunkType.ContentDelta)
    {
        Console.Write(chunk.Content);
    }
}

Use OpenCodeCliProvider

// Get provider through the factory
var provider = await _providerFactory.GetProviderAsync(AIProviderType.OpenCodeCli);

var request = new AIRequest
{
    Prompt = "请帮我分析这个错误",
    WorkingDirectory = "/path/to/project",
    Model = "anthropic/claude-sonnet-4"
};

var response = await provider.ExecuteAsync(request, cancellationToken);
Console.WriteLine(response.Content);

Health Checks

Before startup or before use, you can check whether the provider is available:

var iflowResult = await iflowProvider.PingAsync(cancellationToken);
if (!iflowResult.Success)
{
    Console.WriteLine($"IFlow is unavailable: {iflowResult.ErrorMessage}");
    return;
}

var openCodeResult = await openCodeProvider.PingAsync(cancellationToken);
if (!openCodeResult.Success)
{
    Console.WriteLine($"OpenCode is unavailable: {openCodeResult.ErrorMessage}");
    return;
}

Embedded Command Support

Both providers support embedded commands, such as /file:xxx:

var request = new AIRequest
{
    Prompt = "分析这个文件的问题",
    SystemMessage = "你是一个代码分析专家"
};

await foreach (var chunk in provider.SendMessageAsync(
    request,
    embeddedCommandPrompt: "/file:src/main.cs",
    cancellationToken))
{
    Console.Write(chunk.Content);
}

Notes and Best Practices

Resource Management

IFlow uses long-lived WebSocket connections, so resource management deserves special attention:

Use await using to ensure sessions are released properly.
Cancellation triggers process cleanup.
ACPSessionManager supports a maximum session count limit.

OpenCode process management is relatively simpler, and OpenCodeRuntimeManager handles it automatically.

Error Handling

Both providers have complete error handling:

IFlow errors are propagated through ACP session updates.
OpenCode errors are thrown through OpenCodeApiException.
It is recommended that the caller catch and handle these exceptions.

Performance Considerations

IFlow WebSocket communication has lower latency than HTTP.
OpenCode session reuse can reduce the overhead of HTTP requests.
The factory cache mechanism avoids repeatedly creating providers.
In high-concurrency scenarios, pay close attention to the limits on process count and connection count.

Configuration Validation

The executable path is validated at startup, but runtime issues can still happen. PingAsync is a useful tool for verifying whether the configuration is correct:

// Check at startup
var provider = await _providerFactory.GetProviderAsync(providerType);
var result = await provider.PingAsync(cancellationToken);
if (!result.Success)
{
    _logger.LogError("Provider {ProviderType} is unavailable: {Error}", providerType, result.ErrorMessage);
}

Summary

This article shares the technical approach used by the HagiCode platform when integrating the two AI tools iflow and OpenCode. Through a unified IAIProvider interface, we adapted different communication styles, WebSocket and HTTP, while keeping the upper-layer calling pattern consistent.

The core idea is actually quite simple:

Define a unified interface abstraction.
Build adapter layers for different implementations.
Manage everything uniformly through the factory pattern.

That gives the system good extensibility. When a new AI tool needs to be integrated later, all we need to do is implement the IAIProvider interface without changing too much existing code.

If you are also working on multi-AI-tool integration, I hope this article is helpful.

References

HagiCode GitHub: github.com/HagiCode-org/site
HagiCode official website: hagicode.com
HagiCode Installation Guide: docs.hagicode.com/installation
ACP protocol specification: github.com/modelcontextprotocol/specification
Orleans documentation: learn.microsoft.com/dotnet/orleans

If this article helped you:

Give it a like so more people can see it
Star us on GitHub: github.com/HagiCode-org/site
Visit the official website to learn more: hagicode.com
Watch the 30-minute hands-on demo: www.bilibili.com/video/BV1pirZBuEzq/
Try one-click installation: docs.hagicode.com/installation/docker-compose
Quick install for Desktop: hagicode.com/desktop/
Public beta has started, and you are welcome to try it

HagiCode Multi-AI Provider Switching and Interoperability Implementation Plan

Mar 9, 2026

HagiCode Multi-AI Provider Switching and Interoperability Implementation Plan

In the modern developer-tooling ecosystem, developers often need to use different AI coding assistants to support their work. Anthropic’s Claude Code CLI and OpenAI’s Codex CLI each have their own strengths: Claude is known for outstanding code understanding and long-context handling, while Codex excels at code generation and tool usage.

This article takes an in-depth look at how the HagiCode project achieves seamless switching and interoperability across multiple AI providers, including the core architectural design, key implementation details, and practical considerations.

Background

Problem Domain

The core challenge faced by the HagiCode project is supporting multiple AI CLIs on the same platform, so users can:

Flexibly switch between AI providers based on their needs
Maintain session continuity during provider switching
Unify the API differences across different CLIs behind a common abstraction
Reserve extension points for adding new AI providers in the future

Technical Challenges

Unifying interface differences: Claude Code CLI is invoked through command-line calls, while Codex CLI uses a JSON event stream
Handling streaming responses: Both providers support streaming responses, but with different data formats
Tool-calling semantics: Claude and Codex differ in how they represent tool calls and manage their lifecycle
Session lifecycle: The system must correctly manage session creation, restoration, and termination for each provider

Analysis

Architectural Design Approach

HagiCode uses the Provider Pattern combined with the Factory Pattern to abstract AI service invocation. The core ideas of this design are:

Unified interface abstraction: Define the IAIProvider interface as the common abstraction for all AI providers
Factory-created instances: Use AIProviderFactory to dynamically create the corresponding provider instance based on type
Intelligent selection logic: Use AIProviderSelector to automatically select the most suitable provider based on scenario and configuration
Session state management: Persist the binding relationship between sessions and CLI threads in the database

Key Components

Component	Responsibility	Language
`IAIProvider`	Unified provider interface	C#
`AIProviderFactory`	Create and manage provider instances	C#
`AIProviderSelector`	Select providers intelligently	C#
`ClaudeCodeCliProvider`	Claude Code CLI implementation	C#
`CodexCliProvider`	Codex CLI implementation	C#
`AgentCliManager`	Desktop-side CLI management	TypeScript

Solution

1. Core Interface Design

The IAIProvider interface defines the unified provider abstraction:

public interface IAIProvider
{
    /// <summary>
    /// Provider display name
    /// </summary>
    string Name { get; }

    /// <summary>
    /// Whether streaming responses are supported
    /// </summary>
    bool SupportsStreaming { get; }

    /// <summary>
    /// Provider capability description
    /// </summary>
    ProviderCapabilities Capabilities { get; }

    /// <summary>
    /// Execute a single AI request
    /// </summary>
    Task<AIResponse> ExecuteAsync(AIRequest request, CancellationToken cancellationToken = default);

    /// <summary>
    /// Execute a streaming AI request
    /// </summary>
    IAsyncEnumerable<AIStreamingChunk> StreamAsync(AIRequest request, CancellationToken cancellationToken = default);

    /// <summary>
    /// Check provider connectivity and responsiveness
    /// </summary>
    Task<ProviderTestResult> PingAsync(CancellationToken cancellationToken = default);

    /// <summary>
    /// Send a message with an embedded command
    /// </summary>
    IAsyncEnumerable<AIStreamingChunk> SendMessageAsync(
        AIRequest request,
        string? embeddedCommandPrompt = null,
        CancellationToken cancellationToken = default);
}

Key characteristics of this interface design:

Unified request/response model: All providers use the same AIRequest and AIResponse types
Streaming support: Standardize streaming output through IAsyncEnumerable<AIStreamingChunk>
Capability description: ProviderCapabilities describes the features supported by the provider (streaming, tools, maximum tokens, and so on)
Embedded commands: SendMessageAsync supports embedding OpenSpec commands into prompts

2. Provider Type Enumeration

public enum AIProviderType
{
    ClaudeCodeCli,   // Anthropic Claude Code
    OpenCodeCli,     // Other CLIs (extensible)
    GitHubCopilot,    // GitHub Copilot
    CodebuddyCli,    // Codebuddy
    CodexCli         // OpenAI Codex
}

This enum provides a type-safe representation for all providers supported by the system.

3. Factory Pattern Implementation

The AIProviderFactory is responsible for creating and managing provider instances:

public class AIProviderFactory : IAIProviderFactory
{
    private readonly ConcurrentDictionary<AIProviderType, IAIProvider> _cache;
    private readonly IOptions<AIProviderOptions> _options;
    private readonly IServiceProvider _serviceProvider;

    public Task<IAIProvider?> GetProviderAsync(AIProviderType providerType)
    {
        // Use caching to avoid duplicate creation
        if (_cache.TryGetValue(providerType, out var cached))
            return Task.FromResult<IAIProvider?>(cached);

        // Get provider configuration from settings
        var aiOptions = _options.Value;
        if (!aiOptions.Providers.TryGetValue(providerType, out var config))
        {
            _logger.LogWarning("Provider '{ProviderType}' not found in configuration", providerType);
            return Task.FromResult<IAIProvider?>(null);
        }

        // Create provider by type
        var provider = providerType switch
        {
            AIProviderType.ClaudeCodeCli =>
                _serviceProvider.GetService(typeof(ClaudeCodeCliProvider)) as IAIProvider,
            AIProviderType.CodexCli =>
                _serviceProvider.GetService(typeof(CodexCliProvider)) as IAIProvider,
            AIProviderType.GitHubCopilot =>
                _serviceProvider.GetService(typeof(CopilotAIProvider)) as IAIProvider,
            _ => null
        };

        if (provider != null)
        {
            _cache[providerType] = provider;
        }

        return Task.FromResult<IAIProvider?>(provider);
    }
}

Advantages of the factory pattern:

Instance caching: Avoid repeatedly creating the same type of provider
Dependency injection: Create instances through IServiceProvider, with dependency injection support
Configuration-driven: Read provider settings from configuration files
Exception handling: Return null when creation fails, making it easier for upper layers to handle errors

4. Intelligent Selector

The AIProviderSelector implements provider-selection strategies:

public class AIProviderSelector : IAIProviderSelector
{
    private readonly BusinessLayerConfiguration _configuration;
    private readonly IAIProviderFactory _providerFactory;
    private readonly IMemoryCache _cache;

    public async Task<AIProviderType> SelectProviderAsync(
        BusinessScenario scenario,
        CancellationToken cancellationToken = default)
    {
        // 1. Try getting a provider from scenario mapping
        if (_configuration.ScenarioProviderMapping.TryGetValue(scenario, out var providerType))
        {
            if (await IsProviderAvailableAsync(providerType, cancellationToken))
            {
                _logger.LogDebug("Selected provider '{Provider}' for scenario '{Scenario}'",
                    providerType, scenario);
                return providerType;
            }

            _logger.LogWarning("Configured provider '{Provider}' for scenario '{Scenario}' is not available",
                providerType, scenario);
        }

        // 2. Try the default provider
        if (await IsProviderAvailableAsync(_configuration.DefaultProvider, cancellationToken))
        {
            _logger.LogDebug("Using default provider '{Provider}' for scenario '{Scenario}'",
                _configuration.DefaultProvider, scenario);
            return _configuration.DefaultProvider;
        }

        // 3. Try the fallback chain
        foreach (var fallbackProvider in _configuration.FallbackChain)
        {
            if (await IsProviderAvailableAsync(fallbackProvider, cancellationToken))
            {
                _logger.LogInformation("Using fallback provider '{Provider}' for scenario '{Scenario}'",
                    fallbackProvider, scenario);
                return fallbackProvider;
            }
        }

        // 4. No available provider can be found
        throw new InvalidOperationException(
            $"No available AI provider found for scenario '{scenario}'");
    }

    public async Task<bool> IsProviderAvailableAsync(
        AIProviderType providerType,
        CancellationToken cancellationToken = default)
    {
        var cacheKey = $"provider_available_{providerType}";

        // Use caching to reduce Ping calls
        if (_configuration.EnableCache &&
            _cache.TryGetValue<bool>(cacheKey, out var cached))
        {
            return cached;
        }

        var provider = await _providerFactory.GetProviderAsync(providerType);
        var isAvailable = provider != null;

        if (_configuration.EnableCache && isAvailable)
        {
            _cache.Set(cacheKey, isAvailable,
                TimeSpan.FromSeconds(_configuration.CacheExpirationSeconds));
        }

        return isAvailable;
    }
}

Selector strategy:

Scenario mapping first: First check whether the business scenario has a specific provider mapping
Fallback to default provider: Use the default provider if scenario mapping fails
Fallback chain as a final safeguard: Try providers in the fallback chain one by one
Availability caching: Cache provider availability checks to reduce Ping calls

5. Claude Code CLI Provider Implementation

public class ClaudeCodeCliProvider : IAIProvider
{
    private readonly ILogger<ClaudeCodeCliProvider> _logger;
    private readonly IClaudeStreamManager _streamManager;
    private readonly ProviderConfiguration _config;

    public string Name => "ClaudeCodeCli";
    public bool SupportsStreaming => true;

    public ProviderCapabilities Capabilities { get; }

    public async Task<AIResponse> ExecuteAsync(AIRequest request, CancellationToken cancellationToken = default)
    {
        _logger.LogInformation("Executing AI request with provider: {Provider}", Name);

        var sessionOptions = ClaudeRequestMapper.MapToSessionOptions(request, _config);

        var messages = _streamManager.SendMessageAsync(request.Prompt, sessionOptions, cancellationToken);

        var responseBuilder = new StringBuilder();
        ResultMessage? finalResult = null;

        await foreach (var streamMessage in messages)
        {
            switch (streamMessage.Message)
            {
                case ResultMessage result:
                    finalResult = result;
                    responseBuilder.Append(result.Result);
                    break;
            }
        }

        if (finalResult != null)
        {
            return ClaudeResponseMapper.MapToAIResponse(finalResult, Name);
        }

        return new AIResponse
        {
            Content = responseBuilder.ToString(),
            FinishReason = FinishReason.Unknown,
            Provider = Name
        };
    }
}

Characteristics of the Claude Code CLI provider:

Streaming manager integration: Use IClaudeStreamManager to communicate with the Claude CLI
CessionId session isolation: Use CessionId as the unique session identifier, distinct from the system sessionId
Working directory configuration: Support configuration of the working directory, permission mode, and more
Tool support: Support tool-permission settings such as AllowedTools and DisallowedTools

6. Codex CLI Provider Implementation

public class CodexCliProvider : IAIProvider
{
    private readonly ILogger<CodexCliProvider> _logger;
    private readonly CodexSettings _settings;
    private readonly ConcurrentDictionary<string, string> _sessionThreadBindings;

    public string Name => "CodexCli";
    public bool SupportsStreaming => true;

    public ProviderCapabilities Capabilities { get; }

    public async IAsyncEnumerable<AIStreamingChunk> StreamAsync(
        AIRequest request,
        [EnumeratorCancellation] CancellationToken cancellationToken = default)
    {
        _logger.LogInformation("Executing streaming AI request with provider: {Provider}", Name);

        var codex = CreateCodexClient();
        var thread = ResolveThread(codex, request);

        var currentTurn = 0;
        var activeToolCalls = new Dictionary<string, AIToolCallDelta>();

        await foreach (var threadEvent in thread.RunStreamedAsync(BuildPrompt(request), cancellationToken))
        {
            if (threadEvent is TurnStartedEvent)
            {
                currentTurn++;
            }

            switch (threadEvent)
            {
                case ItemCompletedEvent { Item: AgentMessageItem message }:
                    var messageText = message.Text ?? string.Empty;
                    yield return new AIStreamingChunk
                    {
                        Content = messageText,
                        Type = StreamingChunkType.ContentDelta,
                        IsComplete = false
                    };
                    break;

                case ItemStartedEvent or ItemUpdatedEvent or ItemCompletedEvent:
                    var toolChunk = BuildToolChunk(threadEvent, currentTurn);
                    if (toolChunk?.ToolCallDelta != null)
                    {
                        yield return toolChunk;
                    }
                    break;

                case TurnCompletedEvent turnCompleted:
                    activeToolCalls.Clear();
                    yield return new AIStreamingChunk
                    {
                        Content = string.Empty,
                        Type = StreamingChunkType.Metadata,
                        IsComplete = true,
                        Usage = MapUsage(turnCompleted.Usage)
                    };
                    break;
            }
        }

        BindSessionThread(request.SessionId, thread.Id);
    }

    private CodexThread ResolveThread(Codex codex, AIRequest request)
    {
        var sessionId = request.SessionId;

        // Check whether there is already a bound thread
        if (!string.IsNullOrWhiteSpace(sessionId) &&
            _sessionThreadBindings.TryGetValue(sessionId, out var threadId) &&
            !string.IsNullOrWhiteSpace(threadId))
        {
            _logger.LogInformation("Resuming Codex thread {ThreadId} for session {SessionId}", threadId, sessionId);
            return codex.ResumeThread(threadId, threadOptions);
        }

        _logger.LogInformation("Starting new Codex thread for session {SessionId}", sessionId ?? "(none)");
        return codex.StartThread(threadOptions);
    }
}

Characteristics of the Codex CLI provider:

JSON event-stream handling: Parse Codex JSON event streams (TurnStarted, ItemStarted, TurnCompleted, and so on)
Session-thread binding: Persist the binding between sessions and threads with an SQLite database
Thread reuse: Support resuming existing threads to maintain session continuity
Tool-call tracking: Track active tool-call state and correctly handle the tool lifecycle

7. Session-Thread Binding Mechanism

Codex CLI uses an SQLite database to persist the binding between sessions and threads:

public class CodexCliProvider : IAIProvider
{
    private const int SessionThreadBindingRetentionDays = 30;
    private readonly ConcurrentDictionary<string, string> _sessionThreadBindings;
    private readonly string _sessionThreadBindingDatabaseConnectionString;
    private readonly string _sessionThreadBindingDatabasePath;

    private void BindSessionThread(string? sessionId, string? threadId)
    {
        if (string.IsNullOrWhiteSpace(sessionId) || string.IsNullOrWhiteSpace(threadId))
        {
            return;
        }

        // In-memory cache
        _sessionThreadBindings.AddOrUpdate(sessionId, threadId, (_, _) => threadId);

        // Persist to SQLite
        PersistSessionThreadBinding(sessionId, threadId);
    }

    private void PersistSessionThreadBinding(string sessionId, string threadId)
    {
        try
        {
            using var connection = new SqliteConnection(_sessionThreadBindingDatabaseConnectionString);
            connection.Open();

            using var upsertCommand = connection.CreateCommand();
            upsertCommand.CommandText =
                """
                INSERT INTO SessionThreadBindings (SessionId, ThreadId, CreatedAtUtc, UpdatedAtUtc)
                VALUES ($sessionId, $threadId, $createdAtUtc, $updatedAtUtc)
                ON CONFLICT(SessionId) DO UPDATE SET
                    ThreadId = excluded.ThreadId,
                    UpdatedAtUtc = excluded.UpdatedAtUtc;
                """;
            var nowUtc = DateTimeOffset.UtcNow.ToString("O");
            upsertCommand.Parameters.AddWithValue("$sessionId", sessionId);
            upsertCommand.Parameters.AddWithValue("$threadId", threadId);
            upsertCommand.Parameters.AddWithValue("$createdAtUtc", nowUtc);
            upsertCommand.Parameters.AddWithValue("$updatedAtUtc", nowUtc);
            upsertCommand.ExecuteNonQuery();
        }
        catch (Exception ex)
        {
            _logger.LogWarning(
                ex,
                "Failed to persist Codex session-thread binding for session {SessionId} to {DatabasePath}",
                sessionId,
                _sessionThreadBindingDatabasePath);
        }
    }

    private void LoadPersistedSessionThreadBindings()
    {
        using var connection = new SqliteConnection(_sessionThreadBindingDatabaseConnectionString);
        connection.Open();

        using var loadCommand = connection.CreateCommand();
        loadCommand.CommandText = "SELECT SessionId, ThreadId FROM SessionThreadBindings;";
        using var reader = loadCommand.ExecuteReader();
        while (reader.Read())
        {
            var sessionId = reader.GetString(0);
            var threadId = reader.GetString(1);
            _sessionThreadBindings[sessionId] = threadId;
        }
    }
}

Advantages of session-thread binding:

Session restoration: Previous sessions can be restored after a system restart
Thread reuse: The same session can reuse an existing Codex thread
Automatic cleanup: Bindings older than 30 days are cleaned up automatically

8. Desktop-Side CLI Management

hagicode-desktop manages CLI selection through AgentCliManager:

export enum AgentCliType {
  ClaudeCode = 'claude-code',
  Codex = 'codex',
  // Future extensions: other CLIs such as Aider and Cursor
}

export class AgentCliManager {
  private static readonly STORE_KEY = 'agentCliSelection';
  private static readonly EXECUTOR_TYPE_MAP: Record<AgentCliType, string> = {
    [AgentCliType.ClaudeCode]: 'ClaudeCodeCli',
    [AgentCliType.Codex]: 'CodexCli',
  };

  constructor(private store: any) {}

  async saveSelection(cliType: AgentCliType): Promise<void> {
    const selection: StoredAgentCliSelection = {
      cliType,
      isSkipped: false,
      selectedAt: new Date().toISOString(),
    };

    this.store.set(AgentCliManager.STORE_KEY, selection);
  }

  loadSelection(): StoredAgentCliSelection {
    return this.store.get(AgentCliManager.STORE_KEY, {
      cliType: null,
      isSkipped: false,
      selectedAt: null,
    });
  }

  getCommandName(cliType: AgentCliType): string {
    switch (cliType) {
      case AgentCliType.ClaudeCode:
        return 'claude';
      case AgentCliType.Codex:
        return 'codex';
      default:
        return 'claude';
    }
  }

  getExecutorType(cliType: AgentCliType | null): string {
    if (!cliType) return 'ClaudeCodeCli';
    return this.EXECUTOR_TYPE_MAP[cliType] || 'ClaudeCodeCli';
  }
}

Example desktop-side IPC handler:

ipcMain.handle('llm:call-api', async (event, manifestPath, region) => {
  if (!state.llmInstallationManager) {
    return { success: false, error: 'LLM Installation Manager not initialized' };
  }

  try {
    const prompt = await state.llmInstallationManager.loadPrompt(manifestPath, region);

    // Determine the CLI command based on the user's selection
    let commandName = 'claude';
    if (state.agentCliManager) {
      const selectedCliType = state.agentCliManager.getSelectedCliType();
      if (selectedCliType) {
        commandName = state.agentCliManager.getCommandName(selectedCliType);
      }
    }

    // Execute with the selected CLI
    const result = await state.llmInstallationManager.callApi(
      prompt.filePath,
      event.sender,
      commandName
    );

    return result;
  } catch (error) {
    return {
      success: false,
      error: error instanceof Error ? error.message : 'Unknown error'
    };
  }
});

9. Codex’s Internal Model Provider System

Codex itself also supports multiple model providers via ModelProviderInfo configuration:

pub const OPENAI_PROVIDER_NAME: &str = "OpenAI";
pub const OLLAMA_OSS_PROVIDER_ID: &str = "ollama";
pub const LMSTUDIO_OSS_PROVIDER_ID: &str = "lmstudio";

pub fn built_in_model_providers() -> HashMap<String, ModelProviderInfo> {
    use ModelProviderInfo as P;

    [
        ("openai", P::create_openai_provider()),
        (OLLAMA_OSS_PROVIDER_ID, create_oss_provider(DEFAULT_OLLAMA_PORT, WireApi::Responses)),
        (LMSTUDIO_OSS_PROVIDER_ID, create_oss_provider(DEFAULT_LMSTUDIO_PORT, WireApi::Responses)),
    ]
    .into_iter()
    .map(|(k, v)| (k.to_string(), v))
    .collect()
}

pub struct ModelProviderInfo {
    pub name: String,
    pub base_url: Option<String>,
    pub env_key: Option<String>,
    pub query_params: Option<HashMap<String, String>>,
    pub http_headers: Option<HashMap<String, String>>,
    pub request_max_retries: Option<u64>,
    pub stream_max_retries: Option<u64>,
    pub stream_idle_timeout_ms: Option<u64>,
    pub requires_openai_auth: bool,
    pub supports_websockets: bool,
}

Codex model-provider support includes:

Built-in providers: OpenAI, Ollama, and LM Studio
Custom providers: Users can add custom providers in config.toml
Retry strategy: Configurable retry counts for requests and streams
WebSocket support: Some providers support WebSocket transport

Practice

Configuration Example

Configure multiple providers in appsettings.json:

{
  "AI": {
    "Providers": {
      "DefaultProvider": "ClaudeCodeCli",
      "Providers": {
        "ClaudeCodeCli": {
          "Type": "ClaudeCodeCli",
          "Model": "claude-sonnet-4-20250514",
          "WorkingDirectory": "/path/to/workspace",
          "PermissionMode": "acceptEdits",
          "AllowedTools": ["file-edit", "command-run", "bash"]
        },
        "CodexCli": {
          "Type": "CodexCli",
          "Model": "gpt-4.1",
          "ExecutablePath": "codex",
          "SandboxMode": "enabled",
          "WebSearchMode": "auto",
          "NetworkAccessEnabled": false
        }
      },
      "ScenarioProviderMapping": {
        "CodeAnalysis": "ClaudeCodeCli",
        "CodeGeneration": "CodexCli",
        "Refactoring": "ClaudeCodeCli",
        "Debugging": "CodexCli"
      },
      "FallbackChain": ["CodexCli", "ClaudeCodeCli"]
    },
    "Selector": {
      "EnableCache": true,
      "CacheExpirationSeconds": 300
    }
  }
}

Usage Example - Backend Service

public class AIOrchestrator
{
    private readonly IAIProviderFactory _providerFactory;
    private readonly IAIProviderSelector _providerSelector;
    private readonly ILogger<AIOrchestrator> _logger;

    public AIOrchestrator(
        IAIProviderFactory providerFactory,
        IAIProviderSelector providerSelector,
        ILogger<AIOrchestrator> logger)
    {
        _providerFactory = providerFactory;
        _providerSelector = providerSelector;
        _logger = logger;
    }

    public async Task<AIResponse> ProcessRequestAsync(
        AIRequest request,
        BusinessScenario scenario)
    {
        _logger.LogInformation("Processing request for scenario: {Scenario}", scenario);

        try
        {
            // Select a provider intelligently
            var providerType = await _providerSelector.SelectProviderAsync(scenario, request.CancellationToken);

            // Get the provider instance
            var provider = await _providerFactory.GetProviderAsync(providerType);
            if (provider == null)
            {
                throw new InvalidOperationException($"Provider {providerType} not available");
            }

            _logger.LogInformation("Using provider: {Provider} for request", provider.Name);

            // Execute the request
            var response = await provider.ExecuteAsync(request, request.CancellationToken);

            _logger.LogInformation("Request completed with provider: {Provider}, tokens used: {Tokens}",
                provider.Name,
                response.Usage?.TotalTokens ?? 0);

            return response;
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "Failed to process request for scenario: {Scenario}", scenario);
            throw;
        }
    }
}

Usage Example - Streaming Responses

public async IAsyncEnumerable<AIStreamingChunk> StreamResponseAsync(
    AIRequest request,
    BusinessScenario scenario)
{
    var providerType = await _providerSelector.SelectProviderAsync(scenario);
    var provider = await _providerFactory.GetProviderAsync(providerType);

    if (provider == null)
    {
        throw new InvalidOperationException($"Provider {providerType} not available");
    }

    await foreach (var chunk in provider.StreamAsync(request))
    {
        // Process streaming chunks
        switch (chunk.Type)
        {
            case StreamingChunkType.ContentDelta:
                // Show text content in real time
                await SendToClientAsync(chunk.Content);
                break;

            case StreamingChunkType.ToolCallDelta:
                // Handle tool calls
                await HandleToolCallAsync(chunk.ToolCallDelta);
                break;

            case StreamingChunkType.Metadata:
                // Handle completion events and stats
                if (chunk.IsComplete)
                {
                    _logger.LogInformation("Stream completed, usage: {@Usage}", chunk.Usage);
                }
                break;

            case StreamingChunkType.Error:
                // Handle errors
                _logger.LogError("Stream error: {Error}", chunk.ErrorMessage);
                throw new InvalidOperationException(chunk.ErrorMessage);
        }
    }
}

Usage Example - OpenSpec Commands

public async Task<string> ExecuteOpenSpecCommandAsync(
    string command,
    string arguments,
    BusinessScenario scenario)
{
    var providerType = await _providerSelector.SelectProviderAsync(scenario);
    var provider = await _providerFactory.GetProviderAsync(providerType);

    // Build an embedded command prompt
    var commandPrompt = $"""
        Execute the following OpenSpec command:
        Command: {command}
        Arguments: {arguments}

        Please execute this command and return the results.
        """;

    var request = new AIRequest
    {
        Prompt = "Process this command request",
        EmbeddedCommandPrompt = commandPrompt,
        WorkingDirectory = Directory.GetCurrentDirectory()
    };

    var response = await provider.SendMessageAsync(request, commandPrompt);

    return response.Content;
}

Considerations

1. Provider Health Checks

Before switching providers, it is recommended to call PingAsync first to ensure the target provider is available:

public async Task<bool> IsProviderHealthyAsync(AIProviderType providerType)
{
    var provider = await _providerFactory.GetProviderAsync(providerType);
    if (provider == null) return false;

    var testResult = await provider.PingAsync();

    return testResult.Success &&
           testResult.ResponseTimeMs < 5000; // A response within 5 seconds is considered healthy
}

2. Session Isolation

Use CessionId (Claude) or ThreadId (Codex) to ensure session isolation:

Claude Code CLI: use CessionId as the unique session identifier
Codex CLI: use ThreadId as the session identifier

// Claude Code CLI session options
var claudeSessionOptions = new ClaudeSessionOptions
{
    CessionId = CessionId.New(),  // Generate a unique ID
    WorkingDirectory = workspacePath,
    AllowedTools = allowedTools,
    PermissionMode = PermissionMode.acceptEdits
};

// Codex thread options
var codexThreadOptions = new ThreadOptions
{
    Model = "gpt-4.1",
    SandboxMode = "enabled",
    WorkingDirectory = workspacePath
};

3. Error Handling

Fallback mechanisms must be robust when a provider is unavailable, ensuring that at least one provider remains usable:

public async Task<AIResponse> ExecuteWithFallbackAsync(
    AIRequest request,
    List<AIProviderType> preferredProviders)
{
    Exception? lastException = null;

    foreach (var providerType in preferredProviders)
    {
        try
        {
            var provider = await _providerFactory.GetProviderAsync(providerType);
            if (provider == null) continue;

            // Try execution
            return await provider.ExecuteAsync(request);
        }
        catch (Exception ex)
        {
            _logger.LogWarning(ex, "Provider {ProviderType} failed, trying next", providerType);
            lastException = ex;
        }
    }

    // All providers failed
    throw new InvalidOperationException(
        "All preferred providers failed. Last error: " + lastException?.Message,
        lastException);
}

4. Configuration Validation

Validate settings for all configured providers at startup to avoid runtime errors:

public void ValidateConfiguration(AIProviderOptions options)
{
    foreach (var (providerType, config) in options.Providers)
    {
        // Validate executable paths (for CLI-based providers)
        if (IsCliBasedProvider(providerType))
        {
            if (string.IsNullOrWhiteSpace(config.ExecutablePath))
            {
                throw new ConfigurationException(
                    $"Provider {providerType} requires ExecutablePath");
            }

            if (!File.Exists(config.ExecutablePath))
            {
                throw new ConfigurationException(
                    $"Executable not found for {providerType}: {config.ExecutablePath}");
            }
        }

        // Validate API keys (for API-based providers)
        if (IsApiBasedProvider(providerType))
        {
            if (string.IsNullOrWhiteSpace(config.ApiKey))
            {
                throw new ConfigurationException(
                    $"Provider {providerType} requires ApiKey");
            }
        }

        // Validate model names
        if (string.IsNullOrWhiteSpace(config.Model))
        {
            _logger.LogWarning("No model configured for {ProviderType}, using default", providerType);
        }
    }
}

5. Cache Management

Provider instances are cached, so pay attention to lifecycle management and memory usage:

// Clean up the cache periodically
public void ClearInactiveProviders(TimeSpan inactiveThreshold)
{
    var now = DateTimeOffset.UtcNow;
    var keysToRemove = new List<AIProviderType>();

    foreach (var (type, instance) in _cache)
    {
        // Assume providers have a LastUsedTime property
        if (instance.LastUsedTime.HasValue &&
            now - instance.LastUsedTime.Value > inactiveThreshold)
        {
            keysToRemove.Add(type);
        }
    }

    foreach (var key in keysToRemove)
    {
        _cache.TryRemove(key, out _);
        _logger.LogInformation("Cleared inactive provider: {Provider}", key);
    }
}

6. Logging

Log provider selection, switching, and execution in detail to make debugging easier:

public class AIProviderLogging
{
    private readonly ILogger _logger;

    public void LogProviderSelection(
        BusinessScenario scenario,
        AIProviderType selectedProvider,
        SelectionReason reason)
    {
        _logger.LogInformation(
            "[ProviderSelection] Scenario={Scenario}, Provider={Provider}, Reason={Reason}",
            scenario,
            selectedProvider,
            reason);
    }

    public void LogProviderSwitch(
        AIProviderType fromProvider,
        AIProviderType toProvider,
        string reason)
    {
        _logger.LogWarning(
            "[ProviderSwitch] From={FromProvider} To={ToProvider}, Reason={Reason}",
            fromProvider,
            toProvider,
            reason);
    }

    public void LogProviderError(
        AIProviderType provider,
        Exception error,
        AIRequest request)
    {
        _logger.LogError(error,
            "[ProviderError] Provider={Provider}, RequestLength={Length}, Error={Message}",
            provider,
            request.Prompt.Length,
            error.Message);
    }
}

7. Thread Safety

Using concurrent collections such as ConcurrentDictionary ensures thread safety:

public class ThreadSafeProviderCache
{
    private readonly ConcurrentDictionary<AIProviderType, IAIProvider> _cache;
    private readonly ReaderWriterLockSlim _lock = new();

    public IAIProvider? GetProvider(AIProviderType type)
    {
        // Read operations do not require a lock
        if (_cache.TryGetValue(type, out var provider))
            return provider;

        // Creation requires a write lock
        _lock.EnterWriteLock();
        try
        {
            // Double-check
            if (_cache.TryGetValue(type, out provider))
                return provider;

            var newProvider = CreateProvider(type);
            if (newProvider != null)
            {
                _cache[type] = newProvider;
            }
            return newProvider;
        }
        finally
        {
            _lock.ExitWriteLock();
        }
    }
}

8. Database Migration

When the session-thread binding database schema changes, data migration must be considered:

public class SessionThreadMigration
{
    public async Task MigrateAsync(string dbPath)
    {
        var version = await GetSchemaVersionAsync(dbPath);

        if (version >= 2) return; // Already the latest version

        using var connection = new SqliteConnection(dbPath);
        connection.Open();

        // Migrate to v2: add the CreatedAtUtc column
        if (version < 2)
        {
            _logger.LogInformation("Migrating SessionThreadBindings to v2...");

            using var addColumnCommand = connection.CreateCommand();
            addColumnCommand.CommandText = "ALTER TABLE SessionThreadBindings ADD COLUMN CreatedAtUtc TEXT;";
            addColumnCommand.ExecuteNonQuery();

            using var backfillCommand = connection.CreateCommand();
            backfillCommand.CommandText =
                """
                UPDATE SessionThreadBindings
                SET CreatedAtUtc = COALESCE(NULLIF(UpdatedAtUtc, ''), $nowUtc)
                WHERE CreatedAtUtc IS NULL OR CreatedAtUtc = '';
                """;
            backfillCommand.Parameters.AddWithValue("$nowUtc", DateTimeOffset.UtcNow.ToString("O"));
            backfillCommand.ExecuteNonQuery();
        }

        await UpdateSchemaVersionAsync(dbPath, 2);
        _logger.LogInformation("Migration to v2 completed");
    }
}

Conclusion

HagiCode combines the provider pattern, factory pattern, and selector pattern to implement a flexible and extensible multi-AI provider architecture:

Unified interface abstraction: The IAIProvider interface hides the differences between CLIs
Dynamic instance creation: AIProviderFactory supports runtime creation of provider instances
Intelligent selection strategy: AIProviderSelector implements scenario-driven provider selection
Session state persistence: Database bindings ensure session continuity
Desktop integration: AgentCliManager supports user selection and configuration

The advantages of this architecture are:

Extensibility: Adding a new AI provider only requires implementing the IAIProvider interface
Testability: Providers can be tested and mocked independently
Maintainability: Each provider implementation is isolated and has a single responsibility
User-friendliness: Support both scenario-based automatic selection and manual switching

With this design, HagiCode successfully enables seamless switching and interoperability between Claude Code CLI and Codex CLI, giving developers a flexible and powerful AI coding assistant experience.

References

HagiCode project repository: github.com/HagiCode-org/site
HagiCode official website: hagicode.com
Claude Code official documentation: docs.anthropic.com
OpenAI Codex documentation: platform.openai.com
Codex SDK official repository: github.com/openai/codex
HagiCode multi-platform CLI support: https://docs.hagicode.com/blog/hagicode-ai-cli-multi-platform-support/

Thank you for reading. If you found this article useful, please click the like button below 👍 so more people can discover it.

This content was created with AI-assisted collaboration, reviewed by me, and reflects my own views and position.

Author: newbe36524
Article link: https://docs.hagicode.com/blog/2026-03-09-hagicode-multi-ai-provider-switching-interop/
Copyright notice: Unless otherwise stated, all blog posts on this site are licensed under BY-NC-SA. Please credit the source when reprinting!

Guide to Implementing Hotword Support for Doubao Speech Recognition

Mar 6, 2026

Guide to Implementing Hotword Support for Doubao Speech Recognition

This article explains in detail how to implement hotword support for Doubao speech recognition in the HagiCode project. By using both custom hotwords and platform hotword tables, you can significantly improve recognition accuracy for domain-specific vocabulary.

Background

Speech recognition technology has developed for many years, yet one problem has consistently bothered developers. General-purpose speech recognition models can cover everyday language, but they often fall short when it comes to professional terminology, product names, and personal names. Think about it: a voice assistant in the medical field needs to accurately recognize terms like “hypertension,” “diabetes,” and “coronary heart disease”; a legal system needs to precisely capture terms such as “cause of action,” “defense,” and “burden of proof.” In these scenarios, a general-purpose model is trying its best, but that is often not enough.

We ran into the same challenge in the HagiCode project. As a multifunctional AI coding assistant, HagiCode needs to handle speech recognition for a wide range of technical terminology. However, the Doubao speech recognition API, in its default configuration, could not fully meet our accuracy requirements for specialized terms. It is not that Doubao is not good enough; rather, every domain has its own terminology system. After some research and technical exploration, we found that the Doubao speech recognition API actually provides hotword support. With a straightforward configuration, it can significantly improve the recognition accuracy of specific vocabulary. In a sense, once you tell it which words to pay attention to, it listens for them more carefully.

What this article shares is the complete solution we used in the HagiCode project to implement Doubao speech recognition hotwords. Both modes, custom hotwords and platform hotword tables, are available, and they can also be combined. With this solution, developers can flexibly configure hotwords based on business scenarios so the speech recognition system can better “recognize” professional, uncommon, yet critical vocabulary.

About HagiCode

The solution shared in this article comes from our practical experience in the HagiCode project. HagiCode is an open-source AI coding assistant project with a modern technology stack, designed to provide developers with an intelligent programming assistance experience. As a complex multilingual, multi-platform project, HagiCode needs to handle speech recognition scenarios involving many technical terms, which in turn drove our research into and implementation of the hotword feature.

If you are interested in HagiCode’s technical implementation, you can visit the GitHub repository for more details, or check out our official documentation for the complete installation and usage guide.

Core Implementation

Understanding the Two Hotword Modes

The Doubao speech recognition API provides two ways to configure hotwords, and each one has its own ideal use cases and advantages.

Custom hotword mode lets us pass hotword text directly through the corpus.context field. This approach is especially suitable for scenarios where you need to quickly configure a small number of hotwords, such as temporarily recognizing a product name or a person’s name. In HagiCode’s implementation, we parse the multi-line hotword text entered by the user into a list of strings, then format it into the context_data array required by the Doubao API. This approach is very direct: you simply tell the system which words to pay attention to, and it does exactly that.

Platform hotword table mode uses the corpus.boosting_table_id field to reference a preconfigured hotword table in the Doubao self-learning platform. This approach is suitable for scenarios where you need to manage a large number of hotwords. We can create and maintain hotword tables on the Doubao self-learning platform, then reference them by ID. For a project like HagiCode, where specialized terms need to be continuously updated and maintained, this mode offers much better manageability. Once the number of hotwords grows, having a centralized place to manage them is far better than entering them manually every time.

Interestingly, these two modes can also be used together. The Doubao API supports including both custom hotwords and a platform hotword table ID in the same request, with the combination strategy controlled by the combine_mode parameter. This flexibility allows HagiCode to handle a wide range of complex professional terminology recognition needs. Sometimes, combining multiple approaches produces better results.

Frontend Type Definitions and Validation

In HagiCode’s frontend implementation, we defined a complete set of hotword configuration types and validation logic. The first part is the type definition:

export interface HotwordConfig {
  contextText: string;           // Multi-line hotword text
  boostingTableId: string;      // Doubao platform hotword table ID
  combineMode: boolean;          // Whether to use both together
}

This simple interface contains all configuration items for the hotword feature. Among them, contextText is the part users interact with most directly: we allow users to enter one hotword phrase per line, which is very intuitive. Asking users to enter one term per line is much easier than making them understand a complicated configuration format.

Next comes the validation function. Based on the Doubao API limitations, we defined strict validation rules: at most 100 lines of hotword text, up to 50 characters per line, and no more than 5000 characters in total; boosting_table_id can be at most 200 characters and may contain only letters, numbers, underscores, and hyphens. These limits are not arbitrary; they come directly from the official Doubao documentation. API limits are API limits, and we have to follow them.

export function validateContextText(contextText: string): HotwordValidationResult {
  if (!contextText || contextText.trim().length === 0) {
    return { isValid: true, errors: [] };
  }

  const lines = contextText.split('\n').filter(line => line.trim().length > 0);
  const errors: string[] = [];

  if (lines.length > 100) {
    errors.push(`Hotword line count cannot exceed 100 lines; current count is ${lines.length}`);
  }

  const totalChars = contextText.length;
  if (totalChars > 5000) {
    errors.push(`Total hotword character count cannot exceed 5000; current count is ${totalChars}`);
  }

  for (let i = 0; i < lines.length; i++) {
    if (lines[i].length > 50) {
      errors.push(`Hotword on line ${i + 1} exceeds the 50-character limit`);
    }
  }

  return { isValid: errors.length === 0, errors };
}

export function validateBoostingTableId(boostingTableId: string): HotwordValidationResult {
  if (!boostingTableId || boostingTableId.trim().length === 0) {
    return { isValid: true, errors: [] };
  }

  const errors: string[] = [];

  if (boostingTableId.length > 200) {
    errors.push(`boosting_table_id cannot exceed 200 characters; current count is ${boostingTableId.length}`);
  }

  if (!/^[a-zA-Z0-9_-]+$/.test(boostingTableId)) {
    errors.push('boosting_table_id can contain only letters, numbers, underscores, and hyphens');
  }

  return { isValid: errors.length === 0, errors };
}

These validation functions run immediately when the user configures hotwords, ensuring that problems are caught as early as possible. From a user experience perspective, this kind of instant feedback is very important. It is always better for users to know what is wrong while they are typing rather than after they submit.

Frontend Configuration Persistence

In HagiCode’s frontend implementation, we chose to use the browser’s localStorage to store hotword configuration. There were several considerations behind this design decision. First, hotword configuration is highly personalized, and different users may have different domain-specific needs. Second, this approach simplifies the backend implementation because it does not require extra database tables or API endpoints. Finally, after users configure it once in the browser, the settings can be loaded automatically on subsequent uses, which is very convenient. Put simply, it is the easiest approach.

const HOTWORD_STORAGE_KEYS = {
  contextText: 'hotword-context-text',
  boostingTableId: 'hotword-boosting-table-id',
  combineMode: 'hotword-combine-mode',
} as const;

export const DEFAULT_HOTWORD_CONFIG: HotwordConfig = {
  contextText: '',
  boostingTableId: '',
  combineMode: false,
};

// Load hotword configuration
export function loadHotwordConfig(): HotwordConfig {
  const contextText = localStorage.getItem(HOTWORD_STORAGE_KEYS.contextText) || '';
  const boostingTableId = localStorage.getItem(HOTWORD_STORAGE_KEYS.boostingTableId) || '';
  const combineMode = localStorage.getItem(HOTWORD_STORAGE_KEYS.combineMode) === 'true';

  return { contextText, boostingTableId, combineMode };
}

// Save hotword configuration
export function saveHotwordConfig(config: HotwordConfig): void {
  localStorage.setItem(HOTWORD_STORAGE_KEYS.contextText, config.contextText);
  localStorage.setItem(HOTWORD_STORAGE_KEYS.boostingTableId, config.boostingTableId);
  localStorage.setItem(HOTWORD_STORAGE_KEYS.combineMode, String(config.combineMode));
}

The logic in this code is straightforward and clear. We read from localStorage when loading configuration, and write to localStorage when saving it. We also provide a default configuration so the system can still work properly when no configuration exists yet. There has to be a sensible default, after all.

Backend SDK Configuration Extension

In HagiCode’s backend implementation, we needed to add hotword-related properties to the SDK configuration class. Taking C# language characteristics and usage patterns into account, we used List<string> to store custom hotword contexts:

public class DoubaoVoiceConfig
{
    /// <summary>
    /// App ID
    /// </summary>
    public string AppId { get; set; } = string.Empty;

    /// <summary>
    /// Access token
    /// </summary>
    public string AccessToken { get; set; } = string.Empty;

    /// <summary>
    /// Service URL
    /// </summary>
    public string ServiceUrl { get; set; } = string.Empty;

    /// <summary>
    /// Custom hotword context list
    /// </summary>
    public List<string>? HotwordContexts { get; set; }

    /// <summary>
    /// Doubao platform hotword table ID
    /// </summary>
    public string? BoostingTableId { get; set; }
}

The design of this configuration class follows HagiCode’s usual concise style. HotwordContexts is a nullable list type, and BoostingTableId is a nullable string, so when there is no hotword configuration, these properties have no effect on the request at all. If you are not using the feature, it should stay out of the way.

Payload Construction Logic

Payload construction is the core of the entire hotword feature. Once we have hotword configuration, we need to format it into the JSON structure required by the Doubao API. This process happens before the SDK sends the request:

private void AddCorpusToRequest(Dictionary<string, object> request)
{
    var corpus = new Dictionary<string, object>();

    // Add custom hotwords
    if (Config.HotwordContexts != null && Config.HotwordContexts.Count > 0)
    {
        corpus["context"] = new Dictionary<string, object>
        {
            ["context_type"] = "dialog_ctx",
            ["context_data"] = Config.HotwordContexts
                .Select(text => new Dictionary<string, object> { ["text"] = text })
                .ToList()
        };
    }

    // Add platform hotword table ID
    if (!string.IsNullOrEmpty(Config.BoostingTableId))
    {
        corpus["boosting_table_id"] = Config.BoostingTableId;
    }

    // Add corpus to the request only when it is not empty
    if (corpus.Count > 0)
    {
        request["corpus"] = corpus;
    }
}

This code shows how to dynamically construct the corpus field based on configuration. The key point is that we add the corpus field only when hotword configuration actually exists. This design ensures backward compatibility: when no hotwords are configured, the request structure remains exactly the same as before. Backward compatibility matters; adding a feature should not disrupt existing logic.

WebSocket Parameter Passing

Between the frontend and backend, hotword parameters are passed through WebSocket control messages. HagiCode is designed so that when the frontend starts recording, it loads the hotword configuration from localStorage and sends it to the backend through a WebSocket message.

const controlMessage = {
  type: 'control',
  payload: {
    command: 'StartRecognition',
    contextText: '高血压\n糖尿病\n冠心病',
    boosting_table_id: 'medical_table',
    combineMode: false
  }
};

There is one detail to note here: the frontend passes multi-line text separated by newline characters, and the backend needs to parse it. The backend WebSocket handler parses these parameters and passes them to the SDK:

private async Task HandleControlMessageAsync(
    string connectionId,
    DoubaoSession session,
    ControlMessage message)
{
    if (message.Payload is SessionControlRequest controlRequest)
    {
        // Parse hotword parameters
        string? contextText = controlRequest.ContextText;
        string? boostingTableId = controlRequest.BoostingTableId;
        bool? combineMode = controlRequest.CombineMode;

        // Parse multi-line text into a hotword list
        if (!string.IsNullOrEmpty(contextText))
        {
            var hotwords = contextText
                .Split('\n', StringSplitOptions.RemoveEmptyEntries)
                .Select(s => s.Trim())
                .Where(s => s.Length > 0)
                .ToList();

            session.HotwordContexts = hotwords;
        }

        session.BoostingTableId = boostingTableId;
    }
}

With this design, passing hotword configuration from frontend to backend becomes clear and efficient. There is nothing especially mysterious about it; the data is simply passed through layer by layer.

Practical Guide

Configure Custom Hotwords

In real usage, configuring custom hotwords is very simple. Open the speech recognition settings page in HagiCode and find the “Hotword Configuration” section. In the “Custom Hotword Text” input box, enter one hotword phrase per line.

For example, if you are developing a medical-related application, you could configure it like this:

高血压
糖尿病
冠心病
心绞痛
心肌梗死
心力衰竭

After you save the configuration, these hotwords are automatically passed to the Doubao API every time speech recognition starts. In our tests, once hotwords were configured, the recognition accuracy for related professional terms improved noticeably. The improvement is real, and clearly better than before.

Configure a Platform Hotword Table

If you need to manage a large number of hotwords, or if the hotwords need frequent updates, the platform hotword table mode is a better fit. First, create a hotword table on the Doubao self-learning platform and obtain the generated boosting_table_id, then enter this ID on the HagiCode settings page.

The Doubao self-learning platform provides capabilities such as bulk import and categorized management for hotwords, which is very practical for teams that need to manage large sets of specialized terminology. By managing hotwords on the platform, you can maintain them centrally and roll out updates consistently. Once the hotword list becomes large, having a single place to manage it is much more practical than manual entry every time.

Using Combination Mode

In some complex scenarios, you may need to use both custom hotwords and a platform hotword table at the same time. In that case, simply configure both in HagiCode and enable the “Combination Mode” switch.

In combination mode, the Doubao API considers both hotword sources at the same time, so recognition accuracy is usually higher than using either source alone. However, it is worth noting that combination mode increases request complexity, so it is best to decide whether to enable it after practical testing. More complexity is only worth it if the real-world results justify it.

Code Integration Examples

Integrating the hotword feature into the HagiCode project is very straightforward. Here are some commonly used code snippets:

import {
  loadHotwordConfig,
  saveHotwordConfig,
  validateHotwordConfig,
  parseContextText,
  getEffectiveHotwordMode,
  type HotwordConfig
} from '@/types/hotword';

// Load and validate configuration
const config = loadHotwordConfig();
const validation = validateHotwordConfig(config);

if (!validation.isValid) {
  console.error('Hotword configuration validation failed:', validation.errors);
  return;
}

// Parse hotword text
const hotwords = parseContextText(config.contextText);
console.log('Parsed hotwords:', hotwords);

// Get effective hotword mode
const mode = getEffectiveHotwordMode(config);
console.log('Current hotword mode:', mode);

Backend usage is similarly concise:

var config = new DoubaoVoiceConfig
{
    AppId = "your_app_id",
    AccessToken = "your_access_token",
    ServiceUrl = "wss://openspeech.bytedance.com/api/v3/sauc/bigmodel_async",

    // Configure custom hotwords
    HotwordContexts = new List<string>
    {
        "高血压",
        "糖尿病",
        "冠心病"
    },

    // Configure platform hotword table
    BoostingTableId = "medical_table_v1"
};

var client = new DoubaoVoiceClient(config, logger);
await client.ConnectAsync();
await client.SendFullClientRequest();

Things to Keep in Mind

There are several points that deserve special attention when implementing and using the hotword feature.

First is the character limit. The Doubao API has strict restrictions on hotwords, including line count, characters per line, and total character count. If any limit is exceeded, the API returns an error. In HagiCode’s frontend implementation, we check these constraints during user input through validation functions, which prevents invalid configurations from being sent to the backend. Catching problems early is always better than waiting for the API to fail.

Second is the format of boosting_table_id. This field allows only letters, numbers, underscores, and hyphens, and it cannot contain spaces or other special characters. When creating a hotword table on the Doubao self-learning platform, be sure to follow the naming rules. That kind of strict format validation is common for APIs.

Third is backward compatibility. Hotword parameters are entirely optional. If no hotwords are configured, the system behaves exactly as it did before. This design ensures that existing users are not affected in any way, and it also makes gradual migration and upgrades easier. Adding a feature should not disrupt the previous logic.

Finally, there is error handling. When hotword configuration is invalid, the Doubao API returns corresponding error messages. HagiCode’s implementation records detailed logs to help developers troubleshoot issues. At the same time, the frontend displays validation errors in the UI to help users correct the configuration. Good error handling naturally leads to a better user experience.

Conclusion

Through this article, we have provided a detailed introduction to the complete solution for implementing Doubao speech recognition hotwords in the HagiCode project. This solution covers the entire process from requirement analysis and technical selection to code implementation, giving developers a practical example they can use for reference.

The key points can be summarized as follows. First, the Doubao API supports both custom hotwords and platform hotword tables, and they can be used independently or in combination. Second, the frontend uses localStorage to store configuration in a simple and efficient way. Third, the backend passes hotword parameters by dynamically constructing the corpus field, preserving strong backward compatibility. Fourth, comprehensive validation logic ensures configuration correctness and avoids invalid requests. Overall, the solution is not complicated; it simply follows the API requirements carefully.

Implementing the hotword feature further strengthens HagiCode’s capabilities in the speech recognition domain. By flexibly configuring business-related professional terms, developers can help the speech recognition system better understand content from specific domains and therefore provide more accurate services. Ultimately, technology should serve real business needs, and solving practical problems is what matters most.

If you found this article helpful, feel free to give HagiCode a Star on GitHub. Your support motivates us to keep sharing technical practice and experience. In the end, writing and sharing technical content that helps others is a pleasure in itself.

References

Thank you for reading. If you found this article useful, click the like button below 👍 so more people can discover it.

This content was created with AI-assisted collaboration, reviewed by me, and reflects my own views and positions.

Author: newbe36524
Article link: https://docs.hagicode.com/blog/2026-03-06-doubao-speech-recognition-hotword-support/

AI Compose Commit: Using AI to Intelligently Refactor the Git Commit Workflow

Feb 26, 2026

AI Compose Commit: Using AI to Intelligently Refactor the Git Commit Workflow

Introduction

In the software development process, committing code is a routine task every programmer faces every day. But have you ever run into this situation: at the end of a workday, you open Git, see dozens of unstaged modified files, and have no idea how to organize them into sensible commits?

The traditional approach is to manually stage files in batches, commit them one by one, and write commit messages. This process is both time-consuming and error-prone. We often waste quite a bit of time on this, and after all, nobody wants to worry about these tedious chores late at night when they are already tired.

In the HagiCode project, we introduced a new feature - AI Compose Commit - designed to completely transform this workflow. By using AI to intelligently analyze all uncommitted changes in the working tree, it automatically groups them into multiple logical commits and performs standards-compliant commit operations. In this article, we will take a deep dive into the implementation principles, technical architecture, and the challenges and solutions we encountered in practice.

About HagiCode

The approach shared in this article comes from our practical experience in the HagiCode project.

Background

Pain Points of Traditional Git Commits

As a version control system, Git gives developers powerful code management capabilities. But in real-world usage, committing often becomes a bottleneck in the development workflow:

Manual grouping is time-consuming: When there are many file changes, developers need to inspect each file one by one and decide which changes belong to the same feature. That takes a lot of mental effort.
Inconsistent commit message quality: Writing commit messages that follow the Conventional Commits specification requires experience and skill, and beginners often produce non-standard commits.
Complex multi-repository management: In a monorepo environment, switching between different repositories adds operational complexity.
Interrupted workflow: Committing code interrupts your train of thought and hurts coding efficiency.

These issues are especially obvious in large projects and collaborative team environments. A good development tool should let developers focus on core coding work instead of getting bogged down in a cumbersome commit workflow.

The Trend of AI-Assisted Development

In recent years, AI has been used more and more widely in software development. From code completion and bug detection to automatic documentation generation, AI is gradually reaching every stage of the development process. In Git workflows, while some tools already support commit message generation, most are limited to single-commit scenarios and lack the ability to intelligently analyze and group changes across the entire working tree.

HagiCode encountered these pain points during development as well. We tried many tools, but each had one limitation or another. Either the functionality was incomplete, or the user experience was not good enough. That is why we ultimately decided to implement AI Compose Commit ourselves.

HagiCode’s AI Compose Commit feature was created to fill that gap. It does not just generate commit messages - it takes over the entire process from file analysis to commit execution.

Problem

Technical Challenges

While implementing AI Compose Commit, we faced several technical challenges:

File semantic understanding: The AI needs to understand semantic relationships between file changes and decide which files belong to the same functional module. This requires deep analysis of file content, directory structure, and change context.
Commit grouping strategy: How should a reasonable grouping standard be defined? By feature, by module, or by file type? Different projects may need different strategies.
Real-time feedback and asynchronous processing: Git operations can take a long time, especially when handling a large number of files. How can we complete complex operations while preserving a good user experience?
Multi-repository support: In a monorepo architecture, operations need to be routed correctly between the main repository and sub-repositories.
Error handling and rollback: If one commit fails, how should already executed commits be handled? Do already staged files need to be rolled back?
Commit message consistency: Generated commit messages need to match the project’s existing style and remain consistent with historical commits.

Performance Considerations

AI processing over a large number of file changes consumes significant time and compute resources. We needed to optimize in the following areas:

Reduce unnecessary AI calls
Optimize how file context is constructed
Implement efficient Git operation batching

These issues all appeared in real HagiCode usage, and we only arrived at a relatively complete solution through repeated iteration and optimization. If you are building a similar tool, we hope our experience gives you some inspiration.

Solution

Overall Architecture Design

We adopted a layered architecture to implement AI Compose Commit, ensuring good scalability and maintainability:

1. API Layer (Web Layer)

GitController provides the POST /api/git/auto-compose-commit endpoint as the entry point. To optimize user experience, we adopted a fire-and-forget asynchronous pattern:

After the client sends a request, the server immediately returns HTTP 202 Accepted
The actual AI processing runs asynchronously in the background
When processing finishes, the client is notified through SignalR

This design ensures that even if AI processing takes several minutes, users still get an immediate response and do not feel that the system is frozen.

2. Application Service Layer (Application Layer)

GitAppService is responsible for the core business logic:

Repository detection: supports multi-repository management in a monorepo
Lock management: prevents conflicts caused by concurrent operations
File staging coordination: interacts with the AI processing flow
Error rollback: restores state when failures occur

3. Distributed Computing Layer (Orleans Grains)

AIGrain serves as the core execution unit for AI operations. It implements the AutoComposeCommitAsync method from the IAIGrain interface:

// Define the interface method for AI-powered automatic commit composition
// Parameter notes:
// - projectId: unique project identifier
// - unstagedFiles: list of unstaged files, including file paths and status information
// - projectPath: project root directory path (optional), used to access project context
// Return value: a response object containing execution results, including success/failure status and detailed information
[Alias("AutoComposeCommitAsync")]
[ResponseTimeout("00:20:00")] // 20-minute timeout, suitable for handling large change sets
Task<AutoComposeCommitResponseDto> AutoComposeCommitAsync(
    string projectId,
    GitFileStatusDto[] unstagedFiles,
    string? projectPath = null);

This method sets a 20-minute timeout to handle large change sets. In real-world HagiCode usage, we found that some projects can involve hundreds of changed files in a single pass, requiring more processing time.

4. AI Service Layer

Through the abstract IAIService interface, we implemented a pluggable AI service architecture. We currently use the Claude Helper service, but it can be easily switched to other AI providers.

Core Implementation Logic

File Context Construction

The AI needs to understand the state of each file before it can make intelligent decisions. We build file context through the BuildFileChangesXml method:

/// <summary>
/// Build an XML representation of file changes to provide the AI with complete file context information
/// </summary>
/// <param name="stagedFiles">List of staged files, including file path, status, and old path (for rename operations)</param>
/// <returns>A formatted XML string containing metadata for all files</returns>
private static string BuildFileChangesXml(GitFileStatusDto[] stagedFiles)
{
    var sb = new StringBuilder();
    sb.AppendLine("<files>");

    foreach (var file in stagedFiles)
    {
        sb.AppendLine("  <file>");
        // Use XML escaping to ensure special characters do not break the XML structure
        sb.AppendLine($"    <path>{System.Security.SecurityElement.Escape(file.Path)}</path>");
        sb.AppendLine($"    <status>{System.Security.SecurityElement.Escape(file.Status)}</status>");

        // Handle file rename scenarios and record the old path so the AI can understand change relationships
        if (!string.IsNullOrEmpty(file.OldPath))
        {
            sb.AppendLine($"    <oldPath>{System.Security.SecurityElement.Escape(file.OldPath)}</oldPath>");
        }

        sb.AppendLine("  </file>");
    }

    sb.AppendLine("</files>");
    return sb.ToString();
}

This XML-based context includes file paths, statuses, and old paths for rename operations, giving the AI complete metadata. With a structured XML format, we ensure that the AI can accurately understand the state and change type of each file.

AI Permission Management

To let the AI execute Git operations directly, we configured comprehensive tool permissions:

// Define the set of tools the AI can use, including file operations and Git command execution permissions
// Read/Write/Edit: file reading, writing, and editing capabilities
// Bash(git:*): permission to execute all Git commands
// Other Bash commands: used to inspect file contents and directory structure so the AI can understand context
var allowedTools = new[]
{
    "Read", "Write", "Edit",
    "Bash(git:*)", "Bash(cat:*)", "Bash(ls:*)", "Bash(find:*)",
    "Bash(grep:*)", "Bash(head:*)", "Bash(tail:*)", "Bash(wc:*)"
};

// Build the complete AI request object
var request = new AIRequest
{
    Prompt = prompt,                          // Complete prompt template, including task instructions and constraints
    WorkingDirectory = projectPath ?? GetTempDirectory(), // Working directory, ensuring the AI runs in the correct project context
    AllowedTools = allowedTools,               // Allowed tool set
    PermissionMode = PermissionMode.bypassPermissions, // Bypass permission checks so Git operations can run directly
    LanguagePreference = languagePreference         // Language preference setting, ensuring commit messages match user expectations
};

Here we use PermissionMode.bypassPermissions, which allows the AI to execute Git commands directly without user confirmation. This is central to the feature design, but it also requires strict input validation to prevent abuse. In HagiCode’s production deployment, we ensured the safety of this mechanism through backend parameter validation and log monitoring.

Commit Result Parsing

After the AI finishes execution, it returns structured results. We implemented a dual parsing strategy to ensure compatibility:

/// <summary>
/// Parse commit execution results returned by the AI, supporting both delimiter format and regex format
/// </summary>
/// <param name="aiResponse">Raw response content returned by the AI</param>
/// <returns>A parsed list of commit results, where each result includes the commit hash and execution status</returns>
private List<CommitResultDto> ParseCommitExecutionResults(string aiResponse)
{
    var results = new List<CommitResultDto>();

    // Prefer delimiter-based parsing (new format), which is more explicit and reliable
    if (aiResponse.Contains("---"))
    {
        logger.LogDebug("Using delimiter-based parsing for AI response");
        results = ParseDelimitedFormat(aiResponse);

        if (results.Count > 0)
        {
            return results; // Successfully parsed, return the results directly
        }

        logger.LogWarning("Delimiter-based parsing produced no results, falling back to regex");
    }
    else
    {
        logger.LogDebug("No delimiter found, using legacy regex-based parsing");
    }

    // Fall back to regex parsing (old format) to ensure backward compatibility
    return ParseLegacyFormat(aiResponse);
}

The delimiter format uses --- to separate commits, making the structure clear and easy to parse:

---
Commit 1: abc123def456
feat(auth): add user login functionality

Implement JWT-based authentication with login form and API endpoints.

Co-Authored-By: Hagicode <noreply@hagicode.com>
---
Commit 2: 789ghi012jkl
docs(readme): update installation instructions

Add new setup steps for Docker environment.

Co-Authored-By: Hagicode <noreply@hagicode.com>
---

This format makes parsing simple and reliable, while also remaining easy for humans to read.

Lock Management Mechanism

To prevent state conflicts caused by concurrent operations, we implemented a repository lock mechanism:

// Acquire the repository lock to prevent concurrent operations
// Parameter notes:
// - fullPath: full repository path, used to identify different repository instances
// - requestedBy: requester identifier, used for tracking and logging
await _autoComposeLockService.AcquireLockAsync(fullPath, requestedBy);

try
{
    // Execute the AI Compose Commit operation
    // This section calls an Orleans Grain method to perform the actual AI processing and Git operations
    await aiGrain.AutoComposeCommitAsync(projectId, unstagedFiles, projectPath);
}
finally
{
    // Ensure the lock is released whether the operation succeeds or fails
    // Using a finally block guarantees lock release even when exceptions occur, preventing deadlocks
    await _autoComposeLockService.ReleaseLockAsync(fullPath);
}

The lock has a 20-minute timeout, matching the timeout used for AI operations. If the operation fails or times out, the system automatically releases the lock to avoid permanent blocking. In real HagiCode usage, we found this lock mechanism to be extremely important, especially in collaborative environments where multiple developers may trigger AI Compose Commit at the same time.

SignalR Real-Time Notifications

After processing completes, the system sends a notification to the frontend through SignalR:

/// <summary>
/// Send a notification when automatic commit composition is complete
/// </summary>
/// <param name="projectId">Project identifier, used to route the notification to the correct client</param>
/// <param name="totalCount">Total number of commits, including successes and failures</param>
/// <param name="successCount">Number of successful commits</param>
/// <param name="failureCount">Number of failed commits</param>
/// <param name="success">Whether the overall operation succeeded</param>
/// <param name="error">Error message (if the operation failed)</param>
private async Task SendAutoComposeCommitNotificationAsync(
    string projectId,
    int totalCount,
    int successCount,
    int failureCount,
    bool success,
    string? error)
{
    try
    {
        // Build the notification DTO containing detailed execution results
        var notification = new AutoComposeCommitCompletedDto
        {
            ProjectId = projectId,
            TotalCount = totalCount,
            SuccessCount = successCount,
            FailureCount = failureCount,
            Success = success,
            Error = error
        };

        // Broadcast the notification to all connected clients through the SignalR Hub
        await messageService.SendAutoComposeCommitCompletedAsync(notification);

        logger.LogInformation(
            "Auto compose commit notification sent for project {ProjectId}: {SuccessCount}/{TotalCount} succeeded",
            projectId, successCount, totalCount);
    }
    catch (Exception ex)
    {
        // Log notification errors without affecting the main operation flow
        // A notification failure should not cause the entire operation to fail
        logger.LogError(ex, "Failed to send auto compose commit notification for project {ProjectId}", projectId);
    }
}

After the frontend receives the notification, it can update the UI to show whether the commit succeeded or failed, improving the user experience. This real-time feedback mechanism received strong feedback from HagiCode users, who can clearly see when the operation finishes and what the outcome is.

Implementation

Prompt Engineering Design

AI behavior is entirely determined by the prompt, so we carefully designed the prompt template for Auto Compose Commit. Taking the Chinese version as an example (auto-compose-commit.zh-CN.hbs):

Non-Interactive Mode Support

At the beginning of the prompt, we explicitly declare support for non-interactive execution mode, which is a critical requirement for CI/CD and automation scripts:

**Important Note**: This prompt may run in a non-interactive environment (such as CI/CD or automation scripts).

**Non-Interactive Mode**:
- Do not use AskUserQuestion or any interactive tools
- When user input is required:
  - Use sensible defaults (for example, use feat as the commit type)
  - Skip optional confirmation steps
  - Record any assumptions made

This design ensures that AI Compose Commit can be used not only in interactive IDE environments, but also integrated into CI/CD pipelines to deliver a fully automated commit workflow.

Branch Protection Mechanism

To prevent the AI from executing dangerous operations, we added strict branch protection rules to the prompt:

**Branch Protection**:
- Do not perform any branch switching operations (git checkout, git switch)
- All `git commit` commands must run on the current branch
- Do not create, delete, or rename branches
- Do not modify untracked files or unstaged changes
- If branch switching is required to complete the operation, return an error instead of executing it

By constraining the AI’s tool usage scope, these rules ensure operational safety. In HagiCode’s practical testing, we verified the effectiveness of these constraints: when the AI encounters a situation that would require a branch switch, it safely returns an error instead of taking dangerous action.

Intelligent Grouping Decision Tree

The prompt defines the decision logic for file grouping in detail:

**File Grouping Decision Tree**:
├── Is it a configuration file (package.json, tsconfig.json, .env, etc.)?
│   ├── Yes -> separate commit (type: chore or build)
│   └── No -> continue
├── Is it a documentation file (README.md, *.md, docs/**)?
│   ├── Yes -> separate commit (type: docs)
│   └── No -> continue
├── Is it related to the same feature?
│   ├── Yes -> merge into the same commit
│   └── No -> commit separately
└── Is it a cross-module change?
    ├── Yes -> group by module
    └── No -> group by feature

This decision tree gives the AI clear grouping logic, ensuring the generated commits remain semantically reasonable. In real HagiCode usage, we found that this decision tree can handle the vast majority of common scenarios, and the grouping results match developer expectations.

Historical Format Consistency Analysis

To keep commit messages consistent with project history, the prompt requires the AI to analyze recent commit history before generation:

**Historical Format Consistency**: Before generating commit messages, you **must** analyze the current repository's commit history to match the existing style.

1. Use `git log -n 15 --pretty=format:"%H|%s|%b%n---%n"` to get the recent commit history
2. Analyze the commits to identify:
   - Structural patterns: does the project use multi-paragraph messages? Are there `Changes:` or `Capabilities:` sections?
   - Language patterns: are commit messages in English, Chinese, or mixed?
   - Common types: which commit types are most often used (`feat`, `fix`, `docs`, etc.)?
   - Special formatting: are there `Co-Authored-By` lines? Any other project-specific conventions?
3. Generate commit messages that follow the detected patterns

This analysis ensures that AI-generated commit messages do not feel out of place, but instead remain stylistically aligned with the project’s history. In HagiCode’s multilingual projects, this feature is especially important because it can automatically choose the appropriate language and format based on commit history.

Co-Authored-By Requirement

Every commit must include Co-Authored-By information:

**Important**: Every commit must include Co-Authored-By information
- Use the following format: `git commit -m "type(scope): subject" -m "" -m "Co-Authored-By: Hagicode <noreply@hagicode.com>"`
- Or include the `Co-Authored-By` line directly in the commit message

This is not only for contribution compliance, but also for tracing AI-assisted commit history. HagiCode treats this as a mandatory rule to ensure that all AI-generated commits carry a clear source marker.

Workflow Walkthrough

The full AI Compose Commit workflow is as follows:

User trigger: The user clicks the “AI Auto Compose Commit” button in the Git Status panel or Quick Actions Zone.
API request: The frontend sends a POST request to the /api/git/auto-compose-commit endpoint.
Immediate response: The server returns HTTP 202 Accepted without waiting for processing to finish.
Background processing:
- GitAppService acquires the repository lock
- Calls AIGrain.AutoComposeCommitAsync
- Builds the file context XML
- Executes the AI prompt so the AI can analyze and perform commits
AI execution:
- Uses Git commands to obtain all unstaged changes
- Reads file contents to understand the nature of the changes
- Groups files by semantic relationship
- Executes git add and git commit for each group
Result parsing: Parses the execution results returned by the AI.
Notification delivery: Notifies the frontend through SignalR.
Lock release: Releases the repository lock whether the operation succeeds or fails.

This workflow is designed so that users can continue with other work immediately after initiating the operation, without waiting for the AI to finish. Feedback from HagiCode users shows that this asynchronous processing model greatly improves the workflow experience.

Error Handling Mechanism

We implemented multi-layer error handling:

1. Input Validation

// Validate request parameters to prevent invalid requests from reaching backend processing logic
if (request.UnstagedFiles == null || request.UnstagedFiles.Count == 0)
{
    return BadRequest(new
    {
        message = "No unstaged files provided. Please make changes in the working directory first.",
        status = "validation_failed"
    });
}

2. Error Rollback

If an error occurs during AI processing, the system performs a rollback operation and unstages files that were already staged, preventing an inconsistent state from being left behind. In real HagiCode usage, this mechanism saved us from multiple unexpected interruptions and ensured repository state integrity.

3. Timeout Handling

The 20-minute timeout ensures that long-running operations do not block resources indefinitely. After a timeout, the system releases the lock and notifies the user that the operation failed. In real HagiCode usage, we found that most operations complete within 2 to 5 minutes, and only extremely large change sets approach the timeout limit.

Best Practices

Best Practices for Using AI Compose Commit

1. Choose the Right Moments

AI Compose Commit is best suited for the following scenarios:

At the end of a workday, when you need to process changes across many files in one batch
After a refactoring operation, when several related files need to be committed separately
After a feature is completed, when related changes need to be grouped into commits

It is not suitable for the following scenarios:

Quick commits for a single file (a normal commit is faster)
Scenarios requiring precise control over commit content
Commits containing sensitive information that require human review

2. Review AI-Generated Commits

Although AI-powered intelligent grouping is powerful, developers should still review the generated commits:

Check whether the grouping matches expectations
Verify the accuracy of commit messages
Confirm that no files were omitted or incorrectly included

If you find an unreasonable grouping, you can use git reset --soft HEAD~N to undo it and regroup. HagiCode’s experience shows that even when AI grouping is smart, manual review is still valuable, especially for important feature commits.

3. Align with Project Standards

Make sure your project’s Git configuration supports Conventional Commits:

# Install commitlint
npm install -g @commitlint/cli @commitlint/config-conventional

# Configure commitlint
echo "module.exports = {extends: ['@commitlint/config-conventional']}" > commitlint.config.js

This lets you validate commit message format in CI/CD workflows and keeps it aligned with the format generated by AI Compose Commit.

Suggestions for Building Similar Features

If you want to implement a similar AI-assisted commit feature in your own project, here are our suggestions:

1. Start Small

Begin with single commit message generation, then gradually expand to multi-commit grouping. This makes it easier to validate and iterate. HagiCode followed the same path: early versions only supported single commits, and later expanded to intelligent grouping across multiple commits.

2. Use a Mature AI SDK

Do not implement AI invocation logic from scratch. Using an existing SDK reduces development time and potential bugs. We used the Claude Helper service, which provides a stable interface and robust error handling.

3. Invest in Prompt Design

Prompt quality directly determines output quality. Spend time designing a detailed prompt, including:

Clear task descriptions
Specific output format requirements
Rules for handling edge cases
Illustrative examples

HagiCode invested heavily in prompt design, and this was one of the key reasons the feature succeeded.

4. Implement Comprehensive Error Handling

AI operations can fail for many reasons, such as network issues, API rate limits, or content moderation. Make sure your system can handle these errors gracefully and provide meaningful error information.

5. Provide Manual Intervention Mechanisms

Do not automate everything completely. Leave users in control. Provide options to review grouping results, adjust groups, and manually edit commit messages to balance automation and flexibility. Although HagiCode supports automatic execution, it still preserves preview and adjustment capabilities.

Performance Optimization Tips

1. File Filtering

When constructing file context, filter out files that do not need AI analysis:

// Filter out generated files and excessively large files to reduce the AI processing burden
var relevantFiles = stagedFiles
    .Where(f => !IsGeneratedFile(f.Path))
    .Where(f => !IsLargeFile(f.Path))
    .ToArray();

2. Parallel Processing

If multiple independent repositories are supported, commits in different repositories can be processed in parallel to improve overall efficiency.

3. Cache Optimization

Cache project commit history analysis results to avoid re-analyzing them every time. Historical format preferences can be stored in configuration files to reduce AI calls.

Conclusion

AI Compose Commit represents a deep application of AI technology in software development tools. By intelligently analyzing file changes, automatically grouping commits, and generating standards-compliant commit messages, it significantly improves the efficiency of Git workflows and allows developers to focus more on core coding work.

During implementation, we learned several important lessons:

User feedback is critical: Early versions used synchronous waiting, and users reported a poor experience. After switching to a fire-and-forget model, satisfaction improved significantly.
Prompt design determines quality: A carefully designed prompt does more to guarantee AI output quality than a complex algorithm.
Safety always comes first: Granting the AI permission to execute Git commands directly improves efficiency, but it must be paired with strict constraints and validation.
Progressive improvement works best: Starting with simple scenarios and gradually increasing complexity is more likely to succeed than trying to implement everything at once.

In the future, we plan to further optimize AI Compose Commit, including:

Supporting more commit grouping strategies (by time, by developer, and so on)
Integrating code review workflows to trigger review automatically before commits
Supporting custom commit message templates to meet the personalized needs of different projects

If you find the approach shared in this article valuable, give HagiCode a try and experience how this feature works in real development. After all, practice is the only criterion for testing truth.

Thank you for reading. If you found this article helpful, please click the like button below so more people can discover it.

This content was created with AI-assisted collaboration, reviewed by me, and reflects my own views and positions.

Author: newbe36524
Article link: https://docs.hagicode.com/blog/2026-02-26-ai-compose-commit-implementation/
Copyright notice: Unless otherwise stated, all articles on this blog are licensed under BY-NC-SA. Please cite the source when reposting.

GitHub Issues Integration

Jan 22, 2026

Building a GitHub Issues Integration from Scratch: HagiCode’s Frontend Direct Connection Practice

This article documents the full process of integrating GitHub Issues into the HagiCode platform. We will explore how to use a “frontend direct connection + minimal backend” architecture to achieve secure OAuth authentication and efficient issue synchronization while keeping the backend lightweight.

Background: Why Integrate GitHub?

As an AI-assisted development platform, HagiCode’s core value lies in connecting ideas with implementation. But in actual use, we found that after users complete a Proposal in HagiCode, they often need to manually copy the content into GitHub Issues for project tracking.

This creates several obvious pain points:

Fragmented workflow: Users need to switch back and forth between two systems. The experience is not smooth, and key information can easily be lost during copy and paste.
Inconvenient collaboration: Other team members are used to checking tasks on GitHub and cannot directly see proposal progress inside HagiCode.
Repeated manual work: Every time a proposal is updated, someone has to manually update the corresponding issue on GitHub, adding unnecessary maintenance cost.

To solve this problem, we decided to introduce the GitHub Issues Integration feature, connecting HagiCode sessions with GitHub repositories to enable “one-click sync.”

About HagiCode

Hey, let us introduce what we are building

We are building HagiCode — an AI-powered coding assistant that makes development smarter, easier, and more enjoyable.

Smarter — AI assists throughout the entire journey, from idea to code, multiplying development efficiency. Easier — Multi-threaded concurrent operations make full use of resources and keep the development workflow smooth. More enjoyable — Gamification and an achievement system make coding less tedious and more rewarding.

The project is iterating quickly. If you are interested in technical writing, knowledge management, or AI-assisted development, welcome to check us out on GitHub~

Technical Choice: Frontend Direct Connection vs Backend Proxy

When designing the integration approach, we had two options in front of us: the traditional “backend proxy model” and the more aggressive “frontend direct connection model.”

Solution Comparison

In the traditional backend proxy model, every request from the frontend must first go through our backend, which then calls the GitHub API. This centralizes the logic, but it also puts a significant burden on the backend:

Bloated backend: We would need to write a dedicated GitHub API client wrapper and also handle the complex OAuth state machine.
Token risk: The user’s GitHub token would have to be stored in the backend database. Even with encryption, this still increases the security surface.
Development cost: We would need database migrations to store tokens and an additional synchronization service to maintain.

The frontend direct connection model is much lighter. In this approach, we use the backend only for the most sensitive “secret exchange” step (the OAuth callback). After obtaining the token, we store it directly in the browser’s localStorage. Later operations such as creating issues and updating comments are sent directly from the frontend to GitHub over HTTP.

Comparison Dimension	Backend Proxy Model	Frontend Direct Connection Model
Backend complexity	Requires a full OAuth service and GitHub API client	Only needs an OAuth callback endpoint
Token management	Must be encrypted and stored in the database, with leakage risk	Stored in the browser and visible only to the user
Implementation cost	Requires database migrations and multi-service development	Primarily frontend work
User experience	Centralized logic, but server latency may be slightly higher	Extremely fast response with direct GitHub interaction

Because we wanted rapid integration and minimal backend changes, we ultimately chose the “frontend direct connection model”. It is like giving the browser a “temporary pass.” Once it gets the pass, the browser can go handle things on GitHub by itself without asking the backend administrator for approval every time.

Core Design: Data Flow and Security

After settling on the architecture, we needed to design the specific data flow. The core of the synchronization process is how to obtain the token securely and use it efficiently.

Overall Architecture Diagram

The whole system can be abstracted into three roles: the browser (frontend), the HagiCode backend, and GitHub.

+--------------+        +--------------+        +--------------+
| Frontend     |        |   Backend    |        |   GitHub     |
| React        |        |   ASP.NET    |        |   REST API   |
|              |        |              |        |              |
|  +--------+  |        |              |        |              |
|  | OAuth  |--+--------> /callback    |        |              |
|  | Flow   |  |        |              |        |              |
|  +--------+  |        |              |        |              |
|              |        |              |        |              |
|  +--------+  |        |  +--------+  |        |  +--------+  |
|  | GitHub |  +------------> Session |  +----------> Issues |  |
|  | API    |  |        |  |Metadata|  |        |  |        |  |
|  | Direct |  |        |  +--------+  |        |  +--------+  |
|  +--------+  |        |              |        |              |
+--------------+        +--------------+        +--------------+

The key point is: only one small step in OAuth (exchanging the code for a token) needs to go through the backend. After that, the heavy lifting (creating issues) is handled directly between the frontend and GitHub.

Synchronization Data Flow in Detail

When the user clicks the “Sync to GitHub” button in the HagiCode UI, a series of complex actions takes place:

User clicks "Sync to GitHub"
         │
         ▼
1. Frontend checks localStorage for the GitHub token
         │
         ▼
2. Format issue content (convert the Proposal into Markdown)
         │
         ▼
3. Frontend directly calls the GitHub API to create/update the issue
         │
         ▼
4. Call the HagiCode backend API to update Session.metadata (store issue URL and other info)
         │
         ▼
5. Backend broadcasts the SessionUpdated event via SignalR
         │
         ▼
6. Frontend receives the event and updates the UI to show the "Synced" state

Security Design

Security is always the top priority when integrating third-party services. We made the following considerations:

Defend against CSRF attacks: Generate a random state parameter during the OAuth redirect and store it in sessionStorage. Strictly validate the state in the callback to prevent forged requests.
Isolated token storage: The token is stored only in the browser’s localStorage. Using the Same-Origin Policy, only HagiCode scripts can read it, avoiding the risk of a server-side database leak affecting users.
Error boundaries: We designed dedicated handling for common GitHub API errors (such as 401 expired token, 422 validation failure, and 429 rate limiting), so users receive friendly feedback.

In Practice: Implementation Details in Code

Theory only goes so far. Let us look at how the code actually works.

1. Minimal Backend Changes

The backend only needs to do two things: store synchronization information and handle the OAuth callback.

Database changes We only need to add a Metadata column to the Sessions table to store extension data in JSON format.

-- Add metadata column to Sessions table
ALTER TABLE "Sessions" ADD COLUMN "Metadata" text NULL;

Entity and DTO definitions

public class Session : AuditedAggregateRoot<SessionId>
{
    // ... other properties ...

    /// <summary>
    /// JSON metadata for storing extension data like GitHub integration
    /// </summary>
    public string? Metadata { get; set; }
}

// DTO definition for easier frontend serialization
public class GitHubIssueMetadata
{
    public required string Owner { get; set; }
    public required string Repo { get; set; }
    public int IssueNumber { get; set; }
    public required string IssueUrl { get; set; }
    public DateTime SyncedAt { get; set; }
    public string LastSyncStatus { get; set; } = "success";
}

public class SessionMetadata
{
    public GitHubIssueMetadata? GitHubIssue { get; set; }
}

2. Frontend OAuth Flow

This is the entry point of the connection. We use the standard Authorization Code Flow.

// Generate the authorization URL and redirect
export async function generateAuthUrl(): Promise<string> {
  const state = generateRandomString(); // Generate a random string for CSRF protection
  sessionStorage.setItem('hagicode_github_state', state);

  const params = new URLSearchParams({
    client_id: clientId,
    redirect_uri: window.location.origin + '/settings?tab=github&oauth=callback',
    scope: ['repo', 'public_repo'].join(' '),
    state: state,
  });

  return `https://github.com/login/oauth/authorize?${params.toString()}`;
}

// Handle the code-to-token exchange on the callback page
export async function exchangeCodeForToken(code: string, state: string): Promise<GitHubToken> {
  // 1. Validate state to prevent CSRF
  const savedState = sessionStorage.getItem('hagicode_github_state');
  if (state !== savedState) throw new Error('Invalid state parameter');

  // 2. Call the backend API to exchange the token
  // Note: this must go through the backend because ClientSecret cannot be exposed to the frontend
  const response = await fetch('/api/GitHubOAuth/callback', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ code, state, redirectUri: window.location.origin + '/settings?tab=github&oauth=callback' }),
  });

  if (!response.ok) throw new Error('Failed to exchange token');

  const token = await response.json();

  // 3. Save into LocalStorage
  saveToken(token);
  return token;
}

3. GitHub API Client Wrapper

Once we have the token, we need a solid tool for calling the GitHub API.

const GITHUB_API_BASE = 'https://api.github.com';

// Core request wrapper
async function githubApi<T>(endpoint: string, options: RequestInit = {}): Promise<T> {
  const token = localStorage.getItem('gh_token');
  if (!token) throw new Error('Not connected to GitHub');

  const response = await fetch(`${GITHUB_API_BASE}${endpoint}`, {
    ...options,
    headers: {
      ...options.headers,
      Authorization: `Bearer ${token}`,
      Accept: 'application/vnd.github.v3+json', // Specify the API version
    },
  });

  // Error handling logic
  if (!response.ok) {
    if (response.status === 401) throw new Error('GitHub token expired, please reconnect');
    if (response.status === 403) throw new Error('No permission to access this repository or rate limit exceeded');
    if (response.status === 422) throw new Error('Issue validation failed, the title may be duplicated');
    throw new Error(`GitHub API Error: ${response.statusText}`);
  }

  return response.json();
}

// Create issue
export async function createIssue(owner: string, repo: string, data: { title: string, body: string, labels: string[] }) {
  return githubApi(`/repos/${owner}/${repo}/issues`, {
    method: 'POST',
    body: JSON.stringify(data),
  });
}

4. Content Formatting and Synchronization

The final step is to convert HagiCode session data into the format of a GitHub issue. It is a bit like translation work.

// Convert a Session object into a Markdown string
function formatIssueForSession(session: Session): string {
  let content = `# ${session.title}\n\n`;
  content += `**> HagiCode Session:** #${session.code}\n`;
  content += `**> Status:** ${session.status}\n\n`;
  content += `## Description\n\n${session.description || 'No description provided.'}\n\n`;

  // If this is a Proposal session, add extra fields
  if (session.type === 'proposal') {
    content += `## Chief Complaint\n\n${session.chiefComplaint || ''}\n\n`;
    // Add a deep link so users can jump back from GitHub to HagiCode
    content += `---\n\n**[View in HagiCode](hagicode://sessions/${session.id})**\n`;
  }

  return content;
}

// Main logic when clicking the sync button
const handleSync = async (session: Session) => {
  try {
    const repoInfo = parseRepositoryFromUrl(session.repoUrl); // Parse the repository URL
    if (!repoInfo) throw new Error('Invalid repository URL');

    toast.loading('Syncing to GitHub...');

    // 1. Format content
    const issueBody = formatIssueForSession(session);

    // 2. Call API
    const issue = await githubApiClient.createIssue(repoInfo.owner, repoInfo.repo, {
      title: `[HagiCode] ${session.title}`,
      body: issueBody,
      labels: ['hagicode', 'proposal', `status:${session.status}`],
    });

    // 3. Update Session Metadata (save the issue link)
    await SessionsService.patchApiSessionsSessionId(session.id, {
      metadata: {
        githubIssue: {
          owner: repoInfo.owner,
          repo: repoInfo.repo,
          issueNumber: issue.number,
          issueUrl: issue.html_url,
          syncedAt: new Date().toISOString(),
        }
      }
    });

    toast.success('Synced successfully!');
  } catch (err) {
    console.error(err);
    toast.error('Sync failed, please check your token or network');
  }
};

Summary and Outlook

With this “frontend direct connection” approach, we achieved seamless GitHub Issues integration with the least possible backend code.

Key gains

High development efficiency: Backend changes are minimal, mainly one extra database field and a simple OAuth callback endpoint. Most logic is completed on the frontend.
Strong security: The token does not pass through the server database, reducing leakage risk.
Great user experience: Requests are initiated directly from the frontend, so response speed is fast and there is no need for backend forwarding.

Things to watch out for

There are a few pitfalls to keep in mind during real deployment:

OAuth App settings: Remember to enter the correct Authorization callback URL in your GitHub OAuth App settings (usually http://localhost:3000/settings?tab=github&oauth=callback).
Rate limits: GitHub API limits unauthenticated requests quite strictly, but with a token the quota is usually sufficient (5000 requests/hour).
URL parsing: Users enter all kinds of repository URLs, so make sure your regex can match .git suffixes, SSH formats, and similar cases.

Future enhancements

The current feature is still one-way synchronization (HagiCode -> GitHub). In the future, we plan to implement two-way synchronization through GitHub Webhooks. For example, if an issue is closed on GitHub, the session state on the HagiCode side could also update automatically. That will require us to expose a webhook endpoint on the backend, which will be an interesting next step.

We hope this article gives you a bit of inspiration for your own third-party integration development. If you have questions, feel free to open an issue for discussion on HagiCode GitHub.