S-Expresso - Kanban Coding Team
May 22, 2026
Current top-tier coding harnesses, e.g., Claude Code, OpenCode, CoPilot and Codex, adopt a conversational style to developing software.
I regularly forget that my role in this is more than conversation, but also to guide the development and make sure the coding doesn’t go off the rails. In other words, I expect Claude Code to magically know and do whatever I intend it to do.
Spoiler alert: typically it does not read my mind well.
Thus, I turned to agents and skills to improve the abilities of the coding agent. But, as laziness is a virtue in software development, I tend to forgot to use them, or the LLM ignores them.
Secondly, why do we rely on free-form text and the harness’ ability to compose a context. Furthermore, most coding harnesses are not really transparant about what goes into their context; Pi.dev being a notable exception.
Thirdly, why are we using very large language models, that can translate Mandarin to Swahili and write a poem about it while doing so. “Coincidentally” they can also write Python or Java code. Why don’t we use optimized coding models for writing software?
Therefore, my take on coding harnesses, called S-Expresso:
- Be constrained: built-in bounds and checks.
- Use Mistral Medium 3.5 a “small” LLM optimized for code.
- Explicit context: Kanban board and tickets
Do note, this is an experiment, not a fully fledged open source project. I’ve developed some CLI tools with this coding harness but those are toy examples.
S-Expresso: Defining Features

In short, S-Expresso is a coding harness that mimicks a development team using a Kanban board. This constrains the interactions and the context creation. Each ticket is a context. Each role before the developer, adds relevant information to the ticket, to make it self-sufficient. As a result, the developer role in the coding harness, can start from the ticket and knows enough. Obviously, the developer can read files to add information it needs to the context.
Each role focuses on a small piece of the puzzle, yet has access to information in other tickets in case this is needed.
The second defining feature is that each “generation” step in the process is followed by a “review” step. The output of one role (i.e., a LLM with a prompt), is reviewed by another role (i.e., an LLM with a different prompt). The reviewer is tasked specifically to check completeness, succinctness and correctness. This is to (try to) avoid agents going into wrong tangents, and trying to catching errors, mistakes and hallucinations as early as possible.
Example Development: asciipet
As an example, let’s build a CLI tool in the spirit of cowsay. But
let’s make it a bit more flexible: we define a set of animals that can
be requested. I call the CLI tool asciipet.
S-Expresso is specialized in building applications and tools in the
Rust programming language. This avoids having the LLM using a set of
different programming languages. I hope this makes the process more
predictable. It helps me in learning Rust. Last but not least, it
allows me to introduce tools specific to the Rust ecosystem. E.g.,
cargo test and cargo fmt are built-in tools. The LLM doesn’t need
to know about them, the harness offers them with a
description. Another approach would be to introduce skills but I
deemed that too complex for this experiment.
s-expresso web starts a project with an empty Kanban board. There
are also a CLI and a TUI, however, I introduced them at the start, but
preferred a web UI, so these modes are not really maintained.
Next, I can explain the idea I have to the product owner in the S-Expresso Kanban team. Something akin to:
I want to build a CLI similar to cowsay, that accepts an animal
and a text. It generates an ASCII art rendition of the animal
saying the text. The tool as a pre-defined set of animals.
Here is an example of a dragon:
, ,
\\ \\
) \\ \\ _p_
)^\))\)) / *\
\_|| || / /^`-'
__ -\ \\--/ /
<' \\___/ ___. )'
`====\ )___/\\
// `"
\\ / \
`"
Here is an example of a dog:
__
/ \
/ ..|\
(_\ |_)
/ \@'
/ \
_ / ` |
\\/ \ | _\
\ /_ || \\_
\____)|_) \_)
Example invocation: asciipet --animal dragon --message "Rust is awesome!"

The Product Owner will pick this up and analysis my request. The “ask user” tool is used to pose clarifying questions:

Once all questions have been answered, the PO will write down a Functional Elaboration and Functional Acceptance Criteria in the ticket.
For this idea, the PO has decided to create a single ticket (because it is a relatively small CLI tool), and the functional elaboration is reviewed:

Next, the Architect writes out some technical notes on how to
implement the ticket. The Team Lead reviews the technical
elaboration. For a small story like this, this is rather trivial. The
architect opts for clap for command line argument parsing and
anyhow for error handling. These are two well-known choices within
the Rust ecosystem.
The developer can then kick in and start development of the ticket. The context of the developer starts with the following context:
Implement ticket smooth-frog-45:
Title: CLI tool for ASCII art animal messages
Description:
Build a CLI tool named `asciipet` that accepts an animal and
a text message, then generates an ASCII art rendition of the animal
saying the text. The tool must support a pre-defined set of animals:
dragon, dog, dragonfly, and penguin.
**Functionality:**
- Accept `--animal` and `--message` as command-line arguments.
- Render the ASCII art of the specified animal with the message displayed
in a speech bubble or adjacent to the animal.
- Support the following animals: dragon, dog, dragonfly, penguin.
**Example Invocation:**
`asciipet --animal dragon --message "Rust is awesome!"`
**Example Output for Dragon:**
, ,
\ \
) \ \ _p_
)^\\)) / *\
\_|| || / /^`-'
__ -\ \--/ /
<' \___/ ___. )'
`====\ )___/\\
// `"
\\ / \
`"
Rust is awesome!
**Example Output for Dog:**
__
/ \
/ ..|\
(_\ |_)
/ \@'
/ \
_ / ` |
\\/ \ | _\
\ /_ || \\_
\____)|_) \_)
Rust is awesome!
**Scope:**
- Implement argument parsing for `--animal` and `--message`.
- Store ASCII art templates for each animal: dragon, dog, dragonfly, penguin.
- Render the ASCII art with the message.
- Handle errors for unsupported animals or missing arguments.
Acceptance Criteria:
- Tool is named `asciipet` and is a single Rust binary.
- Accepts `--animal` and `--message` as command-line arguments.
- Supports the following animals: dragon, dog, dragonfly, penguin.
- Renders the ASCII art of the specified animal with the message
displayed adjacent to the animal.
- Handles errors for unsupported animals or missing arguments with
clear error messages.
- Example invocation: `asciipet --animal dragon --message "Rust is awesome!"`
produces the expected ASCII art output for the dragon with the message.
Technical Elaboration:
**Binary name**: asciipet
Use `clap` for argument parsing. Store ASCII art templates as `&'static str`
constants in a module. Use `match` on the animal argument to select the template.
Print the template followed by the message.
Use `anyhow` for error handling (unsupported animal, missing args). (N=50)
Notes from other roles:
[functional_reviewer] Functional Review Feedback:
**Strengths:**
- The problem statement is clear and well-scoped.
- The description explicitly defines the CLI tool's purpose,
required arguments, and supported animals.
- The acceptance criteria are specific, testable, and unambiguous.
- Examples (invocation and output) are provided, which clarify expectations.
- Error handling requirements (unsupported animals, missing arguments) are
explicitly mentioned.
**Minor Observations (Non-Blocking):**
- The phrase "displayed in a speech bubble or adjacent to the animal"
in the description could be slightly ambiguous, but the examples clarify
that the message is displayed adjacent to the animal. This does not require
revision.
- The acceptance criteria do not explicitly mention the penguin and dragonfly
examples, but the scope and description cover all four animals. This is
acceptable as the examples are illustrative.
**Conclusion:**
The ticket is ready for TECHNICAL_ANALYSIS. All requirements are clear,
testable, and unambiguous.
[team_lead] **Technical Review Feedback:**
**Strengths:**
- Binary name, argument parsing library (`clap`),
and error handling approach (`anyhow`) are explicitly specified.
- ASCII art storage approach (`&'static str` constants in a module)
is clear and idiomatic.
- Implementation strategy (match on animal argument, print template + message)
is straightforward and sufficient for the scope.
- All required dependencies and Rust-specific details are defined upfront.
**Assessment:**
The technical elaboration is complete and unambiguous.
A developer can implement this without asking questions.
The scope is small and self-contained, and the approach is well-defined.
**Decision:** Ready for QUEUED.
All the necessary requirements and context are written down in the ticket. The developer can access the list of all tickets, to get more information about specific aspects if needed.
After a while, the developer states: Task Completed.
Running the project:
$ cargo r -- --animal dragon --message "Hello, World"
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.02s
Running `target/debug/asciipet --animal dragon --message 'Hello, World'`
, ,
\ \
) \ \ _p_
)^\\)) / *\
\_|| || / /^`-'
__ -\ \--/ /
<' \___/ ___. )'
`====\ )___/\\
// `"
\\ / \
`"
Hello, World
The dragon fly seems to be less of a success …
/\\\\
/ o o \
( == ^ == )
\ \___/ /
\______/
Hello, World
Finally, the Reviewer has a go on the code base. It has a three minor comments, which are added to the source code file. However, it decides that the code is good enough at this stages and moves the ticket to done!

Great! The ticket is implemented, the software seems to work.
What happened?
To make the development process even more transparant, I can follow the process of messages being sent to LLM and the tools being called in the History tab of the web UI.
There is a timeline view that gives a Git log like view of the interactions and there is a log view, that gives more details into the exact prompts that are provided to the LLM.

This time with Claude Code
For comparison, I gave Claude Code (Sonnet 4.6) the same instructions:
cargo r -- --animal dragonfly --message "Hello, World"
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.02s
Running `target/debug/asciipet --animal dragonfly --message 'Hello, World'`
Error: Unsupported animal: "dragonfly". Use --list to see available animals.
and
cargo r -- --animal penguin --message "Hello, World"
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.02s
Running `target/debug/asciipet --animal penguin --message 'Hello, World'`
______________
| Hello, World |
--------------
____
/ \
/ __ \
| / \ |
| | | |
\ \__/ /
\______/
In my opinion: S-Expresso did pretty well, given that it was vibe-coded by one engineer, and uses a model that is 10x to 20x smaller compared to the frontier models of other labs.
Token Usage
| Metric | S-Expresso (Mistral Medium 3.5) | Claude Code (Claude Sonnet 4.6) |
|---|---|---|
| Input tokens | 1,416,370 | 439,900 |
| Cached tokens | 1,160,554 | 438,500 |
| Output tokens | 30,921 | 12,900 |
| Price Input | $1.5/Mtok | $3/Mtok |
| Price Output | $7.5/Mtok | $15/Mtok |
S-Expresso used a lot more tokens to create asciipet. Cost-wise,
I’ve spent (1.416,370-1.160,554) x $1.5 + 0.1160554 * $0.15 +
0.030921 * $7.5 = $0.633 on Mistral tokens and about $0.46 on
Claude tokens.
On this front, there is definitely some improvement possible for S-Expresso. Although, who knows how fair/correct the API pricing is at this point?
What Did I Learn?

-
Building a coding harness is harder than you might think. It’s a fun and educational experiment and experience nonetheless.
-
Guardrails, guardrails, and feedback loops: add as much automated tests, verification steps, E2E manual testing (e.g., PlayWright), and verification in general to your project. Make them fast, make the coding harness run them often!
This will give you a chance of getting to a result that kind of resembles what you intended.
-
You are the driver. If you don’t take the reins and make sure the coding harness is chopping up tasks, running verifications, double checking and running tests all the time. It will go on a tangent.
Don’t treat a coding harness as a magic wand, you still have to put in the hard work:
- Clearly specifying requirements and expectations.
- Setting up guardrails
- Verifying, reviewing and testing
-
Terminal Coding Agents are not the end game: I do believe at this point that innovation will play in the harnesses and the way we interact with the harnesses, and probably less in the evoluation of the models.
The Code
The code is open sourced on SourceHut.