Announcing Development on Flirt | annotated by Colton

I've started working on Flirt, which stands for "Fabulous, Legendary, Incremental Review Tool". Firstly, what is it and why might you be interested in it?

It avoids the need to review the same code multiple times when the code author amends or rebases their commits. This is relevant for people who value good commit history and see it as something to be iterated on during code review.
It's agnostic with respect to the code sharing / code review platform. That means: You can jump between open-source projects using GitHub, a mailing list etc. and your code review experience stays consistent.
It's a local-first tool, so it integrates seamlessly with your other tools. Using your editor to read, test and comment on code you review is a breeze.

If any of those points tickle your neurons, keep reading!

Table of contents:

Austin Seipp wrote a great explanation of what he calls "interdiff" code review. I find the term a little technical, since it refers to how it's implemented. Instead, I chose the term "incremental code review", which emphasises the practical benefit it provides. It's the same thing, though. Since Austin already explained it so well, I won't reiterate all the details of his post. If you are interested to learn more, do click on that link above.

First, let's illustrate the problem we're trying to solve. Why, or when, is regular code review "non-incremental"? It depends on your workflow.

I call this the PR-workflow, because it was popularized by GitHub with its "pull request" feature.

The PR-workflow encourages code authors to not touch commits that were already reviewed and only add new ones at the tip of their branch. This makes it easy for reviewers and their tools to keep track of what's already been reviewed. In any review cycle, a reviewer only has to look at the diff between their last-reviewed commit and the most recent one.

The PR-workflow is already "incremental". People who use it (and are happy with it) won't benefit as much from using Flirt. They may, however, be interested to learn why other people prefer a different workflow!

I call this the patch-series-workflow, because the way it is practiced on mailing lists stands in strong contrast to the PR-workflow. However, it is also practiced on GitHub, for example by the Jujutsu project.

The patch-series-workflow encourages code authors to amend, squash, fixup and rebase their commits continuously over each review cycle. The primary aim of this is to work towards a high-quality commit history. A carefully crafted commit history has many benefits. For example, while searching for the source of a bug, a clean history makes it easier to identify and revert the commit which introduced it.

However, operations that modify existing commits (amend, squash, fixup, rebase) completely change the hash (identifier) of a commit. Afterwards, it's not easy to determine how the commit was modified. Most code review tools like GitHub's built-in pull request UI struggle to mitigate this. Band-aid features like the "changes since last review" button or "viewed" checkboxes on files help, but they fail completely in many normal situations. Other tools struggle as well, though sometimes in different ways. For example, people using mailing lists may be familiar with git range-diff, which uses heuristics to compare to versions of a patch series.

The relevant consequence is that reviewers often have to review the same code twice. Imagine a commit in a patch series changes 100 lines. 99 of the lines are good, but the first round of review suggests to add a missing semicolon on one of the lines. The author does that by amending the existing commit, to avoid an unseemly "add missing semicolon" commit ending up in the project history. After a force-push (or new patch series submission), reviewers are generally presented with the full 100-line diff to review, because the tools don't understand that only one line changed since the last review. I call such tools "non-incremental" review tools, since they basically ask the reviewer to start from scratch every review cycle. This is frustrating and inefficient!

People who use the patch-series-workflow (or want to switch to it, once its problems are solved) are the target audience of Flirt.

The solution is enabled by the concept of a "change-id": an alternative commit identifier. In contrast to the commit-hash, the change-id stays the same when a commit is amended or rebased.

A review tool can remember the commits that have already been reviewed with their change-id. Once a new commit is proposed for review, the tool can pair it up with a previously-reviewed commit with the same change-id. Calculating the "interdiff" between these two commits will show exactly those changes which the reviewer has not yet seen. That's why Austin calls it "interdiff" code review.

Edit: When I first posted this, I forgot to mention where this change-id is coming from. Jujutsu is a Git-compatible VCS that sets this change-id in a custom commit header. That means, Flirt will only be able to take advantage of this mechanism if the author uses Jujutsu. Since a lot of people haven't made the jump yet and are still using Git, Flirt needs some fallback solution. Conceptually, it would be doing the same thing as git-range-diff, although the heuristics would be much simpler at first. Nevertheless, the experience of using Flirt will improve as Jujutsu adoption grows.

That's the core feature of Flirt. It remembers what you reviewed and by default only shows you what changed in comparison to that. Let's see that in action!

The basic logic of this "incremental review" is implemented. Please don't judge the "user interface" of Flirt yet! It's just a CLI to show this basic function. The main UI will probably end up being a TUI, with the core application logic being exposed as a library. That way, other people can make fancier GUIs, editor plugins etc. on top.

Before you watch the demo, I have to explain one more thing: Flirt has its own name for "pull request", "patch series" or "unit of review". It's "Spirit", which stands for "Set of Patches Intended for Review and Intense Testing". As you can see, I'm trying to be consistent with the silly acronyms.

The demo is made using asciinema. It's not a video, you can even select & copy the text from it! My typing is probably too slow for your reading speed, so I recommend skipping ahead in little jumps with the right arrow key.

One code review tool I haven't mentioned in the incremental review section is Gerrit. The review UI built into Gerrit basically already accomplishes incremental review with clean history using a change-id. That's exactly what Flirt is trying to do, so why reinvent the wheel? My main argument is this:

I don't control the code sharing platform of all the projects I work on. My workplace and many of my favorite open-source projects use GitHub. I have never seriously used Gerrit. Not because I don't want to, but because none of the project I work on are hosted on Gerrit. I don't even know if Gerrit is as great as people say it is. Not that I don't believe it, I just haven't experienced it.

I find it tragic that our code review tools are tied together with our code sharing platforms. It means we have to put up with whatever review tool is imposed upon us. To solve that problem is the second motivation for making Flirt. I want to use a good review tool on every project, without having to ask some enterprise admin to pretty-please rebuild their entire tech-stack from scratch.

So, the idea is pretty simple. Flirt is going to have multiple backends for communicating review information. If the project is hosted on GitHub, making a review comment in Flirt will hit the GitHub API so that the comment appears to other GitHub users as if it was made in the browser. If the project uses a mailing list, making a comment will... send an email... where the comment is sandwiched inside a patch... possibly inside a thread of other comments at multiple levels of indentation... (No, I'm not looking forward to implement this. But I still have some naive optimism left in the tank.)

I don't think making a backend-agnostic design will be too difficult, just a lot of dirty work for each individual backend. The only challenge I foresee is the fact that different backends will support different features. For example, a comment thread on GitHub can be marked as "resolved". That's not a thing on a mailing list. (Unless you send an email literally saying "I think this is resolved", but that would be supper spammy.) Having to navigate that will require some careful design work. The goal is to develop Flirt with a lean set of features such that it is not held back by any particular backend. At the same time, Flirt shouldn't have many or significant features that are optional and don't work for every backend. That would lead to an inconsistent experience as users switch from one project to another.

One particularly interesting backend I've got in mind is a "Git native" backend. With that, Flirt would store all of its data worthy of sharing in a custom file format and dump it in a commit. That commit can be pushed to and pulled from a Git remote. This would allow small teams to enjoy the full feature set of Flirt without having to host any code sharing platform. A bare repo on a server would suffice. That being said, every contributor would be forced to use Flirt as their review tool, so it's not a viable option for open-source projects.

What particular backends am I planning to support? All of them! GitHub, mailing lists, Forgejo (Codeberg), GitLab, Gerrit... plus the other ones you're thinking of right now. That's where the open-source community comes in. If people enjoy Flirt, I'm hoping they will show up to help make it great for the backends they care about. More details about the plans to open-source are in the roadmap.

You probably already have an amazing code review tool locally. It's called a code editor. These things typically have very advanced features for code review:

Viewing the diff of the Git HEAD and your worktree.
Navigating and inspecting the code with LSP integration or language plugins.
Commenting the code by using language-specific comment-syntax.
Running the code and its tests.

I'm guessing you use your editor at least to review your own code. Many people also use it already to review other people's code. (gh co <pr-number> is such a time-saver) But at present, there is a big gap in the workflow: You leave your editor, open a browser, navigate to some page or tab, then navigate to the exact same place in the code you were just looking at, click on a line to open a text box, and finally you can write your comment. Worst of all, that text box may not support Helix motions!

I'm planning for Flirt to work as much as possible with your code editor. Concrete features I have in mind include:

To review an "interdiff", Flirt will move your Git HEAD to the "from" part of the interdiff and write the "to" part to your worktree. That way, your editor will show you the same thing as jj interdiff.
You write comments in your normal language syntax. Flirt will slurp those up, strip whitespace and comment markers, and let you post that to the GitHub API at the push of a button.

That's not an exhaustive list, but it should get the mindset across. Flirt will leverage your code editor for what it's good at, instead of forcing you to do the same thing in custom text boxes and drop-downs.

I haven't fully decided whether Flirt is going to duplicate some of these features with a "regular" code review UI (e.g. text boxes for writing comments). This will probably be decided when such a need becomes apparent.

You might be wondering: Where exe? I completely understand.

This is my master's thesis. I'm studying part-time and will graduate in the Summer of 2026. Until then, I will be working on it by myself (closed-source). The upside is that I can share my roadmap, because I don't have to coordinate with other contributors:

By the end of November: Implement a proof of concept for incremental review. This is already complete, resulting in the demo above.
By the end of January: Develop a reasonably detailed specification of the feature set, taking care to support a broad variety of backends while keeping the user experience across backends consistent. Implement this feature set for the "Git native" backend.
By the end of March: Implement the GitHub and mailing list backends.
By the end of May: Provide a polished user experience.
By the end of July: Hand in my thesis.

What that means for you is: I might decide to publish some alpha builds in early 2026. At the latest, I should have something usable to share in June.

Lastly, Flirt will likely become open-source in August 2026. That's when you will be able to help out adding support for your favorite backend. (I have a Forgejo instance running on my Raspberry Pi, so that will be my next backend of choice to work on.) I haven't decided on the license yet. GPL? Unlicense? Both extremes are ideologically appealing to me, each in their own way. I'd love to hear your opinion on this. You might just tip the scale.

As I hit (or miss) these milestones, there's a good chance I'll post about it here. If you'd like to be notified, subscribe to the Atom feed or follow me on Bluesky: @buenzli.dev.

I'm hoping Flirt will end up being useful to you and others. I can't wait to share it with the community when it's done. Please let me know if you have ideas about how to make it the best code review tool for you!

Here are some discussion threads on various platforms where you can discuss this post with me and others: