Claude TODO: The Missing Layer Between AI Sessions

Mar 15

There's a shift happening in how software gets built at AI-forward companies and if your company constrains your AI budget you might not see it. It's not the shift we read about in X posts, the one where an agent writes a whole feature from a single prompt. That shift is real, but it obscures a more immediate, more human problem: the cognitive cost of managing the increasing work that AI is doing on our behalf.

More Sessions, More Context, More Switches

If a company has successfully made the shift into tools like Claude Code then the number of concurrent tasks individual developers are managing has gone up. It’s not because they're writing more code (in many cases we're writing less), it’s because the bottleneck has moved. We're no longer the person implementing a fix. We're the person who scopes the fix, launches the session, reviews the output, provides the judgment call when the agent deviates from the plan, and then you switch to the next task and do it all again. Multiple that by 2 or 3 because we’re doing it in parallel with other tasks, not sequentially.

Each of these tasks carries its own context and in my case, can even operate across multiple codebases. The database migration for the API service has a different set of constraints than the design system refactor, which has a different set of constraints than the flaky e2e test investigation. As the agents get better and they take on larger, more architecturally significant work, the context per task gets heavier, not lighter.

The tools reduce implementation time, but they increase the cognitive overhead of orchestration. We're running a small team of agents, and you're the only project manager.

The Factory Floor, Eventually

I think the industry is heading toward what some are calling software factories (I don’t like the dystopian imagery this term creates). That said, if we attempt to tier the kinds of tasks that engineering teams handle (critical bug fixes to read paths, fixes that mutate state, minor enhancements, configuration changes, dependency bumps) there's a tier at the bottom where a well-configured agent with the right guardrails can ship a change end-to-end without a human in the loop. The agent reads the ticket, understands the codebase, writes the fix, runs the tests, and opens the PR. Automatic quality gates provide confidence. New novel checks such as plan deviation flag risk. A human might then review it, or another agent might.

However this upcoming shift is deeply contextual to the project. It requires mature test suites, well-documented conventions, stable APIs, and a level of codebase legibility that many production systems don't have. Not all companies have the sophistication Stripe has. As of March 2026, for most real-world production codebases, we're not there yet. We're in the messy middle: agents are powerful enough to do serious work, but they still need a human to steer, to prioritize, and — critically — to context-switch between them.

The Missing Project Management Layer

What struck me was that there's no good answer to a simple question: what are all my active Claude sessions doing right now, and which ones need me?

I find myself with 3-5 terminals open. One finished ten minutes ago and I didn't notice. Another is waiting for permission to proceed. A third is running fine but I’ve lost track of what you asked it to do. It's a project management problem dressed up as a terminal management problem.

So as a weekend project, I built Claude Todo.

The idea is deliberately simple: a native desktop app that gives developers a to-do list view of their Claude Code sessions. Not a dashboard, not a workflow engine… just a to-do list. It's the most basic representation of "here's what needs to get done”. The developer can add tasks, configure them with a project directory and CLI flags, and launch Claude Code sessions directly from the app. The app tracks which sessions are running, which have finished, and which need attention. When a session requires input, it propagates the notification. When the developer ready to switch context, they can focus the right terminal window with a click.

It's a continuous stream. Developer completes tasks and adds more. They pause sessions and resume them later without starting from scratch. They control your environment through a project management layer instead of through raw terminal juggling.

Why Tauri Made This Possible in a Weekend

I want to spend a moment on Tauri, because it genuinely surprised me.

I've tried to build native desktop apps before. In my experience it’s many days of fighting platform-specific build toolchains, discovering that the thing that works on macOS doesn't compile on Windows, debug cryptic linker errors, and eventually shipping something that feels like a web page in a thick frame. I almost didn’t want to bother again.

Tauri 2.0 is different in a way I'd almost describe as revolutionary. The core model, a Rust backend with a web frontend, communicating through typed commands, turns out to be an extraordinarily productive architecture. The Rust side handles system-level concerns (window management, file watching, process spawning) with the performance and safety guarantees expected. The frontend is just React. Developers get native window chrome, native file dialogs, and platform-specific behavior without writing platform-specific JavaScript.

For Claude Todo, this meant I could use Rust's windows-rs crate to enumerate and focus terminal windows on Windows, AppleScript via osascript on macOS, and expose both through the same TypeScript API. The permission model is explicit — every filesystem path, every shell command the app can invoke, is declared in a capabilities file. It's secure by default in a way that Electron never was.

The plugin ecosystem helped too. @tauri-apps/plugin-fs for file operations, @tauri-apps/plugin-shell for spawning Claude Code processes, @tauri-apps/plugin-dialog for native folder pickers — these are well-documented, stable, and they just work. I spent my time building features instead of fighting the framework.

What Claude Todo Actually Does

The data model is intentionally minimal. Tasks live in a markdown file with three sections — Not Started, In Progress, Done and metadata is stored inline as pipe-delimited key-value pairs. This means the data is human-readable, version-controllable, and recoverable even if the app disappears.

When a developer creates a task, they give it a title, optionally point it at a project directory, and configure any desired CLI flags (like --dangerously-skip-permissions for trusted tasks). When they start a task, the app launches a Claude Code session in a new terminal window, moves the task to In Progress, and begins monitoring the terminal process. A status indicator shows which sessions are running. When a session needs input, the app surfaces a notification.

The focus button brings the right terminal to the foreground — no more hunting through a stack of windows to find the session that's waiting. When done with the desired unit of work, marking a task complete closes the terminal and archives the item. If you need to pause, stopping a task preserves the session ID so you can resume exactly where you left off.

It's a small tool that solves a specific problem: keeping a human effectively in the loop across multiple concurrent AI sessions.

The Broader Point

The tooling conversation around AI-assisted development has been almost entirely about making agents more capable. Bigger context windows, better tool use, longer autonomy. That's important work. But there's a parallel conversation: how do we make the human side of this collaboration more sustainable?

As agents take on more, the human role shifts from implementer to orchestrator. And orchestration at scale — even a personal scale of five or six concurrent sessions — requires its own tooling. Not because the work is hard, but because the context-switching cost compounds. Every time you lose track of a session, every time you forget what you asked an agent to do, every time you relaunch a session because you didn't realize the old one was still waiting — that's friction that erodes the productivity gains the agent was supposed to provide.

Claude Todo is a small weekend answer to that problem. A to-do list for your AI sessions. Simple enough to explain in a sentence, useful enough that I now can't work without it.

Nicholas Tsianos