Inspired by rtk-ai/rtk.
Source: rtk-ai/rtkReverse-engineered from real GitHub workflow.
CLI proxy that reduces LLM token consumption by 60-90% on common dev commands. Single Rust binary, zero dependencies
No specific use case defined.
I need to build a CLI proxy tool in Rust that intercepts and compresses command outputs to reduce LLM token consumption. The core idea is to act as a middleware between the terminal and LLM context windows—when a user runs commands like `ls`, `git status`, `cat`, or test runners, rtk should filter and compress the output intelligently before it reaches the LLM, saving 60-90% of tokens.
Here's what I'm building:
The tool should be a single Rust binary with zero external dependencies. It needs to recognize 100+ common dev commands (git, cargo, npm, docker, python testing tools, grep, tree, etc.) and apply specific compression strategies to each. For example, `ls` output should remove noise and format concisely, `git diff` should summarize changes without full diffs, `cargo test` should keep only failures and summary, and `cat` should truncate very large files intelligently.
The architecture should work like this: when a command runs through rtk, it intercepts the output, matches it against known command patterns, applies the appropriate filter/compression rules, and returns the condensed version. Each command type has its own optimization strategy—keep error messages, strip verbose metadata, collapse repetitive output, truncate large sections while preserving context.
For the tech stack: pure Rust with no dependencies means I'm using only stdlib. The binary should be extremely fast (sub-10ms overhead) and have minimal footprint so it works everywhere. I should support installation via Homebrew, curl install script, and cargo.
The main features needed: command detection (pattern matching), output filtering rules for each command family (git commands, test runners, file operations, etc.), configurable output limits and formats, and an easy way to define new command handlers. There should also be documentation explaining what each command filter does and why.
Think about how to structure this: maybe a central command registry that maps command patterns to filter functions, each filter knows how to intelligently compress its output type. The CLI itself should be invisible to the user—they just run `rtk <command>` and get the same UX but with optimized output.
Include proper error handling so failures in filtering don't break the command's functionality, and make sure the tool works cross-platform (Linux, macOS, maybe Windows). The goal is for developers using Claude Code, Cursor, or other LLM IDEs to drop this in and automatically reduce their token spending without changing their workflow.