Caveman: Cut 80% of AI Bloat — the Tool That Makes Your Coding Assistant Shut Up and Ship

John Doe
Agent , Dev Tools
06 Jun, 2026

You are staring at a terminal while Claude Code produces a 300-word explanation for a question that needed 30. You know exactly what this is costing you — every token is metered, and over half of them are “That’s an excellent question,” “Let me provide additional context,” and “I hope this helps.” You have learned to tune it out. The real question is: what are you actually paying for?

This is not anecdotal. A developer benchmarked it: explaining a React re-render bug took 1,180 tokens in normal mode and 159 tokens in Caveman mode. Same answer. That is an 87% cut. A GitHub project called Caveman has 44.8K stars for making exactly one decision — flipping the mode switch to minimal and rejecting every drop of social lubrication.

Caveman’s approach is brutal to the point of elegance. No model parameters are changed. No architecture is swapped. No fine-tuning. It does one thing: in the system prompt, it tells the AI, “You are not a civilized assistant. You are a caveman. No greetings. No setup. No wrap-up. Just the point.” It sounds like a joke, but the results are real. Research shows that when models are forced into concise output, accuracy does not degrade — it sometimes improves, because the cognitive load of maintaining a “polite persona” is eliminated.

It is not a blunt instrument. Caveman offers three compression levels. Lite strips filler words while preserving full sentences — suitable for formal documentation and external syncs. Full is the default, cutting articles, fragmenting expressions, full caveman style — ideal for daily development and rapid debugging. Ultra goes full telegraph mode with aggressive abbreviations and symbolic mappings for maximum token savings. What matters most: it knows the difference between noise and code. Technical terms, function names, file paths, and code blocks remain untouched. Only the connective tissue of natural language gets pulverized.

Picture a typical afternoon: three bugs, four back-and-forth turns each. Debugging an auth middleware expiration costs 704 tokens. Configuring a PostgreSQL connection pool somehow burns 2,347 tokens. Implementing a React error boundary component hits 3,454 tokens. In Caveman mode, those numbers become 121, 380, and 456. An hour of dense conversation that would clock 100,000 tokens now stays under 20,000. The money saved is the superficial win. The real savings are the waiting time that no longer breaks your flow, the scrolling you no longer do to find the actual answer.

Three scenarios illustrate where it shines. Scenario one: you hit an unfamiliar error and want the root cause and fix — nothing else. Caveman Full gives you a fragmented technical verdict that cuts to the chase in half a sentence. Scenario two: you are reviewing code for security vulnerabilities, and Caveman’s compression ratio drops to about 41%, well below the 80%+ of typical Q&A. Code review requires complete logical chains — the tool recognizes this and preserves what matters. Scenario three: you are drafting team-facing technical documentation that needs to be concise but grammatically complete. Lite mode keeps the syntax and structure intact while stripping only pure filler words and redundant statements.

Caveman has limits. It is not built for interactions that require conversational warmth — onboarding a junior engineer, simulating an interview, brainstorming. Over-compression in complex architectural discussions can drop essential reasoning steps. Ultra mode abbreviations may leave teammates reading your chat logs completely lost. Under the hood, Caveman is a system-prompt-level constraint, not a hard-coded truncator — some models comply better than others. At its core, it is outsourced prompt engineering, and its effectiveness depends on how obedient the underlying model is to its directives.

Installation is a one-liner. Claude Code users install via the plugin marketplace: claude plugin marketplace add JuliusBrussee/caveman then claude plugin install caveman@caveman. Cursor, Copilot, Windsurf, Cline, and others use the universal path: npx skills add JuliusBrussee/caveman. Once installed, type /caveman to enter caveman mode, and stop caveman to return to civilization. Switching compression levels is a single command: /caveman ultra.