Resource

Token economy: how Timo cuts the real cost of working with AI

A technical guide for people who know a token is a currency.

The token is the billing unit in AI services. Providers apply precise rates: Anthropic charges 3 dollars per million input tokens on Sonnet and 15 on output, while OpenAI asks 2.50 and 10 dollars on GPT-4o.

A serious working session with a frontier model can cost between 1 and 3 dollars. A thousand sessions a month, on a team of ten people, add up to 10,000–30,000 dollars a year in tokens alone.

The invisible problem: broadcasting the context

Language models lack persistent memory between API calls. Today there are two main approaches.

The first uses a wide system prompt containing rules and historical decisions. This text gets resent to the model at every single turn. If the system prompt weighs 8,000 tokens and the conversation lasts 50 turns, you pay 400,000 tokens dedicated to memory alone.

The second approach inserts relevant documents into the chat messages. The context grows over time: by turn 30, the model is re-reading for the thirtieth time everything you wrote in earlier turns. Paid every time.

This is broadcasting the context: you fire everything at every turn, hoping what you need is in there. It is an economically unsustainable method at scale.

How Timo changes the equation

Timo provides a structured persistent memory, exposed to the model as a tool. Instead of sitting inside the context window, it uses a separate space indexed for smart search. The model accesses it only when needed, retrieving only what it needs.

The cost of a Timo query has three components: the tool call output (50–150 tokens), the payload returned by the server, and the model's final output. A smart search with top_k=10 typically returns 2,000–5,000 tokens. They sound like big numbers. But they're discrete, controllable numbers, retrieved only once.

The comparison, in numbers

Scenario: retrieve a decision about a specific client recorded three months ago in an 8KB note inside a knowledge base of 200 notes.

With Timo — A targeted query retrieves 3 relevant chunks (~1,500 input tokens), a synthetic answer (~300 output tokens). Total: 1,800 tokens, Sonnet cost 0.009 dollars.

Without Timo, knowledge base in the system prompt — 200 notes compressed to 50 tokens each = 10,000 tokens in system, paid at every turn. A 5-turn conversation: 50,000 cumulative input tokens, ~2,000 cumulative output tokens. Total: 52,000 tokens, Sonnet cost 0.18 dollars.

Ratio: 20x. Over 1,000 monthly conversations you save 170 dollars. Annually per user: 2,000 dollars. For a team of 10 people: 20,000 dollars a year.

Technical honesty: when Timo isn't worth it

Very short conversations on a single topic with heavy notes might consume more tokens than direct pasting. Timo's advantage shows up over duration and repetition.

If the model runs 3–4 hybrid_search calls with poorly formulated queries before finding the right chunk, each one costs its payload. The quality of the prompt that instructs the AI on how to use Timo is crucial.

Disorganized spaces — long notes without sensible chunking — lead to retrieving information of little use. Chunking is where the setup difference shows.

Timo is worth it when the knowledge base is large, when only a small fraction is needed for each question, and when sessions are recurring and draw on different portions of the same base.

The projection at scale

For individual users the saving is measured in hundreds of euros a year.

For those running AI sessions for clients — agencies and consultants — the saving becomes structural. Each client chat pays only for the actual queries, without a fat initial system prompt.

For those building products on APIs — specialized chatbots and copilots — the saving scales linearly. The TCO of the AI component of a SaaS product is, today, a heavy line item.

The conclusion, dry

Broadcasting the context is the amateur way to give an AI a memory. It works, it costs a lot, it scales badly.

Targeted retrieval is the professional approach. Timo is an implementation of this second pattern, mature enough to be used in production, simple enough to run on a Raspberry Pi in a drawer.

The token is your currency. Timo makes you spend less of it. Here's how.