# Tokens and context windows, why AI forgets

> Two pieces of jargon explain most of your chatbot frustrations. Here they are, without the maths.

Author: Lulu Kiritu

Published: 2026-06-26T08:00:00.000Z
Updated: 2026-06-26T08:00:00.000Z
Canonical: /explainers/tokens-context-windows-why-ai-forgets

## Why it matters

Understanding why an AI loses the thread halfway through a long chat helps you get better answers, and stops you blaming yourself when it forgets what you said.

## Story

A token is a chunk of text an AI reads and writes, and a context window is how much of that text it can hold in mind at once. Get those two ideas and a lot of confusing chatbot behaviour suddenly makes sense.

Start with tokens. AI does not read whole words the way we do. It breaks text into small pieces called tokens, which are often words or parts of words. A short message is a handful of tokens. A long document is thousands. Everything the model reads from you, and everything it writes back, is counted in tokens. That counting is also how most AI tools bill: you pay per token in, and per token out.

Now the context window. This is the amount of text, measured in tokens, that the model can pay attention to at one time. Picture a desk. The context window is how much paper fits on it. While your conversation is small, everything sits on the desk and the model can see it all. As the chat grows, the desk fills up, and to make room, the oldest papers slide off the edge. That is the moment your chatbot forgets what you said at the start.

In 2026 these desks have become enormous. Top models advertise context windows of a million tokens or more, enough to hold whole books. That is genuinely useful. But bigger is not automatically better. A huge window costs more to use, and stuffing it with everything can actually make a model less focused, the way a cluttered desk makes it harder to find the one page that matters.

## How to get better answers

Put the important bit close to your question. Summarise long threads. Start fresh for a new topic. Do not over-stuff the prompt with the entire folder unless the task truly needs all of it.

## Go deeper

Tokenisation is why AI sometimes miscounts letters or mishandles unusual words: it sees tokens, not characters. As a rough guide, one token is around three-quarters of an English word, so a thousand tokens is roughly 750 words. When a tool quotes a price per million tokens, that is the unit being counted, both your input and the model's output. Knowing this makes pricing pages far less mysterious.



