Token Counter

Estimate how many tokens your text uses for GPT-4o, Claude, Llama, and other LLMs. Updates in real time.

🔒100% Client-Side. Everything runs in your browser — no data is sent to any server.
Your text
Token estimate
Enter text on the left to see token estimates

How are LLM tokens counted?

Large language models don't process text character by character or word by word — they work with tokens, which are chunks of characters produced by a process called Byte-Pair Encoding (BPE). BPE merges the most frequent character pairs in a training corpus into single tokens, building a vocabulary of tens of thousands of subword units. For typical English prose, one token is roughly 4 characters or about ¾ of a word. Code tends to be more token-dense (operators, brackets, and short identifiers tokenize inefficiently), while non-Latin scripts like Chinese or Arabic may use 1–3 characters per token because those characters fall outside the common BPE merges trained on English-heavy data. The estimates on this page use the 4 chars/token heuristic — for exact counts use the tiktoken library (OpenAI) or the Anthropic token-counting API endpoint.

Count tokens in Python with tiktoken

import tiktoken enc = tiktoken.encoding_for_model("gpt-4o") text = "Your prompt text here..." tokens = enc.encode(text) print(f"Token count: {len(tokens)}") print(f"Tokens: {tokens[:10]}...") # first 10 token IDs # Decode back decoded = enc.decode(tokens)

Count tokens for Claude (Anthropic API)

import anthropic client = anthropic.Anthropic() # Count tokens without making a full API call response = client.messages.count_tokens( model="claude-opus-4-5", messages=[{"role": "user", "content": "Your prompt here"}], ) print(f"Input tokens: {response.input_tokens}")