LLM and Transformer online: training, attention and text generation on Intellect

Used 520 times

LLM and Transformer online: training, attention and text generation

Mini LLM: Transformer, training and generation

Enter a corpus, train the educational model, and see how tokens become next-word probabilities.

Training text

Attention heads

Transformer layers

Vector size

Training epochs

Learning rate

Temperature

Text start

Continuation generated by the LLM

How many tokens to add

Head for heatmap

The model is not trained yet. Click "Train model".

Line thickness in the attention block shows average attention between tokens in the current context.

The row is the current token, and the column is the token it attends to. A darker cell means a higher attention weight.

How many tokens to show

Show all tokens all tokens in the current context

In this educational visualization, a layer "neuron" is shown not as each vector element, but as a whole token vector. One circle means one token with a vector of the configured "Vector size", not separate v1, v2, v3 elements.

Generation result

The calculation shows an educational estimate for JavaScript Float64 numbers: 8 bytes per number. A real browser uses more because of objects, arrays, and runtime structures.

This tab shows the trainable weights of the educational model: the logits matrix for the "current token -> next token" transition. Attention on the "Layers" tab is calculated during the text pass, while these weights are the ones changed during training.

An interactive educational simulator showing the full LLM path: text becomes tokens, tokens become numeric vectors, positions are added, Transformer layers with multi-head attention produce next-token probabilities. You can train a mini model on your own text and test generation immediately.

Was this answer useful?

Choose a quick rating so we can improve the next answer for you.

How satisfied are you?

Your answer option for this service or noticed an error:

This tool is a visual educational model of a large language model. It does not call external neural networks and does not send text to the server: training and generation run in the browser. The service shows tokenization, vocabulary, embeddings, positional features, multi-head attention, feed-forward processing, next-token probabilities, and step-by-step generation.
Real LLMs have billions of parameters trained on huge corpora. This simulator uses a small educational model: it builds a vocabulary from the training corpus, creates numeric token vectors, and trains a next-token transition matrix with softmax gradient steps. It is simplified, but preserves the core idea: the model receives previous tokens and computes probabilities for the next token.
The attention visualization shows which tokens influence the current token most. Multiple attention heads can highlight different relationships: nearby words, similar numeric vectors, repeated tokens, and context structure. Layer cards show how data moves through a Transformer: attention mixes information between tokens, then the feed-forward block transforms each token separately.
Settings let you change layers, attention heads, vector size, learning rate, epochs, generation temperature, and output length. Low temperature favors likely continuations; high temperature explores rarer variants. This makes it visible that an LLM repeatedly chooses the next token from a probability distribution.
The service is useful for students and beginning developers: it explains why word order matters, why vectors are needed, how attention connects tokens to each other, what error-based training means, and why text generation is a repeated next-token choice.

Algorithms Programming Text

LLM and Transformer online: training, attention and text generation on Intellect

Mini LLM: Transformer, training and generation

Comments

To leave a comment