How to built a small Language Model from scratch

DevBlog

Jun 1, 2026 ยท 2 min read ยท 27 views

How to built a small Language Model from scratch

Here's exactly how it works ๐Ÿงต
No APIs. No pre-trained weights. Just PyTorch and math.

Step 1 โ€” Feed it text

I grabbed 50,000 children's stories from a dataset called TinyStories (huggingface). Then I used a tokenizer to convert every word into a number.
"Once upon a time" โ†’ [7406, 3504, 257, 640]
Computers don't read words. They read numbers.

Step 2 โ€” Build the brain

I coded a GPT Transformer from scratch using claude. It has 3 key ingredients:

โ†’ Embeddings โ€” converts each number into a rich vector that captures meaning

โ†’ Self-Attention โ€” lets every word look at every other word to understand context. This is literally how "bank" knows if you mean a river or money.

โ†’ 6 stacked Transformer Blocks โ€” each one learns deeper patterns, from grammar all the way to storytelling
30 million parameters total. All built by hand.

Step 3 โ€” Teach it to predict
The training loop is brutally simple:
Show the model: "Once upon a" Model guesses the next word. Gets it wrong. Calculate how wrong (the Loss). Adjust every single weight slightly in the right direction. Repeat 5,000 times.
That's it. Grammar, story structure, common sense โ€” all of it emerges from this one loop.

Step 4 โ€” Run it on a GPU
I connected the code to Modal.com and trained on an NVIDIA A10G cloud GPU.
30 minutes. Less than $1.

The result?
"๐™Š๐™ฃ๐™˜๐™š ๐™ช๐™ฅ๐™ค๐™ฃ ๐™– ๐™ฉ๐™ž๐™ข๐™š, ๐™ฉ๐™๐™š๐™ง๐™š ๐™ฌ๐™–๐™จ ๐™– ๐™ก๐™ž๐™ฉ๐™ฉ๐™ก๐™š ๐™—๐™ค๐™ฎ ๐™ฃ๐™–๐™ข๐™š๐™™ ๐™๐™ž๐™ข๐™ข๐™ฎ. ๐™ƒ๐™š ๐™›๐™š๐™ก๐™ก ๐™™๐™ค๐™ฌ๐™ฃ ๐™–๐™ฃ๐™™ ๐™๐™ช๐™ง๐™ฉ ๐™๐™ž๐™จ ๐™ ๐™ฃ๐™š๐™š. ๐™ƒ๐™ž๐™จ ๐™ข๐™ค๐™ข ๐™œ๐™–๐™ซ๐™š ๐™๐™ž๐™ข ๐™– ๐™—๐™–๐™ฃ๐™™-๐™–๐™ž๐™™. ๐™๐™ž๐™ข๐™ข๐™ฎ ๐™›๐™š๐™ก๐™ฉ ๐™—๐™š๐™ฉ๐™ฉ๐™š๐™ง."
for codebase email to: amaanprogramming@gmail.com
A model I built myself is writing coherent stories.
If you want to truly understand AI โ€” don't just use the tools. Build them.