Demystifying AI Parameters
Scroll Down
Before AI, we had simple math. Imagine you want to estimate apartment rent. You have a formula: Rent = (SqFt * A) + (Floor * B).
But what if you don't know A or B? You only have historical data (the dots below). Your job is to find the "Parameters" (A and B) that make the line fit the dots.
In AI, a single "Neuron" is just a fancy Rent Calculator. But instead of 2 inputs, it might have hundreds. Think of it like a sound mixing board.
The Weights are the volume faders. The Bias is the master gain.
Mode: Linear (Negative values allowed)
Training is just moving these faders until the output matches what we want. Inference is locking the faders and playing the music.
Modern AI isn't one mixer. It's a mixer... mixing the outputs of other mixers. This layering allows it to learn complex patterns, not just straight lines.
When you click "Run Training", imagine 10,000 sound engineers furiously adjusting their sliders at once to minimize the error.
ChatGPT doesn't have a brain. It has a Next-Token Prediction engine. It looks at context and assigns probabilities to the next word.
Every time a word is generated, it flies back into the input, and the process repeats. It's an auto-complete that reads its own writing.
We looked at a mixer with 4 sliders. Then a network with 50. GPT-4 has approximately 1.7 Trillion parameters.
Scroll the box below. 1 pixel height = 1 Million Parameters.
Emergent Intelligence is a conjuring trick born from the sheer magnitude of these sliders.