Imagine learning to throw darts. Every time you miss the bullseye, you get a score: 12 centimeters off to the left, 3 centimeters too high. That score tells you exactly how wrong you were, and you use it to adjust your next throw. Without that feedback number, you'd just be throwing blind.
Neural networks learn in a surprisingly similar way. They make predictions, measure how wrong those predictions are, and use that measurement to improve. The mechanism that produces that "how wrong" measurement has a name: the loss function. It's not glamorous, it's not the part anyone writes headlines about — but it is the mathematical heartbeat of every AI system that learns.
What Is a Loss Function?
When a neural network is being trained, it's shown an input — say, a photograph of a cat — and it produces an output: maybe "70% cat, 30% dog." Somewhere, the correct answer is stored: this is, in fact, a cat. The loss function compares those two things — what the model said versus what was true — and collapses that comparison into a single number. If the model said "70% cat," the loss might be 0.36. If it had said "1% cat," the loss would be much higher. The higher the loss, the more the model needs to learn.
The genius of reducing everything to one number is that it gives the training process something to optimize. You can't easily tell a neural network "be more accurate" — that's too vague. But you can tell it: make this number smaller. That's a concrete, mathematical goal.
How the Network Uses That Score to Improve
A neural network is built from millions (or billions) of individual numerical settings called weights. Think of weights as dials. Each dial controls a tiny aspect of how the network processes information. When the network is first created, those dials are set randomly, which is why untrained models produce nonsense. Training is the process of adjusting all those dials until the network reliably produces good answers.
But how does the network know which dials to turn, and in which direction? This is where two interconnected algorithms come in.
Backpropagation: Tracing the Blame
After the loss function produces its score, the network needs to figure out which weights were responsible for the error.
"Backpropagation" is short for "backward propagation of errors." Here's the intuition: information flows forward through the network (input → hidden layers → output → prediction), and then after the loss is calculated, blame flows backward through the same path. Using calculus — specifically a technique called the chain rule — the algorithm works out precisely how much adjusting any single weight would raise or lower the loss. Every weight in the network gets its own responsibility score.
This is remarkable when you think about it. A modern neural network might have hundreds of billions of weights. Backpropagation efficiently assigns a gradient — a directional nudge — to every single one of them in a single backward pass.
Gradient Descent: Acting on the Feedback
Now that each weight has its nudge, something has to act on it.
The word "gradient" here means slope — like the slope of a hill. Imagine the loss function as a hilly landscape, and the network's current settings as a ball sitting somewhere on that landscape. The goal is to roll the ball downhill to the lowest valley (minimum loss). Gradient descent does exactly this: it looks at the slope at the current position and takes a small step in the downhill direction.
"Descent" means going down. "Gradient" means following the slope. Put them together and you have an algorithm that repeatedly asks: given where I am now, which small adjustment to my weights will reduce the loss the most? Then it makes that adjustment, and asks again, and again, millions of times across millions of training examples.
The size of each step is called the learning rate. Too large and the ball overshoots the valley. Too small and training takes forever. Getting this right is one of the core practical challenges in training AI systems.
The Full Learning Loop
Put it all together and the training loop looks like this:
- Forward pass: The network sees an input and produces a prediction.
- Loss calculation: The loss function compares that prediction to the correct answer and produces a single score.
- Backpropagation: The algorithm traces back through the network, computing each weight's contribution to the error.
- Gradient descent: The weights are all nudged slightly in the direction that reduces the loss.
- Repeat: The next training example comes in, and the loop begins again.
This cycle runs billions or trillions of times during the training of a large model. Each individual step is tiny. The cumulative effect of all those tiny corrections is a network that has genuinely learned patterns in data.
Different Tasks Need Different Loss Functions
Not all problems are the same, so not all loss functions are the same. The right choice of loss function depends on what you're trying to predict.
For predicting a continuous number — like a house price — a common choice is mean squared error, which penalizes large mistakes much more heavily than small ones (by squaring the difference). For classifying inputs into categories — spam or not spam, cat or dog — other loss functions are better suited because they work with probabilities rather than raw numbers.
Cross-Entropy Loss and Language Models
For language models — the family of AI systems that includes ChatGPT and similar tools — one specific loss function dominates.
Here's what "next-token prediction" means: the model is shown a sequence of words (or word-pieces, called tokens) and must predict what word comes next. "The cat sat on the ___" — what's the most likely next word? Cross-entropy loss measures how well the model's probability distribution over all possible next words matches reality. If the correct next word was "mat" and the model assigned it a high probability, the loss is low. If the model was confidently wrong — assigning "mat" near-zero probability — the loss is high.
When Human Preferences Become the Loss Signal
Standard loss functions compare predictions to ground truth labels — predetermined correct answers. But for many real-world AI tasks, there isn't a clean correct answer. "Write me a helpful summary of this article" doesn't have one objectively correct output. How do you measure loss in that case?
This is the problem that Reinforcement Learning from Human Feedback (RLHF) was designed to solve.
Here's how it works in broad strokes: human raters compare pairs of AI outputs and indicate which one is better. A separate model — the reward model — learns from those human judgments and trains itself to predict what humans would prefer. This reward model then acts as a proxy loss function, scoring the AI's outputs not against a fixed label, but against an approximation of human values. The main model is then trained to maximize those scores.
This is how AI assistants are tuned to be helpful, harmless, and honest — the loss signal itself becomes a learned human preference, not just a mathematical distance from a labeled answer.
Why This All Matters for Understanding AI
Once you understand loss functions, a lot of otherwise mysterious things about AI start making sense.
It explains why AI systems need so much data — the loss function needs many, many examples to produce a reliable signal. It explains why training is so computationally expensive — you're running this feedback loop billions of times. It explains why AI can fail in surprising ways — if the loss function doesn't fully capture what you actually care about, the model will optimize for the score, not for your real goal. This last point has its own name in the field: Goodhart's Law, or in AI contexts, "reward hacking."
Most fundamentally, it reframes how we think about AI "learning." There's no moment of understanding, no insight, no aha experience. There's just a number — the loss — being slowly, relentlessly, stubbornly pushed downward, one tiny weight adjustment at a time. Everything an AI knows is the residue of that process.
The Takeaway
The loss function is the simplest important idea in machine learning: compare what you predicted to what was correct, express that gap as a number, and use that number to get better. Backpropagation figures out who's responsible for the error. Gradient descent acts on that information. Cross-entropy loss handles language. RLHF extends the idea to human preferences.
None of these are magic. They're elegant engineering built on calculus and statistics. But chained together and run at enormous scale, they produce systems that can write, reason, and converse — all because a number kept being pushed, very slowly, downhill.
Sources
Every factual claim in this article was independently verified against the following sources:
- What is Loss Function? - Machine learning — ibm.com
- Linear regression: Gradient descent | Machine Learning | Google for Developers — developers.google.com
- Which Loss Function Do LLMs use? • luminary.blog — luminary.blog
- What Are Generative Pretrained Transformers (GPT): The Comprehensive Guide To Understanding And Leveraging AI-Language Models | Al Rafay Global — alrafayglobal.com
- How Does Backpropagation in a Neural Network Work? | Built In — builtin.com
- Reinforcement Learning from Human Feedback (RLHF) Explained | IntuitionLabs — intuitionlabs.ai

