I asked artificial intelligence to solve a 2,400-year-old philosophical puzzle. The answer surprised me. Not because the AI found the truth, but because of what its confusion revealed about language itself.
Start with a single grain of sand. No one would call that a heap. Add another grain. Still not a heap. Keep adding, one grain at a time, and at some point you'll have an undeniable heap of sand in front of you. But here's the problem: at what exact moment did it become one?
This is the Sorites Paradox, named after the Greek word soros, meaning "heap." The ancient philosopher Eubulides of Miletus posed it around 400 BCE, and it has bedeviled thinkers ever since.
The logic seems airtight: if one grain isn't a heap, and adding a single grain can't turn a non-heap into a heap, then no accumulation of sand, no matter how enormous, should ever qualify as one. Yet we know that's absurd. A million grains of sand is obviously a heap.
Something in the argument must be wrong. But what?
The paradox isn't really about sand. It's about the fuzzy boundaries of language: words like "tall," "bald," "rich," and "old" that resist precise definition. These vague predicates are everywhere in human communication, and they've proven remarkably resistant to philosophical resolution.
Large language models like ChatGPT don't internally think in simple yes-or-no answers. Under the hood, they assign probabilities to every possible next word. When asked a binary question, you can peer inside and see how confident the model is in each response.
This gave me an idea: What if I asked an AI, at every quantity of sand from 1 grain to 100 million, whether it constituted a heap? By examining the model's internal confidence levels, I could plot a "heapness curve," a mathematical function showing how the AI's certainty changes as the pile grows.
Here, ℓYes and ℓNo are the model's internal scores (called logits) for each answer: a higher score means the model is more inclined to choose that word.
In my imagination, the results would produce a beautiful S-shaped curve, rising from near-zero for small quantities to near-certainty for massive piles. I'd find the inflection point and declare: This is where the heap begins.
Reality, as it often does, had other ideas.
My first attempt used a straightforward prompt:
"There is a heap of 1,000,000 grains of sand. I remove n grains.
Is this still a heap? Answer yes or no."
I varied n from 1 to over a million, expecting to watch the model's confidence erode as the pile shrank. Instead, I got this:
Across all values of n, the probability stayed around 0.67, barely moving regardless of whether I removed one grain or the entire pile.
By stating "there is a heap" in the prompt, I had already anchored the model's expectations. It was simply deferring to my framing rather than reasoning about the quantity. I had built an expensive machine for confirming my own assumptions.
The second attempt required a different approach. Instead of relying on the model's pre-existing notion of "heap," I would define the concept through examples, a technique called few-shot prompting.
You are a strict classifier that decides whether a pile of sand is a heap.
Answer each question with exactly "Yes" or "No".
Q: There is a pile of 1 grain of sand. Is this a heap?
A: No
Q: There is a pile of 2 grains of sand. Is this a heap?
A: No
Q: There is a pile of 1,000,000 grains of sand. Is this a heap?
A: Yes
Q: There is a pile of 999,999 grains of sand. Is this a heap?
A: Yes
Q: There is a pile of N grains of sand. Is this a heap?
A:
With these examples, I was establishing clear boundary conditions: tiny amounts are definitely not heaps; enormous amounts definitely are. For everything in between, the model would have to interpolate.
I sampled at 1, 10, 100, 1,000, and so on up to 100 million, rather than checking every single integer. This time, the results were striking:
Below a few dozen grains, probability hovered in the 0.25–0.35 range; by the time we reached thousands of grains, it climbed into the 0.65–0.75 range. It wasn't a perfect sigmoid, but the shape was unmistakable: the model was genuinely wrestling with the boundary between heap and non-heap.
Now I had data worth analyzing. But what did it actually mean?
Philosophers have proposed various solutions to the Sorites Paradox over the centuries. Remarkably, the same heapness curve can support three entirely different interpretations.
One school of thought rejects the binary assumption altogether. Instead of "heap" being simply true or false, it exists on a spectrum. The statement "this pile is a heap" might be 30% true, or 70% true, or anywhere in between.
Under this interpretation, the model's probability curve is the meaning of "heap." It serves as a membership function describing how strongly any given quantity belongs to the category. The paradox dissolves because we stop demanding sharp boundaries where none exist.
A more counterintuitive position holds that there is a precise point where heap begins (perhaps at exactly 37,421 grains), but human knowledge can never pinpoint it. The curve represents our uncertainty, not the underlying reality.
If you embrace this view, you'd find the value of n where the model's confidence is closest to 50% and treat that as your best estimate of the true threshold. Classical logic survives; we just acknowledge epistemic humility about vague predicates.
A third approach says that vague predicates create genuine gaps in truth. Instead of one sharp boundary, imagine a range of permissible boundaries: different reasonable cutoffs that competent English speakers might adopt.
Under this framework, the curve defines three zones:
The Sorites argument fails because you can't chain implications through a region where truth values don't exist.
The experiment raised a natural question: How sensitive is this curve to the specific examples I provided? To test how fragile the "heapness" curve was, I tried three variations.
I replaced the high-end examples, originally set at around one million grains, with one billion. That's a thousandfold increase. The resulting curve shifted only slightly.
Why? Both quantities live in the same conceptual bucket: "unimaginably huge pile." Once you've established that some astronomically large number counts as a heap, making it even larger provides no additional information. The model's decision boundary is shaped far more by the low-end examples and its pre-existing intuitions about language.
The initial experiments used a small open-source model called Mistral-7B. When I repeated them with a similar model, DeepSeek-7B, the overall shape remained similar, but the inflection point (where probability crosses 50%) shifted by tens of thousands of grains.
This is philosophically revealing: if "heap" had a fixed, precise boundary in English, you'd expect well-trained models to converge on roughly the same cutoff. They don't. Each model has absorbed slightly different patterns from its training data, leading to different fuzzy zones.
In a sense, this confirms what the Sorites Paradox has always suggested: vague predicates are underdetermined by usage. There's no single correct answer lurking in the data.
The most illuminating result came from Llama-3-8B, whose curve looked unlike the others:
The probability never dropped much below 0.35 or climbed much above 0.55. From a single grain to a hundred million, the model remained perpetually uncertain, never strongly committing to "heap" or "not heap."
Under the fuzzy interpretation, this model is saying that every pile of sand possesses roughly equal claim to heap-hood. Under the supervaluationist view, nearly the entire range falls into the truth-value gap. And if you insist on an epistemicist reading, you can still force a sharp cutoff, but the data is essentially screaming that it doesn't support one.
Did I solve the Sorites Paradox? No. Twenty-four hundred years of philosophy wasn't going to crumble before a weekend project with a language model.
But the exercise revealed something valuable about how AI systems handle the fundamental vagueness built into human language. Here's what I learned:
The boundary between heap and non-heap isn't encoded somewhere in the model's weights, waiting for the right prompt to extract it. The boundary shifts with context, with examples, with the specific model queried. This suggests that "heap" genuinely doesn't have a single precise meaning, not even in the statistical patterns of human language that these models capture.
When I gave the model examples defining heap, I wasn't uncovering its latent understanding. I was constructing a new mini-concept on the fly. The model interpolated my examples using its priors, producing a curve that reflected my inputs as much as any deeper truth.
One heapness curve can be read as evidence for fuzzy logic, epistemic uncertainty, or truth-value gaps. The empirical results don't favor any particular resolution to the paradox. They just give each school new numbers to work with.
Different models, different prompts, same basic pattern: tiny piles aren't heaps, huge piles are, and there's a vast fuzzy middle where everyone hedges. This is exactly what you'd predict if natural language systematically fails to pin down predicates like "heap." The models aren't broken; they're faithfully reflecting the ambiguity inherent in the concept.
As takes on a 2,400-year-old paradox go, that's not the worst I've heard.
Perhaps the Sorites Paradox doesn't reveal a bug in human language that needs fixing. Perhaps it reveals a feature: our words are designed to be flexible, contextual, negotiable. The boundaries of "heap" aren't fixed because they don't need to be. We understand each other well enough without them.
And that's something not even the most sophisticated language model can change.