(Audio of the talk here).
In 1874 the mathematician Georg Cantor published a paper that claimed to prove the existence of an infinite hierarchy of infinities, each more vast than the infinity before it, stretching out forever like some vast alien landscape.
Cantor had achieved the seemingly impossible feat of counting beyond infinity. Of course, if true, this is astonishing intellectual achievement.
However, the existence of higher infinities is, on the face of it, absurd, for one very simple reason: infinity, by definition, is bigger than anything, and therefore there cannot be anything bigger than it. Quite naturally, therefore, Cantor’s contemporaries reacted with a mixture of bewilderment and disbelief.
But Cantor had a mathematical proof. So, after tumultuous debate, mathematicians came to accept what is now known as Cantor’s theorem. David Hilbert, one of the most influential mathematicians of his day, famously asserted that “No one will drive us from the paradise which Cantor created for us”. And to this day the majority of mathematicians are content to live, if not fully within Cantor’s paradise, at least in its divine light.
Should we celebrate Cantor’s higher infinities, like we celebrate the invention of the calculus, or should we jeer, as we jeer at the medieval proofs for the existence of God? What is the relationship between very abstract mathematical thought and material reality? Does the idea of higher infinities help or hinder our intellectual progress? These are the kinds of wider questions I’d like to keep in mind as we examine Cantor’s ideas.
This post is divided into two parts. We cannot critically examine Cantor’s argument unless we understand it. So I’m going to take the time to present the mathematics. Luckily, Cantor’s proof can be easily understood with just a little effort. In the second part, we’ll dive into some of the controversies.
The power set
To understand the proof we need some set theory. Luckily for us, we don’t need much. So let’s begin by defining a set.
A set is simply an unordered collection of distinct elements, where an element can be anything. Consider a set with two elements, where the first element is an apple and the second element is an orange. We can think of subsets of this set. A subset is another set which contains some elements of the original set. In fact, our set, containing an apple and an orange, has 4 distinct subsets. There is the empty set. There is the set containing only an apple. And a set containing only an orange. And finally there is the set containing an apple and an orange.
Now we can combine these 4 subsets into a single set. This is allowed because elements can be anything. So we form a new set of 4 elements, where each element is a subset of our original set.
What we have just done is construct what’s called a power set. A power set is the set of all subsets of a set. That’s the first concept we need to understand Cantor’s proof.
The next concept we need is a definition of the size of a set.
The size of a set
The size of the set containing an apple and orange is 2 because it has 2 elements. And the size of its power set is 4, as we have just seen.
Measuring the size of finite sets is straightforward: we simply count the number of elements. But measuring the size of infinite sets is not straightforward, for the simple reason that we cannot count to infinity.
But infinite sets nonetheless pose questions about their relative size. For example, the set of all even numbers is obviously infinitely large. But we might think that the set of whole numbers must surely be larger, because it contains both odd and even numbers.
But can we really be sure? What we need is a well-defined method for comparing the sizes of sets that contain elements that we cannot count. Cantor discovered a method for precisely doing that.
He observed that two sets are of the same size if we can pair up all their elements with one another, with none left over. This idea corresponds to a very practical method of comparing quantities. Let’s say I need to distribute a collection of spears to my fellow hunters. They line up and I give them one spear each. If, at the end, there are hunters that lack spears, or spears that lack hunters, then the set of spears and the set of hunters were not equal in size. But if they can be paired up, perfectly, then we know they were equal.
Set theory formalizes this idea of pairing-up two sets, with nothing left over, in terms of a bijective function between two sets. We then say that sets A and B have equal size if we can define a bijection between them.
So let’s return to our example, where we tried to compare the size of the even numbers with the size of all numbers. It turns out that we can pair off every whole number with an even number. The bijection that pairs the number n to the number 2 times n will do the job. For example it pairs 1 with 2, 2 with 4, 3 with 6 and so on. Any number we choose gets uniquely paired. This means that the size of the whole numbers is identical to the size of the even numbers. So our initial intuition was quite wrong. The set of even numbers is not smaller than the set of all numbers.
What this shows is that our intuitions regarding infinite sets are easily confounded.
So now we have all the concepts we need to tackle Cantor’s theorem. We understand a set and its power set. And we understand how we can compare the sizes of sets, even when we can’t count the individual elements.
Cantor’s main result is that a power-set is always bigger than the set it’s generated from.
The details are contained in this PDF (might be worth opening in a new tab for reference): infinityHandout
Before we tackle infinite sets, let’s think about finite sets. In this case Cantor’s theorem is clearly true. Take a look at the first picture on this handout:
On the left we have a set with 3 elements. On the right we have its power-set, which has 8 elements. By simply counting, we know that the power-set must be larger. The power-set must be larger because its built from all the possible combinations of the original set.
The finite case might already convince you that the power-set must always be bigger. But the crucial question is whether our intuition generalizes to infinite sets. This is the problem that Cantor set out to solve.
Cantor’s gives a proof by contradiction that works for any set. He assumes the opposite – that a set and its power-set are of equal size – and then shows that this assumption leads to a logical contradiction. So the assumption must be false.
The proof has 5 steps, which I’ve numbered.
- Step 1 is a minor technicality, so we will ignore it.
- Step 2 looks complex but simply defines the properties that any bijective function must satisfy. We’re assuming such a bijection exists.
- Step 3 is where the proof actually gets going. Now take a look at the second sketch on the handout, which again shows a set, which we will call S, on the LHS, and its power-set, on the RHS.
- Cantor asks us to consider a special set. Let’s call it big D. D is defined as the set that contains the elements of S that are paired with elements in the power-set that do not contain that element itself.
- I’ve drawn an element x as an example. Let’s say it’s paired to a set in the power-set that contains elements a and b. Now x isn’t itself in that set. So x will appear in our special set D.
- And x stands for any element. So D may well have many elements, depending on the precise details of the bijection. We don’t need to know what actual elements appear in D. All we need to know is that the bijection – which we’re assuming, as a hypothesis, exists – entirely determines the
contents of D.
- Turn to step 4 of the proof. Here we notice that set D must itself be a subset of S. And therefore D also appears in the power-set of S. That’s why I’ve drawn D on the RHS of the diagram.
- The bijection therefore also pairs D to an element in the set S, on the LHS. So I’ve drawn a connection between D and some element of S, which I’ve labelled small d.
- So the element small-d is paired with big D on the RHS. All these conclusions follow from assuming that the bijection exists.
- And now we get to the punchline. We ask a simple question: is small d an element of big D or not? We’ve got two cases to consider.
- Let’s consider the first case, where small d is an element of big D. Well, we said that any element in big D, by definition, is not paired with a set that contains it. So assuming that small d is in big D implies that it is not in big D. That’s a contradiction.
- Let’s consider the second case, where small d is not in big D. Well, if it’s not in big D, it must be paired with a set that does contain it. So assuming that small d is not in big D implies that it is in big D. That’s another contradiction.
- In other words, assuming that a bijection exists leads us, inescapably, to contradiction.
- So our initial hypothesis was false. There isn’t a bijection between these sets. And therefore, S and its power-set are not the same size.
- In fact, as step 5 of the proof shows, the power-set is always bigger. And we’ve proved this without counting.
If you step through these steps, with the handout in front of you, you will understand the logical force of Cantor’s argument.
Cantor’s proof has lots of consequences. Here we’ll just note Corollary 1.
The set of whole numbers if of course infinite. It represents the infinity of the counting numbers. Now imagine constructing the power-set of the set of whole numbers. That’s an interesting mental exercise to perform. You immediately realize that this power-set must contain an infinite number of subsets that are themselves of infinite size.
And now we know, from Cantor’s proof, that a power-set is always bigger than the set it was generated from. So the size of the power-set of the whole numbers is an infinity that is bigger than the infinity of the counting numbers.
So we have discovered our first higher infinity.
And we can take the power-set of this power-set of the whole numbers. And we can imagine applying this operation as many times as we want. And every time we do, we define abstract structures with sizes of increasingly higher infinities, going on and on, forever.
Cantor appears to have discovered a new universe of infinities that was hidden by our initial, and arguably naive, concept of infinity.
The proof on the handout contains an argument that many mathematicians
consider revolutionary. If correct, it represents the first time the human mind has seen a glimpse of things that exist beyond the infinite.
So, in this second part, we’ll review some of the critical reactions to Cantor’s theorem.
Aristotle and potential versus actual infinities
The first kind of objection derives from classical philosophy, in particular Aristotle. Aristotle argued that infinity, as a conceptual necessity, is essentially incomplete. We can keep counting forever, but at any actual moment, the number we have counted up to is finite. Aristotle called this infinite process a potential infinity.
In contrast, if we imagine that an infinite process actually completes and comes to an end, then we would be stating a contradiction in terms. It would be like thinking of a square circle.
So potential infinities are OK. But thinking of infinity as an actual thing, a determinate whole, is not OK.
Aristotle applied this distinction to dissolve Zeno’s paradoxes. In fact, actual infinities were viewed with extreme suspicion right up to the 17th Century.
The invention of the calculus, by Liebniz and Newton, began to weaken this orthodoxy, since it introduced talk of infinitesimals, and limiting gradients of infinite sequences of tangent lines. But suspicion of actual infinity still held sway, which accounts for many of the contemporary rejections of Cantors’ work. For example, the mathematician Poincare stated, “There is no actual infinite; the Cantorians have forgotten this, and that is why they have fallen into contradiction”.
A typical Aristotelean, when presented with Cantor’s proof, will view the power-set of an infinite set as an incoherent idea. Since the totality of elements of an infinite set are only potential, it doesn’t make sense to form its power-set, or to think of all its possible subsets as an actual thing. So it’s simply a philosophical mistake to posit such an object. And, so, from the point-of-view of classical philosophy, Cantor’s proof is suspect.
We may want to follow Aristotle and reject actual infinities. But before we do this, we should check what else we might have to throw away.
About two-and-a-half thousand years ago the Pythagoreans thought that the entire universe, and all that exists within it, was reducible to whole numbers and their ratios. This world-view was undermined by the discovery that some straight lines have lengths that cannot be represented by the ratio of any whole numbers.
One of the oldest, and most famous, proofs in mathematics is that the square root of 2 is not the ratio of any whole numbers. It’s a short proof by contradiction. Famously, its discover was thrown into the sea, such was the outrage.
The Greeks called such lengths inexpressible magnitudes. Today we call them irrational numbers. Irrational numbers have some peculiar properties. The decimal expansion of any ratio of whole numbers is finite, or repeats in a periodic pattern of finite length. In contrast, the decimal expansion of an irrational number is infinite, and the numbers never repeat on any scale. The most famous example is the number pi. We know the value of pi to many millions of decimal places. But any finite decimal expansion is necessarily an approximation.
So we are faced with a surprising fact: We cannot know the precise value of an irrational number. However, we can define an irrational number as the limit point of an infinite sequence of ratios. So we can get closer and closer to its value, by computing increasingly better approximations, but we can never complete this infinite process and arrive at its actual value.
Hegel suggested, perhaps poetically, that this kind of approximating sequence is a process where quantity turns into quality. At the infinite limit we breakthrough to a qualitatively new kind of object, with new kinds of properties.
And this certainly seems to be happening with irrational numbers. In this sense they are completed, actual, infinites.
The point is this: if you accept the real number line, which contains irrational numbers, then you already accept the existence of abstract objects that embody actual infinities. And you already accept the existence of numbers that have magnitudes that cannot be measured.
So to be consistent Aristotelians we must also reject the real numbers, or at least define them in a radically new way.
And this is precisely the line the mathematician Brouwer took in the early 20th Century, when he founded a new school of mathematics that came to be known as constructivism.
Constructivism claims that a mathematical object exists only if we can specify how to construct it. We can construct a mathematical object if we can write a computer program that generates it as output. In contrast, classical mathematics claims that an object exists if we can show that assuming it does not exist leads to a contradiction.
The key difference is that constructivism rejects a logical rule, called the law of excluded middle. The law of excluded middle states that a proposition P must be either true or false. And, of course, this seems like a reasonable rule to adopt.
For example, if P is the proposition that a certain object does not exist, then according to the law of excluded middle, if we can prove that P leads to contradiction, then we are justified in claiming that the object in fact exists.
Cantor’s proof relies on the law of the excluded middle. He argues that if we assume that a power-set, whose size is a higher infinity, does not exist, then we derive a contradiction. But at no point does Cantor actually construct a set whose size is a higher infinity.
So constructivists reject Cantor’s proof. They note that a proposition P, in addition to being either true or false, may in fact be undecidable. A simple example is the well-known liar’s paradox.
And they reject many other parts of classical mathematical analysis. For example, they propose a new definition of the real numbers that avoids actual infinities. Instead, they work with the computable reals, which are real numbers that can be computed to any degree of precision by an algorithm guaranteed to terminate in finite time.
These computable reals turn out to be a subset of the traditional real numbers. And the size of the infinite set of computable reals is the same as that of the whole numbers. So there is no space here for higher infinities.
Constructivism appears to be a very strong basis, internal to mathematics itself, from which to reject the idea of higher infinities. But before planting our flag on the field of constructivism, we need to sharpen the contradictions further. In particular, we need to recognize that we already compute with objects that can’t be constructed.
‘Unobservable’ mathematical objects
Even though we cannot know the actual magnitude of most real numbers this does not imply that all our calculations are approximate. For example, say we want to know the ratio of the areas of two circles, one with radius 4cm and the other 2cm. We know the area of a circle is pi times the square of its radius. So we can deduce that one circle is exactly 4 times the area of the other.
During this calculation the irrational number, pi, cancelled out. We mainly take this for granted because we do it so often. But what actually happened is that we temporarily visited a theoretical realm, populated by exotic objects, in this case an irrational number, and then returned to a more familiar realm with an ordinary answer in our hand.
How can we apply a theory about the dimensions of a circle without needing to know the magnitude of pi? We can do this because pi is more than just a magnitude.
This is particularly clear when we look at modern computer algebra systems, such as Mathematica. Mathematica computes with symbols rather than magnitudes. Symbolic computation avoids premature approximation when applying mathematical theorems. In a sense, symbolic computation replicates what mathematicians do. It treats irrational numbers, such as pi, or e, or the square root of 2, as theoretical terms within a large interlocking network of mathematical theories.
In physical theories, such as particle physics, we also find terms that refer to difficult-to- observe or inaccessible objects. We normally take a realist position and assume that the terms refer, however imperfectly, to actual things-in-the-world that exist independently of us.
Mathematical Platonists take a similar attitude to mathematical objects. Real numbers, for the Platonist, are abstract objects that just happen to have some properties that are not fully observable in finite time.
But in what sense can abstract objects be real? Surely they are merely mental constructions?
On this point, I think we can clearly say no. Hard-nosed materialism doesn’t preclude the mind-independent existence of abstract objects. In fact there are millions of workers manipulating abstract objects every day. A matrix, to a software engineer, is just as real as a brick to a bricklayer.
Indeed, the discovery of computation has transformed our understanding of how abstract objects may be implemented in mechanical devices, or, to state the same thing in a different way, how abstract objects are reducible to physical processes. So realism applied to mathematical objects is perhaps not so mystical as it once appeared to be.
A Platonist, when defending their realism, will also point to the unreasonable effectiveness of mathematics in science, particularly the calculus over real numbers. This is an enormously successful formalism, with huge practical consequences, and yet is stuffed full of the limits of infinite sequences.
Perhaps the same applies to the higher infinities. They may be partially observable, but nevertheless, real entities, postulated by a theory that happens to be very indirectly tethered to our current material practice.
Since Cantor’s higher infinities are terms of a theory we can also symbolically compute with them. So we can write Mathematica programs that manipulate higher infinities. Of course, it’s an weird feeling to manipulate symbols that claim to refer to unimaginably vast objects that nonetheless seem to have no discernible consequences, other than the formal, purely syntactic, transformations of symbols.
The crucial question for materialists
And this, then, seems to be the crucial question: Are there any practical consequences of Cantor’s theory? Can these symbols be linked to some kind of successful practice?
As of today, there are no examples of practical applications.
Roger Penrose, the physicist, notes that almost all physical theories only require sizes of infinity equal to the real numbers. Even the sizes of higher dimensional vector spaces happen to be the same as that of the real number line.
Since we can almost certainly reformulate all these theories in terms of the computable reals it appears that physics only needs one type of infinity – the ordinary one.
But there are other ways to measure practical success. The logician and Platonist, Kurt Godel, suggested that Cantor’s theory should be judged in terms of its explanatory power within mathematics itself. For example, we can use Cantor’s theory to give simpler proofs for already established theorems. The simpler proofs often yield more insight.
In fact, a large body of mathematical theory has been built on top of Cantor’s theory. So Cantor’s theory is, at least within mathematics, not a dead end.
Nonetheless, Cantor’s higher infinities, especially when compared to the invention of the calculus, or the theory of complex numbers, seem to be especially devoid of practical implications.
A practical experiment
In the sciences we ultimately decide between fact and fiction in terms of practical success. To quote Marx:
The question whether objective truth can be attributed to human thinking is not a question of theory but is a practical question. Man must prove the truth — i.e. the reality and power, the this-sidedness of his thinking in practice. The dispute over the reality or non-reality of thinking that is isolated from practice is a purely scholastic question.
So what future practice might lead us to decide the status of the higher infinities? That’s a very difficult question, and not one I can answer.
But we can make some suggestive observations. Cantor’s higher infinities give a very simple proof of the existence of uncomputable functions. Uncomputable functions are those that cannot be mechanized in any known way. For example, we know that it’s not possible to construct an algorithm that decides whether any given set of tiles will tessellate the plane.
So, rather remarkably, the Platonic existence of higher infinities directly implies fundamental limits to the causal powers of computing machines.
But, just as remarkably, the Physical existence of higher infinities implies that we can transcend those limits, and build hyper-computers. A small minority of theorists have proposed various designs for hyper-computers that, on paper, have the power to solve uncomputable problems. However, on closer inspection their causal powers always derive from some hidden actual infinity, such as performing operations on infinite precision real numbers.
But physical reality, as far as we know, cannot store infinite amounts of information in finite space. In consequence, hyper-computers remain an elaborate fiction since they cannot be physically constructed.
So one route to demonstrating the reality of higher infinities would be the construction of a hyper-computer. But as of today this prospect seems very doubtful.
The social function of higher infinities
Some form of Platonism is the dominant philosophy of mathematics today. Mathematicians might prefer a constructive proof, if they can get one, but otherwise are willing to accept the existence of objects that can’t be constructed. Constructivism remains a minority pursuit because it prohibits proof methods that most believe are reasonable and justified. For example, Hilbert complained that “Taking the principle of excluded middle from the mathematician would be the same, say, as proscribing the telescope to the astronomer or to the boxer the use of his fists”.
Also, constructivism is, in some ways, an anthropocentric philosophy of mathematics. It seeks to reduce existence to the finite powers of our cognition or current technology. And this only really works if you think mathematics deals entirely with invented, rather than discovered, objects.
And so we are faced with a contradiction. On the one hand, Cantor’s higher infinities suggest that our reason can transcend our physical limitations and gain knowledge of real, but practically inaccessible, objects. On the other hand, mathematics is a language in which it’s perfectly possible to talk about things that don’t exist, to speak a kind of nonsense. As Wittgeinstein remarked, “if one person can see the higher infinities as a paradise for mathematicians, why should not another see it as a joke?”
And finally we need to acknowledge the hand of God in all this. Cantor believed it was God himself who revealed the infinite to him. He promoted his ideas within the Catholic church, including writing letters to the Pope. And Cantor’s diagonal argument has a similar logical structure to Anselm’s ontological argument for the existence of God.
When we think about the infinite, religion and theology are always close by. So have mathematicians actually discovered a rational kernel within this mystical shell, or have they merely reproduced the idealist yearning to escape an often painful and disappointing material reality?