We now set out to prove the general pattern that was conjectured to hold in lesson 3. Namely, we set out to show that if a set has N elements in it—where N is a finite number—then that set always has exactly distinct subsets. Although this lesson is primarily intended for mathematical rigor and completeness (where I try to minimize the amount of times you have to “take things on faith”), it will also provide some good examples of how we need to get creative when we prove things. There are several ways to prove that this pattern holds, and some of these methods require fancier machinery than we currently have available. However, with a little creativity, we can make the result obvious just from the theory of sets and functions that we’ve developed so far.

Before I begin, however, let me wax philosophical for just a bit longer. In particular, let me (somewhat) briefly discuss what a mathematical proof really is.

This is a very deep question—one that I will have much more to say about in other parts of this site, and one which has no real clear answer. One relatively well agreed upon definition of a** **proof, and the definition that I’ll go with here, is a finite sequence of logically deduced and unassailably true statements which demonstrates the validity of a final statement. Now, there are lots of terms in there that are ill-defined and/or contentious, and there’s nothing I can do about that here. As I said, this is a huge question—one which I have no pretention of being able to do justice to here. One thing we **can** gather from this definition, however, is the notion that a proof should kind of lay a red carpet out for your mind, leading your brain one step at a time through a supremely convincing argument for the validity of some statement, using previously determined truths and logical manipulations. Thus, when proving a statement it does not matter whether one needs to invoke some high powered theorem about braided monoidal categories or can attain the desired result using nothing but basic arithmetic. All that matters is rigor and clarity. In fact, it is often the simple short proofs that inject the beauty into math, and not the 150 page tours de force that some theorems seem to require.

The above paragraph is only there to make explicit what I mean by “proving the pattern”. Namely, I will provide a sequence of statements that make it “obvious” that a set with N elements **must** have distinct subsets. Before beginning, however, I must introduce one small piece of machinery. We’ve already defined sets and saw that we could just as well define a **set of sets**, whose elements are themselves sets (see lesson 4). Now let us consider **sets of functions**. This isn’t so bad, it’s just a set whose individual elements are functions! This is possible since we have a way of distinguishing two functions from each other: we say two functions f and g are **the same**, or **equal**, if their **domains** and **codomains ** are the same, and if for each in the domain. This is just the technical way of saying that if the functions act on the same domain in the same exact way, then they’re the same. This is an obvious definition of equality.

Let’s see an example of a **set of functions**. Consider the sets and . There are only possible 4 functions that we can define from A to B, since we can have

,

or ,

or ,

or .

Any other function from A to B must be equal to one of these functions, so if we label these four functions as and then we can form the set of functions from A to B.

Now we’re ready to prove the pattern. How does all of this “sets of functions” business relate to the distinct subsets of a set? Well, in a very natural way actually! For let A be a set with N elements and let B be a subset of A. We can then fully characterize B by considering a clever function from A to the set as follows. Let’s define the function (just a label) from A to to be the function which sends an element in A to 0 if that element is **not **in B, and to 1 if that element **is** in B. Thus, the elements in B are simply the elements in A that get sent to 1 by the function .

We can clearly do this for any subset of A that we’d like. In particular, if we chose the empty set as our subset, then our function would simply be the function that maps **everything** in A to 0, since **nothing** in A is in this subset, because this subset is empty! Similarly, since we know that A is a subset of itself, it must correspond to some function from A to . Sure enough, the corresponding function is just the function that sends everything in A to 1, since now **everything** in A is in the subset that we’re considering. It is clear that if you hand me a subset of A, I can construct such a function. Moreover, if you hand me a function , then I can define a subset simply by the elements that get sent to 1. Thus, there is a **bijective **correspondence between subsets of A and functions . Therefore, if we can count the number of distinct functions from A to , then we’ll have counted the number of distinct subsets of A.

But is counting functions any easier than counting subsets? In this case, the answer is yes (if it weren’t then I wouldn’t have wasted your time with the above paragraphs!). The reason for this is that if we know how many distinct subsets (i.e., distinct functions to ) a set with N elements has, then we automatically know how many subsets a set with elements has. Let’s see this in general.

Suppose a set with N elements has M possible functions to (don’t let the letters confuse you: N is the number of elements in the set, and M is the number of distinct subsets of that set). Recall that we don’t know exactly what M is yet, because that’s what we’re trying to find out! However, **whatever** M is, let us consider how many subsets a set with N+1 elements will have. For simplicity, let’s just consider the original set with one extra element in it. Well, we need to look at how many functions from this new N+1-element set to there are. We clearly get all the functions that we had before we added this element, but now we need to send the new element to either . We therefore get the M functions that we had before with the new element sent to 0, and another set of M functions with the new element sent to 1. We therefore now have functions. Thus, we’ve established that if a set with N elements has M distinct subsets, then a set with N+1 elements has distinct subsets. We’re almost there!

Now consider a set with 1 element. We can either send this element to 0 or to 1, therefore there are 2 distinct subsets: the empty set and the set itself. From the above paragraph, this means that a set with 2 elements has distinct subsets, because we just multiply the number of distinct subsets that a set with one less element has by two. Similarly, a 3-element set must have twice as many subsets as a two element set, so a 3-element set has elements. We clearly get an extra factor of 2 every time we raise the cardinality of the set by 1. Since a 1 element set has one factor of 2’s worth of subsets, an N element set has N factors of 2’s worth of subsets. But this is the very definition of !

QED

Nice job! This is a relatively tedious proof, but it does show how we are at liberty in interpreting our mathematical structures, so long as these interpretations are completely rigorous. Namely, the key to the proof was realizing that we can associate subsets to special functions in a bijective way. This association then made the counting a lot more obvious. Note also that this proof helps to clarify **why** the statement is true. This is another sign of mathematical beauty (not that this proof is particularly beautiful). If a proof simply involves quoting some high-powered statement from some unrelated sub-field of math, then it is neither helpful, enlightening, or beautiful. However, if a proof can help to make clear **why** the statement is true, and **why **it couldn’t be any other way, then we see the beauty in math directly—we see how logic drives us into the inevitable and eternal truths that we are so fortunate to experience.

Now that we’re done here, let us move back to our general study of sets and the various constructions that they allow. As we’ll be seeing, there are very few constructions in mathematics that don’t involve sets and functions in some way.

Hi there! First off, thank your very much for this website. I’ve got no mathematical knowledge whatsoever, and your lessons have been very very helpful. I’ve been following each of them very carefully, and I understand pretty much everything (including Cantor’s diagonal….mind blowing!!!) So…. you’re a great teacher!! Now today was the exception (I failed as a student), which is funny since the reasoning about the cardinality of the power set seems to be very simple, right? For some reason, though, I can’t get it!! While I do understand each specific steps of the reasoning, I can’t make sense of it as a whole. I don’t see how ascribing a binary value [0,1] to each element of a set S can give you the cardinality of P(S)?? I understand everything quite mechanically, as it were. If you have S = {1,2,3}, then you go: 1: either 0 or 1; 2: either 0 or 1; 3: either 0 or 1, then, since each member either it is in S or not, 2x2x2 should give you |P(S)|. But this is precisely what I don’t (truly) understand… help! Thanks again, and kind regards from Switzerland!!

another related question (i’m still struggling!): when you say “Thus, there is a bijective correspondence between subsets of A and functions f : A\rightarrow \{0,1\}”, you mean that the bijection is with every function (0 and 1) or only with those that send an element in A to 1??

First of all I’m glad to hear you like the site, and I hope you continue to do so! 🙂 I’m also glad to hear that you’re able to follow the lessons even without much background, as that is precisely what the goal is. Now on to your questions (which are great questions, by the way, and thanks for asking about your confusion instead of giving up, that’s very admirable (and well in the spirit of math!)).

Let us begin with the bijective function from the set of subsets of A to the set of functions from A to {0,1}. The function is as follows. Take ANY subset of A, let’s call it B, and define the following function from A to {0,1}. It’ll be the function that sends each element of A that is in B to 1, and each element of A that is not in B to 0. This gives us a perfectly good function from A to {0,1}. For example, if A={1, 2, 3} and B={2, 3}, then our function would be f(1)=0, f(2)=1, and f(3)=1. As another example, the function corresponding to the empty set (which is a subset of A) would be f(1)=0, f(2)=0, and f(3)=0.

Conversely, we can take ANY function from A to {0,1} and define a subset of A. Namely, we can define the subset of A by only putting in those elements that get sent to 1 by this function into our subset. For example, if A={1, 2, 3} and the enemy hands us the function from A to {0,1} defined by f(1)=0, f(2)=1, f(3)=1, then we can define the subset {2, 3}. Note that these two operations are “inverses” to each other. Namely, if we start with a subset of A and define its corresponding function, and then take this function and define its corresponding subset, then we get back to the same subset that we started with. All of this is saying that for every subset of A, there is a function from A to {0, 1}, and for every function from A to {0, 1}, there is a subset of A. This is then nothing but saying that there is a one-to-one correspondence (or bijective correspondence) between subsets of A and functions from A to {0,1}. And what all of this is saying is that there are always the same number of subsets of A as there are functions from A to {0, 1}.

So finally our job comes down to simply counting the functions from A to {0,1}. Suppose A has N elements (where N is some finite number). We need to count ALL POSSIBLE functions from A to {0,1} (which we now know is the same as counting ALL POSSIBLE subsets of A). To count these functions, we see that in defining ANY function from A to {0,1}, we have the choice of sending the “first” element of A to either 0 or 1, the “second” element of A to either 0 or 1, the “third” element of A to either 0 or 1, and so on, until we send the N^th element of A to either 0 or 1. Thus there are a total of 2^N choices to make in defining some given function, and therefore there are 2^N total POSSIBLE functions (and therefore 2^N total possible subsets). (As an example, explicitly write out all the function from A={1, 2, 3} to {0,1}).

All this talk about functions is nothing but rephrasing the following. To construct ANY subset of A we must go to each element of A and ask whether or not we put it in our subset. EACH POSSIBLE such choice corresponds to a different subset. Namely, if YES means “yes, put it in the subset” and NO means “no, don’t put it in the subset,” then each possible string of “YES, NO, NO, NO, YES, …” where we are just going to each element of A and answering yes or no corresponds to a different subset. We then see that we have two choices for the first element, two choice for the second element (so a total of four choices for the first two elements), two choices for the third element (so a total of eight choices for the first three elements), two choices for the fourth element (so a total of 16 choices for the first four elements), …, and two choices for the N^th element (so a total of 2^N choices for all of the N elements). Thus, 2^N subsets.

Does that help at all? If it still isn’t crystal clear then just let me know and I’ll try to help some more!

Cheers!

That was very helpful, thanks!! Now I see that my confusion was a “linguistic” one rather than mathematical. I thought that each binary value was a function (so: 101 = three functions: 1-0-1) instead of thinking of the whole series as one (so: 101 = one function, 001, another function, etc.). So now I get the point that you count subsets by counting functions (which was what troubled me). Again, thanks! Let us move on to lesson 16 then!!!!

Awesome, glad it all makes sense! I hope you enjoy lesson 16 and beyond! 🙂 And don’t hesitate to keep asking questions (thumbs up emoji)