How can you find one fake coin among many, by comparing the weights of various sets of coins, using the smallest possible number of comparisons?
Puzzles like this have a long history, but somehow I never got around to thinking about them until I read this fun set of balance puzzles by Quanta magazine’s columnist Pradeep Mutalik. If you want to read and solve those Quanta puzzles for yourself, DO NOT keep reading the page you are on now, as I am about to reveal a whole lot of spoilers.
The idea is that we have a collection of n coins, and while the genuine coins are all identical in weight, some number of fake coins among them will have a different weight. We have a balance that lets us directly compare the weights of any two sets of coins from the collection, by placing them in the two pans of the balance and seeing whether one side drops, proving it to be heavier, or whether the pans remain level, proving that they contain equal weights. To be precise, by “any two sets” we mean any two disjoint sets, i.e. sets with no elements in common, since a coin can’t be in both pans at once. But apart from it being impossible to perform such a measurement, the contribution of the shared coins to each set would cancel out, so even in theory there would be no advantage at all to allowing overlapping sets.
In what follows, we will focus on the versions of the puzzle where we know for sure that there is exactly one fake coin and n–1 genuine coins. There are then four possible variations:
Variations 1 and 2 are essentially the same; if we can solve Variation 1 by a series of measurements, we can solve Variation 2 by doing the same thing but swapping every mention of “lighter” and “heavier” in our procedure.
Variation 1 is fairly easy to solve.
For larger values of n, we adopt a strategy of repeatedly dividing up the set of coins that must contain the fake into three parts. Two of these parts need to have the same number of coins, so that if we compare them they will either be the same weight, which will tell us that the third part contains the fake, or one of the first two parts will be lighter, so it will contain the fake.
How, exactly, should we divide up the coins? If n is divisible by 3, we can divide it into three equal parts. If there is a remainder of 1 after dividing by 3, we put the extra coin in the third set. If there is a remainder of 2, we put an extra coin in each of the first two sets. So, 9 → (3,3,3), 10 → (3,3,4), 11 → (4,4,3), etc.
Each comparison shrinks the size of the set that must contain the fake. If we started out with n = 3p coins, every comparison would shrink the set by precisely a factor of 3 every time, so that after p comparisons we would have pinned the fake down to a set of size one, locating it exactly. But in general, the number of comparisons that we can guarantee are enough to find the fake will be the lowest value of p such that 3p ≥ n. So for 10 coins, which we split into two sets of 3 and one set of 4, we might be lucky and pin the fake down to one of the sets of 3, then settle the question with just one more comparison, but if the fake is in the set of 4, we might need two more comparisons, for a total of 3 comparisons.
Variation 2 can be solved in the same way, except that when two sets have different weights, it is the heavier of the two that must contain the fake.
What about Variation 3, where we have no advance knowledge as to whether the fake coin is heavier or lighter than the genuine coins? If n=1, that single coin must be the fake, but if n=2 it is impossible to tell whether the lighter of the two coins is a genuine coin accompanied by a heavier fake, or a lighter fake accompanying the genuine coin. So it really only makes sense to talk about a strategy for n ≥ 3.
For n=3, it is impossible to be sure of finding the fake in just one measurement. We can only compare two coins, and if they are the same then that pins down the fake to the third coin. But if the coins we compare have different weights, we have no way of knowing which is the fake; this is the same situation as n=2, where we were stuck. The only solution is to perform another comparison. Luckily, we now know that the third coin is genuine, so we can compare it with either of the first two, and that will settle the matter: any coin matching the third coin will be genuine, making the other fake, while any coin disagreeing with the third coin will itself be fake.
We can illustrate this strategy with the following decision tree:
Here, we colour a set of coins grey if we know nothing about them except that they might include a fake of either weight. We colour a set yellow if we know they are all genuine. We colour a set light blue if they might include a lighter fake, but not a heavier fake. And we colour a set pink if they might include a heavier fake, but not a lighter fake.
We will call the totality of grey sets F for fake, the totality of yellow sets R for reference [we know they are genuine, so we can use them as reference weights to check other coins], the totality of light blue sets L for light, and the totality of pink sets H for heavy. We will use f, r, l and h for the number of elements in F, R, L and H respectively.
Whenever we perform a measurement comparing two sets A and B (such as the sets of coins marked on our tree by the green and magenta dots), the F, R, L and H sets change as follows:
Once we have pinned down the set containing the fake to a single coin, we give its number as a “leaf” on the decision tree.
It’s easy to extend the strategy we gave for n=3 to n=4. The only difference is that when the initial pair of coins we compare turn out to weigh the same, that no longer identifies the fake coin immediately, and we need one more comparison to reach the answer. So this time we locate the fake with exactly two comparisons in every case.
What about higher values of n? Before trying to answer the question systematically, let’s look at one specific case. The Quanta balance puzzles asked for a solution to n=12, using at most 3 comparisons. After an awful lot of trial-and-error, I finally came up with this decision tree:
To keep the tree diagram from being needlessly large, we have omitted the branch where the first comparison finds the green set to be lighter than the magenta set, because everything we do for the branch where the opposite is true can easily be adapted for this outcome, just by swapping the two sets 1-4 and 5-8.
Here is a description of the strategy in words:
Phew! While most of the individual steps here were fairly easy to come up with, one comparison that sticks out as being a lot less obvious than the others is measurement 2B in the written description, which is the top 2nd-level comparison in the tree diagram. Here, we compare two sets of coins that are each drawn from two different categories: 3 coins from H and 2 from L are compared with 1 from H and 4 from R.
We can extend this strategy to n=13, still with just 3 comparisons. Again, compare four of the coins with four others as the first measurement, and if they have different weights everything works in essentially the same way as for n=12. If they have the same weights, leaving us with 8 reference coins and 5 potential fakes, we now compare 3 of the potential fakes with 3 reference coins. If those 3 potential fakes weigh the same as the reference coins, we have narrowed down the possibilities to the other 2 potential fakes, with the culprit easily located with one more comparison. And if the 3 potential fakes are either heavier or lighter than the reference coins, the comparison tells us definitively whether the fake is heavier or lighter, which takes us back to n=3 for Variation 1 or Variation 2, which is settled by comparing one pair of coins from the 3 candidates.
To see how we can approach the general case, we will start by describing all the situations from which we can reach a result with just one more comparison:
The requirement that r ≥ 1 is not very limiting; while r = 0 at the very beginning, that will only persist if we make some very specific choices that aren’t particularly helpful, let alone obligatory. So apart from those circumstances, it will generally be true that if f ≤ 2 or l + h ≤ 3 we can reach a result with at most one more comparison.
Now let’s consider a comparison of the most general kind. When we have coins in the grey set F, this is:
w coins from F are compared with v from F, w–v from R, while setting aside f–w–v from F
After a measurement, we end up with one of the following results:
When we have coins in the blue and pink sets L and H, the most general comparison is:
a coins from L, b from H are compared with c from L, d from H, a+b–c–d from R, while setting aside l–a–c from L, h–b–d from H
After a measurement, we end up with one of the following results:
Now, we ought to be able to divide and conquer in a similar, if slightly more complicated, fashion as we did with Variation 1. We can divide the coins in L and H into three groups: all equal in size if the total number is divisible by 3, with one extra in one group if there is a remainder of 1, and one extra in each of two groups if there is a remainder of 2. This will always result in at least two of the groups being equal in size, but we will not commit in advance to any particular pair of groups being equal.
These three groups will specify the possible new sets of elements of L and H after the next comparison we perform. That comparison will be between (A) all the current L elements in the first group and all the H elements in the second group, and (B) all the L elements in the second group and all the H elements in the first group. If one of these sets is smaller than the other, we will need to make up the difference by padding out the smaller set with reference coins from R. If the comparison tells us that A weighs less than B, that shrinks both L and H down to their intersection with group 1; if A weighs more than B, that shrinks them to their intersection with group 2; and if A and B weigh the same, that shrinks them to their intersection with group 3.
The maximum value of l + h that we can reduce to 1 in c comparisons this way will be 3c, but the initial value of l + h will always be even, because l and h will always start out being equal the first time they are non-zero (e.g. they are both 4 in our decision tree for n=12). So 3c–1 will be the actual maximum. The question then is how many coins we should compare in the very first comparison, and how many we should set aside. The number we compare will determine the value shared by l and h if the comparison finds the sets have different weights, while the number we set aside will determine the value of f if the sets turn out to weigh the same.
Suppose we initially compare two sets of w = (3c–1)/2 coins. How many additional coins could we set aside without being at risk of falling behind the c comparisons that we know will suffice to locate the fake coin if it lies in the two sets that are initially compared?
If we look at the ceiling we can place on l + h, after the initial value of 3c–1 this falls to 3c–1; for example, for c=3 we have 3c–1 = 26, which we would then split into sets of 9, 9 and 8. At each subsequent comparison we have a ceiling on l + h of 3c–k. So, suppose we have an additional (3c+1)/2 coins at the start, so that after the first comparison we might end up with f = (3c+1)/2. If we then take 3c–1 of these and compare them with reference coins, then if they turn out to be heavier or lighter, we get either l or h equal to 3c–1 and the other zero, so l + h would match the ceiling we get at the same stage by a different route. And if they turn out to weigh the same as the reference coins, what we are left with as the new value of f is:
(3c+1)/2 – 3c–1
So we can continue this strategy for as long as we need to: when f is (3c–k+1)/2, the ceiling on l + h one step deeper into the decision tree will be 3c–k–1, and we can ensure that we keep pace with it by comparing that many of the coins in F with reference coins.
What we have sketched so far suggests that, with a total of c comparisons, including the first one that we haven’t been counting until now, we ought to be able to solve Variation 3 for:
n ≤ 3c–1–1 + (3c–1+1)/2 = (3c–1)/2
This formula takes the values 1, 4, 13, 40, 121, ... though as we have noted, it only really makes sense to start at n = 3.
To make this concrete, we will need to spell out an algorithm for choosing the comparisons at each point in the decision tree, for any n ≥ 3. Suppose we are at a point in the decision tree with:
l + h = 3 t + u
where u is either 0, 1 or 2
Depending on the value of the remainder u we get after dividing l + h by 3, we will have three possible triplets of values for the size of the three groups into which we divide the l + h coins:
But we don’t want to commit yet to which of these are g1, g2, g3, the size of the first, second and third groups. The first group is made up of a coins from L and d coins from H, and the second group is made up of c coins from L and b coins from H; the comparison we perform will then be between a coins from L together with b coins from H, against c coins from L together with d coins from H, with as many reference coins added to one side as we need to ensure that we are comparing equal numbers of coins.
This leads to the following equations and inequalities:
a + d = g1
b + c = g2
0 ≤ a ≤ min(g1, l)
0 ≤ b ≤ min(g2, h)
d = g1 – a
c = g2 – b
a + c = a – b + g2 ≤ l
b + d = b – a + g1 ≤ h
g1 – h ≤ a – b ≤ l – g2
So, for a given choice of g1 and g2, we have (a, b) constrained to a rectangle in the coordinate plane by the first set of inequalities, and then further restricted to a band running at 45° to the coordinate axes by the second set of inequalities. If we then choose a particular point (a, b), that determines the other parameters of the comparison, (c, d), and the number of reference coins we need to use:
ρ = |a + b – (c + d)| = |2a + 2b – (g1 + g2)|
In considering our choices for g1 and g2 [when there is more than one possibility] we need to check that the rectangle and the 45° band have a non-empty intersection. Then, along with our choice of (a, b) we will try to make ρ as small as possible, to ensure that we will have as many reference coins as we need. If we ignored this last criterion, there is at least one scenario where we could make the wrong choice: for n=3, in the second step we only have one reference coin available, so a strategy that suggests we compare the two other coins (one in L and one in H) with two reference coins, while it would work in principle, would be impossible to implement.
First, we note that the 45° band itself can’t be empty, because the number of integer values it allows for a – b is:
l – g2 – (g1 – h) + 1 = l + h – (g1 + g2) + 1 = g3 + 1
The range for a – b encompassed by the top left and bottom right corners of the rectangle is:
–min(g2, h) ≤ a – b ≤ min(g1, l)
For the intersection to be empty, one of these ranges would have to lie entirely outside the other. If the rectangle’s range is to the left of the 45° band:
min(g1, l) < g1 – h
This is clearly impossible unless it is l that is the minimum in min(g1, l), so it would require:
l < g1 – h
which in turns leads to the impossible requirement:
g1 > l + h
Similarly, it turns out to be impossible for the rectangle’s range to lie to the right of the 45° band. So, we are free to choose g1 and g2 without any risk of making the solution set for (a, b) empty.
That leaves us with the problem of picking definite values for g1 and g2 and (a, b) in order to minimise:
ρ = |2a + 2b – (g1 + g2)|
subject to the constraints of the rectangle and the 45° band. The sum 2a + 2b takes constant values on each line that cuts across the plane at –45°, so to minimise ρ, we want the point (a, b) to lie as close as possible to the particular line where 2a + 2b = g1 + g2.
Because g1 and g2 will either be equal or will differ by 1, the only way we can end up using no reference coins at all is if they are equal. So, let’s tentatively assume that we make the choice g1 = g2, and see if we hit any obstacles.
What particular point could we choose for (a, b)? The point P1 = (g1, 0) will lie on the line that gives us ρ = 0, but it might not meet the constraints we need. We need:
g1 – h ≤ (a – b = g1) ≤ l – g1
Clearly it will satisfy the first inequality, but it might or might not satisfy the second. If 2 g1 > l, consider the alternative point:
P2 = (floor(l/2), g1 – floor(l/2))
(a – b = 2 floor(l/2) – g1) ≤ l – g1
Does P2 meet the first inequality? To violate it would require:
g1 – h > 2 floor(l/2) – g1
2 g1 > 2 floor(l/2) + h
If l is even, the RHS here is l + h, and it is impossible for the sum of two equal-sized groups, out of the three into which l + h is divided, to have a larger size. If l is odd, the inequality becomes:
2 g1 > l + h – 1
This is only possible if g1 = g2 = 1 and g3 = 0, with l + h = 2. Since we also require 2 g1 > l, the only way this can arise is with l = h = 1. We have already noted that this might be a case that needs special treatment, so we will set it aside for now and check whether the other constraints are satisfied for all other cases.
If 2 g1 ≤ l, P1 = (g1, 0) lies at the bottom right-hand corner of the rectangle, because min(g1, l) = g1. So it meets all the constraints.
If 2 g1 > l, we have P2 = (floor(l/2), g1 – floor(l/2)). Both coordinates are clearly non-negative, and a = floor(l/2) is less than or equal to both g1 and l. We also need:
b = g1 – floor(l/2) ≤ min(g1, h)
Now, we can rewrite h to obtain:
h = 2 g1 + g3 – l > g3
g3 ≥ g1 – 1
It follows that h ≥ g1, so the RHS of our constraint on b is min(g1, h) = g1, and it becomes manifestly true.
We now have a reasonably simple algorithm for deciding what comparison to perform when we are dealing with the sets L and H:
if l = 1 and h = 1 thenSet g1 = 1, g2 = 0, g3 = 1else
Set a = 1, b = 0, c = 0, d = 0 and ρ = 1
[We compare the single coin in L with a single reference coin.]Set g1 = g2 = round((l+h)/3)
if 2 g1 ≤ l thenSet a = g1, b = 0, c = g1, d = 0 and ρ = 0else
[We compare g1 coins from L with g1 coins from H.]Set a = floor(l/2), b = g1 – floor(l/2), c = floor(l/2), d = g1 – floor(l/2) and ρ = 0
[We compare floor(l/2) coins from L and g1 – floor(l/2) coins from H with a second set containing the same numbers of coins from L and H.]
In the initial comparison, we can split up the coins in exactly the way we did for Variation 1. But we also need to deal with the general case where we still have the set F rather than L and H. If n ≤ (3c–1)/2 [where c is the smallest integer for which this is true], we need to ensure that after the kth comparison [starting with k=1 for the initial comparison], we do not allow l + h to be greater than 3c–k. So we compare this many coins in F with reference coins, or all of F if it contains fewer coins.
Variation 4 can be solved using essentially the same algorithm, but the number of coins it can deal with for a given number of comparisons will be slightly less. The only way the status of the fake coin as lighter or heavier can remain unresolved in Variation 3 is if the set F persists to the end of the process, finally reduced to a size of 1. If we always take 3c–k coins from F to compare, for k = 2, ..., c, this adds up to (3c–1–1)/2, which is one less than the number of coins we set aside in the first comparison when n is exactly the upper limit, (3c–1)/2. If n was one less, we would set aside one less coin, and we would never have any coins in F by the end. So for Variation 4, the maximum number of coins we can deal with using c comparisons becomes (3c–1)/2 – 1 = (3c–3)/2.
You can see how the algorithm for Variation 3 works to locate the coin by selecting the number of coins, the location of the fake coin, and the weight of the fake coin, then hitting the LOCATE FAKE button.
|Coins that might include a fake of EITHER weight||Coins that might include a LIGHT fake||Coins that might include a HEAVY fake||Coins that are known to be GENUINE|