Colin Foster's Mathematics Education Blog

23 June 2022

Lines of not-very-good fit

Does anyone teach lines of best fit 'properly' in lower secondary school? I think whenever I’ve seen this concept taught, or taught it myself, it’s always been a bit wrong.

Typically, students are given a scatterplot, or they draw one themselves, and are asked to draw a straight line on top of it, by eye, but the instructions for how they are supposed to draw this line can be a bit vague. Maybe the teacher says something like, “The 'line of best fit' goes roughly through the middle of all the scatter points on a graph.” (BBC Bitesize: I guess this is kind of right, but I think that any student hearing this is bound to misinterpret what this is supposed to mean.

Suppose you give students the $x$-$y$ scatterplot below (Note 1), and ask them to draw the best straight line they can that takes account of all these points. 

Of course, they could draw something like this, which “goes roughly through the middle of all the scatter points” (10 points on either side).

But, unless they are trying to be awkward, they will probably be much more likely to draw something like this.

It 'goes through the middle' and is the kind of thing that the teacher is wanting (Note 2).

But, if you then display an accurate trend line, say using Excel (in black below), then it will be a bit off from what the students have drawn.

Here they are together, so you can see the difference:

It is easy to put this discrepancy down to human error. The computer draws the best possible line, and the line we draw by eye is bound to be not quite right. Students might over-attend to a few prominent outliers, rather than really base their line on where the overall mass of the points is located. So there is nothing to worry about.

But there is more than random error going on here. I claim that the students are not even trying to draw the line that the computer is drawing. For example, if we switch the variables around (interchange the axes), presumably this would/should make no difference at all to the line that the students are trying to draw, relative to the positions of the points – it should just be a reflection of their line in the diagonal $y=x$. But the computer will give you a completely different regression line, because the regression line of $y$ on $x$ is in general quite different from the regression line of $x$ on $y$ - and sometimes dramatically so. The regression line of $x$ on $y$ is shown in blue below, on top of the black regression line of $y$ on $x$.

The black line minimises the sum of the squares of the vertical distances of the points from the line, whereas the blue line minimises sum of the squares of the horizontal distances of the points from the line. We should not expect the resulting lines to be the same. The black line gives the best linear prediction of the $y$ value, given the $x$ value; the blue line gives the best linear prediction of the $x$ value, given the $y$ value. The two lines answer two different questions.

And neither of these questions is likely to be what the students are thinking about. The line the students are likely to be aiming for is the principal axis of the data. If we draw an ellipse around our data points, what the students are presumably trying to do is essentially find the major axis of this ellipse.

If we compare the principal axis (in red below) with the correct regression line of $y$ on $x$ (in black), we can see that they are not the same.

If you consider thin, vertical slices of the ellipse, the black line approximately bisects these, and is close to the mean $y$-value of the points that are near to that value of $x$. Relative to this, the red line underestimates $y$ for low values of $x$ and overestimates them for high values of $x$.

In school mathematics, lines of best fit are used to predict one variable from the other, so it's really regression lines that we need, not principal axes. (And, indeed, really we should use a different line to predict $y$ from $x$ [part (b) of the typical exam question, in which part (a) is to draw the line of best fit] from the line we use to predict $x$ from $y$ [part (c) of the typical exam question].) Even when the regression line and the principal axis happen to be close to each other, conceptually they are quite different. The principal axis minimises the sum of the squares of the (perpendicular) distances from the points to the line, whereas the ordinary-least-squares regression line minimises the sum of the squared vertical distances from the points to the line. It can be interesting to devise scenarios where these two lines are very similar and very different.

From a school teaching point of view, does this matter, or is it unnecessary quibbling? I have found that this discussion comes up sometimes when students complain that the line of best fit that the computer is producing ‘looks wrong’, especially when there are lots of points, and the correlation is fairly strong. They think they can draw a better one, and are puzzled why the computer is clearly giving them 'wrong' lines. The problem here is that the students have been misled about which line they should be aiming for, and Gelman and Nolan (2017, Chapter 4) have a nice approach to addressing this.

Maybe it is a relatively small point to worry about, but surely it would be a bit of a problem if students drawing something closer to the black line above were being penalised or criticised over those drawing something more like the red line.

Questions to reflect on

1. How concerned are you about this distinction between regression lines and principal axes?

2. What, realistically, might be done to address this in school-level mathematics?

3. Are there other examples in school mathematics where it is usual to teach things 'a bit wrong'?


1. The data used in this blogpost is available at:

2. Students sometimes have a strong tendency to avoid going directly through any of the points. They have been told that they are not meant to 'join up' the points, and, as if to prove this, they try to keep away from any actual points altogether. Similarly, they may feel that it would be wrong to allow the line to pass directly through the origin, so they act as though the origin must be avoided at all costs.


Gelman, A., & Nolan, D. (2017). Teaching statistics: A bag of tricks. Oxford University Press.

09 June 2022

Motivation for measurement

Optical illusions are almost universally intriguing. Young children can completely get them, but they can fascinate adults too. There is something captivating about being tricked by your eyes. And I think they can provide a great opportunity for motivating some geometry.

Topics in mathematics that involve accurate measurement can sometimes feel a bit unmathematical - more science than mathematics. For example, why is 'scale drawing' a topic in mathematics? Is this just a hangover from the days when 'technical drawing' was a marketable skill that was taught in schools? Converting scales is a useful application of ratio, but what is the mathematical purpose of making accurate drawings? Loci and construction are important topics for understanding concepts in geometry, and the central idea that compass constructions are 'exact in principle' seems to me to be important. But should it matter whether students can execute a perfect circle with their compasses or draw an angle of $35^\circ \pm 1^\circ$ using a protractor? Arguably, making neat constructions may depend more on the quality of the student's instruments (such as how well-tightened the screw on their compasses is) and on basic dexterity than on any mathematically-relevant skill. The beauty of mathematics is the ability to carry out exact calculations that mean that a correct mathematical sketch not drawn to scale is generally as useful as, or more useful than, an accurate scale drawing. For example, in an astronomical scenario (e.g., calculating the distance to the sun) we can sketch a 1 by 20,000 right-angled triangle, and this is much more helpful than trying to draw this to scale! In mathematics, we develop ways to calculate so that we don't need to make accurate drawings, so perhaps the main purpose of scale drawing is to show students how slow and tiresome things would be without mathematics (a Dan Meyer 'headache-aspirin' situation, see Meyer, 2015)?

Nevertheless, there are times when we need students to measure lengths and angles, and it is great when we can find purposeful ways to practise these skills (see Andrews, 2002). I think finding scenarios where there is a real (i.e., uncontrived) need to measure can be quite difficult, but optical illusions can be really helpful for this - and are fun in their own right.

There is a good list of many kinds of optical illusions at, and this includes things that are extremely weird, such as Ames room ( These might be fun to look at and talk about. However, I focus below on some examples of optical illusions that could have obvious, immediate relevance in motivating some primary or secondary school geometry and measurement.

The Ebbinghaus illusion ( is a nice one. The orange discs below are equal in size, but don't look it.

This is just crying out for some measurement with a ruler. Can the diameters really be equal? But the centres of the circles are not marked, so how could you be sure you were accurately measuring the diameter, and not some other chord?

The Delboeuf illusion ( is similar. The black discs are in fact equal, but the right-hand one looks larger:

The Moon illusion is a nice variation on this (

Without being prompted to do so, when presented with these illusions, students reach for their rulers. And so, of course, if you want every student to do the measuring, then you need to provide the images on paper, as displaying them on the screen allows only one student to do it on behalf of everyone else.

Students could also attempt to create their own drawings, some of which are illusory (something looks bigger but isn't) and some not (something looks bigger and is), and see if other students can decide which are which by eye - followed by measuring to check.

The Müller-Lyer illusion ( provides motivation for measuring the lengths of line segments. The two horizontal portions below are equal in length, but don't look it!

Similar opportunities are provided by the Ponzo illusion (,

Tony Philips, National Aeronautics and Space Adm., Public domain, via Wikimedia Commons

the Sander illusion (, where the two purple diagonals below are actually equal in length,
where the vertical line segment looks longer, but isn't.

Asking students to attempt to draw a square by eye, using a straight edge (i.e., not a scaled ruler), can be revealing. Then the students measure everyone's and do some statistics to see whether, among the various drawings produced, 'squares' that are tall/narrow are more prevalent than ones which are short/wide. (The easiest way of keeping track of the orientation of each piece of paper is to have the students write their name at the top.) There are similar opportunities for statistical analysis in devising a way to decide how to judge the quality of people's freehand circles (see Foster, 2015, and Bryant & Sangwin, 2011).

The café wall illusion ( is a bit more sophisticated, and this can be a good opportunity to encourage students to use precise language. What exactly do they mean by 'wavy', 'wonky' or 'not straight'? Do they mean sloping straight lines or curves? "Say what you see" can be a really a useful prompt to use with these figures, and you can follow up with requests for greater clarity.

Fibonacci, CC BY-SA 3.0, via Wikimedia Commons

The Zöllner illusion ( provides another opportunity for students to check parallelness,

Fibonacci, CC BY-SA 3.0, via Wikimedia Commons

and the Hering illusion ( is another example:

Fibonacci, CC BY-SA 3.0, via Wikimedia Commons

For working with polygons, the Ehrenstein illusion ( is useful. What do we need to check to see if the shape really is a square? Is it enough just to measure the lengths of the four sides? Is it enough just to check that the angles are all right angles (and how many do we need to measure to do this?)? (The Orbison illusion - provides similar opportunities.)

Often, students address measurement objectives by spending lesson time measuring arbitrary line lengths or angles on a sheet, merely to improve their skill at measurement. Optical illusions can provide a rich context for doing similar work, where there is a motivation to discover whether, say, two lengths really are the same or not. I find that students will measure much more accurately, and with considerably more enthusiasm, when it has some purpose behind it, and I would call tasks like these mathematical etudes ( for measurement.

Questions to reflect on

1. Do you find these optical illusions engaging? Would your students?

2. How could you use these ideas to promote a need for measurement with one of your classes?

3. What other tasks make measurement a meaningful mathematical activity?


Andrews, P. (2002). Angle measurement: An opportunity for equity. Mathematics in School, 31(5), 16–18.

Bryant, J., & Sangwin, C. (2011). How round is your circle? Princeton University Press.

Foster, C. (2015). Exploiting unexpected situations in the mathematics classroom. International Journal of Science and Mathematics Education, 13(5), 1065–1088.

Meyer, D. (2015, June 17). If math is the aspirin, then how do you create the headache? [Blog post].

26 May 2022

Are two cars better than one?

So many ideas in probability are really quite unintuitive. How can we help learners make better sense of how simple probabilities combine, without relying on arbitrary rules and mysterious formulae?

I recently heard someone talking about whether they should get a second car for their household (Note 1). They were doubtful that it was a good idea. In addition to the cost and environmental impact, they said, “If you have two cars, there’s twice as much chance that something will go wrong with one of them.” This seemed self-evidently true, and everybody nodded sadly, and the conversation moved on.

I began thinking about how I might respond mathematically. I knew what they meant, and their statement might indeed be approximately true, but it couldn’t be exactly true. Even if all that you know about probabilities is that they are capped at 1 (i.e., 100% is the highest possible probability), it is clear that you cannot just go around doubling probabilities. Doubling any probability greater than 0.5 will give you a total probability greater than 1, which is impossible. And, even if you have super-reliable cars with a very small probability $p$ of failing, you would only need to ensure that you buy more than $\frac{1}{p}$ of them for the total probability to exceed 1.

So, why is such a plausible-sounding statement not right? And under what assumptions might the statement be approximately true?

Suppose that both cars have the same probably $p$ of ‘something going wrong with them’ in a certain time interval. Would these be independent events? If both cars are parked outside the same house, then they are likely to be subject to similar weather conditions and other factors, so it seems unlikely that failure of one would be completely unrelated to failure of the other. But let’s ignore this and suppose that the two events are independent, and also that the two cars are equally likely to fail. This would mean that the probability of both cars failing would be $p^2$. And this means that when we double $p$ we are overcounting by $p^2$ (see Figure 1), because we are counting the same situation of ‘something goes wrong’ when car A fails and counting it again when car B fails, for those occasions when they both fail. In the extreme case, where you had two completely useless cars, you would have a 100% chance of not being able to drive anywhere, but not a 200% chance! Now, if $p$ is very small, then $p^2$ will be very very small, and so we can perhaps ignore the overlap region. But, if $p$ is not very small, then $p^2$ will be non-negligible. This means that the correct probability for either (or both) cars failing is $2p-p^2$. 

Figure 1. $p(A∪B)$

Looking at the expression $2p-p^2$, we might wonder whether we can be absolutely sure that it is always less than 1 for all values of $p$. Is the $p^2$ definitely always sufficiently large to bring back $2p$ to below 1 whenever $p>0.5$? One way to see this is by completing the square, to obtain $1-(1-p)^2$, meaning that $2p-p^2$ is equal to '$1-$something that is never negative'. Figure 2 shows the graph $y=2p-p^2=p(2-p)$, and the curve has its maximum value of 1 at $p=1$, and so it never exceeds 1 for any value of $p$ (Note 2).

Figure 2. $y=2p$ and $y=2p-p^2$

We can also see in Figure 2 that the line $y=2p$ is indeed a good approximation to the curve for small values of $p$. So, just doubling the probability is a reasonable approximation if you are considering very reliable cars.

Using the algebra, this reasoning is quite straightforward for anyone comfortable with quadratics and elementary probability. In probability, we can only add the probability of events A and B if they are mutually exclusive (i.e., $p(A∩B)=0$), so that the ‘Venn diagram identity’ $p(A∪B)\equiv p(A)+p(B)-p(A∩B)$ reduces to $p(A∪B)=p(A)+p(B)$. In other cases, we have to subtract the intersection, so as not to double count it.

I was happy with all of this, but I wanted to say something that didn't sound technical or rely on set theory or even Venn diagrams. Could I say in words why doubling was not quite right? I found it hard to come up with a good way to explain to my friend why the possibility of both cars going wrong was relevant to their statement, and why this indeed made their statement technically wrong, even if you were willing to make assumptions about things like independence and so on. If I had said that having two cars that might go wrong is not quite twice as bad as having one car that might go wrong, because, at least some times, both cars will simultaneously go wrong, I think they would be quite surprised! The commonsense response is that there is no consolation in having both cars fail on the same day - that is the worst possible nightmare, and indeed one of the reasons for contemplating having a second car was to try to be sure that they would always have one working car! They would probably respond that “When I said 'either-or', I was including 'both'!”, which misses the point. Yes, we want to include the chance of both cars failing - the point is that we want to include that possibility only once, not twice (Figure 3)! It is still true that $p^2<p$, and possibly dramatically so ($p^2 \ll p$), so the chance of having at least one working car has indeed risen from $1-p$ to $1-p^2$. The point is that if last week Car A failed, say, on Monday, Thursday and Saturday, whereas Car B failed on Tuesday, Saturday and Sunday, our daily frequency of car trouble would have been $\frac{5}{7}$ and not $\frac{6}{7}$, because we don't double count Saturday just because it was a double-failure day.

Figure 3. $(A∪B)-(A∩B)$

Perhaps this is part of a broader theme in mathematics of situations in which things can’t be simplistically added up. Other examples could include vectors that are not in the same direction, numerators of fractions that have different denominators, and dimensionally-incompatible quantities, such as distance and time (Foster, 2019). However, there is something particular about probability for me. I enjoy probability very much, but I think it's the area of mathematics in which I'm most likely to struggle to truly make sense or to be able to explain concepts clearly to others without using technical language and symbols (see Foster, 2021). Rarely when I'm doing a probability calculation do I have a rough ballpark estimate of the answer I should be getting, and, if I obtain an answer like $\frac{127}{351}$, it would be scarcely worth my while converting it to a decimal to see if it looked a 'reasonable' size, as I would have no idea how to tell reasonable from unreasonable. Do other people share this sense?

Questions to reflect on

1. Do you have a better way of explaining why doubling is invalid here?

2. Do you have other examples of 'similar' situations to this?

3. Do you share my sense that probability is often 'harder to explain' than other areas of mathematics?


1. With apologies for the highly 'middle-class' nature of this 'first-world problem'!

2. The expression $1-(1-p)^2$ can also be obtained intuitively by saying that the required probability is the complement of the probability that both cars are working properly. Since the probability that either car is working properly is $1-p$, the probability that both (assumed independent) work properly is $(1-p)^2$, and so the probability that this is not the case must be $1-(1-p)^2$.


Foster, C. (2019). Questions pupils ask: Why can’t it be distance plus time? Mathematics in School, 48(1), 15–17.,%20Mathematics%20in%20School,%20Why%20can't%20it%20be%20distance%20plus%20time.pdf

Foster, C. (2021). In a spin. Teach Secondary, 10(1), 11.,%20Teach%20Secondary,%20In%20a%20spin.pdf

12 May 2022

Learning times tables efficiently

Times tables can be a controversial subject. Can we help students to learn their tables in ways that promote conceptual understanding? This is my take on teaching times tables. I imagine there will be some strong opinions...

For many children, learning the times tables feels like a huge mountain to climb. And for those who have tried and feel that they have failed, going back and trying again fills them with dread. Perhaps all seems to go well in the beginning, with the 2s, 5s and 10s, say, but before long we reach the 6s, 7s and 8s, and it feels like every new fact that is mastered displaces an old fact that then becomes lost. As more and more facts are covered, the potential for muddling them up increases (e.g., $7 \times 8$: Is it $54$, or maybe $48$?), until the student really doesn’t have much idea which things they know and which they don’t. In the worst-case scenario, the only thing the child really trusts is skip-counting up from zero every time. And with skip-counting you only have to make one mistake for all of your remaining numbers to be wrong.

Teachers are highly strategic in the order in which they teach the tables: often 2s, 5s, 10s, 4s, … etc. But the effect of this is that the ‘hard stuff’ (6s, 7s, etc.) is delayed, so that when it arrives it can feel overwhelming and as though it is coming at learners far too quickly. I am not sure that learning one table after another like this – however carefully planned the sequence – is ideal (Note 1).

Here is a different way, that tries to build up from the multiplicative connections between the facts and deliberately avoids any addition/subtraction/skip-counting approaches, so as to build on the multiplicative structure of the tables and work more in harmony with that.

At first sight, there are 144 facts to learn: 

But, of course, this is highly deceptive, and it is nowhere near as bad as this. Because of the commutativity of multiplication ($a \times b \equiv b \times a$, see Foster, 2022), we can immediately delete nearly half of these facts: 

Everyone knows their 1-times table (which is almost as easy as the 0-times table, Note 2), so we can grey those out. And I will assume that the 2s (the even numbers), 5s, 10s and 11s (at least up to $9 \times 11$) are also known, or easily learned, and so I’ve marked those in green below:

So, from the original 144, this now leaves just 30 which need some teaching. And these are the tougher ones. Because of how the picture looks at this point, the best way to tackle this, I think, is not to go table by table (Note 3), but to exploit the structure a bit more strategically. In particular, we want to begin with the highest-leverage multiplication facts – the ones that help most with getting others. When students arrive, say, at secondary school and clearly ‘do not know their tables’, it is basically these 30 that are the problem. Convincing them that their difficulty is not a functionally infinite number of unknown facts but a relatively small number can be helpful. (It really is not like having to memorise a telephone directory!) And starting with the ones most likely to be of immediate help seems to make sense.

Big claim: The most useful of these remaining products to know are the eight squares in red below:

In desperate circumstances, where students have repeatedly tried without success to master tables, I have been known to (reluctantly) settle for just knowing the squares. The beauty of the squares is that they march diagonally through the table, and so they really take you deep in amongst all the difficult facts. If you know the squares, the difficult products you don’t know are often only a step away.

For example, if you know that $8 \times 8 = 64$, then $7 \times 8$ must be $64-8 = 56$.
Or, if you know that $7 \times 7 = 49$, then $7 \times 8$ must be $49+7 = 56$.

So, the squares are really high-leverage facts to know, and I wouldn’t do anything else on the multiplication facts until the student knows these 8 squares. However, I am not really advocating pushing things like $7 \times 8 = 8 \times 8-8$, because students find this reasoning hard (Do I subtract 8 or 7?), and it breaks with the multiplicative theme.

So, instead, I would build differently from the squares:

$6 \times 3$ is half of $6 \times 6$ or double $3 \times 3$
$4 \times 8$ is half of $8 \times 8$ or double $4 \times 4$
$6 \times 12$ is half of $12 \times 12$ or double $6 \times 6$

This is really powerful. Mental doubling and halving may need some work, but that is very important anyway, so I am happy to be dependent on that (see Francome, 2020).

So, now we have three more facts, in orange below:

Knowing that $6 \times 6 = 36$ is the single most powerful fact in the entire tables square, so long as you are able to mentally break down the 6s into 2s and 3s. Students who haven’t had much practice doing this ‘prime decomposition’ find it initially difficult, but this is at the heart of how multiplication works, so is an important awareness, and, with practice, it allows students to see why all the 36s in the table are equal (there are no 'coincidences' in the multiplication table):

$4 \times 9 = (2 \times 2) \times (3 \times 3) = (2 \times 3) \times (2 \times 3) = 6 \times 6 = 36$
So, $8 \times 9 = 2 \times 4 \times 9 = 2 \times 36 = 72$ and
$3 \times 12 = 6 \times 6$ (double the 3, halve the 12) $= 36$

These are in gold below:

Next, I would do 12, 24, 48 and 96. If you learn that $3 \times 4 = 12$ (which most students will know), then $3 \times 8$ (double the 4), $6 \times 4$ (double the 3), $12 \times 4$ (double the 3 twice), $6 \times 8$ (double the 3 and double the 4) and $12 \times 8$ (double three times) all come along without too much trouble if students are fluent doublers - and only the the last one of these involves any 'carrying' when doubling.

This means that when students are stuck on $6 \times 8$, the prompt would not be to count up in 6s or work from the nearest multiple of 6 or 8 they can think of (e.g., $6 \times 10$). It would be: Do you know $ \textit 3 \times \textit 4$? (Both numbers are doubled, so the answer must be 12 double-doubled, which can be done easily mentally, without any 'carrying'.)

These six are in blue below, so, by this point, we have dealt with 20 of the tricky ones and there are just 10 left.

The remaining ones are all ‘hard’, and we need to take time and care over these. I think I would spend 50% of my total energies on these 10.

There is 21, 42, 84 and 63 (in purple below), which all come from $3 \times 7$, which therefore needs to be learned. Then, given $3 \times 7$, we can do $6 \times 7$ (double), $12 \times 7$ (double twice) and $9 \times 7$ (triple). (None of these scalings is hard to do quickly mentally, as none involves any 'carrying'.)

Then there is 28 and 56 (in yellow below), where $8 \times 7$ is just double $4 \times 7$ (which is just double $2 \times 7$).

And then we have 27, 54 and 108 (in pink below), which come from $3 \times 9$, which needs to be learned (perhaps as $3^3$). We have $6 \times 9$ (double) and $12 \times 9$ (double twice).

Which just leaves 132 to remember (or know as $11^2 + 11$, which is possibly easier than double $6 \times 11$). I think this is probably the least connected of all of the multiplication facts, and so perhaps the hardest to remember.

So, in conclusion, this means that the only ones that potentially 'need' memorising are these 12:

$3^2, 4^2, 6^2, 7^2, 8^2, 9^2, 11^2, 12^2, 3 \times 4, 3 \times 7, 3 \times 9$ and $11 \times 12.$
And, if you have the 3-times table, then that reduces this list to just the other 7 squares and $11 \times 12$, which really feels manageable. It does show the power you get from knowing the squares (Note 4).

I think the key to supporting all of this is in the kind of prompts that you provide when a student is stuck. Rather than asking them to figure it out from ‘anything relevant that you know’, or waiting patiently while they skip-count up from zero, with this approach you have a clear plan for how they might be getting from known things to unknown. With practice, figuring out something like $7 \times 8$ by saying ‘double $7 \times 4$, which is 28, so that's 56’ can be extremely quick, and the more you do this the more you are incidentally practising doubling. (And this is one of the trickiest ones, because doubling 28 involves a mental 'carry'.) Of course, nothing will be as fast as ‘just knowing’, but, where that has repeatedly failed for a student, then this kind of approach may help. And I would teach it to everyone for the sake of understanding the multiplicative connections (Note 5). I certainly prefer to spend energy on this than on those one-off mnemonics, like ‘5-6-7-8’ for $56 = 7 \times 8$, which are flukes that don't generalise.

Here is my attempt at a (rather messy) summary of where everything comes from (pdf version):

In conclusion, I am not suggesting that any of this is easy, especially with students who have experienced repeated failure with tables or have developed ‘tables anxiety’. There is no quick, easy fix. And I’m not saying that I think this approach is definitely the best (e.g., I don’t make anything much of the 9s, which can be fairly easy to learn). But, I think that if you work through in this order you at least get the highest-leverage facts (e.g., the squares) before the lower-leverage ones (e.g., $11 \times 12$). However, if you have a better order - or entire approach - please put it in the comments below!

Questions to reflect on

1. What are your best strategies for teaching the multiplication tables? Do you work differently with older learners who have previously been unsuccessful learning their multiplication tables? 

2. What are the pros and cons of the different approaches you have tried? 

3. What do you think of the scheme I have outlined above? Please respond in the comments if you can improve on it.


1. For some great tables tasks that focus on conceptual understanding, see Faux (2018). See also the Position Statement on 'The Teaching and Learning of Multiplication Bonds' from the Joint ATM/MA Primary Group:

2. Of course, except for the $1 \times n = 1 + n$ error, which seems to be particularly common with $1 \times 1 = 2$.

3. In the time of the Numeracy Strategies in the UK, everyone seemed to be chanting up and down in multiples on 'counting sticks', but I worry that that doesn't always help learners to remember which numbers belong in which tables. Once you move on to a new table, you trample all over numbers that have been learned in previous tables, with different but similar numbers appearing, and this interference makes it highly muddling for many students. It also feels 'additive' rather than 'multiplicative'.

4. Of course, some of the squares in this list could be derived from others in the list (e.g., $6^2$ is double-double $3^2$), but I tend to think that they are all important enough to know in their own right. But, if you disagree, then you could further reduce the list of base facts to just these nine: $3^2, 4^2, 7^2, 9^2, 11^2, 3 \times 4, 3 \times 7, 3 \times 9$ and $11 \times 12$, and get everything from just them.

5. The other advantage of 'just knowing' the tables, rather than working them out (even very quickly) is, of course, that you can work backwards, and when you see, say, 56, you immediately think 7s and 2s ($2^3 \times 7$). I think the kind of approach I've outlined here, focused on scaling up, rather than repeated addition, potentially helps with this, because, when you see 56, you are more likely to think 'double 28', and that can take you back to $14=2 \times 7$ and, via $4 \times 7$, to $8 \times 7$, so all the 'reverse doubling' helps to make visible the multiplicative structure that is there. Whereas thinking that 56 may be 7 or 8 more or less than some other half-remembered number doesn't do much for you.


Faux, G. (2018). Tables together. Association of Teachers of Mathematics.

Foster, C. (2022). Getting multiplication the right way round. Mathematics in School, 51(2), 16–17.,%20Mathematics%20in%20School,%20Getting%20multiplication%20the%20right%20way%20round.pdf

Francome, T. (2020). Random chants: Generating a lot from a little using Excel. Mathematics Teaching, 274, 28-30.

28 April 2022

Tangible contexts for mathematics

Do contexts help students to understand mathematics or do they just make it harder for them to untangle the mathematics from all the extraneous information? I think the answer is yes – both of these happen on different occasions. So, what is it that gives some contexts the potential to be powerfully illuminating?

I think the answer is not ‘relevance’ to a student’s personal interests. Relevance might be motivating, possibly, but it doesn’t necessarily make the context more illuminating of the mathematics. That way lies a ‘learning styles’ kind of fallacy, that every student needs a different context that is just right for them, and the magical right context will somehow make everything clear to them. I don’t think that’s right. And anyway, students often seem more switched on by contexts that take them out of their existing worlds (e.g., spaceships, dinosaurs, unicorns) than those which merely reference things they are already familiar/bored with. So I don't think matching personal interests is the most helpful approach. I think it's more likely that generally most students are helped by the same illuminating, well-chosen contexts, and not really so much by others. 

Ratio and multiplicative/proportional reasoning

Let’s take ratio or proportional/multiplicative reasoning as an example. This is widely acknowledged to be a (or possibly ‘the’) central concept in lower secondary mathematics. And something that many students really have a weak grasp of. If you wanted a concrete context to help students make sense of this area, what would you pick? If you opened a textbook at the ‘ratio’ chapter, what contexts would you expect to find?

Of course, ratio can be applied to all sorts of contexts, and it is important to do this and let students see how ratio can be relevant and important in a wide range of areas. That is fine. But what I am thinking about here is contexts that are deliberately used to try to develop students’ understanding of what ratio is and how it operates.

The problem for me with, for example, money as context is that if the ratio of money spent by, say, Usha and Sam is 3:1, and the ratio of money spent by Dave and Priya is also 3:1, it is quite hard to capture in words (or in pretty much any other way) what specifically it is about Usha/Sam and Dave/Priya that is the same, given that these ratios are the same. The ‘same ratio’ is a highly abstract concept here. So, although I think that money might at some point be a worthwhile context for using ratio, I don’t think it’s helpful for understanding what ratio is. My test is that I need to be able to complete the sentence: “When the ratios are the same, the _____ is the same” with something highly tangible and familiar (not mathematical) going into the blank space. For this reason, I think that most discrete ratios (money, different coloured beads on a string, different kinds of animals on a farm, boys and girls in a class, etc.) are not so useful.

Tangible context: paint

Instead, I think the ratios of continuous quantities are much more useful to begin with, and, in particular, my go-tos are always drinks (Foster, 2007) and paint. The fact that most students probably never mix their own drinks, and even professional decorators rarely mix pots of paint together to make new colours (and when students mix their paint in art, this would be by eye) is irrelevant. The point of the context is not that it’s something students do every day, or even ever. The point is that it’s easy to imagine (what Realistic Mathematics Education calls 'realistic', and which means something closer to 'realisable').

The reason that I think these contexts are useful is that:

  • “When the ratios of red paint to white paint, say, are the same, the paint is the same colour.” and
  • “When the ratios of orange juice to lemonade, say, are the same, the drinks taste the same.”
And everyone knows what these things mean. This means that you can have a discussion about various hypothetical mixtures of red and white paint, or fizzy orange, and you can initially completely avoid the word ratio and any 'rules' about when 'ratios' are or aren’t equal. You can just ask: “Would they be the same colour?” or "Would they taste the same?", and everyone knows what you mean and can engage in the thinking that you want them to do.

With paint, I find that having the two colours as red and white is particularly useful, because you then have the word ‘pink’ available, in addition to talking about ‘redness’ and ‘darker/lighter’. This all helps the discussion to focus initially on the mathematical thinking, rather than terminology. Once students appreciate that 2:3 and 20:30 and 1:1.5 and 4:6 are all ‘the same colour’, then it is natural to try to capture this ‘sameness’, and we can use a word like ‘ratio’ to do so. But doing it the other way round, beginning by stating that 'We say that' 2:3 and 20:30 and 1:1.5 and 4:6 are all ‘the same ratio’ invites students to ask, “What do you mean?” And that puts the teacher in the position of having to do the justifying, whereas really you want the students to be doing this, based on something that they have already gained a sense of.

Tangible context: fizzy orange

For the same reason, making fizzy orange using orange juice and lemonade can be another really illuminating context (and you could possibly even do this one for real in the classroom, Foster, 2007). Lemonade is better than water, I think, not just because the mixture tastes better, but because then you can ask, “Which mixture will be fizzier?” as well as “Which mixture will be more orangey?” Really tangible contexts like these do a lot of the work for you. Every child knows that adding more orange juice won’t necessarily make the mixture taste more orangey, if you are also adding more lemonade.

I would often begin a discussion of this scenario by suggesting a few possible mixtures of orange juice and lemonade (as in the table below), and asking students which mixtures would taste the same, and which would taste different. For any ones that they think would taste different, I would ask them which would taste more orangey, and I find that that sometimes causes them to change their minds. You often get to a situation where they think one mixture would taste more organgey, but also more fizzy, and so that causes them to go back and think again.As the discussion progresses, further possible mixtures are usually suggested by the students, and I would add these to the list. The point is to avoid telling students whether they are right or wrong, but to draw on their common sense and life experience to let them figure it out. They know everything they need to know to do this. This then forms a really good basis for more formal teaching of ratio.

For example:

Teacher: Would any of these mixtures taste the same? Are there any you’re sure would taste different?
Student 1: D and E would taste different.
Teacher: Why do you say that?
Student 1: D would taste stronger than E because there’s less lemonade in it.
Teacher: But D and E have the same amount of orange, don’t they, so shouldn’t they be equally orangey?
Student 2: No, because the orange is spread out in more lemonade in E.
Teacher: Can someone else explain what S2 is saying?
Teacher: Would any of these mixtures taste the same as each other?
Student 3: A and B would taste the same.
Teacher: Why do you say that?
Student 3: Because they both have 1 more lemonade than orange.
Teacher: Are there any other mixtures with 1 more litre of lemonade than orange?
Student 4: Mixtures C and D.
Teacher: So, would mixtures A, B, C and D all taste the same?
Students: Yes.

It’s likely at this point that some student will raise some doubt, perhaps relating to C being ‘nearly fifty-fifty’. Multiplicative language or thinking tends to appear around this point, if it hasn't already, which can then develop into getting the students to order A, B, C, D and E by ‘orangeyness’.

If this doesn't happen, then the teacher can be more proactive:

Teacher: Suppose I took two containers of Mixture A. How many litres would there be in each?
Student 3: 5 litres.
Teacher: What would happen if I mixed them together?

Every student will appreciate that mixing identical mixtures will lead to twice as much mixture, but that it will taste exactly the same. So this gives us Mixture E. And students will have already agreed that Mixture E must be less orangey than Mixture D, so this provides the nudge for everyone to think more deeply. Mixtures A and D can't taste the same if mixtures A and E taste the same and mixtures D and E don't! The idea that mixing 'identically-tasting mixtures' (still avoiding the use of the word ‘ratio’) will lead to a new mixture with exactly the same taste is highly intuitive, and nobody will ever doubt this. And that kind of knowledge is all that is needed to develop all the necessary ideas of ratio through this kind of discussion.

Tangible context: chromatography

Finally, I think a really helpful science context is chromatography and retardation factor ($R_f$) values (Note 2). There could be potential for some cross-curricular practical work with chromatography paper and water-soluble marker pens. Different inks dotted along a pencil line at the bottom of a sheet of chromatography paper will move at different rates as the solvent soaks up the sheet (Figure 1). Each component will travel at a fixed fraction of the speed of the solvent, and the $R_f$ value of each is defined as

$$R_f = \frac {\text{distance travelled by the substance}} {\text{distance travelled by the solvent}}$$

Figure 1. Calculating the $R_f$ for the red substance.

This seems to me like a perfect, dynamic scenario for understanding ratio, because molecules of a substance are highly obliging, and obey the rules perfectly (unlike, say, two runners in a race, running at different speeds, who need to negotiate bends and are likely to get tired at different rates). Here, when the ratio is the same, the height above the baseline on the chromatogram is the same (and the substance is likely to be the same). I would be keen to hear from anyone who has used this context as a way to explore ratio with students.

Questions to reflect on

1. What examples of illuminating contexts do you use - for ratio, or for other topics? What is so good about them?

2. When do you feel that contexts do and do not work well? Why?


1. For a free lesson plan based on the fizzy orange idea, see

2. People's recent familiarity with Covid lateral flow tests may also make this easier to grasp.


Foster, C. (2007, May 24). Make maths sparkle. SecEd, 12. Available at,%20SecEd,%20Make%20Maths%20Sparkle.pdf

14 April 2022

Intro-ducing and outro-ducing methods

Welcome to my first blogpost as President of the Mathematical AssociationI am aiming to post to this blog every other Thursday during my year as President, and to address a variety of issues that will hopefully be of interest to MA Members and others across the whole range from early years up to university. That is a tall order, and means that I won't always know what I'm talking about, so please engage with the blog in the comments underneath, put me right when I'm talking nonsense, and make this a conversation. I will try to encourage this by being a bit provocative and controversial at times!

So, let’s get started. And thinking about how to introduce the first post of the blog got me thinking about how teachers introduce methods in mathematics - and also particularly the opposite: namely, how we help learners to move on from methods. And yes, I've invented the word ‘outro-ducing’, because my thesaurus couldn’t find me a word that really captured the opposite of intro-ducing!

I think teachers spend a lot of careful thought on how they will introduce methods to learners, but much less consideration is given to how particular methods might enjoy a dignified exit. You have probably taught many lessons where the main aim was 'to introduce X'. But how often has your main lesson aim been 'to outro-duce' something? “This is the last day on which you will do X - we won't be doing that any more after today." Is that something you would ever do?

It may sound negative to be thinking about removing methods from learners' toolboxes. Why would anyone want to do that? Surely the more methods learners have access to the better? But I don't think that's realistic. Picture having a cluttered toolbox, with new tools constantly pouring in at the top. Some tools are genuinely additive - they expand the range of things learners can do. But others really should displace older tools - and we ought to throw out those older tools that we no longer need. If we don't prune our toolbox, we make it harder for ourselves to find what we need. Do we ever talk about this kind of thing with learners (Note 1)?

Counting on

The problem is that learners at any level can get stuck on an inefficient method, which becomes comfortable for them through familiarity, and it can then be hard for them to 'move on' to more powerful approaches. I will take an example from primary/secondary school, but please substitute your own example that is relevant for the ages you work with.

It is important for young children to learn ‘counting on’ as a powerful strategy - far more powerful than 'counting all'. So, to work out $5 + 3$ they would begin with the larger number, 5, and say “6, 7, 8”, so the answer is 8. For young children, this is not trivial, and there are all kinds of pitfalls, such as starting counting the 1 on the 5, rather than the 6, and obtaining an answer of 7. The business of counting up to 3 while saying “6, 7, 8”, rather than “1, 2, 3”, is really quite complicated. This is all important to take time over and work on carefully. However, what do we do when we find older primary or even secondary-age learners who still seem wedded to counting on as their preferred method? Of course, they have been introduced to many other, more efficient, methods over the years. But they trust ‘counting on’ more than any of these. They are more comfortable with it, and believe that, for them, it is more reliable. This favoured method then becomes a barrier to other methods, and, the more they use it, the more alien other methods feel ("I don't do it that way; I prefer my method").

When I watch a child counting on to work out something like $14 + 14$ (“15, 16, 17, …), I can't help feeling that we are wasting their time. Suppose they reach the answer 29 - what does the teacher do? It is easy to feel sorry for the child and say something like, “Ooh, nearly. Try that again.” More wasted time as they repeat the process, and, even if they get it right second time, what do they learn? A systematic error, such as a fencepost error (Note 2), getting 27 because they ‘count the 14’, should be addressed explicitly, but if their working memory has simply been overwhelmed by the task, or they just made a slip, then what does repeating it achieve? Errors like this are a feature of the method, rather than the child, when used on numbers as large as this; if I had to do $14 + 14$ that way, I would also be slow and possibly inaccurate. Assuming that the learner just needs more practice simply traps them, and lots of mathematics lesson time can be consumed while apparently ‘low-attaining’ learners endlessly 'count on', while learning nothing except that they are apparently not good at mathematics. Until they can succeed with this method (by some measure), they are deemed not yet 'ready' to be urged onto a more sophisticated method.

Getting over the hump

But what is the teacher to do? The learner has been taught more powerful methods but claims not to understand them, or not to like them, or just to be more comfortable with their counting-on method. The problem is that if we simply allow learners to stay for as long as they wish with whichever method they feel most comfortable with, then they are very likely to get stuck on inefficient methods. Any new method is going to feel hard at first, simply because it's new and unfamiliar. Mastering a new method is bound to be challenging initially, even if, ultimately, it might feel far more comfortable than where you were beforehand. Transitioning to a new method is hard because it’s unfamiliar. You are stepping away from your comfort zone, so learners should expect to find the new method harder at first, as there is a hump to get over before you feel the benefits (Figure 1).

Figure 1. Getting over the hump when learning a new method. You cannot expect to experience the benefits immediately.

Learners need to understand that they can't judge whether they like a new method the first time they see it or try it - it is only when some fluency with it has been developed that they will be in a position to say what they think of it. "This may be your future favourite method, but you can't know that yet!" If we just introduce a new method and ask them what they prefer, they may be very likely to prefer the old method, simply because it’s familiar and it has served them well in the past, particularly if they have low confidence and a history of lack of success in mathematics. We may have to be a bit more pushy than just introducing new methods and hoping they will catch on. We can phrase this positively: “I know you’re really good at doing this by counting on. I’d like to see if you can do it using tens and ones, and I want you to try this method today."

If we do this, we need to be tolerant of the fact that learners trying a new method may initially be less reliable than they were with their old method, since they are not yet fluent with it. A new method may not give instant benefits. So, we might need to expect more errors (or perhaps different ones), at least at the start. So we need to praise the fact that they're trying the new method and not let them feel like they have failed because they are slower and less accurate than they were previously: “Great that you’re using tens and ones to do this. You'll get more accurate as you work at it.” Otherwise, if we (even subtly) reward speed and accuracy, they will want to revert to the old ways ("Counting on just suits me better"). Learning the new method is an investment that will most likely take time to pay off.

Fading out?

People often talk in terms of 'scaffolding and fading', but I think it is not enough just to introduce new methods and hope that the old ones will 'fade away' in the shadow of these new, more powerful methods. Often the old method will persist, and we need to help learners by actively 'outro-ducing' them. Letting some learners spend all lesson working out a handful of calculations like $34 + 27$ by counting on is not teaching them anything useful. It is not helping them withdraw from 'counting on' and transition to 'tens and ones' approaches - it is just reinforcing their dependency on something that is ultimately no longer helping them. It is just contributing to the problem, as they fall further and further behind their peers, who are accessing the more powerful methods. We need to be helping learners through a managed withdrawal from methods that have outlived their usefulness: “I would like to support you in moving from this method to this other method that I know will be harder at the start but in the end I think will really help you.” This doesn’t have to be flicking a switch overnight ‘banning counting on’. The word 'fading' suggests something gradual - which is helpful - but also perhaps something that happens naturally, without any intervention - which is, I think, less helpful. If you think of a toddler who has got hold of something they are not allowed to have, like a pair of scissors, then the ideal thing to do may be to distract them away from it with something bright and even shinier. But, while doing this with one hand, you might still need to use your other hand to gently prise their fingers away from the scissors. The attraction to the shinier object might do some of the work, but not all.

Planning the outros

None of this is saying that certain methods should never have been taught (see Foster & Ollerton, 2020). Outro-ducing a method your colleague painstakingly intro-duced years previously is no reflection on their judgment as a fellow professional. If you had been the teacher then, you would also have taught that method. Many methods are important - necessary even - for a time, and then the point comes when they need to be retired. The learner who is now wedded to 'counting on' was probably previously committed to 'counting all', and somehow made the shift from that. So, moving on from 'counting on' is certainly not saying that ‘counting on is bad’. But, when a particular method seems to have passed its use-by date for a particular learner, our role may be to help them say goodbye to it. All of this is obviously a matter of judgment for the teacher. We don’t want to accelerate learners prematurely onto formal methods like column addition that they don’t understand (Foster, 2019). But building understanding of place value needs to be actively worked on, and leaving learners counting on for years doesn't do this.

Two of the most important parts of a piece of music are the intro and the outro. The quality of the intro determines whether the listener will continue listening or click ‘next’; the quality of the outro has a strong influence on the listener’s memory and overall perception of the piece. I suspect that good outros are harder to write than good intros - look at how many popular songs end by looping a repeat of a couple of lines and fading down the volume. Ending things well can be difficult - personal relationships sometimes drift along because neither person knows quite how to end it. A good host at a party needs not just to be hospitable and welcome all the guests in, but also occasionally to boot out guests who’ve overstayed or had too much to drink! Waving goodbye to methods that learners have become accustomed to over years is hard and may feel like pulling teeth, but it is just as important as introducing them to new methods.

Questions to reflect on

1. Are there methods that your learners use that you wish they would move on from?

2. Do you recognise the challenge of 'the hump' (Figure 1) when learners encounter new methods?

3. How might you help your learners to let go of mathematical methods that seem to have outlived their usefulness for them?


1. Of course, one exception to throwing out old tools is when you happen to be a mathematics teacher. You still need to know how to do things in 'less sophisticated' ways, because you will have learners who are working in those ways. Saying, "Oh but I don't do it that way" doesn't work if you are a teacher!

2. A fencepost error is an "off-by-one error" caused by incorrectly including or excluding a boundary value (e.g., for 4 fence panels you need 5, not 4, fence posts).


Foster, C., & Ollerton, M. (2020). Mathematical white lies. Mathematics Teaching, 272, 24–25.

Foster, C. (2019). Doing it with understanding. Mathematics Teaching, 267, 8–10.

Lines of not-very-good fit

Does anyone teach lines of best fit 'properly' in lower secondary school? I think whenever I’ve seen this concept taught, or taught ...