This is a work in progress on how model-based thinking can be used informally to make you smarter in everyday life.

It's kind of hard to explain what "informal model based thinking" actually means but to a first approximation: "situations where looking at some simplified, abstract, number-free graphs can help us understand something about normal, everyday life", often (but not always) via helping us distinguish cases that look quite similar but are actually very different.

TOPIC 1: Linear Versus Exponential, Or, Are you learning anything useful?

TOPIC 2: Rare Bad Events, Or, The risk of getting hit with the mother of all parking fines.

TOPIC 3: Post-Selection and High-Variance Strategies, or, why many of the world's most famous business strategists are selling you snake-oil.

TOPIC 4: Small harms add up, if you actually care about them, or, how much do you care about bugging people?

TOPIC 5: Discontinuities, Or, are you learning vocab or learning to ride a bike?

TOPIC 6: Diminishing Returns, Or, Why sometimes you've got to stop studying and just go to bed.

TOPIC 1: Linear Versus Exponential, Or, Are you learning anything useful?

A lot of important things in the world work exponentially. There's more about exponentials here, but long story short, for our purposes, an exponential curve is one that starts slowly and grows very rapidly. You may have heard of this as a "hockey stick curve" (especially if you hang out a lot with startupistas).
By contrast, a lot of things in the world work linearly. Whatever rate of change they have to start with they'll continue having later on.

What do you notice about these two curves? That's right: for a very long time (though "a very long time" depends completely on the gradients involved and how far you zoom in) these curves will look indistinguishable.

The conclusion? It might often be hard to tell whether you're doing something exponential or somethinglinear.

What kind of domains might this apply in? I think education/training is a very good one. A lot of succesful types of education/training seem (to me) to have exponential returns; a lot of unsuccesful types of education/training seem to have linear returns. And I think that, for a lot of activities, the span of that I think you can easily spend 5-7 years learning to do something and not necessarily be able to tell, based on returns, whether you're doing something linear or exponential. Similarly, you might compare This is a problem. I'm not sure what to do about it.

TOPIC 2: Rare Bad Events, Or, The risk of getting hit with the mother of all parking fines.

Imagine a thing you can do which, 24 times out of 25 times, has a small positive outcome for you.

What happens on the 25th time? Well, it's hard to know. There are two very different models that might apply.

On the one side I would give the example model of parking fines. If you park your car illegally a lot then maybe many times in a row you'll get a mild benefit (not-paying for parking), but every so often you'll get caught and have a massive negative outcome (a huge fine for parking illegally).

On the other side I would give the example model of eating lots of apples. Every time you eat an apple you get a mild benefit (one apple). Some apples are better than others, but nothing ever goes seriously wrong: you get 24 mild benefits in a row and then the 25th time you also get a mild benefit. [It would be neat to have two example models which were thematically similar to each other; if you come up with something like that please let me know].

In real life, it's really hard to tell which type of model you're dealing with. And that's a big problem in life: a lot of things that look like really good/smart things to do are actually blown up by the possibility that they're a "parking fine" model in disguise.

One example: if you ever go to the bank and ask about investing in shares (in Britain or the U.S.) you will probably have a slick-looking 20-something dude tell you confidently that "the stock market as a whole has never lost money over a seven-year period since 17XX". What he's saying is that investing in an index fund is certainly going to give you a mild benefit (you won't lose money, even if you don't make as much money as you could-possibly have in some riskier investment). The problem is that it's possible this is actually a parking fine model: that eventually the stockmarket, as a whole, will crash completely, and that everyone who invests in stocks will lose all their money altogether. Is the stockmarket a parking fines model or an eating-apples model? It's not as obvious as some people will tell you.

[This model underlies the thesis of Nicholas Nassim Taleb's The Black Swan].

TOPIC 3: Post-Selection and High-Variance Strategies, or, why many of the world's most famous business strategists are selling you snakeoil

You're trying to decide between two different strategies, one which has high variance in outcomes (sometimes the outcomes are outstanding, sometimes the outcomes are terrible) and one which has low variance in outcomes (the outcomes are never outstanding, and never terrible, they're only ever mildly-good or mildly-bad)

There might be good reasons to choose the high-variance strategy – maybe you're a risk-loving kinda person, and you're willing to tolerate some occassional terrible outcomes for the possibility of maybe-having an outstanding outcome. That's fine, there's no problem, it's just a choice to make about how much you like risk. But that's the point: you are taking a risk if you pick the high-variance strategy.

Here's where the problem happens. Suppose you focus on a very small part of the data: the data from previous people who did really well.

What do we notice? Well, a lot of the data from both strategies goes missing. But all the data from the low-variance strategy goes missing (there's never been a person who took the low-variance strategy and had outstanding results) but some of the data from the high-variance strategy survives the cut (some of the people who take the high-variance strategy have oustanding results, even if most don't... and some people who take the high-variance strategy have terrible results).

Now, if we phrase it as "let's ignore everything below this threshold for some reason" it sounds really silly. But what if instead we said something like "we've put together this useful list of billionaires – let's look at how each of them got rich, and see what we can learn from the strategies they used!" This is like saying "let's ignore all the data below a certain threshold (anyone who is worth less than a billion) and infer things about the strategies those people used as a result," suddenly that sounds like a pretty succesful article in a bunch of different magazines. What do you notice? At the end of the day, this is equivalent to ignoring all the data below a certain threshold. And then it might well turn out that the strategies you think are "succesful" are actually just high-variance.

Jerker Denrell writes about how various business gurus/prognosticators/thinkers/forecasters make this mistake in a bunch of different ways.

TOPIC 4: Small harms add up if you can't ignore them, or, how much do you care about bugging people?

Suppose you're trying out some kind of strategy that generally has mildly-bad outcomes but occassionally has pretty-good ones. It could easily turn out that, despite the occassional good outcomes, the strategy isn't worth it for you because the many mildly-bad outcomes make the average come out negative.

Now... suppose you could get rid of those mildly-bad outcomes? Ah, everything would be different then.

Ok, but... you can't just randomly ignore bits of data, can you? Either a bad outcome exists or it doesn't, right?

I actually think this situation comes up pretty often in real life: that "bad outcomes" can often be a question of framing. For example... suppose that, for some activity you do, the Small Harms happen to someone other than you (while the Pretty-Good outcomes happen to you). Do you care about small harms that happen to those other people? If you do care then you're living in the first model, and you won't use the strategy. If you don't care then you're living in the second model, and you will use the strategy.

What are we talking about, concretely, here? Well, imagine someone who stands in the middle of the street and tries to collect donations for a charity. Occasionally people stop to give money, and maybe in those cases this is a Pretty Good outcome for everyone – maybe the person was genuinely convinced about the charitable cause and wanted to donate, and the charity got money, and the charity-collector gets a bonus for every sign-up, so everyone's happy. But it's also clear that a lot of people don't like mid-street charity-collections: they avoid eye-contact, and cross the street, and generally feel uncomfortable. There's obviously a question of whether the benefits outweight the harms, but the easiest way to be a charity-collector is if you can ignore the mild-harms that happen to other people completely – either because you think those harms are illegitimate, or because you just don't care.

TOPIC 5: Discontinuities, Or, are you learning vocab or learning to ride a bike?

A lot of the time we expect input/benefits to act like this: for every extra unit of input invested we expect to get a sensibly-scaled extra amount of benefit. The line might not be perfectly straight – the first unit of input might not get us exactly as much benefit as the third unit, or the eighth, or the fifty-seventh – but the increase is always "smooth" and reasonable:

However, in real life things aren't always like this. Sometimes, a tiny bit of extra input can have a vast amount more benefit in a way that basically means there is no "continuity" from one point on the graph to the neighbouring point.

A toy example: suppose you're really hungry and you go to the pizza-shop. The people there sell two things (garlic bread and pizza) and they sell them by weight. Garlic bread costs 1 penny and upwards (you can buy even a crumb of it) but the smallest pizza costs $5. You really want some pizza – you'll only settle for garlic bread if you can't avoid it. Therefore, you really hope that when you reach into your pocket and pull out some coins you'll discover they add up to $5+. Finding $4.99 in your pocket will give you significantly less value because for $4.99 you can only buy large amounts of garlic bread – you can't get any amount of pizza. With one penny more you can start buying pizza, and then you're back in a situation where each extra penny adds a "sensible" extra amount of value. But the one-penny difference in input between $4.99 and $5.00 makes a huge difference in the benefit you receive. This is the discontinuity.

A real-life example: while I don't have hard evidence for this, I tend to think that some kinds of learning are continuous but many others involve discontinuities, and that it's helpful to understand in advance what kind of learning you're about to do.

For example, suppose you're learning a new language. Some language-learning activities are continuous: for example, when learning new vocabulary. Don't get me wrong: some vocab will be easier to learn and some harder; maybe overall it gets easier with time, or harder, I don't know. In general, though, if you put in a little bit more time/effort into learning vocab you'll come out a sensibly-little bit better at the language.

There are other parts of learning, though, where "a-ha" moments really matter. Suddenly, in one moment, a whole new topic or ability "just clicks" and suddenly you're able to do or understand something that you couldn't before. Maybe riding a bike is like this, or certain kinds of maths, or (going back to language-learning) learning the tones in a tonal language.

One key thing about discontinuities is that they completely shake up our predictions of how something will change in future: just looking at past trends is a really bad guide if we're about to hit a discontinuity, because discontinuities are (by definition) a sharp jump away from anything that came before.

TOPIC 6: Diminishing Returns, Or, Why sometimes you've got to stop studying and just go to bed.

So, exciting times: after years of day-dreaming and hoping and wishing, you're finally launching your dream business: a frying-pan factory. (Look, it's not my fault you have weird dreams). At the end of the first day you have 0 finished frying-pans: you've forgotten to actually hire anyone.

Alright, no worries, everyone makes mistakes. For day 2 you hire your first employee: she spends all day running around the factory trying to single-handedly construct frying-pans. It's hard work, and she has to follow each single pan all the way down the production line which is very inefficient. She doesn't make a lot of pans... but at least it's more than zero.

Over the next week you hire a bunch more employees and for a while productivity just goes up and up.

Like grade-school relationships, though, the good times just can't last. You look at the graph and think "wow, every time we hire more employees we get more benefit. I'll hire a bunch more employees." That's when things stop working the way you'd hope:

The new employees just don't add as much as the first few did; there just isn't all that much left to do. Each new employee adds less and less extra value. This is called diminishing returns.

Let's suppose that, for some reason, you just keep hiring more employees anyway. At some point things might get even worse than diminishing returns; you might see this:

You now have so many employees on the factory floor that people can't even move. No-one can get from the conveyor belt to the handle-bin to grab more handles without tripping over a bunch of newbie employees. At this point you would literally be better off if the new employees stayed home: they are creating negative returns. Some people confuse diminishing returns and negative returns so remember:

  • diminishing returns just means that the extra value from adding more inputs is getting less and less (but not necessarily negative), whereas

  • negative returns means that the new inputs are actually destroying value, leaving you worse off in absolute terms than if you didn't have the input to begin with.

Now: where do diminishing returns happen in real life? One common example, in my opinion, is studying for an exam. The day before the exam-date (let's be real here) you finally crack open your textbook. You read through the key points and learn all the big ideas: for every unit of effort you put in you get a serious amount of benefit in return. By 11 o'clock at night your eyes are tired and your brain keeps drifting, it takes you 2 minutes just to finish a sentence. You've run into serious diminishing returns: each unit of input is bringing less and less benefit. It's possible that at some point you even reach negative returns: at some point you're better off just giving up and going to bed, because the extra time you're putting in is actually making you worse off: the new information is pushing more important information out of your head (or something). So there you have it: a model-based reason to get some sleep.

This is a work in progress; no guarantees that it's free of even very basic mistakes. Thought, suggestions, feedback?

The Best from Uri Bram

Limited time only: Get a free copy of Write Harder + (very) occasional & thoroughly excellent emails.

Thinking Statistically in Chinese

Thinking Statistically 的中文版将于2016年底推出。如欲及时获取本书出版资讯,请在此输入您的电邮地址:

Start Thinking Statistically

Thinking Statistically Book Cover

"Thinking Statistically explains essential concepts in statistics with wit and flair. Instead of page after page of mathematical mumbo-jumbo, Uri Bram tells stories that clearly illustrate the core ideas."

Get Thinking Statistically at now!