First Order Approximation

I used this phrase in a post two weeks ago (Models – How do computers play chess?), and in causing some debate, has made me realise how it has both subtle and important differences in meanings. This has implications for how we approach problems and links to some of the root causes of mutual incomprehension that I often come across.


Some of the different meanings:

1). General usage
It is the same as “first approximation” or a “ball park estimate” and used interchangeably. It is an educated guess with a few assumptions and likely a simple model. I think this is the most common usage. It tells you nothing about the nature of the model as this will vary by context.

2). Engineering/physics
Often used as an indication of level of accuracy, like significant figures, and this accuracy improves with higher orders of approximation. For example, if estimating the number of residents of a town the answers could be:

1st order approximation / 1 sf 40,000
2nd order approximation / 2s.f. 37,000
etc.

3). Mathematics
1st order often refers to linearity. For example, a Taylor series of a function of the form:

1st order approximation a + bx (also called a linear approximation)

2nd order approximation a + bx + cx^2 (also called quadratic)

Order also exists in statistics, with arithmetic mean and variance known as the first and second order statistics of a sample. Skew and kurtosis are third and fourth order and can be thought of as shape parameters telling how far from the normal distribution you are.

4). Financial derivatives pricing
For option pricing, it refers to the order of differentiation, so is helpful when thinking about sensitivities in the change in price:

1st order approximation Delta (1st derivative of price with respect to underling price) 2nd order approximation Gamma (2nd derivative of price)

For bond pricing, second order approximation is also called convexity adjustment which again is used to help understand the non-linearity of bond prices.

5). It can refer to the number and importance of the variables in a model.

A first order approximation may only deal with primary drivers. A second order model would include secondary drivers used to refine the estimate.

For example, a first order approximation of the time taken for a ball to drop would be to use Newton’s second law, F= ma. A second order approximation might include some appreciation of wind resistance.

The meanings may not align

These definitions might appear to be much the same thing. You can easily argue that a simple linear model using only the most important drivers will produce a decent ball-park estimate to one significant figure.

But this apparent similarity of definition means that a common trap is to not notice when they are different.

  • A first order approximation can be quadratic

To estimate the height of a cannon ball after firing we need to draw a parabola not a straight line. Often the non-linearity is so important that any linear model is awful.

(see How Not to be Wrong: The hidden Maths of Everyday Life by Jordan Ellenberg).

  • Making estimates more “accurate”, using higher power terms may make the model worse
    In maths or physics textbooks, this approach always works out given that you already know the mathematical function or have an underlying relative which is stable. But in the world of economics and finance it can lead to a huge methodological error, thinking that the better our model fits the data the better the model. I worked with many analysts who have struggled with this and kept producing models with wonderful correlations and R^2. This leads them to think they have a model which “explains” the price action as well as possible. But these models are invariably useless, have no predictive value and need to be “recalibrated” to make them refit new data as it comes in.
  • It takes judgement to know which variables to use.
    For instance, in the example with dropping an object, Newton’s second law will do an excellent job on a ball bearing from 10m but a pretty poor job on a parachutist. Which drivers will be important in financial markets varies over time and it takes a lot of flexibility to stay open-minded as to potential outcomes.

Models – How do computers play chess?

Chess computers have been good enough to beat me my whole life. It took until 1997 for them to beat a reigning World Champion, when Deep Blue beat Kasparov. They can now comfortably beat all the best players in the world. But development of Go computers has been far slower, it was only this year that AlphaGo defeated Lee Sodol, the 9 dan professional. What is the difference?

The most common reason given for why Go is harder, is that it is more complicated. In chess, there is a choice of 20 first moves, in Go the choice is 361. So in Go, the permutations and, as a result, possible games are far higher than in chess. Since computers play these games by simulating permutations, it makes intuitive sense that this is easier in chess than Go.

This argument is logical but highly misleading. The problem is that BOTH games are insanely complex and unsolvable by brute force. Imagine I am trying to move a pair of large rocks. One weighs 100 tonnes. The other weighs 10,000 tonnes. Is it sensible to say that the second is 100 times harder to move? Or simply that both are unmovable. Degrees of impossibility is not a very useful concept.

There is a more important difference between the two games. In chess we can build a simple model that acts as an excellent first approximation to evaluate who is winning. Just count the material and use a simple scoring of queen= 9 pawns, rook = 5 pawns etc. to come up with a single total for both sides. The one with the higher number is winning. This is how beginners think of the game, the aim is to grab material. Thus, it is easy to code a simple model to get the computer started. Once this first order approximation is worked out then second order models can be added such as pawn structure, space advantage or use of open files. In Go there is no such simple evaluation metric and how they managed to programme a computer to win is a fascinating topic and likely a separate post on AI.

A good first order approximation often gets you a decent way to a solution. If you don’t have this, you may have trouble finding a solution that doesn’t take an infinite amount of time to solve, as the early versions of Go computers found.

This has an interesting link to the way I approach financial markets and economics. I think it is most important to spend time thinking about appropriate first order approximations to help with the general understanding of what is really going on. But the influences around us often obscure this, for example from news or complex analysis.


Reminiscences of a Stock operator

I first read this book when I was 17 but it took me many years of trading and painful experiences to realise that the character who had the most to teach me was Partridge. He is the older, experienced trader who whenever presented with a stock tip by an excited young trader would always reply

“Well, this is a bull market, you know!” as though he were giving you a priceless talisman wrapped inside a million-dollar accident-insurance policy. And of course, I did not get his meaning.


Useful trading models

In economics and finance, it is the development and understanding of models of a first approximation that are the most useful and the most important. This is primarily the method I am using for models of asset market pricing described in other posts. Far too much effort and time is spent on far more “complex” analysis and models, which often focus on second or third order drivers by assuming away the first order ones. The “news” constantly blaring out on cable TV is at best a focus on factors causing minute differences in asset prices. At worst, it is just distracting white noise. Precise directions for the last 100m of your trip are a not much use if you are not sure which town you are going to. It is far better to be approximately right than to be precisely wrong.

Trade Ideas

A common type of trade idea proposed to me will be in this form:

  1. There is a recent development or upcoming event which matters for the Australian dollar (substitute in any other market) e.g. a piece of economic data
  2. We should buy/sell it

What is rarely done however, is considering how important this driver is in context. Commonly the idea is logical but essentially rests on the idea that the current market price is already correctly priced. This approach fits well with many people’s education in which assumptions of efficient markets are often embedded without them realising. The reason that these trade ideas often fail is that the new information will only dominate the market movements if and only if the more important drivers of the currency are correctly priced. Instead of assuming the market is fairly priced, I would prefer to question whether this first order approximation is appropriate before moving on.

Australian Dollar Example

To use an Australian dollar example, the value of the currency doubled between 2001 and 2008

It did not rise like this because of a succession of incremental pieces of random news which happened to cumulate in a massive movement. It happened because the currency was by first order approximation far too cheap. A useful first order approximation model for currencies is Purchasing Power Parity (indices are freely available and calculated by the OECD). The chart below shows that the PPP of the AUD was very steady at around 0.70 cents. In 2001 it was very cheap, and when it approached parity it was very expensive. Capturing these kinds of move is where I spend my time and historically where I have made my biggest profits.

Conclusion

I have learned to focus on the bigger picture and look for large market movements. In my experience, these are most likely to happen when the market price is a long way from a good first approximation model. I therefore put time into building these first approximation models across asset classes as I have briefly described so far in fixed income, and will follow up with ones on currencies and equities. Just as in chess, a good understanding of a first approximation model can get you a long way. Focusing on very new information or complex models which are actually third order features, while neglecting the first order drivers, only leads to confusion and major mistakes.

Decision making – idea generation

“when you have eliminated the impossible, whatever remains, however improbable, must be the truth” – Sherlock Holmes “Sign of the Four”

I am a Sherlock fan but I am afraid this statement is nonsense and reflects a common error in how we make decisions. The statement above is incorrect because what is left after eliminating the impossible is not only the improbable ideas you have, but also all the ideas you did not come up with. It is more likely that the answer is something you did not think of because you limited your possibilities too early in the process.

I saw a good friend of mine this week who has been very worried for the past two weeks about a business problem. He had put a lot of time and energy into preparing a space to display artwork, but now it seems the lease might fall through due to issues beyond his control. He appeared stuck in a loop, thinking he had no control of this situation. I suggested we use an approach that I use all the time when looking at investment ideas.

  1. Write down all the things that could happen.
    This is an exhaustive list of all the possibilities you can think of. Include items which you think are “impossible” but are in fact just ones you dismiss because they seem too hard. I lead the way by insisting on putting an idea down he has previously told me is ridiculous. With that as the benchmark for how bad an idea can be and one that still makes the list, he came up with a dozen ideas in less than 10 minutes.
  1. Go back to each one and write 2 or 3 sentences about why this outcome would be good.
    Do not evaluate them, do not mention any potential obstacles or negatives. I refused to listen to them and cut him off every time he starts. I kept telling him that we would get to the negatives next and then he could explain to me why this idea was stupid. I call this stage “suspension of disbelief”. Once he released his imagination, he became very animated about many of the ideas. Including the ones he did not want to even put on the list because they were “impossible”.
  1. I lied. We are not going to look at the negatives now. Make Action points!
    Instead, go back over each of the items and work out what the action point is (AP) to take it forward. This is the information gathering phase. After you have done each of these, write down what you have discovered and let’s talk again and start to evaluate the ideas.

If you go back to my first 2 posts you can see the same basic idea.
To be creative you need to separate the idea generation phase from the analytical evaluation phase.
I hope to use this blog over time to share some of my investigations of ideas which even I may think hard to justify. But to come up with really good ideas, you have to be willing to entertain a lot of really bad ones.

“It is a capital mistake to theorize before you have all the evidence. It biases the judgement.” Sherlock Holmes “A Study in Scarlet”.

Well said Sherlock.

Creativity

The advice I always give to my analyst team is to send their work BEFORE they had completely finished it, especially before they have polished it. I want to see the spreadsheets, their workings and their ideas before they had settled on an answer.

The advantages of sharing early

  • I get to see all the underlying data.
    Given time they will clean it up and only present the “relevant” data. Seeing all the data and associated thoughts, I may assign different importance to information they have discounted or draw different conclusions
  • I get to see a variety of possible ideas and views
    If they are given time to polish it, then only one view will be presented
  • We can have a vigorous, enjoyable and creative conversation.
    At this stage, other people’s input and ideas are useful as the answer is not fully known.

If I get the final product later

  • The final version will be well argued and compelling
    After all, my team are smart! It will be full of supporting data and information. The best analysts may also present some counter-arguments but no-one seems to represent the messiness of reality or admits they have no idea what is going on.
  • They will be proud of the work they have done.
    This means they will be protective, taking comments as criticism and most likely personal criticism. This leads to a conversation which will likely be professional, non-creative and pretty dull.

How I like to work

In the context of my previous piece on writing, I like work where there is an active response but no evaluation or criticism from allies and peers (middle-centre box in the grid); not a piece that has been written for evaluation by the boss (top-right box). I understand that this may not be standard, in fact it is the cultural opposite to how Ray Dalio describes Bridgewater in his “Principles”. This perhaps shows that many different approaches can be successful but it is important to know which one works for you and make sure you stick to it.

In this blog, I will try to follow my own advice. The posts may not be analytically perfect, well-footnoted or accurately referenced. The views presented will be ones that are liable to refinement and even complete reversal as more information and better analysis is included.

I hope this does not turn out like Charles Foster Kane’s Declaration of Principles.