Attributing Climate

illuminated-clouds

Climate sensitivity is the thing in climate science right now. A part of the climate sensitivity debate is how sensitive climate is to CO2 The arguments seem to be getting more vitriolic and personal as the days past by. It really wasn’t helped by a Daily Mail article. The Daily Mail is a tabloid in the UK.

Whilst the errors in that article are far too numerous to even be bothered with, and the journalist who authored it has done it before, it is interesting to note that the Met Office has started to publish rebuttals with, evidently, more to come. I would personally have ignored it (and him) It seems perfectly reasonable to refer to this particular tabloid as the Daily Fail.

However, it occurred to me, whilst I haven’t access to any amount of supercomputer resources, or a wealth of top-end climate researchers at my fingertips, it must be possible to make a ballpark estimate of climate sensitivity with just publicly available data, a spreadsheet, a cup of coffee, and a little bit of time.

If we want to know how sensitive the climate is, we need to figure out what drives it – ie climate parameter attribution. So, what do we know about climate attribution? Here’s the IPCC’s view,

climate_attribution

Essentially, what we have, here, is the range of possible forcings and their quantitative forcing value in W m-2 (a measure of irradiance, Watts per square metre) Here’s the list as a table,

Min Max IPCC Mean Diff %
CO2 1.490 1.830 1.660 1.660 0.000 0.902
CH4 0.430 0.530 0.480 0.480 0.000 0.261
Ozone (Troposheric) 0.250 0.650 0.350 0.450 −0.100 0.190
Halocarbons 0.310 0.370 0.340 0.340 0.000 0.185
N20 0.140 0.180 0.160 0.160 0.000 0.087
Solar Irradiance 0.060 0.300 0.120 0.180 −0.060 0.065
Surface Albedo (Black Carbon) 0.000 0.200 0.100 0.100 0.000 0.054
Water Vapour 0.020 0.120 0.070 0.070 0.000 0.038
Contrails 0.003 0.030 0.010 0.017 −0.007 0.005
Ozone (Stratospheric) −0.150 0.050 −0.050 −0.050 0.000 −0.027
Surface Albedo (Land Use) −0.400 0.000 −0.200 −0.200 0.000 −0.109
Aerosol (Direct effect) −0.900 −0.100 −0.500 −0.500 0.000 −0.272
Aerosol (Cloud albedo) −1.800 −0.300 −0.700 −1.050 0.350 −0.380

Here, we note that for the most part, the IPCC has used the arithmetic mean of the forcing range for the actual forcing values; in same cases this isn’t true, and I’ve noted the difference, well, in the ‘Diff’ column. Also, I’ve calculated the percentage of the sum of IPCC forcings each parameter is.

One thing we can take from this (I may comment on the differences between forcing and the arithmetic mean of the forcing range at some later date) is that by one hell of a margin the IPCC state categorically that CO2 is by far the biggest contributor to climate forcing. I was surprised by how much the emphasis was on CO2.

Given that – and please remember this is a cup of coffee, back of a cigarette packet study – I am going to initially assume that the entire climate signal, and therefore climate sensitivity, is down to CO2. I know that’s wrong, and, for sure, there’s going to be some that pick up on it: but here’s the paragraph that says ‘I already know’ Remember we are looking for ballpark figures here.

Given our primary (incorrect, but good enough) assumption, then, let’s get some data. As normal I will be using the HadCrut4 climate data and for CO2 I will be using Earth System Research Laboratory dataset. Honestly, I didn’t really know whether I should be picking insitu CO2 measurements, or flasks; I opted for flasks pretty much on a hunch,

Co2_vs_temp

Here you can see that it’s actually a pretty good fit; it seems perfectly reasonable to say that as CO2 increases, so does temperature anomaly. I put a straight line fit through to see what the r2 value was. This value tells us how good the correlation between CO2 and temperature anomaly actually is. If it’s +1 then there is a perfectly positive correlation; if it is -1 then there is a perfectly negative correlation. If it’s zero, then the variables do not correlate whatsoever. So, pretty much, the close you get to either +1, or -1, the better the correlation you’ve got.

Here we get up to 0.635, which, is a worthy value of investigation. However, if you look at the way the dots are arranged, it looks like a curve. A log curve, to be precise. Here’s the finished chart with all the titles, labels, and a log fit curve,

Co2_vs_temp_log

That’s better, but, it doesn’t provide for significantly better fit. The reason is that CO2‘s logarithmic relationship with temperature is much more evident over very long periods of time – lengths of time that are longer than a human lifetime, for sure. I think we’ll stick to the logarithmic fit, even though for the time scales we’re investigating the relationship might as well be linear.

Least linear squares, by a trial and error algorithm, eventually transforms the data we have into the following equation. And this gives us a simple model to which we can figure out climate sensitivity based on what little empirical observation we can get out hands on.

y = a + b \cdot (sgn(x-c) \cdot \ln (sgn(x-c) * x-c))

The Sgn function is a requirement of the internal model that Gnumeric uses; it makes it look all the more awkward, but, hey, if someone else wants to reproduce it, it really matters that you tell everyone. Here’s the chart of the idealised CO2 vs Temperature Anomaly curve,

Co2_vs_temp_modelled

Using this model we can take the difference between 280ppmv and 560ppmv, which are -2.503 and 1.219, respectively, which gives us a climate sensitivity of 3.722oC. 90% of that value – since that’s the extent of the IPCC CO2 forcing – is 3.35oC. In the grand scheme of things, that’s about the middle of the road ball-park figure; neither particularly high, nor particularly low.

One of the most interesting features of this chart is that it implies that most of the (fast) warming has already been done, and that we’re at the beginning of the rather flat section of the curve (we’re currently at ~400ppmv) In other words, the effect of polluting our planet with more CO2 will progressively have less and less of an effect. For instance between 400ppmv and 500ppmv the increase is about another 0.5oC and between 500ppmv and 600ppmv even less coming in at around 0.2oC. And horror story of horror stories, a tripling of CO2 will add a measly 1.2C – which given today’s anomaly of ~0.6oC  adds up to ~1.8oC which is still less than the 2oC threshold for dangerous climate change.

In all respects this is likely wrong given far too many simplifications and assumptions; but it is in the ballpark; I suggest that it’s rather too high, given that the period that the data is from contains two very distinct properties. The sharp rise in temperature at the end of the last century, and the hiatus during the 21st century which are going to screw things up. Also, of course, this doesn’t account for any feedbacks or feed-forwards.

There you have it: a back of cigarette packet, cup of coffee, analysis of climate sensitivity. I doubt some British journalists go to as much trouble during their coffee breaks.

Posted in Carbon Dioxide, Climate, Climate Attribution, Climate Sensitivity, Mathematics, Statistics | Leave a comment

Encoding Wind

16841619-wind-rose

Wind is curious thing.

We all know what it is, it is, of course the movement of air; the faster the air moves, the more windy it is. Wind is also exactly perfect to describe a mathematical concept called vectors.

Quantities are easy to understand. If you have 100 metres of something you can visualise it. For sure, when you get into the extreme of physics, visualisation becomes a little bit of a problem: as I write this it’s been announced that Voyager has moved out of our solar system, our neighbourhood, in August 2012. That it’s gone some 19,000,000,000,000 metres is incredibly difficult to visualise. I reckon I can probably problem do something in the region of 800 metres given that that’s twice around a standard athletics track.

The problem gets odder once we come across a thing that has more than one quantity associated with it. Wind is a quantity that includes both speed and direction. We talk of a strong South-Westerly (particularly in Britain) – that’s our English language way of saying that there is a wind that is quite fast, coming from about 225 degrees (assuming North is 0 degrees). Such a concept is easy to visualise mainly because we can experience it directly, but it is harder to reason mathematically about it. That’s to say, it’s harder to treat the thing called wind, which is composed of two things, as one thing.

This is where the concept of vectors becomes useful. Vectors help us treat something that is made of more than one quantity as if it were one quantity. Schematically, they look something like this,

vector

So we have a as the direction of the vector and the length of the arrow |v| being the magnitude – ie wind direction, and wind speed respectively. Of course, when one wants to think of wind direction we must think of the complete compass face.

The main problem, and, to be fair, I’ve looked across the web for suitable ‘cookbook’ examples, is that it doesn’t at all seem that clear how to encode speed and direction into a vector, and to decode it back to speed and direction. So, I’m going to have a go.

We will formally define our vector as,

\textbf{w} = (u,v)

The Vector notation that is in use is curiously variable. Some people put arrows on top of the algebraic symbol, \textbf{w}, some people use bold: some people even use bold and italics. I am opting for something simple – I will simply use bold. The reason why (u,v) is chosen, is convention in the meteorological literature. If you are trying to convert, for instance, output from a gridded weather model, these parameters will be expressed as (u,v).

It is important to note that (u,v) are not synonymous with speed and direction. That’s to say you can’t directly assign speed to u, and direction with v. Well, you could, but, unfortunately, that will be wrong. We have to encode the information into a vector so we can use it as one quantity. If we simply copied the details over, it would be a special class of something called an n-tuple – in this case a pair – which isn’t the same thing as a vector. It’s also the case that I’ve committed a cardinal sin and used v for both the initial representation and for one of the vectors internal quantities. I will edit this at some point.

Well, fortunately, the transformation of information to and from vectors is actually reasonably easy. Assuming that we define all of our directions to lie within 0 to 360 degrees, and we call speed, s, and direction, d, we compute u and v as,

u = s \cdot \sin((d \cdot \frac{\pi}{180}) - \pi)

v = s \cdot \cos((d \cdot \frac{\pi}{180}) - \pi)

This is slightly different from other authors, primarily because of the -\pi inside the trigonometric functions; this, with a bit of care, keeps the transform the other way, within the bounds of 0..360 degrees; it will always be within this value, even, if, say, you have a direction of 380 degrees (which could happen as the result of some computation) the reverse process will return 20 degrees. It’s rather like adding 3 hours to 11pm: the answer isn’t 14pm, the answer is 2am: that’s just the effect we need.

From u and v we can get the original answer back (but keeping within our 0..360 bounds) by,

d = (\frac{180}{\pi}) \cdot atan2(v,u) + 180

s = \sqrt{u^2 + v^2}

Atan2 isn’t really a mathematical function; it’s a function derived from the computer age, but it’s so widespread, now, that we might as well think of it as a mathematical function. You have to be careful when you use it because some implementations reverse the order of the parameters ie (u,v) instead of (v,u).

And there you have it. This doesn’t seem all that useful at first glance, since we can store wind as both u and v, or s, and d – it really doesn’t seem to be more efficient to do one or the other, apart from, perhaps, we can represent wind with one algebraic symbol. It might, at this juncture, seem to be simply a waste of time.

Fortunately, there’s a lot more to all this than this post, but that’s for another time.

Posted in Mathematics, Vectors, Weather, Wind | Leave a comment

Trends

My copy of the Concise Oxford English Dictionary defines a trend as

n. 1 a general direction in which something is developing or changing. 2 a fashion. v (especially of a geographical feature) bend in a specified direction, change or develop in a general direction.

Of course, we further define trend, in mathematics, as the direction a plot of data takes. More formally, we take the equation of a straight line,

y=mx+c

where y is the value we’re interested in, m is the gradient which is how steep the line is, and c is the value where the line will cross the y-axis. For this post I will concentrate on the m – the steepness (or otherwise) of the line; ie its trend.

On the face of it, m, is pretty uninteresting, if it’s positive, the line goes upwards, and if it’s negative, then the line goes downwards. The more interesting part of this is how we can take some time series data, and then reduce it to a straight line equation, thus deducing the trend of the data. This is a very commonplace technique, and it’s used widely across the blogosphere.

Possibly the most used technique of converting time-series data into a straight line trend is something called linear regression using linear least squares. There’s loads of stuff around that shows you how to do it, including that Wikipedia page, and one of the most convenient methods is to use matrices. I would defer to a computer programmer sufficiently informed of numerical methods if you want to use the matrices method, since calculating the inverse can sometimes be numerically unstable when using floating point types.

Anyway, it all boils down to computing two equations. The first one computes the intercept,

c = \frac{\sum{y_i}\sum{{x_i}^2} - \sum{x_i} \sum {x_i y_i}}{n \sum{{x_i}^2 - {\left( \sum{x_i} \right)}^2}}

The second one computes the gradient,

m = \frac{n \sum {x_i y_i} - \sum{x_i} \sum {y_i}} {n \sum{{x_i}^2 - {\left( \sum{x_i} \right)}^2}}

All very well and good. Let’s have a look at some climate data. I will use HadCrut4 as my source, with the data, here. In the data, we are only interested in the year, and the first column, which is the computed temperature.

This gives us quite a nice graphic when charted using Gnumeric,

hadcrut4

We don’t even need to go through the effort of using those equations to compute the linear trend, we can simply get Gnumeric to do it,

hadcrut4_trend

This chart even has the equation of the straight line we’ve reduced the data to (upper left) What’s actually going on, here? Well, the data can be considered a composite,

y_i = t_i+s_i+n_i

which is effectively the same as saying the data is the trend + the seasonality + the noise (or error). Least linear squares effectively removes the seasonality and the noise, leaving only the linear trend line.

I’ve heard it said albeit not in a long time, that why can’t we just get a ruler out and draw the line ourselves? Well, human beings are pretty bad at doing that; so we should really use the mathematics.

One might think well, that’s that, then. The trend is up, the climate is warming, it’s based on well understood mathematics that are pretty much beyond reproach. Except that isn’t true.

Firstly, this series is an example of cherry-picking. I realise that that’s quite controversial given that I’ve used all the available data; but cherry-picking it is. The reason is quite simple – I’ve used all of the available data, but not all of the data. We know that climate goes back a long time before 1850, so really, I should be using any number of available climate series that go back as far as I can.

Except you can’t. The technique of linear least squares requires all of the data to be homogenous. That’s to say that the data is collected/observed in exactly the same manner. This clearly wouldn’t be true if we bolted on to the front a climate series that was produced from proxy data like tree-rings, and then switched to using the temperature record in 1850. I note that the HadCrut4 series might not be homogenous, but if you read the paper linked, you’ll find that the boys and girls of the Hadley Centre have gone to extraordinary lengths, as they should, to mitigate this factor. Indeed, they’ve gone to such lengths, that the HadCrut4 series might as well be homogenous – and I will treat it as such.

Here’s another example of cherry-picking,

hadcrut4_cherrypick

This one is normally used with a straight line with no trend, or a slightly decreasing trend to ‘show’ that the climate has ceased warming. But, as with the previous example, it isn’t just not using all the data, it is not even using all the available data. This is so bad, that I didn’t bother to put a trend line on it (and I’m slightly embarrassed about even posting it)

Another point that’s often made is that least linear squares is robust. This doesn’t mean that it is such a strong technique that it evades criticism, it means that it successfully ‘ignores’ outliers. An outlier is a bit of data that is so far away from the general theme of things it might be considered to be an incorrect measurement or such like. Here’s a made up (using Gnumeric RNG function) series,

random

Whilst there is a measurable trend it isn’t significant, and, with a good random number generator, an infinite amount of data points will leave a trend of precisely zero. I simply got as close as I could to a zero trend line. Now, hypothetically speaking, if we added, say, fifty points of outliers – I will use Rand() + 1 to put them in, the trend should still be zero, because they are outliers, and do not express the overall position of the data. Here’s the chart,

hadcrut4_trend_outliers

There is now an appreciable trend – even though it is a series of random numbers, throughout, with some outliers, the outliers have forced in a downwards trend. This is not mathematically robust. If we add a fictional warm decade at the front of the HadCrut4, the effect is quite obvious,

hadcrut4_fictional_decade

In essence, that fictional decade has reduced the trend by half, even though it is just 6% of the series. This is certainly not mathematically robust!

You might recall from plenty of posts throughout the blogosphere that linear trends are frequently used to show that the climate is warming; some will say but look at the trend, it’s up, therefore we are warming. A cursory glance confirms this, but again, does it stand the test of experiment?

Well, unfortunately, the answer is no. If we pretend that for some reason the temperature record posts some 50 years of a -0.5 degree anomaly, would we still say that the climate is warming? We’d know that it wasn’t – we’d be cold, going hungry, and no doubt, wars will be fought over important things like food, water etc. What does the chart look like? Well, here it is,

hadcrut4_fictional_coldspell

The trend line is fairly flat, but if you look at the equation of the trend line (top-right) you’ll note that technically least squares regression is still reporting that the climate is warming.

Would you say in this case that the climate was still warming?

I didn’t think so.

Posted in Climate, Statistics, Trends | Tagged , , | Leave a comment