An example of misusing statistics.
A couple days ago, a fellow user on a different forum challenged me with the following information while disputing whether or not the Earth is still warming:
It's well known that global temperature data show autocorrelation, wherein the temperature one month is correlated with the temperatures of previous months. Most of the statistics we use in science assume that there is no correlation between data points, a condition we call "white" noise. Autocorrelation means that the noise isn't white but red noise and that the standard errors and p-values calculated with standard statistics will be far too small. We compensate for autocorrelation using ARIMA to model the red noise and correct the standard errors and p-values. For an example of how to use ARIMA in regression models in R, see this post.
From January 1996 to May 2013, GISS shows second-order red noise, with the best-fit ARIMA(2,0,0). Using generalized least square (GLS) regression to incorporate the ARIMA into the regression and compensate for the autocorrelation gives us this:
It's not just his GISS error term that is problematic, either. UAH also shows autocorrelation of ARIMA(2,0,0) since his claimed start point of 1994. Plugging that into GLS shows this:
The trend ± 95% confidence interval for UAH since 1994 is 0.15004 ± 0.13173ºC per decade, again far below his claimed error of ±
0.173. Oh yes, and it's statistically significant as well, despite his claim.
The only one he came close to getting right? HadCRUT4 shows autocorrelation of ARIMA(1,0,1) since 1995. GLS analysis incorporating that ARIMA model shows the following:
While I'm leaving HadCRUT3 and RSS out for now, there are good reasons for doing so. HadCRUT3 has been replaced by HadCRUT4, making analyzing both HadCRUT3 and HadCRUT4 redundant. RSS shows false cooling since 2000 due to orbital decay, as Roy Spencer pointed out two years ago, making it unreliable. However, what I have done should give the general idea. His error terms are generally larger than they should be.
I don't know exactly where he's getting his error terms (he's so far refused to divulge that information), but I suspect that he's (mis)using the trend calculator at Skeptical Science. If you use the Advanced Options, there's the option to correct for autocorrelation, with a default start date of 1980 and a default end date of 2010. I suspect that he's calculating the trend for different time periods using the default autocorrelation time period. That's invalid because, just like the trend changes depending on the time period, the best-fit autocorrelation model changes depending on the time period. You cannot calculate the autocorrelation in GISS for 1980–2010 and mindlessly apply it to the trend for 1996–2013. If I'm correct in the source for his statistics, then what this really shows is the danger of using online tools when you don't understand how the underlying statistics work. And my challenger doesn't understand–he didn't even know what an ARIMA model was.
"For RSS the warming is not significant for over 23 years.Looks like no global warming, right? I mean, it can't be statistically significant if the error terms are larger than the average rate of rise, right? Wrong. This is a prime example of lying by misusing statistics and hoping that the other person doesn't know enough to catch the lie. The lie in this case? Those error terms. They're far too large. I'll explain why using the "GISS shows no significant warming since 1996" as an example.
For RSS: +0.127 +/-0.136 C/decade at the two sigma level from 1990
For UAH, the warming is not significant for over 19 years.
For UAH: 0.143 +/- 0.173 C/decade at the two sigma level from 1994
For Hacrut3, the warming is not significant for over 19 years.
For Hadcrut3: 0.098 +/- 0.113 C/decade at the two sigma level from 1994
For Hacrut4, the warming is not significant for over 18 years.
For Hadcrut4: 0.095 +/- 0.111 C/decade at the two sigma level from 1995
For GISS, the warming is not significant for over 17 years.
For GISS: 0.116 +/- 0.122 C/decade at the two sigma level from 1996"
It's well known that global temperature data show autocorrelation, wherein the temperature one month is correlated with the temperatures of previous months. Most of the statistics we use in science assume that there is no correlation between data points, a condition we call "white" noise. Autocorrelation means that the noise isn't white but red noise and that the standard errors and p-values calculated with standard statistics will be far too small. We compensate for autocorrelation using ARIMA to model the red noise and correct the standard errors and p-values. For an example of how to use ARIMA in regression models in R, see this post.
From January 1996 to May 2013, GISS shows second-order red noise, with the best-fit ARIMA(2,0,0). Using generalized least square (GLS) regression to incorporate the ARIMA into the regression and compensate for the autocorrelation gives us this:
Generalized least squares fit by REMLConverting the standard error to a 95% confidence interval, we get the following trend ± 95% confidence interval for GISS since 1996:
Model: GISS.96 ~ time(GISS.96)
Data: NULL
AIC BIC logLik
-329.9748 -313.2871 169.9874
Correlation Structure: ARMA(2,0)
Formula: ~1
Parameter estimate(s):
Phi1 Phi2
0.4412208 0.2299560
Coefficients:
Value Std.Error t-value p-value
(Intercept) -21.047794 8.357168 -2.518532 0.0125
time(GISS.96) 0.010767 0.004169 2.582892 0.0105
Correlation:
(Intr)
time(GISS.96) -1
Standardized residuals:
Min Q1 Med Q3 Max
-2.62471410 -0.72152714 0.01100717 0.61391719 3.00939636
Residual standard error: 0.1307708
Degrees of freedom: 210 total; 208 residual
GISS: 0.10767 ± 0.08171ºC per decadeNote that the trend is statistically significant, with a p-value of 0.0105. Now, I don't know where my challenger got his "2 sigma" error term of ±0.122 from but it's not from any sort of regression analysis. Neither regular regression nor autocorrelated regression shows an error term anywhere near that large. My data is a bit more recent than his, accounting for the differences in the calculated warming rates but there's no way that would have messed up the error term to the point of nearly doubling it. So just where did he get it?
It's not just his GISS error term that is problematic, either. UAH also shows autocorrelation of ARIMA(2,0,0) since his claimed start point of 1994. Plugging that into GLS shows this:
Generalized least squares fit by REML
Model: UAH.94 ~ time(UAH.94)
Data: NULL
AIC BIC logLik
-349.2611 -332.0274 179.6306
Correlation Structure: ARMA(2,0)
Formula: ~1
Parameter estimate(s):
Phi1 Phi2
0.6099461 0.2143155
Coefficients:
Value Std.Error t-value p-value
(Intercept) -29.943202 13.467363 -2.223390 0.0272
time(UAH.94) 0.015004 0.006721 2.232317 0.0265
Correlation:
(Intr)
time(UAH.94) -1
Standardized residuals:
Min Q1 Med Q3 Max
-2.33041950 -0.62247393 -0.03907016 0.48623254 3.49613665
Residual standard error: 0.1778559
Degrees of freedom: 234 total; 232 residual
The trend ± 95% confidence interval for UAH since 1994 is 0.15004 ± 0.13173ºC per decade, again far below his claimed error of ±
0.173. Oh yes, and it's statistically significant as well, despite his claim.
The only one he came close to getting right? HadCRUT4 shows autocorrelation of ARIMA(1,0,1) since 1995. GLS analysis incorporating that ARIMA model shows the following:
Generalized least squares fit by REMLThe trend ± 95% confidence interval this time is 0.07982 ± 0.10310ºC per decade. This is, quite frankly, the only error term that he states where my analysis agrees with his. HadCRUT4 just doesn't show any statistically significant trend since 1995. However, start from 1994 and and the nature of the noise becomes ARIMA(2,0,0) as GISS and UAH show over their time frames. That trend IS statistically significant (0.12089 ± 0.08283ºC per decade, p = 0.0046).
Model: HadCRUT4.95 ~ time(HadCRUT4.95)
Data: NULL
AIC BIC logLik
-395.5914 -378.6461 202.7957
Correlation Structure: ARMA(1,1)
Formula: ~1
Parameter estimate(s):
Phi1 Theta1
0.8800097 -0.4207543
Coefficients:
Value Std.Error t-value p-value
(Intercept) -15.566915 10.54182 -1.476682 0.1412
time(HadCRUT4.95) 0.007982 0.00526 1.517508 0.1306
Correlation:
(Intr)
time(HadCRUT4.95) -1
Standardized residuals:
Min Q1 Med Q3 Max
-2.241658378 -0.671252575 0.008621077 0.593698960 2.877875246
Residual standard error: 0.1306724
Degrees of freedom: 221 total; 219 residual
While I'm leaving HadCRUT3 and RSS out for now, there are good reasons for doing so. HadCRUT3 has been replaced by HadCRUT4, making analyzing both HadCRUT3 and HadCRUT4 redundant. RSS shows false cooling since 2000 due to orbital decay, as Roy Spencer pointed out two years ago, making it unreliable. However, what I have done should give the general idea. His error terms are generally larger than they should be.
I don't know exactly where he's getting his error terms (he's so far refused to divulge that information), but I suspect that he's (mis)using the trend calculator at Skeptical Science. If you use the Advanced Options, there's the option to correct for autocorrelation, with a default start date of 1980 and a default end date of 2010. I suspect that he's calculating the trend for different time periods using the default autocorrelation time period. That's invalid because, just like the trend changes depending on the time period, the best-fit autocorrelation model changes depending on the time period. You cannot calculate the autocorrelation in GISS for 1980–2010 and mindlessly apply it to the trend for 1996–2013. If I'm correct in the source for his statistics, then what this really shows is the danger of using online tools when you don't understand how the underlying statistics work. And my challenger doesn't understand–he didn't even know what an ARIMA model was.
Article edited on Aug. 23, 2013 to reflect new information.
ReplyDelete