R语言检验空间自相关Moran's I和运行SAR模型

空间位置越接近的点,相应的变量也可能越相近,这种现象称为空间自相关。空间自相关能够改变人们对某些事物原因的判断。举例来说,一个地点物种数高,临近的点物种数也相应得会高些,但这并不一定是由于两个地点的环境条件都十分优越,而纯粹是由于两个地点之间在空间上更为接近。点与点之间空间关系的依赖性使得各点之间数据并不是独立的。

为此,需要从空间关系检验空间自相关是否存在。空间自相关的检验,用Moran’s I 表示。 Moran’s I 与Pearson相关系数的算法类似,不过其计算时相关程度考虑的是点与点本身。在R的ape程序包中,有关于Moran’s I的详细的说明。为了检验Moran’s I的显著性,空间统计学家运动Randomization的方法,获得Moran’s I的零分布,继而求的各距离段中的Moran’s I是否显著偏离于零分布。

为了尽量去除空间自相关的影响,统计学家开发出了空间自回归模型,SAR(Spatial Auto Regressive Model),该模型在R的spdep中能够较为方便的实现。当然,也还有众多的程序包,如宏生态学数据分析的SAM程序包等。

以下是在spdep程序包中如何计算和检验Moran’s I的详细过程,以及调用SAR模型,进行相应的统计推断等,供感兴趣的老师同学参考。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
library(spdep)
## Loading required package: sp
## Loading required package: Matrix
## Loading required package: spData
library(maps)
library(mapproj)
library(spdep)
library(RANN)
library(mapdata)

map("china", col = "gray40", ylim = c(18,54), xlab ="Longitude", ylab
="Latitude", main ="Sampling sites")

map.axes()
map.scale()## 为了检验nsp是否具有空间自相关
lat <-sort(runif(50, min=24, max=35), decreasing =TRUE)

long <-runif(50, min=95, max=120)

nsp <-sort(round(rnorm(50, 800, 200)))

Pre <-sort(rnorm(50, 1500, 300))

Elev <-sort(rnorm(50, 1000, 200), decreasing =TRUE)

Time <-factor(rep(c(1, 2, 3), c(12, 18, 20)))

test0 <- data.frame(long, lat, nsp, Pre,Elev, Time)

points(long, lat, cex =exp(nsp/mean(nsp)), col = "red")

nsp <- test0$nsp

## 将test数据集转换成Spatial格式
test <- test0[,c(1,2)]
sptest <- SpatialPoints(test, proj4string = CRS("+proj=longlat +datum=WGS84"))

## 计算每个点最近的几个neighbour(这里k = 1,表示只计算一个的)
nbk1 <- knn2nb(knearneigh(sptest, k =5, longlat =TRUE))

## 将nbk1转换成 spatial weight linkage object 对象
snbk1 <- make.sym.nb(nbk1)

### n.comp.nb() finds the number of disjoint connected subgraphs
### in the graphdepicted by nb.obj - a spatial neighbours list object.
### 查看每个点不相接的相邻点数量
n.comp.nb(snbk1)$nc
## [1] 1
### 查看各点链接情况
plot(nb2listw(snbk1), cbind(test$long, test$lat), add =TRUE)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
### Moran's Test检验该数据集是否存在显著的空间自相关
### Moran's I testunder randomisation

moran.test(nsp, nb2listw(snbk1))
##
## Moran I test under randomisation
##
## data: nsp
## weights: nb2listw(snbk1)
##
## Moran I statistic standard deviate = 8.8677, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic Expectation Variance
## 0.671936139 -0.020408163 0.006095733
### Moran's ICorrelograms

### par(mfrow = c(1,3))

nsp.Moron.I <- sp.correlogram(snbk1, nsp, order=6, method="I", zero.policy =TRUE)

plot(nsp.Moron.I)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
### SAR model
### Saddlepointapproximation for global Moran's I (Barndorff-Nielsen formula)
lm.morantest.sad(lm(nsp~1),nb2listw(snbk1))
##
## Saddlepoint approximation for global Moran's I (Barndorff-Nielsen
## formula)
##
## data:
## model:lm(formula = nsp ~ 1)
## weights: nb2listw(snbk1)
##
## Saddlepoint approximation = 6.7733, p-value = 6.292e-12
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I
## 0.6719361
## sacsarlm
COL.sacW.eig <- sacsarlm(nsp ~ Pre + Elev +factor(Time),
data= test0, nb2listw(snbk1, style="W"))

summary(COL.sacW.eig,correlation=TRUE)
##
## Call:sacsarlm(formula = nsp ~ Pre + Elev + factor(Time), data = test0,
## listw = nb2listw(snbk1, style = "W"))
##
## Residuals:
## Min 1Q Median 3Q Max
## -103.204 -18.260 2.159 19.739 52.779
##
## Type: sac
## Coefficients: (asymptotic standard errors)
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 637.31705 298.77823 2.1331 0.032918
## Pre 0.33743 0.10993 3.0697 0.002143
## Elev -0.43635 0.16205 -2.6927 0.007088
## factor(Time)2 -23.80507 18.68448 -1.2741 0.202644
## factor(Time)3 -8.52493 30.23387 -0.2820 0.777970
##
## Rho: 0.16737
## Asymptotic standard error: 0.06605
## z-value: 2.5339, p-value: 0.011279
## Lambda: -0.12972
## Asymptotic standard error: 0.26622
## z-value: -0.48726, p-value: 0.62607
##
## LR test value: 5.9974, p-value: 0.049852
##
## Log likelihood: -243.6922 for sac model
## ML residual variance (sigma squared): 994.25, (sigma: 31.532)
## Number of observations: 50
## Number of parameters estimated: 8
## AIC: 503.38, (AIC for lm: 505.38)
##
## Correlation of coefficients
## sigma rho lambda (Intercept) Pre Elev factor(Time)2
## rho -0.04
## lambda 0.06 -0.29
## (Intercept) 0.00 0.04 -0.01
## Pre 0.01 -0.27 0.08 -0.94
## Elev 0.00 -0.11 0.03 -0.99 0.92
## factor(Time)2 0.01 -0.25 0.07 0.27 -0.37 -0.17
## factor(Time)3 0.01 -0.31 0.09 0.24 -0.36 -0.11 0.87