Lab 6: Spatial Regression
NOTE: if you’re having trouble getting the right results (i.e. the
linear regression has significant spatial autocorrelation in the residuals and
the spatial regression does not) because of data issues, you can download the
correct version of the data, including the H2Odens variable, at Data_2006\NR245\NR245_backup.mdb\BG_GF_LC_census2
- Making
a neighbor matrix:Open S-Plus
and use file>>import data>>from database to import your
BG.GF.census2 layer. Now enable S-Plus spatial module (file>>load
module>>spatial). You should now see a spatial menu item appear in
the menu. Click on Spatial>>neighbors. Choose Nearest neighborhoods
as the source, BG.GF.Census2 as the data set, X Centroid as variable 1 and
Y Centroid as variable 2. Keep the number of neighbors to 3 and the metric
as Euclidean. Under Save in type BGneighbor3. This will create the
neighbor matrix used to spatially weight observations in the spatial
regression
- Linear
regression: Go to
statistics>>regression>>linear. Choose BG.GF.Census2 as the
dataset, and input the following model:
P.coarse.veg~
H2Odens+P.HS.+MED.HH.INC+P.SFDH+P.Protland+log(d2down)+d2ramp. Under “result” check residuals and choose
to save residuals in BG.GF.Census2. Hit apply and then in the table for
BG.GF.Census2, right click on the heading newly created field “residuals” and
hit properties. Change the Name to resid1. Copy and paste the regression
results into your document.
- Spatial
regression: Now go to
spatial>>spatial regression. Choose BG.GF.Census2 as the data set,
input the same model as above, choose SAR as the covariance type, choose
BGneighbor3 as the covariance type and under the results tab check
“residuals” and choose to save in BG.GF.Census2. Click OK. Copy and paste
the regression results (at the beginning of the output, not including the
variance-covariance matrix of coefficients or correlation matrix). Note
the differences, if any, between the coefficients of this version and the
regular, non-spatial regression from step 3. Open the BG.GF.Census2 table
and right click on the new residuals field heading and click properties.
Change the name to resid2.
- Moran
test: Now you’ll see why we
did this. Go to spatial>>spatial correlations, choose
BG.GF>Census2 as the data set, resid1 as the variable, BGneighbor3 as
the neighborhood matrix and moran as the statistic. Click apply. Report
the Moran statistic and P value. Now do the same thing but for the
variable resid2. Again report the Moran statistic and P value. Are they
different? What does this tell you about spatial regression?
- View
in Arc Map: Export the two
residual columns plus the BKG_KEY column from Splus using
file>>export data>>to database. Choose MS Access database as
the To Data Target, BG.GF.Census2 as the From Data frame, type residsp as
the Table Name, and under the filter tab, shift click to select on
BKG.KEY, resid1, and resid2. Click OK. Now this should be a new table in
your geodatabase. In Arc Map, load up any block group layer, such as
BG.GF.Census2 and the new table, and do a tabular join to join that table
to the block group layer. Go to the symbology window and choose
quantities>>graduated color, with resid1 as the value. Hit classify
and choose standard deviation as the method. Click OK. Back in the
symbology window choose any color ramp then click OK and screencpature and
caption. Do the same for resid2.
- Second
example: Now let’s try this
with a new layer. On the share drive go to Data_2006\Database\Analysis\Analysis.mdb
and copy and paste sample_props feature class to your nr245 geodatabase.
If you view it in Arc Map you’ll notice it’s a point layer of properties
with a bunch of variables. Load that table into S Plus using
file>>import data>>from database. Now run the following model
as a regular regression (statistics>>regression>>linear):
price~NFMIMPVL+ACRES+SQFTSTRC+YEAROLD+TREES.PER+DWTWN.DIST+INSTE.DIST
Under the Results tab, make sure to save residuals in your sample.props
table. After doing this, right click on the residuals column heading and change
the name to resid1. Copy and paste the coefficient table and briefly note which
variables are significant or not significant. Now, create a neighbor matrix for
this layer. Go to spatial>>spatial neighbors, choose sample.props as the data set, X as
variable 1, Y as variable 2, 5 as the number of neighbors (we’re choosing more
because housing points are closer together than block groups) and choose to
save the matrix as propneighbor1. Now do a spatial regression
(spatial>>spatial regression), with the model just given as the formula,
propneighbor1 as the neighbor object, SAR as the covariance type, and under the
results tab choose to save the residuals in sample.props. Copy and paste the coefficient
results and describe what is now significant and how that differs from the
previous result. You should see one change in significance of a coefficient.
Explain why that might make sense in this case. Rename the column heading for
the new residuals to resid2. Now, using the instructions from earlier, in SPlus
run a Moran’s I analysis for both and report which or both are autocorrelated.
Does this result make sense?
- Display
in Arc Map: Export using
“Export to database” (this may require refreshing the interface, by
rechoosing MS Access Database as a Data Target, and sample.props as the
data frame), choosing only resid1, resid2, and PROPID as columns. Load
that table in ArcGIS, join it to samp.props. Before plotting, get rid of
the outline. Do this by right clicking in the classification window and
hitting “properties for all symbols”.
In the resulting window, double click on
the point shown under “preview” and the in subsequent window uncheck “use
outline.” Click OK twice. Then, back in the symbology window (where all
points should be the same color) hit the classification button and choose
the standard deviation classification method. Use whatever color ramp you
want. Take a screencapture and caption. Do this again for resid2. Interpret what these maps are showing.
What do negative or positive values mean? Which appears to have more
clusteringof high and low values in space? Which appears more randomly
distributed. What does that mean from a statistical perspective if
residuals of similar value tend to be near each other?
- Assemble
materials in document and upload.