Lab 6: Spatial Regression

Due Feb 29

 

  1. Use your BG_GF_Census table from last week (if you still have it in S-Plus you renamed the table BG.GF.Census2). If you’re not sure which one it is, go back to NR245\download.mdb and download BG_GF_Census2 (this is the same as BG_GF_Census, but it has the stream density variable).

 

  1. Making a neighbor matrix: Open S-Plus and use file>>import data>>from database to import your BG.GF.census layer. Now enable S-Plus spatial module (file>>load module>>spatialstats). You should now see a spatial menu item appear in the menu. Click on Spatial>>neighbors. Choose Nearest neighborhoods as the source, BG.GF.Census2 as the data set, X Centroid as variable 1 and Y Centroid as variable 2. Set the number of neighbors to 3 and the metric as Euclidean. Under Save in type BGneighbor3. This will create the neighbor matrix used to spatially weight observations in the spatial regression

 

  1. Linear regression: Go to statistics>>regression>>linear. Choose BG.GF.Census2 as the dataset, and input the following model:

P.coarseveg~ H2Odens+P.HS.+MED.HH.INC+P.SFDH+P.Protland+log(d2down)+d2ramp. Under “result” check residuals and choose to save residuals in BG.GF.Census2. Hit apply and then in the table for BG.GF.Census2, right click on the heading newly created field “residuals” and hit properties. Change the Name to resid1. [Q1] Copy and paste the regression results into your document and put asterisks by the significant variables (* for 95% and ** for 99%).

 

  1. Spatial regression: Now go to spatial>>spatial regression. Using the same data set, input the same model as above, choose SAR as the covariance type, choose BGneighbor3 as the covariance type and under the results tab check “residuals” and choose to save in your data table. Click OK. Chances are it won’t work. This is because you have an NA value in the H2Odens variable. Look for that NA and replace with a 0. Now rerun. It should work.  [Q2] Copy and paste the regression results with asterisks again (at the beginning of the output, not including the variance-covariance matrix of coefficients or correlation matrix). [Q3] What are the differences, if any, between the coefficients of this version and the regular, non-spatial regression from step 3. Open the BG.GF.Census2 table and right click on the new residuals field heading and click properties. Change the name to resid2.

 

  1. Moran test: Now you’ll see why we did this. Go to spatial>>spatial correlations, choose BG.GF>Census2 as the data set, resid1 as the variable, BGneighbor3 as the neighborhood matrix and moran as the statistic. Click apply. Now do the same thing but for the variable resid2. [Q4] Report the Moran statistic and P value for both tests. Are they different? What does this tell you about spatial regression?

 

  1. View in Arc Map: Export the two residual columns plus the BKG_KEY1 column from Splus using file>>export data>>to database. Choose MS Access database as the To Data Target, your data table as the From Data frame, type residsp as the Table Name, and under the filter tab, shift click to select on BKG.KEY, resid1, and resid2. Click OK. Now this should be a new table in your geodatabase. In Arc Map, load up any block group layer ( BG_GF_Census) and the new table, and do a tabular join to join that table to the block group layer. Then go to toolbox>>spatial statistics and do Cluster and Outlier Analysis (Local Moran) on each. Choose BG_GF_Census as input, resid1 as input field, save your input feature class as BGresid1LISA in a geodatabase, choose inverse distance as the “conceptualization”, Euclidean distance” as the distance method, “row” as standardization and 2000 as the distance band. Look at the output. Then do the another Local Moran with all the same settings except the input field is resid2. It should have the HH,LL,HL,HL symbology. If you want to see both maps on the same layout, create a second data frame (insert>>data frame) and copy and paste one of the layers into the second data frame. Then go to layout view and adjust the two map frames so they’re roughly equal size (this is optional).  Screencpature and caption. Do the same for resid2. [Q5] briefly describe the difference between the two maps. Why does this difference make sense in light of the fact that one set of residuals is from a standard OLS regression and the other is from a spatially adjusted regression?

 

  1. Second example: Now let’s try this with a new layer. Import your sample_props (not sample_props2) dataset from before (if you don’t have it, copy sample_props from NR245/lab_data.mdb). Load it into S Plus using file>>import data>>from database. Now run the following model as a regular regression (statistics>>regression>>linear): price~NFMIMPVL+ACRES+SQFTSTRC+YEAROLD+TREES.PER+DWTWN.DIST+INSTE.DIST

Under the Results tab, make sure to save residuals in your sample.props table. After doing this, right click on the residuals column heading and change the name to resid1. [Q6]Copy and paste the coefficient table and gives asterisks. Now, create a neighbor matrix for this layer. Go to spatial>>spatial neighbors,  choose sample.props as the data set, X as variable 1, Y as variable 2, 5 as the number of neighbors (we’re choosing more because housing points are closer together than block groups) and choose to save the matrix as propneighbor1. Now do a spatial regression (spatial>>spatial regression), with the model just given as the formula, propneighbor1 as the neighbor object, SAR as the covariance type, and under the results tab choose to save the residuals in sample.props. [Q7] Copy and paste the coefficient results with asterisks and describe what is now not significant that was before. You should see one change in significance of a coefficient. Explain why that might make sense in this case. Rename the column heading for the new residuals to resid2. Now, using the instructions from earlier, in SPlus run a Moran’s I analysis for both and report which or both are autocorrelated and [Q8] report the result.

 

  1. Display in Arc Map: Export using “Export to database” (this may require refreshing the interface, by rechoosing MS Access Database as a Data Target, and sample.props as the data frame), choosing only resid1, resid2, and PROPID as columns. Load that table in ArcGIS, join it to samp.props.  Then run Local Morans on both resid1 and resid2. Use “Fixed distance ban”, no standardization, Euclidean distance and a 200 m distance band (houses are much smaller scale).   Take a screencapture and caption for both.    The differences might be subtle, but you should see some visible differences particularly for one of the four categories (HH, LL, HL or LH). Now quantify this difference by getting a count of how many points there are in each category for each layer. Easiest way to do this is to open the symbology window for each then click on the “count” heading.  [Q9] Report the count of each category for both resid1 and resid2. Which group (not including “not significant”) had the biggest change between the two maps?  Does the change in count make sense and if so why?

 

  1. Assemble materials in document and upload.