Lab 1 (Week 1)

Summarizing land cover by Block Group, Histogram, Box Plot and ANOVA

 Due Jan 24


  1. Please sign your data agreement!!
  2. Map a network drive to \\zoofiles\gradgis.
  3. Open Arc Catalogue. In Arc Catalogue create a new folder on your Z drive called nr245. Also in Arc Catalog, copy the Gwynn’s Falls land cover layer (Data_2010\Database\LULC\lulc_gf_6clsr_1999) into your folder. In that folder create a new personal geodatabase (.mdb)called NR245. Copy Gwynn’s Falls watershed (GFW) boundary (NR245\lab_data.mdb\Watershed_GF), and census block groups (NR245\lab_data.mdb\PRIZM_2003_MSA) to that geodatabase. Do so in Arc Catalog by right clicking, clicking “copy” and then right clicking on the geodatabase (not the feature dataset within the geodatabase) and clicking paste.
  4. Load the land cover layer and import the symbology (.lyr) file in the symbology window for the layer. Zoom in and explore the data.
  5. Do an intersection of the block group layer and the GFW boundary. MAKE SURE TO SAVE IT AS A FEATURE CLASS IN YOUR NEW GEODATABASE. That is, click the little folder icon in the geoprocessing wizard and when you specify the output, in the “save as” dragdown box make sure to choose “geodatabase feature class” as the type and save it within your existing geodatabase. output1>>>>output2 Call it something like BG_GF_inter. The reason for saving it in the geodatabase is that when you create geodatabase feature classes that are polygons, it automatically generates area. Also, we’ll be doing lots of table editing in Access, which can be done with geodatabases, but not shapefiles, unless converted.
  6. Do Select by Attributes on BG_GF_inter to find all the really small sliver polygons created by the intersection. Use Shape_Area<20000 (the units of this and most other layers is meters). Then delete these polygons by starting the editor (editor>>start editing) and making sure to choose your NR245 geodatabase as the editable space. Then click delete. All the selected polygons will disappear. Click Stop Editing and choose to save. Note that the area values should have been generated automatically, because this is a geodatabase.
  7. Next, in preparation for tabulating areas (which involves a tabular join), we need to make sure we have a reliable join field. Create a new field in the table called join1 (set to integer). Use the field calculator to set it equal to ObjectID_1 (or, if you don’t have Object ID_1, then to ObjectID).
  8. Now tabulate each of these six cover types by block group. To do this open Arc Toolbox and click on Spatial Analyst Tools>>Zonal>>Tabulate Areas. Choose BG_GF_inter as the input raster or zone data layer, join1 as the zone field, gf6cls as the input raster data set and value as the class field. Save the output table in nr245.mdb. 
  9. Now do a tabular join (right click on BG_GF_inter>>>join; then choose  “join attribute from table” with join1 as the join field and the tabulate areas table as the “from” table). To make the joins permanent we’ll need to make a duplicate layer. Right click on BG_GF_Inter in the Table of Contents (TOC) and click data>>export data, accept the defaults and save the output as a feature class called BG_GF_LC.  MAKE SURE YOU SAVE IT AS A FEATURE CLASS WITHIN YOUR NR245 GEODATABASE! Add it and remove BG_GF_inter
  10. Now use graduated color symbology to plot out the block groups by the amount of buildings (value1; use the green to blue color ramp—fifth from the last) and take a screencapture (for info on how to screencapture go to . Insert the screencapture in your Word document and make sure that you caption this and all subsequent screencaptures. You’ll note that the really large block groups in the north appear to have the most, but this is misleading because they’re the largest by far. So, normalize the symbology by area. In the symbology window choose “Shape_Area” under the “Normalization” combo box. Now the display is essentially showing you a percentage. Take a screencapture of the plot.
  11. The only problem with this is that you don’t have percentages in the attribute table for doing analysis. You can manually correct this by creating a bunch of new fields and using the field calculator to divide land cover area by shape area, but instead we’re going to build an Arc Model to automate this somewhat. In Arc Catalog, right click on your NR245 geodatabase, and click new>>toolbox. Call that nr245. Go ahead and drag it from Arc Catalog into the Arc Toolbox window in ArcGIS (you can also right click in the Arc Toolbox window and click “add toolbox” and browse to this toolbox).  You should now see that toolbox in the toolbox window. Right click on it and click new>>model. The model editor should come up. We’ll now create a model that creates a field and calculates it as the percentage of a certain land cover type of total area. If you ever close the model and need to get back into edit mode, just right click on it and click “edit.”
  12. We’ll start by adding the “add field” tool. Click the “search tab” below the toolbox window and enter “add field” in the search line and hit search.” Then click “locate” to see where it is in Arc Toolbox. Drag the tool “Add Field” into the model interface. You should see . Double click “add field.” Choose BG_GF_LC as the input table. Set the field name to “P_Building”, the field type to double and click OK. Now we’ll add the field calculator tool. In Arc Toolbox again do a search but this time for “Calculate field.” Locate it and drag it into the model window. Now you’ll draw a model connector (which looks like )between the oval with BG_GF_LC to the rectangle that says Calculate Field.  . Now double click on the Calculate Field box and you should see that BG_GF_LC is the input. Choose P_building as the field name and click on the calculator next to the expression field. Using it, set the equation to [VALUE1]/[Shape_Area]. Click OK twice. Now all boxes should be colored in. Now we’ll clone this little model five times and change the parameters. Click Edit>>select all and then click control-C to copy and control-V to paste. The new set of boxes will be pasted over the old set, so just drag them down below so they don’t overlap. Then delete the first blue oval of the second group (if it gives you an other oval underneath, delete that too. You should now have just two boxes and two ovals in the second group. Then. Click on the model connector tool again and draw a connection between the last oval of the first group and the first box of the second group. The result should look like this.To add the four more model groupings (consisting of two boxes and two ovals each), you’ll want to select that last model grouping and copy it four times. Use the arrow tool to draw a rectangle just around the last model grouping (the last two boxes and ovals) like thisand then hit control-C followed by Control-V four times. Again, separate the groupings so they are not overlapping and then connect them with the connector tool as you did above. You may want to arrange them in two side by side columns of three groups. Now change the expressions and field names. Double click on Add Field (2) and change the field name from P_building to P_coarseveg. Double click on Calculate Field(2) and change the Field name to P_coarseveg (scroll to the bottom of that list and P_coarseveg should be there to be chosen) and in Expression change VALUE1 to VALUE2. Now, do change the parameters for the third, fourth, fifth and sixth groups in the same way, using whatever field names you think are appropriate. Finally, right click on the first oval of the entire model and check “model parameter.” Now save the model, take a screencapture of the model diagram, and close it. Back in Arc Tool box, right click on this model (“model1”) and rename it to “calc_percent.” Now, before running it close Arc Catalog, just in case of segmentation violations. Then double click the model icon to run it. Click OK at the next screen. Now check your attribute table to make sure it worked correctly. If you’re having trouble getting this to work, then click here to download, load it into toolbox and run.
  13. Go to the symbology tab for Census block groups and using graduated color, plot out P_pavement using 6 classes and natural breaks. Take a screencapture
  14. Now we’ll make a few quick graphs in JMP statistical software (a graphically friendly program). Open JMP, then click Open Data Table. Where it says “Data files” in the lower left, drag down to choose “MS Access” database as the type. Then browse to your NR 245 database and double click. It will bring up a listing of all the tables in your geodatabase. Choose the one called BG_GF_LC. It should open the table you just created in Arc Map. Now click Analyze>>Distribution. Click P_pavement once in the interface and click OK. You should get a a window with a sideways histogram and a bunch of other info. [1]Report the mean, standard deviation, median, maximum and minimum values. Explain what a median is. Next, make the histogram more presentable by pivoting and enlarging it. . Click on the arrow next to P_pavement . Click Histogram options and uncheck “vertical.” The graph should now be horizontal. Enlarge the window and drag the corner of the graph to enlarge it. Then under Histogram Options go to “set bin width” and change it from .1 to .025. Go again to the options and add a “count axis.” You should now have a much clearer histogram. Take a screencapture. [2) Explain what this histogram is showing you. Next, double click on one of the columns in the histogram. Note that it shows you all the records in the table from that bin.
  15. Now we’ll do a box plot and a quick Analysis of Variance to see if there are systematic difference in the percentage of pavement by PRIZM group. Go to Analyze>>Fit Y by X. Click on P_pavement and the Y, Response. Then click DESC15 (that is the descriptive variable for PRIZM15) and click X, Factor. Click OK. You should get a window. Enlarge that window. Now enlarge the graph by clicking and dragging on the corner of the white part of the graph. Click on the downward arrow and then “Quantiles.”  . This should now give you a box plot over a dot plot. Note the width of each box. Now click Display options>>X axis proportional. Explain the difference between these two graphs. Now, go down to Display options and uncheck “points.” This will get rid of the dots. Take a screencapture of the resulting box plot. [3] Explain what the middle line in the box, the top and bottom lines of the box, and the “whiskers” are showing you. See for help in interpreting a boxplot. Don’t close the window.
  16. Next comes the ANOVA. With the window still open, click the downward arrow and hit “Means/ANOVA.” [4] Report the mean pavement percentage and number of observations for Affluentials and UrbanMidscale. Then report the F statistic and P value on the F statistic and report whether the F statistic is significant. What does this tell you about difference in pavement between groups? (See for help with ANOVA).
  17. Now do a pairwise comparison: click the downward arrow then “compare means”>> “All pairs-Tukey’s HSD.” Scroll to the bottom an interpret the results based on what we discussed in class. [5]List five pairs that are significantly different and give the differences a p-values for each.  [6] Finally, based on the comparison list (the one with the As, Bs and Cs), which category has a mean that is significantly different from the most other group means?Now save this model. Click file>>save. Give it a name.
  18. Save the Word document with your answers and screencaptures as a PDF and then upload them at using nr243 as the option for the course. See for help on screencapturing and making PDFs.