NR 385 - Analysis of Natural Resources Data

Spring 2005

General Information

Goal of the Course

The primary focus of the seminar is to gain proficiency with model selection and multi-model averaging using data sets chosen by the participants.  We will also briefly address alternative approaches.  For example, when are “frequentist” statistics appropriate?  When is a Bayesian approach preferable?

Schedule

The course meets on Mondays, from 10 AM to 1PM, in 301 Aiken, during the first half of the Spring 2005 semester.  We will meet seven times: January 24, January 31, February 7, February 14, February 28, March 7, and March 14.

Prerequisites

Prerequisites are: a course in multivariate statistics, and attendance at Anderson’s seminar in Nov 2004.  In addition, you need a thorough understanding of the data set you will be analyzing; if it is not your data set, you should know how the data was collected and why.  You are also expected to be familiar with a software package that can calculate the log likelihood of statistical models derived from your data set.

Grading

Grades are based on weekly written assignments (30%), class participation (30%), the final presentation (20%), and the final project (20%).
Audits and sit-ins are welcome (up to a maximum class size of 12 people)!  Please keep in mind, though, that you will get much more out of the course if you are working through a data set and discussing your progress in class.

Readings

1)    Burnham, K. P. and D. R. Anderson. 2002. Model Selection and Multimodel Inference. 2nd edition. Springer, New York, New York, USA.
2)    Other readings as noted below in the schedule.

Assignments

Weekly readings and class discussions will focus on the theory and application of the information-theoretic approach to model selection and averaging. 
Written assignments will focus on data analysis and preparation of a manuscript describing your analysis.  New sections of the manuscript will be due periodically, and new sections should be turned in with copies of the previously written section(s).  Electronic submission is fine, unless the syllabus specifically requests hard copies.

Final Presentation

Final presentations will be on March 14.  Presentations should be composed in PowerPoint.  You will have 20 minutes for your presentation, including a question-and-answer period.  Follow the typical format for a scientific presentation, with an introduction, methods, and results/discussion section.

Final Paper

The final written project (due April 1 – no joke) will be structured like a standard scientific paper, with the following sections: Introduction, Methods, Results, Discussion, and Literature Cited.  Methods and Results should be publication-quality.  The introduction and discussion need to be well-written, and must cover everything relevant to the analysis (e.g. statement of the research question and hypotheses, essential background information, and interpretation of your results) but you are not required to exhaustively research and cite the manuscript (for example, I won’t quibble about unsubstantiated background, nor do I expect you to fully compare your results with the existing literature).
Your final project should address feedback received on previous assignments.
Your final paper may be submitted electronically.

Office Hours and Contact Information

Office hours are by arrangement.  Contact me with 2 or 3 potential meeting times. 
Office: 205 Aiken
E-mail: brian.mitchell@uvm.edu
Phone: 802-656-2496

Schedule

Week 1 (Class meets January 24, 2005)

Readings
    Chapter 1 in: Burnham, K. P. and D. R. Anderson. 2002. Model Selection and Multimodel Inference. 2nd edition. Springer, New York, New York, USA.
    Johnson, J. B. and K. S. Omland. 2004. Model selection in ecology and evolution. Trends in Ecology and Evolution 19(2): 101-108.  

Assignment
    Come to class prepared to describe your data set, including the research question, the data collected, and proposed statistical methods (e.g. ANOVA, logistic regression, program MARK). 
    Begin thinking about your model set: what models/hypotheses do you want to address?  Is there enough existing knowledge for a confirmatory analysis, or will this be exploratory?


During Class
    Welcome and introductions
    Data sets
    Discuss readings
    Approaches to data analysis
    Approaches to model selection and averaging

Lecture Notes from January 24, 2005

Week 2 (Class meets January 31, 2005) 

Readings
    Chapter 2 in: Burnham, K. P. and D. R. Anderson. 2002. Model Selection and Multimodel Inference. 2nd edition. Springer, New York, New York, USA.
    Hurlbert, S. H. 1984. Pseudoreplication and the design of ecological field experiments. Ecological Monographs 54:187-211.
    Oksanen, L. 2001. Logic of experiments in ecology: is pseudoreplication a pseudoissue? Oikos 94:27-38.
    OPTIONAL: Hurlbert, S. H. 2004. On misinterpretations of pseudoreplication and related matters: a reply to Oksanen. Oikos 104(3):591-597.

Assignment
    Write an introduction for your research.  You do NOT need exhaustive citations, but you should at a minimum: A) describe the research question, B) provide important background information, and C) list the objectives of the research and the hypotheses you will be testing.  Bring hard copies for the entire class.
    Develop a draft model set based on your understanding of the system you are studying, and be prepared to present it to the class.  Bring hard copies for the entire class.

During Class
    Discuss model sets
    Sample size and pseudoreplication
       Carl found this powerpoint presentation about pseudoreplication

Lecture Notes from January 31, 2005

Week 3 (Class meets February 7, 2005)

Readings
    Chapter 3 in: Burnham, K. P. and D. R. Anderson. 2002. Model Selection and Multimodel Inference. 2nd edition. Springer,
New York, New York, USA.
    SKIM: Burnham, K. P. and D. R. Anderson. 2004. Multimodel inference: Understanding AIC and BIC in model selection.  Sociological Methods and Research 33(2): 261-304.

Assignment
    Review all introductions and model sets, and be prepared to discuss them and suggest changes to the model sets during class.

During Class
    Discuss model sets
    Versions of AIC
    Model selection issues

    AIC versus BIC
    Examples of model selection

Lecture Notes from February 7, 2005

Week 4 (Class meets February 14, 2005)

Readings
    Chapter 4 in: Burnham, K. P. and D. R. Anderson. 2002. Model Selection and Multimodel Inference. 2nd edition. Springer, New York, New York, USA.

Assignment
    Finalize your model set (turn in)
    Update your introduction (turn in)
    Find a model selection paper in your research field that you would like to see critiqued in class; turn in a digital or hard copy.

During Class
    Model parameter averaging
    Point estimates and variance
    Confidence intervals
    Goodness-of-fit testing – statistical methods

Regression data set

Discriminant analysis data set

Cohen's Kappa spreadsheet

Logistic regression data set

Lecture notes from February 14, 2005

NO CLASS ON FEBRUARY 21, 2005

Week 5 (Class meets February 28, 2005)

Readings
   
Anderson, D. R., W. A. Link, D. H. Johnson, and K. P. Burnham. 2001. Suggestions for presenting the results of data analyses. Journal of Wildlife Management 65:373-378.

Assignment
   
Test GOF of your most general model (include write-up in methods)
   
Calculate appropriate AIC stats for all models in your model set (turn in)
    Write Methods section (turn in)


During Class
    Tools to facilitate model selection and parameter averaging
    Writing about and presenting model selection results

Spreadsheet for use during class

Lecture notes from February 28, 2005

Week 6 (Class meets March 7, 2005)

Readings
    Anderson, D. R. and K. R. Burnham. 2002. Avoiding pitfalls when using information-theoretic methods. Journal of Wildlife Management 66:912-918.
    Gibson, L. A., B. A. Wilson, D. M. Cahill, and J. Hill. 2004. Spatial prediction of rufous bristlebird habitat in a coastal heathland: a GIS-based approach. Journal of Applied Ecology 41:213-223. Discussion Leaders: Jeremy and Turner
    Maestas, J. D., R. L. Knight, and W. C. Gilgert. Biodiversity across a rural land-use gradient. Conservation Biology17(5):1425-1434. Discussion Leaders: Rebecca and Kathryn
    Miyakoshi, Y., M. Nagata, and S. Kitada. Effect of smolt size on postrelease survival of hatchery-reared masu salmon Oncorhynchus masou. Fisheries Science 67:134-137. Discussion Leader: Eric

Assignment
    Calculate model-averaged parameters and outcomes
    Write Results section (turn in)

   
Prepare discussion points for the paper assigned to you

During Class
    Discuss and critique examples of model selection and averaging
    Goodness of fit testing – Logistic regression

Lecture notes from March 7, 2005

Week 7 (Class meets March 14, 2005)

Readings
    The following resources provide useful tips and techniques for giving presentations in general, and for giving PowerPoint presentations in particular.  I highly recommend that you take some time to explore these links before you start working on your presentation!  Note that the links below can be found along with a number of other resources at DePauw University's Next Slide Please page; but I felt that these were the most useful:
Assignment
    Prepare for presentation

During Class
    Goodness of fit testing Mark-recapture models
    Presentations
       Kathryn
       Turner
       Jeremy
       Rebecca
       Carl
       Eric
    Course Evaluations

Lecture notes from March 14, 2005

Final Paper Due April 1, 2005

Demonstrations

NOTE: All demos should be run as slideshows (as opposed to the presentation editor) because of the presence of transitions within each slide.  Just start the slideshow and click your mouse to advance.

Week 2

Parsimony

Week 3

Maximum Likelihood vs. Least Squares

Grouped vs. Ungrouped Predictor Variables

Spreadsheets

Model selection spreadsheet - with example data

Model selection spreadsheet - with instructions and sheets for maximum likelihood and least squares data

Model selection and predictor variable averaging - with instructions and some example data (this spreadsheet incorporates what we worked on in class on February 14, plus some bells and whistles). NOTE: this file was updated on 24 Feb 2005 to correct an error that produced incorrect parameter averaging results for least squares data.

Model selection, predictor averaging, and response averaging - with instructions.  This spreadsheet is in a .zip archive to save  download time. Last updated on 12 April 2005.  The update corrects a serious error in the response averaging variance calculation.  Earlier versions will NOT give valid results.