Organizing or clustering data into natural groups is one of the most fundamental aspects
of understanding and mining information. The recent explosion in sensor networks and data
storage associated with hydrological monitoring has created a huge potential for automating
data analysis and classification of large, high-dimensional data sets. In this work, we
develop a new classification tool that couples a Na€ıve Bayesian classifier with a neural
network clustering algorithm (i.e., Kohonen Self-Organizing Map (SOM)). The combined
Bayesian-SOM algorithm reduces classification error by leveraging the Bayesian’s ability to
accommodate parameter uncertainty with the SOM’s ability to reduce high-dimensional
data to lower dimensions. The resulting algorithm is data-driven, nonparametric and is as
computationally efficient as a Na€ıve Bayesian classifier due to its parallel architecture. We
apply, evaluate and test the Bayesian-SOM network using two real-world hydrological data
sets. The first uses genetic data to classify the state of disease in native fish populations in
the upper Madison River, MT, USA. The second uses stream geomorphic and water quality
data measured at 2500 Vermont stream reaches to predict habitat conditions. The new
classification tool has substantial benefits over traditional classification methods due to its
ability to dynamically update prior information, assess the uncertainty/confidence of the
posterior probability values, and visualize both the input data and resulting probabilistic
clusters onto two-dimensional maps to better assess nonlinear mappings between the two.