Data mining is a broad area that integrates methods from several fields including machine learning, statistics, pattern recognition, and database systems, for the analysis of large volumes of data. The faculty in the Department of Computer Science at the University of Vermont include internationally recognized researchers in this area whose work is widely published in international journals and conferences.
Our goal is to build from our acknowledged research and apply such research to large, noisy real-world problems. While fostering academic research in data mining methods and tools development, we also advocate collaboration across the academic-industrial divide and promote interdisciplinary collaborations between faculty members in Computer Science, Statistics, Engineering, Biology, and Medicine.
In recent years the power of machine learning and statistics techniques to discover interesting patterns in raw data has manifested itself in the widespread application of decision trees, rule induction, Bayesian networks, association analysis, and sequential patterns. As these techniques have matured in sophistication and power, industry has become directly involved in their promotion and use, particularly in various conferences on Data Mining. The University of Vermont has a strong contingent of researchers in this area.
Our faculty publish in the leading forums in data mining as well as other, related leading journals and conferences, such as IEEE Transactions on Information Theory, ACM Transactions on Information Systems (TOIS), Information Systems, IEEE Intelligent Systems, IJCAI, AAAI, ICML, COLT, and WWW.
Dr. Xindong Wu is the Editor-in-Chief of TKDE (IEEE Transactions on Knowledge and Data Engineering), and the Steering Committee Chair for ICDM (IEEE International Conference on Data Mining). He was one of the two Program Committee Co-Chairs (with Rich Caruana, Cornell University) for KDD-07, the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, held in San Jose, California.
Dr. Jason Moore was recently appointed founding Co-Editor-in-Chief of BioData Mining.
Areas of Excellence
- Data mining from multiple data sources (Xindong Wu)
- Emerging data mining applications in bioinformatics, engineering, and medicine (Abdullah Arslan, Jeff Bond, Yves Dubief, Marc Greenblatt, Larry Haugh, Yuichi Motai, Jason Moore, Jim Vigoreaux, and Xindong Wu)
- Image analysis (Richard Foote, Gagan Mirchandani, and Robert Snapp)
- Noise detection and cleansing in large, distributed data environments (Jeff Bond, Xindong Wu, and Xingquan Zhu)
- Ontology-based information extraction and knowledge discovery (Serguei Krivov and Xindong Wu)
- Pattern discovery in data streams (Byung Lee, Sean Wang, Xindong Wu, and Xingquan Zhu)
- Pattern matching and mining (Abdullah Arslan, Robert Snapp, Xindong Wu, and Xingquan Zhu)
Many research CS departments have a component of distributed systems. However, most are concentrated on the networking aspects, or on particular applications. The uniqueness of this group is in its emphasis on "vertical slices" for distributed systems: from underlying hardware support (robots, cameras and sensors), to application-level research (security, distributed control and monitoring). That is, we target applications that use distributed systems, and at the same time, we study all the necessary tools and theories in the software to hardware levels to support the applications. Hence, part of the uniqueness of this group is its interdisciplinary nature.
We envision a world in which various kinds of systems, including sensors and actuators, mobile phones, PDAs, web servers and clients, and large databases, all work together to help human users to monitor environments, obtain/provide healthcare, make decisions, play games, and in general live and work. Computer Science research has a lot to contribute to this vision of the world, and faculty in the CS and ECE departments have involved in research in various aspects. More specifically:
- Sensor networks and their security
- Security of distributed systems
- Distributed databases and data stream processing
- Data mining from distributed data
- Multiple robotics and distributed visual computing
We will find the use of distributed systems in many applications. Especially important are the connection of this group's research with the two emphasized areas of UVM: environmental and healthcare studies. The connection with the ECE and Math departments is obvious. Also, we see a huge potential of distributed systems in the future Transportation Center. The Vermont Advanced Computing Center (VACC), now under construction, has also decided to emphasize on sensor systems. We believe our group can contribute a lot to this vision of VACC. Current projects include:
- Trace Effect Analysis for Software Security. Funding source: DoD AFOSR
- A framework for optimal approximate query evaluation based on workload forecasting. Funding source: NSF CISE/IIS.
- Controlled Release of Information Based on Contents. Funding source: NSF CISE/IIS.
- Efficient and flexible processing of aggregation join queries on data streams: Funding source: Vermont EPSCoR.
- Modeling the cost of a user-defined function. Funding source: DoE Office of Science.
- Privacy-Aware Information Release Control. Funding source: NSF CISE/IIS.
- Wireless Sensor Network Optimization with Application QoS Requirements. Funding source: Vermont EPSCoR.
Areas of Excellence
- Wireless Networking (Jeff Frolik)
- Data stream processing (Byung Lee and Sean Wang)
- Security and privacy (Alan Ling, Sean Wang, and Christian Skalka)
- Distributed data mining (Hill Zhu and Sean Wang)
- Computer Vision (Hill Zhu)
- Distributed Databases (Byung Lee)
Evolutionary & Agent-Based Computing
Computer Science faculty members and their graduate students are actively involved in a variety of research projects focusing on systems of autonomous and/or evolving agents. Some of this research focuses on developing and studying computational intelligence using nature-inspired computing paradigms, such as evolutionary computation, artificial neural networks, and autonomously interacting computational agents. Other projects are focused on applying these methods to domain-specific problems in areas such as robotics, biology, ecology, economics, social networks, psychology, and transportation.
The general area of Evolutionary and Agent-Based Computing has strong overlaps with our other departmental focus areas in Data Mining and Distributed Systems, and supports the College thrust in Complex Systems.
- Using genetic programming to evolve models of cyanobacterial population growth (Bongard & Eppstein, in collaboration with Natural Resources Faculty Watson)
- Using stochastic cellular automata to model mechanisms of invasiveness in plant species (Eppstein, in collaboration with Plant Biology Faculty Molofsky)
- Developing a multi-scale agent based model, using spiking neural networks for up-scaling, for modeling the alternative transportation energy economic market (Eppstein, in collaboration with Engineering Faculty Rizzo & Marshall)
- Developing computational evolutionary models of self-organizing biological speciation due to multi-scale nonlinear genetic interactions (Eppstein, in collaboration with Biology Faculty Goodnight)
- Using multi-objective evolutionary methods for optimizing management strategies for surface water runoff and innovizing design prinicples (Eppstein, in collaboration with Natural Resources Faculty Bowden)
- Using population based search methods to detect epistatic genetic interactions that predispose for complex disease traits (Eppstein, in collaboration with computational geneticist and adjunct Faculty Moore)
Areas of Excellence
- Automated design of robot morphology and control, using evolutionary computation and artificial neural networks (Bongard)
- Active learning in social robots (Bongard)
- Evolving artificial neural networks to model human decision making (Bongard, in collaboration with Math Faculty Dodds & Danforth)
- Studying evolutionary and system dynamics on complex networks (Eppstein)
- Design of semantic webs for ecological data (Krivov)
- Distributed control algorithms for autonomous sensor networks (Wang & Lee, in collaboration with Engineering Faculty Frolik)
- Pattern recognition in medical imaging (Snapp)
- Modeling genetic regulatory networks (Bongard)