|Team Members:||Rosemary K. Le1, Christopher V. Rackauckas2, Anne S. Ross3, and Nehemias Ulloa4|
|Graduate Assistant:||Sai K. Popuri5|
|Faculty Mentor:||Dr. Nagaraj K. Neerchal5|
|Client:||Dr. Brian R. Smith6|
1Department of Applied Mathematics, Brown University,
2Department of Mathematics, Oberlin College,
3Departments of Computer Science and of Statistics, Colorado State University,
4Department of Mathematics, California State University, Bakersfield,
5Department of Mathematics and Statistics, University of Maryland, Baltimore County,
6Maryland Department of Natural Resources
Team 2, from left to right: Christopher V. Rackauckas, Rosemary K. Le, Anne S. Ross, Nehemias Ulloa
About the Team
Our team is part of the REU Site: Interdisciplinary Program in High Performance Computing based in the Department of Mathematics and Statistics at UMBC. We, Rosemary Le, Anne Ross, Christopher Rackauckas, and Nehemias Ulloa, worked on a project concerning the Chesapeake Bay and its surrounding tributaries. The project was coordinated by Dr. Brian Smith with the Maryland Department of Natural Resources (DNR). This project allowed us to gain practical real world experience while allowing us to explore non-traditional statistical methods. We were wonderfully assisted by our graduate student assistant Sai K. Popuri and our faculty mentor Dr. Nagaraj K. Neerchal, both of which provided guidance when necessary.
The Chesapeake Bay and its surrounding tributaries house over 3,600 species of plants and animals. As the largest estuary in the United States, the Chesapeake Bay serves as a valuable commercial and recreational resource for the people who live in its basin. In order to assess its health, the Maryland Department of Natural Resources monitors various parameters such as dissolved oxygen, turbidity (a measure of water clarity) and chlorophyll (a measurement to estimate algae growth) in monitoring stations located throughout the bay and its associated tributaries.
While percent failure is one indicator of a station’s performance, statistical tests take inaccuracies and missed readings into account. We employed the Wilcoxon Signed-Rank Test to classify each continuous monitoring station as “Good,” “Bad,” or “Borderline.” The Wilcoxon Signed-Rank Test is a non-parametric test that compares the median of two populations (in this case, the readings of one station against the failure threshold). When using the test, one assumes that the population from which samples are drawn is symmetric. If this is not the case, i.e. the data is skewed, the true Type I Error may be inflated. Our simulation study (using Γ(α,β) to cover a wide range of skewness values) shows that log-transformation substantially reduces the Type I Error.
Picture to the left is an example of the skew in some of the parameters and the picture to the right shows the stations’
statuses utilizing the Wilcoxon test on the log-transformed data with the Benjamini-Hochberg rejection method.
Our ranking methods focus on using multiple comparison tests. When computing a ranking, a normal pairwise test can be used, but when used on multiple comparisons, the Type I error can get out of hand. Using multiple comparison tests helps account and correct this problem. We used the Tukey’s Test (percent failure), Bonferroni (percent failure), Benjamini-Hochberg (mean, percent failure), and Bayesian (percent failure).
Oxygen (5mg) | Ranks of monitoring stations with respect to percent failure:
the Tukey Test (TT), the Bonferroni Test (Bonf),
Benjamini Hochberg Method (BH), the Bayesian Ranking Method (BRM).
|Station Name||% Fail||TT||Bonf||BH||BRM|
|Havre de Grace||0||2||1||1||2||2|
This table is an example of the results of running our various ranking methods for all parameters. It is a portion of the dissolved oxygen with the 5mg/L threshold ranking table. The stations are listed in order of increasing percent failure in dissolved oxygen, that is listed from fairing best to fairing worst. The second column gives this percent failure. The following columns show a segment of the rankings assigned, each column being a different method. From this and similar tables we created, we can see that the Benjamini-Hochberg was most useful because it shows ties in the ranking without being too conservative. The Bonferoni method was too conservative and thus resulted in many ties whereas Tukey’s Test was quite the opposite and had little to no ties. Although the Bayesian Ranking method does not have any ties, it is still useful because the probabilities of the rankings are not uniform. This means that the posterior probabilities between rank 1 and rank 2 are not necessarily the same as the posterior probabilities between rank 2 and rank 3.
This project also included creation of GUI to make graphs of the data easily accessible, providing scientific researchers and the general public a powerful tool to aid in understanding Maryland’s tidal waterways. The GUI was developed using the R statistical package. However, there is no installation of R required in order to run this GUI, one must simply unpack the compressed folder and then there will be an executable able to run the specifed GUI. The GUI has three modes:
- 2011 Mode : Runs the GUI using the pre-made plots for the summer of 2011.
- MDB Mode : Runs the GUI using the Microsoft Access database provided by DNR’s Eye’s on the Bay site.
- Batch Mode: Runs MDB Mode but instead of displaying a plot, it generates all of the plots from the given database.
The picture above is an example of the GUI showing a scatter plot matrix for the St. Georges stationWe are hoping DNR will distribute our GUI to interested researchers and other individuals. Being designed around DNR’s Microsoft Access databases that are currently used to hold monitoring data, the GUI provides easy access to many insightful graphs of the data from both past and future years. These include the levels of any reported parameter (dissolved oxygen, chlorophyll, etc.) in the water over time and plots of two parameters against each other, allowing for speculation of trends and relationships amongst the parameters and between the parameters and time. We believe that this tool has potential to aid in current research, spark new research questions and overall help DNR spread interest and understanding about the health of the Chesapeake Bay and the surrounding waterways.
The full study and more detail is available in the technical report.
Rosemary Le, Anne Ross, Christopher Rackauckas, Nehemias Ulloa, Sai K. Popuri, Nagaraj K. Neerchal, and Brian Smith. Water Quality Monitoring of Maryland’s Tidal Waterways Technical Report HPCF-2012-12, UMBC High Performance Computing Facility, University of Maryland, Baltimore County, 2012. Reprint in HPCF publications list