Team 2 – Quantifying Uncertainty in Estimates of Baseflow of Watersheds For the Chapsapeake Bay

Team Members: Christian Dixon1, Gabriel Martinez Lazaro 2, Hwan Hee Park3,and Maddie Rainey4
Graduate Assistant: Sai Popuri1 and Nadeesri Wijekoon1
Faculty Mentor: Kofi Adragni1
Client: Jeff Raffensperger5

Department of Mathematics and Statistics, University of Maryland, Baltimore County,
Department of Mathematics, University of California, Fullerton
Computational and Applied Mathematics, Rice University
Department of Mathematics, Grand Valley State University
U.S. Geological Survey (USGS)

About the Team

Christian Dixon is a rising junior at UMBC, majoring in Mathematics. Gabriel Martinez is a rising senior at California State, Fullerton majoring in Applied Mathematics. Hwan Hee Park is a rising senior at Rice University majoring in Applied Mathematics. Maddie Rainey is a rising senior at Grand Valley State University majoring in Mathematics and minoring in Mathematical Statistics and Data Science.
The team was under the mentorship of Dr. Kofi Adragni, and research assistants: Nadeesri Wijekoon and Sai Popuri. The client, Dr. Jeff Raffensperger, is from the United States Geological Survey (USGS). The team worked on quantifying the variability around the baseflow estimate that was calculated using Eckhart’s Recursive Digital Filter (RDF).

Problem

Baseflow is an important aspect in making informed decisions about water regulations and aquatic ecosystems. The USGS National Water-Quality Assessment Project used several methods to estimate baseflow for 225 sites in the Chesapeake Bay watersheds. Baseflow is the measurement of the volume of water that is gathered at the measuring site. One of the methods, Eckhart’s (2005) Recursive Digital Filter (RDF), produced significant results. The purpose of this study is to quantify the variance of the recursive digital filter and its parameters.

Methodology

The variance of baseflow relies on the variability of numeric parameters, alpha (the recession constant) and beta (the maximum baseflow index). Two statistical methods have been implemented to estimate the variability of baseflow: the Bootstrap method and the Delta method. The Bootstrap method is a re-sampling method with replacement that is computationally intensive. The Delta Method is a linear approximation using Taylor’s Series expansion, see Figures 1-2.

Delta Method Algorithm
Bootstrap Method Algorithm
Figure 1: The Delta Method utilizes the product of the derivative of the RDF squared and the standard deviation of the recession constant (alpha) to estimate the variance of baseflow. This diagram displays the algorithm that computes a 95% confidence interval around the baseflow estimation.
Bootstrap Method Algorithm
Bootstrap Method Algorithm
Figure 2: The Bootstrap Method randomly samples the recession constant data (alpha) with replacement. This process returns a vector of variances for each day. This diagram displays the algorithm that computes a 95% confidence interval around the baseflow estimation.

Results

The plots shown in Figures 3-4 illustrate the variability obtained from each of the methods for the first 100 days. Due to the data set of the recession constant not having an asymptotically normal distribution, and the derivative of the RDF not being entirely continuous, the Delta Method is not recommended in quantifying the variability of baseflow. The Bootstrap Method showed that the margin of error around baseflow is very small for a 95% confidence interval. Therefore, the variability using the median of the recession constant, alpha, is negligible. This method was recommended for the client.

Bootstrap Method Plot
Figure 3: The variability using the Delta Method produced very large values which resulted in a large margin of error around each estimation of baseflow.
Bootstrap Method Plot
Figure 4: The Bootstrap Method produced a very small margin of error for each value of baseflow. The lines on this graph are not discernible.

Parallelization

The Delta Method runs consistently in less time than the parallelized Bootstrap Method. The data set consisted of 31,046 data entries. A parallelized Bootstrap Method completes the entire data set at 63 minutes, while the unparallelized Delta Method completes it in 6.37 minutes, see Figure 5.

Performance Study Plot
Figure 5: Performance study for Bootstrap Method using 1, 4, and 16 processes compared to an unparallelized Delta Method.

Conclusion

The Bootstrap Method is recommended to estimate the variance of baseflow when streamflow is constant. The results showed that the variance with respect to using the median of the recession constant, alpha, is negligible. In order to find the total variance for a non-constant streamflow, the variability produced when measuring streamflow is needed.

Links

Christian Dixon, Gabriel Martinez Lazaro, Hwan Hee Park, Maddie Rainey, Kofi Adragni, Sai Kumar Popuri, Nadeesri Wijekoon and Jeff Raffensperger. Quantifying Uncertainty in Estimates of Baseflow of Watersheds For the Chapsapeake Bay. Technical Report HPCF-2017-12, UMBC High Performance Computing Facility, University of Maryland, Baltimore County, 2016. Reprint in HPCF publications list

Poster presented at the Summer Undergraduate Research Fest (SURF)

Click here to view Team 1’s project
Click here to view Team 3’s project
Click here to view Team 4’s project
Click here to view Team 5’s project
Click here to view Team 6’s project