|Team Members:||Ross Flieger-Allison1, Lois Miller2, Danielle Sykes3, and Pablo Valle4|
|Graduate Assistants:||Sai K. Popuri3 and Nadeesri Wijkekoon3|
|Faculty Mentor:||Nagaraj K. Neerchal3|
1Department of Computer Science and Department of Statistics, Williams College,
2Department of Mathematics, DePauw University,
3Department of Mathematics and Statistics, University of Maryland, Baltimore County,
4Department of Mathematical Sciences, Kean University,
5Joint Center for Earth System Technology (JCET)
About the Team
Team 3 consisted of Ross Flieger-Allison, Lois Miller, Danielle Sykes, and Pablo Valle. We worked with our faculty mentor, Dr. Nagaraj K. Neerchal, along with our graduate assistants, Sai K. Popuri and Nadeesri Wijekoon, who provided guidance and useful background knowledge throughout the research process. Our project focused on implementing and evaluating a weather forecasting model that uses a statistical tool called Sliced Inverse Regression (SIR) to reduce the dimensionality of a large set of covariates, then uses a modified Nadaraya-Watson estimator (NWE) to produce rainfall predictions.
Climate conditions, especially rainfall, have an important impact on agricultural yields in several regions of the United States. The Missouri River Basin (MRB) is a significant agricultural region that is not irrigated and thus highly dependent on rainfall. The UMBC-JCET team as well as UMBC-REU teams from 2014-2015 previously used daily and monthly weather data provided by NASA and several Global Climate Models (GCMs) to produce predictions for a number of climate variables. Their precipitation predictions, however, have proven to be troublesome and inaccurate due to the semi-continuous nature of precipitation data, as well as the primitive modeling techniques used to produce their results (simple linear regression). This year’s project focused on implementing a more complex forecasting model (using SIR and NWE) in hopes of improving upon previous years’ predictions.
The data we used was provided by NASA and included weather observations made at approximately 21,000 locations across 57 years (1949-2005). Our model assumes that precipitation at any given location s depends on a large number of covariates: current and past values of monthly precipitation, sea-level pressure, relative humidity, and maximum/minimum temperature at s and its neighboring locations.
We implemented a data-analytical tool called Sliced Inverse Regression (SIR) that can be used to reduce the dimensionality of large set of covariates that influence precipitation in the Missouri River Basin. In our final stage of obtaining our predictions from the GCM data, we used a simple non-parametric technique called Nadaraya-Watson Estimator in which it was modified to account for the semi-continuous nature of the precipitation data.
We have successfully demonstrated that SIR and NWE methods can be implemented to work on a large dataset, and we were able to improve upon the predictive accuracy of previous years’ models. Additionally, we showed that parallelization of the SIR and NWE code greatly increases computational efficiency on the subregion, and also improves efficiency for the entire MRB region, up to 16 processes. Further study is needed on the implementation of SIR and NWE for daily precipitation data, and for other methods to continue improving accuracy.
Ross Flieger-Allison, Lois Miller, Danielle Sykes, Pablo Valle, Sai K. Popuri, Nadeesri Wijkekoon, Nagaraj K. Neerchal, and Amita Mehta. Dimensionality Reduction Using Sliced Inverse Regression in Modeling Large Climate Data. Technical Report HPCF-2016-13, UMBC High Performance Computing Facility, University of Maryland, Baltimore County, 2016. (HPCF machines used: maya.). Reprint in HPCF publications list
Click here to view Team 1’s project
Click here to view Team 2’s project
Click here to view Team 4’s project
Click here to view Team 5’s project
Click here to view Team 6’s project
Click here to view Team 7’s project