Hybrid Ensemble Modeling with Application to Hydrological


Sumi Sirajum Monira

BHƑww@Hw@Ȋwm_iHwAHbQVUj, pp. 1-100 (2013.1.17)

Accurate forecasting of rainfall has been one of the most important issues in hydrological research, because early warnings of severe weather, made possible by timely and accurate forecasting can help prevent casualties and damages caused by natural disasters. The intricacy of the atmospheric processes that generate rainfall makes the physical modeling of rainfall overly parameterized. A possible solution to this is to construct the rainfall forecasting system based on the data, so that the sub-processes of rainfall can be captured more accurately. In this study we have proposed hybrid ensemble modeling framework where multiple machine learning models are built on inputs selected by statistical criterion.
The main objectives of this research are two-fold; firstly to investigate the possibilities and different architectures of integrating the usual statistical techniques with computationally intelligent models for the purpose of rainfall forecasting. Then test these forecasting models on different case studies. In our approach we construct the forecasting model following ''hybrid ensemble modeling'' framework. The term hybrid characterizes the heterogeneity in the design of the constructed ensemble model. The construction of the ensemble can be generalized in three steps. In a nutshell, first we select inputs then construct the sub-models/component models on the selected inputs and then select sub-models based on statistical relevance with the outputs to construct the ensemble. To maximize the diversity within the sub-models of the ensemble, we have used expert models from different domains. We have used Stepwise linear regression (SLR), Multivariate adaptive regression spline (MARS) from Statistics, Artificial neural network (ANN) and Support vector regression (SVR) from Machine learning back ground. The main criterion for using expert models from different areas is to capture the complex process of rainfall as accurately as possible.
In the first step of the ensemble, an input selection technique is used to select appropriate inputs/variables. A statistical criterion, linear correlation analysis (LCA) and an information theoretic criterion, Average mutual information (AMI) give similar results in selecting the inputs. In the second step, the sub-models are trained on the selected inputs. Adaptive training strategy is used to train the models; in this training past outcomes are taken into account to train the models. Finally, the constructed sub-models are ranked and then selected to construct the ensemble model. For ranking of the sub-models, one statistical variable selection method, Least angle regression (LARS) and one information theoretic measure, Mutual information (MI) is utilized. For faster implementation of the MI based ranking, a projection technique, Independent component analysis (ICA) is used. The accuracy of the higher ranked models is then checked on the basis of L2 loss function. In this way the ensemble is constructed with sub-models with higher accuracy and better conformity with the original outputs.
The hybridized ensembles are applied in two rainfall series of Japan and India. The experimental results show the advantage the hybridization of combination scheme of models. This thesis contributes to hydrological rainfall forecasting and we hope its findings can be used in building more effective rainfall and flood forecasting systems.

Key Words
Forecast Combination, Ensemble Approach, One-step ahead Forecasting, Extreme Rainfall event.



Times Cited in Web of Science:

Times Cited in Google Scholar:

Cited in Books: