1 NP3.1/CL6.12/SSS0.6 - Scales, scaling and extremes in the geosciencesEuropean Geosciences Union, General Assembly Vienna, Austria, 27April – 02 May 2014 Vol. 16, EGU , 2014 CONFIDENCE INTERVALS IN FLOW FORECASTING BY USING ARTIFICIAL NEURAL NETWORKS Dionysia Panagoulia Department of Water Resources, School of Civil Engineering, National Technical University of Athens 9 Heroon Polytechneiou Street, 15780 Zografou, Athens, Greece George J. Tsekouras Department of Electrical & Computer Science, Hellenic Naval Academy, Terma Hatzikyriakou, 18539 Piraeus, Greece Abstract One of the major inadequacies in implementation of Artificial Neural Networks (ANNs) for flow forecasting is the development of confidence intervals, because the relevant estimation cannot be implemented directly, contrasted to the classical forecasting methods. The variation in the ANN output is a measure of uncertainty in the model predictions based on the training data set. The aim of this paper is to present the re-sampling method for ANN prediction models and to develop this for flow forecasting of the next day. Introduction – Uncertainty analysis for ANNs’ ANN Method for flow forecasting (2) Different methods for uncertainty analysis, such as bootstrap, Bayesian, Monte Carlo, have already proposed for hydrologic and geophysical models [1-2], while methods for confidence intervals, such as error output, re-sampling, multi-linear regression adapted to ANN have only been used for power load forecasting [3-5]: (1) Error output: The ANN model has finally two outputs for each output variable, the first is the forecasting mean value and the second is the respective estimated absolute percentage error. After the training process a larger confidence interval is determined by multiplying the initial one by a proper factor, in order to obtain the required confidence degree, which can not be predefined in this technique. (2) Re-sampling technique: The prediction and the respective error are calculated for each set (training, evaluation, test) separately and for all available m input vectors of each set. These errors are sorted in ascending order considering the signs and the cumulative sample distribution function of the prediction errors can be estimated by: When m is large enough, Sm(z) is a good approximation of the true cumulative probability distribution F(z). The confidence interval is estimated by keeping the intermediate zr and discarding the extreme values, according to the desired confidence degree. The intervals are computed in order to be symmetrical in probability (not necessarily symmetric in z). The number of discarding cases in each tail of the prediction error distribution is |_n∙p_|, where p is the probability in each tail. If the cumulative probability distribution F(Zp) equals to p, then there is a probability p that an error is less than or equal to Z(p), which indicates that Z(p) is the lower confidence limit. Similarly, Z(1-p) is the upper limit and there is a (1-2p) confidence interval for future errors. (3) multilinear regression adapted to ANN: This technique can be only applied, if the linear activation functions are used for the output layer. In this case a multi-linear regression model with pc coefficients can be implemented for each output neuron. The inputs of the regression model are taken as the outputs of the hidden neurons and the regression coefficients are taken as the connection weights of the output neuron. The confidence interval for a point prediction can be computed using the prediction error variance, the ANN inputs, and the desired confidence degree, which follows a t-Student’s distribution with (m-pc) degrees of freedom, where m is the available m input vectors of the respective set. The theoretical superiority of the re-sampling technique can been proved easily [5], because the output error technique doubles the number of the ANN’s outputs increasing the respective number of the ANN weights and the respective computational time significantly, while the multi-linear regression adapted to ANN technique allows only the use of the linear activation function for the output layer which deteriorates the ANNs’ results. On the contrary, the re-sampling technique does not affect in computational time and the ANNs’ results. and the time parameters of learning rate and momentum term etc (see Fig. 1). The performance of each ANN structure is evaluated by the voting analysis based on eleven criteria, which are the root mean square error (RMSE), the correlation index (R), the mean absolute percentage error (MAPE), the mean percentage error (MPE), the mean percentage error (ME), the percentage volume in errors (VE), the percentage error in peak (MF), the normalized mean bias error (NMBE), the normalized root mean bias error (NRMSE), the Nash-Sutcliffe model efficiency coefficient (E) and the modified Nash-Sutcliffe model efficiency coefficient (E1). The next day flow for the test set is calculated using the best ANN structure’s model. Afterwards the re-sampling technique is applied for the three sets (training, validation, test) according to the respective uncertainty analysis. Data pre-processing Selection of ANN parameters Training Process Evaluation Process End of optimization? No Yes Selection of the respective parameters via the statistical voting analysis Flow forecast for the Test Set ANN data input Parameters output Output Input variables selection ANN parameters range selection Uncertainty analysis based on re-sampling Operator ANN Method for flow forecasting (1) The input variables selection of a classical ANN structure has been studied [6]. Input variables are historical data of previous days, such as flows, nonlinearly weather related temperatures and nonlinearly weather related rainfalls based on correlation analysis between the under prediction flow and each implicit input variable of different ANN structures. After inputs selection through an step-wise multi-stage process (the respective work is under review [7]), the ANN is formed. The ANN’s training algorithm is the stochastic training back-propagation process with decreasing functions of learning rate and momentum term, for which an optimization pro-cess is conducted regarding the crucial parameters values, such as the number of neurons, the kind of activation functions, the initial values Fig. 1: Flow chart of the ANN optimization method
2 Study catchment & observed data Application of re-sampling techniqueThe Mesochora catchment drained by Acheloos’ river in central-western Greece was selected for this study due to the partial diversion of the river flow in order to irrigate the arid Thessaly plain and boost hydropower generation in the surrounding region. The catchment has an area of 633 km² (Figure 2) and extends nearly 32 km from north (39º42’) to south (39º 25’) with an average width of about 20 km. Daily precipitation was available at 12 stations for the period of , while mean daily temperature was collected from 4 stations for the period of The precipitation and temperature variability at the stations was determined by conditioning on CP types [8]. For ANNs’ training process three sets are formed: •Training set: 80% vectors of time period , •Evaluation set: 20% vectors of time period , •Test set: 100% vectors of time period For each set (training, evaluation, test) the re-sampling technique is applied for different (1-2∙p) confidence degrees. In Table 1 the median, the mean value, the standard deviation of the training set, the respective limits of the confidence intervals for the true cumulative probability distribution and for the approximate normal cumulative probability distribution for different confidence degrees are reported. In Fig. 3 the flow error cumulative probability function by the real values and the error approximate normal cumulative probability function are presented for the training set, from which it is obvious that the confidence intervals of the normal function are broader than the respective ones by re-sampling technique. The median is closer to the zero than the mean value. The same conclusions are arisen for other two sets. Fig. 4: Flow error cumulative probability function by real values for training set, evaluation set and test set. Conclusions Table 1: Flow error confidence intervals of training set for different confidence degrees The cumulative probability function of the error flow based on re-sampling technique is similar to the normal distribution one, but it is narrower. The respective median value is closer to zero than the mean value of the normal distribution for all sets. The re-sampling technique can successfully determine the confidence interval, as the confidence interval based on test set is slightly overlapped by that of training set. This finding supports the generalization of ANN's flow forecasting by using confidence intervals of training and evaluation sets. Training set Population 4380 Mean value Median Standard deviation From cumulative probability function by real values From approximate normal cumulative probability function Lower limit Upper limit 100% 316.29 - + 95% -27.87 24.09 -35.99 35.51 90% -17.32 10.34 -30.24 29.76 85% -11.44 5.86 -26.50 26.01 80% -8.50 3.70 -23.61 23.13 75% -6.56 2.65 -21.22 20.74 70% -5.05 2.08 -19.14 18.66 65% -3.95 1.68 -17.29 16.81 60% -3.17 1.41 -15.59 15.11 55% -2.60 1.20 -14.01 13.54 50% -2.03 1.04 -12.54 12.06 References [1] R.K. Srivastav, K.P. Sudheer, I. Chaubey: “A simplified approach to quantifying predictive and parametric uncertainty in artificial neural network hydrologic models”, Water Resources Research, Vol. 43, W10407, 2007, p. 12 [2] S. Maiti, R.K. Tiwari: “Neural network modelling and an uncertainty analysis in Bayesian framework: A case study from the KTB borehole site”, Journal of Geophysical Research, Vol. 115, 2010, p. 28. [3] H.S. Hippert, C.E. Pedreira, R.C. Souza, “Neural networks for short-term load forecasting: A review and evaluation,” IEEE Trans. on Power Systems, vol. 16, no. 1, 2001, pp [4] G. J. Tsekouras, N.E. Mastorakis, F.D. Kanellos, V.T. Kontargyri, C.D. Tsirekis, I.S. Karanasiou, Ch.N. Elias, A.D. Salis, P.A. Kontaxis, A.A. Gialketsi: “Short term load forecasting in Greek interconnected power system using ANN: Confidence Interval using a novel re-sampling technique with corrective Factor”, WSEAS International Conference on Circuits, Systems, Electronics, Control & Signal Processing, (CSECS '10), Vouliagmeni, Athens, Greece, December 29-31, 2010. [5] N.E. Mastorakis, G. J. Tsekouras. Short Term Load Forecasting in Greek Intercontinental Power System using ANN: The Confidence Interval. Advanced Aspects of Theoretical Electrical Engineering, Sozopol 2010, – , Sozopol, Bulgaria. [6] D. Panagoulia, I. Trichakis, G. J. Tsekouras: “Flow Foreca-sting via Artificial Neural Networks - A Study for Input Variables conditioned on atmospheric circulation”, European Geosciences Union, General Assembly 2012 (NH1.1 / AS1.16 – Extreme meteorological and hydrological events induced by severe weather and climate change), Vienna, Austria, April 2012. [7] D. Panagoulia, G. J. Tsekouras: “Flow forecasting via Artifi-cial Neural Networks: A multi-stage methodology for selection of input variables conditioned on atmospheric circulation”, under review. [8] D. Panagoulia, A. Grammatikogiannis, A. Bárdossy. An automated classification method of daily circulation patterns for surface climate data downscaling based on optimised fuzzy rules. Global Nest J. 8(3), 218–223, 2006. Fig. 2: The Mesochora catchment, Greece: Topography and recording stations. Fig. 3: Flow error cumulative probability function by real values & by approximate normal distribution function for training set ANN selection After the application of the methodology the input variables are the flows of the previous days Q(t-1), Q(t-2), Q(t-3), Q(t-4), the precipitation of the current day P(t) and of previous days P(t-1), P(t-2), P(t-3), the temperature of the current day T(t) and of previous days Τ(t-1), Τ(t-2), Τ(t-3), while the output is Q(t). The ANN’s parameters are 3 neurons in one hidden layer, initial value and time parameter of the momentum term 0.35 & 800 respectively, initial value and time parameter of the learning rate 0.65 & 300 respectively, tanh(x) activation function of hidden layer, tanh(0.35x) activation function of output layer. Generalization of the results In Fig. 4 the flow error cumulative probability functions of the training, evaluation and test set based on the re-sampling technique are presented. It is obvious that the confidence limits of the training set (or evaluation set) are larger than the respective ones of the test set. This means that the confidence limits of the training set can be used as the respective ones of the test set with safety, as the last ones can be calculated using the respective real values.