An Bayesian Learning and Nonlinear Regression Model for Photovoltaic Power Output Forecasting
Published Online: Sep 15, 2020
Page range: 531 - 542
Received: Feb 24, 2020
Accepted: May 26, 2020
DOI: https://doi.org/10.2478/amns.2020.2.00032
Keywords
© 2020 Wengen Gao et al., published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 International License.
Renewable energy sources are of more and more significant importance in the current and future power supply systems [1, 2], especially, the photovoltaic(PV) power techniques has achieved tremendous progress in the industry and research fields. In the past years, the total cumulative solar PV power capacity has reached 178GW [3, 4]. Moreover, photovoltaic(PV) power takes a percentage of 8 in the gross power consumption in Italy and 7.1 in Germany in the year of 2015 [5,6]. The large-scale deployments of PV system brings the surging demands for the management and scheduling operations on the PV power system, which greatly depends on the forecasting of PV system power outputs in [7,8,9]. Genially, PV power outputs are determined by the randomness of solar irradiance in the area of interest, which indicates that the power outputs are variable. Therefore, amount of models and methods have been proposed to approximate the PV power outputs under different conditions.
In [10], the power regression is modeled based on analysis of the images of the weather in the UC San Diego, which provides a god tested for the solar energy. Similarly, the analysis of cloud images or weather is employed in [11,12,13]. However, these methods requires expensive equipments to obtain the cloud or weather images, which is not favorable to lower the price of PV power systems. In [14], the forecasting of PV power output is implemented based on the real-time collection of solar irradiance through the irradiance sensor network. In [15], the images of cloud motions are obtained through geostationary satellite to predict the medium and short-term solar radiation. The above approaches can provide good performance in forecasting PV power outputs, which need additional hardware or complex operations.
Besides applying equipments and complex operations, various forecasting algorithms are also proposed. Common approaches are based on employing the machine learning classification and regression methods. In [16], aerosol index, which has evident linear correlation with solar radiation attenuation, is used to train the artificial neural network (ANN) and forecast the power outputs in the next 24 hours. Similarly, the ANN method is also employed to implement the forecasting of the PV power outputs in [17, 18]. Support vector machine(SVM) was also employed to learn and model the relationship and relevance between the input data such as solar radiation and the output of PV power in [19,20,21]. In [22, 23], multiple linear regression (MLR) modeled the power outputs of PV system based on the features of solar radiation and the weather data. In [24], K-nearest neighbour(K-NN) was employed to build the forecast model based on the non-common data. In [25], ANN, SVM, KNN and MLR are analyzed and the effect of selecting input data for the learning algorithms are analyzed.
Another approach is based on the probabilistic model to forecast probability density function associated with PV power outputs based on the features of input data in [26]. In [27], a versatile probability method based on pair copula construction to model the PV power system. Similarly, a chronological probability is employed to model the output of PV power system based on conditional probability and nonparametric kernel density estimation in [28]. Moreover, the conditional probability associated with the outputs of PV power is also utilized to predict the outputs in the future. In [29], the Bayesian sparse learning that incorporates the features of input data to learn the likelihood function of the outputs of PV power. In the above probabilistic model, the prediction of PV power outputs is inevitably negative, which is resulted from the models and do not follow the positivity of the outputs. Therefore, a sparse Bayesian learning algorithm that guarantees the positivity of the outputs and approximates the relevance between the input features and power outputs is proposed in this paper.
The rest of the paper is organized as follows. In Section II, the forecasting problem is molded as a Poisson regression problem and the regression problem is implemented on the basis of sparse Bayesian learning. The simulation results of proposed algorithm for the forecasting performance are presented in Section III. The conclusion and acknowledgement are given in Section IV and Section V, respectively.
Generally, the basic principle for photovoltaics is the photovoltaic effect, which transform the solar energy to the electrical energy in the semi-conductors. The output power of the photovoltaics is modeled as follows,
The above model reveals the fact the output power
Based on previous discussions, the outputs of a PV system is nonnegative and can be regarded as integers (with low resolutions in large scale systems), and the outputs cannot be modeled by simple support vector machine, Gaussian process or relevance vector machine, which will leads to negative outputs predictions. In order to alleviate the problem, we employ a generalized linear model, poisson regression was built based on the hierarchical Bayesian learning.
In the regression of PV power outputs, a training set
The index
Assuming that the power outputs is linear combinations of the inputs vector, which is given by
Based on (5), the likelihood function can be formulated as follows,
In Bayesian learning, the weight parameters
By combining the equation (8) and (9), the posterior of
By subsisting the details, the posterior distribution can be given by
Similarly, the posterior distribution
By assuming that
By following the results in [33],
In this subsection, the Sparse Bayesian Learning method is proposed for the Poisson regression of PV system power outputs.
Given the equation (4) and (5), we redefine the natural parameter
So the Poisson distribution can be reformulate as
Taking logarithm to both sides of (21), the log-posterior of
By simple manipulations, the above complicated posterior formulation can be rewritten as
Hence, the posterior distribution can be formulated as
So the Bayesian estimation of
Based on Bayesian rule, the posterior distribution of
Without any extra information of
Then, the above likelihood can be given by
By using the approximation results in (21), it yields
By taking log operation with respect to both sides of (30), taking derivatives with respect to
Given the estimation results of
Due to
Following the results in [34], the likelihood function of prediction can be approximated as
Based on the above derivations, the detailed algorithm can be formulated as
Poisson Kernel Regression Based Sparse Bayesian Learning
1: Input the training set
|
2: Set the convergence criterion for |
3: Set |
4: Initialize the parameter |
5: Initialize the threshold value |
6: Initialize the RVs matrix by setting |
7: |
8: Creating the kernel matrix according to (6); |
9: Calculate the inverse covariance matrix of |
10: Calculate the mean vector according to (25); |
11: Updating the hyper-paramter as
|
12: Eliminate the |
13: Updating kernel matrix by using the eliminated samples; |
14: |
15: Output the estimation of |
Based on combination of Poisson regression and SBL, the power output of PV system can be formulated as a regression problem. Based on the strength of solar radiations in different type weathers, the regression problem can can be classified into three sub-problems.
In each regression problem, the weights of input vectors are dominated by independent zero-mean Gaussian distribution, which is different form the Bayesian prior with identical Gaussian distributions. Meanwhile, the sparsity of the weights are guaranteed by the zero mean and the variance parameter
On one hand, the complexity of the proposed algorithm is dominated by the step 9 in
On the other hand, the complexity of prediction is proportional to the number of RV samples. By comparing the analysis results of complexity to other algorithms, it is concluded that the proposed algorithm is less than the related kernel count data regression model including Kernel Probabilistic Regression and Probabilistic Regression [35].
In the section, the data collected from the real PV power system in Anhui Polytechnic University PV power platform will be applied. The installed capacity of the platform reaches 100 kWh, which is deployed on the roof of main administration building in the campus. PV power data and corresponding weather data are collected in a season. Fig. 1 shows the collected PV power data in seven different days.
Fig. 1
The collected PV power outputs data

In order to make the weather data clear, the data is shifted by one unit in vertical orientation, which is shown in Fig. 2. The RMSE results are obtained through 1000 times Monte Carlo independent experiments, and is defined as
Fig. 2
The collected Quantized weather data

In Fig. 3, the data is collected in the sunny data and the forecasting values based on Poisson regression is closed to true data and has no negative outputs while SVM regression poses negative outputs and has larger error in table I.
Fig. 3
The forecasting PV power outputs in sunny days

The detailed RMSE of three different situations in two regression methods
Situations | Sunny | Sunny/Cloudy | Rainy/Cloudy |
---|---|---|---|
RMSE of PR-SBL | 1.145 | 11.861 | 8.343 |
RMSE of SVM | 22.290 | 22.281 | 18.715 |
Fig. 4 and Fig. 5 show the simulation results in hybrid weather. In both situations, the Poisson regression based on SBL algorithm can achieve better performance in forecasting and nonnegativity.
Fig. 4
The forecasting PV power outputs in sunny/cloudy days

Fig. 5
The forecasting PV power outputs in rainy/cloudy days

In simulation results, the PV power regression is more complicated under the hybrid weather conditions. In super short-term regression, the other factors, such as environmental temperature and wind speed, can be regarded as stable and unchanged in a sole weather type. Thus just the time sequence correlation is considered. The proposed Poisson regression based on SBL can also incorporate the environmental temperature and wind speed to the input data, then the input data forms a vector and the SBL algorithm can still provide good performance, which can be found in [35].
By combining all the simulation results, the proposed PR-SBL algorithm can provide accurate and nonnegative forecasting values of PV power, which outperforms the SVM algorithm in both aspects. The superiorities is resulted from the Poisson distribution assumption and statistical learning mechanism. Specifically, SBL is a data-driven iterative algorithm and updates the hyper-parameters in a hierarchical way, which prevails over the SVM. Moreover, the assumption that the outputs of PV power subjects to Poisson distribution guarantees the nonnegativity of the predicted data. Furthermore, the assumption can be used by adopting the maximum entropy principle according to the physical situations.
The forecasting problem is of vital significance for the management and schedules in the renewable energy sources, such as the PV power system. The traditional nonparametric regression methods cannot guarantee the nonnegativity of the output. In this paper, a regression model based on Poisson distribution and sparse Bayesian learning algorithm is proposed to solve the nonnegative PV power forecasting problem. The detailed principles of PR-SBL algorithm and the simulation results are illustrated. The simulation results demonstrate the superiorities and accuracies of the proposed algorithm. Moreover, the proposed algorithm is feasible to other exponential family distribution other than Poisson distribution, which deserves more investigations in the future.