Research on the impact analysis and control strategy of load fluctuation in distribution banding operation based on segmented linearization method
Published Online: Mar 19, 2025
Received: Nov 17, 2024
Accepted: Feb 21, 2025
DOI: https://doi.org/10.2478/amns-2025-0443
Keywords
© 2025 Ying Xu et al., published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
With the rapid development of social and economic development and the continuous improvement of people’s living standards, the social demand for electric energy is also growing, the demand for power quality is also higher and higher, to ensure uninterrupted power supply has become an important indicator of the power companies to provide customers with quality services [1–3]. This requires a high-quality operation and reliable power supply system, 10kV distribution network is an important part of the power system, is directly facing the user’s power infrastructure, has a large amount of power consumption, high load density, more points and a wide range of characteristics. While the insulation level of 10kV distribution equipment is relatively low, there are many power supply equipments, which are more prone to failures, so the maintenance workload is large and the number of power outages is high [4–7]. In order to minimize the impact of excessively long power outages on customers’ electricity consumption and loss of load of power supply enterprises, power-carrying operation has been vigorously carried out in 10kV distribution network maintenance operations in recent years. Banding operation refers to the operation of high-voltage electrical equipment and facilities without power [8–11]. Carrying out power-carrying operations can complete most of the current maintenance and expansion work without power, greatly reducing construction outages. At the same time, the banding operation can also be used for most of the accident processing, which can significantly reduce the power outage time and the scope of power outage, and effectively improve the reliability of power supply in distribution network [12–14]. The load in distribution banding operation is volatile, which can adversely affect distribution banding operation under the current situation of insufficient generation reserve capacity and tight power supply [15–17]. The power grid needs to be effectively regulated and controlled, so the segmented linearization method can be used to analyze the impact of load fluctuations in distribution banding operations and the corresponding control strategies to accurately predict the changes in grid loads and effectively identify the potential operational risks, so as to enhance the responsiveness and accuracy of the grid regulation and control [18–20].
In this paper, we firstly start from the load data of power distribution band operation, and preprocess the load data, including the filling of missing load data, the detection and processing of abnormal data, and the standardization of load data. Then we analyze the characteristics of the fluctuation pattern of the preprocessed load data, including the daily characteristics, weekly characteristics, seasonal characteristics, holiday characteristics and so on. The correlation between different influencing factors and distribution point operation loads is analyzed using regression models, and targeted distribution point flow load control strategies are proposed based on the results.
The regional-level electric load data is the superposition of some residential electric load datasets, therefore, the exploration of the regional-level electric load data firstly needs to pre-process and characterize the residential electric load data, so that the part of the whole can be glimpsed. This chapter firstly introduces the acquisition and preprocessing of power load data, focusing on the characteristics of regional-level power load data based on the characterization of residential power load dataset, which provides data support for the subsequent modeling and example simulation.
With the popularization of the smart meter project, the collection of residential power load data is made convenient, but in the process of data acquisition and transmission, it is inevitable to be affected by other factors, resulting in the absence of valid data or the addition of abnormal data sequences. Therefore, data preprocessing is needed to extract or derive its valuable and meaningful parts.
The experimental dataset in this paper is derived from the publicly available dataset on the official website of the Australian government, which launched the Smart Grids, Smart Cities (SGSC) program in 2009, which collects load data from about 10,000 residential dwellings in the state of New South Wales, Australia, at half-hourly intervals every day, i.e., at 48 sampling points every day. Load profiles with 48 sampling points per day. In order to explore the characteristics of the load data, 100 residential load data from January 1, 2013 to June 30, 2013 were randomly selected as the experimental sample data.
In order to eliminate the load data anomalies or missing phenomena caused by the smart meter measuring device itself, abnormal data transmission record factors and artificial factors, it is necessary to pre-process the data, which mainly includes data missing value processing, data outliers detection and data normalization. Missing data processing At present, the common processing methods for missing values in power load data are the heat card filling method, neighborhood mean filling method, and regression filling method. The hot card filling method refers to searching for data samples similar to the missing data series in the complete data set, and filling the missing values with the values of similar objects in this data set. Generally, the correlation coefficient method is used to determine the degree of similarity of the data series, but different correlation coefficient method focuses on different characteristics of the data series, and there is a certain degree of subjectivity in the similarity of the determination criteria; neighboring point mean filling method is to fill in the missing values with the average of the data of the previous and subsequent moments of missing values, which is not a big change in the overall sequence of the data as well as the trend of data fluctuations; The regression filling method is to use the load curve to establish the regression equation between the power load and time, estimate the missing values with the known data, and fill in the missing values with this estimated value. However, when the data series volatility is high and the overall data show a large deviation from the estimated value when the data are not correlated. In the residential electricity load data set, due to the randomness of the electricity consumption of each residential user, the similarity of electricity consumption among residential users is low, and the overall missing value of the data set is small. In summary, this paper adopts the neighborhood point mean filling method to deal with the missing data. Considering that the electricity consumption of residential users does not increase or decrease abruptly in a short period of time, but shows a certain increasing or decreasing trend, and that the electricity consumption of residential users is closely related to the moments before and after the current moment of electricity consumption, it is reasonable to take the average of the moments before and after the moment in which the missing values are located as the missing values. Data outliers processing Data outliers are usually deleted or processed as missing values, but it is inevitable that there are outliers in the residential electricity load data recorded by smart meters, which should not be interfered too much as an objective factor. And the residential power load data has a temporal order, the deletion of subjective factors will break the temporal order and uncertainty of the data, and change the learning of the model on the data characteristics and then affect the prediction results. Therefore, this paper contains a large number of continuous and dense abnormal data sets in the dataset, in order to avoid destroying the temporal and uncertainty characteristics of the data, it is directly deleted and does not participate in the model simulation, while a small number of avoidable load data anomalies can be modified through the detection of the deviation rate of the data. Assuming that the data series of residential electricity load is Where: Compare the deviation rate Data normalization Data normalization refers to the unification of different attributes of the data scale, so that the data is smoothed to improve the similarity of the distribution of the data in the model training,to accelerate the convergence of the model on the data, so that the model prediction value is more accurate [22]. In this paper, min-max normalization is used to make the load data mapping between [0,1], the specific calculation is shown in Equation (6):
Where
After data processing, it is found that the regular characteristics of power load fluctuations are mainly reflected in its regular changes over time, the following will be analyzed from the load fluctuations of the daily characteristics, weekly characteristics, seasonal characteristics, and holiday characteristics of the four aspects of the regular characteristics of load changes over time.
The general daily load curve has two peaks, the “morning peak” and the “evening peak”, of which the “morning peak” is usually caused by the surge of electricity consumption of enterprises starting work, and the “evening peak” is usually caused by the electricity consumption of households. The morning and evening peaks coincide with people’s daily routines and working hours, and can shift with the seasons. The load profile for a day in June 2006 in the NSW dataset is shown in Figure 1.

Daily load curve
It can be seen that the “morning peak” peaks around 08:20 to 08:30 and the “evening peak” peaks around 18:00 to 19:00, then the load gradually decreases and reaches a trough around 04:30 in the morning.
Generally speaking, the weekday load curves fluctuate approximately the same during the same week, with a clear daily cycle, and the loads on weekends are significantly lower than the loads on weekdays. This is due to the fact that industrial loads usually account for a large proportion of total regional loads, and some plant shutdowns on weekends result in a drop in total loads. The load fluctuation curve for the randomized 20-week period in 2006 in the NSW dataset is shown in Figure 2. The graph demonstrates that the trend and magnitude of load fluctuations from Monday to Friday are almost identical, and that loads on weekends are significantly less than those on weekdays. Further analysis of the graph reveals that the emergence of the morning peak from Monday to Friday coincides very well, while the emergence of the morning peak on Saturday and Sunday is delayed relative to the weekdays, and the load peak is significantly lower than the weekday morning peak. The evening peak occurs closer on weekdays and weekends, because the evening peak is usually caused by household electricity consumption, and people’s work habits will not change much in the short term.

Week load curve
The climate of New South Wales is temperate, with seasons opposite to those of the Northern Hemisphere, with the summer months of December to February and the winter months of June to August, providing four distinct seasons of the year with plenty of light. As the seasons change, the time of day when the morning and evening peaks in electricity consumption occur, as well as the peaks and valleys of loads, change. Figure 3 shows the mean load curves for each month of 2006 in the NSW dataset.

Year load curve
It can be seen that the morning and evening peaks around the summer months are not as pronounced, due to the increase in air conditioning loads as temperatures rise around midday in the summer. Normally the load fluctuations between the “morning peak” and the “evening peak” show a decreasing and then increasing trend, whereas in the summer, the temperature gradually increases with sunrise and reaches the highest temperature around 14:00 to 16:00 p.m. The increase in cooling load during this period changes this trend. The morning and evening peaks during winter are more pronounced and significantly higher than during other seasons due to heating demand.
Load fluctuation patterns for the same legal holidays in different years are roughly the same, and their loads drop significantly relative to adjacent weekdays. Figure 4 gives the load profile for the week of April 21 to April 27, 2006 in the NSW dataset, where April 25 is a Friday, marked with a red rectangle, and it is also a holiday. It can be seen from the graph that the load on that day is significantly lower than on the adjacent weekday.

Holiday load curve
Unlike ordinary segmented linear regression, due to the special characteristics of the flow control process, the analytical goal of this paper is to correct the valve position command and corresponding flow rate to a linear relationship by fitting the segmented linear relationship between the valve position command and the valve opening. Therefore, instead of simply constructing a regression model to describe the relationship between the two variables in the historical data, the mapping relationship is adjusted appropriately to change the linearity between the associated variables in order to achieve the purpose of correction, and the loss function in this process is also different.
Let the independent variable be
And then, the segmented linear regression model can be expressed as:
If the data is divided into multiple segments, the regression model is:
The above model is fitted by dividing the raw data into
Figure 5 shows the analysis process of flow curve correction in this paper. Before modeling, the large-scale historical data are processed by data cleaning and sampling, and the steady-state data suitable for modeling are selected according to the characteristics of the working conditions. After completing the data screening, the segmented linear regression model is established in three steps: firstly, the data are reasonably partitioned, and the location of the breakpoints accurately reflecting the characteristics of the data distribution is obtained through repeated evaluation of the segmentation effect; then the regression model is established using the improved particle swarm algorithm, and the construction of the model is completed iteratively through the analysis of the optimization process and the fitting effect; finally, the comparison and evaluation of multiple models is carried out, and model fusion method is used to Finally, multiple models are compared and evaluated, and the model fusion method is used to further improve the fitting accuracy to obtain the final segmented linear regression model.

Piecewise linear regression analysis process
One of the challenges of segmented linear regression is how to determine the location and number of breakpoints. Different number of breakpoints and location of segmentation points will result in different model results, too few breakpoints can not refine the characteristics of the fitted curves, and too many breakpoints will increase the time cost of modeling and reduce the robustness of the model, so it is necessary to design segmentation algorithms before modeling.
Segmentation algorithms must meet three requirements: (1) applicability: they can effectively identify mutation locations and segment the data according to their characteristics; (2) efficiency: they use as few breakpoints as possible to characterize the data and reduce the time for subsequent regression modeling; and (3) tunability: they support domain experts to incorporate the domain knowledge and add, delete or adjust breakpoints. Based on the above needs, this paper designs three segmentation algorithms, two unsupervised methods and one supervised method, to support data segmentation operations from multiple perspectives [24].
The unsupervised methods used in this paper are isometric and isofrequency. Isometric division is to divide the data into
CART regression tree is a classification method that uses top-down recursive ideas. The core of the CART algorithm is to determine a partitioning criterion for constructing a decision tree, and its main problem is how to find the optimal threshold that is most suitable for segmentation from a largeamount of continuous data. Therefore, the concept of coefficient of variation is introduced to measure the degree of difference of the sample features between partitions, i.e., evaluating the partition attribute index
If the sample set
The smaller value of
Figure 6 shows the construction process of CART regression tree. Firstly, calculate the

CART regression tree construction process
The results of the segmented linear distribution operation load fluctuation regression quantitative model are shown in Table 1. It is verified by calculation that a higher fitting accuracy can be obtained by using wet bulb temperature 26.4°C or air enthalpy 86 kJ/kg as the segmentation point of linear fitting, and for the convenience of presentation, the linear fitting with WT>26.4°C or AH>86kJ/kg is abbreviated as the high temperature segment in the following, whereas the linear fitting with WT<26.4°C or AH<86kJ/kg is abbreviated as the low temperature segment. In addition, this paper used the Stata software to perform the White’s test on the multivariate segmented regression model, and the test results showed that there was no significant heteroscedasticity problem in the model at the p = 0.05 significance level. According to the comparative analysis of the above four models, all segmented linear regression models can achieve a high accuracy of prediction with R2 > 0.96 in the low-temperature segment, but the R2 in the high-temperature segment of WT and AH is only 0.86, which demonstrates the limitation of the single-factor segmented model in its applicable interval. To further compare the error distribution of the four models, this paper uses the absolute relative error (ARE) index to analyze the errors of the four segmentation models mentioned above.
Combined results of different load volatility models
| N | Fitting result | R2 | Remark |
|---|---|---|---|
| 1 | Distribution load=0.341*WT-3.164 | 0.9687 | High temperature fitting: general |
| (WT<26.4°C) | |||
| Distribution load=1.028*WT-21.508 | 0.8625 | ||
| (WT>26.4°C) | |||
| 2 | Distribution load=0.092*AH-1.572 | 0.9742 | High temperature fitting: general |
| (AH<86kJ/kg) | |||
| Distribution load=0.238*AH-12.278 | 0.8635 | ||
| (AH>86kJ/kg) | |||
| 3 | Distribution load=0.339*DT+0.0644*RH-8.89 | 0.9764 | The fitting is better |
| (WT<26.4°C) | |||
| Distribution load=0.842*DT+0.173*RH-33.346 | 0.9534 | ||
| (WT>26.4°C) | |||
| 4 | Distribution load=0.325*DT+0.328*WT-4.295 | 0.9795 | The fitting is better |
| (WT<26.4°C) | |||
| Distribution load=0.095*DT+0.922*WT-22.371 | 0.9186 | ||
| (WT>26.4°C) |
Figure 7 shows the results of ARE probability distribution. It can be seen that Fig. 7 proves the feasibility of the model to quantify load fluctuations under a variety of working conditions, such as different strategies and different DR durations, and that the use of the model to quantify load fluctuations is appropriate from the comprehensive considerations of the difficulty of collecting meteorological data, the simplicity of the model form, and the accuracy of quantization. Although the model can realize simple and fast quantification of load fluctuation under general working conditions, when facing different internal disturbances, different human behaviors, and different events, a small number of segmented linear quantification models are obviously unable to carry out comprehensive load fluctuation quantification of dynamics. Therefore, this method is only applicable to the scenario that the operating conditions of the distribution band point do not change much throughout the year, and the operation management personnel can quickly quantify the power load fluctuation based on the forecast, historical system operation data and empirical formulas to accomplish reasonable decision-making.

ARE probability distribution of different Dr Events
In this paper, Matlab programming is used to analyze the correlation between different influencing factors and the load of distribution band point operations.
The load data is selected for 365 days in a year, each day is 15 minutes a data, a total of 96 load data in a day, the data is shown in Figure 8.

Annual daily load fluctuation
ANOVA was used to analyze the correlation between the maximum temperature, minimum temperature, average temperature, relative humidity (average), and rainfall of the day and the load at that moment of the current day, respectively, and the results are shown in Tables 2 and 3.
ANOVA analyzes the highest temperature - load
| SS | df | MS | F | Prof>F | |
|---|---|---|---|---|---|
| Regression analysis | 4.0915*108 | 187 | 2223209.7 | 4.85 | 1.63087.5*10-29 |
| Residual error | 8.61294*107 | 183 | 475859.6 | ||
| Total | 4.967*108 | 372 |
ANOVA analyzes the p values of different influencing factors
| Influencing factor | Prof>F |
|---|---|
| Maximum temperature | 1.63087.5*1023 |
| Minimum temperature | 2.59187*1028 |
| Mean temperature | 9.00492*1029 |
| Relative humidity (average) | 0.1589 |
| Rainfall amount | 0.4738 |
According to the contents of the charts, it is known that the maximum temperature, minimum temperature, and average temperature have a significant effect on the load fluctuation, so the maximum temperature, minimum temperature, and average temperature can be used as inputs to the model. In addition, according to Fig. 8, it can be seen that the load size and load fluctuation at each moment are roughly the same as the previous day’s pattern, so the load at that moment of the previous day can also be used as an input to the model.
Based on the previously obtained distribution of charged load fluctuation law characteristics, as well as load fluctuations in the analysis of factors affecting the results, targeted to propose the following load stabilization control strategy.
In order to reduce the probability of the governor operating in the open mode and to accurately distinguish between power fluctuations and power transmitter faults, in addition to configuring two sets of high-performance power transmitters that can be automatically switched, it is also necessary to introduce a frequency condition into the fault criterion.
Some power stations choose the open mode as the normal operation mode of the governor, and the governor also switches to the open mode when the transmitter fails or when it is running alone. The closed-loop regulation of active power in the open mode is accomplished by the monitoring system by sending increasing and decreasing pulses, and its regulation performance is far inferior to that in the governor power mode. Considering that the governor is likely to work in the open mode when the hydraulic power fluctuates, in order to efficiently stabilize the output of the unit within a reasonable range without affecting the primary FM function, the following optimization is made for the closed-loop regulation program on the monitoring system.
When the governor works in the open mode and the active real value is between -3%Pe and 100%Pe, the difference between the power real value and the set value is less than the regulation dead zone and lasts for 10 s, the regulating pulse output is blocked. After the regulating pulse is blocked, there are three cases as follows. During the operation of the primary FM and within 20 s after the operation, the unit will experience load fluctuation and even cause the difference between the actual power value and the set value to be too large. This is caused by the normal action of the primary FM, during which the monitoring and control pulse is blocked, and the monitoring and control of the active closed-loop regulation cannot intervene, and the “regulation blocked” indication is displayed on the monitoring and control screen. If the power difference exceeds the dead zone and lasts for 5 s when the FM is not in action, the monitor will release the blocking of the active adjustment pulse, and “normal adjustment” will be displayed on the screen at the same time, and the monitor will send out the pulse to adjust the power difference back to the dead zone. At any time, once a new set value is given to the monitor, the blocking will be lifted immediately, and the monitor will immediately carry out the active closed-loop regulation and adjust the actual value to the new set value.
A program is added to the monitoring system to automatically record the most recent valid active setting value. When the unit power is greater than Pe, i.e., when an abnormality occurs that causes the unit to exceed the power, the program automatically sets the latest active setting value to the latest setting value. In this way, when the unit exceeds the power, regardless of whether the primary FM is activated or not, the monitoring resets a normal setting value and sends a pulse reduction to adjust the unit output to the setting value.
According to the following two principles to set the order of cutting: ① the same hydraulic unit of two units of a running, a shutdown, the running unit priority cut; ② the same hydraulic unit of two units are running, the priority cut each hydraulic unit of the first unit, each hydraulic unit of the second unit after the cut. This can reduce the probability of two diversion units tripping at the same time, to prevent the diversion of water hydraulic units due to excessive water damage.
When the water striking force fluctuates, the operating unit will make corresponding adjustments to the guide vane with the increase or decrease of the worm shell pressure to maintain the load stability. Therefore, the opening limit of the guide vane should be set to +10% of the corresponding opening degree under the current maximum load of the head, to avoid that when the pressure of the worm shell drops to the trough, the opening degree of the guide vane cannot be adjusted in time due to its limitation, which leads to a large deviation of the load.
When the unit dumping load leads to hydraulic fluctuations, at this time the operating unit, if another rapid large load adjustment, will make the two hydraulic fluctuations superimposed effect. If the worm shell pressure drops to the valley or peak point when another machine is added or reduced in load, it will exacerbate the magnitude of the original hydraulic fluctuations. In the snail shell pressure fluctuation cycle, you can observe the snail shell pressure changes, according to the hydraulic fluctuations of staggered load increase or decrease in order to realize the peak and fill in the valley, that is: the snail shell pressure to reduce the valley point or peak point when the corresponding reduction or load, can also slowly increase the load.
According to the hydraulic fluctuation statistics under the load shedding condition of one machine, the maximum amplitude occurs when the unit is shedding full load, which is directly proportional to the size of the load shed by the unit. However, the same hydraulic unit two machines at the same time to throw off the load conditions, the maximum amplitude does not necessarily occur in the case of the maximum load, there should be a load range will produce the maximum amplitude of the spike, which needs to be followed by simulation tests to prove. The test results should be taken into account in the load distribution of the unit, and the unit output should be avoided during this load range as much as possible.
In this paper, on the basis of analyzing the characteristics of distribution load fluctuation law, we use segmented linear regression method to study the impact of load fluctuation in distribution load operation, and put forward targeted control strategies accordingly. It is found that the general daily load curve has “morning peak” and “evening peak”, “morning peak” peaks around 08:20 to 08:30, “evening peak” peaks around 08:30. The “evening peak” peaks at around 18:00 to 19:00, and then the load gradually decreases and reaches a trough at around 04:30 in the morning. The emergence of the morning and evening peaks coincides with people’s living patterns and working hours, and will be nudged with seasonal changes. The trend and magnitude of load fluctuations from Monday to Friday are relatively close, and the load on weekends is significantly lower than that on weekdays. It is worth noting that, although the method in this paper can realize simple and fast quantification of load fluctuation under general working conditions, it is unable to quantify comprehensive load fluctuation of dynamics when facing different internal disturbances, different human behaviors and different events. Therefore, this method is only applicable to the scenario that the operating conditions of distribution points do not change much throughout the year, and the operation managers can quickly quantify the power load fluctuations based on forecasts, historical system operation data and empirical formulas to accomplish reasonable decision-making.
