Research on Intelligent Recognition Method of Risk Levels of Electricity Marketing Field Operation Types Based on Data Mining

In the current power company business model, marketing gradually occupies an important position, and marketing is bound to have the possibility of risk. However, power marketing involves the vital interests of the enterprise and needs to minimize the risk. In this context, the establishment of a power marketing risk evaluation model can effectively carry out risk identification and assessment, take preventive measures against risky behaviors, and ensure that the marketing work is carried out safely and efficiently [1-2].

Marketing risk security evaluation refers to when the electric power enterprise completes the collection of security information, according to the information of the electric power enterprise marketing-related work for risk assessment [3]. In the security risk management work, security identification is the core task in a series of security management work, and its main work is to discover the potential risks in the marketing of electric power enterprises in a timely manner based on the identification of relevant security information [4-5]. Security identification is a proven security management method in the security risk assessment index. At the same time, the form of security identification is also diverse in the specific risk evaluation work, according to the nature of the evaluation object and the need to choose the appropriate security identification method for risk evaluation work in order to enhance the efficiency of the relevant work [6-8]. In the process of security identification, the relevant security information should first be summarized, and then the collated information should be checked and compared with the historical records to determine whether there are risk factors in the power marketing work according to the differences. Once the unstable factors are found, they should be analyzed in a focused manner, and the risk level of the safety hazards should be determined according to the risk assessment standards.

The marketing risks of electric power enterprises include market risk, operation risk, power supply and use risk, power supply service risk, and operation safety risk. First, literature [9] proposes to establish an intelligent hydrogen storage parking lot for electric vehicles to help electric power enterprises alleviate the problems caused by the high penetration rate of electric vehicles and adopts the information gap decision theory to model the uncertainty of the electricity market price and solve the optimal bidding curve, which optimizes the enterprise’s multiple risk strategies. Literature [10] formulated a decision model for hydropower energy investment risk by identifying different investment risks and using spherical fuzzy multi-attribute ideal-realistic comparative evaluation to rank the identification results and implement an effective strategy for the development of power enterprises. This means that power companies must always adjust to changes in the market environment to respond to environmental needs. Secondly, literature [11] introduces demand-side management into the transformation of future power grids, which will help to improve the reliability and financial performance of power systems in the face of increasing power loads. Literature [12] designed a home energy management system with battery storage that is associated with power company tariff information, allowing it to purchase power when power prices are low and manage the temperature of electrical appliances to limit power consumption during peak hours, which experimentally proved to reduce huge electricity bills without sacrificing the user experience. This implies that electric power companies should fully consider consumer groups in their business process to reduce the risk of corporate image damage. Further, literature [13] shows that as renewable energy (VRE) continues to grow, there is a need for the power system to adapt to variable, uncertain, and locally dependent VRE outputs to level the playing field in the electricity market by creating an analytical framework that can integrate market barriers and market efficiencies. Literature [14] addresses the issue of EV charging loads exacerbating the stress on the distribution system by proposing the use of an integrated photovoltaic system, mentioning and incorporating hybrid optimized energy storage management algorithms to minimize the operating costs of EV charging stations. This implies that power companies should strengthen the level of power construction in their business processes to reduce the risk of power supply due to equipment defects.

Operational errors by workers in surveying, metering, etc., are very common at the operation site, and with the development of remote technology and other developments, marketing managers will make fewer and fewer mistakes and losses in controlling safety risks. Literature [15] emphasizes the importance of maintaining appropriate safety distances and adopting suitable work procedures when carrying out non-electrical activities around power lines and suggests requirements such as strict implementation of work procedures, worker information training, respect for time, careful handling, and concentration to ensure that workers are protected from the resulting risks. Literature [16] used particle swarm optimization algorithm for feature selection and parameter tuning of gradient hoist technology, which can significantly eliminate the redundant features of the data points in the dataset, and experiments have shown that the proposed model has a high predictive ability to predict the combination of risks that can lead to accidents in power infrastructure projects. Literature [17] developed a full-participation closed-loop safety management method for offshore wind power construction sites and an OWP safety management system based on social media platforms; the former prevents management deficiencies caused by information loss by increasing the participation of management personnel, while the closed-loop mechanism ensures that safety hazards are corrected in a timely manner, and the latter establishes an interconnecting communication channel for different builders to be able to most quickly Reporting of safety hazards. Literature [18] for electric power construction site safety hazards and safety requirements, the use of information technology in project management, construction planning, safety supervision, and standard development of information technology solutions help to promote the further development of electric power infrastructure level. Literature [19] considers the risk of electrical damage in electric power construction. It utilizes fuzzy hierarchical analysis to assess the safety risk of common activities in the power distribution industry. Empirical research proves that the safety risk index of the power distribution industry derived from this method has a high degree of reliability and provides theoretical insights into the risk management of electric power enterprises. Literature [20] points out that construction safety monitoring relying on manual observation is not only a large workload but also has a low rate of correct identification, and the construction of a safety monitoring system framework based on cloud computing for real-time on-site applications, which collects and analyzes the behavior of workers exposed to predefined risk events with the help of sensing technology, assists in the process of construction site monitoring and control and helps to improve the safety of the construction site.

At the beginning of the article, the basic concepts of data mining and the process of data mining are introduced, which provides a theoretical foundation for the subsequent example analysis. After that, the definition of association rules is introduced, and the FP-growth algorithm is chosen as the core algorithm of this paper by comparing the advantages and disadvantages of two commonly used algorithms. Subsequently, the causes of power marketing field operation accidents and the correlation between accidents were quantitatively analyzed, and the power marketing field operation accident causation model was constructed based on the analysis results. The risk level identification method of power marketing field operation types is further explored, and a behavioral assessment model based on power marketing field operation scenarios is constructed, which is used to assess whether there is any violation of behaviors in the process of power marketing field operation and carry out the grading of violation behaviors. After training model evaluation and performance testing of this paper’s method through video acquisition data, the application process and application effect of the proposed method are demonstrated using an actual power marketing field historical operation event of a power grid company as a sample.

2

Theoretical analysis of association rule mining

2.1

Data Mining Theory

2.1.1

Overview of data mining

Data mining is mainly a process of obtaining hidden information from a large amount of incomplete, fuzzy, noisy, and random data through algorithms. Typically, data mining tasks are mainly categorized into two main types: prediction tasks and description tasks. The prediction task is to predict the value of the target variable or dependent variable attribute based on the value of the descriptive variable or independent variable attribute. Descriptive tasks, which are the discovery of underlying patterns in the given data, are usually exploratory and require interpretation of the results [21]. The four main tasks of data mining are predictive modeling, association analysis, cluster analysis, and anomaly detection.

2.1.2

Data mining process

The process of data mining generally refers to the process of automatically searching for information hidden in a large amount of data that has a special relational nature. Data mining is a cyclic process. When one of the steps does not achieve the desired goal, it is necessary to go back to the previous step and repeat the execution until the result meets the user’s needs. The whole process of data mining is shown in Figure 1.

1)

Data collection

Analyze the user’s needs, determine the target data for mining through the needs, and store the collected data in the database.

2)

Data Preprocessing

Convert the original data form in the database into a form suitable for data mining. The steps involved in data preprocessing mainly include data cleaning, processing of missing values, and duplicate values of data [22]. Data extraction, by setting different indexes for splitting and extracting the data records. Data merging, fusion of data from different sources. Data normalization, transforming data into a specified structure suitable for data mining tasks.

3)

Data Mining

Selecting different data mining algorithms according to different data characteristics to derive useful information for analysis.

4)

Data post-processing

Post-processing of data is used to ensure that the final output is valid and useful. Examples of data post-processing are visualization, statistical metrics, pattern representation, and so on.

5)

Export of information

Combined with the initial data mining objectives, the results from data mining are analyzed and evaluated, and the information is stored in the final knowledge base.

2.2

Association Rule Algorithm

2.2.1

Concept of association rules

Association analysis is used to discover meaningful connections hidden in large datasets, which are usually represented in the form of association rules or frequent item sets.

Definition 1 (item): In general, let L = {x₁, x₂, ⋯, x_n} be the set of all items in the dataset and T = {t₁, t₂, …, t_i} be the set of all transactions. Each transaction t_i contains an itemset that is a subset of L. In correlation analysis, a set containing 0 or more items is called an item. If an item contains k items, it is said to be a k – item set.

Definition 2 (Association Rule): An association rule is an implication expression of the form X → Y, where X is called the rule antecedent, Y is called the rule consequent, and X and Y are disjoint sets of terms, i.e., X ∩ Y = ∅. The strength of an association rule can be measured in terms of support and confidence.

Definition 3 (Support): Support determines how frequently a rule can be used for a given data set. The degree of support has an aspirational nature; rules with low support may appear only by chance and are mostly meaningless. The formula calculates the support degree: 1 $S (X) = \frac{n (X)}{n (T)}$

where n(X) is the number of transactions that contain itemset X in all transactions, also known as the support count for itemset X, and n(T) is the number of all transactions.

Definition 4 (Confidence Level): the confidence level determines how often Y occurs in the transactions containing X. For a given rule X → Y, the higher the confidence level, the more likely it is that Y will occur in transactions containing X. The formula calculates the confidence level: 2 $C (X \to Y) = \frac{S (X \cup Y)}{S (X)}$

Definition 5 (Lift): Since the confidence measure ignores the support of the item that appears in the posterior of the rule, and the high-confidence rule may also be meaningless, this paper adds an objective measure called the interest factor or called the lift, which can make up for the shortcomings of the “support-confidence” framework to a certain extent, and its calculation formula is: 3 $I (\begin{matrix} X \to Y \end{matrix}) = \frac{n (\begin{matrix} T \end{matrix}) * n (\begin{matrix} X \cup Y \end{matrix})}{n (\begin{matrix} X \end{matrix}) * n (\begin{matrix} Y \end{matrix})}$

Property 1: All non-empty subsets of a frequent itemset are frequent.

Property 2: If any (k − 1) − term set X, 1 < k, X of I is not frequent, then any k – term set Y is also not frequent, this property is used to terminate the search for frequent term sets operation.

2.2.2

Comparison of Association Rule Mining Algorithms

The most commonly used association rule mining algorithms are the Apriori algorithm and the FP-growth algorithm. 1)

Apriori algorithm

The Apriori algorithm is the first association rule mining algorithm. Apriori algorithm adopts a layer-by-layer search strategy, first through the support to filter out all the frequent 1-item set, will leave the frequent item set of the two splicings, and then support “pruning”, will repeat the above steps until it can not be found (n + 1)-itemsets can no longer be found. As the number of items in the itemset increases, the number of times the algorithm scans the dataset gradually increases, the storage space required gradually increases, and the running time appears to increase exponentially with the number of transactions.

2)

FP-growth algorithm

FP-growth algorithm is an algorithm that explores the tree in a bottom-up manner and generates frequent term sets from the FP tree. The main idea of the FP-growth algorithm is to decompose a large problem into smaller subproblems. FP-growth algorithm is different from the Apriori algorithm’s “trial and error” strategy. Only need to scan the original database twice through the tree structure of the data compression, which is high efficiency.

2.2.3

FP-growth algorithm

The main body of the FP-growth algorithm is to use a compact data structure called FP-tree to organize and process the data. FP-tree has only one root node, NullSet in the initial period of its construction. After that, the FP-tree connects the transactions in each path in the FP-tree one by one by reading the transactions in the original database. The construction of the FP-tree is shown in Fig.2. The prefix paths of the elements D in the FP tree in the figure are {A, B}, {B}, so these Ds are called similar nodes, and the red dotted line is used to connect the similar nodes in the FP tree.

3

Construction of an association rule-based accident causation model for power marketing field operations

3.1

Analysis of the correlation between the causes of accidents in power marketing field operations

According to the association rule step, the result of the maximum frequent item set searched from the power marketing field operation accident database, the frequent item set of power operation accident causes, is shown in Table 1.

Table 1.

The cause of the power operation is a frequent collection

Accident category	Frequent set	Support
Electrical shock (R1)	{A13,C22,D13,D22,D32,F12,F13,R1}	0.185
	{A13,A33,B21,C13,D12,F12,F13,R1}	0.196
	{A13,A32,C21,E13,F12,F13,F14,R1}	0.221
Object strike(R2)	{A13,A33,B22,C11,D41,F12,F13,R2}	0.301
Object strike(R2)	{A13,A33,B22,C11,D14,F12,F13,R2}	0.337
High fall(R3)	{A22,B23,C21,D21,D31,E14,F14,R3	0.168
	{A12,A51,B14,D11,E22,F11,F12,R3}	0.184
	{A13,A33,C12,D11,D33,F12,F13,R3}	0.335

Each accident causal frequent item set contains a chain of causes within multiple levels from corporate management to the work site, and each of these factors is an important feature for understanding the operational dynamics of power marketing sites. In order to further clarify the detailed evolution of accidents, it is necessary to calculate the confidence level between the factors in the frequent item set of accident causes to determine the degree of association between the causal factors and in order to make the mined rules have a high degree of reliability, the minimum confidence level is set to 0.8. The confidence level table of the causes of accidents in electric power operations is shown in Table 2. The confidence level reflects the degree of dependence of the results of the association rules on the preconditions, and the higher the confidence level, the higher the probability that the results of the association rules will also occur at the same time in the case of the occurrence of the preconditions. For example, the correlation rule “{A13, C21}=>A32” with serial number 11 in the table has a confidence level of 0.944, indicating that if there are two unsafe factors at the same time, such as weak safety awareness (A13) and lack of safety shelter (C21) at the operation of a power marketing site, it will be very easy to cause the operator to expand the scope of operation without authorization (A32). Each strong association rule obtained from mining the historical accident data of power marketing field operations is the basis for understanding the evolution of power marketing field operation accidents and guides the prevention of power marketing field operation accidents.

Table 2.

The cause of the power operation accident

Accident category	Serial number	Precondition	Results	Confidence
Electrical shock (R1)	1	A13,D13	C22	0.902
	2	C22,D32	R1	0.991
	3	D22	D32	0.829
	4	F12,F13	A13	0.969
	5	D12	B21	0.993
	6	A13,D12	A33	0.92
	7	A33,D12	C13	0.944
	8	B21,C13	R1	0.997
	9	F14	C21	0.84
	10	A32,E13	R1	0.996
	11	A13,C21	A32	0.944
Object strike(R2)	12	A33,D41	C11	0.944
	13	A13	A33	0.815
	14	F12,F13	A13	0.964
	15	D14	B22	1.000
	16	A33,D14	C11	0.938
	17	B22,C11	R2	1.000
High fall(R3)	18	D21,D31	A22	0.959
	19	F14	C21	0.927
	20	C21,E14	B23	0.998
	21	A22,B23	R3	1.000
	22	F11	A12	0.901
	23	E22	B14	0.92
	24	A12	A51	0.983
	25	A51	B14	0.841
	26	D11,B14	R3	0.996
	27	D33,A13	A33	0.936
	28	F12,F13	A13	0.964
	29	A33,D11	C12	0.82
	30	C12,D11	R3	1.000

3.2

Modeling the Causation of Accidents in Electric Power Operations

In order to more intuitively reflect the evolution of power marketing field operation accidents, this section describes the occurrence mechanism of electrocution, fall from height and object striking accidents, which frequently occur in power operations, by combining the results of accident causality correlation analysis. Taking the preconditions and outcomes of the accident causation association rules of electric power marketing field operations as the nodes in the accident causation model and then using directed edges to indicate the causality between the preconditions and outcomes, the typical accident causation model of electric power operations is shown in Fig. 3. Combined with the actual meaning of the factors represented by each node under the power marketing field operation accident causation model, the causation nodes can be divided into management behavior nodes, habitual behavior nodes, and unsafe behavior and unsafe physical nodes. 1)

Management behavior analysis

In electric power operations, the lack of on-site supervision (F12) and safety training (F13) are the most common management failures in various types of accidents, which can easily lead to a lack of safety awareness among operators. Lax examination of operator qualifications (F11) is an important cause of fall accidents in work-at-height. Work-at-height belongs to special operations, and operators must undergo appropriate technical training and practical operational skills assessment to obtain a special operations license. Therefore, unlicensed personnel who participate in work-at-height without examination are more likely to make operational errors during operation, increasing the chance of accidents. Work safety measures are initiatives taken to prevent operational risks. Failure to implement necessary work safety measures (F14) leads to increased operational risks, which, in combination with unsafe human behavior, are more likely to evolve into personal injury or death accidents.

2)

Analysis of habitual behavior

In addition to the influence of genetic factors and the social environment, the habitual behavior of the operating personnel is, to a large extent, the result of the role of organizational management factors, and the habitual behavior will manifest itself explicitly as specific unsafe behaviors in the production process. As can be seen from the figure, the level of habitual behavior, low skill level of operators (A12), poor safety awareness (A13), and poor mental state (A22) are the main factors leading to the occurrence of accidents in electric power operations. Electric power operation is a high-risk production activity, and the personnel engaged in the operation did not master the appropriate safety operation skills, which led to operational errors during the operation and may threaten the personal safety of the operators. In the production process, the operators’ loss of safety awareness and inadequate understanding of the potential risks in the operation area will greatly increase the possibility of the emergence of unsafe behaviors and unsafe physical conditions. When working near wells, pits, ditches, and holes, the operator’s poor mental state and inability to correctly identify safety hazards increase the likelihood of accidental falls.

3)

Analysis of unsafe behavior and unsafe physical state

In the process of electric power production, the spatial span of the work area is large, and the potential risk factors in different operation areas are different, so the specific manifestations of unsafe behavior and unsafe physical state in different types of accidents are different. The direct causes of electrocution accidents can be divided into three groups, which are: (1)

In a power failure operation, the equipment or line in the power failure state is not hooked up to the grounding wire (C22), and the equipment or line is suddenly energized outside the planned operation period, resulting in the electrocution of the operator.

(2)

During non-electrification work, the operator touches energized equipment or line (B21) without wearing insulated gloves or boots (C13), resulting in electric shock.

(3)

When working near energized equipment, the lack of safety cover (C21) caused the operator to expand the scope of work without authorization (A32), resulting in accidental contact with a nearby energized body. There is only one set of unsafe behavior and unsafe object combination in an object striking accident, i.e., vertical cross operation or lifting operation, the operator enters the construction site without wearing a helmet (C11), and the equipment or facility accidentally falls from height (B23) striking the operator. Unsafe behavior and unsafe physical manifestations in fall-from-height accidents are as follows: (1) In work-at-height, physical structural damage occurs to the tower or work surface where the operator is located (B14), failing protection such as safety belts and the fall of the operator. (2) In aerial work, the operator is not wearing a safety belt (A12), loses his footing, and falls from a high place. (3) When working near the edge of wells, pits, ditches, and holes, the safety cover is missing (C21), and the operator’s mental state is poor (A22), so they are unable to correctly recognize the safety hazards, and an accidental fall occurs.

4

Risk level identification methodology for power marketing field operation types

4.1

Power marketing metering field operation violation identification methods

4.1.1

Power Marketing Metering Field Operation Violation Feature Extraction

After preprocessing the image, the image is further analyzed to extract the features of the field operation violation. The image is first filtered, where the feature extraction layer utilizes a 3×3 convolution kernel for depth processing of the image to effectively extract key features in the operational image. The depth processing process is as follows: 4 $D = Γ (a)$

where a is the initial value of the nonlinear mapping used to extract the features of the power marketing metering field operation image.

Then, the adaptive filter is selected as the a priori expectation of no signal and no noise to reject the extracted noise features and reconstruct the image using the effective features [23]. The power marketing metering field operation violation parameter f(x) is: 5 $f (x) = D (x) + k (x)$

where D(x) is the modulation parameter of the input signal in the adaptive filter. k(x) is the original error signal.

The sampled data is used as the core of the weighted average to reduce the error. The phase frequency weighting and averaging parameter d(x) of the low-pass filter is: 6 $d (x) = \frac{\sum_{i = 1}^{n} s u (x - p)}{\sum_{i = 1}^{n} h p}$

where s is the output error of the filter. p is the optimal convergence signal. h is the least squares parameter of the smooth signal.

The 3 information connection layers play a key role in the convolutional neural network, which are responsible for fusing the shallow features, deep features, and overall features of the power marketing metering field operation images, respectively. Each information connection layer contains an enhancement module and a compression module, and through the cooperative work of these 2 modules, the output of the 3 information connection layers is: 7 $D_{n} = g (x) d {(x)}_{n - 1}, n = 1, 2, 3$

where g is the nonlinear mapping of the 3 information connection layers. D_n is the output of the 3 information connection layers, and the output result is the fusion result of power marketing metering field operation violation features.

4.1.2

Constructing a model for identifying irregularities in power marketing metering field operations

After completing the power marketing metering field operation violation feature extraction, the violation recognition model is constructed to realize the automatic recognition of violations. Assuming that there are multiple network nodes in the machine vision network architecture, the power marketing metering field operation violation recognition model is: 8 $M = \frac{φ}{1 + σ e}$

where σ is the regression parameter. e is the power marketing metering field operation violation data vector.

Violation training and identification can be realized with the help of this model. The core of the model lies in the accurate use of the loss function, which realizes the efficient and accurate identification of violations through the accurate calculation of operational losses. The loss detection function is: 9 $N = L_{ϵ} + δ e + ε L + θ L_{k}$

where L_ϵ is the characteristic value of the movement for each individual. e is the coordinate error value on the image data. L_k is the critical point loss function. δ, ε and θ are the weighted values of the above parameters.

To ensure the safe use of construction equipment, the overall loss characteristic of construction can be accurately calculated according to Eq. Comparing it with the maximum loss value of the construction, if the combined loss characteristics are outside the safety limits, it can be determined that there is a violation. Thus, the identification of power marketing metering field operation violations is completed.

4.2

Behavioral assessment model based on power marketing field operation scenarios

Before constructing the power marketing site operation violation assessment model, it is necessary to formulate an assessment method to analyze various types of operation violations and find that certain operation violations are common relative to all marketing site operation scenarios, for example, failure to wear a helmet, failure to wear overalls, etc. may be a violation in all operation scenarios. Therefore, any scenario needs to be detected, and smoking and other behaviors in any scenario are explicitly prohibited and alarm when detected. Additional assessment methods need to be set on the basis of regular operations for work scenarios such as electric and gas welding [24].

4.2.1

Evaluation functions

According to the violation determination standard, when there is a situation such as the operator has brought a helmet into the workplace but just put it aside without wearing it correctly, this paper adopts the intersection and merger ratio (IoU) in order to judge the degree of overlap of the position between the recognition frames as an additional condition for determining safe operation. The intersection and merger ratio is calculated as shown in Fig. 4. That is, the ratio of the intersection and concatenation of the two target rectangular boxes. The figure shows that when the two target boxes do not intersect, the value is 0, and when the two target rectangular boxes completely overlap, the value is 1. In this paper, the degree of overlap of the position between the judgment of the person and the critical item is calculated by the formula: 10 $I o U = \frac{M (x_{p}) \cap M (x_{n})}{M (x_{p}) \cup M (x_{n})}$

Where M(x_p) represents the pixel area of the operator’s “person” target box area and M(x_n) represents the pixel area of the key items that the operator needs to wear, such as helmets, goggles, protective gloves, and so on.

From the above equation, we can define the behavioral assessment function for the regular operation category of “working without helmet” as: 11 $J_{h e l m e t} = \frac{M (x_{p}) \cap M (x_{h e l m e t})}{M (x_{p}) \cup M (x_{h e l m e t})}$

When multiple additional conditions need to be satisfied, the product of the multiplicity union ratio is defined as a logical judgment function as a criterion for detecting whether there is a violation. Since not wearing goggles and protective gloves during electric and gas welding work is a violation of the law, it is necessary to identify the above objects, and in order to ensure correct wearing, each key item needs to have a certain position relationship with the target frame of the operator, so the behavior evaluation function that defines the operation category of the specific scenario of “electric and gas welding work scene” is: 12 $J_{w e l d i n g} = \frac{M (x_{p}) \cap M (x_{g o g g l e s})}{M (x_{p}) \cup M (x_{g o g g l e s})} \times \frac{M (x_{p}) \cap M (x_{S_{g} l o v e s})}{M (x_{p}) \cup M (x_{S_{g} l o v e s})}$

Where M(x_goggles), M(xs_gloves) represents the pixel area of the “goggles” and “S_gloves” identification box area, respectively, and J_welding represents the degree of overlap between the operator and the goggles and protective gloves.

If J is 0, i.e., there is no intersection between the target frames, it means that the helmet, goggles, and protective gloves are not worn correctly according to the regulations.

In order to construct a scientific and reasonable electric power operation behavioral assessment model, it is necessary to define the assessment function as the function threshold I. When the behavioral assessment function J reaches threshold I, it indicates that the operator reasonably wears the key items specified under the current operation scenario, which is judged to be a safe operation, or else it is regarded as a violation of the operation. Then, the definition of threshold I directly affect the accuracy of the violation judgment. Therefore, it is necessary to know the sensitivity of small targets to IoU so as to set a reasonable I. In the COCO dataset, targets smaller than 32 × 32 pixels are defined as small targets, which are more likely to be de-emphasized during the target detection process compared to medium and large targets, and their IoU sensitivity is different.

Assuming that the area of A, B, and C pixels is 6 × 6, and the diagonal deviation is 1 and 4 pixels, respectively, the IoU of small targets is given by: 13 $I o U = \frac{A \cap B}{A \cup B} = 0.53$

Change for: 14 $I o U = \frac{A \cap C}{A \cup C} = 0.06$

Assuming A, B, and C pixel areas of 36 × 36 and diagonal deviations of 1 and 4 pixels, respectively, the IoU of the normal-size target is given by: 15 $I o U = \frac{A \cap B}{A \cup B} = 0.90$

Change for: 16 $I o U = \frac{A \cap C}{A \cup C} = 0.65$

From the above example, it can be seen that the IoU of the small target decreases sharply and quickly, from 0.53 to 0.06. The IoU of the normal-sized target, on the other hand, has a decrease from 0.90 to 0.65, which is a smaller rate of change than that of the small target. Therefore, the reasonableness of the threshold I was setting is more dependent on the small targets. It is shown in the related literature that small targets occupy a small area in the image, and it is difficult for the IoU to reach 0.5. Therefore, the threshold range between 0 and 0.5 is selected in this paper.

4.2.2

Grading of violations

Constructing a sound and scientific assessment method is the first condition to reach or realize the power operation violation assessment model. The traditional LEC method of hazard assessment of operating conditions measures the hazard level of operating conditions through the probability of occurrence of accidents (including hazardous events) L, the frequency of hazardous environments in which operators are exposed to E, and the probable outcome of occurrence of violations C, which are three dimensions D. The LEC method is mainly aimed at the construction conditions at the operating site, such as common hidden dangers, failure to wear helmets, and failure to implement daily inspections.

The functional relationship between the indicators in the traditional LEC method is: 17 $D = L \cdot E \cdot C$

In order to better describe the hazardousness of operating conditions, the value of L is taken as 0~10, the value of E is taken as 0~10, and the value of C is taken as 1~100. Since the assessment of electric power operation violations has certain differences with the LEC assessment, this paper adjusts the assessment parameters on the basis of the LEC method, i.e., the frequency of violations is defined as F, the probability of violation leading to accidents is defined as P, and the severity causing the severity of the accident is defined as C, as the measurement parameters of the assessment method in this paper, and adjust the differences between the two methods.

Define the violation level by calculating the D-value. Define violation i, which corresponds to violation level D_i, then D_i is denoted as: 18 $D_{i} = F_{i} \cdot P_{i} \cdot C_{i}$

Where D_i indicates the danger level of violation i. F_i denotes the frequency of occurrence of violation event i. P_i denotes the probability that the violation event i causes an accident. C_i denotes the severity of the violation event i causing an accident.

4.2.3

Modeling

In this paper, we utilize the improved YOLOv5s model for multi-task recognition and detection of six types of targets in the self-constructed electric power operation dataset: people, helmets, smoke, welding, goggles, and protective gloves. Based on the relationship between the person and the remaining five types of targets, the location relationship is judged, and the operator is evaluated to see if he is in a safe operating environment. If there is a violation, it is graded according to the violation grading method. The flowchart of the power operation violation assessment model is shown in Figure 5.

5

Power marketing field operation type risk level intelligent identification experiment

5.1

Model Performance Evaluation Experiments

5.1.1

Training Model Evaluation

The training experiment configuration parameters are shown in Table 3. The Kinetics-skeleton skeleton behavior dataset was used as the training test set, which is a 3D-like skeleton sequence obtained by extracting skeleton features using the OpenPose algorithm on the Kinect-400 dataset and then using confidence as the Z-axis. It contains about 240,000 video clips covering 400 action classes and at least 600 video clips lasting about 10 seconds per action class. The cross-validation set contains 20,000 short video behavioral messages, each clip from a unique YouTube video that has been manually annotated over multiple rounds and covers a wide range of action data. The environment was deployed using a deep learning server to train the proposed model in this paper for a total of 60 iteration rounds on the training set. Data evaluation on the cross-validation set was performed every 10th time to count the probability of true value results in the TOPl and TOP5 classification. The dataset training results are shown in Table 4.

Table 3.

Training parameters

Training parameter	Parameter value
Training data set	35
Test data set	35
Learning rate	0.3
Iteration number	50

Table 4.

Data training results

Iteration rotation	Mean loss	Top1 accuracy	Top5 accuracy
10	4.6245	11.26%	32.76%
20	4.5156	14.45%	29.43%
30	3.8516	19.57%	42.72%
40	3.457	26.77%	48.67%
50	3.3162	25.66%	50.42%
60	3.562	31.22%	54.10%

5.1.2

Behavioral Recognition Results

Test the robustness of the algorithm in different environments involving strong light, low light, sparse, and crowded. The experimental scenarios are shown in Fig. 6. Using precision rate and recall rate as performance metrics. The abnormal behavior recognition performance is shown in Table 5. As can be seen from the table, under the strong light and sparse conditions, this paper’s algorithm has the highest precision rate. Under low light and crowded conditions, this paper’s algorithm has the lowest precision rate. From the precision rate degradation, it can be seen that congestion causes more false detections than low light when other conditions remain unchanged. And the recall rate does not differ much in different conditions, indicating that low light and congestion do not cause more missed detections. The reason is that low light or crowding leads to incomplete skeleton extraction, and crowding leads to incomplete skeleton extraction after occlusion, causing more false detections. A normal behavior skeleton will be misdetected as an abnormal behavior skeleton, but the abnormal behavior skeleton will not be detected as a normal behavior skeleton, and there will be fewer leakage cases. Overall, the algorithm in this paper has little difference in the performance indicators in different environments, and the accuracy rate can reach more than 90%. It shows that the algorithm has good robustness.

Table 5.

Abnormal behavior recognition performance

Influencing factor	Accuracy rate%	Recall rate%
Weak light	91.4	96.9
Weak light and sparse	95.7	96.4
Strong light	92.4	97
Strong light, sparse	97.3	96.4
Total	94.2	96.5

Abnormal behavior identification is carried out using the Optical Flow Method, Motion Instability, Support Vector Machine, and Markov Random Field and compared with the identification results of this paper’s algorithm and the comparison of abnormal behavior identification performance is shown in Table 6. As can be seen from the table, the performance of this paper’s algorithm is optimal when abnormal behavior identification is performed. The reason is that Motion Instability calculates motion instability based on a trajectory to discriminate abnormal behavior, and it is difficult to track accurately when the occlusion is serious. The Optical Flow method ignores the dynamic correlation information of the moving target in time and adopts an iterative solution calculation method, which requires a longer computation time and is more affected by noise. The support vector machine method makes it more difficult to select features in the case of occlusion and is not applicable to the environment of multiple bursts. Markov random field can only reflect the simple changes of the target in the direction and velocity magnitude, and the recognition accuracy is not high in complex motion scenes. The experimental results show that the algorithm in this paper has better performance in abnormal behavior recognition.

Table 6.

Abnormal behavior identification performance comparison

Method	Accuracy rate	Processing speed/FPS
Light flow method	90.52	8.82
Motion Instability	92.36	11.69
Support vector machine	80.63	16.8
Markov	83.55	15.06
This algorithm	93.87	18.33

5.2

Testing of model runtime effects

5.2.1

Evaluation of model performance

The model accuracy is shown in Fig. 7, and the overall accuracy of the model is 0.845, as shown by the validation curve, which proves that the model in this paper correctly predicts most of the behavioral actions, but the validation curve is not smooth enough.

The precision and recall for the three types of actions are shown in Table 7. Sitting (label 0), standing (label 1), and falling (label 2) are given in the table. The precision Prcn data for the falling down action, which is the focus of the model goal, is 0.69, and the recall Rcll is 0.96. The data set Prcn represents the ratio of the number of correct predictions for the action of falling to all samples predicted as falling. Rcll represents the ratio of the number of collapses that were correctly predicted to the number of moves that were collapses in the dataset.

Table 7.

The accuracy and recall rate of three kinds of actions

	Precision	Recall
0.Sit down	0.71	0.81
1. Standing	0.94	0.86
2.Inversely	0.69	0.96

The confusion matrix is shown in Figure 8. Of the 250 samples considered to be actually falling, 195 were successfully predicted. The number of samples in which the falling down action was successfully recognized is high, so the low Prcn for the falling down action is actually due to the low recognition accuracy for the sitting down and standing up actions. Therefore, whether the model can achieve effective recognition of abnormal behaviors, the more concerned index is the recall rate. By observing the call rate, only a very small number of actions in the dataset labeled as falling were recognized as other actions, accounting for more than ninety percent of the total samples of the falling actions that were successfully recognized as falling. Considering the actual electric power operation scenario, as long as the abnormal behavior can be accurately identified to play the role of prevention, warning, and protection of the operating personnel, the recall rate of the downed behavior reaches 0.96, which can be achieved under this precision for the vast majority of the downed abnormal behavior identification.

5.2.2

Comparison of eigenvalues for each action

In order to evaluate the reasonableness of the three features, the video samples in the Fall Detection Dataset are selected as tests, and the data of the three types of features are outputted separately.

The relative height coefficient of the human head is shown in Fig. 9. The figure gives the curve of the human head’s relative height coefficient in the test video with the advancement of video frames. The head relative height coefficient stays between 0.6 and 0.8 from 0 to 200 frames of the video and has a clear tendency to increase when the video runs near 300 frames and eventually improves to near 1.0 after 300 frames. At around 350 frames of the video, the head relative height coefficient dropped off a cliff to near the value of 0 and then recovered to near 0.19. The first change in the relative head height coefficient is consistent with the increase in head height when a person shifts from sitting to standing, and the second change in the relative head height coefficient to 0 and then back to a certain height, which is consistent with the cliff-like decline in head height and the slight adjustment of the head height when the person shifts from standing to falling and tries to regain the posture after falling in the test video. The above analysis proves that the relative head height can effectively differentiate the three types of actions and highlights the difference between the abnormal behavior of falling and the other two types of routine actions.

The relative height coefficient of the human knee joint is shown in Fig. 10. The figure gives the curve of the relative height coefficient of the human knee joint in the test video with the advancement of the video frames. The relative height of the knee joint remains between 0.28 and 0.35 at frames 0 to 200, which corresponds to the relative height coefficient of the knee joint in the test video when the tester maintains the sitting action. At 250 to 350 frames, which corresponds to the tester’s standing behavior, the relative knee height shows a certain decrease. After 350 frames, the relative knee height decreased rapidly and remained at a very low value. The changes in the relative knee height between the first 200 frames and after 350 frames are consistent with the trend of the relative knee height in the sitting down and falling down scenarios. Observing the overall curve trend of the graph, it is proved that the method of this paper can effectively distinguish the abnormal behavior between two types of routine actions, namely, falling and sitting down, but cannot distinguish between two types of routine actions, namely, standing up and sitting down.

The human leg angle cosine is shown in Fig. 11. As given in the figure, the curve of the cosine of the human leg angle in the test video with the advancement of the video frames.

In the figure from 0 to 200 frames, the human body is kept in a sitting position, and the cosine value is maintained between -0.3 and 0. At 250-350 frames, the cosine drops sharply to -1 as the human body changes from sitting to standing. It should be noted that the cosine rises briefly before the human body falls to the ground and then decreases and stabilizes after 330 frames.

The cosine remains near 0 from 0 to 200 frames, which is consistent with the body sitting down at a near-right angle between the legs. The cosine drops sharply around -1 after the human body changes to a standing position, which is consistent with the human body’s legs being close to a flat angle at this time, and the tester made a bending motion to make the angle between the legs less than 180 degrees. The final state of the cosine value is kept at -1 because the testers in the sample kept their legs straight when they fell to the ground, but this is not representative because the human body’s leg movements cannot be determined when they fall to the ground. The above analysis demonstrates that the large and small leg cosine values are effective in distinguishing between standing and sitting postures, but it is not representative to use it to describe the abnormal movement of falling to the ground.

In summary, in this section, the changes of three types of eigenvalues in the process of human posture changes are analyzed, and the method of this paper can effectively differentiate three types of actions, the method of this paper can effectively differentiate routine actions from falling down abnormal behaviors, and the large and small leg cosine values can effectively differentiate between standing up and sitting down. The fusion of the three types of features can accurately distinguish three different types of actions and achieve the goal of accurately identifying abnormal behaviors. However, considering that there are still some abnormal changes in the process of feature value change, more action classifications and features need to be added to describe the situation when each action is transformed.

5.2.3

Effect of the algorithm in practice

This subsection tests the algorithm performance by using the COCO image dataset with the Fall Detection Dataset video dataset.

Video dataset testing: three videos are selected for testing. Video #1 has the same scene and different angles as video #2, and video #3 has different scenes with video #1 and video #2. The total number of frames of the test videos is 351, 261, and 427 frames, respectively. The actual running data of the algorithm is shown in Table 8. From the information in the table, it can be seen that the recognition algorithm in this paper is applied to the above three groups of test videos with better results, and the recognition accuracy for the downward movement in the videos all reaches more than 96%.

Table 8.

The algorithm actually runs the data

	Total frame number	Inverted action frame number	Successful identification number	Success in identifying reverse times	Overall accuracy	Reverse recognition accuracy
1	342	98	334	95	0.928	0.965
2	262	75	248	78	0.936	0.966
3	428	167	387	155	0.932	0.963

The test data of the electric power operation scene is shown in Table 9. Since there is less material of abnormal behaviors in the electric power operation scenario, the number of detected frames is the total number of samples. The number of frames of successfully recognized actions is used to evaluate the performance of the algorithm. Finally, the overall success rate of the algorithms of this chapter in real operation scenarios without obscuring the key points of the operators reaches 0.942.

Table 9.

Test data for the power operation scenario

Test frame number	Correct number of action frames	Success rate
1086	1023	0.942

5.3

Identify and analyze the risk level of the power marketing field operation process

In order to analyze the risk SA of this paper’s method in the operation process, a 35 kV line maintenance operation in the accident sample set is selected as an example, and the monitoring interval is taken as Δt=5min. By inputting the risk key element information, the risk state of the operation process is understood, and the risk contribution value of each type of level 1 factor changes with time, as shown in Figure 12. The figure reflects that the initial risk level of this power marketing field operation is low, and the dominant elements of risk are the personnel attribute class and the nature of operation class elements. As the length of the operation increased, the operation time changed from daytime to nighttime, the weather conditions deteriorated, and the operation personnel experienced fatigue and low working mood. Due to other emergency repair operations to draw personnel, the operation staffing only reached 78% of the requirements and could only carry out cross operations. In the end, an operator, without being supervised, did not follow the instruction steps of the operation form and transcribed the nameplate of the arrester by himself and touched the charged part of the arrester by mistake, resulting in an accident.

The risk trend prediction curve for this operational process is shown in Figure 13. The graph effectively reflects the changes in operational risk during the operation. The final risk prediction level is 1, which is consistent with the real situation. If, during the initial rise in operational risk at monitoring points 50-60, operational adjustment decisions for key elements of operational management and psychological and behavioral risk are proposed in accordance with the direction of the risk profile, the risk level of the operation can be effectively reduced, and accidents can be avoided.

6

Conclusion

With the help of data mining technology, this paper focuses on the intelligent identification method of the risk level of the operation type with the power marketing field operation type as the object.

Through the use of association rules to analyze the risk factors of power marketing field operations, it is found that the confidence level of the association rule with the serial number 11 is 0.944, that is to say, if two unsafe factors, namely, weak safety awareness of the operators and the lack of safety cover at the operation site, occur at the same time in a certain power marketing field operation, it will easily lead to unauthorized expansion of the scope of the operation by the operators.

The recall rate of this paper’s method for the recognition of falling behavior in the course of operation reaches 96%, and the overall success rate of recognition in the actual operation scenario reaches 94.2%. The experimental results prove that the method in this paper has a good distinguishing effect and recognition effect in distinguishing human behavioral feature values.

Based on the above experimental results, the method proposed in this paper can effectively assess and analyze the key causes of electric power operation accidents, avoid the adverse effects of subjective factors in traditional operation risk assessment, accurately understand the risk status in the operation process, effectively track the risk development trend, and provide technical support for the risk control of the whole process of electric power marketing on-site operation, with a large prospect of online application.

Idioma:: Inglés

Calendario de la edición:: 1 veces al año
Temas de la revista:: Ciencias de la vida, Ciencias de la vida, otros, Matemáticas, Matemáticas aplicadas, Matemáticas generales, Física, Física, otros

RSS Feed de revista

Research on Intelligent Recognition Method of Risk Levels of Electricity Marketing Field Operation Types Based on Data Mining

Lihua Zhang

Shujun Jing

Xu Chen

Xiang Wu

Yinzhe Xu

Publicado en línea: 21 mar 2025

Recibido: 20 nov 2024

Aceptado: 18 feb 2025

DOI: https://doi.org/10.2478/amns-2025-0560

Palabras claveAssociation rules, Power marketing, Field operation, Risk level identification

© 2025 Lihua Zhang et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Palabras clave
Association rules, Power marketing, Field operation, Risk level identification