A multi-task scheduling algorithm for heterogeneous information security in the Internet of Things for electricity

In recent years, with the continuous development of social economy, network information science and technology continues to progress, the application of Internet technology in all walks of life of society is becoming more and more extensive [1-2]. In such a general environment background trend, all walks of life on the power energy consumption is more and more large, these specific status quo are not conducive to the stable, smooth and sustainable development of the power industry [3-5]. Nowadays, the development of the application of IoT for electric power in mainland China does still have many security risk factors [6]. For example, the risk at the perception level, the risk at the network technology level, and the risk at the practical application level [7-8]. For these specific risks in the application process of power Internet of Things, it is necessary for electric power enterprises to formulate scientific, reasonable and effective information security protection measures according to the actual situation in a targeted manner to maximize the information security of the power Internet of Things, in this way, the electric power enterprises can be better, more sustainable, healthier and more orderly operation and development under the fierce market competition [9-13].

With the construction of the new power system, new source-load devices represented by clean energy and electric vehicles are connected to the power grid, and multiple types of power services are deployed, resulting in the new power system facing many network threats such as terminal access security, data security, and communication security [14-16]. Therefore, there is an urgent need to study the multi-service characteristics of power IoT information security and build a high computing power operation system for security service tasks. Cloud computing data centers provide powerful computing power, which can realize cloud-based password control and security service hardware systems [17-19].

Different operations of the new power system have different level requirements for information security. The high-classified or core computing tasks must be executed in dedicated physical cryptographic machines, which have a perfect key protection mechanism and security can be guaranteed, but the flexibility, computing efficiency and scalability are low [20-22]. Virtual cryptographic machine, on the other hand, can provide higher computational power and easy scalability of resources, but it cannot meet the security level requirements of all services. In order to solve the shortage of computational resources for different security service tasks, a new hybrid underlying hardware infrastructure for the coexistence of VCMs and PCMs, which is unique and fundamentally different from cloud computing platforms, has to be formed in the Internet of Things for electric power [23-25]. Therefore, there is an urgent need to study the task scheduling algorithms for this new underlying hardware platform to maximize the security of PCM and the computational efficiency advantage of VCM [26-27].

In this paper, based on the selection of data information security system cloud service, the security transmission model is constructed, and the security coefficient of the electric power data information security transmission to each layer of cloud service process is calculated, which composes the security transmission data set of all electric power data. The information entropy of the power data information transmission nodes is calculated by the network characteristics of the transmission power data nodes, and the PSO is discretized using PSO with decision tree, and the secure transmission of power data information is realized by constructing a security system. The time model constraints are proposed to schedule multiple task DAGs on heterogeneous platforms, two optimization objectives are determined, the mathematical model of task scheduling is listed, and the load balancing in the scheduling model is designed. Multi-task clustering of security services is used to enhance the importance of task attributes of security services. According to the actual application of electric power IoT information security service, create and match computing nodes, and after getting the best match between the task and the VCM class, use the quantum particle swarm optimization algorithm to complete this part of the task scheduling, and achieve the optimization of the task completion time. Build a simulation experiment environment, through simulation experiments from the task clustering results, security, completion time and other perspectives, to analyze the performance of the scheduling algorithm.

2

Information security service architecture and task scheduling model

2.1

Power Information Security Service Architecture

2.1.1

Modeling secure transmission of power information

Assuming that the number of information samples of computer cloud service data information subjected to cyber attacks is i, the number of computer cloud service data information security transmission is j, and Y_ij represents the transmission value of the jth power data information. According to the selected data information security system cloud service, the security transmission model is constructed with the following expression: (1) $Y = [\begin{matrix} Y_{11} & Y_{12} & \dots & Y_{1 j} \\ Y_{21} & Y_{22} & \dots & Y_{2 i} \\ ⋮ & ⋮ & ⋮ & ⋮ \\ Y_{i 1} & Y_{i 2} & \dots & Y_{i j} \end{matrix}]$

According to equation (1), the security factor u for the secure transmission of power data information to each layer of the cloud service process and the conservation index λ for ensuring the secure transmission of data are calculated with the following expressions: (2) $u = \frac{i}{\sum_{i = 0}^{i} Y_{i j}}$ (3) $λ = \sqrt{\frac{j}{\sum_{i = 0}^{i} Y_{i j}}}$

According to the calculation results of the above equations (2) and (3), the safe transmission data of electric power data information is integrated and managed, and the collected data is analyzed by using the cluster analysis method to obtain the safe transmission fuzzy set, whose expression is as follows: (4) ${\bar{Y}}_{i j} = \frac{Y_{i j}}{λ}$ (5) $S_{i j} = f (Y_{i}, Y_{j})$

where S_ij denotes the alignment matrix of the fuzzy set of secure transmission and $f (Y_{i}, Y_{j})$ denotes the distance interval in the secure transmission process with the expression: (6) $f (Y_{i}, Y_{j}) = \sqrt{Y_{i} + Y_{j}}$

If the set of measurement indexes of the security transmission model of electric power data information is $P = {P_{j} | j > 2}$ , the percentage of time consumed by the process of security transmission of electric power data information in the cloud versus the actual transmission process is ΔU, and the set of index coefficient weights of the security transmission is $X = {X_{j} | j > 2}$ , and the weights of all electric power data information within a fixed period of time are collated, the expression is specified as follows: (7) $T = \frac{{(G_{i} \cdot X_{j})}^{2}}{Δ U} + S_{i j} + \frac{G_{i} \cdot P_{j}}{Δ U}$

Where G_i represents the set of all secure transmission data of power data information in layer i.

The security transmission weight information represents the security transmission result of all the power data information in the cloud that is attacked, and its weight information is directly proportional to the security coefficient of the information data transmitted in the cloud, and the higher the weight value indicates the higher the probability of the security transmission of the power data information, and the weight matrix of the security transmission is calculated as follows: (8) $B = [\begin{matrix} b_{11} & b_{12} & \dots & 0 \\ b_{21} & b_{22} & \dots & b_{2 j} \\ ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & b_{i 2} & \dots & b_{i j} \end{matrix}]$

Where, when b_ij > 0, the dataset for secure transmission is built based on the security transmission weight matrix obtained from Eq. (8) as follows: (9) $X' = {(\sum_{j = 2}^{P} Y_{11} + \sum_{j = 2}^{P} Y_{12} + \dots + \sum_{j = 2}^{P} Y_{i j})}^{- 1}$

According to the above equation, the weight matrix of power data information security transmission can be calculated on the impact of information data security transmission, from which the measure of security transmission can be obtained as follows: (10) $Q = \frac{P}{\sum_{j = 2}^{P} (X_{j} / Y_{j})}$

Based on the calculation of the above measurements, the comparative analysis of the security of power data information is obtained: (11) ${\begin{array}{l} Q > λ, Contains hazardous information \\ Q \leq λ, No hazardous information \end{array}$

where λ is the conservation index for ensuring the secure transmission of data derived in Equation (3). According to the matrix arrangement of the security transmission time of electric power data information, the weighted average value of the security transmission time is set, and all electric power data information in the computer cloud service environment is integrated to realize the security transmission of data information.

2.1.2

Electric power information security system construction principle

Assuming that there is m node for transmitting power data information r₁, r₂, ⋯, r_n in the computer cloud service environment, and different transmission nodes correspond to different network characteristics, but the transmission nodes have probabilistic correlation, and the probabilistic correlation of all of them is denoted by $P (r_{1}), P (r_{2}), \dots, P (r_{n})$ . Then, when P(r) = 1, the information entropy of the power data information transmission node can be calculated by using Eq. (12), i.e: (12) $G (R) = P (r) \log_{2} P (r)$

Combining the node entropy value obtained above, the power data information in the computer cloud service environment is discretized, and the shortest distance between the transmission node and the root of the decision tree is obtained by determining the state of the power data information transmission in the computer cloud service environment through the access state of each node under the PSO decision tree environment, and the specific process is as follows:

When the number of transmission nodes is k, the number of nodes in its transmission process can be represented by P_i(i = 1, 2, ⋯, k), and the attributes of the transmission nodes are represented by B. The attributes of the transmission nodes are divided into u₁, u₂, ⋯, u_u sub-attributes, and according to the different attributes, the electric power data information is divided into c subsets and each node consists of B. The probability that the root node under the same decision tree possesses the same security transmission node H_i is p_i, and at this time, the electric power data information security transmission the information entropy of power data information security transmission is expressed using equation (13): (13) $J (z) = P_{i} \log_{2} (P)$

The scope of security during the secure transmission of power data information is described by Equation (14): (14) $W (B) = \frac{| z_{r} |}{| D |} \times J (z)$

Where, J(z) denotes the number of nodes with successful transmission, z_r denotes the power data information attributes, and D denotes the set of attributes.

Assuming that the number of power data information transmissions is p, the decision tree is utilized for its decision analysis, which is described using equations (15) and (16): (15) $\begin{array}{rcl} J (p, l) & = & - \frac{p}{p + l} \log_{2} \frac{p}{p + l} \\ - \frac{p}{p + l} \log_{2} \frac{p}{p + l} \end{array}$ (16) $Z (B) = \frac{p_{i} + l_{i}}{p + l} J (p_{i}, l_{i})$

In the formula, the result of hierarchical classification of transmitted power data information using decision tree is l_i. The probabilistic information gain for secure transmission of power data information is calculated using Eq. (17): (17) $w i n (B) = J (p, l) - Z (B)$

Through the information gain method mentioned above, we get all the information attributes in the process of power data information transmission in the computer cloud service environment, accurately describe the target information characteristics, comprehensively detect the power data information according to the relevant characteristics, and realize the safe transmission of power data information by constructing a security system.

2.2

Task Scheduling Model Design for Heterogeneous Computing Platforms

2.2.1

Time model constraints

Before establishing this time model, it is necessary to make some descriptions of the tasks of this heterogeneous platform, due to the heterogeneous computing platform contains a large number of machines and users, so in the scheduling is the scheduling of multiple tasks DAG, and the computing platform contains a large number of processor types, and different processors have different characteristics, so suitable for processing the type of task is not the same. In this heterogeneous computing platform, the complex tasks submitted by the user can be divided into a number of sub-tasks, and the computational characteristics of these sub-tasks are known, so that the processor type most suitable for its computational characteristics can be found. Figure 1 shows that complex tasks can be divided into several subtasks with computational characteristics, and these complex tasks can be divided into several subtasks with known computational characteristics, which are represented by a directed acyclic graph DAG, where different shapes of the subtasks represent different computational characteristics, and therefore different suitable computing devices. That is, one divides a task into six subtasks, three of which have computational characteristics suitable for CPU, two of which have computational characteristics suitable for GPU, and one of which is suitable for FPGA, and after the division, the subtasks are able to exist in parallel execution, and thus can be scheduled to different computing devices for parallel execution in order to minimize the user waiting time.

2.2.2

Minimize average task completion time

For heterogeneous computing platforms, the most intuitive thing that users can see after submitting their own tasks is when their tasks are completed, so for this time model, the optimization objective considered is to minimize the average task completion time of the system, i.e., minimize AveTime, which is related to the number of tasks N in the system and the completion time of all tasks in the system, SysATFT, as shown in Eq. (18) [28]: (18) $A v e T i m e = \frac{S y s A T F T}{N}$

To calculate the actual completion time of all the tasks within the system, it is necessary to calculate the completion time of each task, and furthermore, since the smallest scheduling unit within the system is the sub-task, for the actual completion time of each task, it is necessary to calculate the actual completion time of each sub-task after the division of that task. For calculating the actual completion time JAFTime of a subtask, it is necessary to go up to the actual start execution time of the subtask as well as the actual execution time of the subtask on the server, and for subtasks, it is necessary to confirm to which computing device they are scheduled before the next step of the calculation can be carried out. The calculation of the actual start execution time of subtask $J o b_{i}^{j}$ is shown in equation (19): (19) $\begin{array}{rcl} J S e r A S T i m e e_{i, k}^{j, l} & = & \max {I d l e T i m e_{i, k}^{j, l}, J P e r A F T i m e_{i}^{j}} \\ + T r a n s T i m e_{J o b P e r S e t_{i}^{j} \to J o b_{i}^{j}} \end{array}$

Where $J S e r A S T i m e_{i, k}^{j, l}$ denotes the actual start time of execution of $J o b_{i}^{j}$ on the computational resource $S e r v e r_{k}^{l}$ . $I d l e T i m e_{i, k}^{j, l}$ denotes the earliest idle time of $S e r v e r_{k}^{l}$ for subtask $J o b_{i}^{j}$ : $J P e r A F T i m e_{i}^{j}$ denotes the actual completion time of the predecessor subtasks that have a constraint relationship with $J o b_{i}^{j}$ . $T r a n s T i m e_{J o b p e r S e t_{i}^{j} \to J o b_{i}^{j}}$ denotes the transmission time of the intermediate results of the set of antecedent subtasks having a constraint relationship with $J o b_{i}^{j}$ to the subtask j of task i.

After knowing the actual start execution time of the subtasks, plus the actual execution time of the subtasks on the computing device, then the actual completion time of the subtasks can be calculated, and the actual execution time of the subtasks is not the same on different computing devices, because the arithmetic power of different computing devices is different, so the calculation of the actual execution time of the subtasks is shown in Equation (20): (20) $J S e r R T i m e_{i, k}^{j, l} = \frac{J D a t a_{i}^{j}}{S A b i l i t y_{k}^{l}}$

where $J S e r R T i m e_{i, k}^{j, l}$ denotes the execution time of $J o b_{i}^{j}$ on the computational resource $S e r v e r_{k}^{l}$ , $J D a t a_{i}^{j}$ denotes the computational scale of the jth subtask of task i in the system, and $S A b i l i t y_{k}^{l}$ denotes the processing capacity per unit time of $S e r v e r_{k}^{l}$ .

Therefore the calculation of the actual completion time $J S e r A S T i m e_{i}^{j}$ of the subtask is shown in equation (21): (21) $J A F T i m e_{i}^{j} = J S e r A S T i m e_{i, k}^{j, l} + J S e r R T i m e_{i, k}^{j, l}$

For the transmission time in Eq. (21), since it has already gone through the task merging phase, here if for subtask j of task i, if the set of antecedent subtasks that have a constraint relationship with $J o b_{i}^{j}$ are all within the same server as $J o b_{i}^{j}$ , the transmission time of the subtasks is ignored when calculating their actual start execution time, and the main consideration is the network transmission overheads across servers. The calculation of the transmission time is shown in Equation (22): (22) $\begin{array}{l} T r a n s T i m e_{J o b_{i}^{x} \to J o b_{i}^{y}} = \\ {\begin{matrix} 0 & i f I s S a m e S e r_{J o b_{i}^{x}}^{J o b_{i}^{y}} = T r u e \\ \frac{J M i d D a t a_{i}^{x}}{\min {B a n d W i d t h_{J o b_{i}^{x}}, B a n d W i d t h_{J o b_{i}^{y}}}} & i f I s S a m e S e r_{J o b_{i}^{x}}^{J o b_{i}^{y}} = F a l s e \end{matrix} \end{array}$

where $T r a n s T i m e_{J o b_{i}^{x} \to J o b_{i}^{y}}$ denotes the transmission time of the intermediate results of the subtask x of task i to the subtask y of task i, and $I s S a m e S e r_{J o b_{i}^{x}}^{J o b_{i}^{y}}$ is true means that the two subtasks x, y of task i are on the same server, and vice versa.

Therefore, for the calculation of the transmission time in Eq. (19), for the subtasks in the set of precursor subtasks that have a constraint relationship with $J o b_{i}^{j}$ , if they are in the same server as $J o b_{i}^{j}$ , then the transfer time between them is 0, on the contrary, for the subtasks in $J o b P e r S e t_{i}^{j}$ that are not in the same server as $J o b_{i}^{j}$ , the transmission time to $J o b_{i}^{j}$ is related to the bandwidth of the two servers and the scale of the intermediate data generated by the precursor subtask, if $J o b P e r S e t_{i}^{j}$ is divided into two sets of A and B, where set A represents the subtask in the same server as $J o b_{i}^{j}$ , Set B represents a subtask that is not on the same server as $J o b_{i}^{j}$ , and its calculation is given in Eq. (23): (23) $T r a n s T i m e_{J o b P e r s e t}^{i \to J o b_{i}^{j}} = \sum_{x \in B} \frac{J M i d D a t a_{i}^{x}}{\min {B a n d W i d t h_{J o b i}^{x}, B a n d W i d t h_{J o b_{i}^{j}}}}$

After the above derivation, for task Job_i, the actual completion time of its subtasks can be found, so the actual completion time of task Job_i is the maximum of the actual completion time of all subtasks after its division, i.e., the actual completion time of its last subtask, so the calculation of the actual completion time of task Job_i, JAFTime_i, is shown in equation (24): (24) $J^{J F T i m e} e_{i} = \max_{j} J A F T i m e_{i}^{j}$

For the whole system, the average completion time of the system’s tasks then need to get the sum of the execution time of all the tasks in the system, and after getting the actual completion time of each task after scheduling, then we can calculate to get the sum of the execution time of all the tasks in the system SysATFT, the calculation is shown in equation (25): (25) $S y s A T F T = \sum_{i = 1}^{N} J A F T i m e_{i}$

Therefore the average completion time AveTime of all tasks within the system is calculated in equation (26): (26) $A v e T i m e = \frac{1}{N} S y s A T F T = \frac{1}{N} \sum_{i = 1}^{N} J A F T i m e_{i}$

Therefore the optimization objective of this time model is to minimize the average completion time of the task, see equation (27): (27) $\min \frac{1}{N} \sum_{i = 1}^{N} J A F T i m e_{i}$

2.2.3

Maximization of mission safety and security factors

With the optimization objectives of maximizing the Task Safety Factor (TSF) T^F and minimizing the total system completion time T^M, the task scheduling mathematical model can be expressed as [29]: (28) $\max T^{F} = \max \sum_{i = 1}^{n} z_{i}^{\sec} T_{i}^{s a l}$ (29) $\min T^{M} = \min (\max (E^{T I M E} (T_{e x i t}^{P C M}), E^{T I M E} (T_{e x i t}^{v C M})))$ (30) $s . t . \sum_{j = 1}^{m} x_{i j}^{P C M} + \sum_{o = 1}^{h} \sum_{k = 1}^{l} y_{i k o}^{V C M} = 1 \forall i$ (31) $\sum_{i = 1}^{n} x_{i j}^{P C M} + \sum_{i = 1}^{n} \sum_{o = 1}^{n} y_{i k o}^{V C M} \geq 1 \forall k, \forall j$ (32) ${\begin{array}{l} T_{i}^{s t} + T_{i}^{e x e c} \leq T_{i}^{d l} \\ \sum_{j = 1}^{m} x_{i j}^{P C M} + \sum_{o = 1}^{h} \sum_{k = 1}^{l} y_{i k o}^{V C M} = 1 \end{array}$ (33) ${\begin{array}{l} x_{i j}^{P C M} = 1 \\ T_{i}^{s a l} = 4 \end{array}$ (34) $\sum_{o = 1}^{h} \sum_{k = 1}^{l} \sum_{i = 1}^{n} y_{i k o}^{V C M} T_{i k o}^{4} E^{T I M E} (T_{i k o}) \leq R^{V C M}$

2.3

Load balancing design in scheduling models

Load balancing plays a vital role in scheduling systems, and load-balanced systems provide multiple benefits to network applications and services. Load balancing effectively optimizes the performance of the system by distributing tasks across multiple servers. This decentralized approach helps avoid overloading a single server and becoming a bottleneck in the system, and ensures that each server is able to handle requests efficiently, thus improving the responsiveness of the entire system.

In addition, load balancing optimizes resource utilization to ensure that each server is properly loaded, improving the overall efficiency of the system. In the face of growing computing demands, load balancing provides flexible scalability for the system, allowing new servers to be easily added and tasks to be automatically assigned without the need for large-scale changes to the entire system. Load balancing effectively prevents overloading of servers and uses different algorithms and rules to decide how to allocate tasks to meet system performance and availability requirements.

In summary, the role of load balancing in a scheduling system is multifaceted, covering performance optimization, high availability, resource utilization, scalability, fault recovery, and global load distribution to ensure that the system is able to respond robustly to a variety of challenges and provide efficient and reliable services.

2.4

Multitask Scheduling Algorithm for Security Services

2.4.1

Multi-task clustering for security services

In order to amplify the importance of security service task feature attributes and make the task clustering effect more satisfying to the needs of the computing system, for the G-dimensional feature attribute samples of n task, the weight of the attributes is calculated by using the information entropy assignment method, and the specific calculation steps are as follows [30]. 1)

Normalize the task feature attribute data, that is: (35) $T_{i g}^{s t d} = \frac{T_{i g} - \min_{i} T_{i g}}{\max_{i} T_{i g} - \min_{i} T_{i g}}$

Where: $T_{i g}^{s t d}$ is the standardized value of the grd feature attribute data of the ind task, and T_ig is the value of the gth feature attribute of the ith task. 2)

Calculate the weight of the gth feature attribute of the ith task in the sample set $P_{i g}^{f e a t u r e}$ , i.e.: (36) $P_{i g}^{f e a t u r e} = \frac{T_{i g}^{s t d}}{\sum_{i = 1}^{n} T_{i g}^{s t d}}$

3)

Calculate the information entropy E_g of the gst feature attribute in the sample set, viz: (37) $E_{g} = - \frac{1}{\ln n} \sum_{i = 1}^{n} P_{i g}^{f e a t u r e} \ln P_{i g}^{f e a t u r e}$

4)

Calculate the entropy weights ψ_g of the feature attribute g from the information entropy E_g, i.e: (38) $ψ_{g} = \frac{1 - E_{g}}{\sum_{g = 1}^{G} (1 - E_{g})}$

The smaller the degree of change of the task feature attribute in the sample set, the larger the value of information entropy, the smaller the influence of the attribute on the clustering effect, and the smaller the weight. Conversely, the smaller the value of information entropy, the greater the influence on the clustering effect, the greater the weight.

Assuming that the feature attributes in the samples are not related to each other, define the weighted Mahalanobis distance $d_{w n d} (T_{i}, T_{i + n})$ of task samples T_i and T_i+n as follows: (39) $\begin{array}{rcl} d_{w n d} (T_{i}, T_{i + n}) & = & \sqrt{{(T_{i} - T_{i + n})}^{T} ψ^{T} Σ^{- 1} ψ (T_{i} - T_{i + n})} \\ = & {({[\begin{matrix} \sqrt{ψ_{1}} (T_{i 1} - T_{(i + n) 1}) \\ \sqrt{ψ_{2}} (T_{i 2} - T_{(i + n) 2}) \\ ⋮ \\ \sqrt{ψ_{g}} (T_{i g} - T_{(i + n) g)}) \end{matrix}]}^{⊤} Σ^{- 1} [\begin{matrix} \sqrt{ψ_{1}} (T_{i 1} - T_{(i + n) 1}) \\ \sqrt{ψ_{2}} (T_{i 2} - T_{(i + n) 2}) \\ ⋮ \\ \sqrt{ψ_{g}} (T_{i g} - T_{(i + n) g)}) \end{matrix}])}^{\frac{1}{2}} \\ = & {[\sum_{g = 1}^{G} ψ_{g} \frac{{(T_{i g} - T_{i g}^{'})}^{2}}{σ_{g}^{2}}]}^{\frac{1}{2}} \end{array}$

Where: $Ψ = d i a g (\sqrt{ψ_{1}}, \sqrt{ψ_{2}}, \dots, \sqrt{ψ_{g}})$ is the task feature attribute weight matrix, $Σ^{- 1} = d i a g (1 / σ_{1}^{2}, 1 / σ_{2}^{2}, \dots, 1 / σ_{g}^{2})$ is the feature attribute covariance generalized inverse matrix, where σ_g is the covariance of the gth feature attribute, and T_(i+n)g is the gth feature attribute value of the i + nth task.

The feature weights of embedding point z_i and clustering center μ_S are calculated according to equation (38). Then, the weighted martensian distance $d_{w m d} (z_{i}, μ_{S})$ between the embedding point z_i and the clustering center μ_S is calculated according to Eq. (39).Finally, the optimal multitasking feature estimation and clustering centers for security services are obtained by continuous training.

2.4.2

VCM Creation and Matching

In this section, based on the actual application of the information security service of electric power IoT, the type of electric power business served by the master station is used as the task decomposition principle to encapsulate the associated cryptographic tasks, so that the tasks in the whole task sequence are independent of each other, and the computing nodes are created based on the task computing attributes.

Analyze the ratio of tasks of each resource type in the same power business to the total number of tasks submitted for that business based on historical cryptographic task data, and create a VCM class that matches the resource attributes of the tasks. Let $| T^{c l a s s} |$ be the total number of tasks for a class of power operations, $| T_{ρ}^{c l a s s} |$ be the number of tasks of resource type ρ calculated in the set of tasks T^class of the same power operations, ρ = 1, 2, ⋯, K^number, where K^number is the total number of task types, and the ratio of task types Ω is: (40) $Ω = \frac{| T_{p}^{c l a s s} |}{| T^{c l a s s} |}$

Set the computational attributes of each VCM in the VCM class to be consistent. The comprehensive matching degree of the security service task with the VCM class consists of the matching degree of each computational resource, and the matching degree of the ist task with the rrd computational attribute of the knd VCM is expressed as: (41) $Φ (T_{i}^{s} | C_{v k}^{r}) = {\begin{array}{l} {(\frac{C_{v k}^{r}}{T_{i}^{s}})}^{2} & T_{i}^{s} > C_{v k}^{r} \\ \frac{C_{\max}^{r} - C_{v k}^{r} + T_{i}^{s}}{C_{\max}^{r}} & T_{i}^{s} \leq C_{v k}^{r} \end{array}$

Where: $T_{i}^{s}$ is the cryptographic computation resource requirement for the ind task, s = 1, 2, 3, 4 and denotes CPU resources, memory resources, hard disk resources, and cryptographic card I/O port throughput, respectively. $C_{\max}^{r}$ is the maximum value of the rth computational attribute. $C_{v k}^{r} > 0$ , so $Φ (T_{i}^{s} | C_{v k}^{r}) > 0$ , and the higher the $Φ (T_{i}^{s} | C_{v k}^{r})$ value, the higher the match between the task and the rth attribute of the VCM.

The combined match between the ith task and the kth VCM is: (42) $E (C_{v k} | T^{c l a s s}) = \prod_{r = 1}^{4} Φ (T_{i}^{s} | C_{v k}^{r})$ (43) $(C_{v k}^{ρ}, i) = \arg \max (E (C_{v k} | T^{c l a s s}))$

where: $C_{V k}^{ρ}$ is the krd VCM of class ρ. When $E (C_{v k} | T^{c l a s s})$ takes the largest value, the best match $(C_{v k}^{p}, i)$ is obtained, i.e., task i is scheduled to this class of VCMs.

The set of minimal cryptographic computing units C_Vk constitutes the class of VCMs and its computational properties are related to the class of VCMs. To accomplish the computation of the task, an appropriate number of Minimum Cryptographic Computing Units are created in the host computer, and the Minimum Cryptographic Computing Unit computational resources satisfy the following conditions: (44) $\sum_{k = 1}^{1} C_{v k o}^{1} \leq O_{C P U} \forall o$

Where: C_vko¹ is the CPU computing resources of the krd VCM in the ond host, and O_CPU is the total CPU resources of the host.

2.5

Task mapping based on quantum particle swarm optimization algorithm

After getting the best match between tasks and VCM classes, these tasks have similar computational resource properties but different execution times and deadlines. The initial priority of tasks is determined according to their urgency to guarantee that all tasks can be executed on schedule. As shown in Eq. (45), as the waiting time of task i increases, the urgency of task execution increases. When the execution time margin of task i is equal to its execution time, the task’s priority with respect to time reaches the highest level, and the task must be executed immediately, otherwise the cryptographic task will fail due to missing the deadline: (45) $[{(U_{i})}^{- 1}] = [{(\frac{T_{i}^{e x e c} - δ_{i}}{T_{i}^{d l} - δ_{i} - T_{i}^{w a i t}})}^{- 1}]$

Where: $[\cdot]$ indicates the initial priority of the task, the value of 1 is the highest priority, U_i is the urgency factor of the task, δ_i is the time that the task has been executed, and $T_{i}^{w a i t} = T_{i}^{s t} - T_{i}^{s u b}$ is the waiting time of the task. $T_{i}^{d l} - δ_{i} - T_{i}^{w a i t}$ is the execution time margin of the cryptographic task.

Then, this paper adopts the quantum particle swarm optimization (QPSO) algorithm to complete the task scheduling, to achieve the optimization of the task completion time.QPSO algorithm is different from the traditional particle swarm optimization algorithm, which adopts the wave function to describe the positional state of the particles in the quantum space, and uses only the positional parameter to determine the particle’s convergence speed and positional information. Therefore, QPSO algorithm has the advantages of efficient global search ability, simpler algorithm structure, and wider range of problem solving. The real number encoding method is used, and the position information of the particle represents the mapping relationship between Task T^class and VCM.The QPSO particle position update equation is as follows: (46) $P_{k}^{b e s t} = σ P_{k}^{b e s t} + (1 - σ) P^{G b}$ (47) $M^{b} = \frac{1}{N^{p a r t i c l e}} \sum_{k = 1}^{N^{p a r t i c l e}} P_{k}^{b e s t}$ (48) $P_{k}^{b e s t} (t^{i t e} + 1) = P_{k}^{b e s t} \pm ζ | M^{b} - P_{k}^{b e s t} (t^{i t e}) | \ln \frac{1}{η}$

Where: N^particle is the total number of particles, $P_{k}^{b e s t}$ is the current optimal position of particles, P^Gb is the global optimal position of the particle population, η and σ are random values between (0, 1), M^b is the average optimal position of all particles, t^tie is the number of iterations, and $ζ = (ζ_{\max} - ζ_{\min}) (t_{\max}^{i t e} - t^{i t e}) / t_{\max}^{i t e} + ζ_{\min}$ is the contraction expansion factor. Where $t_{\max}^{i t e}$ is the maximum number of iterations, ζ_max and ζ_min are the upper and lower limits of ζ, respectively. Generally ζ takes a value that decreases linearly from 1.0 to 0.5. In the VCM execution environment, the minimization of the completion time of the VCM exit cipher task is used as the fitness function F^fitness: (49) $F^{f i t n e s s} = \min [(T_{e x i t}^{s t} - T_{e x i t}^{s u b}) + T_{e x i t}^{e x e c}]$

Where: $T_{e x i t}^{e x c e}$ , $T_{e x i t}^{s t}$ , $T_{e x i t}^{s u b}$ are the task sequence exit task execution time, start time and submission time, respectively.

3

Simulation and analysis

3.1

Simulation environment

In the experiments, the power system used is an IEEE 118-node grid, and the remote IEDs have a high-level single-board processor, and the corresponding control software includes gray-box temperature elements, which are devices of the power system.

The Java platform is used to realize the software part of the power system computing platform proposed in this paper, using Java’s Servlets and JSP, Java Beans technology to build web page components for connecting to Apache’s web server, and realizing the logic processing layer, data storage layer and representation layer of MVC mode. The hardware resources of the computing engine use a hierarchical structure. It is connected using 100M Fast Ethernet, the CPU of the machine is P IV 2.8GH, the memory is 1GB, and the implementation of the computing engine uses Java’s RMI and ProActive. The storage system uses JDBC open database connectivity to query local and remote power system data files through a stable, platform-independent interface. Each component is accessed remotely via a TCP connection. IEDS and domain power system parameters are accessed via the HTTP protocol. Finally, Matlab was used as the statistical engine of mathematics, which was installed on top of the specific power system, and the use of Matlab as the statistical module of the computational engine was intended to be easily integrated with other commercial software platforms and to allow for a better display of the results of the on-line safety analysis of the power system.

3.2

Scheduling algorithm performance analysis

In order to test the task processing time data collection method and resource data collection method proposed in this chapter, five microservices are implemented in this paper using Go language, including the fault diagnosis service, the grid management service, the information monitoring service, the measurement data query service and the topology data query service. For the convenience of analysis, they are named as Service A, B, C, D, E. Before conducting the test, Prometheus and Jaeger need to be deployed to the physical machine. The specific steps are as follows:

First, deploy Prometheus-related components on physical machines, including Prometheus server, which is responsible for collecting and storing all resource monitoring data; c Advisor, which is responsible for collecting container resource monitoring data; Node-exporter, which is responsible for collecting physical machine resource monitoring data; and Grafana, which is responsible for monitoring data visualization.

Then, the relevant components of Jaeger are deployed on the physical machine, including jaeger-agent, which is responsible for collecting local call chain data, jaeger-collector, which is responsible for collecting and storing all call chain data, and jaeger-query, which is responsible for querying and displaying call chain data. Since jaeger-collector needs to store call chain data in a database, and Cassandra is chosen for this article, Cassandra needs to be deployed on the physical machine before starting jaeger-collector.

In order to facilitate the use and management, all the above components are deployed using Docker containers, which do not need to configure the relevant runtime environment on the physical machine, and only need to execute a single line of Docker commands to start the corresponding components.

After completing the above deployment work, services A, B, C, D and E are compiled into Docker image files, and these services are started on the physical machine using Docker commands. After starting, these services run as containers on the server. A request is made to service A through the client program User, which receives a response indicating that the request has been processed. When these services are running, the resource usage of their corresponding containers can be viewed through Grafana’s visualization interface for the container related to Service A. Since the resource usage graphs for Services B, C, D, and E are similar to those for Service A, they are not shown here to avoid redundancy.

3.2.1

Task clustering results

Figure 2 shows the resource usage of the container corresponding to Service A. Figure (a) shows the CPU usage, (b) shows the memory usage, (c) shows the disk I/O, and (d) shows the network bandwidth. in the 5 minutes from 10:20 to 10:25, the CPU, memory, disk I/O and network bandwidth of the container corresponding to Service A are more highly utilized between 10:23 and 10:24 than the other periods, especially CPU and disk I/O, the CPU usage is as high as 5% and the maximum disk I/O usage is 4MiB/s in this time period. The reason for this phenomenon is that the client program User initiated a request at 10:23:18, and the processing of this request by Service A (from 10:23 to 10:24) consumed a certain amount of resources; after the request processing was completed, the resources used to process the request were released, and thus the resource usage of Service A decreased. From Fig. 2(b) and (d), it can be seen that the memory and network bandwidth usage of service A is still not 0 when it does not process any request, which is due to the fact that the operation of the service itself needs to take up a certain amount of memory, and at the same time, the service needs to communicate with the registry center every once in a while to enable the registry center to keep abreast of the online situation and the health status of all services.

Through the visualization interface, the resource usage of the server can also be viewed in real time, as shown in Figure 3, where (a) shows the CPU usage, (b) shows the memory usage, (c) shows the disk I/O, and (d) shows the network bandwidth. It can be seen that the CPU, memory and network bandwidth of the physical machine have been maintained at low values during the five minutes from 10:20 to 10:25, with CPU utilization below 1.9% and memory usage below 280GiB, which is due to the low number of processes running on the physical machine, which has a very high total amount of resources in terms of CPU, memory and network bandwidth, in contrast to which the services A-E the amount of CPU, memory and network bandwidth resources used to process requests is almost negligible. Meanwhile, the maximum disk I/O usage of the physical machine is close to 1.9MB/s, which is mainly due to the fact that Service A corresponding to the container uses more disk I/O resources for processing tasks.

3.2.2

Mission safety and security

For the selected three data transmission methods, a comparison of the amount of failed data transmission is carried out, and A, B, and C in the figure are the shared transmission of grid information model data, the transmission of multi-channel power station security data collection and transmission, and the task scheduling method of the heterogeneous computing platform, respectively, as shown in Fig. 4. Figure 4 is an analysis of the results of the secure transmission of power dispatching big data. Compared with the shared transmission method of grid information model data and the multi-channel power station security data collection and transmission method, the model designed in this paper ultimately results in a relatively small amount of failed data transmission, and after three cycles of testing, the amount of failed data transmission is 3, which is smaller than the controllable standard of 7. This shows that in this design of power scheduling, the big data security transmission method is more stable, safe, and efficient. The actual transmission protection effect is better, targeted, and the actual application value is significantly improved.

3.2.3

Total completion time

For the setting of password task weights, the initial step in this study is to set the server cryptograph A as the base criterion. Its processing ability under different cryptographic task requests was tested, and the results are shown in Table 1, from which it can be seen that the execution time of different cryptographic task types ranges from 120ms to 580ms, with a shorter response time, which proves that the algorithm in this paper is more suitable for task scheduling of massive information security services for the electric power Internet of Things (IoT).

Table 1.

Password task execution time

Password task type	Execution time
Symmetric encryption request	240ms
Symmetric decryption request	210ms
Asymmetric encryption request	380ms
Asymmetric decryption request	360ms
Signature request	420ms
Check request	580ms
Hash request	120ms

3.2.4

Probability of interception

Figure 5 shows the relationship between transmission power and interception probability before and after VCM control, comparing the interception probability with and without the VCM threshold transmission control scheme. As can be seen in Fig. 5, the interception probability increases as the transmission power P_t increases. More importantly, the probability of interception also decreases significantly when the VCM threshold transmission control scheme is used compared to when the VCM threshold transmission control scheme is not used, from a probability of more than 90% to a probability between 0 and 20%. By limiting certain transmissions in environments that are prone to eavesdropping, the probability of data transmission being intercepted is reduced. Therefore, it can also be concluded that the VCM threshold transmission control scheme can effectively improve the security of information transmission.

For the optimization objective minimum task average completion time, the relationship between the transmission power and its reach rate is analyzed and Fig. 6 shows the relationship between the optimization objective and the transmission power and reach rate, i.e., the system is considered to need a better safety and security factor. It can be observed that the system in its current state has an optimal power of 10.5W, an optimal reach rate of 0.19, and an optimization objective J with a value of 0.0049. Where the optimal solution for the optimal power and arrival rate is approximately at the median of the given interval. Therefore, under the premise of ensuring that the system has good timeliness, a moderate transmission power and a moderate arrival rate are required to achieve the comprehensive optimal performance of the system.

3.2.5

Operational Node Load Ratio

When conducting experiments on real-time load of computing nodes, this study first set up a specific password task request duration through the heterogeneous information of power IoT as a way to simulate the scenario that the user constantly initiates the password task. In this experiment, 60 seconds was chosen as the duration of the password task cyclic sending. The experimental results are shown in Fig. 7, Figs. (a)-(c) show the task scheduling method of the heterogeneous computing platform, the shared transmission of grid information model data, and the transmission of multi-channel power station security data collection, respectively, and the results are analyzed as described below.

By comparing the real-time load of this paper’s algorithm with that of the shared transmission of grid information model data and the multi-channel power station safety data collection and transmission, it can be seen that the real-time load of this paper’s algorithm floats within the range of 58% to 78% and the real-time load changes are relatively smooth, while the real-time load of the shared transmission of grid information model data ranges from 51% to 89%, which shows that the load is unstable within a certain period of time. The real-time load of multi-channel power station safety data collection and transmission is between 54% and 81%, and although the load is relatively stable as well, the weighted minimum number of connections algorithm may require more load redistribution operations when the nodes are increasing or decreasing, resulting in lower efficiency. In summary, the algorithm in this paper is more suitable for this scheduling scenario.

4

Conclusion

This paper establishes the transmission model for electric power information security and explores the construction principles of an electric power information security system. The time model constraints are proposed, and for heterogeneous computing platforms, the task scheduling data model is constructed with maximizing the task security coefficient as well as minimizing the total completion time of the system as the optimization objective, and the load balancing module in the scheduling model is designed. Using the quantum particle swarm optimization algorithm, the task is mapped and the completion time of minimizing the VCM export cipher task is derived as the fitness function.

Build a simulation environment to analyze the performance of the scheduling algorithm by simulation. Analyze the resource usage of the container corresponding to service A. The system’s CPU and disk I/O usage in the time period from 10:23 to 10:24 is higher than the other time periods, the CPU usage rate is as high as 5%, and the maximum usage of disk I/O is 4MiB/s. Compare the task security guarantees of the three data transmission methods. The model designed in this paper finally results in a relatively small amount of failed data transmission, after three cycles of testing, the amount of failed data transmission is 3, which is less than the controllable standard of 7, indicating that this design of the power scheduling big data security transmission method is more stable and secure. The execution time of different cryptographic task types of this model ranges from 120ms to 580ms, with a shorter response time, which is in line with the task scheduling for massive information security services for the power IoT.

Langue:: Anglais

Périodicité:: 1 fois par an
Sujets de la revue:: Sciences de la vie, Sciences de la vie, autres, Mathématiques, Mathématiques appliquées, Mathématiques générales, Physique, Physique, autres

RSS Feed de la revue

A multi-task scheduling algorithm for heterogeneous information security in the Internet of Things for electricity

Xinghua Wang

Hantuo Dong

Yanfeng Wang

Wenwei Zhu

Jingen Guo

Publié en ligne: 26 mars 2025

Reçu: 10 nov. 2024

Accepté: 16 févr. 2025

DOI: https://doi.org/10.2478/amns-2025-0800

Mots clésCluster Analysis Method, QPSO Algorithm, Task Scheduling, VCM Threshold Transmission, Electric Power Internet of Things, Heterogeneous Platforms

© 2025 Xinghua Wang et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Mots clés
Cluster Analysis Method, QPSO Algorithm, Task Scheduling, VCM Threshold Transmission, Electric Power Internet of Things, Heterogeneous Platforms