AI-driven resource scheduling optimisation model and its system architecture design for digital library management

Under the premise that China attaches great importance to the concept of “Internet Plus”, the construction of digital libraries around the application of digital technology has become one of the directions for the innovative development of many libraries. The application of digital technology has considerable practical value. Not only can it improve the management effect of collection resources and reduce unnecessary management costs [1-4], but it also ensures that the readers in the process of accepting the library service get a good experience, from the overall level, to promote the library of social and public cultural services to perform the duties of the library in an orderly manner. Therefore, the library should form a correct understanding of the application of digital technology in its management and pay attention to [5-8] references to the application of digital technology needs to provide strong support to ensure that all types of emerging digital technology can be effectively applied to improve the efficiency and effectiveness of library management [9-11].

The digital library grid contains a large number of a wide variety of information resources, which will provide users with a variety of services, and how to coordinate and configure between these various types of information and services, just as a large number of signal base stations and other facilities are needed in the cable television network to regulate and control the cable television network [12-15], and the digital library grid also needs to have a regulation and control system between information resources and users, which is called the digital library grid information resource scheduling system. It refers to the release of the information resources required by users to the “edge” of the grid with the shortest distance from users [16-19] so that users can obtain the information resources they need nearby and accelerate the efficiency of users’ access to the digital library grid and the use of information resources, that is, content distribution.

This paper associates the ARIMA model and the LSTM model, combines the characteristics of library digitization management work resource time series, designs a decomposition-based ARIMA-LSTM resource prediction model, and realizes library digitization system resource scheduling based on load balancing. The prediction model is used as the resource prediction module of the Kubernetes platform, and the target node nodes are selected by judging the high load type of node nodes and regularly maintaining the queue of low load node nodes of each resource indicator type, which ultimately achieves the purpose of improving resource utilization and reliability. Applying the model for resource scheduling experiments verifies that it can improve load balancing.

2

Method

2.1

Library resource scheduling optimization problem

With the continuous development of cloud computing, it has become the main way for libraries to build, deploy, and run digital platform applications. Cloud computing provides rich computing, storage, and service resources, facilitating users to rapidly deploy applications to meet business needs. The traditional monolithic architecture cannot fully utilize the on-demand allocation of resources and elastic scaling, and there is a risk of a single point of failure, which gradually highlights the limitations of cloud computing. To overcome these challenges, distributed systems have emerged. It decentralizes software, hardware, and network resources to multiple computers to achieve collaborative work and resource sharing. Unlike monolithic architectures, distributed systems split applications into independent services running on different nodes, which improves resource utilization through network communication and collaboration. With the development of information technology, application deployment has gone through several stages. In the early days, applications were run directly on physical hardware, resulting in wasted resources and inefficient management, as well as limited application portability and scalability. Later, the rise of virtualization technology made virtual machines a popular deployment method. VMs provide isolated virtual environments for applications, improve resource utilization and flexibility, allow multiple VMs to share physical hardware resources, and enhance portability and scalability. However, virtual machines still have the disadvantage of being resource-intensive and simulating the entire operating system and hardware environment, leading to poor resource utilization, especially in resource-intensive applications such as digital libraries.

Container technology is lighter and more efficient than virtualization technology. Containers are capable of encapsulating applications and their dependencies and deploying them across environments. The advantages are lightweight, rapid deployment, and isolation. Containers share the host operating system and kernel, consume fewer resources, and start up faster. Docker is a representative of container technology, providing tools for developing, deploying, and running containers. It utilizes Linux kernel features to achieve isolation, providing mechanisms for process, network, file, and user isolation to enhance application security and stability. However, Docker has security risks and isolation requirements that need to be considered for network and storage isolation. In addition, Docker lacks auto-scaling and scalability mechanisms. To solve the container management problem, container orchestration tools are needed, such as Kubernetes, which supports container management across hosts and clusters and provides auto-scaling, service discovery, load balancing, and network and storage isolation mechanisms to meet the security and isolation requirements of cloud environments. In cloud computing, high-performance servers are suitable for Kubernetes, but it is difficult to apply Kubernetes in restricted environments due to its excessive resource consumption.

Based on the above problems, this paper designs a resource scheduling algorithm based on an AI neural network, aiming to improve the multi-dimensional resource utilization and multi-task deployment efficiency in library digital resource management clusters.

2.2

Decomposition-based ARIMA-LSTM resource prediction

2.2.1

Time series modeling

ARIMA model

Differential integrated moving average autoregressive model (ARIMA) is a model that treats a time-varying data series as a random data series, constructs a model to describe it approximately, and then uses the past and present values of the data series to predict its future values [20].

Autoregressive modeling

Autoregressive modeling (AR) means that the value of the time series at the current time is equal to the value at previous times. Its model expression is: (1) $x_{t} = ϕ_{0} + ϕ_{1} x_{t - 1} + ϕ_{2} x_{t - 2} + \dots + ϕ_{p} x_{t - p} + ξ_{t}$

Where p denotes the highest order of the model, ϕ_p denotes the autoregressive coefficient, and ξ_t denotes the error term, which is the random fluctuation component of the time series.

Sliding average models

The sliding average model (MA) is not a linear combination of historical values, but a linear combination of historical white noise that affects the predicted values at the current time. Its model expression is: (2) $x_{t} = ξ_{t} - θ_{1} ξ_{t - 1} + θ_{2} ξ_{t - 2} + \dots + θ_{q} ξ_{t - q} + ν$

Where q denotes the highest order of the model, θ_q denotes the sliding average coefficient, and v denotes a constant term.

Autoregressive sliding average models

Autoregressive sliding average model (ARMA) is a combination of AR and MA. Its model expression is: (3) $x_{t} = ϕ_{0} + ϕ_{1} x_{t - 1} + ϕ_{2} x_{t - 2} + ... + ϕ_{p} x_{t - p} + ξ_{t} - θ_{1} ξ_{t - 1} + θ_{2} ξ_{t - 2} + ... + θ_{q} ξ_{t - q}$

When p = 0 or q = 0 , ARMA can be transformed into AR or MA models, thus AR and MA are special cases of ARMA models.

Differential integration of moving average autoregressive modeling

In actual production, the time series usually has an unsteady nature. The time series analysis needs to be introduced into the different terms using the different operations will be unsteady time series into a smooth series. The model expression is: (4) ${\begin{array}{l} Φ (B) \nabla^{d} x_{i} = Θ (B) ξ_{i} \\ \nabla^{d} = {(1 - B)}^{d} \\ Φ (B) = 1 - ϕ_{1} B - ϕ_{p} B^{p} \\ Θ (B) = 1 - θ_{1} B - θ_{q} B^{q} \end{array}$

Where d denotes the order of the difference.

ARIMA model construction is mainly divided into four stages: data preprocessing, model identification, model validation and estimation, and model prediction, and its construction flow chart is shown in Figure 1.

The basic characteristics of a smooth time series are twofold:

First, the time series mean is constant.

Second, the late k self-covariance function, which has the following expression: (5) $γ (t, s) = γ (k, k + s - t), \forall t, s, k \in T$ (6) $γ (s - t) = γ (t, s), \forall t, s \in T$ (7) $γ (0) = γ (t, t) = E {(x_{t} - μ)}^{2} = σ^{2}, \forall t \in T$

Where the variance of the smooth time series can be seen to be σ².

In the model identification stage, the main task is to select the appropriate order for the model to be fitted, usually using the criterion of the minimum information criterion (AIC) so that the value that minimizes the AIC is the most appropriate model order. The AIC can be expressed as; (8) $A I C = n \ln L + 2 (p + q + 1)$

Gray model

GM(1, 1) is the basic prediction model of gray system theory. Constructing the GM(1, 1) prediction model first assumes that the original time series is X⁽⁰⁾ = { x⁽⁰⁾(1),x⁽⁰⁾(2),...,x⁽⁰⁾(r)} , where r represents the time series of the original time series, after which the time series is processed to obtain the generated series after first-order accumulation, and its computational expression is as follows: (9) $X^{(1)} = {x^{(1)} (1), x^{(1)} (2), ..., x^{(1)} (r)}$ (10) $X^{(1)} (k) = \sum_{i = 1}^{k} X^{(0)} (i), k = 1, 2, ..., r$

Combining the original time series as well as Eq. (9) yields the first-order differential equation, which is the whitening equation of the GM(1, 1) prediction model, with the following computational expression: (11) $\frac{d x^{(1)} (t)}{d t} + a x^{(1)} (t) = c$

Where, a and c are both model parameters to be estimated by the GM(1, 1) prediction model, and the parameter estimation is performed using the least squares method with the following computational expressions: (12) $\begin{array}{l} b = {[a, c]}^{T} = {(B^{T} B)}^{- 1} B^{T} Y \\ Y = [\begin{matrix} x^{(0)} (2) \\ x^{(0)} (3) \\ ⋮ \\ x^{(0)} (r) \end{matrix}] B = [\begin{matrix} - z^{(1)} (2) & 1 \\ - z^{(1)} (3) & 1 \\ ⋮ & ⋮ \\ - z^{(1)} (r) & 1 \end{matrix}] \end{array}$

Where it will usually be represented as the mean series of the original time series in the GM(1, 1) forecasting model with the following computational expression: (13) $z^{(1)} (k) = \frac{1}{2} x^{(1)} (k - 1) + \frac{1}{2} x^{(1)} (k), k = 2, 3, ..., r$

Combining the model parameters obtained from Eq. (11), the whitening equation can be solved to obtain the time response series with the following computational expression: (14) ${\hat{x}}^{(1)} (k) = (x^{(0)} (1) - \frac{c}{a}) e^{- a (k - 1)} + \frac{c}{a}$

The time response series are processed using a first-order cumulative series to obtain the reduced value of the GM(1, 1) prediction model with the following computational expression: (15) ${\hat{x}}^{0} (k) = (x^{(0)} (1) - \frac{c}{a}) (1 - e^{a}) e^{- a (k - 1)}$

Where the expression of the generated sequence after first-order cumulative reduction of the GM(1, 1) prediction model is as follows: (16) $\begin{array}{l} X^{(0)} = {x^{(0)} (1), x^{(0)} (2), ..., x^{(0)} (r)} \\ x^{(0)} (k) = x^{(1)} (k) - x^{(1)} (k - 1), k = 1, 2, ..., r \end{array}$

Finally, the predicted values of the GM(1, 1) prediction model can be obtained according to Equation (14).

2.2.2

Neural network model

RNN model

Recurrent Neural Network (RNN) is a class of directed recursive artificial neural networks formed by connecting each neuron to itself, which are time series correlated neural networks. The RNN model computational expression is given below: (17) $\begin{array}{l} h_{t} = f (u_{t} x_{t} + w_{t} h_{t - 1} + b) \\ y_{t} = f (v_{t} h_{t} + b) \end{array}$

Where u_i,w_i,v_i denotes the weight coefficients of the input, hidden and output layers of the RNN, x_i,h_i,y_i, the input layer sequence, the state of the hidden layer, and the output layer sequence, respectively, and f denotes the activation function used by the current neural network.

RNN forward propagation

The data vector x_i = [x₁,x₂,...x_i], which corresponds to the feature sequence of t moment, is used as the input of RNN, and the t moment features of the sequence are extracted in the hidden layer, while the sequence information of the previous moment is saved, and its computational expression is as follows: (18) $h_{t} = f (u x_{t} + w h_{t - 1} + b_{h})$

Finally, the output sequence of the moment is obtained through the output layer of the RNN with the following computational expression: (19) $y_{t} = f (v h_{t} + b_{t})$

RNN backward propagation

The backward propagation of RNN is performed by superimposing the errors caused by the gradient descent of the input and hidden layers into the final output sequence, which in turn affects the weight coefficients to be re-updated.

Long Short Neural Network (LSTM) is a recurrent neural network specialized in processing large data sequences and is suitable for medium to long time sequences and is capable of learning the correlation between before and after time [21].

LSTM is divided into three parts: forgetting, inputs and outputs, and introduces a state unit to coordinate the network.

The forgetting gate depends on the time series of the LSTM module, and the input part has the time series of the current moment, and the time series data and state of the previous moment. Its computational expression is as follows: (20) $f_{t} = σ (u_{f} x_{t} + w_{f} h_{t - 1} + b_{f})$

After the input data passes through the oblivion gate of removing invalid information, it enters the input gate, which selects and generates candidate values for the input time series, and obtains the recombined time series input data, which determines the degree of retention of the time series input data at the moment of t, and selectively updates the state of the cell as well as stores new information. The computational expression is as follows: (21) $i_{t} = σ (u_{i} x_{t} + w_{i} h_{t - 1} + b_{i})$

Where u_i,w_i,b_i denotes the input gate parameter, after the linear transformation of the input at the t moment and the output of the hidden layer, the input state at the t moment can be obtained by using the Tanh function, which actually integrates the state information of the input and the output at the t moment, and obtains the new state quantity. The expression is as follows: (22) ${\tilde{c}}_{t} = Tanh (u_{c} x_{t} + w_{c} h_{t - 1} + b_{c})$

The state unit always exists inside the LSTM and provides a state update function to update the state of the current t moment in time. Firstly, the data state information of the previous moment is retained and integrated with the data state information of the current t moment to get the output data of the current t moment. The calculation expression is as follows: (23) $c_{t} = c_{t - 1} * f_{t} + {\tilde{c}}_{t} * i_{t}$

Selecting and updating the data at the input gate, the LSTM obtains the new time series data, and finally obtains the output data of t moment through the state unit for the state situation of the current moment. The computational expression is as follows: (24) $\begin{array}{l} o_{t} = σ (u_{o} x_{t} + w_{o} h_{t - 1} + b_{o}) \\ h_{t} = o_{t}^{*} Tanh (c_{t}) \end{array}$

Where u_o,w_o,b_o denotes the output gate parameter, the output data at the t moment is obtained by the joint action of the output gate o_i and the state unit c_i.

2.2.3

Decomposition-based ARIMA-LSTM resource prediction

Empirical Mode Decomposition

Empirical Mode Decomposition (EMD) is a method for processing instantaneous frequency signals. The workflow of EMD has the following four steps:

For the original time series, it is necessary to find the maximum and minimum values at each point and calculate the average value of the envelope with the expression: (25) $M_{i} (t) = \frac{S_{i} (t) + X_{i} (t)}{2}$

Where s_i(t) and X_i(t) denote the upper and lower envelopes fitted for all local maxima and minima, respectively, and i denotes the number of cycles.

Find the time series that satisfies the MF condition.

Subtracting the IMF obtained above from the original time series is the residual series.

The original time series is decomposed into multiple IMFs that satisfy the condition as well as the residual component.

Decomposition-based resource forecasting models

ARIMA and LSTM prediction models can combine the features of both models and can be used to model complex time series with higher accuracy than a single model. The specific framework of their models is shown in Figure 2.

The specific steps of the decomposition-based ARIMA-LSTM resource prediction model are as follows:

The input to the resource prediction model is a continuous time series, which is represented by the following equation: (26) $y_{t} = L_{t} + E_{t}$

Where L_i and E_i are the linear and nonlinear relationships in the time series, respectively.

Training fitting of the residual value sequence using the LSTM model to obtain the trend sequence prediction value ${E^{'}}_{t}$ . The final error value prediction sequence is obtained: (27) $E_{t^{'}} = f (e_{t - 1}, e_{t - 2}, \dots, e_{t - n}) + ξ_{t}$

Probability estimation is carried out for the error value prediction sequence, and the data that are too large or too small in the time series are adjusted, so as to ensure the accuracy of the prediction model. Finally, the prediction results are superimposed to obtain predictions of resource utilization, and the sequence of prediction results is expressed: (28) $y_{t}' = L_{t}' + L_{t}'' + E_{t}'$

2.3

Load balancing based resource scheduling

The resource requirements of services are learned by the Kubernetes system beforehand. However, the default scheduling algorithm used in the current version of Kubernetes is static scheduling, which performs poorly in the changing production environment, so this paper studies and analyzes the Kubernetes resource scheduling strategy, improves and optimizes for the problems, and proposes a load balancing-based resource scheduling strategy for Pods to achieve static scheduling in a stable environment and dynamic The proposed resource scheduling strategy is based on load balancing.

The specific dynamic resource scheduling model diagram is shown in Figure 3, and the dynamic resource scheduling model is divided into a timer module, judgment module, prediction module, judgment module, selection module and migration module. The timer module is mainly responsible for periodically collecting Node node load data, periodically updating dynamic adaptive thresholds, and periodically maintaining three low load Node node queues decision module is mainly responsible for determining whether a Node node is a load Node node and determining the load type of a Node node; the prediction module is mainly responsible for predicting the future short-term load data of a Node node. The prediction module is responsible for predicting the future load data of Node nodes in the short term to avoid triggering unnecessary Pod migration due to load peaks; the selection module is responsible for selecting the Pod to be migrated among the high load Node nodes and selecting the target Node node; the migration module is responsible for migrating the Pod to be migrated to the target Node.

2.3.1

Dynamic Resource Scheduling Trigger Timing

When the utilization rate of a resource metric of a Node node has exceeded the threshold value, this indicates that the Node node is already in a high load state, and at this time, dynamic resource scheduling should be triggered to migrate some of the Pods running in this high load Node node to other low load Node nodes. If the threshold value is set too high, it will lead to some Node nodes already in high load state, but the resource utilization does not exceed the threshold value, resulting in the inability to trigger the dynamic resource scheduling; if the threshold value is set too low, it will lead to the excessive migration of Pods in the Node nodes, which will degrade the user experience and consume many unnecessary system resources. Therefore, whether the threshold size is set reasonably is particularly important.

There are two general ways to set the queue value: the static threshold approach and the dynamic adaptive threshold approach. The static queuing method is simpler to implement, but its disadvantages are also obvious. It will not change the threshold size once it is set, which will lead to the situation that if the whole cluster is in a high load state, the Pod to be migrated will not be able to find the target Node node to be placed. In this paper, the dynamic adaptive queue value will be updated periodically, while the threshold size can be dynamically adjusted according to the overall load degree of the cluster and the failure rate of Pod migration. When the overall load of the cluster is high, the threshold size will be increased; at the same time, when the failure rate of Pod migration is high, it means that it is not suitable to carry out too many Pod migrations at this time, so it is also necessary to increase the threshold size, similar to the negative feedback mechanism in biology. Mechanism.

Taking CPU utilization as an example, the calculation of other resource metrics thresholds is the same as the calculation of CPU thresholds. The following is to introduce the calculation of thresholds in detail. Setting the threshold triggered by high CPU-type load as HC_th, the average cluster CPU utilization as Ca, and the failure rate of Pod migration triggered according to high CPU-type load as FC, the calculation formula of the new threshold newHC_th is shown in Eq. (29) and Eq. (30): (29) $H C_{t h}^{'} = {\begin{array}{l} H C_{t h} + {(1 - H C_{t h})}^{*} (C a - 50 %) & C a > 50 % \\ H C_{t h} & e l s e \end{array}$ (30) $n e w H C_{t h} = H C_{t h}^{'} + {(1 - H C_{t h}^{'})}^{*} F C$

2.3.2

Load Prediction for Node Nodes

The formula for the value AA of the cubic exponential smoothing method is shown in Eq. (31) to Eq. (33): (31) $S_{t}^{(1)} = α y_{t} + (1 - α) S_{t - 1}^{(1)}$ (32) $S_{t}^{(2)} = α S_{t}^{(1)} + (1 - α) S_{t - 1}^{(2)}$ (33) $S_{t}^{(3)} = α S_{t}^{(2)} + (1 - α) S_{t - 1}^{(3)}$

The formula for calculating the value of the cubic exponential smoothing method is shown in Eq. (31) to Eq. (33)

(34)

{\hat{y}}_{t + T} = A_{t} + B_{t} T + C_{t} T^{2}

Where: A_t,B_t,C_t is the prediction parameter for period t, which is calculated as shown in equation (35).

(35)

{\begin{matrix} A_{t} = 3 S_{t}^{(1)} - 3 S_{t}^{(2)} + S_{t}^{(3)} \\ B_{t} = \frac{α [(6 - 5 α) S_{t}^{(1)} - 2 (5 - 4 α) S_{t}^{(2)} + (4 - 3 α) S_{t}^{(3)}]}{2 {(1 - α)}^{2}} \\ C_{t} = \frac{α^{2} (S_{t}^{(1)} - 2 S_{t}^{(2)} + S_{t}^{(3)})}{2 {(1 - α)}^{2}} \end{matrix}

The smoothing coefficient α influences the accuracy of the predicted load data, and in the calculation process, if the value of α is larger, the closer values representing the time series have a greater impact on the prediction results. The size of α is closely related to the application scenario, if the data of the time series is relatively smooth and the fluctuation is not too big, the value of α is generally selected between 0.05-0.3, while if the fluctuation of the data of the time series is too big, the value of α is generally selected between 0.6-0.9. In essence, the value of α is an empirical data, and experiments are generally conducted to test a few more values to compare which prediction has a smaller mean squared error.

2.3.3

High Load Types of Node Nodes

In order to better place the Pods to be migrated, this chapter categorizes the high-load Node nodes into CPU-type high-load Node nodes, memory-type high-load Node nodes, and network bandwidth-type high-load Node nodes. For the same Node node, multiple resource metrics may show high load status, and then we need to distinguish which type of high load Node node the Node node belongs to according to certain rules [22]. Taking the example of both CPU and memory resource indicators displaying a high load state, the specific rules are as follows:

Calculate the load degree of the resource indicator according to the utilization rate of the resource indicator and the corresponding threshold value of the resource indicator. Taking CPU as an example, the calculation formula is shown below: (36) $S_{c p u} = \frac{U_{c p u} - H C_{t h}}{H C_{t h}}$

Where S_cpu is the load degree of the CPU resource indicator, U_cpu is the CPU utilization of the Node node, and HC_th is the trigger threshold corresponding to the CPU in the cluster.

Compare the load degree of the two resource metrics, CPU and memory, and select the resource metric with a greater load degree as the high load type of this Node node. For example, when the load degree of the CPU is greater than the load degree of memory as calculated by the above formula, then the Node node is a CPU-type load Node node.

2.3.4

Selection of Pods to be Migrated

For the selection of Pods to be migrated, the load of Node nodes should be reduced as much as possible in a large and even manner. For example, the utilization rate of both CPU and memory resource indicators of a Node node exceeds the set threshold, and the Node node is a CPU-type Node node with a high load, if at this time, the Pods are sorted from largest to smallest according to the utilization rate of the CPU and then migrate the Pods in sequence until the CPU utilization rate of the Node node falls below the set threshold, but at this time, the utilization rate of memory resource indicators may still exceed the set threshold. Migrate the Pods until the CPU utilization rate of the Node node is lower than the set threshold, but at this time, the utilization rate of the memory resource index may still exceed the set threshold, and if the migration is performed according to the memory resource index, the efficiency of Pod migration will be greatly reduced.

The specific rule for calculating the degree of contribution W of Pod is shown in equation (37): (37) $W = a P U_{c p u} + b P U_{m e m} + c P U_{n e t}$

Where PU_cpu is the CPU utilization occupied by the Pod; PU_mem is the memory utilization occupied by the Pod; and PU_net is the network bandwidth utilization occupied by the Pod; a represents the weights used by CPU: b represents the weights used by memory; c represents the weights used by network bandwidth; where a,b,c three weights are calculated as shown in equation (38): (38) ${\begin{matrix} a = \frac{U_{c p u}}{U_{c p u} + U_{m e m} + U_{n e t}} \\ b = \frac{U_{n e m}}{U_{c p u} + U_{m e m} + U_{n e t}} \\ c = \frac{U_{n e t}}{U_{c p u} + U_{m e m} + U_{n e t}} \end{matrix}$

Where U_cpu is the CPU utilization of the Node node; U_mem is the memory utilization of the Node node; U_net is the network bandwidth utilization of the Node node.

2.3.5

Determination of Target Node Nodes

In this paper, the target Node node is selected based on the high load type of the source Node node where the Pod to be migrated is located. Three low load queues are maintained regularly, namely, CPU-type low load queue, memory-type low load queue, and network bandwidth-type low load queue. Taking the CPU-type low load queue as an example, an ordered list of CPU-type low load Node nodes is obtained by sorting all the Node nodes according to the CPU utilization rate from smallest to largest, and then excluding the Node nodes whose overall load degree is greater than the average overall load degree of all the Node nodes in the cluster. While selecting the target Node node for the Pod in the CPU-based high load Node node, the Node node at the top of the ordered list of CPU-based low load Node nodes is selected as the target Node node to be migrated [23]. In this, the overall load balancing degree LB of the Node nodes and the average overall load degree $\bar{LB}$ of all the Node nodes of the cluster are calculated as follows: (39) $L B = e^{x U_{q n}} + e^{y U_{m e n}} + e^{z U_{m a}}$ (40) $\bar{L B} = \frac{\sum_{i = 1}^{k} L B_{i}}{k}$

3

Results and Discussion

The AI neural network-based resource scheduling model constructed in this paper is tested to examine whether its performance meets the design expectations. Static scheduling in a stable environment and dynamic scheduling in a changing environment are tested respectively to verify the prediction level and load balancing capability of the model.

3.1

Static resource scheduling test

In order to verify the practicality and reliability of the resource prediction and load balancing strategy of this paper, this paper builds a Kubernetes cluster in Linu, where the cluster version is 1.15, and the cluster consists of 1 master node and 4 Node nodes used for loading Pods.

Where the Pod task corresponds to a common WEB application request on the server, which is developed by using the SSM framework, when the program receives a request for a task, it writes characters and numbers through character stream related operations and then stores them in a corresponding file and returns them to the client. Afterward, the WEB application is packaged as a Docker image, and the Kubernetes cluster deploys the Pod phase to the Node worker node through a resource scheduling policy based on YAML commands. When the Pod task is deployed on the Node node, it will select a more suitable Node node for each Pod task through a series of pre-selection preferences as well as a scoring mechanism, and fully The resources of the node are utilized.

In order to observe the trend of the whole experiment, the number of Pod tasks is designed to increase linearly from 5 to 50 in step 5. When a new round of scheduling is performed, the cluster calculates the difference in the deployment time of each schedule by deleting all the Pod tasks created in the previous scheduling.

In this resource scheduling experiment, the monitoring module designed in this paper is used to monitor the utilization of various types of resources and scheduling time of the deployed Node working nodes, and a comparison of the utilization rates of various types of resources of the Node is shown in Fig. 4. Where (a) to (d) are Node1, Node2, Node3 and Node4 nodes respectively.

In the experiment for load balancing degree comparison, we compare three scheduling strategies: the Kubernetes default resource scheduling strategy, the IGAACO resource scheduling strategy based on an improved genetic algorithm, and the neural network resource scheduling strategy proposed in this paper. According to the figure, it can be seen that in the comparison of the experimental results of the resource utilization of each Node node, the Kubernetes default resource scheduling strategy has unevenly high and low overall utilization of CPU and memory in the Node1~Node4 nodes. Although the resource scheduling strategy based on the IGAACO algorithm is slightly better in the overall utilization of various types of resources, there is still a difference in the utilization of certain resources of different nodes that is too high. In contrast, the resource scheduling model that utilizes the neural network algorithm in this paper balances the load of each node in the cluster and improves its load capacity.

Also, in this experiment, the scheduling times of the three scheduling strategies for different numbers of Pod tasks in the cluster are compared, and the respective changes are plotted. The specific changes are shown in Figure 5.

From the figure, it can be seen that as the number of pods increases, the scheduling time shows a gradual upward trend. However, the overall scheduling time is shorter due to the relative simplicity of the Kubernetes default scheduling policy algorithm. The resource scheduling time of the IGAACO algorithm and the model in this paper is relatively close when the number of Pod tasks is small, but the scheduling time gap increases as the number of Pod tasks keeps increasing. In the case of an increasing number of pod tasks, the model presented in this paper requires less scheduling time.

3.2

Dynamic resource scheduling test

The dynamic network scenario experiments are performed next. The settings for dynamic resource scheduling triggers are as follows:

The system is initialized, and parameters such as system resource specification factor, demand weight factor, dynamic scheduling threshold, and load information collection period are set.

The 0Kubelet component obtains the load information of each working node at regular intervals according to the set period, which is generally set to 20-60 seconds, and stores the collected load information into the etcd database.

The master node reads the load information from the etcd database and obtains the scheduling scheme under the dynamic scenario according to the strategy of dynamic load balancing and delay optimization.

The scheduler completes the migration of Pods and access requests and assigns them to the specified Node according to the scheduling scheme.

On a small scale, three simulation experiments were also designed with three different total access scales. In all three types of simulation experiments, the number of nodes is set to three, and the number of pod types that need to be deployed in total each time is set to four.

A comparison of the load balancing degree and delay condition of the three simulation results under the small-scale dynamic scenario is shown in Fig. 6 and Fig. 7.

A load balance of 1 means that each node in the cluster has the same utilization rate of CPU and memory resources, and the cluster is optimally loaded, and vice versa. The lower the value of load balance means that the resources are unevenly distributed among the nodes in the cluster. From Fig. 6, it can be learned that the resource load balancing status of the cluster is very low before Pod performs dynamic migration, and after the redeployment of the dynamic scheduling model, the resource load balancing of the cluster significantly rises and is slightly better than the dynamic migration scheme of the IGAACO algorithm. From Fig. 7, it can be learned that due to the increasing number of accesses, the overall delay of the cluster is also at a high level, and after the accesses are re-shifted and re-allocated by the dynamic scheduling model, the overall delay of the cluster has been reduced to a certain extent, and the overall reduction of this paper’s model is better than that of the IGAACO algorithm.

4

Conclusion

In this paper, we design a decomposition-based ARIMA-LSTM resource prediction model and implement resource scheduling for library digitization systems based on load balancing. Static and dynamic scheduling experiments were conducted, and it was found that Kubernete’s default resource scheduling strategy has high and low overall CPU and memory utilization in Node1 to Node4 nodes. Although the resource scheduling strategy based on the IGAACO algorithm has a slightly better overall utilization of all types of resources, there is still a significant difference in the utilization of certain resources in different nodes. In contrast, the resource scheduling model based on the neural network algorithm in this paper can make the load of each node of the cluster more balanced and improve the load capacity of the cluster. After the dynamic scheduling model reallocates accesses, the overall latency of the cluster is reduced, and this paper’s model achieves a better overall reduction than the IGAACO algorithm.

Language:: English

Publication timeframe:: 1 times per year
Journal Subjects:: Life Sciences, Life Sciences, other, Mathematics, Applied Mathematics, General Mathematics, Physics, Physics, other

Journal RSS Feed

AI-driven resource scheduling optimisation model and its system architecture design for digital library management

Yu Zhao

Ying Che

Published Online: Mar 17, 2025

Received: Oct 22, 2024

Accepted: Jan 30, 2025

DOI: https://doi.org/10.2478/amns-2025-0156

KeywordsARIMA, LSTM, Load balancing, Resource scheduling

© 2025 Yu Zhao et al., published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Keywords
ARIMA, LSTM, Load balancing, Resource scheduling