Research on the fault diagnosis method of offshore oil and gas field equipment combined with deep reinforcement learning

In recent years, with the growing global demand for energy, deep-sea oil and gas field exploitation has become one of the hot areas in the international energy industry. In order to realize the effective exploitation of offshore oil and gas resources, a variety of offshore oil and gas extraction equipment is involved in offshore petroleum engineering. For example, deepwater drilling equipment. The common ones are semi-submersible drilling rigs and power positioning drillships. Semi-submersible drilling platform is a kind of equipment whose structure is similar to that of a ship, and it can adapt to the change of water depth in the deep sea by floating up and down [1]. Power positioning drillships, on the other hand, utilize a power positioning system to remain in a specific position for drilling operations [2]. These equipments are not only capable of stabilizing drilling operations in deep-sea environments, but also provide good operating conditions under rough sea conditions. In addition, there are subsea production systems such as oil and gas well control equipment, horizontal separators, and oil and gas storage equipment [3-5]. As the core equipment in offshore petroleum engineering, its design and use need to pay special attention to environmental factors, safety and sustainable development considerations. Then, the offshore pipeline system, which is the main channel to transport the oil extracted from the sea to the land, can cope with more complicated and arduous challenges, such as the irregularity of the seabed topography, the harshness of the marine environment, and the protection of the marine ecology, etc. [6].

The development of standardization and normalization of equipment can improve its safety, reliability and efficiency, reduce the risk and cost of oil and gas extraction, and promote sustainable development [7-9]. However, the geopolitics, ecological environmental protection and safety risks faced by deep-sea oil and gas field exploitation also bring certain difficulties to the standardization work [10-12]. Therefore, countries have strengthened the standardization of offshore oil and gas field exploitation equipment standardization research and practice, and constantly promote the technological innovation and industrial development, as well as to promote technical exchanges and economic development between countries. For example, the trend of equipment intelligence, automation and digitalization is becoming more and more obvious. The application of advanced technologies such as sensor technology, big data analysis, and artificial intelligence has made the monitoring, control, and maintenance of equipment more convenient and efficient, further improving the safety and reliability of oil and gas extraction in the complex environment of the deep sea [13-16]. However, the complexity and harshness of the deep-sea environment and the defects of the equipment itself increase the risk of equipment failure [17]. In order to effectively and safely develop deep-sea oil and gas resources, equipment fault diagnosis in offshore oil and gas fields has become particularly important.

The rapid progress of global industrial technology makes the safety of equipment in offshore oil and gas fields receive wide attention, and the diagnosis of the health status of offshore oil and gas field equipment is beneficial to ensure the enterprise economy and personnel safety. The article proposes the acquisition process and fusion method for multi-source heterogeneous data of offshore oil and gas field equipment, extracts equipment fault features by PSO-optimized NLM algorithm, and interpolates and fills in the missing data of equipment faults by combining with cubic spline interpolation algorithm. The deep domain adaptive fault diagnosis model for offshore oil and gas field equipment is constructed using the DDPG reinforcement learning framework and an introduction of sparse noise-reducing self-encoders. For the effectiveness of the model, a validation analysis was carried out with offshore oil and gas field equipment fault data.

2

Multi-source heterogeneous data fusion and processing

Equipment fault diagnosis technology through the vibration of mechanical equipment, process, electrical and other signals for data acquisition, graphical processing and mechanism model analysis, assessment of the working state of the equipment, equipment and personnel experience is difficult to find out the failure of the timely warning, and speculate on the trend of fault development. Equipment fault diagnosis technology in the early days mainly rely on personnel perception, by virtue of the personnel’s own experience to determine the existence of equipment, “apparent fault”. After years of development, with the development of data acquisition and signal processing technology, a relatively complete discipline system has formed in theory.

2.1

Multi-source heterogeneous data acquisition and fusion

2.1.1

Multi-source heterogeneous data acquisition

Offshore oil and gas field equipment due to its geographic location is more special, but by virtue of the personnel’s own experience to determine the failure of the equipment can not be found in a timely manner, which may lead to the emergence of offshore oil and gas field safety problems. Therefore, the acquisition of data for offshore oil and gas field equipment is particularly important. The main work of multi-source data acquisition of offshore oil and gas field equipment is to utilize various sensors for data acquisition and convert the collected physical signals into current or voltage signals for computer recognition through A/D converter.

Multi-source data acquisition of offshore oil and gas field equipment is the basic part to carry out equipment fault diagnosis, and the storage of the collected characteristic parameters can provide the data basis for data display, data analysis and fault diagnosis. The specific data acquisition process is as follows: 1)

Select various types of sensors according to the types of system monitoring parameters, and after the arrangement is completed, the current, displacement, pressure and other types of sensors can be utilized to obtain the physical signals of motor current, brake oil pressure, vibration acceleration, gate spacing and gate force and other monitoring parameters of offshore oil and gas field equipment.

2)

Since the signals that can be directly recognized by the industrial computer are digital signals, it is necessary to convert the physical signals. The A/D conversion of the signals can be realized through the acquisition board, and combined with the data acquisition program, the ICP can read the digital signals of the monitoring parameters and store them in the database.

3)

Through the access function of Configuration King, you can access and connect to the database on site, and call the table contents of the database to realize the real-time data collection and display.

4)

By connecting to the server switch, the local database can be transferred to the remote data server, and the ODBC data source can be established on the application side to connect with the database, and the data can be used on the computer to realize the remote real-time collection, display and storage of multi-source data of offshore oil and gas field equipments. At the same time, the data can also be transmitted remotely through the communication protocol for remote monitoring, data storage, data analysis and so on.

After the above data acquisition process, the data acquisition system of this paper is basically established, in addition to the key monitoring parameters mentioned above, the overall monitoring parameters of the system include data analog and switching quantities as well as alarm information, etc., and the collected data will be stored in a database for offshore oil and gas field equipment monitoring and fault diagnosis.

2.1.2

Multi-source heterogeneous data fusion

After acquiring the multi-source sensor data of offshore oil and gas field equipment, different types of data have different fault characteristics, so they need to be effectively fused so that the machine model can better learn its data characteristics and provide a complete data structure for realizing data analysis, fault diagnosis, etc [18].

The constructed multi-source heterogeneous data model based on granular computing is shown in Figure 1. The granularity structure is defined according to the data types and characteristics, and the granularity of different integrated levels in the granularity structure is calculated to give a formalized description of the multi-structured granularity. Based on multi-scale sampling to achieve the fusion of data, a rule table is generated, and the rules in the table are used to perform comprehensive calculation on the basic data to form views with different granularity sizes and complete data modeling.

There are some data values in the actual data that deviate from the expected situation, if these data values and normal data values are put together for statistics, it will greatly affect the correctness of the data processing, however, if these data values are simply eliminated, important characteristic information may be overlooked. In this paper, an outlier is defined as a group of measured values whose deviation from the mean is more than two times the standard deviation, and a measured value whose deviation from the mean is more than three times the standard deviation is referred to as a highly abnormal outlier. When processing the data, highly abnormal outliers should be eliminated.

2.2

Equipment Fault Characterization and Handling

2.2.1

Fault Feature Selection Extraction

Based on the multi-source data of offshore oil and gas field equipment, mining the fault features in the data is crucial for diagnosing equipment faults. In this regard, this paper proposes a non-local mean (NLM) denoising algorithm optimized based on the PSO algorithm for denoising and extracting fault features of offshore oil and gas field equipment.

The NLM algorithm can effectively retain the detail information and sharp edges of the signal, which can be used to deal with one-dimensional vibration signals. The algorithm extracts the periodic shock components in the signal based on the local similarity between the neighborhoods of the sample points, and achieves filtering and noise reduction by averaging the similar sample points.

The filtering and noise reduction effect of the NLM algorithm is strongly influenced by the radius of the search box M, the bandwidth parameter λ and the radius of the similarity box P. The larger M is, the better the noise reduction effect is, but too large increases the computation time. λ is too small to cause noise fluctuations, and too large to make the signal too smooth, losing important information. The value of P is too large so that the signal loses key information, and too small so that the similar block of information is not representative and noisy. If the combination of parameters is set artificially, it may lead to the loss of some fault characteristics in the filtered signal. Therefore, in this paper, we optimize the [M, λ, P]-parameter combination of NLM with the help of improved PSO algorithm to improve the performance of noise reduction and extraction of fault features.

When searching for optimization with the help of PSO algorithm, the algorithm explores the optimal solution according to the size of the value of the fitness function, and thus the construction of the fitness function directly affects the merit of the optimization result. Cliff index can highlight the shock characteristics of the noise reduction signal, but its stability is poor. The energy entropy index can evaluate the uniformity and complexity of the energy distribution in the signal, and its stability is good and noise resistance is strong.

Therefore, this paper combines the advantages of the crag and energy entropy indexes to construct the minimum energy entropy-crag ratio index $(M_{H K})$ $$\left( {{M_{HK}}} \right)$$, whose expression is: (1) $M_{H K} = \min (H_{E n} / \sum_{i = 1}^{n} K)$ $${M_{HK}} = \min \left( {{H_{En}}\Bigg /\mathop \sum \limits_{i = 1}^n K} \right)$$ (2) $H_{E n} = - \lg {(k)}^{- 1} \sum_{i = 1}^{k} p_{i} \lg p_{i}$ $${H_{En}} = - \lg {(k)^{ - 1}}\sum\limits_{i = 1}^k {{p_i}} \lg {p_i}$$

Where H_En is the energy entropy of the reconstructed signal after noise reduction, $\sum_{i = 1}^{n} K$ $$\sum\limits_{i = 1}^n K$$ is the sum of the cliff values of each IMF component in the reconstructed signal after noise reduction, and $p_{i} = ε = \frac{E_{M F} (i)}{E}$ $${p_i} = \varepsilon = \frac{{{E_{MF}}(i)}}{E}$$, k are the modal numbers.

The smaller the value of the NLM-filtered signal M_HK is, the more obvious the fault impact characteristics in the signal at this time, and the less noise interference it contains. In this paper, M_HK is selected as the fitness function of PSO algorithm, then the steps of PSO algorithm to optimize the combination of NLM parameters are as follows: 1)

Population initialization. Based on the existing research, set the PSO parameters, i.e., learning factor C₁ = C₂ = 2, population size of 50, iteration number K = 50, weights ω_max = 0.85 and ω_min = 0.35. Considering the noise reduction performance and computational efficiency, set the optimization range of NLM algorithm M to $[500, 1500]$ $$\left[ {500,1500} \right]$$, λ to $[0.3, 3]$ $$\left[ {0.3,3} \right]$$, and P to [10, 30].

2)

Setting the fitness function. Take the minimum of M_HK value of the signal after NLM filtering as the fitness function, and calculate the first generation of individual extreme values and global extreme values.

3)

Iterative optimization. According to the particle update formula of PSO algorithm to automatically update the particle position and velocity of the population, calculate the M_HK-value of the new particles, and search for the optimal individual local extreme value and the global extreme value of the population through cyclic iteration.

4)

Output NLM optimal parameter combinations. Stop searching when the number of evolutionary generations reaches the number of iterations K and output the NLM optimal [M, λ, P] parameter combinations.

5)

Extract the fault characteristics of offshore oil and gas field equipment by using PSO-NLM algorithm for noise reduction of reconstructed signals and doing envelope spectrum analysis [19].

2.2.2

Triple spline interpolation algorithm

After noise reduction and filtering extraction of offshore oil and gas field equipment features using the PSO-NLM algorithm, it may lead to the existence of some missing values of data, which, if not effectively and reasonably processed will result in the construction of erroneous data leading to the legacy of fault diagnosis results. Therefore, for the processing of incomplete data, this paper proposes to utilize the cubic spline interpolation (CSI) algorithm for filling.

The CSI algorithm is a high-precision interpolation algorithm [20]. The algorithm has the following two characteristics: 1)

When there is N + 1 data point, N intervals will be spaced out, requiring the generation of N curvelets, each of which is a cubic polynomial.

2)

The values, slopes, and curvatures between neighboring curves will be continuous, and the neighboring curvelets can be connected smoothly. That is, assuming there are N + 1 data points, N small curves are needed to connect them, and the ith curve connects the ith and i + 1th points, the expression of the ith curve is: (3) $g_{i} (x) = a_{i} {(x - x_{i})}^{3} + b_{i} {(x - x_{i})}^{2} + c_{i} (x - x_{i}) + d_{i}$ $${g_i}(x) = {a_i}{\left( {x - {x_i}} \right)^3} + {b_i}{\left( {x - {x_i}} \right)^2} + {c_i}\left( {x - {x_i}} \right) + {d_i}$$

Each curve must satisfy the following four conditions: 1)

The endpoints of the curve pass through the data points, viz: (4) $g_{i} (x_{i}) = y_{i}, g_{i} (x_{i + 1}) = y_{i + 1}$ $${g_i}\left( {{x_i}} \right) = {y_i},{g_i}\left( {{x_{i + 1}}} \right) = {y_{i + 1}}$$

2)

The numerical continuity between neighboring curves, i.e: (5) $g_{i} (x_{i + 1}) = g_{i + 1} (x_{i + 1})$ $${g_i}\left( {{x_{i + 1}}} \right) = {g_{i + 1}}\left( {{x_{i + 1}}} \right)$$

3)

The slopes between neighboring curves are continuous, i.e: (6) $g_{i}^{'} (x_{i + 1}) = g_{i + 1}^{'} (x_{i + 1})$ $${g'_i}\left( {{x_{i + 1}}} \right) = {g'_{i + 1}}\left( {{x_{i + 1}}} \right)$$

4)

The curvature continuity between neighboring curves, i.e: (7) $g_{i}^{″} (x_{i + 1}) = g_{i + 1}^{″} (x_{i + 1})$ $${g''_i}\left( {{x_{i + 1}}} \right) = {g''_{i + 1}}\left( {{x_{i + 1}}} \right)$$

Let $S_{i} = g_{i}^{″} (x_{i})$ $${S_i} = {g''_i}\left( {{x_i}} \right)$$, step h_i = x_i+1 − x_i. The values of a_i, b_i, c_i, d_i are denoted by S_i, h_i, y_i. It can be found by the above conditions: (8) ${\begin{array}{l} d_{i} = y_{i} \\ b_{i} = \frac{S_{i}}{2} \\ a_{i} = \frac{S_{i + 1} - S_{i}}{2} \\ c_{i} = \frac{y_{i + 1} - y_{i}}{h_{i}} - \frac{2 h_{i} S_{i} + h_{i} S_{i + 1}}{6} \end{array}$ $$\left\{ {\begin{array}{l} {{d_i} = {y_i}} \\ {{b_i} = \frac{{{S_i}}}{2}} \\ {{a_i} = \frac{{{S_{i + 1}} - {S_i}}}{2}} \\ {{c_i} = \frac{{{y_{i + 1}} - {y_i}}}{{{h_i}}} - \frac{{2{h_i}{S_i} + {h_i}{S_{i + 1}}}}{6}} \end{array}} \right.$$

As: (9) ${\begin{array}{l} g_{i}^{'} (x_{i}) = c_{i} \\ g_{i - 1}^{'} (x_{i}) = 3 a_{i - 1} h_{i - 1}^{2} + 2 b_{i - 1} h_{i - 1} + c_{i - 1} \end{array}$ $$\left\{ {\begin{array}{*{20}{l}} {g_i^\prime \left( {{x_i}} \right) = {c_i}} \\ {g_{i - 1}^\prime \left( {{x_i}} \right) = 3{a_{i - 1}}h_{i - 1}^2 + 2{b_{i - 1}}{h_{i - 1}} + {c_{i - 1}}} \end{array}} \right.$$

Bringing the expressions for a_i, b_i, c_i, and d_i into Eq. (9) is given by $g_{i}^{'} (x_{i}) = g_{i - 1}^{'} (x_{i})$ $${g'_i}\left( {{x_i}} \right) = {g'_{i - 1}}\left( {{x_i}} \right)$$: (10) $h_{i - 1} S_{i - 1} + 2 (h_{i - 1} + h_{i}) S_{i - 1} + h_{i} S_{i + 1} = 6 (f [x_{i}, x_{i + 1}] - f [x_{i - 1}, x_{i}])$ $${h_{i - 1}}{S_{i - 1}} + 2\left( {{h_{i - 1}} + {h_i}} \right){S_{i - 1}} + {h_i}{S_{i + 1}} = 6\left( {f\left[ {{x_i},{x_{i + 1}}} \right] - f\left[ {{x_{i - 1}},{x_i}} \right]} \right)$$

Eq. $f [x_{i}, x_{i + 1}] = (y_{i + 1} - y_{i}) / h_{i}$ $$f\left[ {{x_i},{x_{i + 1}}} \right] = \left( {{y_{i + 1}} - {y_i}} \right)/{h_i}$$.

The commonly used boundary conditions are: 1)

Natural boundary condition $g_{0}^{″} (x_{0}) = g_{n - 1}^{″} (x_{n}) = 0$ $${g''_0}\left( {{x_0}} \right) = {g''_{n - 1}}\left( {{x_n}} \right) = 0$$, i.e. S₀ = S_n = 0.

2)

Fixed boundary condition $g_{0}^{'} (x_{0}) = A, g_{n - 1}^{'} (x_{n}) = B$ $${g'_0}\left( {{x_0}} \right) = A,{g'_{n - 1}}\left( {{x_n}} \right) = B$$.

3)

Non-nodal boundary conditions $g_{0}^{‴} (x_{1}) = g_{1}^{‴} (x_{1}), g_{n - 2}^{‴} (x_{n - 1}) = g_{n - 1}^{‴} (x_{n - 1})$ $${g'''_0}\left( {{x_1}} \right) = {g'''_1}\left( {{x_1}} \right),{g'''_{n - 2}}\left( {{x_{n - 1}}} \right) = {g'''_{n - 1}}\left( {{x_{n - 1}}} \right)$$.

If natural boundary conditions are used, Eq. (10) can be expressed in matrix form, i.e.: (11) $\begin{array}{l} [\begin{matrix} 2 (h_{1} + h_{2}) & h_{2} & 0 & \dots & 0 & 0 \\ h_{2} & 2 (h_{2} + h_{3}) & h_{3} & \dots & 0 & 0 \\ 0 & h_{3} & 2 (h_{3} + h_{4}) & \dots & h_{n - 2} & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & 0 & \dots & h_{n - 2} & 2 (h_{n - 2} + h_{n - 1}) \end{matrix}] \\ [\begin{matrix} S_{1} \\ S_{2} \\ ⋮ \\ S_{n - 1} \end{matrix}] = 6 [\begin{matrix} f [x_{2}, x_{3}] - f [x_{1}, x_{2}] \\ f [x_{3}, x_{4}] - f [x_{2}, x_{3}] \\ ⋮ \\ f [x_{n - 1}, x_{n}] - f [x_{n - 2}, x_{n - 1}] \end{matrix}] \end{array}$ $$\begin{array}{c} \left[ {\begin{array}{c} {2\left( {{h_1} + {h_2}} \right)}&{{h_2}}&0& \cdots &0&0 \\ {{h_2}}&{2\left( {{h_2} + {h_3}} \right)}&{{h_3}}& \cdots &0&0 \\ 0&{{h_3}}&{2\left( {{h_3} + {h_4}} \right)}& \cdots &{{h_{n - 2}}}&0 \\ \vdots & \vdots & \vdots &{}& \vdots & \vdots \\ 0&0&0& \cdots &{{h_{n - 2}}}&{2\left( {{h_{n - 2}} + {h_{n - 1}}} \right)} \end{array}} \right] \\ \left[ {\begin{array}{c} {{S_1}} \\ {{S_2}} \\ \vdots \\ {{S_{n - 1}}} \end{array}} \right] = 6\left[ {\begin{array}{c} {f\left[ {{x_2},{x_3}} \right] - f\left[ {{x_1},{x_2}} \right]} \\ {f\left[ {{x_3},{x_4}} \right] - f\left[ {{x_2},{x_3}} \right]} \\ \vdots \\ {f\left[ {{x_{n - 1}},{x_n}} \right] - f\left[ {{x_{n - 2}},{x_{n - 1}}} \right]} \end{array}} \right] \\ \end{array}$$

Due to the high accuracy of the CSI algorithm, when sampling the offshore oil and gas field equipment data, the sampling interval can be appropriately made larger than the minimum sampling interval required by the sampling theorem, and the near-field undersampled data are well recovered by the CSI algorithm, which can be realized by linear interpolation on other occasions.

3

Troubleshooting model for offshore oil and gas field equipment

With the development of science and technology, the post-digital oilfield era has arrived, and digital technology has become the key technology for the exploration, development, and production of oil and gas field enterprises. The structure of the current offshore oil and gas field facilities and equipment is also more and more complex, and the function is also more and more perfect. Oil and gas production units in the production process, due to many unavoidable factors, facilities and equipment will have all kinds of failures, so as to reduce the loss of the intended function, and even cause serious accidents and environmental pollution and other events. This not only affects the safety of production, but also causes great economic losses. The development and application of offshore oil and gas field equipment fault diagnosis model can reduce the accidental shutdown accidents, or safety accidents caused by equipment failure, increase the equipment availability and production efficiency, and improve the safety of offshore oil and gas field equipment.

3.1

DDPG and Sparse Noise Reduction Self-Encoder

3.1.1

DDPG Reinforcement Learning Framework

DDPG is a deep reinforcement learning algorithm suitable for dealing with continuous action control problems. The algorithm combines the advantages of deep Q-networks and Actor-Critic algorithms with offline policy and model-free features [21]. In DDPG, the Actor policy network is responsible for exploring in the environment and outputting action decisions. This is critical for dealing with continuous action control in real-world problems such as rolling bearing life prediction, which typically have highly complex action spaces. The Critic evaluation network evaluates the strengths and weaknesses of each action and provides feedback for gradient updates, which guides the training direction of the Actor policy network.

In the DDPG algorithm, two neural networks of the same structure, Actor and Critic, are the online network and the target network, respectively. When the algorithm runs, the parameters θ_Q and θ_μ of the online Q network and the online strategy network are randomly initialized, and then the corresponding target networks Q′ and μ′ are initialized using the parameters $θ_{Q}^{'}$ $${\theta '_Q} $$ and $θ_{μ}^{'}$ $${\theta '_\mu }$$, and the expression of the initialization process is: (12) $θ_{Q}^{'} \leftarrow θ_{Q}, θ_{μ}^{'} \leftarrow θ_{μ}$ $${\theta '_Q} \leftarrow {\theta _Q},{\theta '_\mu } \leftarrow {\theta _\mu }$$

The intelligent body selects action a_t according to the current policy, and the goal of the Actor network is to output the optimal action, this process is accomplished by the online policy network $μ (s | θ_{μ})$ $$\mu \left( {s|{\theta _\mu }} \right)$$ with the expression: (13) $a_{t} = μ (s_{t} | θ_{μ})$ $${a_t} = \mu \left( {{s_t}|{\theta _\mu }} \right)$$

where s_t is the current state and a_t is the action outputted by the Actor network, which will be used to interact with the environment and obtain a reward.

In the decision making process of the DDPG algorithm, the action a_t is executed, the feedback reward value r_t and the new observed state s_t−1 are obtained, and the interaction experience $(s_{t}, a_{t}, r_{t}, s_{t + 1})$ $$\left( {{s_t},{a_t},{r_t},{s_{t + 1}}} \right)$$ is deposited into the experience pool. When the interaction experience reaches a certain number, N interaction experience is randomly selected from it for training and updating the network parameters.The expression for calculating the objective Q value of the Critic network is: (14) $y_{t} = r_{t} + γ Q (s_{t + 1}, μ (s_{t + 1}; θ_{μ}); θ_{\underline{Q}})$ $${y_t} = {r_t} + \gamma Q\left( {{s_{t + 1}},\mu \left( {{s_{t + 1}};{\theta _\mu }} \right);{\theta _{\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{Q} }}} \right)$$

The Critic network is updated by minimizing the TD bias, which is the difference between the current estimate and the target value and represents the error of the Critic network on the value of the action, and the update expression is: (15) $L (θ_{Q}) = \frac{1}{N} \sum_{i = 1}^{N} {(Q (s_{i}, a_{i} | θ_{Q}) - y_{i})}^{2}$ $$\mathcal{L}\left( {{\theta _Q}} \right) = \frac{1}{N}\sum\limits_{i = 1}^N {{{\left( {Q\left( {{s_i},{a_i}{\theta _Q}} \right) - {y_i}} \right)}^2}}$$

This process enhances the Critic network’s accurate estimation of action values, which provides more reliable gradient information for the Actor network and promotes the optimization and improvement of strategies. It is through the Actor-Critic structure that the entire DDPG algorithm iteratively optimizes the policy and value functions to achieve efficient learning in continuous action control problems.

The Actor policy network updates the network parameters based on the policy gradient with the goal of maximizing the long-term cumulative rewards, and the update expression is: (16) $\nabla_{θ_{z}} J \approx \frac{1}{N} \sum_{i = 1}^{N} \nabla_{a} Q (s_{i}, a | θ_{Q}) \nabla_{θ_{j}} μ (s_{i} | θ_{μ})$ $${\nabla _{{\theta _z}}}J \approx \frac{1}{N}\sum\limits_{i = 1}^N {{\nabla _a}} Q\left( {{s_i},a|{\theta _Q}} \right){\nabla _{{\theta _j}}}\mu \left( {{s_i}|{\theta _\mu }} \right)$$

where N is the number of samples drawn, $\nabla_{a} Q (s_{i}, a | θ_{Q})$ $${\nabla _a}Q\left( {{s_i},a|{\theta _Q}} \right)$$ is the gradient of the Critic network with respect to the action, and $\nabla_{θ_{μ}} μ (s_{i} | θ_{μ})$ $${\nabla _{{\theta _\mu }}}\mu \left( {{s_i}|{\theta _\mu }} \right)$$ is the gradient of the Actor network with respect to the policy parameters.

The Critic’s expression for updating the target network is: (17) $θ_{Q^{'}} \leftarrow τ θ_{Q} + (1 - τ) θ_{Q^{'}}$ $${\theta _{Q'}} \leftarrow \tau {\theta _Q} + (1 - \tau ){\theta _{Q'}}$$

where $θ_{Q^{'}}$ $$\theta_{Q'}$$ is the parameter of the target Critic network, θ_Q is the parameter of the online Critic network, and τ is the hyperparameter of the soft update.

Actor’s target network update expression is: (18) $θ_{μ^{'}} \leftarrow τ θ_{μ} + (1 - τ) θ_{μ^{'}}$ $${\theta _{\mu '}} \leftarrow \tau {\theta _\mu } + (1 - \tau ){\theta _{\mu '}}$$

where $θ_{μ^{'}}$ $${\theta _{\mu '}}$$ is the parameter of the target Actor network and θ_μ is the parameter of the online Actor network.

DDPG adopts a soft update method, so that the parameters of the target network maintain a certain degree of smoothness when updating, avoiding too drastic changes, and the whole DDPG algorithm realizes the joint learning of the action value and the strategy by means of this update method of the target network.

3.1.2

Sparse Noise Reduction Self-Encoder

Sparse Noise Abatement Self-Encoder (SDAE) consists of two parts: encoder and decoder. SDAE firstly adds noise to the original input data to get the damaged data samples, the encoder performs feature extraction on the damaged data samples, and the decoder reconstructs the extracted higher-order features into the undamaged data, and this training process greatly improves the model’s noise immunity. The higher-order features extracted by the SDAE are very robust The high order features extracted by SDAE are extremely robust, and are able to express the high-dimensional original features in an abstract way without being disturbed by noise, so the SDAE model has a very good noise immunity [22].

The SDAE first “destroys and contaminates” the original input data sample x ∈ R^m to obtain the damaged data sample $\hat{x} \in R^{m}$ $$\hat x \in {R^m}$$, and then the encoder maps $\hat{x}$ $$\hat x$$ to the hidden layer vector z ∈ R^h by the following formula. i.e: (19) $z = f_{φ} (\hat{x}) = \frac{1}{1 + e^{- W \hat{x} + B}}$ $$z = {f_\varphi }(\hat x) = \frac{1}{{1 + {e^{ - W\hat x + B}}}}$$

Where W ∈ R^{h × m} is the weight matrix connecting the input and hidden layers and B ∈ R^{h × 1} is the bias vector matrix connecting the input and hidden layers.

The decoder maps the hidden layer vector z to an output layer of the same dimension as the input layer by the following equation to obtain the reconstructed original input feature x′. i.e: (20) $x' = q_{α} (z) = \frac{1}{1 + e^{- W^{'} z + B^{'}}}$ $$x\prime = {q_\alpha }(z) = \frac{1}{{1 + {e^{ - W^\prime z + B^\prime }}}}$$

where W′ ∈ R^{m × h} is the weight matrix connecting the hidden and output layers and B′ ∈ R^{m × 1} is the bias vector matrix connecting the hidden and output layers.

The goal of SDAE training is to reconstruct the damaged samples $\hat{x}$ $$\hat x$$ into the undamaged samples x′, which is achieved by minimizing the following equation, which is used to measure the error between the damaged and undamaged samples. Namely: (21) $L_{r e c o n s t n c t i o n} = \frac{1}{m} \sum_{i = 1}^{m} {(x_{i}^{'} - x_{i})}^{2}$ $${L_{reconstnction}} = \frac{1}{m}\sum\limits_{i = 1}^m {{{\left( {{{x_i'}} - {x_i}} \right)}^2}}$$

where $x_{i}^{'}$ $${{x_i'}}$$ is the ind dimensional feature quantity of the reconstructed input features and x_i is the ith dimensional feature quantity of the undamaged original input features.

The correlation between information such as fault characteristics of offshore oil and gas field equipment leads to a large number of redundant features in the high-dimensional original input information. Adding sparsification constraints to the SDAE model can force the SDAE model to automatically remove redundant features during the training process and extract features with high differentiation for equipment fault diagnosis, thus improving the performance of equipment fault diagnosis.

3.2

Fault diagnosis model for oil and gas field equipment

3.2.1

Deep Domain Adaptation

In this paper, we construct a deep mapping network F based on the DDPG reinforcement learning framework combined with a sparse noise-reducing self-encoder to reduce the domain distribution differences and enhance the cross-domain invariance of features. By adjusting the hyperparameter set {L, N} of the deep domain adaptive network, the source and target domains can be synchronously projected into different deep hidden layers (common space) and produce different domain adaptive effects. Where L denotes the number of hidden layer layers of the deep domain adaptive network, and N denotes the additive noise rate (i.e., random zero probability).

In order to better realize the fault diagnosis of offshore oil and gas field equipment, this paper proposes a deep domain adaptive fault diagnosis model based on DDPG-SDAE, which evaluates the cross-domain adaptability of the deep mapping network through a comprehensive evaluation function, which consists of a domain difference penalty term and a class distance penalty term.

First, the MMD metric is chosen to measure and minimize the distributional variability between two domains, i.e: (22) $M M D (X_{s}, X_{t}) = ‖ \frac{1}{n_{s}} \sum_{i = 1}^{n_{s}} F (x_{s}^{i}) - \frac{1}{n_{s}} \sum_{j = 1}^{n_{s}} F (x_{t}^{j}) ‖$ $$MMD\left( {{X_s},{X_t}} \right) = \left\| {\frac{1}{{{n_s}}}\sum\limits_{i = 1}^{{n_s}} F \left( {x_s^i} \right) - \frac{1}{{{n_s}}}\sum\limits_{j = 1}^{{n_s}} F \left( {x_t^j} \right)} \right\|$$

where $M M D (X_{s}, X_{t})$ $$MMD\left( {{X_s},{X_t}} \right)$$ is the edge distribution difference between the source domain feature set X_s and the target domain feature set $X_{t} (P (X_{s}) \neq P (X_{t}))$ $${X_t}\left( {P\left( {{X_s}} \right) \ne P\left( {{X_t}} \right)} \right)$$, n_s and n_s denote the number of samples in the source and target domains, respectively, and F denotes the mapping matrix, which in this paper refers specifically to a t-layer SDAE network, i.e: (23) $F = h_{t} = \tanh (W^{t} h) t \geq 1$ $$F = {h_t} = \tanh \left( {{W^t}h} \right)t \geq 1$$

From the above equation, it is clear that the difference between the distributions of two domains can be assessed empirically in terms of the distance between the centers of their samples. Obviously, the larger the MMD, the greater the difference between the two distributions, and conversely, the closer the MMD is to 0, the more similar the two distributions are. Therefore, the domain difference penalty term is defined as: (24) $J_{M M D} = \arg \min M M D (X_{s}, X_{t})$ $${J_{MMD}} = \arg \min MMD\left( {{X_s},{X_t}} \right)$$

Then, a class distance metric is introduced in the source domain samples to ensure that the recognizability information of DDPG reinforcement learning is maximally preserved in the feature mapping. Assuming a n classification problem, x^k denotes the input features belonging to class k faults, and $h_{t} (x^{k})$ $${h_t}\left( {{x^k}} \right)$$ denotes the features after mapping to the common space (the output of SDAE layer t). The expectation and variance of $h_{t} (x^{k})$ $${h_t}\left( {{x^k}} \right)$$ are denoted as $E [h_{t} (x^{k})]$ $$E\left[ {{h_t}\left( {{x^k}} \right)} \right]$$ and $V a r [h_{t} (x^{k})]$ $$Var\left[ {{h_t}\left( {{x^k}} \right)} \right]$$, respectively.Thus, the intra-class distance, i.e., the distance between samples of the same class, can be computed as: (25) $D_{int r a} = \sum_{i = 1}^{n} \sqrt{V a r [h_{t} (x^{k})]}$ $${D_{\operatorname{int} ra}} = \sum\limits_{i = 1}^n {\sqrt {Var\left[ {{h_t}\left( {{x^k}} \right)} \right]} }$$

The class spacing, i.e. the distance between the samples of class i and class j, can be calculated as: (26) $\begin{array}{rcl} D_{int e r} & = & {‖ E [h_{t} (x^{i})] - E [h_{t} (x^{j})] ‖}_{2} \\ - \sqrt{V a r [h_{t} (x^{i})]} - \sqrt{V a r [h_{t} (x^{j})]} \end{array}$ $$\begin{array}{rcl} {D_{\operatorname{int} er}} &=& {\left\| {E\left[ {{h_t}\left( {{x^i}} \right)} \right] - E\left[ {{h_t}\left( {{x^j}} \right)} \right]} \right\|_2} \\ &&- \sqrt {Var\left[ {{h_t}\left( {{x^i}} \right)} \right]} - \sqrt {Var\left[ {{h_t}\left( {{x^j}} \right)} \right]} \\ \end{array}$$

where ${‖ \cdot ‖}_{2}$ $${\left\| \cdot \right\|_2}$$ denotes the Euclidean distance, 1 ≤ i ≤ j ≤ n. D_intra and D_inter are used to denote the intraclass compactness and interclass separateness of the samples in the source domain, respectively. To facilitate categorization, samples within the same class should be as close as possible, while samples within different classes should be as far away as possible. Therefore, the class distance penalty term minimizing D_intra while maximizing D_inter can be calculated as: (27) $J_{c l a s s} = \arg \min (λ D_{int r a} - D_{int e r})$ $${J_{class}} = \arg \min \left( {\lambda {D_{\operatorname{int} ra}} - {D_{\operatorname{int} er}}} \right)$$

Finally, Eqs. (24) and (27) are integrated to obtain the final evaluation function as: (28) $\begin{array}{rcl} J_{f i n a l} & = & \arg \min (μ J_{M M D} + J_{c l a s s}) \\ = & \arg \min (μ J_{M M D} + λ D_{int r a} - D_{int e r}) \end{array}$ $$\begin{array}{rcl} {J_{final}} &=& \arg \min \left( {\mu {J_{MMD}} + {J_{class}}} \right) \\ &=& \arg \min \left( {\mu {J_{MMD}} + \lambda {D_{\operatorname{int} ra}} - {D_{\operatorname{int} er}}} \right) \\ \end{array}$$

where μ and λ are non-negative scaling coefficients, J_MMD is used to correct the overall displacement of the feature space, and J_class is used to retain more distinguishable clustering regions. The evaluation function can quantitatively measure the effect of domain adaptation, so it can be used to search for optimization in the preset parameter space by the grid search method, and select the optimal parameter set ${L, N}$ $$\left\{ {L,N} \right\}$$ to obtain the best cross-domain learning effect.

3.2.2

Cross-condition fault diagnosis process

The flowchart of offshore oil and gas field equipment fault diagnosis based on DDPG-SDAE algorithm is shown in Fig. 2, and its specific realization steps are as follows:

Step1 Collect multi-source heterogeneous data of offshore oil and gas field equipment, make samples, and add labels to the samples.

Step2 Preprocess the samples and divide them into training set and test set in the ratio of 7:3.

Step3 Set the parameters in the network. For example, the number of neurons in each layer, the number of iterations, the sparse penalty term parameter, the edge restriction parameter and so on.

Step4 Take the training set as the input of the first layer network and train the first MSDA network.

Step5 Take the output of the hidden layer of the DDPG front network as the input of the next layer network and train this layer network. Iterate the operation in this way until the last layer is trained.

Step6 Take the output of the last hidden layer of the SDAE network as the input of the SoftMax classifier and use the raw data labels to train the network parameters of SoftMax.

Step7 Use the obtained initialized network parameter values as the initial values of each parameter in the model, and then fine-tune the whole network by back-propagation algorithm to complete the training.

Step8 The test samples are input into the fine-tuned DDPG-SDAE network, and then the extracted features are input into the trained SVM classifier to calculate the fault diagnosis rate of the algorithm.

4

Troubleshooting validation of offshore oil and gas field equipment

From the current situation, offshore oil and gas fields have been initially formed to provide equipment for oil and gas production and to promote the awareness of proper management. In order to better adapt to the low oil price economic situation and the rapid development trend of oil and gas Internet of Things, the offshore oil and gas field enterprises have increasingly strong demand for the establishment of equipment life cycle management specifications, extend equipment life and maintenance cycle, improve equipment operation and maintenance efficiency and equipment failure analysis timeliness, and urgently improve the standardization, automation and intelligent management level of oil and gas production equipment, and make it an indispensable application management platform in the construction of “smart oilfield”. It is urgent to improve the standardization, automation and intelligent management level of oil and gas production equipment, and make it an indispensable application management platform in the construction of “Intelligent Oilfield”.

4.1

Equipment Fault Data Processing

4.1.1

Effectiveness of fault feature extraction

In order to evaluate the performance of the proposed PSO-NLM algorithm in performing fault feature extraction for offshore oil and gas field equipment, a periodic Gaussian pulse signal at 90Hz with a bandpass of 50% and a repetition frequency of 15Hz with random noise is used for analysis in this paper. The sampling frequency of the signal is 120 kHz, and the width of the pulse train is 2 s. The composite signal and the Gaussian pulse train are shown in Fig. 3, which shows that the composite signal is dominated by noise, and the Gaussian pulse train is almost completely submerged in it.

The PSO-NLM algorithm proposed in this study is used for filtering and noise reduction of the composite signals.In order to verify the effectiveness and superiority of PSO-NLM, two commonly used fast filtering algorithms, i.e., median filtering (MF) and wavelet thresholding noise reduction (WTD), are also used to compare with the proposed PSO-NLM method, taking into account that the PSO-NLM is an improvement on the NLM method. Therefore, NLM is also used to analyze the original signal and compare it with the proposed method. In the noise reduction process, the unique parameter window length of median filtering is selected as 35, the optimal predictor variable threshold selection is used in wavelet threshold filtering, soft threshold noise reduction method, sym3 wavelet basis function is selected, the number of decomposition layers is set to 4, the structural block half-width of 90 is selected for NLM and PSO-NLM, the search region half-width of 6000, and the filter parameter is set to 1. The denoising effect of different algorithms is shown in Fig. 4. Where Fig. 4(a)~(d) shows the feature extraction effect of MF, WTD, NLM and PSO-NLM, respectively. As can be seen from the figure, all four methods can filter out a large amount of noise and highlight the Gaussian pulse sequence. The median filtering denoises the most remaining noise, while wavelet thresholding denoises the waveform with serious distortion. NLM and PSO-NLM denoise the Gaussian pulse sequence more purely and with less distortion.

In order to quantitatively compare the denoising performance of the four filtering methods, signal-to-noise ratio (SNR) and mean absolute error (MAE) are introduced as signal denoising evaluation indexes. White noise from -10dB to 20dB SNR is sequentially added to the original Gaussian pulse sequence, and MAE is used as a quantitative index to examine the denoising performance of median filtering, wavelet threshold filtering, NLM and PSO-NLM in different noise environments. Figure 5 shows the feature extraction effect of different algorithms under different signal-to-noise ratios.

It can be seen that the PSO-NLM algorithm has high filtering accuracy under various signal-to-noise ratios, especially at low signal-to-noise ratios, and still has good robustness. When the signal-to-noise ratio is increased from -10dB to 20dB, the feature extraction MAE of PSO-NLM algorithm is reduced from 0.574 to 0.081. In addition, the running time of the different algorithms is counted in this paper, and the four algorithms have a running time of less than 0.8s, which can be used as a feature extraction method for fault diagnosis in offshore oil and gas fields. Among them, the running time of PSO-NLM algorithm is only 0.051s, which is about 81.48% higher than that of NLM algorithm, and has the optimal running efficiency. Therefore, the PSO-NLM algorithm is applied to extract fault features from offshore oil and gas field equipment, which has better extraction efficiency.

4.1.2

Validity of data interpolation

In order to ensure the integrity of multi-source heterogeneous data of offshore oil and gas field equipment, this paper introduces the CSI algorithm to interpolate and fill them. In order to further understand the effect of this algorithm for interpolation filling, this paper carries out simulation verification of the nearest neighbor interpolation, linear interpolation, cubic interpolation and CSI algorithm with corresponding simulation signals.

The given simulation signal S(t) is superimposed by two signals S₁ and S₂, where S₁ is a sinusoidal signal with a frequency of 20 Hz, and S₂ is a cosine FM signal with a frequency range of 0 to 20 Hz. Fig. 6 shows the corresponding time-domain waveform result of the simulation signal S(t), which has a period of 2 s and an amplitude range of ±3 m/s².

Based on the simulated signal, the nearest neighbor interpolation (CI), linear interpolation (LI), cubic interpolation (CUI), and CSI algorithms are used to obtain the corresponding upper and lower envelope effects of the simulated signal as shown in Fig. 7, in which Figs. 7(a)~(d) show the interpolation results of CI, LI, CUI, and CSI, respectively. According to the simulation effect diagrams obtained for several interpolation methods, their effects are analyzed and evaluated in terms of smoothness and similarity. 1)

Smoothness. From the theoretical analysis, the so-called “smoothness” is also known as the degree of smoothness, which is quantitatively described mathematically as the existence and continuity of the k-order derivatives of a function or a curve, which is called the curve with k-order smoothness. The higher the order, the better the smoothness. Closest Interpolation, Linear Interpolation are all zero order smoothness, i.e. not smooth, Cubic Interpolation and CSI algorithm are interpolation methods that achieve more off-order smoothness by utilizing polynomials of lower number of times. Therefore, theoretically speaking, the order of the above four interpolation methods according to the degree of smoothness from low to high should be CI<LI<CUI<CSI, according to the simulation verification of the effect of comparing the degree of smoothness of the upper and lower envelopes in the figure, it can be seen that the CSI algorithm obtains the relative best smoothness of the envelope, cubic interpolation is second to the closest, linear interpolation method is the worst. The simulation results are in good agreement with the theoretical analysis results.

2)

Degree of similarity. The degree of similarity, also known as the degree of approximation, represents the degree of proximity between the envelope and the real signal. According to the requirements of envelope fitting, the envelope generated by interpolation method should meet two basic requirements, one is that the envelope can enclose all the data points, and the other is that the envelope can better reflect the trend of the real signal. According to the results in the figure combined with the two basic requirements, it can be seen that for the closest interpolation and linear interpolation, the two methods not only have poor smoothness, but also the fitted envelope is cut off at the position of the extreme point closest to the endpoint. For the data between the two endpoints and the closest two endpoints between the extreme points of the data is not interpolated fit, that is, the interpolation generated by the envelope does not signal all the data points of the envelope, CUI algorithm obtained by the Paul’s curve is relatively poor in smoothness, the overall trend of the change of the relatively gentle. The CSI algorithm’s Paul curve is relatively smooth, and the overall trend becomes more obvious. Therefore, the CSI algorithm better reflects the trend of the signal. This paper uses the CSI algorithm to interpolate offshore oil and gas field equipment failure signals with feasibility and effectiveness.

4.2

Equipment Troubleshooting Validation

4.2.1

Comparison of Fault Diagnosis Effectiveness

Rolling bearings, the most common component in offshore oil and gas field equipment, are used as an example to analyze the effectiveness of this paper’s method. The rolling bearing data collected from the Electrical Engineering Laboratory of CWRU University was selected to be the identification sample. The localized damage of the bearings was created with an electric spark machine to obtain no fault (Q1), outer ring faulty bearings (Q2), inner ring faulty bearings (Q3) and rolling element faulty bearings (Q4), respectively. The signals of the four fault states are filled by feature selection and interpolation using the PSO-NLM algorithm and CSI algorithm given in the previous section, and a total of 470 groups of no-fault bearing data are obtained, and 320, 180 and 120 groups of rolling body fault, inner ring fault and outer ring fault data are obtained respectively. Among them, the first 370 groups of fault-free data, the first 200 groups of rolling body fault data, the first 100 groups of inner ring fault data and the first 60 groups of outer ring fault data are used to form the training data, and the remaining data are used to test the recognition rate of the model. The G-DBM and PSO-G-DBM models are selected as the comparison models, and the fault diagnosis results of different models are obtained as shown in Fig. 8, where Figs. 8(a)~(c) show the fault diagnosis results of the G-DBM, PSO-G-DBM and DDPG-SDAE models, respectively.

Based on the fault diagnosis confusion matrix of different models, it can be seen that after noise reduction and interpolation to fill in the signal, different types of equipment fault recognition is better, the three models for the normal state of the bearing signal recognition rate is higher than 90%, and the diagnostic accuracy of the DDPG-SDAE model designed in this paper can reach 98.95%. For the two categories of outer ring faults and rolling body faults, the diagnostic accuracy of G-DBM and PSO-G-DBM models are lower than 90%, and the accuracy of the fault diagnostic model designed in this paper reaches 92.64% and 94.84%, respectively. This fully demonstrates that the offshore oil and gas field equipment fault diagnosis model proposed in this paper after combining deep reinforcement learning has a better diagnostic effect, which can better help the offshore oil and gas field equipment overhaul and guarantee the safety of offshore oil and gas field equipment.

4.2.2

Troubleshooting Migration Results

The DDPG-SDAE model designed in this paper has cross-domain and cross-operating condition fault diagnosis capability, in order to validate the fault diagnosis migration effect of the model, this paper takes three different types of equipment fault datasets of an offshore oil and gas field platform as an example, and designs a total of nine source-domain and target-domain migration tasks, as shown in Table 1, by considering the single-sensor migration task and the multi-sensor information fusion migration task. In the test, in order to construct the source and target domain tasks, any one sensor data is taken as the source domain data and the other two sensor data as the target domain data when performing the single source domain migration task. A total of six single-sensor migration tasks were obtained due to the differences in distribution between the target domain test data and the source domain training data. In multi-source domain migration, in order to fully utilize the acquired historical data from multiple source domains, any two of the three sensors’ acquired sensor data are treated as the source domain data while the remaining one is treated as the target domain dataset in order to perform the multi-sensor fusion migration task, thus a total of three multi-sensor migration tasks are obtained.

Table 1.

Different migration task descriptions

Describe	Source domain	Target domain	Migration task
The single-transmitter migration task	Sensor 1	Sensor 3	1→3
	Sensor 2	Sensor 3	2→3
	Sensor 1	Sensor 2	1→2
	Sensor 3	Sensor 2	3→2
	Sensor 2	Sensor 1	2→1
	Sensor 3	Sensor 1	3→1
Multi-sensor migration task	Sensor 1+2	Sensor 3	1+2→3
	Sensor 1+3	Sensor 2	1+3→2
	Sensor 2+3	Sensor 1	2+3→1

In order to verify the effectiveness and superiority of the proposed method, the proposed method is tested on single-sensor migration diagnosis and multi-sensor migration tasks to compare with CNN and shallow migration DACNN models. Fig. 9 shows the fault diagnosis migration results of different models, where Fig. 9(a)~(b) shows the results of single-sensor and multi-sensor migration tasks, respectively.

On the single-sensor migration task, for CNN, the classification performance of different sensors shows large fluctuations, such as the migration task 1→3 from sensor 1 to sensor 3 and 2→3 from sensor 2 to sensor 3 whose diagnostic accuracies are 86.96% and 97.82%. While in 2→1 and 3→1, its diagnostic accuracy is only 73.07% and 64.92% respectively. Therefore, the fault occurs inside the planetary gear, and the signal is transmitted to the case surface through gear meshing and shaft vibration, which is picked up by the sensor. Affected by the vibration signal transmission path, the information obtained by the sensor from different locations is obviously different, which leads to a large distribution difference in the data obtained by different sensors. It can be seen that the diagnostic accuracy of sensor 2 to sensor 3 is significantly higher than that of sensor 1 to sensor 3, while the diagnostic accuracy of sensor 2 to sensor 1 is also significantly higher than that of sensor 3 to sensor 1. It shows that the closer the installation distance of two sensors is, the more conducive to obtaining more similar diagnostic information, and the difference between the two data distributions is relatively smaller, which in turn is conducive to obtaining better generalization ability. In addition, it can be found that both the DACNN and the proposed method using the migration learning technique have significantly higher classification accuracy than the CNN. Comparing several other migration tasks, the lowest diagnostic accuracy of the proposed DDPG-SDAE model can reach 96.77% (3→1), while the highest diagnostic diagnostic accuracy can reach 99.05% (3→2). Compared with the shallow migration CNN, its fault diagnosis accuracy is improved by about 10~32 percentage points. It shows that the model in this paper can significantly improve the migration capability under different sensor data by utilizing unlabeled datasets in the target domain and minimizing the classifier output error to obtain more stable and higher diagnostic accuracy. In addition, on the multi-sensor migration task, the classification accuracy of this paper’s model is more competitive compared to the single-sensor results, and the highest classification accuracy is obtained by combining deep reinforcement learning, domain adaptation techniques, and minimizing the classifier output.

5

Conclusion

In order to improve the fault diagnosis and identification of offshore oil and gas field equipment, this paper proposes a deep domain adaptive oil and gas field equipment fault diagnosis model based on the deep reinforcement learning DDPG framework and SDAE, and carries out a validation analysis for the effectiveness of the model. 1)

The PSO-NLM algorithm is used to extract the fault features of offshore oil and gas field equipment, and the MAE of the algorithm is reduced from 0.574 to 0.081 when the signal-to-noise ratio is increased from -10 dB to 20 dB, and the running time is only 0.051 s. The PSO-NLM algorithm for extracting equipment fault features has faster running efficiency, and it can also fully guarantee the denoising effect of the fault signal. The effect of denoising the fault signal can be fully guaranteed.

2)

The interpolation of fault features based on CSI algorithm fills in a smoother curve, and the degree of similarity is smaller than that of the original signal. Therefore, using the CSI algorithm to interpolate and fill the fault characteristic data can better retain the changes of the original data sequence, and provide reliable data for the accuracy of equipment fault diagnosis.

3)

The recognition accuracy of the DDPG-SDAE model for normal equipment data can reach 98.95%, and the fault diagnosis accuracy under cross-domain is 10~32 percentage points higher than that of shallow migration CNN.

In summary, introducing deep reinforcement learning into the fault diagnosis of offshore oil and gas field equipment can significantly improve the accuracy of equipment fault diagnosis and help ensure the stable operation of offshore oil and gas field related equipment.

Language:: English

Publication timeframe:: 1 times per year
Journal Subjects:: Life Sciences, Life Sciences, other, Mathematics, Applied Mathematics, General Mathematics, Physics, Physics, other

Journal RSS Feed

Research on the fault diagnosis method of offshore oil and gas field equipment combined with deep reinforcement learning

Dexi Zhao

Feng Zhu

Published Online: Mar 26, 2025

Received: Nov 09, 2024

Accepted: Feb 07, 2025

DOI: https://doi.org/10.2478/amns-2025-0804

KeywordsDeep reinforcement learning, Multi-source heterogeneous data, Adaptive NLM algorithm, Equipment fault diagnosis

© 2025 Dexi Zhao et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Keywords
Deep reinforcement learning, Multi-source heterogeneous data, Adaptive NLM algorithm, Equipment fault diagnosis