Dynamic analysis of railroad track faults based on neural network algorithms

As the main mode of transportation, railroad plays an indispensable role in national economic development and national defense construction. In order to meet the growing demand for travel, countries around the world have improved the transportation capacity of railroads by increasing the operating speed and train density [1]. In China, with the opening of “Fuxing” train sets, high-speed railroads have resumed the operation speed of 350km/h one after another. The higher the operating speed of the railroad, the more prominent the damage and destruction of the train on the track structure. On the one hand, due to the increase in operating speed, wheel and rail power to strengthen the role of the rail fasteners are very easy to fatigue damage, resulting in insufficient rail buckle pressure to change the track stiffness, further aggravated by the train on the under-rail infrastructure damage, the formation of a vicious cycle of track condition deterioration. On the other hand, with the increase of traffic density, track structure maintenance and repair work is more frequent, but the busy mainline railroad window period is shorter, can be used for high-speed railroad maintenance and repair time is limited, it is difficult to find line damage and repair in time [2-3].

At the same time, China’s high-speed railroad has been transferred from the stage of large-scale “design and construction” to the stage of long-term “operation and maintenance”, coupled with the fact that China’s high-speed railroad has the characteristics of short line construction cycle and short time of opening and operation, etc., the deterioration of infrastructure on the operation line has entered into the obvious period, and the local damage of some sections of the structure is serious. In addition, the short construction period of China’s high-speed railway lines has led to the stage of “operation and maintenance” for a long period of time. Investigation shows that ballastless track has appeared many structural injuries and component damages, such as cracks of components (track plate, base, support layer), bonding failure damage between components, fasteners elastic fracture, etc., which will have non-negligible impact on the track structural service performance and traveling safety [4-5]. Therefore, the study of railroad track fault detection method based on neural network algorithm, so as to strengthen the track structure state detection, timely detection of railroad track fault conditions, correctly guide the line maintenance and repair, to avoid further deterioration of the track structure disease, and to ensure the safety of railroad transportation is a necessary way to promote the intelligent modernization of China’s railroads, and it is of great economic benefits and strategic significance [6-7].

In this paper, the rolling bearings of the railroad track are analyzed from the perspective of kinematics and dynamics, and after establishing the dynamics analysis model with five degrees of freedom, the dynamics model of the rolling bearings failure of the railroad track fault is established. The wavelet transform is used to convert one-dimensional signals into two-dimensional images, so as to provide richer information for the analysis of railroad track faults. The attention mechanism is introduced into the CNN-BiLSTM network, which makes the fault feature extraction process more efficient and realizes the reinforcement of key information. Meanwhile, in order to prevent the overfitting phenomenon during the training process, this paper adopts the Dropout technique to improve the generalization ability and performance of the railroad track fault diagnosis model. The railroad track fault diagnosis model constructed in this paper accelerates the fault diagnosis speed, improves the diagnosis accuracy, and has a better application prospect in railroad track fault diagnosis.

2

Overview

When the railroad track is faulty, it is of great significance to detect the fault type and repair the fault at the fastest speed. Literature [8] establishes a fusion model based on deep fusion model for detecting and semantic segmentation of railroad track faults, and the comprehensive test verifies that the proposed fusion model can effectively check the railroad track by detecting and segmenting faults, which in turn reduces the occurrence of derailment and other railway-related accidents. Literature [9] proposed a high-speed train bogie fault diagnosis method based on LSTM neural network, and verified the effectiveness and feasibility of the method through experiments, and the research is of great significance to maintain the safety and stability of the railroad. Literature [10] proposes a new method of detecting defects in railroad track fasteners based on image processing and deep learning technology, and verifies the effectiveness of the proposed method through fastener defect detection tests, and the research is of great significance to ensure rail safety and reduce maintenance costs. Literature [11] tries to introduce YOLOv4 deep neural network in work condition monitoring and fault detection, and verifies the scientific nature of the proposed method through the test data evaluation model, which guarantees the safety of the railroad track to a certain extent. Literature [12] proposed a new data-driven technology for automatic detection of railroad track faults using three target detection models, YOLOv5, Faster RCNN and EfficientDet, and verified the validity of the proposed technology through model training, which can detect railroad track faults more accurately. Literature [13] designed a railroad track fault detection and localization scheme based on IoT, and took the data of Pakistan railroad line as an example for experimental analysis, and the experimental results verified the applicability of the proposed method, which helps to ensure the safe and reliable operation of trains.

In addition, literature [14] proposes a local convolutional neural network-based Pandrol track fastener defect detection method, which is experimentally verified to be effective in detecting defects in track fastener fasteners and effectively detecting whether the bolts are loosened or not, which in turn reduces the occurrence of railroad track safety accidents. Literature [15] developed a deep convolutional neural network based contact network support device fastener defects automatic detection technology, through a large number of experiments and comparisons verified that the developed technology has a high detection rate and good adaptability and robustness, to ensure the safety of railroad operation and cost reduction is of great significance. Literature [16] proposed a railroad track fault detection method based on MobileNet image processing, which is able to achieve good performance in the classification of railroad track faults, and to a certain extent ensures the safety of railroad tracks and reduces maintenance costs. Literature [17] proposed a cost-effective railroad track fault detection method based on image processing, which aims to capture the image of the railroad track and check the small cracks, and experimentally verified the effectiveness of the method, which can not only estimate the length of the cracks, but also give a report on the health of the railroad track, which in turn reduces the occurrence of the railroad accidents. Literature [18] pointed out that the existence of cracks in the railroad track can cause the occurrence of railroad accidents, and designed a deep neural network-based fault detection system for railroad tracks and fasteners, which was verified by system performance tests, and can detect the faults and cracks existing on the track early and correct them. Literature [19] proposed and verified the effectiveness of a vision-based crack detection system, which can identify the presence of cracks on the track more quickly than other existing systems and does not affect the normal operation of the train during operation. Literature [20] proposed a machine learning automatic fault detection technology based on artificial neural network, and applied it to railroad track fault detection, and verified its practicality through empirical analysis, which can effectively reduce the occurrence of train accidents.

3

Dynamic modeling of railroad track failures

As a core component of the railroad track, the working condition of the rolling bearing directly affects the operation of the railroad track. At the same time, with the clustering of mechanical equipment, large-scale, in the field of railroad track fault monitoring also presents a large amount of data, wide distribution characteristics, so the development of rolling bearing fault monitoring model suitable for the current needs is to achieve a more comprehensive fault monitoring program on the railroad track. The first step is to model the dynamics of the rolling bearing.

3.1

Model assumptions and motion analysis

The main contact relationships and motion characteristics of rolling bearings are the basis for the following assumptions:

Balls in rolling bearings are distributed equidistantly between the inner and outer raceways.

The inner ring of the rolling bearing rotates with the shaft and the outer ring is fixed in the housing.

There is pure rolling between the balls and the raceways.

Neglect the influence of oil film lubrication, friction heat effect.

Neglect bearing raceway surface corrugation degree, rotational inertia.

Only damage and elastic contact are considered between rolling body and raceway, other geometric errors are not considered.

According to the model assumption, in the process of bearing operation, bearing clearance, nonlinear contact force will produce variable flexibility VC vibration, so that the bearing in the process of operation, the total stiffness will show a cyclic change, Figure 1 for the rolling bearing motion model schematic diagram.

As shown in Fig. 1, let the radius of inner ring of rolling bearing be R_n, the linear velocity of contact point of rolling body and inner raceway be v_n, the angular velocity of inner ring be ω_n, the radius of outer ring be R_w, the linear velocity of contact point of ball and outer raceway be v_w, the angular velocity of outer ring be ω_w, and the number of ball beings be N_b. According to kinematics relationship, it can be known: (1) ${\begin{array}{l} v_{n} = ω_{n} \times R_{n} \\ v_{w} = ω_{w} \times R_{w} \end{array}$ $\left\{ \begin{array}{*{35}{l}} {{v}_{n}}={{\omega }_{n}}\times {{R}_{n}} \\ {{v}_{w}}={{\omega }_{w}}\times {{R}_{w}} \\ \end{array} \right.$

The cage (which can be equated to the center of the rolling body) has a linear velocity of: (2) $v_{c a g e} = \frac{v_{n} + v_{w}}{2}$ \[{{v}_{cage}}=\frac{{{v}_{n}}+{{v}_{w}}}{2}\]

Since the rotational speed of the outer ring of the bearing is considered to be 0, then: (3) $v_{c a g e} = \frac{v_{n}}{2} = \frac{ω_{n} R_{n}}{2}$ \[{{v}_{cage}}=\frac{{{v}_{n}}}{2}=\frac{{{\omega }_{n}}{{R}_{n}}}{2}\]

This gives the cage angular velocity: (4) $ω_{c a g e} = \frac{v_{c a g e}}{(R_{n} + R_{w}) / w} = \frac{(ω_{n} \times R_{n}) / 2}{(R_{n} + R_{w}) / 2} = \frac{ω_{n} \times R_{n}}{R_{n} + R_{w}}$ \[{{\omega }_{cage}}=\frac{{{v}_{cage}}}{{\left( {{R}_{n}}+{{R}_{w}} \right)}/{w}\;}=\frac{{\left( {{\omega }_{n}}\times {{R}_{n}} \right)}/{2}\;}{{\left( {{R}_{n}}+{{R}_{w}} \right)}/{2}\;}=\frac{{{\omega }_{n}}\times {{R}_{n}}}{{{R}_{n}}+{{R}_{w}}}\]

And the inner ring speed is the same as that of the spindle rotor, i.e. ω_n = ω_shaft, then: (5) $ω_{c a g e} = \frac{ω_{s h a f t} \times R_{n}}{R_{n} + R_{w}}$ \[{{\omega }_{cage}}=\frac{{{\omega }_{shaft}}\times {{R}_{n}}}{{{R}_{n}}+{{R}_{w}}}\]

For the variable flexibility vibration generated by contact load and clearance variations due to the periodic motion of the bearing, the VC frequency can be expressed as: (6) $f_{v c} = ω_{c a g e} \times N_{b}$ \[{{f}_{vc}}={{\omega }_{cage}}\times {{N}_{b}}\]

The angle at which the j st roller ball turns after time t is φ_j, then: (7) $φ_{j} = ω_{c a g e} \times t + \frac{2 π}{N_{b}} (j - 1) j = 1, 2, \dots, N_{b}$ \[{{\varphi }_{j}}={{\omega }_{cage}}\times t+\frac{2\pi }{{{N}_{b}}}\left( j-1 \right)j=1,2,\ldots ,{{N}_{b}}\]

3.2

Dynamic analysis of rolling bearings

A 5-degree-of-freedom model is developed for the rolling bearing based on the above modeling assumptions for the characteristics of the drive spindle system. The model with four basic degrees of freedom to characterize the bearing inner and outer ring displacement in the x, y direction, based on this add a degree of freedom of the unit mass resonator, used to adjust the stiffness and damping coefficients, Figure 2 for the rolling bearing dynamics model [21].

According to Newton’s second law [22], the kinetic equation characterizing the model can be expressed as: (8) ${\begin{array}{l} m_{n} {\ddot{x}}_{n} + c_{n} {\dot{x}}_{n} + k_{n} x_{n} + f_{x} = 0 \\ m_{n} {\ddot{y}}_{n} + c_{n} {\dot{y}}_{n} + k_{n} y_{n} + f_{y} = F_{r} \\ m_{w} {\ddot{x}}_{w} + c_{w} {\dot{x}}_{w} + k_{w} x_{w} - f_{x} = 0 \\ m_{w} {\ddot{y}}_{w} + (c_{w} + c_{r}) {\dot{y}}_{w} + (k_{w} + k_{r}) y_{w} - k_{r} y_{b} - c_{r} {\dot{y}}_{b} - f_{y} = 0 \\ m_{r} {\ddot{y}}_{b} + c_{r} ({\dot{y}}_{b} - {\dot{y}}_{w}) + k_{r} (y_{b} - y_{p}) = 0 \end{array}$ $\left\{ \begin{array}{*{35}{l}} {{m}_{n}}{{{\ddot{x}}}_{n}}+{{c}_{n}}{{{\dot{x}}}_{n}}+{{k}_{n}}{{x}_{n}}+{{f}_{x}}=0 \\ {{m}_{n}}{{{\ddot{y}}}_{n}}+{{c}_{n}}{{{\dot{y}}}_{n}}+{{k}_{n}}{{y}_{n}}+{{f}_{y}}={{F}_{r}} \\ {{m}_{w}}{{{\ddot{x}}}_{w}}+{{c}_{w}}{{{\dot{x}}}_{w}}+{{k}_{w}}{{x}_{w}}-{{f}_{x}}=0 \\ {{m}_{w}}{{{\ddot{y}}}_{w}}+\left( {{c}_{w}}+{{c}_{r}} \right){{{\dot{y}}}_{w}}+\left( {{k}_{w}}+{{k}_{r}} \right){{y}_{w}}-{{k}_{r}}{{y}_{b}}-{{c}_{r}}{{{\dot{y}}}_{b}}-{{f}_{y}}=0 \\ {{m}_{r}}{{{\ddot{y}}}_{b}}+{{c}_{r}}\left( {{{\dot{y}}}_{b}}-{{{\dot{y}}}_{w}} \right)+{{k}_{r}}\left( {{y}_{b}}-{{y}_{p}} \right)=0 \\ \end{array} \right.$

Define the principal curvatures of the ball-raceway contact respectively: (9) $ρ_{11} = \frac{2}{D_{b a l l}}, ρ_{12} = \frac{2}{D_{b a l l}}, ρ_{21} = - \frac{1}{f D_{b a l l}}, ρ_{22} = \frac{2}{D_{b a l l}} (\frac{λ}{1 \pm λ})$ \[{{\rho }_{11}}=\frac{2}{{{D}_{ball}}},{{\rho }_{12}}=\frac{2}{{{D}_{ball}}},{{\rho }_{21}}=-\frac{1}{f{{\mathcal{D}}_{ball}}},{{\rho }_{22}}=\frac{2}{{{D}_{ball}}}\left( \frac{\lambda }{1\pm \lambda } \right)\]

Where λ = D_ball cosθ/D_p. D_ball is the diameter of the ball; f is the radius of curvature coefficient of the raceway. θ is the bearing contact angle. D_p is the radius of the bearing pitch circle. For the outer ring, ρ₂₂ should take +. For the inner ring, ρ₂₂ should be taken -.

According to the contact relationship and the geometric relationship of rolling bearings, it can be seen that the two surfaces in the O-point neighborhood range, along the Z-axis distance of the second-order expansion for: (10) $Z = \frac{1}{2} (ρ_{11} + ρ_{21}) X^{2} + \frac{1}{2} (ρ_{12} + ρ_{22}) Y^{2}$ \[Z=\frac{1}{2}\left( {{\rho }_{11}}+{{\rho }_{21}} \right){{X}^{2}}+\frac{1}{2}\left( {{\rho }_{12}}+{{\rho }_{22}} \right){{Y}^{2}}\]

When the elastic object to meet Hooke’s law of contact and produce elastic deformation, will be contact center O of the normal direction of the formation of an elliptical contact area, the contact load is also presented along the region of the elliptic distribution.

Let the total load in the contact area be F, the ratio of elliptical geometric dimensions c = a/b, the integral of formula (10) and solve the transcendental equation about c, and then we can find the size of the elliptical region, the amount of contact deformation: (11) $a = {(\frac{2 Σ}{π c^{2}})}^{1 / 3} \cdot {(\frac{3 F}{Σ ρ} (\frac{1 - v^{2}}{E}))}^{1 / 3}$ \[a={{\left( \frac{2\Sigma }{\pi {{c}^{2}}} \right)}^{1/3}}\cdot {{\left( \frac{3F}{\Sigma \rho }\left( \frac{1-{{v}^{2}}}{E} \right) \right)}^{1/3}}\] (12) $b = {(\frac{2 k \sum}{π})}^{1 / 3} \cdot {(\frac{3 F}{\sum ρ} (\frac{1 - v^{2}}{E}))}^{1 / 3}$ \[b={{\left( \frac{2k\Sigma }{\pi } \right)}^{1/3}}\cdot {{\left( \frac{3F}{\Sigma \rho }\left( \frac{1-{{v}^{2}}}{E} \right) \right)}^{1/3}}\] (13) $ζ = {(\frac{π k^{2}}{2 \sum})}^{1 / 3} \cdot \frac{2 Γ}{π} \cdot {(\frac{3 F^{2} \sum ρ}{8} {(\frac{1 - v^{2}}{E})}^{2})}^{1 / 3}$ \[\zeta ={{\left( \frac{\pi {{k}^{2}}}{2\Sigma } \right)}^{1/3}}\cdot \frac{2\Gamma }{\pi }\cdot {{\left( \frac{3{{F}^{2}}\Sigma \rho }{8}{{\left( \frac{1-{{v}^{2}}}{E} \right)}^{2}} \right)}^{1/3}}\]

For rolling bearings, the inner raceway and outer raceway contact area of the normal are through the diameter of the center of the ball, so the total contact deformation for the inner and outer raceway deformation of the sum: (14) $ζ = ζ_{n} + ζ_{w}$ \[\zeta ={{\zeta }_{n}}+{{\zeta }_{w}}\]

where ζ_n = (F/c_n)^2/3, ζ_w = (F/c_w)^2/3.

Thus, the total contact deformation can be expressed as: (15) $ζ = {(F / c_{s u m})}^{2 / 3}$ \[\zeta ={{\left( {F}/{{{c}_{sum}}}\; \right)}^{{2}/{3}\;}}\]

where $c_{s u m} = \frac{1}{{(c_{n}^{- 2 / 3} + c_{w}^{- 2 / 3})}^{3 / 2}}$ ${{c}_{sum}}=\frac{1}{{{\left( c_{n}^{{-2}/{3}\;}+c_{w}^{{-2}/{3}\;} \right)}^{{3}/{2}\;}}}$.

From the above analysis, it can be seen that with the determination of the basic parameters of the rolling bearing and the load, the relevant solution can be obtained based on the Hertz point contact for the subsequent dynamics process.

3.3

Rolling Bearing Fault Modeling

In the actual operation process, the failure of rolling bearings is generally due to surface spalling, wear, pitting, and gluing.

In this paper, local defects of inner and outer rings are introduced into the above rolling bearing dynamics model to simulate the vibration response signals of rolling bearings under failure conditions.

Based on the kinetic and kinematic analysis of the bearing model in the previous section, the displacements of the raceway center in the transverse and longitudinal directions are set to be x and y, respectively, the initial radial clearance is ζ₀, and the deformation of the rolling element through the damage zone is ζ_e. Therefore, the deformation of the normal contact between the rolling element and the raceway of the jth rolling element is: (16) $ζ_{j} = x \cos φ_{j} + y \sin φ_{j} - ζ_{0} - t_{j} ζ_{0}$ \[{{\zeta }_{j}}=x\cos {{\varphi }_{j}}+y\sin {{\varphi }_{j}}-{{\zeta }_{0}}-{{t}_{j}}{{\zeta }_{0}}\]

In the formula, τ_j is the judgment quantity, which is used to determine whether the rolling element enters the damage zone and generates damage deformation.

According to the nonlinear Hertz contact theory, the analysis shows that when the jnd rolling body and raceway contact deformation, the contact force can be calculated using (17): (17) $F_{j} = c ζ_{j}^{n} σ_{j}$ \[{{F}_{j}}=c\zeta _{j}^{n}{{\sigma }_{j}}\]

Where c is the contact stiffness: n is the rolling bearing contact load deflection coefficient, which is generally taken as 1.5 for the bearings studied in this paper: σ_j is the contact switching quantity.

Since the force is generated only when contact occurs between the two elastomers, σ_j is used as the switching quantity for determination: (18) $σ_{j} = {\begin{array}{l} 0 & ζ_{j} \leq 0 \\ 1 & ζ_{j} > 0 \end{array}$ \[{{\sigma }_{j}}=\left\{ \begin{array}{*{35}{l}} 0 & {{\zeta }_{j}}\le 0 \\ 1 & {{\zeta }_{j}}>0 \\ \end{array} \right.\]

According to the above equation, combined with the previous kinematic analysis equation, we can find the rolling bearing in the x, y direction of the partial force is: (19) ${\begin{array}{l} F_{x} = c \sum σ_{j} ζ_{j}^{1.5} \cos φ_{j} \\ F_{y} = c \sum σ_{j} ζ_{j}^{1.5} \sin φ_{j} \end{array}$ $\left\{ \begin{array}{*{35}{l}} {{F}_{x}}=c\Sigma {{\sigma }_{j}}\zeta _{j}^{1.5}\cos {{\varphi }_{j}} \\ {{F}_{y}}=c\Sigma {{\sigma }_{j}}\zeta _{j}^{1.5}\sin {{\varphi }_{j}} \\ \end{array} \right.$

In the dynamic model of inner ring failure, the inner ring and shaft segment of a rolling bearing rotate with the shaft. At the same time, the rolling body is also rotating at high speed in synchronization with the constraint of the cage. The contact between the damaged area of the inner ring and the rolling element shows a periodic change. Let the width of the crater damage be L_n, the depth of the damage be H_n, and the radius of the rolling element be r_b.

When the damage vibration condition $L_{n} \leq 2 \sqrt{2 r_{b} H_{n} - H_{n}^{2}}$ ${{L}_{n}}\le 2\sqrt{2{{r}_{b}}{{H}_{n}}-H_{n}^{2}}$ is satisfied, the rolling body will experience a gap transient and suddenly obtain a certain depth of gap increment. In the simulation process, the damage simulated in this paper meets this condition, due to the rolling body movement to the local defects caused by the amount of change in the bearing clearance ζ_e = h_n, that is: (20) $ζ_{c} = r_{b} - \sqrt{r_{b}^{2} - {(\frac{L_{n}}{2})}^{2}}$ \[{{\zeta }_{c}}={{r}_{b}}-\sqrt{r_{b}^{2}-{{\left( \frac{{{L}_{n}}}{2} \right)}^{2}}}\]

In the fault model, the relative position of the rolling body and the inner ring is changing in real time, in the damage area arc β, angular position Θ₀, according to the kinematic analysis formula can be known at t moments of the angular position of the jth rolling body, at this time the corresponding position of the inner ring is θ_n = ω_nt+θ₀, so when the angular position of the rolling body θ_j is located in the θ_n < θ_j < θ_n + β, the rolling body enters the damage area, the bearing clearance changes.

Thus, the switching quantity in equation (16) can be expressed as: (21) $τ_{j} = {\begin{array}{l} 1, & | (θ_{n} - θ_{j}) M O D (2 π) | < β \\ 0 \end{array}$ \[{{\tau }_{j}}=\left\{ \begin{array}{*{35}{l}} 1, & \left| \left( {{\theta }_{n}}-{{\theta }_{j}} \right)MOD\left( 2\pi \right) \right|<\beta \\ 0 & {} \\ \end{array} \right.\]

Where, β = 2 arcsin(L_n/D_n), D_n is the inner ring diameter.

Combined with the above analysis, in the process of inner ring failure dynamics analysis, the gap change quantity under the current failure scale combined with the switching quantity can be calculated by substituting into the analysis process to obtain the vibration response of the inner ring failure.

Outer ring failure dynamics model. The outer ring of the bearing is fixed in the housing, the rotational speed is 0, and the rest of the cases are similar to the inner ring failure. When the rolling body enters the damage area, it will release a certain amount of deformation and produce periodic shock vibration. Let the width of the crater damage be L_w, the depth of the damage be H_w, and the radius of the rolling element be r_b.

Similar to the inner ring failure, when the rolling body enters the damage area after the bearing clearance change amount ζ_c = h_w, that is: (22) $ζ_{c} = r_{b} - \sqrt{r_{b}^{2} - {(\frac{L_{w}}{2})}^{2}}$ \[{{\zeta }_{c}}={{r}_{b}}-\sqrt{r_{b}^{2}-{{\left( \frac{{{L}_{w}}}{2} \right)}^{2}}}\]

If the radian size of the damage region is β, its angular position is θ_w.

Thus, the switching quantity in Eq. (16) can be expressed as: (23) $τ_{j} = {\begin{array}{l} 1, | (θ_{w} - θ_{j}) M O D (2 π) | < β \\ 0 \end{array}$ \[{{\tau }_{j}}=\left\{ \begin{array}{*{35}{l}} 1,\left| \left( {{\theta }_{w}}-{{\theta }_{j}} \right)MOD\left( 2\pi \right) \right|<\beta \\ 0 \\ \end{array} \right.\]

where β = 2 arcsin(L_n/D_w), D_w are the outer ring diameters.

Having determined the variation of the clearance between the rolling element and the raceway, the dynamics of the failure model can be solved in conjunction with the dynamics model.

4

Fault diagnosis network modeling

4.1

Wavelet transform

Wavelet transform has a better signal processing effect [23]. This can not only process the non-smooth raw data, but also make full use of the advantages of two-dimensional convolutional neural networks in image feature extraction. After wavelet transform, the signal W_φ(a,b) is generally represented as: (24) $W_{φ} (a, b) = \frac{1}{\sqrt{a}} \int x (t) φ^{*} (\frac{t - b}{a}) d t, a < 0$ \[{{W}_{\varphi }}\left( a,b \right)=\frac{1}{\sqrt{a}}\int{x\left( t \right)}{{\varphi }^{*}}\left( \frac{t-b}{a} \right)dt,a<0\]

Where, x(t) is the search distance, φ is the mother wavelet, φ^* is the complex covariant mother wavelet, a is the scale factor and b is the time shift factor. The key of wavelet transform is to choose the wavelet basis function, other parts of the signal with different status quo characteristics will be suppressed. Morlet wavelet is characterized by symmetry and smoothness. In this study Morlet wavelet is chosen as the basis function of wavelet transform.

4.2

Convolutional Neural Networks

Convolutional neural network (CNN), as a classical model for deep learning, is a deep feed-forward neural network characterized by weight sharing and local connectivity. It consists of several parts: input layer, convolutional layer, pooling layer, fully connected layer and output layer.

In the convolutional layer, different weights through each convolutional kernel correspond to a kind of feature extraction. If one wants to obtain the same receptive field of a one-dimensional vibration signal of a motor, this can be done by stacking small convolutional kernels, but it brings certain problems and makes the training parameters increase. Therefore, the deep convolutional network structure on the visual field is not suitable for motor bearing fault diagnosis, so the expression of wide convolutional kernel is used: (25) $y_{l + 1, m} (n) = w_{l, m} * x_{l} (n) + b_{l, m}$ \[{{y}_{l+1,m}}\left( n \right)={{w}_{l,m}}*{{x}_{l}}\left( n \right)+{{b}_{l,m}}\]

where w_l,m is the weight matrix of the m nd filter; b_l,m represents the bias term; and x_l(n) denotes the n th input, and the wide convolution kernel is computed by the above equation.

The activation layer allows the nonlinearity of the network after convolution of the output to be enhanced, which increases the feature representation of the model. Common activation functions in the field of fault diagnosis are the Sigmoid function and the ReLU function, the network uses the latter, for the training process to produce some of the risks, such as gradient disappearance and gradient explosion, can play a significant role in reducing the role of; part of the neuron output results can be accelerated in the convergence of the CNN at the same time to make part of the neuron output value to 0, to avoid overfitting, accelerate the training of the network, its expression is as follows: (26) $a_{l + 1, m} (n) = \max {0, y_{l + 1, m} (n)}$ \[{{a}_{l+1,m}}\left( n \right)=\max \left\{ 0,{{y}_{l+1,m}}\left( n \right) \right\}\]

where a_l+1,m(n) is the output value after y_l+1,m(n) activation and max(·) denotes the ReLU activation function.

When the convolutional layer directly extracts features and classifies them, it generates a relatively large amount of computation and, in this case, promotes the phenomenon of overfitting, which is usually achieved by adding a pooling layer after the convolutional layer. The role of pooling is to reduce the feature mapping dimensions and maintain the invariance of the features. If maximum pooling is used, there are two main advantages, on the one hand, the parameters can be reduced and on the other hand, the fault features of periodic time-frequency signals are maximally preserved. The expression is as follows: (27) $p_{l + 1} (n) = \max_{(n - 1) H + 1 \leq i \leq n H} {q_{l, m} (i)}$ \[{{p}_{l+1}}\left( n \right)=\underset{\left( n-1 \right)H+1\le i\le nH}{\mathop{\max }}\,\left\{ {{q}_{l,m}}\left( i \right) \right\}\]

where q_l,m(i) denotes the value of the i th neuron in the m rd filter in layer l, in i ϵ[(n–1)H+1,nH], H is the width of the pooling zone, and p_l+1(n) denotes the output value after the pooling operation.

4.3

Bidirectional Long and Short-Term Memory Networks

Long Short-Term Memory Network (LSTM) is a variant of Recurrent Neural Network (RNN) that processes sequential data.It is capable of holding short-term memory for long periods of time and is able to solve gradient explosions and gradients in recurrent neural networks.

The gate structure of LSTM contains forgetting gates, input gates and output gates. In order to enhance the learning ability of LSTM and improve the extraction of temporal features. In this paper, a bidirectional long and short-term memory network (BiLSTM) is used, which calculates the effect of the current input on the forward hidden state (forward propagation), and also considers the effect of the subsequent information on the current state (back propagation). The computational equation is shown in (28): (28) $\begin{array}{l} \vec{h_{t}} = f (w_{1} x_{t} + w_{2} {\vec{h}}_{t - 1}) \\ \overset{\leftarrow}{h_{t}} = f (w_{3} x_{t} + w_{4} \overset{\leftarrow}{h_{t + 1}}) \\ Y_{t} = g (w_{5} {\vec{h}}_{t} + w_{6} \overset{\leftarrow}{h_{t}}) \end{array}}$ \[\left. \begin{align} & \overrightarrow{{{h}_{t}}}=f\left( {{w}_{1}}{{x}_{t}}+{{w}_{2}}{{\overrightarrow{h}}_{t-1}} \right) \\ & \overleftarrow{{{h}_{t}}}=f\left( {{w}_{3}}{{x}_{t}}+{{w}_{4}}\overleftarrow{{{h}_{t+1}}} \right) \\ & {{Y}_{t}}=g\left( {{w}_{5}}{{\overrightarrow{h}}_{t}}+{{w}_{6}}\overleftarrow{{{h}_{t}}} \right) \\ \end{align} \right\}\]

where $\vec{h_{t}}$ $\overrightarrow{{{h}_{t}}}$, ${\overset{\leftarrow}{h}}_{t}$ ${{\overleftarrow{h}}_{t}}$ and Y_t are the outputs of the forward hidden layer state, the inverse hidden layer state and the hidden layer state at moment t, respectively, x_t is the vector of inputs at moment t, f is the LSTM unitary function, and w is the weight vector.

4.4

Attention mechanisms

In this paper, by using soft attention mechanism [24], a fully distinguishable deterministic mechanism, the gradient is propagated through the attention mechanism and the rest of the network. Its core idea is to assign reasonable weights to different feature vectors, focusing attention to highlight key features ignoring some useless information, so as to improve the prediction accuracy of the model. The specific formula is as follows:

1) The fault features are processed through the attention mechanism layer, multiplied with the weights plus bias, mapped by hyperbolic tangent to the range of [-1, 1], and the feature weight coefficients are calculated as shown in Eq. (29): (29) $u_{i} = \tanh (w y_{i} + b)$ \[{{u}_{i}}=\tanh \left( w{{y}_{i}}+b \right)\]

where tanh(·) is the hyperbolic tangent function, w is the weight of the neuron, and b is the deviation of the neuron.

2) Use the Softmax function to do the calculation of the fault feature weights to get the weight coefficients of different fault features, and then normalized to get the probability distribution of the weight coefficients summing up to 1, as shown in Eq. (30): (30) ${\begin{array}{l} a_{i} = \frac{\exp (u_{i})}{\sum_{i} \exp (u_{i})} \\ \sum a_{i} = 1 \end{array}$ $\left\{ \begin{array}{*{35}{l}} {{a}_{i}}=\frac{\exp \left( {{u}_{i}} \right)}{\sum\limits_{i}{\exp }\left( {{u}_{i}} \right)} \\ \sum{{{a}_{i}}}=1 \\ \end{array} \right.$

3) Filtering fault features and weighted fusion through the attention mechanism to obtain an optimized fault expression F_c as: (31) $F_{c} = \sum (a_{i} * y_{i})$ \[{{F}_{c}}=\sum{\left( {{a}_{i}}*{{y}_{i}} \right)}\]

From the above analysis, it can be seen that in this mechanism, by adjusting the weights, the soft-attention mechanism can dynamically allocate the attention according to the different parts of the input data in order to better capture the key information in the data.

4.5

How Dropout Works

Dropout is a commonly used regularization technique to prevent neural networks from overfitting. Since some neurons are randomly dropped during each training, it is equivalent to training several different sub-networks and improves the generalization ability of the model. Overfitting of the network training is prevented by setting a portion of the output to zero. The calculation formula is as follows: (32) $r = m a (W v)$ \[r=ma\left( Wv \right)\]

where m is the probability that the remaining 1-P is not set to 0, a is the activation function, W is the weight, w is the weight vector, v and r are the outputs, and m and a(Wv) mean the multiplication of the individual elements.

4.6

Model Network Structure and Diagnostic Process

In this paper, we propose a network model of CNN-Attention-BiLSTM, and the main components contain the following parts: four convolutional layers and pooling|layers, one fully connected layer and a hidden layer, where the hidden layer is a 128 BiLSTM network layer of the attention mechanism as well as a Softmax layer composition. The activation function is ReLU and the pooling function is maximum pooling.

In bearing fault diagnosis, the bearing signal data is collected through sensors and divided into training, testing, and validation samples.In the process of fault diagnosis, batch normalization is introduced to stabilize the feature distribution. BiLSTM and attention mechanisms are used to extract key features, and Dropout enhances the generalization ability of the model. Finally, the training set is loaded for feature fusion and Softmax classification to determine the fault type and continuously optimize the model performance.

Due to the different operating environments in which the bearings are located, resulting in unusually complex operation and unpredictable fault conditions, it is more difficult to interpret and analyze the signals.It is easy to cause the traditional deep learning model is prone to overfitting and underfitting problems.This model has high recognition accuracy under variable operating conditions and noisy environments, which effectively improves the accuracy and stability of bearing fault diagnosis. The fault diagnosis process is shown in Figure 3:

5

Experimental analysis of fault diagnosis models

5.1

Dynamics analysis modeling validation

In this paper, based on the railroad track failure dynamics modeling of the speed of 300-1800rpm under different working conditions, respectively, no failure, outer ring failure, inner ring failure, and rolling element failure state of the rolling bearing work of the vibration characteristics of the response of the simulation analysis, the calculation results are more accurate, of which the speed of 900rpm when the above failure of rolling bearing failure characteristics of the frequency of the simulation results and the theory of the frequency of the rolling bearing. Calculation results are shown in Table 1. Combined with the analysis of the time domain and frequency domain, the outer ring failure, inner ring failure and rolling element failure rolling bearing vibration acceleration signal in the speed conditions and no fault rolling bearing signals there are obvious differences, the fault bearing vibration signal there is a significant impact phenomenon, due to the existence of faults in the structure of the rolling bearing vibration impact, the impact frequency is the characteristic frequency of the fault. The fault characteristic frequency obtained is very close to the theoretical calculation results, the simulation results and the theoretical value of the error is small, the maximum error of the outer ring faulty bearings is 0.363%, the maximum error of the inner ring faulty bearings is 0.813%, the maximum error of the rolling element faulty bearings is 0.968%, which are all within the reasonable range. According to the above analysis results, it shows that the simulation analysis of the vibration characteristic response of rolling bearings based on the dynamic modeling of railroad track failure is accurate.

Table 1.

Error analysis of error characteristic frequency simulation

Fault element	Speed of revolution(rpm)	Simulation frequency(Hz)	Theoretical frequency(Hz)	Error/%
Outer ring	300	25.3	25.36	0.237
	600	50.58	50.71	0.257
	900	75.84	76.09	0.330
	1200	101.07	101.41	0.336
	1500	126.42	126.75	0.261
	1800	151.61	152.16	0.363
Inner ring	300	39.35	39.67	0.813
	600	78.71	79.22	0.648
	900	118.12	118.88	0.643
	1200	157.51	158.56	0.667
	1500	196.86	198.19	0.676
	1800	236.23	237.79	0.660
Scroll body	300	21.39	21.58	0.888
	600	42.88	43.28	0.933
	900	64.3	64.91	0.949
	1200	85.75	86.58	0.968
	1500	107.17	108.19	0.952
	1800	128.7	129.87	0.909

5.2

Parameter design

Railroad track fault diagnosis network model parameters mainly include CNN parameters (convolution kernel size k and number of convolution kernels N). The specific parameter settings are as follows.The size and number of convolution kernels of CNN parameters have an important impact on model training. In order to obtain the optimal convolution kernel size and number for the railroad track fault diagnosis network model, take the bearing data under 735W load as an example, according to the grid search method, set the convolution step size, pooling step size and window size to 4, the number of augmentation nodes to 130, the nonlinear activation function to tanh function, and the regularization coefficient λ = 2 × 10-30, and analyze the convolution kernel size and number on the accuracy of fault classification. Impact. The diagnostic results for different convolutional kernel sizes and numbers are shown in Fig. 4. It can be found that: the diagnosis accuracy of large convolution kernel is higher than that of small convolution kernel, and the accuracy gradually increases with the increase of the number of convolution kernel N. However, when N is larger than 18, the range of variation is small and overfitting phenomenon is easy to occur. Therefore, the parameters of this study were set to k=27 and N=18.

5.3

Comparison of classification methods under different additive modules

In order to verify the classification performance of the railroad track fault diagnosis network model for different faults and the reliability of the results, the bearing data under the three loads of 735, 1470 and 2205W are selected for the experiments, F1 is the model proposed in this paper, F2 is the model without adding the attention mechanism, F3 is the model of Dropout in CNN, and F4 is the traditional CNN model. The average value is taken as the final experimental result.The average accuracy of the four models under different loads is shown in Table 2. As can be seen from Table 2: comparing F1 and F2 models, the average accuracy of F1 is 99.50%, which is 3.83% higher than the average accuracy of F2. It shows that adding the attention mechanism can improve the classification ability of the model. Comparing F3 and F1, the average accuracy of F3 is 2.64% lower than that of F1. The accuracy of F4 is significantly inferior to that of the other three models.It can be seen that the railroad track fault diagnostic network model with the addition of the attention mechanism and Dropout in BiLSTM has better performance in bearing fault diagnosis.

Table 2.

The average accuracy of different modules added

Model	735W	1470W	2205W	Mean value/%
F1	99.12	99.56	99.83	99.50
F2	95.45	95.68	95.89	95.67
F3	96.58	96.89	97.12	96.86
F4	91.53	92.13	92.45	92.04

In order to study the training and testing efficiency of the above methods, the training time and testing time of the four models are analyzed, and the analysis results are shown in Fig. 5. It can be seen that: although F2 is faster than the training speed of F1, the latter has a high accuracy rate, and the difference in training time is only 0.02 s. The training time of F3 and F4 is 0.15 s and 0.25 s more than that of F1 respectively, and the model of F1 has a higher accuracy rate, which indicates that the railroad track fault diagnosis network model in this paper consumes a smaller training time to improve a higher accuracy rate.

In order to further visualize and understand the classification ability of the railroad track fault diagnosis network model of this paper for different railroad track faults, the model extracted features are reduced to a two-dimensional plane by t-distributed neighborhood embedding visualization technique. The raw data features are visualized as shown in Fig. 6, where different colors represent different bearing state categories. From Fig. 6, it can be seen that the distribution of the original data is chaotic, and it is difficult to distinguish each fault category. Fig. 7 and Fig. 8 show the results of the data fault feature visualization with the addition of the BiLSTM network and with the addition of the BiLSTM network and the attention mechanism, respectively. From Fig. 7, it can be seen that after adding the BiLSTM network it is possible to obtain a clear separation of the 10 kinds of fault features, and from Fig. 8 it can be seen that after the addition of the BiLSTM network and attention mechanism it is possible to obtain the 10 kinds of fault features. As can be seen in Figure 8, the distribution of the 10 kinds of railroad track fault features is more tightly aggregated after the addition of a BiLSTM network and attention mechanism together. The results show that the model of this paper with the addition of BiLSTM network and attention mechanism has strong feature extraction capability and classification performance, which can distinguish several different kinds of railroad track faults, and the diagnosis ability of different faults is stronger.

5.4

Comparison of different fault diagnosis methods

In order to verify the effectiveness of the model, a comparison experiment between the model of this paper and the fault diagnosis model based on BP network is established, and the comparison results are shown in Fig. 9. When performing parameter optimization, the initial adaptation value and convergence value of this paper’s model are smaller than that of the BP network-based fault diagnosis model. Moreover, the model in this paper reaches convergence around 11 iterations, and the fault diagnosis model based on BP network reaches convergence around 23 iterations, indicating that the model in this paper converges faster than the fault diagnosis model based on BP network.

6

Conclusion

In order to improve the accuracy and reliability of railroad track fault diagnosis, this paper proposes the kinetic modeling of railroad track faults, analyzes the reasons for the formation of railroad track faults, and then proposes a railroad track fault diagnosis model based on a neural network.

According to the kinetic modeling of railroad track faults, the characteristic frequency of faults obtained is very close to the theoretical results, and the maximum errors of the outer ring fault bearing, inner ring fault bearing, and rolling element fault bearing are 0.363%, 0.813%, and 0.968%, respectively, which are all within a reasonable range. The analysis results show that the dynamic modeling method based on railroad track failure proposed in this paper is accurate in the simulation analysis of the vibration characteristic response of rolling bearings, and the cause of fault formation analyzed according to this principle is reliable.

The original data is ambiguous, and the characteristics of the railroad track failure are not clear. When the BiLSTM network is added, 10 kinds of railroad track fault characteristics can be clearly separated, and after the BiLSTM network and the attention mechanism are added together, the distribution of 10 kinds of railroad track fault characteristics is more closely aggregated. It shows that the model of this paper with a BiLSTM network and attention mechanism has more excellent feature extraction and classification performance, and it has a stronger fault diagnosis ability for railroad tracks.

Lingua:: Inglese

Frequenza di pubblicazione:: 1 volte all'anno
Argomenti della rivista:: Scienze biologiche, Scienze della vita, altro, Matematica, Matematica applicata, Matematica generale, Fisica, Fisica, altro

Feed RSS della rivista

Dynamic analysis of railroad track faults based on neural network algorithms

Yue Lyu

Jian Sun

Yifei Cao

Jie Sun

Meng Tian

Hongchang Wang

Feilong Wu

Pubblicato online: 17 mar 2025

Ricevuto: 17 ott 2024

Accettato: 07 feb 2025

DOI: https://doi.org/10.2478/amns-2025-0300

Parole chiaveConvolutional neural network, BiLSTM, Soft attention mechanism, Wavelet transform, Dynamics analysis

© 2025 Yue Lyu et al., published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Parole chiave
Convolutional neural network, BiLSTM, Soft attention mechanism, Wavelet transform, Dynamics analysis