Combined Application of Speech Recognition and Natural Language Processing Technologies in the Electric Power Industry
Published Online: Mar 19, 2025
Received: Nov 02, 2024
Accepted: Feb 02, 2025
DOI: https://doi.org/10.2478/amns-2025-0530
Keywords
© 2025 Tao Xu et al., published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
The application of speech recognition technology in the power industry can improve the collaborative efficiency of power grids at all levels and reduce the work intensity of dispatchers, which is one of the indispensable key technologies in the process of intelligent development of power grids. In this study, a power speech recognition model is designed based on the combination of Transformer-based out-of-set word model and n-gram language error checking based model. For model application, a training set is used for model training to test the input features of the model in this paper. Subsequently, a power speech dataset was created, which was used for model comparison to validate the effectiveness of the algorithms in the paper. System design using the algorithms proposed in the paper is carried out to process real-time speech, speech files, and speech information from telephone terminals. The results show that the Spectrogram feature of the speech signal is more suitable as the input feature of the model in this paper, which can reduce the word error rate of the speech recognition model. The model in this paper performs best in all four metrics: Accurary, Precision, Recall, and F1. The parameter count of the proposed method in this paper is 25, the word error rate WER is 8.21%, and the real-time rate RTF is 0.017, which indicates that the algorithm has a good generalization performance on power speech dataset.