A study of paraphrase meta-language in linguistic domains in the age of artificial intelligence
Published Online: Feb 26, 2024
Received: Jan 15, 2024
Accepted: Jan 22, 2024
DOI: https://doi.org/10.2478/amns-2024-0610
Keywords
© 2024 Tongtong Peng, published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 International License.
This study delves into paraphrase meta-language for linguistic domains in the age of artificial intelligence. The study includes text preprocessing, text representation based on vector space modeling, statistical disambiguation, feature selection, and LDA topic modeling application. The research results show that these methods can effectively extract and understand paraphrased meta-language. The thematic distribution and dynamic changes of paraphrased meta-language are revealed by LDA modeling analysis in 4623552 Twitter data and 532565 linguistic documents. In addition, this study empirically analyzes paraphrase meta-language based on lexical understanding and finds that the average correctness of the annotators meets the expected range in all types of polysemous words. In the era of artificial intelligence, the study of paraphrase meta-language can bring new insights to linguistics, especially showing its value in understanding and processing large-scale linguistic data.
