Acceso abierto

Development Dilemma and Innovation Path of Chinese Language and Literature in the Information Age

  
17 mar 2025

Cite
Descargar portada

Introduction

As an important part of Chinese traditional culture, Chinese language and literature still has an irreplaceable and important role in the network era [1-2]. We should correctly recognize the influence of network language on Chinese language and literature, take positive measures to promote the innovation and development of Chinese language and literature [3], strengthen the inheritance and protection of its work, so that more people can understand and learn Chinese language and literature, and contribute to the inheritance and development of Chinese culture [4].

With the rapid development of information technology, the society has entered the era of informationization, is in people’s daily work and life, the network has a pivotal role, but also play a major role [5-6]. Internet language depends on the network, and because of the rapid development of the network, to achieve the corresponding popularization, which, in a way, will bring positive and negative impact on the development of Chinese language and culture [7-8].

The continuous development of network language, whether it is a positive or negative role, Chinese language and literature in the full understanding of the basis, to seize the opportunity for development, to enhance their own, in the new era to make it more with a certain degree of vitality [9-10]. However, it should be worth noting that: some contents in the network language do not have a longer life, sometimes under the heat of the past, for these languages, people define them as disconnected from the network link. If the Chinese language and literature are not identified and rationally incorporated in the traditional language system, an embarrassing situation will occur [11]. Based on this, in order to promote the development of network language and Chinese language literature in the Chinese new media environment to explore the development of language and literature, whether it is for the overall development goals, or the overall direction of development, need to be clear and good [12], and recognize that in the development and innovation of Chinese language literature, the network language is playing an important role based on the basis of the network language to absorb the essence of the network language of some parts of the Chinese language literature to become richer [13]. If there are good conditions, it is necessary to regulate the network language. It can be seen that while network language makes the expression of Chinese language and literature richer and promotes the development of language industrialization, it should be clear that it will also bring negative impact on Chinese language and literature [14-15]. In the face of the negative impact and influence, it is necessary to take effective measures to cope with it, so as to maximize the role played by network language in Chinese language and literature, and then promote the development of Chinese language and literature in the age of informationization [16].

In this study, data mining technology is utilized to analyze data and performance related to the development level of Chinese language and literature.First, the mining model of Chinese language development level is constructed by using K-mode clustering and anomaly data detection algorithms. Then, taking students majoring in Chinese language and literature as samples, the model was utilized for clustering and division, and the Chinese language and literature levels of Chinese language and literature students in five categories were derived. On this basis, the dilemma of Chinese language and literature development is discussed, and a bilinear teaching model for the innovative cultivation of Chinese language and literature, consisting of three major links: teaching front-end analysis, teaching resource design and teaching process, is designed. Then the teaching mode of this paper and the traditional teaching mode are applied in practice to compare the effect. The feasibility of this teaching model is examined through the teaching effect of practical application and student satisfaction, and the innovative path of Chinese language and literature informatization is proposed from the aspects of traditional culture integration, minority language and culture protection and international promotion.

Data Mining of Chinese Language and Literature Informatization Development Levels
Mining process and categorization

The process of data mining for Chinese language and literature development is actually the process of extracting knowledge from a large amount of data about Chinese language and literature. With the continuous deepening of the research, data mining has established a set of mature process system, the data mining process is shown in Figure 1, the main processes of data mining are data acquisition, data preprocessing, feature extraction, feature selection, data mining, model evaluation.

Figure 1.

Data mining process

Data mining has an important role in decision-making problems and has successfully been applied in several fields.The purpose of data mining is to build scientific models from preprocessed data, mainly including predictive and descriptive models. Among them, predictive models are used to predict unknown data from known data, while descriptive models discover new patterns or structures by analyzing the data. Classification algorithms, clustering algorithms and association rule algorithms are the main three categories of data mining algorithms, according to the attributes of the data and the goal of the task, this paper focuses on classification algorithms, and drills down into the analysis and prediction of the development of Chinese language and literature by support vector machine algorithms specifically.

Support Vector Machines (SVMs) have been widely studied as a data mining classifier. first of all. in the linearly differentiable case. Given a set of training samples Z = (x1,y1),(x2,y2),…,(xn,yn),yi ∈ {+1, −1}. The decision classification function can be described by a linear discriminant equation wT · x + b = 0 in a n-dimensional space. Assuming that there are three hyperplanes A, B, and C, the fork and circle represent the positive and negative samples, respectively. the SVM tries to find an optimal hyperplane to separate the positive and negative classes while minimizing the classification error to maximize the margin. At this time, hyperplane B should be chosen because hyperplane B satisfies the condition of maximum classification interval and minimum empirical risk. The optimal classification hyperplane is f(x)=tαi*yixiTx+b* , where αi* is the support vector points.

In most cases, the training data is linearly indistinguishable, e.g. face images, text document data, etc. A mapping function from the low-dimensional feature space to the high-dimensional feature space can be defined by adding relaxation variables and penalty factors to the conditions Φ:dF . By introducing the kernel function K(xi,xj), the SVM can seek the inner-product operation in the new feature space, converting the linearly indivisible into the linearly divisible, which can solve the problem that the linear segmentation is not possible in higher dimensions directly. At this time, the optimal classification hyperplane is f(x)=tαi*yiK(xi,xj)+b* . In addition, the selection of appropriate kernel function parameters has a greater impact on the classification effect, the commonly used kernel functions are linear kernel function, polynomial kernel function, Gaussian kernel function (radial basis kernel function), Sigmoid kernel function and so on.

K-modes clustering algorithm

In traditional taxonomy, the classification problem mainly originates from people’s cognition of things, but with the continuous progress of human society and the rapid development of science and technology, all kinds of industries in various fields have put forward higher requirements for the classification problem, so people refer to mathematics as a tool to taxonomy, forming a numerical taxonomy with quantitative classification. After that, it gradually formed the widely used cluster analysis technology today.

K-means clustering algorithm is a kind of division method, sometimes called k – mean clustering algorithm, which originated from signal processing and applications, and now it is more used in the field of data mining. The core idea of this algorithm is to cluster objects with n datasets into k classes according to the principle of closest distance, so that each object can be categorized into the cluster corresponding to its closest mean. Therefore, the clustering algorithm should satisfy two basic conditions, i.e., there is at least one sample data in any cluster, and any sample data belongs to and only belongs to one cluster.

Specifically, a given set of objects (x1,x2,⋯,xn), in which each object is a m – dimensional vector, is divided into k sets (k ≤ n) by the algorithm such that the sum of the squares of the differences between each object in the set and the central object in the set is minimized, i.e., the clusters that satisfy the following equation Ci: argminSCi=1kxci Xμi 2

K-modes clustering algorithm Overview K-modes clustering algorithm also belongs to a classical clustering algorithm of the division method, and its basic idea and the basic process of algorithm implementation are basically the same as the k-means clustering algorithm. The K-means clustering algorithm is a simple and practical clustering method, but it is unable to deal with datasets that contain categorical variables.Therefore, the k-means clustering algorithm has been improved, and the K-modes clustering algorithm has been proposed, which can solve the problem of categorical data. The algorithm adopts the common SMD method to realize the processing of categorical variables, replaces the mean with the multitude, and uses the Hamming distance to compute the distance between the two sample points, and its objective function is defined as the sum of the distances between the sample data points and their corresponding clustering centers. .

The most essential difference between this clustering method and the K-modes clustering algorithm is that the method weakens the similarity of the data in the class and ignores the intrinsic connection between the attributes of the same data, which makes the classification problem in the similarity distance metric simple and easy to use, and therefore has a wide range of applications in many fields. However, this similarity distance measure may often lead to loss of information, making it difficult to achieve the desired effect of clustering results. For this reason, in many applications, it may need to be modified to some extent.

K-modes Clustering Algorithm Distance Calculation Formula Let X = {x1,x2,⋯,xn} be the set of sample points, the attribute of sample is xi = {xi1,xi2,⋯,xim} is {A1,A2,⋯,Am}, and for attribute Ai taking the value of Dom(Ai)={ a1i,a2i,,ali }(l2) , the K-modes distance of any two sample points xi and xj is: D(xi,xj)=l=1md(xil,xjl)

Among them: d(xil,xjl)={ 0,xil=xjl1,xilxjl

K-modes clustering algorithm workflow determine the number of categories K , and randomly generate K clustering centers. Calculate the distance from each sample point to the center of each cluster, according to the principle of minimum distance, the sample point is classified as the closest class, if there are more than one distance from the nearest center, randomly select a classified in which. Calculate the average distance between each sample data point and its corresponding center in each cluster, and change the corresponding clustering center based on the calculation results. Judge each clustering center and samples and the last clustering results are consistent, if consistent, the clustering is complete, the end of the algorithm, otherwise repeat the steps and steps to continue the iteration.

Based on the information first anomalous data detection algorithm, the concept of information first originated from a test degree in order to determine the amount of information, so people often use information first to quantify the information content of a system, so as to achieve the purpose of optimizing the system or the determination of the system. It is defined as follows: H(X)=xXpilogpi

Where: pi = p(xi|X) is the probability of xi, i = 1,2,…,n and satisfies: i=1npi=1

Based on this, an abnormal data detection algorithm based on information first, i.e. greedy algorithm, is proposed. The main idea of this algorithm is to use the information first to measure the orderliness of the dataset, if the smaller the value of the information first of the dataset is, it means the more orderly the dataset is. On the contrary, if the first value of the information in the dataset is larger, it means that the data is more chaotic.

The greedy algorithm is implemented by traversing the dataset and calculating the first value of the information of the dataset after separating each data from the dataset, finding out the data that causes the largest change in the first value of the information of the dataset and separating it from the dataset to put it into the anomalous dataset. This is repeated until all the anomalies in the dataset are found. It can be seen that the greedy algorithm of abnormal data detection based on information first, each time to find an abnormal data need to traverse the entire dataset, when the amount of data contained in the dataset is very large or contains a lot of abnormal data, the algorithm’s time complexity is relatively large.

The basic idea of frequency-based abnormal data detection algorithm is to calculate the frequency of each attribute value in each sample data in the frequency of the attribute value, and then based on the frequency of each attribute in each sample to calculate the frequency of the sample data AVF, the smaller the frequency of the sample data indicates that it is more abnormal. The formula used to calculate the frequency of sample data is as follows: AVF(xi)=1mj=1mf(xij)

Where: xi is the sample data, m is the dimension of the sample data, and f(xij) is the frequency of the sample data xi on attribute j.

Cosine-based similarity calculation model Cosine-based similarity calculation method is a distance-based calculation method, the basic idea is to use the cosine value of the angle between two vectors in the vector space to calculate the difference between the two sample data, if the cosine value between the two sample data is larger, it means that the two sample data are more similar. On the contrary, if the cosine value between the two sample data is smaller, it means that the difference between the two sample data is greater. The difference between this method and the Euclidean distance-based measure of sample data similarity is that the Euclidean distance method pays more attention to the differences in the positions and absolute distances of the two sample data, while the cosine distance-based measure ignores the differences in the positions of the two sample data and pays more attention to the differences in the directions of the two sample data. Therefore, the two distance similarity measures have their own advantages and can be applied to various models. For any two sample data X, Y, the model for calculating the cosine distance is as follows: Sim(X,Y)=cos(X,Y)=i=1m(xi×yi)i=1mxi2×i=1myi2=XY X × Y where m is the dimension of the sample data.

Analysis of the development of Chinese language and literature based on data mining

This section focuses on the excavation and statistics of the development of Chinese language and literature education in the age of informationization, so as to provide support for the innovation of the development of Chinese language and literature in the later text.

Analysis of Potential Capacity in Chinese Language and Literature

In this study, the grades of 20 students in Class A of Chinese Language and Literature major in S College were selected as the initial data, which included (A1) Introduction to Chinese Culture, (A2) Twentieth Century Western Literature, (A3) Educational Technology, (A4) History of Contemporary Chinese Literature, (A5) Ancient Chinese, (A6) Introduction to Linguistics, (A7) Theory of Language Teaching, (A8) Introduction to Literature, (A9) Modern Chinese, ( A10) Information Processing of Language and Literature, (A11) History of Modern Chinese Literature, a total of 11 courses, factor analysis of students’ performance, and using the “maximum variance rotation method” to calculate the factor loading matrix, the results of the factor loadings and the factor structure are shown in Table 1. From the factor loadings, the factor structure can be obtained by taking the variables with each principal factor greater than 0.5. It can be seen that the five primary factors reflect different potential abilities.Influencing the main factor 1 is the basic theory courses, which reflect students’ certain cognitive ability, rational thinking ability and the ability to use theories to analyze specific phenomena.The main factor 2 is language courses, which reflect students’ ability to study language problems and use language correctly.Master Factor 3 mainly contains word processing courses, which obviously reflect students’ word-processing skills. Master Factor 4 is the Basic Skills for Language Teachers category of courses, reflecting students’ ability to teach language. Master Factor 5 is the Literary History category of courses, reflecting students’ ability to look for and find patterns in historical knowledge and to think coherently in historical thinking.

Factor load result and factor structure

Course title Primary cause 1 Primary cause 2 Primary cause 3 Primary cause 4 Primary cause 5
A1 0.906 0.246 -0.113 0.147 0.194
A2 0.147 0.115 0.989 0.044 0.159
A3 0.334 0.126 0.024 0.995 0.262
A4 0.938 -0.066 0.092 0.34 0.353
A5 0.261 0.486 0.197 0.673 0.579
A6 0.683 0.615 0.345 0.223 -0.136
A7 -0.012 0.309 0.696 0.724 0.035
A8 0.377 0.838 -0.031 0.452 0.299
A9 -0.029 0.905 0.556 0.052 0.141
A10 -0.212 0.532 0.716 0.225 0.226
A11 0.240 0.159 0.189 0.249 0.929
Factor structure
Primary cause 1 Primary cause 2 Primary cause 3 Primary cause 4 Primary cause 5
Course title A1,A4,A6 A6,A8,A9 A2,A7,A10 A3,A5,A A7 A5,A11
Might Cognitive ability, rational thinking ability language competence Verbal ability Language teaching ability Historical thinking ability
Cluster analysis of Chinese language and literature development

By standardizing the student achievement data, students are represented by serial numbers 1 to 20. After processing the student classification, the results of systematic clustering were obtained as shown in Fig. 2, according to the systematic clustering spectrogram, the 20 students can be classified into five major categories, i.e., K1 = (1, 3, 4, 2), four students. K2 = (5, 9, 7, 6, 8, 10), six students. K3 = (11, 12, 13), three students. K4 = (14, 15, 16), three students. K5 = (17, 20, 18, 19), four students.

Figure 2.

System clustering results

The cluster analysis model of students’ performance classified the 20 students into five major categories, which can be used in combination with the model presented in this paper to categorize the development of Chinese language and literature.

K1 = (1, 3, 4, 2), three students have poor Chinese language ability and teaching ability in Chinese language, and cognitive ability, rational discursive ability, and word processing ability are outstanding.

K2 = (5, 9, 7, 6, 8, 10), six students were relatively balanced in all competencies, with linguistic competence and cognitive and rational discursive abilities standing out.

K3 = (11, 12, 13), three students have poor teaching ability in Chinese language, strong word processing, and the rest are relatively balanced in all abilities.

K4 = (14, 15, 16) Three students have poor rational discursive ability and poor language ability. Word processing is strong and the rest are balanced.

K5 = (17, 20, 18, 19) Four students are more prominent in all five potential abilities.

In summary, only 20% of the students majoring in Chinese Language and Literature have outstanding potential ability, and the potential ability of the rest of the students is unbalanced, which shows that the current development of Chinese Language and Literature is unbalanced and severely dislocated.

Difficulties and Countermeasures for the Development of Chinese Language and Literature
Analysis of the Dilemma of Chinese Language and Literature Development

In today’s information age, the proliferation of Internet buzzwords has greatly hindered the further development and progress of Chinese language and literature. Since the composition of Internet buzzwords does not have a formal Chinese grammatical structure, they are only fabricated and spread by some netizens in order to seek excitement and fun, so they do not belong to the Chinese language system. However, because these Internet buzzwords are usually easy to read and have a certain sense of humor, they are constantly spreading on major social networking software. The witty character of network buzzwords is frequently used in the network and even in the display life, which not only seriously affects the development and dissemination of Chinese language and literature, but also causes great harm to the physical and mental health of minors, so we need to strengthen the guidance of the community, so that minors can learn Chinese language and literature in a high-quality and efficient way. Secondly, the phenomenon of language and writing loss is serious. With the accelerating pace of people’s lives, the frequent use of various types of social software also makes letters and time together become memories. The result of these phenomena is that people are having fewer and fewer opportunities to write. Finally, there is a lack of modernized and reformed theoretical knowledge and teaching practice in the education of Chinese language and literature, coupled with the rich connotation and complex structure of Chinese language and literature itself, making its popularization and promotion more difficult.

Innovative Cultivation Path of Chinese Language and Literature

Chinese language literature has profound connotation and rich structure, the improvement and refinement of the Chinese language literature teaching curriculum can better promote the development of Chinese language literature and advance the realization of the core literacy teaching goals of Chinese language literature. In order to promote the organic combination of emotional value education and the connotation of Chinese language literature in the teaching process of Chinese language literature, and also in order to make the teaching of Chinese language literature more full and vivid, and to adapt to the development in the context of the information age, the Chinese language teaching mode based on two-line teaching is constructed. The teaching mode of Chinese language and literature based on two-line teaching is shown in Figure 3. In the context of dual-line teaching, the Chinese language and literature course organically considers and selects online teaching resources and offline teaching resources, accurately implements measures in the course of teaching to achieve the organic docking of online and offline, and tries its best to implement the complete educational chain of knowledge, emotion, intention and behavior. Under the principles of blended teaching, curriculum teaching and Chinese language and literature, based on the basic content of the Chinese language and literature course, and combined with the current environment of two-line teaching, a two-line teaching model of Chinese language and literature course is constructed. The model includes three major links: teaching front-end analysis, teaching resource design, and teaching process design, which are progressive and cyclical.

Figure 3.

The teaching model of Chinese language and literature based on double line teaching

Analysis of the effect of the application of the two-line teaching model

The purpose of this experiment is to practically apply the traditional teaching mode and this paper’s two-line teaching mode to the teaching practice of Chinese language and literature classroom, and deeply compare the differences between the two in various aspects. The experimental object of this experiment is 35 students of Chinese language and literature majoring in class A of College S with a difference of 0.1 in each level of Chinese language and literature, and divided them equally into two groups, A and B. Group A adopts this paper’s bilinear teaching mode to teach Chinese language and literature, while group B does not have any intervention and teaches in the traditional mode of education, and only plays the role of a control. Through a semester teaching experiment to compare the two groups of students in the Chinese language and literature on the N1 cognitive ability, N2 rational thinking ability, N3 theoretical analysis ability, N4 language ability, N5 word processing ability and N6 historical thinking ability of the achievements of the two groups, so as to validate the application of the innovative path of the Chinese language and literature effect.

Comparison of the effects of Chinese language and literature teaching models

After one semester of teaching experiment, the results of the comparison of the competence levels of students in groups A and B are shown in Table 2. In terms of the overall effect, the P of the two groups of students in the six dimensions of competence level are less than 0.05, all of which are significant differences, and the competence level of each competence level of the students in Group A after the implementation of the innovative pathway has increased by 0.563-1.263 points compared with that of Group B in the traditional teaching mode. In terms of comprehensive score, the comprehensive average score of group A after the implementation of innovative path is 4.340, compared with the 3.485 comprehensive average score of group B in the traditional mode, the average score has increased by 0.855. And if P is less than 0.001, there is a significant difference. This indicates that the comprehensive average score of Group A is significantly higher than that of Group B. It also proves that the teaching of Chinese language and literature based on the dual-line teaching model of this paper is more effective in promoting the improvement of students’ competence in various dimensions of Chinese language and literature.

The ability level of the two groups was compared

Dimensionality Group N Mean Standard deviation T P
N1 A 32 4.252 0.396 2.279 0.000
B 32 3.306 0.162
N2 A 32 4.043 0.263 2.015 0.000
B 32 3.385 0.391
N3 A 32 4.236 0.266 1.796 0.001
B 32 3.646 0.172
N4 A 32 4.399 0.295 0.983 0.002
B 32 3.836 0.267
N5 A 32 4.548 0.402 2.115 0.000
B 32 3.285 0.406
N6 A 32 4.562 0.176 2.483 0.000
B 32 3.452 0.115
Integrated mean A 43 4.340 0.867 2.376 0.000
B 43 3.485 0.788
Comparison of Satisfaction with Chinese Language and Literature Teaching Models

In this section, the effectiveness of the application of the two teaching modes is further investigated by comparing the students’ satisfaction with the two modes. After the teaching experiment, a satisfaction survey was conducted on 64 participants from six aspects: P1 teacher guidance, P2 teaching method, P3 teaching effect, P4 course situation, P5 goal completion, and P6 acceptance, etc. The questionnaire was scored on a scale of 1 to 5, which represented “very dissatisfied”, “dissatisfied”, “average”, “satisfied”, and “very satisfied”, in that order. The satisfaction scores of the two groups of students are shown in Figure 4. It can be seen that in the six aspects of the satisfaction index, the score of Group A of the teaching model of this paper has improved between 0.18 and 0.94 points compared with that of Group B of the traditional teaching. Taken together, the composite average scores of student satisfaction in Groups A and B are 4.258 and 3.736, respectively, and the average composite score of satisfaction in Group A has increased by about 0.521 points. It means that the students’ satisfaction with the dual-line teaching mode of Chinese language and literature in this paper is between “satisfied” and “very satisfied”, and compared with the traditional teaching mode, it has been recognized by the students and has a higher satisfaction rating, which further verifies the advantages of the dual-line teaching mode in the practical application of Chinese language and literature in this paper. This further verifies the advantages of the dual-line Chinese language and literature teaching mode in practical application, which can effectively promote the development of Chinese language and literature education.

Figure 4.

Student satisfaction scores

In addition, under the background of the increasing integration of information network technology and social production and life, all kinds of novel vocabularies and expressions emerge in large quantities and rapidly in the network environment, enriching people’s ways of expression and means of expression, and the public even tends to use the network language to carry out daily communication and expression of emotions. This brings new ideas for the development of Chinese language and literature. In order to inherit and develop Chinese language culture and Chinese language literature, it is necessary to strengthen the integration with traditional culture in the process of inheriting Chinese language literature, and utilize the profound heritage of traditional culture to further promote the common development of the two in the new era.

Conclusion

This paper firstly constructs a data mining model of the development level of Chinese language literature by using the clustering algorithm, at the same time, combines the mined data to explore the dilemma of the development of Chinese language literature informatization, and puts forward the path of the innovative development of Chinese language literature informatization and its application.

1) Through factor analysis, take each main factor greater than 0.5 variables combined with the model of this paper can be divided into five categories of Chinese language literature such as K1, K2, K3, K4, K5, etc., in which the word processing ability of K1 students is outstanding. the ability of K2 students is relatively balanced in all aspects, and the proportion of such students is the largest (30%), the teaching ability of K3 students in Chinese language is poor, and the rationality and discernment ability of K students is poor, and the language ability of K students is poor. critical thinking and poor language skills. k5 students were more prominent in all the five potential abilities.

2) Group A based on the teaching mode of this paper improved 0.563~1.263 points in each ability level compared with Group B of the traditional teaching mode, and the comprehensive average score of Group A was 4.340 points, compared with the traditional mode, the comprehensive average score of Group B was improved by 0.855. It shows that the dual-line teaching mode of Chinese language and literature in this paper can effectively promote the improvement of the students’ ability in all dimensions, and has a better performance in terms of teaching effect compared with the traditional mode. Teaching has an effect on excellent performance.

3) The six satisfaction index scores of Group A of this paper’s teaching mode have increased by 0.18~0.94 points compared with those of Group B. The average score of Group A’s satisfaction is 4.258, and the average score of Group B’s satisfaction has increased by about 0.521 points. It shows that the dual-line teaching mode of Chinese language and literature in this paper has been recognized more by students and is more in line with the needs of practical teaching.

The valuable insights and methods provided in this paper, both theoretically and practically, have significant implications and impetus for the development of Chinese language and literature. However, in the future, we need to strengthen the depth of data analysis, expand the size of the sample, enrich the combination of theoretical discussion and empirical analysis, and further verify and deepen the discovery of this study.

Fund project:

This article is sponsored by the second batch of “Three Comprehensive Education” comprehensive reform pilot colleges (departments) in Hunan Province