Uneingeschränkter Zugang

Research on multi-label short text categorization method for online education under deep learning

  
19. März 2025

Zitieren
COVER HERUNTERLADEN

Figure 1.

Multi-label essay the overall process of the class task
Multi-label essay the overall process of the class task

Figure 2.

CNN’s network structure
CNN’s network structure

Figure 3.

BERT-LSTM-CNN model structure
BERT-LSTM-CNN model structure

Figure 4.

Change curve of Loss and Macro-P value
Change curve of Loss and Macro-P value

Figure 5.

Model time consumption contrast
Model time consumption contrast

The impact of convolutional kernel size

Convolution nucleus THCNEWS EduData
Train Test Train Test
[1,2,3] 96.34 96.42 97.18 96.89
[2,3,4] 98.06 98.13 97.42 97.26
[3,4,5] 97.83 97.57 97.63 97.54
[4,5,6] 97.69 97.42 97.71 97.98
[5,6,7] 97.75 97.68 98.25 98.37

Different models compare experimental results

Model THCNEWS EduData
Marco-P Marco-R Marco-F1 Marco-P Marco-R Marco-F1
TextCNN 83.64% 82.01% 0.835 84.45% 83.13% 0.891
BERT 88.06% 84.75% 0.871 89.09% 86.06% 0.932
RoBERTa 87.57% 83.28% 0.882 89.24% 86.28% 0.935
MacBERT 88.25% 85.16% 0.878 89.46% 86.71% 0.937
ERNIE 88.48% 85.34% 0.883 89.87% 87.15% 0.941
ERNIE-CNN 89.73% 86.43% 0.894 91.16% 88.49% 0.948
CRC-MHA 90.12% 88.85% 0.901 91.49% 89.27% 0.953
Ours 92.09% 90.14% 0.915 92.08% 90.38% 0.962

Different word embeddings’ impact on the results

Model THCNEWS EduData
Train Test Train Test
Word2Vec 93.32 92.64 93.75 93.98
ELMo 94.06 94.83 94.27 94.46
GloVe 94.27 94.78 94.96 94.83
BERT 96.48 96.32 96.71 96.59
Sprache:
Englisch
Zeitrahmen der Veröffentlichung:
1 Hefte pro Jahr
Fachgebiete der Zeitschrift:
Biologie, Biologie, andere, Mathematik, Angewandte Mathematik, Mathematik, Allgemeines, Physik, Physik, andere