Integrating Content Analysis and LDA Thematic Modeling to Analyze the Presentation of Youth Culture in Urban Cinema
26. Sept. 2025
Über diesen Artikel
Online veröffentlicht: 26. Sept. 2025
Eingereicht: 22. Dez. 2024
Akzeptiert: 18. Apr. 2025
DOI: https://doi.org/10.2478/amns-2025-1030
Schlüsselwörter
© 2025 Hao Liu, published by Sciendo.
This work is licensed under the Creative Commons Attribution 4.0 International License.
Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

The operating time of each model in different topic keywords
| Model | Running time(s) | ||
|---|---|---|---|
| General key | Synonymous words | Keywords of multiple meanings | |
| LSA | 5.0944 | 5.0498 | 5.7439 |
| PLSA | 4.2886 | 3.7515 | 4.6910 |
| STM | 4.3783 | 4.7040 | 5.1960 |
| CNN | 5.0109 | 5.2280 | 6.2341 |
| ERNIE | 4.337 | 4.5253 | 4.8669 |
| LDA | 5.5752 | 6.1680 | 6.7025 |
| LSTM | 5.0323 | 5.4087 | 6.6815 |
| BERT-base | 3.3305 | 3.8339 | 4.6453 |
| LDA-Kmeans | 0.2744 | 0.3241 | 0.4543 |
The text data topic of the topic is divided
| Theme | Weighting (%) | Core theme | Topic description |
|---|---|---|---|
| Topic 5 | 34.26 | Society, industry, company, time, work, competition, market, enterprise, Internet, young person, graduation, opportunity. | The inner volume is very serious |
| Topic 6 | 29.04 | Oneself, hard work, life, work, study, anxiety, lying flat, things, overtime, life, examination and investigation | Stress of life |
| Topic 7 | 21.17 | Serious, after-work, anti-internal volume, evening, likes, colleagues, work, support, milk tea, mobile phone, game, star. | Resistance volume |
| Topic 8 | 15.53 | Education, school, students, parents, training, teachers, institutions, cold and summer holidays, policies, college entrance exams, universities and complementary courses | Education volume |
Test results in three different data sets
| Data set | Model | Accuracy rate | Recall rate | F1 value |
|---|---|---|---|---|
| YCT | CNN | 0.9015 | 0.8996 | 0.9219 |
| LSTM | 0.8638 | 0.8605 | 0.8646 | |
| BERT-base | 0.9164 | 0.9169 | 0.9131 | |
| LDA-Kmeans | 0.9613 | 0.9844 | 0.9702 | |
| LDA | 0.9248 | 0.9118 | 0.9206 | |
| ERNIE | 0.9474 | 0.9465 | 0.9434 | |
| Weibo1 | CNN | 0.8891 | 0.8858 | 0.9078 |
| LSTM | 0.8489 | 0.8472 | 0.8526 | |
| BERT-base | 0.9031 | 0.9015 | 0.8977 | |
| LDA-Kmeans | 0.9789 | 0.9699 | 0.9545 | |
| LDA | 0.9063 | 0.8981 | 0.9056 | |
| ERNIE | 0.9341 | 0.9321 | 0.9295 | |
| Online2 | CNN | 0.873 | 0.8694 | 0.9026 |
| LSTM | 0.8381 | 0.8207 | 0.8458 | |
| BERT-base | 0.8942 | 0.8882 | 0.8861 | |
| LDA-Kmeans | 0.9679 | 0.9789 | 0.9625 | |
| LDA | 0.8954 | 0.8848 | 0.8948 | |
| ERNIE | 0.9228 | 0.908 | 0.9166 |
The accuracy of each model is compared to the accuracy of the key words
| Model | Accuracy(%) | ||
|---|---|---|---|
| General key | Synonymous words | Keywords of multiple meanings | |
| LSA | 44.37 | 41.64 | 38.77 |
| PLSA | 56.88 | 53.37 | 42.34 |
| STM | 64.51 | 58.32 | 40.23 |
| CNN | 47.74 | 43.04 | 37.67 |
| ERNIE | 52.78 | 51.38 | 50.69 |
| LDA | 57.16 | 51.52 | 39.03 |
| LSTM | 65.75 | 60.85 | 54.08 |
| BERT-base | 55.85 | 49.45 | 49.21 |
| LDA-Kmeans | 93.88 | 90.12 | 88.54 |
The result of the text data topic is divided
| Theme | Weighting (%) | Core theme | Topic description |
|---|---|---|---|
| Topic 1 | 35.78 | Choices, problems, young people, society, children, life, future, ability, flat lying, opportunity, education fund. | Cause of lie down |
| Topic 2 | 24.97 | Self, effort, work, no desire, things, anxiety, learning, salted fish, rejection, giving up, resting | Inner emotion |
| Topic 3 | 21.03 | Like, teacher, friend, forever, hope, lovely, good-looking, enter the pit, game, pit, thank you, stage | Resistance volume |
| Topic 4 | 18.22 | Happy, home, weekend, day, comfort, sleep, mobile phone, refueling, sports, summer holidays, happiness, air conditioning | Enjoy life |
Young people lie in the high frequency vocabulary of text data
| Serial number | Participle | frequency | Serial number | Participle | frequency |
|---|---|---|---|---|---|
| 1 | Lie down* | 76500 | 16 | Question * | 7014 |
| 2 | Self * | 71034 | 17 | Child * | 6757 |
| 3 | Life * | 50377 | 18 | Go home | 6549 |
| 4 | Effort * | 30549 | 19 | Learning * | 6321 |
| 5 | Work * | 22147 | 20 | Anxiety * | 6218 |
| 6 | Suffer * | 19053 | 21 | Young man | 6210 |
| 7 | Like * | 9987 | 22 | Fatigue | 6138 |
| 8 | Eat | 9654 | 23 | Friend | 6022 |
| 9 | Select | 9014 | 24 | World | 5317 |
| 10 | Time * | 8326 | 25 | At home | 5015 |
| 11 | Get up | 8059 | 26 | Society * | 5004 |
| 12 | Hope * | 8011 | 27 | Teacher * | 4932 |
| 13 | Happiness | 7877 | 28 | Go to work | 2714 |
| 14 | Joyfulness | 7656 | 29 | China* | 1999 |
| 15 | Thing * | 7325 | 30 | Tomorrow | 1934 |
The text data high frequency vocabulary of the topic of youth
| Serial number | Participle | frequency | Serial number | Participle | frequency |
|---|---|---|---|---|---|
| 1 | Inner volume | 54870 | 16 | Anxiety * | 6891 |
| 2 | Self * | 49423 | 17 | Question* | 6624 |
| 3 | Education* | 28762 | 18 | Hope * | 6434 |
| 4 | Child * | 18941 | 19 | Time * | 6194 |
| 5 | Work * | 11547 | 20 | China* | 6097 |
| 6 | Severity | 10562 | 21 | Company | 6082 |
| 7 | Effort * | 9848 | 22 | Stars | 6012 |
| 8 | Life * | 9534 | 23 | School | 5884 |
| 9 | Society * | 8889 | 24 | Student | 5190 |
| 10 | Teacher * | 8193 | 25 | Age | 4879 |
| 11 | Donation | 7925 | 26 | Overtime | 4873 |
| 12 | Lie down* | 7884 | 27 | Competition | 4809 |
| 13 | Like * | 7754 | 28 | Parent | 2584 |
| 14 | Industry | 7526 | 29 | Thing * | 1877 |
| 15 | Learning * | 7206 | 30 | Money | 1813 |
