Integrating Content Analysis and LDA Thematic Modeling to Analyze the Presentation of Youth Culture in Urban Cinema
Sep 26, 2025
About this article
Published Online: Sep 26, 2025
Received: Dec 22, 2024
Accepted: Apr 18, 2025
DOI: https://doi.org/10.2478/amns-2025-1030
Keywords
© 2025 Hao Liu, published by Sciendo.
This work is licensed under the Creative Commons Attribution 4.0 International License.
Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

The operating time of each model in different topic keywords
| Model | Running time(s) | ||
|---|---|---|---|
| General key | Synonymous words | Keywords of multiple meanings | |
| LSA | 5.0944 | 5.0498 | 5.7439 |
| PLSA | 4.2886 | 3.7515 | 4.6910 |
| STM | 4.3783 | 4.7040 | 5.1960 |
| CNN | 5.0109 | 5.2280 | 6.2341 |
| ERNIE | 4.337 | 4.5253 | 4.8669 |
| LDA | 5.5752 | 6.1680 | 6.7025 |
| LSTM | 5.0323 | 5.4087 | 6.6815 |
| BERT-base | 3.3305 | 3.8339 | 4.6453 |
| LDA-Kmeans | 0.2744 | 0.3241 | 0.4543 |
The text data topic of the topic is divided
| Theme | Weighting (%) | Core theme | Topic description |
|---|---|---|---|
| Topic 5 | 34.26 | Society, industry, company, time, work, competition, market, enterprise, Internet, young person, graduation, opportunity. | The inner volume is very serious |
| Topic 6 | 29.04 | Oneself, hard work, life, work, study, anxiety, lying flat, things, overtime, life, examination and investigation | Stress of life |
| Topic 7 | 21.17 | Serious, after-work, anti-internal volume, evening, likes, colleagues, work, support, milk tea, mobile phone, game, star. | Resistance volume |
| Topic 8 | 15.53 | Education, school, students, parents, training, teachers, institutions, cold and summer holidays, policies, college entrance exams, universities and complementary courses | Education volume |
Test results in three different data sets
| Data set | Model | Accuracy rate | Recall rate | F1 value |
|---|---|---|---|---|
| YCT | CNN | 0.9015 | 0.8996 | 0.9219 |
| LSTM | 0.8638 | 0.8605 | 0.8646 | |
| BERT-base | 0.9164 | 0.9169 | 0.9131 | |
| LDA-Kmeans | 0.9613 | 0.9844 | 0.9702 | |
| LDA | 0.9248 | 0.9118 | 0.9206 | |
| ERNIE | 0.9474 | 0.9465 | 0.9434 | |
| Weibo1 | CNN | 0.8891 | 0.8858 | 0.9078 |
| LSTM | 0.8489 | 0.8472 | 0.8526 | |
| BERT-base | 0.9031 | 0.9015 | 0.8977 | |
| LDA-Kmeans | 0.9789 | 0.9699 | 0.9545 | |
| LDA | 0.9063 | 0.8981 | 0.9056 | |
| ERNIE | 0.9341 | 0.9321 | 0.9295 | |
| Online2 | CNN | 0.873 | 0.8694 | 0.9026 |
| LSTM | 0.8381 | 0.8207 | 0.8458 | |
| BERT-base | 0.8942 | 0.8882 | 0.8861 | |
| LDA-Kmeans | 0.9679 | 0.9789 | 0.9625 | |
| LDA | 0.8954 | 0.8848 | 0.8948 | |
| ERNIE | 0.9228 | 0.908 | 0.9166 |
The accuracy of each model is compared to the accuracy of the key words
| Model | Accuracy(%) | ||
|---|---|---|---|
| General key | Synonymous words | Keywords of multiple meanings | |
| LSA | 44.37 | 41.64 | 38.77 |
| PLSA | 56.88 | 53.37 | 42.34 |
| STM | 64.51 | 58.32 | 40.23 |
| CNN | 47.74 | 43.04 | 37.67 |
| ERNIE | 52.78 | 51.38 | 50.69 |
| LDA | 57.16 | 51.52 | 39.03 |
| LSTM | 65.75 | 60.85 | 54.08 |
| BERT-base | 55.85 | 49.45 | 49.21 |
| LDA-Kmeans | 93.88 | 90.12 | 88.54 |
The result of the text data topic is divided
| Theme | Weighting (%) | Core theme | Topic description |
|---|---|---|---|
| Topic 1 | 35.78 | Choices, problems, young people, society, children, life, future, ability, flat lying, opportunity, education fund. | Cause of lie down |
| Topic 2 | 24.97 | Self, effort, work, no desire, things, anxiety, learning, salted fish, rejection, giving up, resting | Inner emotion |
| Topic 3 | 21.03 | Like, teacher, friend, forever, hope, lovely, good-looking, enter the pit, game, pit, thank you, stage | Resistance volume |
| Topic 4 | 18.22 | Happy, home, weekend, day, comfort, sleep, mobile phone, refueling, sports, summer holidays, happiness, air conditioning | Enjoy life |
Young people lie in the high frequency vocabulary of text data
| Serial number | Participle | frequency | Serial number | Participle | frequency |
|---|---|---|---|---|---|
| 1 | Lie down* | 76500 | 16 | Question * | 7014 |
| 2 | Self * | 71034 | 17 | Child * | 6757 |
| 3 | Life * | 50377 | 18 | Go home | 6549 |
| 4 | Effort * | 30549 | 19 | Learning * | 6321 |
| 5 | Work * | 22147 | 20 | Anxiety * | 6218 |
| 6 | Suffer * | 19053 | 21 | Young man | 6210 |
| 7 | Like * | 9987 | 22 | Fatigue | 6138 |
| 8 | Eat | 9654 | 23 | Friend | 6022 |
| 9 | Select | 9014 | 24 | World | 5317 |
| 10 | Time * | 8326 | 25 | At home | 5015 |
| 11 | Get up | 8059 | 26 | Society * | 5004 |
| 12 | Hope * | 8011 | 27 | Teacher * | 4932 |
| 13 | Happiness | 7877 | 28 | Go to work | 2714 |
| 14 | Joyfulness | 7656 | 29 | China* | 1999 |
| 15 | Thing * | 7325 | 30 | Tomorrow | 1934 |
The text data high frequency vocabulary of the topic of youth
| Serial number | Participle | frequency | Serial number | Participle | frequency |
|---|---|---|---|---|---|
| 1 | Inner volume | 54870 | 16 | Anxiety * | 6891 |
| 2 | Self * | 49423 | 17 | Question* | 6624 |
| 3 | Education* | 28762 | 18 | Hope * | 6434 |
| 4 | Child * | 18941 | 19 | Time * | 6194 |
| 5 | Work * | 11547 | 20 | China* | 6097 |
| 6 | Severity | 10562 | 21 | Company | 6082 |
| 7 | Effort * | 9848 | 22 | Stars | 6012 |
| 8 | Life * | 9534 | 23 | School | 5884 |
| 9 | Society * | 8889 | 24 | Student | 5190 |
| 10 | Teacher * | 8193 | 25 | Age | 4879 |
| 11 | Donation | 7925 | 26 | Overtime | 4873 |
| 12 | Lie down* | 7884 | 27 | Competition | 4809 |
| 13 | Like * | 7754 | 28 | Parent | 2584 |
| 14 | Industry | 7526 | 29 | Thing * | 1877 |
| 15 | Learning * | 7206 | 30 | Money | 1813 |
