A Study on the Evolution of Language Style in Japanese Academic Articles Based on Text Mining
17 mar 2025
O artykule
Data publikacji: 17 mar 2025
Otrzymano: 16 paź 2024
Przyjęty: 08 lut 2025
DOI: https://doi.org/10.2478/amns-2025-0319
Słowa kluczowe
© 2025 Xueyang Yin, published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 International License.
Distribution of sentence length of Japanese academic articles
| Period | Item | 1~15 | 16~30 | 31~45 | 46~60 | 61~75 | >75 | Total |
|---|---|---|---|---|---|---|---|---|
| 1981-1985 | Sentence number | 11876 | 8250 | 6144 | 4601 | 3368 | 2135 | 36374 |
| Proportion | 32.65% | 22.68% | 16.89% | 12.65% | 9.26% | 5.87% | 100% | |
| 1986-1990 | Sentence number | 12530 | 9344 | 6642 | 5170 | 3551 | 2440 | 39677 |
| Proportion | 31.58% | 23.55% | 16.74% | 13.03% | 8.95% | 6.15% | 100% | |
| 1991-1995 | Sentence number | 13779 | 9841 | 7338 | 5675 | 3630 | 2597 | 42860 |
| Proportion | 32.15% | 22.96% | 17.12% | 13.24% | 8.47% | 6.06% | 100% | |
| 1996-2000 | Sentence number | 13357 | 11024 | 7080 | 6237 | 4330 | 2334 | 44362 |
| Proportion | 30.11% | 24.85% | 15.96% | 14.06% | 9.76% | 5.26% | 100% | |
| 2001-2005 | Sentence number | 13577 | 10327 | 8490 | 8142 | 5647 | 2166 | 48349 |
| Proportion | 28.08% | 21.36% | 17.56% | 16.84% | 11.68% | 4.48% | 100% | |
| 2006-2010 | Sentence number | 14343 | 10728 | 9594 | 8796 | 5873 | 3908 | 53242 |
| Proportion | 26.94% | 20.15% | 18.02% | 16.52% | 11.03% | 7.34% | 100% | |
| 2011-2015 | Sentence number | 16058 | 11881 | 9756 | 8608 | 5850 | 4369 | 56522 |
| Proportion | 28.41% | 21.02% | 17.26% | 15.23% | 10.35% | 7.73% | 100% | |
| 2016-2020 | Sentence number | 17761 | 13345 | 10628 | 9664 | 6236 | 2615 | 60249 |
| Proportion | 29.48% | 22.15% | 17.64% | 16.04% | 10.35% | 4.34% | 100% |
ALW and DLW of Japanese academic articles from 1981 to 2020
| Period | Total character number | Total word number | Average length of word | Dispersion length of word |
|---|---|---|---|---|
| 1981-1985 | 1585188 | 864852 | 1.8329 | 0.358 |
| 1986-1990 | 1733027 | 931985 | 1.8595 | 0.362 |
| 1991-1995 | 1881979 | 997286 | 1.8871 | 0.346 |
| 1996-2000 | 2032242 | 1086470 | 1.8705 | 0.351 |
| 2001-2005 | 2245315 | 1187746 | 1.8904 | 0.338 |
| 2006-2010 | 2487462 | 1298868 | 1.9151 | 0.345 |
| 2011-2015 | 2723730 | 1406522 | 1.9365 | 0.346 |
| 2016-2020 | 2968347 | 1521683 | 1.9507 | 0.346 |
Statistical results of segmented sentence length of Japanese academic articles
| Period | Character number | Segmented sentence number | Segmented sentence length |
|---|---|---|---|
| 1981-1985 | 1585188 | 206674 | 7.67 |
| 1986-1990 | 1733027 | 225949 | 7.67 |
| 1991-1995 | 1881979 | 304527 | 6.18 |
| 1996-2000 | 2032242 | 347392 | 5.85 |
| 2001-2005 | 2245315 | 418122 | 5.37 |
| 2006-2010 | 2487462 | 515002 | 4.83 |
| 2011-2015 | 2723730 | 599941 | 4.54 |
| 2016-2020 | 2968347 | 711834 | 4.17 |
Statistical results of type-token ratio of Japanese academic articles of 1981-2020
| Period | Total word number | Type number | Type-token ratio |
|---|---|---|---|
| 1981-1985 | 864852 | 24109 | 35.8726 |
| 1986-1990 | 931985 | 64061 | 14.5484 |
| 1991-1995 | 997286 | 63524 | 15.6994 |
| 1996-2000 | 1086470 | 70239 | 15.4682 |
| 2001-2005 | 1187746 | 171101 | 6.9418 |
| 2006-2010 | 1298868 | 171165 | 7.5884 |
| 2011-2015 | 1406522 | 87635 | 16.0498 |
| 2016-2020 | 1521683 | 100517 | 15.1386 |
Word density of Japanese academic articles of 1981-2020
| Period | Real word | Total word number | Word density |
|---|---|---|---|
| 1981-1985 | 751470 | 864852 | 0.8689 |
| 1986-1990 | 811107 | 931985 | 0.8703 |
| 1991-1995 | 791446 | 997286 | 0.7936 |
| 1996-2000 | 941209 | 1086470 | 0.8663 |
| 2001-2005 | 1034646 | 1187746 | 0.8711 |
| 2006-2010 | 1081308 | 1298868 | 0.8325 |
| 2011-2015 | 1159677 | 1406522 | 0.8245 |
| 2016-2020 | 1292213 | 1521683 | 0.8492 |
Distribution of word length
| Period | Monosyllable frequency | Two-syllable frequency | Trisyllable frequency | Four-syllable frequency | Above-four-syllable frequency |
|---|---|---|---|---|---|
| 1981-1985 | 70.62% | 27.36% | 1.26% | 0.64% | 0.12% |
| 1986-1990 | 68.52% | 26.92% | 2.73% | 1.29% | 0.54% |
| 1991-1995 | 67.05% | 26.79% | 2.76% | 2.06% | 1.34% |
| 1996-2000 | 66.93% | 26.84% | 3.37% | 2.21% | 0.65% |
| 2001-2005 | 66.65% | 26.68% | 3.73% | 2.38% | 0.56% |
| 2006-2010 | 64.44% | 25.52% | 4.24% | 4.27% | 1.53% |
| 2011-2015 | 63.79% | 24.92% | 5.36% | 4.84% | 1.09% |
| 2016-2020 | 62.78% | 23.55% | 5.96% | 6.03% | 1.68% |
Character and sentence number and average sentence length of Japanese academic articles
| Period | Character number | Sentence number | Average sentence length |
|---|---|---|---|
| 1981-1985 | 1585188 | 36374 | 43.58 |
| 1986-1990 | 1733027 | 39677 | 43.68 |
| 1991-1995 | 1881979 | 42860 | 43.91 |
| 1996-2000 | 2032242 | 44362 | 45.81 |
| 2001-2005 | 2245315 | 48349 | 46.44 |
| 2006-2010 | 2487462 | 53242 | 46.72 |
| 2011-2015 | 2723730 | 56522 | 48.19 |
| 2016-2020 | 2968347 | 60249 | 49.27 |
Statistical results of single occurrence word of Japanese academic articles of 1981-2020
| Period | Single occurrence word number | Total word number | Accumulative frequency |
|---|---|---|---|
| 1981-1985 | 10724 | 864852 | 0.0124 |
| 1986-1990 | 19292 | 931985 | 0.0207 |
| 1991-1995 | 25431 | 997286 | 0.0255 |
| 1996-2000 | 31073 | 1086470 | 0.0286 |
| 2001-2005 | 106185 | 1187746 | 0.0894 |
| 2006-2010 | 93909 | 1298868 | 0.0723 |
| 2011-2015 | 69763 | 1406522 | 0.0496 |
| 2016-2020 | 76388 | 1521683 | 0.0502 |
Lexical richness of Japanese academic articles of 1981-2020
| Period | Word density | Type-token ratio | Single occurrence word frequency |
|---|---|---|---|
| 1981-1985 | 0.8689 | 35.8726 | 0.0124 |
| 1986-1990 | 0.8703 | 14.5484 | 0.0207 |
| 1991-1995 | 0.7936 | 15.6994 | 0.0255 |
| 1996-2000 | 0.8663 | 15.4682 | 0.0286 |
| 2001-2005 | 0.8711 | 6.9418 | 0.0894 |
| 2006-2010 | 0.8325 | 7.5884 | 0.0723 |
| 2011-2015 | 0.8245 | 16.0498 | 0.0496 |
| 2016-2020 | 0.8492 | 15.1386 | 0.0502 |
