Accès libre

A Study on the Evolution of Language Style in Japanese Academic Articles Based on Text Mining

  
17 mars 2025
À propos de cet article

Citez
Télécharger la couverture

Distribution of sentence length of Japanese academic articles

Period Item 1~15 16~30 31~45 46~60 61~75 >75 Total
1981-1985 Sentence number 11876 8250 6144 4601 3368 2135 36374
Proportion 32.65% 22.68% 16.89% 12.65% 9.26% 5.87% 100%
1986-1990 Sentence number 12530 9344 6642 5170 3551 2440 39677
Proportion 31.58% 23.55% 16.74% 13.03% 8.95% 6.15% 100%
1991-1995 Sentence number 13779 9841 7338 5675 3630 2597 42860
Proportion 32.15% 22.96% 17.12% 13.24% 8.47% 6.06% 100%
1996-2000 Sentence number 13357 11024 7080 6237 4330 2334 44362
Proportion 30.11% 24.85% 15.96% 14.06% 9.76% 5.26% 100%
2001-2005 Sentence number 13577 10327 8490 8142 5647 2166 48349
Proportion 28.08% 21.36% 17.56% 16.84% 11.68% 4.48% 100%
2006-2010 Sentence number 14343 10728 9594 8796 5873 3908 53242
Proportion 26.94% 20.15% 18.02% 16.52% 11.03% 7.34% 100%
2011-2015 Sentence number 16058 11881 9756 8608 5850 4369 56522
Proportion 28.41% 21.02% 17.26% 15.23% 10.35% 7.73% 100%
2016-2020 Sentence number 17761 13345 10628 9664 6236 2615 60249
Proportion 29.48% 22.15% 17.64% 16.04% 10.35% 4.34% 100%

ALW and DLW of Japanese academic articles from 1981 to 2020

Period Total character number Total word number Average length of word Dispersion length of word
1981-1985 1585188 864852 1.8329 0.358
1986-1990 1733027 931985 1.8595 0.362
1991-1995 1881979 997286 1.8871 0.346
1996-2000 2032242 1086470 1.8705 0.351
2001-2005 2245315 1187746 1.8904 0.338
2006-2010 2487462 1298868 1.9151 0.345
2011-2015 2723730 1406522 1.9365 0.346
2016-2020 2968347 1521683 1.9507 0.346

Statistical results of segmented sentence length of Japanese academic articles

Period Character number Segmented sentence number Segmented sentence length
1981-1985 1585188 206674 7.67
1986-1990 1733027 225949 7.67
1991-1995 1881979 304527 6.18
1996-2000 2032242 347392 5.85
2001-2005 2245315 418122 5.37
2006-2010 2487462 515002 4.83
2011-2015 2723730 599941 4.54
2016-2020 2968347 711834 4.17

Statistical results of type-token ratio of Japanese academic articles of 1981-2020

Period Total word number Type number Type-token ratio
1981-1985 864852 24109 35.8726
1986-1990 931985 64061 14.5484
1991-1995 997286 63524 15.6994
1996-2000 1086470 70239 15.4682
2001-2005 1187746 171101 6.9418
2006-2010 1298868 171165 7.5884
2011-2015 1406522 87635 16.0498
2016-2020 1521683 100517 15.1386

Word density of Japanese academic articles of 1981-2020

Period Real word Total word number Word density
1981-1985 751470 864852 0.8689
1986-1990 811107 931985 0.8703
1991-1995 791446 997286 0.7936
1996-2000 941209 1086470 0.8663
2001-2005 1034646 1187746 0.8711
2006-2010 1081308 1298868 0.8325
2011-2015 1159677 1406522 0.8245
2016-2020 1292213 1521683 0.8492

Distribution of word length

Period Monosyllable frequency Two-syllable frequency Trisyllable frequency Four-syllable frequency Above-four-syllable frequency
1981-1985 70.62% 27.36% 1.26% 0.64% 0.12%
1986-1990 68.52% 26.92% 2.73% 1.29% 0.54%
1991-1995 67.05% 26.79% 2.76% 2.06% 1.34%
1996-2000 66.93% 26.84% 3.37% 2.21% 0.65%
2001-2005 66.65% 26.68% 3.73% 2.38% 0.56%
2006-2010 64.44% 25.52% 4.24% 4.27% 1.53%
2011-2015 63.79% 24.92% 5.36% 4.84% 1.09%
2016-2020 62.78% 23.55% 5.96% 6.03% 1.68%

Character and sentence number and average sentence length of Japanese academic articles

Period Character number Sentence number Average sentence length
1981-1985 1585188 36374 43.58
1986-1990 1733027 39677 43.68
1991-1995 1881979 42860 43.91
1996-2000 2032242 44362 45.81
2001-2005 2245315 48349 46.44
2006-2010 2487462 53242 46.72
2011-2015 2723730 56522 48.19
2016-2020 2968347 60249 49.27

Statistical results of single occurrence word of Japanese academic articles of 1981-2020

Period Single occurrence word number Total word number Accumulative frequency
1981-1985 10724 864852 0.0124
1986-1990 19292 931985 0.0207
1991-1995 25431 997286 0.0255
1996-2000 31073 1086470 0.0286
2001-2005 106185 1187746 0.0894
2006-2010 93909 1298868 0.0723
2011-2015 69763 1406522 0.0496
2016-2020 76388 1521683 0.0502

Lexical richness of Japanese academic articles of 1981-2020

Period Word density Type-token ratio Single occurrence word frequency
1981-1985 0.8689 35.8726 0.0124
1986-1990 0.8703 14.5484 0.0207
1991-1995 0.7936 15.6994 0.0255
1996-2000 0.8663 15.4682 0.0286
2001-2005 0.8711 6.9418 0.0894
2006-2010 0.8325 7.5884 0.0723
2011-2015 0.8245 16.0498 0.0496
2016-2020 0.8492 15.1386 0.0502