Acceso abierto

Assessment of Science and Technology Innovation Capability of Digital Creative Industries Based on Empirical Analysis of Inputs and Outputs

  
21 mar 2025

Cite
Descargar portada

Introduction

Digital creative industry is a collection of industries supported by digital technology and with creative content and services as the core output, covering a wide range of fields such as digital film and television, music, games, animation, publishing, art design, advertising, and so on. With the accelerated integration of emerging technologies such as artificial intelligence, virtual reality, and blockchain with the creative industry, the digital creative industry presents increasingly complex and dynamic global value chain characteristics [1-4]. Enterprises have taken advantage of digital technology to integrate global creative resources, accelerating the expansion and upgrading of the value chain [5]. An in-depth understanding of the structural characteristics and evolutionary law of the global value chain of digital creative industries is of great significance in promoting the high-quality development of the industry and enhancing the international competitiveness of enterprises [6-7].

The rapid development of artificial intelligence, cloud computing, Internet of Things, 5G, blockchain, and meta-universe drives the change and growth of digital creative industries [8]. Today’s world is experiencing a great change that has not been seen in a hundred years, and developed countries are all attaching great importance to the development of digital creative industries, and are trying to control the emerging global value chain of digital creative industries in the same way as they treat traditional industries. At the same time, the Chinese government is also vigorously cultivating and developing digital creative industries [9-11].

In the 13th Five-Year Plan, the state for the first time listed the digital creative industry as one of the five key industries to be cultivated, and regarded it as one of the key areas that “represents the direction of the new round of scientific and technological revolution and industrial change, and is the key area to cultivate new kinetic energy for development and to gain a new competitive advantage in the future”. It is proposed to “promote the vigorous development of digital creative industries and create and lead new consumption”, and put forward specific goals, namely “to promote the accelerated development of cultural creativity and innovative design and other industries with digital technology and advanced concepts, and to promote the in-depth fusion of culture and science and technology as well as the interpenetration of related industries” [12-15]. By 2020, a digital creative industry development pattern with cultural leadership, advanced technology and complete chain will be formed, and the output value of related industries will reach 8 trillion yuan.” After the release of the Plan, digital creative industries and design industries in major cities gained different degrees of development in 2016 [16-18].

The current development of creative industries is increasingly dependent □ on the support of new technologies, the integration of cultural and creative content and modern science and technology is promoting creative industries into a new stage of development. Design industry is to strengthen the content support, creativity and design enhancement of culture to the information industry, accelerate the cultivation of two-way deep integration of the industry, each major city will design industry cultivation as an integrated application to promote the two-way integration of culture and science and technology [19-21].

This paper divides the assessment indexes of scientific and technological innovation capacity of digital creative industry into two categories: input indexes and output indexes, establishes the corresponding assessment index system of scientific and technological innovation capacity, and further proposes the combined method of factor analysis and DEA based on it, to conduct in-depth research on the scientific and technological innovation capacity of digital creative industry. The factor analysis method is used to extract the common factor of each influencing factor index in the assessment index system of scientific and technological innovation capacity, construct and solve the correlation coefficient matrix, and after determining the number of common factors, extract the dimensions that can well explain the independent variables and select the common factors. Taking the common factor as input data, the three-stage DEA model is proposed as the main assessment method for the scientific and technological innovation ability of the digital creative industry.In the first stage, the efficiency of digital creative industries is measured by the EBM model, which is based on constant returns to scale and input orientation. In the second stage, the slack variables in the first stage are regressed with the help of SFA regression and the input variables are adjusted. Finally, in the third stage, the efficiency of each decision-making unit is measured again using the adjusted input-output variables, eliminating the influence of environmental factors and random factors. In this paper, based on the data of digital creative industry in the China Economic Survey Yearbook (2022), X1-X11 indicators with correlation coefficients greater than 0.3 are selected, KMO test and Bartlett test are conducted to complete the extraction of common factors, and DEA analysis is conducted to analyze the scientific and technological innovation capacity of digital creative industry.

Indicator System for Assessing the Scientific and Technological Innovation Capability of Digital Creative Industries

Digital creative industry is an innovative industry based on digital technology and Internet technology, the essence of which is to combine culture, creativity and technology to realize industrial upgrading and transformation. Compared with the traditional digital creative industry, digital creative industry has higher innovation, interactivity and personalization characteristics, which can better meet the needs and preferences of consumers and is a new direction for the future development of digital creative industry. Evaluating the scientific and technological innovation capability of digital creative industries not only helps digital creative enterprises enhance their competitiveness, but also promotes industrial upgrading and optimizes industrial configuration. Based on this, this paper will propose a system of indicators for assessing the scientific and technological innovation capacity of digital creative industries from the perspective of inputs and outputs, in order to prepare for the subsequent in-depth analysis of the scientific and technological innovation capacity of digital creative industries.

The assessment indexes of scientific and technological innovation capacity of digital creative industry are selected in accordance with scientificity and applicability, as shown in Table 1. The final establishment of the evaluation index system of scientific and technological innovation capacity of digital creative industry includes input of R&D expenses, full-year equivalents of R&D personnel, enterprise management costs, number of enterprises in the system, and per capita public financial expenditure on education in the input indexes, while the output indexes include contract turnover, number of patent applications and authorizations, growth rate of industrial output value of the enterprise, profitability of business income of the enterprise, growth rate of technology income of the enterprise, and growth rate of product sales income of the enterprise. Revenue growth rate.

The index system of the innovation ability of digital creative industry
Primary indicator Secondary indicator Code number
Technology innovation input R & D expenses investment X1
R & D personnel annual equivalent X2
Enterprise management cost X3
Number of enterprises into the system X4
Per capita public financial expenditure on education X5
Technology innovation output Contract volume X6
Number of Patent Application Authorizations X7
The growth rate of enterprise industrial output value X8
The profit rate of enterprise operating income X9
The growth rate of enterprise technology income X10
The growth rate of enterprise product sales revenue X11
Assessment Methods of Scientific and Technological Innovation Capability of Digital Creative Industries

This paper adopts the method of using a combination of factor analysis and DEA to empirically analyze the scientific and technological innovation capacity of digital creative industries [22].

Factor Analysis
Raw data and correlation matrix

In fact the study of an object using factor analysis is the study of potential relationships between its attributes. In geology, the study of the tectonic processes of a particular ore body can be viewed as a study of the relationship between the contents of its m chemical elements such as Cu, Pb, Zn, Ag ----- and so on, and the values of these contents can all be viewed as random variables,. We study them, through sample values.

The original data is the sample value, with 2 random variables x,y, which represent two variables A and B. Their content values are measured for n specimens: x=(x1,x2,,xn) \[\vec{x}=(\;{{x}_{1}},{{x}_{2}},\cdots \cdots ,{{x}_{n}})\] y=(y1,y2,,yn) \[\vec{y}=(\;{{y}_{1}},{{y}_{2}},\cdots \cdots ,{{y}_{n}})\]

The samples were first standardized and the mean and variance were calculated according to the following formula: x¯=1ni=1nxi,y¯=1ni=1nyi \[\bar{x}=\frac{1}{n}\sum\limits_{i=1}^{n}{{{x}_{i}}}\;,\bar{y}=\frac{1}{n}\sum\limits_{i=1}^{n}{{{y}_{i}}}\] σx2=1ni=1n(xix¯)2 \[\sigma _{x}^{2}=\frac{1}{n}\sum\limits_{i=1}^{n}{{{({{x}_{i}}-\bar{x})}^{2}}}\] σy2=1ni=1n(yiy¯)2 \[\sigma _{y}^{2}=\frac{1}{n}\sum\limits_{i=1}^{n}{{{({{y}_{i}}-\bar{y})}^{2}}}\]

Re-order: xi=xixσx,yi=yiy¯σy,i=1,2,n \[x_{i}^{\prime }=\frac{{{x}_{i}}-x}{{{\sigma }_{x}}},y_{i}^{\prime }=\frac{{{y}_{i}}-\bar{y}}{{{\sigma }_{y}}},i=1,2\cdots \cdots ,n\]

The sample after standardization meets the following conditions: x¯'=1ni=1nxi'=0,y¯'=1ni=1nyi'=0 \[{{\bar{x}}^{'}}=\frac{1}{n}\sum\limits_{i=1}^{n}{x_{i}^{'}}=0,{{\bar{y}}^{'}}=\frac{1}{n}\sum\limits_{i=1}^{n}{y_{i}^{'}}=0\] σx2=1ni=1nxi2=1,σy2=1ni=1nyi2=1

Here, x,y\[\vec{x},\vec{y}\] is still used to represent the samples after standardization, and their variance and correlation coefficients can be calculated according to the following formula: { σx2=1ni=1nxi2=1nx'x=1σy2=1ni=1nyi2=1ny'y=1Yxy=1ni=1nxiyi=1nx'y \[\left\{ \begin{matrix} \sigma _{x}^{2}=\frac{1}{n}\sum\limits_{i=1}^{n}{x_{i}^{2}}=\frac{1}{n}{{{\vec{x}}}^{'}}\vec{x}=1 \\ \sigma _{y}^{2}=\frac{1}{n}\sum\limits_{i=1}^{n}{y_{i}^{2}}=\frac{1}{n}{{{\vec{y}}}^{'}}\vec{y}=1 \\ {{Y}_{xy}}=\frac{1}{n}\sum\limits_{i=1}^{n}{{{x}_{i}}}{{y}_{i}}=\frac{1}{n}{{{\vec{x}}}^{'}}\vec{y} \\ \end{matrix} \right.\]

It can be shown that the random variables x,y\[\vec{x},\vec{y}\] are uncorrelated, Yxy = 0 and algebraically equivalent to their inner product x'y\[{{\vec{x}}^{'}}\vec{y}\] and geometrically the two vectors are directly intersecting.

For n samples with m variables each, the original data matrix is as follows: X=[ x11x12x1mx21x22x2mxn1xn2xnm ]=[ x1,x2,,xm ] \[X=\left[ \begin{matrix} {{x}_{11}} & {{x}_{12}} & \cdots & {{x}_{1m}} \\ {{x}_{21}} & {{x}_{22}} & \cdots & {{x}_{2m}} \\ \cdots & \cdots & \cdots & \cdots \\ {{x}_{n1}} & {{x}_{n2}} & \cdots & {{x}_{nm}} \\ \end{matrix} \right]=\left[ \begin{matrix} {{{\vec{x}}}_{1}},{{{\vec{x}}}_{2}},\cdots ,{{{\vec{x}}}_{m}} \\ \end{matrix} \right]\]

The column vector at the right end of the equation: x¯j=(x1j,x2j,,xnj),j=1,2,,m \[{{\bar{x}}_{j}}={{({{x}_{1j}},{{x}_{2j}},\cdots ,{{x}_{nj}})}^{\prime }},j=1,2,\cdots ,m\]

The observation representing the jst variable on the n sample can be viewed as a point or vector in a n dimensional Euclidean space, here denoted by xj${{\vec{x}}_{j}}$. The relationship between the original variables is studied by examining the positional relationship of these m points or vectors.

If the sample data is normalized, i.e., X a normalized matrix, there is: xj=1ni=1nxij=0 \[{{\vec{x}}_{j}}=\frac{1}{n}\sum\limits_{i=1}^{n}{{{x}_{ij}}}=0\] σj2=1ni=1nxij2=1nxjxj=1,2,,m \[\sigma _{j}^{2}=\frac{1}{n}\sum\limits_{i=1}^{n}{{{x}_{ij}}^{2}}=\frac{1}{n}\vec{x}_{j}^{\prime }{{\vec{x}}_{j}}=1,2,\cdots ,m\]

Then, the correlation coefficient between xj${{\vec{x}}_{j}}$ and xk${{\vec{x}}_{k}}$ is: Yjk=1ni=1nxijxik=1nxj'xk,j,k=1,2,,m \[{{Y}_{jk}}=\frac{1}{n}\sum\limits_{i=1}^{n}{{{x}_{ij}}}{{x}_{ik}}=\frac{1}{n}\vec{x}_{j}^{'}{{\vec{x}}_{k}},j,k=1,2,\cdots ,m\]

The correlation coefficient matrix R consists of the correlation coefficients between m variables two by two [23]: R=[ r11r12r1mr21r22r2mrm1rm2rmm ]=1nxx \[R=\left[ \begin{matrix} {{r}_{11}} & {{r}_{12}} & \cdots & {{r}_{1m}} \\ {{r}_{21}} & {{r}_{22}} & \cdots & {{r}_{2m}} \\ \cdots & \cdots & \cdots & \cdots \\ {{r}_{m1}} & {{r}_{m2}} & \cdots & {{r}_{mm}} \\ \end{matrix} \right]=\frac{1}{n}{{x}^{\prime }}x\]

The correlation coefficient matrix R is symmetric and at least semi-positive definite, which means that all of its eigenvalues are non-negative.

The correlation coefficient matrix is the starting point of the factor analysis method, and an important part of factor analysis is to study the structure of the correlation matrix. In addition, in factor analysis, we also often involve the correlation coefficient matrix between two groups of variables, assuming that in addition to the previous m random variables, there are another p random variables, the matrix is as follows: y=[ y11y12y1py21y22y2pyn1yn2ynp ]=[ y¯1,y¯2,,y¯p ] \[y=\left[ \begin{matrix} {{y}_{11}} & {{y}_{12}} & \cdots & {{y}_{1p}} \\ {{y}_{21}} & {{y}_{22}} & \cdots & {{y}_{2p}} \\ \cdots & \cdots & \cdots & \cdots \\ {{y}_{n1}} & {{y}_{n2}} & \cdots & {{y}_{np}} \\ \end{matrix} \right]=\left[ \begin{matrix} {{{\bar{y}}}_{1}},{{{\bar{y}}}_{2}},\cdots ,{{{\bar{y}}}_{p}} \\ \end{matrix} \right]\]

Assuming all standardized data, the correlation coefficient between yk${{\vec{y}}_{k}}$ and xj${{\vec{x}}_{j}}$ from equation (16) is Skj=1nyk'xj,k=1,2,,p;j=1,2,,m \[{{S}_{kj}}=\frac{1}{n}\vec{y}_{k}^{'}{{\vec{x}}_{j}},k=1,2,\cdots ,p;j=1,2,\cdots ,m\]

Written in matrix form as follows: Sp×m=[ S11S12S1mS21S22S2mSp1Sp2Spm ]=[ 1ny1x11ny1x21ny1xm1ny2x11ny2x21ny2xm1nypx11nypx21nypxm ]=1n[ y¯1y¯2y¯p ][ x1,x2,,xmy¯p ]=1nYX \[\begin{align} & {{S}_{p\times m}}=\left[ \begin{matrix} {{S}_{11}} & {{S}_{12}} & \cdots & {{S}_{1m}} \\ {{S}_{21}} & {{S}_{22}} & \cdots & {{S}_{2m}} \\ \cdots & \cdots & \cdots & \cdots \\ {{S}_{p1}} & {{S}_{p2}} & \cdots & {{S}_{pm}} \\ \end{matrix} \right] \\ & =\left[ \begin{matrix} \frac{1}{n}{{y}_{1}}{{x}_{1}} & \frac{1}{n}y_{1}^{\prime }{{x}_{2}} & \cdots & \frac{1}{n}y_{1}^{\prime }{{x}_{m}} \\ \frac{1}{n}y_{2}^{\prime }{{x}_{1}} & \frac{1}{n}y_{2}^{\prime }{{x}_{2}} & \cdots & \frac{1}{n}y_{2}^{\prime }{{x}_{m}} \\ \cdots & \cdots & \cdots & \cdots \\ \frac{1}{n}y_{p}^{\prime }{{x}_{1}} & \frac{1}{n}y_{p}^{\prime }{{x}_{2}} & \cdots & \frac{1}{n}y_{p}^{\prime }{{x}_{m}} \\ \end{matrix} \right] \\ & =\frac{1}{n}\left[ \begin{matrix} \bar{y}_{1}^{\prime } \\ \bar{y}_{2}^{\prime } \\ \vdots \\ \bar{y}_{p}^{\prime } \\ \end{matrix} \right]\left[ \begin{matrix} {{{\vec{x}}}_{1}},{{{\vec{x}}}_{2}},\cdots ,{{{\vec{x}}}_{m}} \\ \vdots \\ \bar{y}_{p}^{\prime } \\ \end{matrix} \right] \\ & =\frac{1}{n}YX \end{align}\]

Mathematical model for factor analysis

Mathematical Model of Factor Analysis The common factor of factor analysis can, in fact, be expressed in the following linear algebraic form [24]: { x1=a11f1+a21f2++ap1fp+μ1ε1x2=a12f1+a22f2++ap2fp+μ2ε2xm=a1mf1+a2mf2++apmfp+μmεm \[\left\{ \begin{matrix} {{{\vec{x}}}_{1}}={{a}_{11}}{{{\vec{f}}}_{1}}+{{a}_{21}}{{{\vec{f}}}_{2}}+\cdots +{{a}_{p1}}{{{\vec{f}}}_{p}}+{{\mu }_{1}}{{{\vec{\varepsilon }}}_{1}} \\ {{{\vec{x}}}_{2}}={{a}_{12}}{{{\vec{f}}}_{1}}+{{a}_{22}}{{{\vec{f}}}_{2}}+\cdots +{{a}_{p2}}{{{\vec{f}}}_{p}}+{{\mu }_{2}}{{{\vec{\varepsilon }}}_{2}} \\ \cdots \\ {{{\vec{x}}}_{m}}={{a}_{1m}}{{{\vec{f}}}_{1}}+{{a}_{2m}}{{{\vec{f}}}_{2}}+\cdots +{{a}_{pm}}{{{\vec{f}}}_{p}}+{{\mu }_{m}}{{{\vec{\varepsilon }}}_{m}} \\ \end{matrix} \right.\]

Abbreviated into: xj=k=1pakjfk+μjεj,j=1,2,,m \[{{\vec{x}}_{j}}=\sum\limits_{k=1}^{p}{{{a}_{kj}}}{{\vec{f}}_{k}}+{{\mu }_{j}}{{\vec{\varepsilon }}_{j}},j=1,2,\cdots ,m\] Where f1,f2,,fp\[{{\vec{f}}_{1}},{{\vec{f}}_{2}},\ldots ,{{\vec{f}}_{p}}\] and ε1,ε2,,εm${{\vec{\varepsilon }}_{1}},{{\vec{\varepsilon }}_{2}},\ldots ,{{\vec{\varepsilon }}_{m}}$ are the new variables sought, the former is the common factor, which can be understood as the commonality; the latter is called the single factor, which is the individuality factor. The positive integer P represents the number of common factors, which is much smaller than the original number of variables m, reducing the original m variables to a small number of factors, and the coefficients aij and μj(j = 1,2,…m;k = 1,2,…,p) are called factor loadings or factor loadings, the former is called the common factor loadings, and the latter is called the single factor loadings, and since we are concerned with the common factors only, the factor loadings usually referred to the former only.

Notation: A=[ a11a12a1ma21a22a2map1ap2apm ]p×m \[A={{\left[ \begin{matrix} {{a}_{11}} & {{a}_{12}} & \cdots & {{a}_{1m}} \\ {{a}_{21}} & {{a}_{22}} & \cdots & {{a}_{2m}} \\ \cdots & \cdots & \cdots & \cdots \\ {{a}_{p1}} & {{a}_{p2}} & \cdots & {{a}_{pm}} \\ \end{matrix} \right]}_{p\times m}}\] where akj is the loading of the jnd variable on the krd factor (k = 1,2,……,p; j = 1,2,……,m); F=[ f¯1,f¯2,,f¯p ]=[ f11f12f1pf21f22f2pfn1fn2fnp ]n×p \[F=\left[ \begin{matrix} {{{\bar{f}}}_{1}},{{{\bar{f}}}_{2}},\cdots ,{{{\bar{f}}}_{p}} \\ \end{matrix} \right]={{\left[ \begin{matrix} {{f}_{11}} & {{f}_{12}} & \cdots & {{f}_{1p}} \\ {{f}_{21}} & {{f}_{22}} & \cdots & {{f}_{2p}} \\ \cdots & \cdots & \cdots & \cdots \\ {{f}_{n1}} & {{f}_{n2}} & \cdots & {{f}_{np}} \\ \end{matrix} \right]}_{n\times p}}\] Where column k is the value of the knd factor on each specimen, this matrix is called the factorial measure; U=[ u1000u2000um ]m×m \[U={{\left[ \begin{matrix} {{u}_{1}} & 0 & \cdots & 0 \\ 0 & {{u}_{2}} & \cdots & 0 \\ \cdots & \cdots & \cdots & \cdots \\ 0 & 0 & \cdots & {{u}_{m}} \\ \end{matrix} \right]}_{m\times m}}\]

This is the m st order diagonal matrix where the j nd diagonal element uj is the loading (j = 1,2,⋯⋯,m) of variable Xj on a single factor εj : E=[ ε1,ε2,,εm ]=[ ε11ε12ε1mε21ε22ε2mεn1εn2εnm ] \[E=\left[ \begin{matrix} {{{\vec{\varepsilon }}}_{1}},{{{\vec{\varepsilon }}}_{2}},\cdots ,{{{\vec{\varepsilon }}}_{m}} \\ \end{matrix} \right]=\left[ \begin{matrix} {{\varepsilon }_{11}} & {{\varepsilon }_{12}} & \cdots & {{\varepsilon }_{1m}} \\ {{\varepsilon }_{21}} & {{\varepsilon }_{22}} & \cdots & {{\varepsilon }_{2m}} \\ \cdots & \cdots & \cdots & \cdots \\ {{\varepsilon }_{n1}} & {{\varepsilon }_{n2}} & \cdots & {{\varepsilon }_{nm}} \\ \end{matrix} \right]\] where column j is the value of εj on each specimen. Then equation (24) can be rewritten in the following form: [ x1,x2,,xm ]=[ f1,f2,,fp ][ a11a12a1ma21a22a2map1ap2apm ]+[ ε1,ε2,,εm ][ u1000u2000um ] \[\begin{align} & \left[ \begin{matrix} {{{\vec{x}}}_{1}},{{{\vec{x}}}_{2}},\cdots ,{{{\vec{x}}}_{m}} \\ \end{matrix} \right]=\left[ \begin{matrix} {{{\vec{f}}}_{1}},{{{\vec{f}}}_{2}},\cdots ,{{{\vec{f}}}_{p}} \\ \end{matrix} \right]\left[ \begin{matrix} {{a}_{11}} & {{a}_{12}} & \cdots & {{a}_{1m}} \\ {{a}_{21}} & {{a}_{22}} & \cdots & {{a}_{2m}} \\ \cdots & \cdots & \cdots & \cdots \\ {{a}_{p1}} & {{a}_{p2}} & \cdots & {{a}_{pm}} \\ \end{matrix} \right] \\ & +\left[ \begin{matrix} {{{\vec{\varepsilon }}}_{1}},{{{\vec{\varepsilon }}}_{2}},\cdots ,{{{\vec{\varepsilon }}}_{m}} \\ \end{matrix} \right]\left[ \begin{matrix} {{u}_{1}} & 0 & \cdots & 0 \\ 0 & {{u}_{2}} & \cdots & 0 \\ \cdots & \cdots & \cdots & \cdots \\ 0 & 0 & \cdots & {{u}_{m}} \\ \end{matrix} \right] \end{align}\] X=FA+EU \[X=FA+EU\]

Factor loadings

We already know that the original variables xj${{\vec{x}}_{j}}$ in Eq. (27) are all standardized variables, now assume again that the common factor f¯K(k=1,2,.......,p)${{\bar{f}}_{K}}(k=1,2,.......,p)$ and the single factor εj(j=1,2,.......,m)${{\vec{\varepsilon }}_{j}}(j=1,2,.......,m)$ to be solved are also all standardized variables and that the correlation coefficients between all the common factors, and between the single factors, are 0. There is the following relationship: { 1ni=1nfik=0,k=1,2,,p1ni=1nεij=0,j=1,2,,m1nfk'f=δk={ 1,k=0,k k,=1,2,,p1nεj'εq=δjq={ 1,j=q0,jq j,q=1,2,,m1nfk'εj=0,k=1,2,,p;j=1,2,,m \[\left\{ \begin{matrix} \frac{1}{n}\sum\limits_{i=1}^{n}{{{f}_{ik}}}=0,k=1,2,\cdots ,p \\ \frac{1}{n}\sum\limits_{i=1}^{n}{{{\varepsilon }_{ij}}}=0,j=1,2,\cdots ,m \\ \frac{1}{n}\vec{f}_{k}^{'}{{{\vec{f}}}_{\ell }}={{\delta }_{k\ell }}=\left\{ \begin{matrix} 1,k=\ell \\ 0,k\ne \ell \\ \end{matrix} \right.k,\ell =1,2,\cdots ,p \\ \frac{1}{n}\vec{\varepsilon }_{j}^{'}{{{\vec{\varepsilon }}}_{q}}={{\delta }_{jq}}=\left\{ \begin{matrix} 1,j=q \\ 0,j\ne q \\ \end{matrix} \right.j,q=1,2,\cdots ,m \\ \frac{1}{n}\vec{f}_{k}^{'}{{{\vec{\varepsilon }}}_{j}}=0,k=1,2,\cdots ,p;j=1,2,\cdots ,m \\ \end{matrix} \right.\]

These relational equations are written in matrix form to obtain the correlation matrix between the metrics: 1nFF=1n[ f¯1f¯2f¯p ][ f1,f2,,fp ]=[ 1nf1f11nf1f21nf1fp1nf2f11nf2f21nf2fp1nfpf11nfpf21nfpfp ]=[ 100010001 ]=Ip \[\begin{align} & \frac{1}{n}{{F}^{\prime }}F=\frac{1}{n}\left[ \begin{matrix} \bar{f}_{1}^{\prime } \\ \bar{f}_{2}^{\prime } \\ \vdots \\ \bar{f}_{p}^{\prime } \\ \end{matrix} \right]\left[ \begin{matrix} {{{\vec{f}}}_{1}},{{{\vec{f}}}_{2}},\cdots ,{{{\vec{f}}}_{p}} \\ \end{matrix} \right] \\ & =\left[ \begin{matrix} \frac{1}{n}\vec{f}_{1}^{^{\prime }}{{{\vec{f}}}_{1}} & \frac{1}{n}\vec{f}_{1}^{^{\prime }}{{{\vec{f}}}_{2}} & \cdots & \frac{1}{n}\vec{f}_{1}^{^{\prime }}{{{\vec{f}}}_{p}} \\ \frac{1}{n}\vec{f}_{2}^{^{\prime }}{{{\vec{f}}}_{1}} & \frac{1}{n}\vec{f}_{2}^{^{\prime }}{{{\vec{f}}}_{2}} & \cdots & \frac{1}{n}\vec{f}_{2}^{^{\prime }}{{{\vec{f}}}_{p}} \\ \cdots & \cdots & \cdots & \cdots \\ \frac{1}{n}\vec{f}_{p}^{^{\prime }}{{{\vec{f}}}_{1}} & \frac{1}{n}\vec{f}_{p}^{^{\prime }}{{{\vec{f}}}_{2}} & \cdots & \frac{1}{n}\vec{f}_{p}^{^{\prime }}{{{\vec{f}}}_{p}} \\ \end{matrix} \right] \\ & =\left[ \begin{matrix} 1 & 0 & \cdots & 0 \\ 0 & 1 & \cdots & 0 \\ \cdots & \cdots & \cdots & \cdots \\ 0 & 0 & \cdots & 1 \\ \end{matrix} \right]={{I}_{p}} \end{align}\] where Ip is a unit matrix of order p. Similarly, the correlation matrix between the single factors can be obtained as: 1nEE=Im \[\frac{1}{n}{{E}^{\prime }}E={{I}_{m}}\]

Then the correlation matrix between the common factor and the single factor is: 1nFE=1n[ f1f2fp ][ ε1,ε2,,εm ]=[ 1nf1ε11nf1ε21nf1εm1nf2ε11nf2ε21nf2εm1nfpε11nfpε21nfpεm ]=[ 0000 ]=H \[\begin{align} & \frac{1}{n}{{F}^{\prime }}E=\frac{1}{n}\left[ \begin{matrix} \vec{f}_{1}^{\prime } \\ \vec{f}_{2}^{\prime } \\ \vdots \\ \vec{f}_{p}^{\prime } \\ \end{matrix} \right]\left[ \begin{matrix} {{{\vec{\varepsilon }}}_{1}},{{{\vec{\varepsilon }}}_{2}},\cdots ,{{{\vec{\varepsilon }}}_{m}} \\ \end{matrix} \right] \\ & =\left[ \begin{matrix} \frac{1}{n}\vec{f}_{1}^{^{\prime }}{{{\vec{\varepsilon }}}_{1}} & \frac{1}{n}\vec{f}_{1}^{^{\prime }}{{{\vec{\varepsilon }}}_{2}} & \cdots & \frac{1}{n}\vec{f}_{1}^{^{\prime }}{{{\vec{\varepsilon }}}_{m}} \\ \frac{1}{n}\vec{f}_{2}^{^{\prime }}{{{\vec{\varepsilon }}}_{1}} & \frac{1}{n}\vec{f}_{2}^{^{\prime }}{{{\vec{\varepsilon }}}_{2}} & \cdots & \frac{1}{n}\vec{f}_{2}^{^{\prime }}{{{\vec{\varepsilon }}}_{m}} \\ \cdots & \cdots & \cdots & \cdots \\ \frac{1}{n}\vec{f}_{p}^{^{\prime }}{{{\vec{\varepsilon }}}_{1}} & \frac{1}{n}\vec{f}_{p}^{^{\prime }}{{{\vec{\varepsilon }}}_{2}} & \cdots & \frac{1}{n}\vec{f}_{p}^{^{\prime }}{{{\vec{\varepsilon }}}_{m}} \\ \end{matrix} \right] \\ & =\left[ \begin{matrix} 0 & \cdots & 0 \\ \cdots & \cdots & \cdots \\ 0 & \cdots & 0 \\ \end{matrix} \right]=H \end{align}\] where H is the zero matrix.

According to equation (30), the correlation matrix between the obtained public factors and the original variables can be obtained: [ 1nf1x11nf1x21nf1xm1nf2x11nf2x21nf2xm1nfpx11nfpx21nfpxm ]=1n[ f1f2fp ][ x1,x2,,xmfp ]=1nFX=1nFFA+1nFEU=IpA+HU=A $\begin{align} & \left[ \begin{matrix} \frac{1}{n}\vec{f}_{1}^{^{\prime }}{{{\vec{x}}}_{1}} & \frac{1}{n}\vec{f}_{1}^{^{\prime }}{{{\vec{x}}}_{2}} & \cdots & \frac{1}{n}\vec{f}_{1}^{^{\prime }}{{{\vec{x}}}_{m}} \\ \frac{1}{n}\vec{f}_{2}^{^{\prime }}{{{\vec{x}}}_{1}} & \frac{1}{n}\vec{f}_{2}^{^{\prime }}{{{\vec{x}}}_{2}} & \cdots & \frac{1}{n}\vec{f}_{2}^{^{\prime }}{{{\vec{x}}}_{m}} \\ \cdots & \cdots & \cdots & \cdots \\ \frac{1}{n}\vec{f}_{p}^{^{\prime }}{{{\vec{x}}}_{1}} & \frac{1}{n}\vec{f}_{p}^{^{\prime }}{{{\vec{x}}}_{2}} & \cdots & \frac{1}{n}\vec{f}_{p}^{^{\prime }}{{{\vec{x}}}_{m}} \\ \end{matrix} \right]=\frac{1}{n}\left[ \begin{matrix} \vec{f}_{1}^{\prime } \\ \vec{f}_{2}^{\prime } \\ \vdots \\ \vec{f}_{p}^{\prime } \\ \end{matrix} \right]\left[ \begin{matrix} {{{\vec{x}}}_{1}},{{{\vec{x}}}_{2}},\cdots ,{{{\vec{x}}}_{m}} \\ \vec{f}_{p}^{\prime } \\ \end{matrix} \right] \\ & =\frac{1}{n}{{F}^{\prime }}X=\frac{1}{n}{{F}^{\prime }}FA+\frac{1}{n}{{F}^{\prime }}EU={{I}_{p}}A+HU=A \\ \end{align}$

This illustrates that it is the correlation coefficient between the common factor and the original variable that is the element in the factor loading A: 1nfk'xj=akj,k=1,2,,p;j=1,2,,m \[\frac{1}{n}\vec{f}_{k}^{'}{{\vec{x}}_{j}}={{\text{a}}_{kj}},k=1,2,\cdots ,p;j=1,2,\cdots ,m\]

Factor loading akj reflects the link between factor fk${{\vec{f}}_{k}}$ and variable xj${{\vec{x}}_{j}}$. When akj > 0, it indicates a positive correlation between factor f¯k${{\bar{f}}_{k}}$ and variable xj; when akj < 0, it indicates an inverse correlation between factor fk${{\vec{f}}_{k}}$ and variable xj${{\vec{x}}_{j}}$; and when akj ≈ 0, it indicates a weak link between factor fk${{\vec{f}}_{k}}$ and variable xj${{\vec{x}}_{j}}$.

The correlation array R can be expressed as: R=1nXX=1n(FA+EU)(FA+EU)=1nAFFA+1nUEFA+1nAFEU+1nUEEU=A(1nFF)A+U(1nEF)A+A(1nFE)U+U(1nEE)U=AA+UU=R*+[ u12000u22000um2 ] \[\begin{align} & R=\frac{1}{n}{{X}^{\prime }}X=\frac{1}{n}{{(FA+EU)}^{\prime }}(FA+EU) \\ & =\frac{1}{n}{{A}^{\prime }}{{F}^{\prime }}FA+\frac{1}{n}{{U}^{\prime }}{{E}^{\prime }}FA+\frac{1}{n}{{A}^{\prime }}{{F}^{\prime }}EU+\frac{1}{n}{{U}^{\prime }}{{E}^{\prime }}EU \\ & ={{A}^{\prime }}(\frac{1}{n}{{F}^{\prime }}F)A+{{U}^{\prime }}(\frac{1}{n}{{E}^{\prime }}F)A+{{A}^{\prime }}(\frac{1}{n}{{F}^{\prime }}E)U+{{U}^{\prime }}(\frac{1}{n}{{E}^{\prime }}E)U \\ & ={{A}^{\prime }}A+{{U}^{\prime }}U={{R}^{*}}+\left[ \begin{matrix} {{u}_{1}}^{2} & 0 & \cdots & 0 \\ 0 & {{u}_{2}}^{2} & \cdots & 0 \\ \cdots & \cdots & \cdots & \cdots \\ 0 & 0 & \cdots & {{u}_{m}}^{2} \\ \end{matrix} \right] \end{align}\]

Among them: R*=AA \[{{R}^{*}}={{A}^{\prime }}A\] is called the approximate correlation matrix, the raw variable correlation coefficients, and the off-diagonal elements are the same as R: rij=k=1pakiakj,i,j=1,2,,m \[{{r}_{ij}}=\sum\limits_{k=1}^{p}{{{a}_{ki}}}{{a}_{kj}},i,j=1,2,\cdots ,m\]

The diagonal element of R* is: hj2=k=1pakj2=1uj2,j=1,2,,m \[{{h}_{j}}^{2}=\sum\limits_{k=1}^{p}{{{a}_{kj}}^{2}}=1-{{u}_{j}}^{2},j=1,2,\cdots ,m\] hj2$h_{j}^{2}$ is called the metric variance of variable xj${{\vec{x}}_{j}}$, and it represents the share of each metric in the variance of the original variable, and is numerically equal to the square of the element in column j of A. The common factor variance represents the extent to which all of the original variables can be explained by the p common factors, and takes values between 0 and 1, which are positively correlated.

The sum of the common factor variances is shown below: j=1mhj2=j=1mk=1pakj2=k=1pj=1makj2=k=1pSk2 \[\sum\limits_{j=1}^{m}{h_{j}^{2}}=\sum\limits_{j=1}^{m}{\sum\limits_{k=1}^{p}{a_{kj}^{2}}}=\sum\limits_{k=1}^{p}{\sum\limits_{j=1}^{m}{a_{kj}^{2}}}=\sum\limits_{k=1}^{p}{S_{k}^{2}}\]

Among them: Sk2=j=1makj2,k=1,2,,p \[{{S}_{k}}^{2}=\sum\limits_{j=1}^{m}{a_{kj}^{2}},k=1,2,\cdots ,p\] is called the variance contribution of factor f¯k${{\bar{f}}_{k}}$ and is numerically equal to the sum of the squares of the elements in row k of A. It indicates the degree of contribution, or significance, that factor f¯k${{\bar{f}}_{k}}$ plays in all of the common factors, and is again positively correlated.

Three-stage DEA model
Theory of the first-stage non-expectation EBM model

Data Envelopment Analysis (DEA model for short) can be categorized into three kinds from the perspective of orientation, i.e., input-oriented, output-oriented, and non-oriented [25]. In the efficiency study of multi-input-output systems, we should choose the appropriate orientation according to the purpose and direction of the study. In this paper, we adopt the input-oriented model. According to the theory, for any decision unit, the mathematical theory of BCC model in dyadic form under input orientation is as follows: ιθε(e^TS+eTS+)subject toj=lnXjλj+S=θX0j=lnYjλjS+=Y0λj0,S,S+0 \[\begin{array}{*{35}{l}} \iota \theta -\varepsilon ({{{\hat{e}}}^{T}}{{S}^{-}}+{{e}^{T}}{{S}^{+}}) \\ subject~to\underset{j=l}{\overset{n}{\mathop \sum }}\,{{X}_{j}}{{\lambda }_{j}}+{{S}^{-}}=\theta {{X}_{0}} \\ \underset{j=l}{\overset{n}{\mathop \sum }}\,{{Y}_{j}}{{\lambda }_{j}}-{{S}^{+}}={{Y}_{0}} \\ {{\lambda }_{j}}\ge 0,{{S}^{-}},{{S}^{+}}\ge 0 \\ \end{array}\]] where j = l,2,⋯,n denotes the decision unit and X,Y is the input indicator variable and output indicator variable, respectively. S,S+ is the input and output slack variables. θ is the efficiency value of the BCC model. In this model, there are the following definitions and concepts that have universal applicability:

If θ = I,S+ = S = 0, the decision unit DEA is valid;

If θ = l,S+ ≠ 0, or S ≠ 0, the decision unit weak DEA is valid;

If θ < l, the decision unit non-DEA is valid.

Based on different economic perspectives, DEA is categorized into two types of radial and non-radial methods, radial methods such as DEA-CCR, DEA-BCC, etc. are based on Debreu-Farrel’s economic theory; and non-radial methods such as DEA-SBM, etc. are based on Pareto-Koopmans’ economic theory. The advantage of the DEA-SBM model is that it can be adjusted for different inputs or outputs in non-equal proportions, which is an improvement over the BCC model, and the theoretical formulation of the input-oriented SBM model is given below in the following pairwise transformed form: minρ1mi=1msixosubjecttoj=1nxijλj+si=ρxio,i=1,,mj=1nyrjλjyro,r=1,,s,λjsi,sr+0,i,j,r \[\begin{align} & \text{min}\rho -\frac{1}{m}\sum\limits_{i=1}^{m}{\frac{s_{i}^{-}}{{{x}_{o}}}} \\ & subjectto\sum\limits_{j=1}^{n}{{{x}_{ij}}}{{\lambda }_{j}}+s_{i}^{-}=\rho {{x}_{io}},i=1,\ldots ,m \\ & \sum\limits_{j=1}^{n}{{{y}_{rj}}}{{\lambda }_{j}}\ge {{y}_{ro}},r=1,\ldots ,s,{{\lambda }_{j}} \\ & s_{i}^{-},s_{r}^{+}\ge 0,\;\forall i,j,r \\ \end{align}\]

This paper adopts the EBM model based on constant scale reward and input orientation, and uses this model to measure the efficiency of digital creative industries in each region.The EBM model effectively improves the advantages of the SBM and CCR models, and at the same time makes up for their shortcomings. The study shows that the same proportion reduction of input factors in the traditional DEA(CCR) model is contrary to the actual situation, and the SBM model has the problem of losing the original proportion information of factors in the efficiency frontier. The EBM model can be used to evaluate the efficiency of each region more effectively. This paper adopts the EBM model based on non-expected output, variable returns to scale, and input orientation, and uses this model to measure the efficiency of digital creative industry in each region. The theoretical formula is as follows: Min θ0ε1i=1mwii=1mwisisubject toj=1nxijλj+si=θoxio,i=1,,mj=1nyrjλjyro,r=1,,sλj,si,sr+0,i,j,r \[\begin{align} & Min\text{ }{{\theta }_{0}}-\varepsilon \frac{1}{\sum\limits_{i=1}^{m}{w_{i}^{-}}}\sum\limits_{i=1}^{m}{w_{i}^{-}}s_{i}^{-} \\ & subject\text{ }to\sum\limits_{j=1}^{n}{{{x}_{ij}}}{{\lambda }_{j}}+s_{i}^{-}={{\theta }_{o}}{{x}_{io}},i=1,\ldots ,m \\ & \sum\limits_{j=1}^{n}{{{y}_{rj}}}{{\lambda }_{j}}\ge {{y}_{ro}},r=1,\ldots ,s \\ & {{\lambda }_{j}},s_{i}^{-},s_{r}^{+}\ge 0,\forall i,j,r \\ \end{align}\]

In this model, θ0 is the efficiency of digital creative industry in each region of the country, i,r represents the data of input and output indicators respectively. r represents the number of panel data of each city-region, and Min θo represents the optimal solution of the evaluated DMU. wi$w_{i}^{-}$ represents the relative importance of input indicators. λj is the weight of each DMU. ε is a key parameter, which takes the value of [0, 1]. si,sr+$s_{i}^{-},s_{r}^{+}$ is a slack variable for the input and output indicators. The objective function θ0 takes the value [0, 1]. When θ0 = 1, it means that the decision unit is located on the efficiency frontier, i.e., it represents that the DEA is relatively efficient; when θ0 < 1, it means that there is an efficiency loss in the decision unit.

Since the third stage uses the modified variables for the analysis, the theoretical description of the relaxation improvement is not provided here, and this paper will illustrate the specific model of the relaxation variables in the third stage of the EBM model.

Second-stage SFA regression theory

In this paper, we begin with two determinations when regressing the slack variables in the first stage using SFA regressions. In the first determination, we need to consider whether to use the simultaneous adjustment of inputs and outputs or only the adjustment of inputs or outputs. The determination is made according to the type of orientation of our first-stage EBM model, and since the orientation perspective of the first-stage EBM model is chosen to be input-oriented, this paper only performs SFA regression decomposition on the input slack variables of digital creative industries and adjusts the input variables. For the second determination, this paper stacks all the slack variables thus estimating only a single SFA regression. Adopting this approach ensures a higher degree of freedom in model treatment.

In summary, in this paper, the following SFA regression function can be constructed in order to correct the slack variables of the first stage EBM model: Sni=f(Zi;βn)+vni+μni;i=1,2,,I;n=1,2,,N \[{{S}_{ni}}=f({{Z}_{i}};{{\beta }_{n}})+{{v}_{ni}}+{{\mu }_{ni}};i=1,2,\cdots ,I;n=1,2,\cdots ,N\] Where smi is the slack value of the n rd input in the i nd province and city; zi is the environmental variable in each province and city, βn is the coefficient of the environmental variable in each province and city; vm + μm is the mixed error term in the model, vmi represents the random disturbances in the model, and μni represents the managerial inefficiency in the model. Where v~N(0,σv2)$v\tilde{\ }N(0,\sigma _{v}^{2})$ is the random error term in the model, which represents the effect of random disturbances on the input slack variable; μ is the management inefficiency, which represents the effect of management factors on the input slack variable, and is assumed to obey a normal distribution truncated at the null point in this paper, i.e. μ~N+(0,σμ2)$\mu \tilde{\ }{{N}^{+}}(0,\sigma _{\mu }^{2})$.

In order to separate the management inefficiency term, the separation equation used in this paper takes the following form: E(μ|ε)=σ*[ ϕ(λεσ)Φ(λεσ)+λεσ ] \[E(\mu |\varepsilon )={{\sigma }_{*}}\left[ \frac{\phi (\lambda \frac{\varepsilon }{\sigma })}{\text{ }\!\!\Phi\!\!\text{ }(\frac{\lambda \varepsilon }{\sigma })}+\frac{\lambda \varepsilon }{\sigma } \right]\] where σ*=σμσvσ,σ=σμ2+σv2,λ=σμ/σv${{\sigma }_{*}}=\frac{{{\sigma }_{\mu }}{{\sigma }_{v}}}{\sigma },\sigma =\sqrt{\sigma _{\mu }^{2}+\sigma _{v}^{2}},\lambda ={{\sigma }_{\mu }}/{{\sigma }_{v}}$.

To calculate the random error term μ in the model, the following formula is used in this paper: E[vni|=snif(zi;βn)E[uni|vni+μni] \[E[{{v}_{ni}}|={{s}_{ni}}-f({{z}_{i}};{{\beta }_{n}})-E[{{u}_{ni}}|{{v}_{ni}}+{{\mu }_{ni}}]\]

The purpose of the SFA regression is to eliminate the effects of environmental and random factors on efficiency measures in order to adjust all decision units to the same external environment. The SFA adjustment formula in this paper is as follows: XniA=Xni+[max(f(Zi,β^n))f(Zi,β^n)]+[max(vni)vni]i=1,2,,I;n=1,2,,N \[\begin{align} & X_{ni}^{A}={{X}_{ni}}+[\text{max}(f({{Z}_{i}},{{{\hat{\beta }}}_{n}}))-f({{Z}_{i}},{{{\hat{\beta }}}_{n}})]+[\text{max}({{v}_{ni}})-{{v}_{ni}}]\; \\ & i=1,2,\cdots ,I;n=1,2,\cdots ,N \\ \end{align}\] where xm4$x_{m}^{4}$ is the adjusted inputs; xmi is the pre-adjusted inputs; [max(f(Zi;β^n))f(Zi;β^n)]$[\text{max}(f({{Z}_{i}};{{\hat{\beta }}_{n}}))-f({{Z}_{i}};{{\hat{\beta }}_{n}})]$ is the adjustment for external environmental factors; and [max(vmi) – vmi] is the placement of all decision-making units in the same external environment.

Third-stage modified EBM regression theory

Using the adjusted input-output variables to measure the efficiency of each decision-making unit again, this time the efficiency has eliminated the influence of environmental factors and random factors, is relatively real and accurate. In order to study the improvement of each index, this paper adopts the relaxation variable analysis. So that the target value and the improvement value can be achieved.

For input and output slack variables, this variable is the improvement value in the projection analysis. Here the target value is the sum of the improved value and the original value. All three values contain input and output indicators. The following formula is used: si+sio=sipsr++sro=srp \[\begin{align} & s_{i}^{-}+{{s}_{io}}={{s}_{ip}} \\ & s_{r}^{+}+{{s}_{ro}}={{s}_{rp}} \\ \end{align}\] Where si\[s_{i}^{-}\] is the input redundancy value of digital creative industry efficiency, i.e. input improvement value. sio is the input original value of digital creative industry efficiency, sip is the input target value of digital creative industry efficiency. si+\[s_{i}^{+}\] is the output insufficiency value of digital creative industry efficiency, i.e. output improvement value. sm is the original value of output of digital creative industry efficiency, sm is the target value of output of digital creative industry efficiency. Projection analysis is mainly to find the input redundancy value and output deficiency value. Then it guides the region to make corresponding improvements to achieve the DEA efficiency of digital creative industries.

Since the efficiency of some digital creative industry provinces may reach the condition of efficiency 1. Under this condition, it is difficult for these provinces to conduct specific efficiency ranking and comparative analysis. This paper adopts the super-efficiency model to expand the ordinary EBM model, and the specific expansion model is as follows: Min θ0ε1i=1mwii=1mwisixosubject toj=1nxijλj+si=θoxio,i=1,,mj=1nyrjλjyro,r=1,,sλj,si,sr+0,i,j,r.k \[\begin{align} & Min\text{ }{{\theta }_{0}}-\varepsilon \frac{1}{\sum\limits_{i=1}^{m}{w_{i}^{-}}}\sum\limits_{i=1}^{m}{\frac{w_{i}^{-}s_{i}^{-}}{{{x}_{o}}}} \\ & subject\text{ }to\sum\limits_{j=1}^{n}{{{x}_{ij}}}{{\lambda }_{j}}+s_{i}^{-}={{\theta }_{o}}{{x}_{io}},i=1,\ldots ,m \\ & \sum\limits_{j=1}^{n}{{{y}_{rj}}}{{\lambda }_{j}}\ge {{y}_{ro}},r=1,\ldots ,s \\ & {{\lambda }_{j}},s_{i}^{-},s_{r}^{+}\ge 0,\forall i,j,r.k \\ \end{align}\]

In this super-efficient EBM model, θ0 is the efficiency of the digital creative industry in each region of the country, and i,r represents the data of input and output indicators respectively. r represents the number of panel data in each region, and since the number of years is 1. r therefore takes the value of 1. Min θo represents the optimal solution of the evaluated DMU. wi$w_{i}^{-}$ represents the relative importance of the input indicators. Objective function θ0 > 0. The model uses desired output, input orientation, and variable returns to scale. In this super-efficient EBM model, inputs and outputs are treated as one system, and multiple input and multiple output variables are allowed in this system.

Analysis of Factors Influencing the Scientific and Technological Innovation Capability of Digital Creative Industries

In the above article, this paper proposes to build up a corresponding evaluation index system for the influencing factors of the digital creative industry. In this chapter, the factor analysis method will be used to extract the common factor to determine the common factor of the scientific and technological innovation ability of digital entrepreneurship industry, and in the following this paper will use the determined common factor as the input data and output data of the subsequent three-stage DEA method to analyze the scientific and technological innovation ability of the word creative industry.

Extract the data from the China Economic Survey Yearbook (2022).Before extracting public factors, it is necessary to check whether the data are suitable for factor analysis. The methods available include KMO test, Bartlett test, correlation coefficient matrix, and so on.

Correlation matrix

The raw data was first analyzed to determine the correlation coefficient matrix.The correlation coefficient matrix is mainly used to examine whether there is a strong correlation between the data or if it is not suitable for factor analysis. The results of the run are shown in Figure 1. Generally speaking, the correlation coefficient matrix figure needs to be greater than 0.3, which indicates a strong correlation between the data, in order to have the value of data analysis. If it is below this indicator, the data analysis is of little significance. As can be seen from the table, most of the correlation coefficients between the selected X1-X11 indicators are greater than 0.3, which is suitable for factor analysis.

Figure 1.

Correlation matrix

KMO test and Bartlett’s test

The basic principle of the KMO test is to use data to calculate the difference between the simple correlation coefficient squared and the partial correlation coefficient squared between the variables. A KMO value greater than 0.9 indicates that the data meets the appropriate requirements, 0.8-0.9 is good, 0.5 or more is satisfactory, and 0.5 or less indicates that the data is not suitable for factor analysis.

The purpose of Bartlett’s test of sphericity is to test whether the correlation matrix is a unit matrix, the test is based on the assumption that the correlation matrix in the data is a unit matrix, and if the assumption can not be rejected, it indicates that the data are not suitable for factor analysis. Therefore, the smaller the value of Bartlett’s test of sphericity, the better. The smaller the value, the more likely there is a meaningful relationship between the original variables.

The test results are specifically shown in Table 2. The results of the KMO test show that the score of the data is 0.516, which indicates that the group of data can be factor analyzed, but the effect of the analysis is not good, which may be caused by the number of samples is not large enough. And the final result of Bartlett’s test of sphericity is 0.000, which shows that there is a significant relationship between the original data and this group of data. This group of data can be analyzed using factor analysis.

KMO and Bartlett tests

KMO and Bartlett tests
KMO Measurement of Sampling 1Adequacy 0.516
Bartlett ‘s sphericity test Approximate card 257.628
dF 60
Sig. 0.000
Common factor selection

The next step is to identify the common factors. The final results of the calculation using SPSS are shown in Table 3. The criterion for selecting public factors is generally to select data with eigenvalues greater than 1. From the data in the table, it can be seen that there are four variables with eigenvalues greater than 1, so four dimensions can be extracted. In particular, it should be noted that the cumulative contribution rate of the eigenvalues of the first four components represents 82.403%, which indicates that the four dimensions can adequately explain the independent variables.The four dimensions were extracted as F1, F2, F3, and F4, and these dimensions illustrate most of the variance in the influence factors of digital creative industries.

The total variance of the explanation

Constituent The total variance of the explanation Extract the sum of squares and load Rotate the squares and load
Initial eigenvalue Total Variance (%) Cumulation (%) Total Variance (%) Cumulation (%)
Total Variance (%) Cumulation (%)
1 3.104 28.224 28.224 3.104 28.224 28.224 2.716 24.565 24.565
2 2.592 23.605 51.829 2.592 23.605 51.829 2.201 20.154 44.719
3 2.097 19.043 70.872 2.097 19.043 70.872 2.152 19.56 64.279
4 1.259 11.531 82.403 1.259 11.531 82.403 1.995 18.124 82.403
5 0.795 7.15 89.553 - - - - - -
6 0.441 3.942 93.495 - - - - - -
7 0.303 2.801 96.296 - - - - - -
8 0.162 1.389 97.685 - - - - - -
9 0.144 1.25 98.935 - - - - - -
10 0.089 0.815 99.75 - - - - - -
11 0.02 0.25 100 - - - - - -

In order to obtain the specific composition of dimensions F1, F2, F3 and F4, it is still necessary to analyze them using the method of variance great value rotation, and by applying this method, it is possible to clarify the relationship between the results of the analysis of the principal component loadings and the evaluation index system, and it is also possible to clearly demonstrate the specific distribution of the public factors. The running results are shown in Table 4.

Rotational composition matrix

Variable Rotational composition matrix
1 2 3 4
X1 0.007 0.083 0.96 0.065
X2 -0.004 -0.06 -0.928 -0.16
X3 0.554 0.079 -0.18 0.626
X4 0.289 0.84 0.085 0.268
X5 -0.01 0.124 0.231 0.806
X6 0.483 -0.466 -0.284 0.548
X7 0.415 0.777 0.295 0.095
X8 -0.933 -0.097 0.09 0.071
X9 0.904 0.143 0.169 -0.011
X10 -0.437 0.187 0.286 0.694
X11 0.215 -0.775 0.082 0.215

The public factors that influence the scientific and technological innovation capacity of the digital creative industry are finally composed as follows.

The first public factor F1 contains the growth rate of enterprise industrial output value X8 and the profit margin of enterprise business income X9. Public factor F1 is named capital structure.

The second public factor F2 contains the number of enterprises in the system X4, the number of patent applications and authorizations X7, and the growth rate of enterprise product sales revenue X11. Public factor F2 is named industry competition.

The third public factor, F3, contains the survival status of firms in the industry, X1, and the investment heat of the industry, X2. The public factor is named investment heat.

The fourth public factor F4 contains the degree of enterprise management costs X3, the amount of per capita public expenditure on education X5, the volume of contract turnover X6, and the growth rate of enterprise technology revenue X10. Public factor F4 is named innovative strength.

DEA Analysis of Science and Technology Innovation Capability of Digital Creative Industries

In the analysis in the previous chapter, the public factors F1 Capital Structure, F2 Industry Competition, F3 Investment Heat and F4 Innovation Strength obtained from the factor analysis of influencing factors will be used as input indicators. In this chapter, the three-stage DEA model proposed in this paper will be utilized to measure the level of scientific and technological innovation capability of digital creative industries using DEAP2.1 software, as shown in Table 5. The table mainly reflects the main results of DEA analysis, i.e., comprehensive efficiency, pure technical efficiency, scale efficiency, and scale reward. On the whole, most of the industries in the digital creative industry still show good efficiency characteristics.

The measure of the ability level of scientific and technological innovation

Industry code Integrated efficiency Pure technical efficiency Scale efficiency Scale compensation
8910 0.902 0.991 0.91 Decreasing
8920 1 1 1 Decreasing
8930 1 1 1 Unchanged
8940 0.873 0.911 0.985 Decreasing
6100 0.881 1 0.881 Decreasing
6200 0.807 1 0.807 Unchanged
9200 1 1 1 Incremental
7480 0.904 1 0.904 Incremental
8400 0.746 1 0.923 Unchanged
9010 0.831 0.882 0.942 Incremental
9070 1 1 1 Unchanged
9080 1 1 1 Unchanged
9090 1 1 1 Unchanged
4210 1 1 1 Unchanged
8240 1 1 1 Unchanged
7440 0.843 0.872 0.967 Decreasing
4900 0.476 0.538 0.885 Incremental
7410 1 1 1 Unchanged
7420 0.807 0.852 0.948 Incremental
7430 0.552 0.617 0.895 Decreasing
7450 0.722 0.746 0.969 Incremental
7470 1 1 1 Unchanged
8810 0.893 0.953 0.938 Unchanged
8820 1 1 1 Decreasing

Next, the main five industries of digital creative industry, digital cultural and creative equipment industry, digital cultural and creative content production industry, design service industry, digital creativity and integration service industry, digital content processing industry, will be the main research object, for their scientific and technological innovation ability, to measure and deeply analyze their technological innovation efficiency.

Results and Analysis of Phase I Measurements

Using DEAP2.1 software, the five main industries of digital creative industry, digital cultural and creative equipment industry, digital cultural and creative content production industry, design service industry, digital creativity and integration service industry, digital content processing industry, are measured, and the specific results are shown in Table 6.

1) Digital cultural and creative equipment industry. As can be seen from the table, in the first stage of DEA, the decision-making unit (year) to achieve DEA effective land is 2017, and all other years are DEA invalid.

2) Digital cultural and creative content production industry. The above table shows that in the first stage of DEA, it is 2016 and 2018 that reach DEA effective decision-making (year).In the seven years of 2016-2022, there are four years in which the value of pure technical efficiency is 1 and the average value reaches 0.967, and there are two years in which the value of scale efficiency is 1, with the average value of 0.725, and the fluctuating value of scale efficiency is large, which indicates that the The low value of scale efficiency is the main reason for ineffective technical efficiency. Consideration should be given to expanding or reducing the size of enterprises.

3) Design services industry. It can be concluded from the above table that in the first stage of DEA, the years in which DEA is realized effectively are 2017 and 2021, with a mean value of 0.948 for pure technical efficiency and a mean value of 0.838 for scale efficiency, and there are four years in which the value of pure technical efficiency is 1, and there are two years in which the value of scale efficiency is 1. The value of pure technical efficiency is slightly higher than the value of scale efficiency, which indicates that from an overall point of view, the main reason for DEA ineffectiveness should be taken into consideration, regardless of the external influencing factors. The main reason for the ineffectiveness of DEA without considering external influences should be considered as the scale factor.

4) Digital Creative and Integration Services Industry. It can be concluded from the above table that in the first stage of the DEA, the DEA effective decision-making is reached in 2021 and 2022, and the pure technical efficiency value reaches 1 in five years, 2016, 2019, 2020, 2021 and 2022, with an average value of 0.874. The years in which scale efficiency reaches 1 are 2021 and 2021, with an average value of 0.859, which is very close. This suggests that the digital creativity and convergence services industry has advantages in terms of both scale and technical efficiency levels to realize the overall DEA decision effectively requires comprehensive consideration of the proportion of STI resource allocation and scale factors.

5) Digital content processing industry. It can be concluded from the above table that in the first phase of DEA, there are three years, 2017, 2018 and 2020, in which the DEA decision unit is reached to be effective. Pure technical efficiency is valid except for 2019 and 2022. There are 3 valid years for scale efficiency, in 2017, 2018 and 2020. In 2019 and 2022, the value of scale efficiency is higher than the value of pure technical efficiency, indicating that in these two years, the main reason for the ineffectiveness of DEA is the irrational allocation of innovation resources.

The measure of technological efficiency

Digital cultural creative equipment industry
Decision unit(Year) Integrated efficiency Pure technical efficiency Scale efficiency Scale compensation change
2016 0.397 0.466 0.845 Drs
2017 0.673 0.664 0.996 -
2018 0.38 0.398 0.985 Drs
2019 0.394 0.44 0.874 Drs
2020 0.38 0.408 0.956 Drs
2021 0.288 0.31 0.959 Irs
2022 0.652 0.69 0.97 Irs
Mean 0.452 0.482286 0.940714
Digital cultural creative content production industry
Decision unit(Year) Integrated efficiency Pure technical efficiency Scale efficiency Scale compensation change
2016 1 1 1 -
2017 0.713 0.855 0.855 Irs
2018 1 1 1 -
2019 0.372 1 0.382 Irs
2020 0.762 0.966 0.79 Irs
2021 0.451 1 0.444 Irs
2022 0.548 0.95 0.602 Drs
Mean 0.69228571 0.967286 0.724714
Design service industry
Decision unit(Year) Integrated efficiency Pure technical efficiency Scale efficiency Scale compensation change
2016 0.422 1 0.423 Drs
2017 1 1 1 -
2018 0.803 0.844 0.952 Drs
2019 0.668 0.903 0.745 Irs
2020 0.772 0.888 0.868 Irs
2021 1 1 1 -
2022 0.878 1 0.878 Drs
Mean 0.79185714 0.947857 0.838
Digital creative and integrated service industry
Decision unit(Year) Integrated efficiency Pure technical efficiency Scale efficiency Scale compensation change
2016 0.569 1 0.548 Drs
2017 0.437 0.448 0.962 Drs
2018 0.53 0.672 0.85 Drs
2019 0.77 1 0.745 Drs
2020 0.909 1 0.905 Drs
2021 1 1 1 -
2022 1 1 1 -
Mean 0.745 0.874286 0.858571
Digital content processing industry
Decision unit(Year) Integrated efficiency Pure technical efficiency Scale efficiency Scale compensation change
2016 1 1 1 -
2017 1 1 1 -
2018 1 1 1 -
2019 0.742 0.743 0.995 Drs
2020 1 1 1 -
2021 0.917 0.941 0.972 Irs
2022 0.518 0.558 0.926 Irs
Mean 0.88242857 0.891714 0.984714 Irs
Second-stage regression analysis

The SFA estimation results of the second stage are specifically shown in Table 7. It can be seen that the one-sided test value in the three-stage DEA model proposed in this paper reaches 50.436 and 53.768, which achieves significance at the 1% level, indicating that the model is significant and good. This paper further analyzes the SFA estimation results in the table from four perspectives: industry asset size, number of enterprises in the industry, government support, and number of R&D institutions.

The results of SFA estimation

- R & D investment relaxation variable Environmental support relaxation variable
Constant term -2.512(3.118) -1.231(1.889)
Asset size 1.356(1.015) 1.209*(1.428)
Number of enterprises -0.957**(0.847) 0.25*(0.13)
Government support 2.311**(1.927) 2.913**(2.0572)
Number of R & D institutions 3.209*(1.416) -1.941**(1.019)
sigma-squared 5.448***(2.126) 5.089***(2.364)
gamma 0.565***(0.137) 0.605***(0.116)
Log-likehood -212.482 -196.757
LR test of one-sided 50.436*** 53.768***

Analyzed from the perspective of industry asset size, the impact of asset size on R&D investment is not significant, which indicates that the increase in the size of enterprise assets does not lead to an increase in R&D investment, probably due to the inherent differences in the industries that lead to different asset sizes of enterprises. The impact of asset size on environmental support is significant and negative.

Analyzing from the perspective of the number of enterprises in the industry, the impact of the number of enterprises in the industry on R&D investment is significant and positive. This indicates that the increase in the number of enterprises in the industry is conducive to improving the competitive awareness of enterprises, so that the R&D investment (personnel and capital) in each industry increases, thus making the technical efficiency of high-tech industry increase. But the impact of the number of enterprises on environmental support is unfavorable.

Analyzed from the perspective of government support, the effects of government expenditure on the slack variables of both R&D investment and environmental support are significant and positive, which indicates that government support is not conducive to enhancing the allocation of innovation resources. Analyzing from the perspective of the number of R&D institutions, the effect of having too many R&D institutions on the relaxation variable of R&D is negative. On the other hand, the effect of the number of R&D organizations on the environmental support slack variable is positive, as the increase in the number of R&D organizations allows for the introduction, digestion, purchase and adaptation of more new technologies, which can lead to a wide variety of technological research and development within the industry, and an increase in the amount of fixed assets and investment can provide a better innovation-driven environment.

Efficiency analysis of the DEA methodology in phase III

According to the results after the stochastic frontier (SFA) regression in the second stage of the previous section, excluding the random interference and the interference of external factors, the adjusted inputs are substituted for the original inputs, and the DEAP2.1 software is used to measure the technical efficiency of the five major industries of the digital creative industry, such as the digital cultural and creative equipment industry and the digital cultural and creative content production industry. The results are shown in Table 8.

The measure of the third stage

Digital cultural creative equipment industry
Decision unit(Year) Integrated efficiency Pure technical efficiency Scale efficiency Scale compensation change
2016 0.38704 0.46 0.826 Drs
2017 0.844 1 0.836 -
2018 0.4206 0.48 0.881 Drs
2019 0.48239 0.58 0.827 Drs
2020 0.65154 0.66 0.952 Irs
2021 0.42548 0.47 0.897 Irs
2022 0.50253 0.58 0.868 Irs
Mean 0.531 0.604 0.870 -
Digital cultural creative content production industry
Decision unit(Year) Integrated efficiency Pure technical efficiency Scale efficiency Scale compensation change
2016 0.613 1 0.611 Irs
2017 0.675 0.811 0.832 Irs
2018 1 1 1 -
2019 0.403 1 0.401 Irs
2020 0.58 0.781 0.752 Irs
2021 0.397 1 0.391 Irs
2022 0.564 0.94 0.606 Drs
Mean 0.605 0.933 0.656 -
Design service industry
Decision unit(Year) Integrated efficiency Pure technical efficiency Scale efficiency Scale compensation change
2016 0.403 1 0.398 Irs
2017 0.754 1 0.754 Irs
2018 0.805 0.903 0.893 Irs
2019 0.693 1 0.7 Irs
2020 0.775 0.955 0.803 Irs
2021 1 1 1 -
2022 0.732 1 0.735 Irs
Mean 0.737 0.980 0.755 -
Digital creative and integrated service industry
Decision unit(Year) Integrated efficiency Pure technical efficiency Scale efficiency Scale compensation change
2016 0.504 1 0.522 Drs
2017 0.63 0.782 0.818 Drs
2018 0.474 0.68 0.725 Irs
2019 0.696 1 0.693 Irs
2020 0.697 0.814 0.859 Irs
2021 0.897 1 0.909 Irs
2022 1 1 1 -
Mean 0.7 0.897 0.789 -
Digital content processing industry
Decision unit(Year) Integrated efficiency Pure technical efficiency Scale efficiency Scale compensation change
2016 0.322 1 0.322 Irs
2017 1 1 1 -
2018 1 1 1 -
2019 0.743 0.852 0.872 Irs
2020 1 1 1 -
2021 0.791 1 0.791 Irs
2022 0.562 0.634 0.881 Irs
Mean 0.774 0.927 0.838 -
Digital cultural and creative equipment industry

From 2016 to 2029, the returns to scale are decreasing, and from 2020 to 2022, the returns to scale are increasing, which also indicates that, after 2020, the technical efficiency of the digital culture and creative equipment industry is increasing along with the increase in the input of innovative materials.

Digital cultural and creative content production industry

The effective year of DEA is 2018, and the mean value of pure technical efficiency is higher than the mean value of scale efficiency, which indicates that the main reason for the ineffectiveness of DEA is the scale factor.

Design service industry

In addition to the year when DEA is effective, the changes in returns to scale in other years are all increasing returns to scale, which also indicates that the reason for the lower comprehensive efficiency of the design service industry is the scale factor, i.e., the scale of the enterprise’s technological innovation resource input has not reached the optimum.

Digital Creativity and Integration Services Industry

From an overall perspective, the average value of scale efficiency from 2018-2022 also decreases from 0.893 to 0.755, which indicates that the cause of ineffective technical efficiency is the scale factor, i.e., the scale of enterprises should be adjusted.

Digital content processing industry

Overall, the mean value of pure technical efficiency has increased, the mean value of scale efficiency has decreased, and the value of comprehensive efficiency has also become smaller. This indicates that the reason for the lower overall efficiency value of the digital content processing industry is due to scale, and there is still a certain distance from the optimal state of scale.

To summarize, the main factor restricting the development of digital creative science and technology innovation ability is the scale factor, the insufficiency of the input scale and the poor scale status adversely affect the technological development and comprehensive development of the digital creative industry, and the input scale of the digital creative industry should be expanded at the right time.

Conclusion

Based on the research point of input and output, this paper constructs the index system for assessing the scientific and technological innovation ability of digital creative industry, and proposes the analysis method combining factor analysis and three-stage DEA method to explore and assess the scientific and technological innovation ability of digital creative industry.

In the factor analysis, firstly, based on the influencing factor indicators in the assessment index system of scientific and technological innovation capacity of digital creative industry, the correlation matrix is constructed, and it is determined that the correlation coefficients of most of the indicators are greater than 0.3, which is a strong correlation, and it is determined that they are suitable to be analyzed by factor analysis. Secondly, KMO test and Bartlett’s test are carried out, and it can be learned that the KMO score is 0.516, and the final result of Bartlett’s test of sphericity is 0.000, which can be further carried out for factor analysis. Based on the cumulative contribution rate of the eigenvalues of the first four components in the SPSS calculation results of 82.403%, the four dimensions of F1, F2, F3 and F4 are determined with the common factors and named as capital structure, industry competition, industry competition and innovation strength respectively.

The public factors F1 capital structure, F2 industry competition, F3 investment heat and F4 innovation strength are taken as input indicators to further carry out the DEA analysis of the scientific and technological innovation capacity of digital creative industry. In the first stage, DEAP2.1 software is applied to measure the five major industries of digital creative industry, digital cultural and creative equipment industry, digital cultural and creative content production industry, design service industry, digital creativity and integration service industry, and digital content processing industry. The digital cultural and creative equipment industry only reached DEA effective in 2017, the average value of pure technical efficiency of the digital cultural and creative content production industry is 0.725 and the fluctuation value is large, the average value of pure technical efficiency and scale efficiency of the design service industry is 0.948 and 0.838, respectively, the average value of pure technical efficiency and scale efficiency of the digital creativity and fusion service industry is very close to the average value of the pure technical efficiency and scale efficiency, and the digital content processing industry is very close to the average value of the pure technical efficiency and scale efficiency in the DEA ineffective in both 2019 and 2022, and the value of scale efficiency is higher than the value of pure technical efficiency. The SFA estimation results are further analyzed from four perspectives, namely, the industry asset size, the number of enterprises from the industry, government support, and the number of R&D institutions, and the random interference and the interference of external factors are removed to carry out the final third stage of the DEA method efficiency analysis. Among them, the scale compensation of digital cultural and creative equipment industry shows a trend of increasing first and then decreasing, while the digital cultural and creative content production industry is effective in DEA in 2018, when the mean value of pure technical efficiency is higher than the mean value of scale efficiency, which verifies that the scale factor is the main reason for the ineffectiveness of DEA. Except for the years when the DEA is effective, the scale reward of the design service industry is increasing in all other years, while the scale efficiency mean of the digital creativity and integration service industry shows an overall decreasing trend from 2018 to 2022, and both verify that the scale reason is also the reason for the ineffectiveness of technical efficiency. The pure technical efficiency mean value of the digital content processing industry increases and the scale efficiency mean value decreases, but the comprehensive efficiency value also shows a decrease, which indicates that the scale reason is also the reason for the lower comprehensive efficiency value of the digital content processing industry. In general, the main factor restricting the development of digital creative science and technology innovation ability is the scale factor, and when promoting the development of science and technology innovation ability of digital creative industry, it should pay attention to the input of industrial scale, and reduce the unfavorable impact of scale state on industrial innovation.

Funding:

This research was supported by the National Natural Science Foundation of China: “Research on the Prevention and Control Capabilities and Management Strategies during the Critical Period of Sudden Major Infectious Epidemics” (72074110).