Open Access

Construction of Time Series Prediction Models for Event Influence and Revenue Growth in Sports Industry

,  and   
Mar 21, 2025

Cite
Download Cover

Introduction

The influence of events in the sports industry covers a wide range of economic, social, cultural, urban image and environmental aspects, and is of great significance to the organizing country or city, as well as to the participants and spectators [1-3]. The organization of sports events usually attracts a large number of spectators and media attention, increases tourism revenue and media exposure, and also drives the development of hotels, restaurants, retail and other services, bringing great economic benefits to the local economy [4-5]. The impact and revenue of organizing these sports events can be predicted based on past data and current environment. And time series forecasting model plays a good role at this moment.

Time series refers to the sequence formed by arranging the values of a variable at different times in chronological order, and its time unit can be minutes, hours, days, weeks, decades, months, quarters, years, etc. [6]. Time series forecasting model is a kind of statistical model used to analyze and forecast time series data, its essence is the use of time series to build a mathematical model, it is mainly used for short-term forecasting of the future, belongs to the trend forecasting method [7-9]. Among them, the data used for model construction are called time series data, and they are a common type of data in many real-world problems, such as sales data, stock prices, temperature changes, social reactions, etc [10]. Common time series forecasting models include moving average model, autoregressive model, ARIMA model and LSTM model, which are widespread [11-14]. And nowadays, there are many global sports events with high participation of all people, for this reason, it is very necessary to construct a model specializing in events in the sports industry.

Several factors need to be considered to construct a suitable time series forecasting model, including data characteristics, model complexity, accuracy and so on. Constructing a suitable model according to the actual situation can lead to better prediction results.

At present, the most common research on the prediction of events in the sports industry is the prediction of the results of the game, while the influence and revenue prediction of the sports event itself is less. For example, literature [15] used a time series model to predict the results of soccer matches in the national league, and literature [16] used an LSTM model to predict the state of a player’s training and the peak of his execution ability in most cases, so as to formulate individual and team training plans for coaches and players. Whereas, literature [17] used different modeling of Autoregressive Integrated Moving Average (ARIMA) and Recurrent Neural Networks (RNNs) to predict and analyze players’ behaviors across seasons and teams, which assisted the management well. A sporting event is broadly predictable in terms of heat and revenue, and literature [18] provides profitability on market odds by calibrating existing expert and experimental information through time series model predictions to further reduce uncertainty due to expert judgment bias. Literature [19] used real-time and historical data on the relative popularity of search terms provided by Google Trends to evaluate the influence of sports leagues and identified future popularity trends through three models: trend plus seasonal regression, Holt-Winter Multiplier Method (HWMM), and Seasonal Autoregressive Integrated Moving Average (SARIMA). Based on this trend, companies or organizers can place relevant advertisements to attract potential consumers during that period. The trend prediction provided by these types of models is convenient for advertisers, companies, and merchants of the organizing venue. In addition, literature [20] also mentions that the use of robust predictive models for time series analysis has helped event managers to make informed decisions and predictions about the success and profitability of the sports industry, enhancing the technical competitiveness of the organizers.

Before carrying out the research on the relationship between the influence of sports events and revenue growth in the sports industry, this paper first constructs the economic revenue measurement and forecasting model ARIMA, conducts the unit root test on the time series, and transforms the non-stationary series into a stationary series through the difference to realize the measurement and prediction of the economic revenue growth of the sports industry. On the basis of the results of economic revenue measurement, the relationship between revenue growth and the influence of sports events is further explored, and an improved time series prediction model SVAR model is built. Relying on the framework of the VAR model, the OLS is used to estimate the induced equations without bias, estimate the parameters of the induced equations, and convert the VAR model to reach the induced equations to the structural equations, and propose variable analysis methods such as variable impulse response and variance decomposition based on the structural equation conversion. Taking the data of China’s sports industry from 2013 to 2023 as sample data, the relationship between event influence and revenue growth is discussed and analyzed with the help of the time series prediction model of event influence and revenue growth constructed in this paper.

Model for measuring and forecasting the economic returns of the sports industry

Traditional econometric methods are based on economic theory to describe models of variable relationships. However, economic theory is usually insufficient to provide a rigorous description of the dynamic linkages between variables, and endogenous variables can appear at both the left and right ends of the equation, complicating estimation and inference. To address these issues, an unstructured approach to modeling the relationships between variables has emerged, such as vector autoregressive (VAR) and vector error correction (VEC) models.

In classical regression modeling, the main focus is on regression analysis to establish a functional relationship (causality) between different variables in order to examine the connections between things. It is important to discuss how time series data itself can be used to build models in order to examine the laws of the development of things themselves and to make predictions about the future development of things accordingly. The significance of studying time series data: In reality, it is often necessary to study the pattern of development of a certain thing over time. This requires the study of the historical record of the past development of the thing in order to obtain the law of its own development. In reality, many issues exist, such as interest rate fluctuations, changes in yields, and reflected stock market conditions of various indices. can usually be expressed as time series data, through the study of these data, to find the pattern of change of these economic variables (for some variables, affecting the development of too many factors, or the main impact of the variables of the data is difficult to collect, so it is difficult to establish a regression model to find out the development of its changes) Law, at this time, the time series analysis model shows its advantage, because this kind of model does not need to establish the causality model, only need the data of the variable itself can be modeled), such a modeling method belongs to the research category of time series analysis. In time series analysis, ARIMA model is the most typical and commonly used model.

ARIMA contains three components, i.e., AR, I, and MA. AR denotes auto regression, i.e., autoregressive model [21]. I denotes integration, i.e., the number of single integer orders, time series models must be smooth series in order to establish an econometric model, ARIMA model as a time series model is no exception, so the first time series to carry out the unit root test, if it is a non-smooth series, it is necessary to be transformed into a smooth series through the difference, after a few differences into a smooth series, known as the number of orders of single integer; MA denotes moving average, that is, moving average model. It can be seen that the ARIMA model is actually a combination of the AR model and the MA model. In a ARIMA(p,d,q) model, AR(p) is the autoregressive process, p is the autoregressive term; MA(q) is the moving average process, q is the number of moving average terms, d is the number of time series to become a smooth series of the number of differences made.

Autoregressive process AR(p) model

Describes the relationship between the current value and the historical value, using the historical time data of the variable itself to predict itself, the autoregressive model must satisfy the requirement of smoothness, the Pnd order autoregressive process form is expressed as equation (1): yt=μ+pγiyti+εt

yt is the current value, u is the constant term, P is the order, ri is the autocorrelation coefficient, and et is the error, i.e., white noise.

The formula expansion is shown in (2): Xt=1Xt1+2Xt2++pXtp+μt

If the random perturbation term is a white noise (μt=εt), i.e., u = 0, the AR model is said to be a pure AR(p) process, denoted as equation (3): Xt=1Xt1+2Xt2++pXtp+εt

As you can see from the formula, the current value is predicted from the historical values, p is an order in the autoregressive model that indicates how many periods of historical values are used to predict the current value.

Moving Average Process MA(p)

The moving average model is concerned with the accumulation of the error terms in the autoregressive model, and in the AR model, if μt it is not a white noise, it is usually considered to be a moving average process MA(p) of order q, as shown in Eq. (4). ut=β1εt1+β2εt2++βpεtp+εt

where εt denotes the white noise sequence. In particular, Equation (5) is obtained when ut = Xt, i.e., the current value of the time series is not related to the historical value but depends only on a linear combination of the historical white noise: Xt=β1εt1+β2εt2++βpεtp+εt

ARIMA Model

AR and MA in ARIMA, are the AR and MA models, respectively, and I is the difference method, where the difference calculation ensures the stability of the data.

The autoregressive model (AR), the moving average model (MA) and the difference method (I) are combined, so that the differential autoregressive moving average model ARIMA(p,d,q) is obtained, where d is the order in which the data need to be differenced, and ARIMA is the ARMA model after differencing.

Forecast analysis of the economic returns of the sports industry

In this chapter, using the ARIMA-based econometric model constructed in the previous section and the forecasting methodology, the quarterly year-on-year data of China’s economic indices for the period from Q1 2006 to Q2 2023 for the sporting goods manufacturing industry such as sporting goods and sports facilities production, and the sports services industry such as the provision of sports services, sports media and information services, and sports tourism and recreation, are used to specifically measure the economic returns of China’s The economic benefits of the sports industry in China are measured by the quarterly year-on-year economic index data. The volatility time series of the economic income index of China’s sports industry is further calculated to systematically and comprehensively analyze the dynamic process of the volatility time series of China’s sports economic income. The time-varying paths of the time series of the economic return indices of China’s sporting goods manufacturing industry and sports service industry are specifically shown in Figure 1, with Figures (a) and (b) representing the sporting goods manufacturing industry and the sports service industry, respectively.

Figure 1.

Sports industry

From the figure (a) can be intuitively found, China sporting goods manufacturing industry economic return index time series in this study selected data time range fluctuation is huge, the economic return index in 2009 had reached the peak, and then fell back to the lowest in 2013, followed by a rapid rise. However, since 2015, the economic gain index time series has shown a slow decline and reached a trough position in 2020. Subsequently, from 2021 to present, the time series of the economic return index for the sporting goods manufacturing industry once again shows a slow upward momentum.

Through Figure (b), it can be found that the time series of the economic return index of the sports services industry, such as the provision of sports services, sports media and information services, sports tourism and entertainment, also fluctuates greatly in the time range of the data selected for the study of this paper, and the economic return index had peaked in 2012, also fell back to the lowest in 2015, and then rose rapidly, rising to the maximum value in 2016. However, since 2016, the economic gain index time series also shows a slow decline and small fluctuations, and again shows a small peak in 2021. Subsequently, the time series of the economic return index of China’s sports service industry from 2021 to the present also shows a slow upward momentum.

The estimation results of the descriptive statistics of the time series of the sports industry economic income index are shown in Table 1. When examining the estimation results of skewness and kurtosis statistics, it can be easily seen that the distribution characteristics of the time series of sports economic income index “sharp peaks and thick tails” are extremely significant, and when observing the results of the estimation of the statistics of the J-B normal test and the results of the probability of the P-value, it can be concluded that the time series of the economic income index does not obey the normal distribution, which is consistent with the specific distribution characteristics of the time series mentioned earlier. When we look at the estimated J-B normal test statistic and the probability P-value results, we can conclude that the time series of the economic return index does not obey the normal distribution, which is consistent with the specific distributional characteristics of the time series mentioned earlier.

Statistical estimates

Industry Sequence Quantity of samples Degree of bias Kurtosis J-B Normal test
J-B statistic Probability P
Sporting goods manufacturing industry Sporting goods 80 0.3291 4.7819 10.5215 0.0053
Sports facilities 80 0.6313 4.5542 11.691 0.0028
Sports service Industry Sports service 80 0.9757 4.7898 20.455 0.0000
Sports media and information services 80 -1.6485 9.5655 157.4709 0.0000
Sports and entertainment 80 0.7149 3.8629 8.1432 0.0172
Time-series forecasting model of the impact and revenue growth of the event

The time series forecasting model (VAR) can describe the dynamic relationship between multiple variables, which is just suitable for analyzing the interaction between event influence and revenue growth in the sports industry in the same framework in this paper [22]. Yet general VAR models are unable to capture the contemporaneous correlation of variables, and hence inferences about macroeconomic structure, since the latter necessarily involves a distinction between correlation and causation. The SVAR model is improved on this basis, which can extract the contemporaneous correlations originally hidden between the error terms, and give certain economic meanings by imposing constraints, from which the dynamic impacts of stochastic perturbations on the variable system can be further analyzed [23].

VAR modeling framework

The so-called VAR model is actually a set of equations describing the interrelationships among multiple time series, which is categorized into structural and induced forms based on whether the right side of the equation contains other variables contemporaneous with the dependent variable on the left. The two forms can be converted to each other, taking the bivariate first-order lagged VAR model as an example, its structural form is given first: { y1t=a10+a12y2t+β11y1,t1+β12y2,t1+ε1t y2t=a20+a21y1t+β21y1,t1+β22y2,t1+ε2t

Where y1t and y2t are smooth stochastic processes; ε1t and ε2t are random interference terms, uncorrelated, with variances σ12 and σ22 respectively, it can be seen that the right-hand side of the above equation has a contemporaneous effect with the left-hand side, i.e., y2t has an effect on y1t and y1t has an effect on y2t, which is in line with the logic of reality. Even if it is determined that there is no simultaneous influence between the series, it is sufficient to set the coefficients of the corresponding variables to zero. On this basis, question (1) is asked: how is the above equation estimated? It should be known that OLS coefficient estimation requires that the independent variable and the error term are uncorrelated in order to obtain an unbiased estimate. Assuming that ε1t produces a disturbance, the first equation of Eq. (6) shows that it will have an effect on y1t, and the second equation shows that y1t will in turn have an effect on y2t, i.e., there is a correlation between ε1t and y2t, so the above equation cannot be estimated using OLS.

If there is no contemporaneous effect on both sides of the equation, the above problem will not occur, thus an attempt is made to eliminate the contemporaneous variables on both sides of the equation by some transformation. Firstly, equation (6) is rewritten in matrix form: [ 1 a12 a21 1][ y1t y2t]=[ a10 a20]+[ β11 β12 β21 β22][ y1,t1 y2,t1]+[ ε1t ε2t]

The first term on the left-hand side of Eq. (8) is the identification matrix N introduced in the literature review, such that: [ a1 a2]=N1[ a10 a20],[ b11 b12 b21 b22]=N1[ β11 β12 β21 β22],[ e1t e2t]=N1[ ε1t ε2t]

Substituting into equation (9) gives: { y1t=a1++b11y1,t1+b12y2,t1+e1t y2t=a2+b21y1,t1+b22y2,t1+e2t

There is no longer a contemporaneous influence on both sides of the above joint equation equation, called the induced set of equations, which can be estimated unbiased using OLS. The key to obtain the coefficients of the induced equations to obtain the coefficients of the structural equations is to take N. The covariance matrix is obtained by taking the covariance matrix on both sides of the third equation in equation (8): N1[ ε1t ε2t][ ε1t ε2t]N1=[ e1t e2t][ e1t e2t]

By assumption, the error terms in the system of structural equations are uncorrelated, therefore: N1[ σ12 0 0 σ22]N1=[ Var(e1t) Cov(e1t,e2t) Cov(e1t,e2t) Var(e2t)]

where Var(X) denotes the sample variance and Cov(X,Y) denotes the covariance between variables. Since the parameters of the induced equations can be estimated by OLS, the right-hand side of equation (10) is actually known. The problem of converting from induced to structural equations is also known as the SVAR model identification problem.

Identification of SVAR models

From the previous subsection, it is clear that the key to converting from induced VAR to structural VAR is to find N, during which additional constraints need to be imposed. Different constraints can be categorized into different identifications according to the constraints, which have been briefly introduced in the literature review. Different constraints actually correspond to different economic meanings, and in this paper, we only select a special case of short-term constraints, the Cholesky decomposition, for illustration [24].

Let a21 = 0 in Eq. (6), the corresponding structural equation is: { y1t=a10+a12y2t+β11y1,t1+β12y2,t1+ε1t y2t=a20+β21y1,t1+β22y2,t1+ε2t

Expressed as y1t has no current impact on y2t. It can be obtained from equation (11): a12=Cov(e1t,e2t)Var(e2t)

From equation (8): { ε1t=a12e2t+e1t ε2t=e2t

It follows that a21 = 0 corresponds to the meaning of.

y2t the error term in the structural equation is equivalent to the error term in the induced equation and; y1t the error term in the structural equation ε1t is the residual from the regression of the error term in the induced equation e1t on e2t and a12 is the regression coefficient.

y1t has no current effect on y2t. The multivariate case can be reasoned analogously. The so-called Cholesky decomposition, where a symmetric positive definite matrix can be expressed as the product of a lower triangular matrix and its transpose, is actually a special case of the LU triangular decomposition when A is a symmetric positive definite matrix: A=LLT

Also, the symmetric positive definite matrix can be decomposed as: A=QΛQT

where Q is an orthogonal matrix and Λ is a real diagonal matrix. It is easy to find that Eqs. (16) and (11) correspond to each other, i.e., A corresponds to [ Var(e1t) Cov(e1t,e2t) Cov(e1t,e2t) Var(e2t)], Q corresponds to N−1, Λ corresponds to [ σ12 0 0 σ22], and L = QΛ1/2. After obtaining A from the OLS estimation of the induced equations, it is possible to further decompose them by Cholesky decomposition to obtain L, which leads to the elements in Q and Λ. It is found that by applying the 0 constraint to some elements of the original equation so that N−1 becomes a triangular matrix and the remaining elements are solved, which is equivalent to Cholesky decomposition of the covariance matrix of the induced equations, and the elements in Q are the corresponding coefficients in the identification matrix, and Λ is the covariance matrix of the error terms of the structural equations, and therefore, this method of identification is referred to as Cholesky decomposition. By applying 0 or other constraints to the other coefficients or variances of the structural equations in order to express different economic meanings, this is a short-term constraint, which is actually a generalization of the Cholesky decomposition.

Impulse Response Function

The so-called impulse response function is actually a function that describes a one-unit change in the current error term εit with t as the independent variable and yt (denoting the set of y1t, y2t, …, ypt) as the dependent variable [25]. In order to obtain a more intuitive description, an attempt is made here to convert the VAR model into a vector moving average (VMA) model, still using the bivariate first-order lag model as an example, such that: yt=[ y1t y2t],β=[ β11 β12 β21 β22],εt=[ ε1t ε2t]

where ε1t and ε2t are uncorrelated and the intercept term is ignored. Then: yt=βyt1+εt

Introducing the lag operator: yt−1 = Lyt, then: (IβL)yt=εt

Multiply both sides by (IβL)−1: yt=(IβL)1εt

By mathematical derivation it can be found that multiplying (I+βL+β2L2+β3L3+) by (IβL) gives (IβnLn). Where n tends to infinity, so when element bij < 1 in β, (IβnLn) tends infinitely to I, i.e., (IβL)−1 = I + βL + β2L2 + β3L3 + …, which can be obtained by substituting into Eq. (20): yt=(I+βL+β2L2+β3L3+)εt

This i.e. the moving average form of the VAR model is expanded as: { y1t=k=0β11(k)ε1,tk+k=0β12(k)ε2,tk y2t=k=0β21(k)ε1,tk+k=0β22(k)ε2,tk

where β11(k), β12(k), β21(k), and β22(k) are the corresponding elements in the matrix β to the kth power βk (note βij(k)βijk that Zhao Hong provides a detailed derivation of its computation), i.e: βk=[ β11(k) β12(k) β21(k) β22(k)]

It can be found that y1t and y2t each have two impulse response functions describing how they are affected by the shocks of error terms ε1t and ε2t, respectively. This shows the necessity of the error terms ε1t and ε2t being uncorrelated, and assuming that they are correlated, when ε1t is perturbed, ε2t may also be perturbed at some time, and y1t is not affected by the ε1t front coefficients. Since only the error terms of the structural equations are uncorrelated, it is necessary to convert from the induced equations to the structural equations before performing the impulse response analysis. It can be further deduced that for the n variable first-order lag VAR model, it has n2 impulse response functions. For the multi-order lagged VAR model, its moving average form can also be obtained by factorization.

Variance decomposition

The so-called variance decomposition, which is actually a decomposition of the variance of the prediction error of a VAR model, aims to characterize the contribution of each random perturbation to the effect of the dependent variable [26]. Taking the n-variable first-order lagged VAR model as an example, the generalized formula for the expression of the ind variable can be obtained by analogy from the previous section: yit=p=1n(k=0βip(k)εp,tk)

n of the error terms εpt(p = 1, 2, …, n) are uncorrelated with each other and correspond to a variance of σp2, as can be deduced: Var(yit)=p=1n(k=0(βip(k))2σp2)

That is, the error of yit can be decomposed into n uncorrelated effects, and in order to determine the magnitude of the contribution of each perturbation term to it, the relative variance contribution of the effect of the prd perturbation term on the ith variable is defined: RVCpi=k=0(βip(k))2σp2p=1n(k=0(βip(k))2σp2)

Study on the time-series relationship between the impact of tournaments and revenue growth

In the previous chapter, this paper proposed a time series forecasting model for event impact and revenue growth, which will be used in this chapter to analyze the dynamic relationship between event impact and revenue growth variables in time series data in the context of the sports industry.

There are also relatively more indicators to measure the influence of sports industry events. The commonly used indicators include event attention, professionalism, contribution, economic benefits, and so on. In this paper, we will choose “Tournament Economic Benefit (TYC)” as a parameter indicator of the impact of the event, which mainly refers to the final results of the impact of the sports events organized by all resident units of a country (or region) in a certain period of time. The statistics of GDP in the sports industry are conducted year by year, and they have continuity, which can objectively reflect the overall operation and development of the sports industry economy in recent years. Therefore, “GDP of sports industry” is adopted as an indicator to measure the growth of sports industry revenue.

In conclusion, this chapter mainly analyzes the relationship between the influence of events and revenue growth of the sports industry, and chooses “economic benefits of events (TYC)” and “gross domestic product (GDP) of the sports industry” as the analytical variables, and the sample interval is 2013-2023. The sample period is 2013-2023, and the data are mainly obtained from China Statistical Yearbook, China Tertiary Industry Statistical Bulletin, and Sports Industry Statistical Bulletin of State General Administration of Sports.

Parameter Estimation of a VAR Model of Race Impact and Revenue Growth

Since “Tournament Economic Benefits (TYC)” and “Gross Domestic Product (GDP)” are both time series, there may be heteroskedastic effects. Therefore, prior to the empirical validation of the VAR model, the natural logarithm of the variables is taken to preprocess “TYC” and “GDP”, which is denoted as “lnTYC”, “lnGDP”. This treatment does not change the covariance of the original variables, can linearize the time series, and can eliminate the effects of heteroskedasticity.

Stability tests

To prevent “tournament economic benefits (TYC)” and “sports industry gross domestic product (GDP)” two time series variables to establish a VAR model when the “pseudo-regression” phenomenon. Need to carry out the unit root test, this paper adopts the ADF method to test the smoothness of the results as shown in Table 2. From the test results, it can be seen that the ADF test values of the series lnTYC, lnGDP are greater than the critical value at the 10% confidence level, and there is a unit root in the two series, which is a non-stationary series. First-order differencing of the two sequences respectively, the ADF test values are still greater than the critical value at the 10% confidence level, and the first-order differencing sequences ΔlnTYC, ΔlnGDP (Δ is the first-order differencing operator) are non-stationary sequences. After the second-order differencing, the ADF test values are still all less than the critical value at the 1% confidence level, and the second-order differencing series Δ2lnTYC, Δ2lnGDP are smooth series. It can be seen that the sequences lnTYC and lnGDP become smooth sequences after second-order differencing and satisfy the smoothness condition.

Stability result

Variable ADF test value Test type (c,t,k) T statistic P Stability
1%Critical value 5%Critical value 10%Critical value
LnGDP -0.32084 c,0,1 -5.29544 -4.0082 -3.46078 P>0.1 nonstationary
LnTYC 9.401743 c,0,1 -2.81664 -1.98227 -1.6012 P>0.1 nonstationary
ΔlnGDP -1.04696 c,0,1 -2.84732 -1.9883 -1.60021 P>0.1 nonstationary
ΔlnTYC 0.208104 c,0,1 -2.84724 -1.98812 -1.60022 P>0.1 nonstationary
Δ2lnGDP -3.53499 c,0,1 -2.93721 -2.00628 -1.59815 P<0.01 stationary
Δ2lnTYC -3.00523 c,0,1 -2.88611 -1.99586 -1.59906 P<0.01 stationary

Δ represents one order difference sequence.

Δ2 represents the second order difference sequence.

c is the intercept term.

t is the trend term.

k is the hysteresis.

Johansen cointegration test

From the ADF test can be seen, the sequence of lnTYC, lnGDP two series itself does not have smoothness, but a certain linear combination between them is likely to be smooth, this linear combination reflects the possible existence of a long-term stable relationship between the series, called the “co-integration relationship (equilibrium relationship)”. To further clarify the long-standing cointegration relationship, a cointegration test is required. The results of the cointegration test are specifically shown in Table 3. As can be seen from Table 3, the original hypothesis None indicates that there is no cointegration relationship between the series lnTYC and lnGDP, and the value of the trace test statistic under this hypothesis is 25.60196, which is greater than the critical value of the 5% confidence interval of 14.2635 (P<0.01), and it is considered that there is at least one cointegration relationship; the original hypothesis At most 1 indicates that the series lnTYC and lnGDP have at most one cointegration relationship, and the value of the trace test statistic under this hypothesis is 0.008918, which is smaller than the critical value of the 5% confidence interval. Cointegration relationship, the value of trace test statistic under this hypothesis 0.008918, less than the critical value of 5% confidence interval 14.2635 (P>0.05), accept the original hypothesis. The results of the cointegration equation fitting index show that the cointegration equation fits well and can reflect the long-term equilibrium relationship between the two, and there is a positive correlation between the economic benefits of the event (TYC) and the gross domestic product (GDP) of the sports industry.

Results of the cointegral test

Original hypothesis Eigenvalue Trace survey Race test critical value (5%) P Conclusion
None 0.941735 25.60196 14.2635 0.0007 Reject
At most 1 0.000987 0.008918 3.841477 0.9245 Acceptance
Vector Error Modeling Tests

Although there is a cointegration relationship between the series lnTYC and lnGDP, i.e., there is a long-run stable equilibrium relationship between them. However, in the short term, it may be affected by a variety of factors, resulting in the cointegration relationship deviating from the equilibrium path. Therefore, an error correction model (VEC) can be constructed on the basis of the previous cointegration analysis to link the short-term dynamic relationship between the two with the long-term equilibrium relationship. When the short-term fluctuation of the series is large, the effect of convergence of the cointegration relationship can be achieved by restricting the behavior of the endogenous variables. The test results of the serial vector error model are specifically shown in Table 4. In the VEC model, the coefficients of ΔlnTYC and ΔlnTYCt-1 are -0.206788 and -0.529791, respectively, which shows that there is a negative correlation between tournament economic benefits (TYC) and the gross product of the sports industry (GDP) in the short term, and this effect is opposite to the long term equilibrium of the positive correlation between the two, which indicates that in the short term there are large fluctuations in the TYC and GDP The interaction between TYC and GDP in the short term is characterized by large fluctuations. In order to maintain the long-term positive correlation equilibrium effect between the two, the error correction model shows that the current period with -1.74408 times the strength of the previous period of the non-equilibrium (deviation) between the variables to adjust the state, pulling it back to the long-term equilibrium state.

Vector error

Variable Coefficient estimate Standard deviation T statistic P
ΔlnTYC -0.206788 0.113925 -1.81445 0.1427
C 0.168401 0.038592 4.364773 0.013
ΔlnGDPt-1 0.933958 0.169932 5.495774 0.0052
ΔlnTYCt-1 -0.529791 0.181398 -2.92057 0.0435
ECMt-1 -1.74408 0.253961 -6.86723 0.0026
Analysis of the relationship between race impact and revenue growth
Impulse Response Analysis

The impulse response function can analyze the impact on the current and future values of other endogenous variables when a one-unit perturbation shock is applied to one endogenous variable, and can vividly show the dynamic relationship between the variables interacting with each other. In this study, impulse response is used to analyze the dynamic relationship between the two variables of sports event influence and revenue growth, as shown in Figure 2. Figure (a) shows the impact of tournament influence (TYC) on earnings growth (GDP), while Figure (b) shows the impact of earnings growth (GDP) on tournament influence (TYC).

Figure 2.

Pulse response analysis

From Figure (a), it can be seen that when the influence of sports events is impacted for the first time, the short-term economic earnings will grow; when impacted for a long time, the positive impact of the influence of the event on the growth of economic earnings remains relatively stable, and with the prolongation of the impact period of the positive impact of the slow weakening. It can be seen that the development of the influence of the event can promote economic growth, with the passage of time the influence of the body event on the growth of economic income continues to reduce the impact. In order to promote the stable growth of the economy through sports, it is necessary to combine the development of special sports, create sports projects with regional characteristics, actively build the sports-related industrial base, and constantly inject new vitality into the sports industry to maintain the sustainable development of the sports industry.

From the figure (b), the economic revenue growth of a unit standard deviation shock, the impact of sports events made a positive response to this shock, the short-term development of sports undertakings rise; sustained input standard deviation shock, the peak of the response reached in the three period, with the passage of time, the response is slowly reduced, but the degree of response is still maintained at a high level. It shows that the growth of economic revenue can promote the development of the impact of the tournament. The impact effect is more significant, and the trend of the impact is slowly decreasing. It can be concluded that, in order to make the benign development of sports, the government needs to increase the financial investment in sports, rational planning of financial resources, and financial investment should have a stable cycle.

Granger causality test

Granger causality test measures whether a given set of series is exogenous to another set of series and can analyze the statistically significant causal relationship between variables.The results of Granger causality test of TYC and GDP are shown in Table 5. As can be seen from the table, the empirical results of the Granger causality test between tournament influence and earnings growth, when tournament influence is the dependent variable, the p-value of earnings growth is 0.0488, which is less than 0.05, indicating that earnings growth is the Granger cause of the development of tournament influence. When earnings growth is the dependent variable, the p-value of tournament influence is 0.122, which is greater than 0.05. This indicates that tournament influence is not the cause of Granger’s earnings growth. Therefore, there is a causal relationship between tournament influence and earnings growth that is unidirectional.

Results of the granger causality test

Dependent variable Exogenous variable Chi-sq statistical value P
TYC GDP 3.845082 0.0488**
ALL 3.845082 0.0488**
GDP TYC 2.256788 0.122
ALL 2.256788 0.122
Variance decomposition

On the basis of impulse response analysis and Granger causality test, the degree of contribution between the variables and the variables to themselves is further investigated through variance decomposition.The results of the variance decomposition between TYC and GDP are specifically shown in Table 6. As can be seen from the table, the results of the variance decomposition between tournament influence and revenue growth for the first 10 periods. The first period has the biggest contribution of tournament influence to itself, at 100%, but it gradually decreases to 93.77% in the tenth period. The contribution of earnings growth to the first period of tournament influence is close to zero, but then it continues to increase to 6.23% in the 10th period. The contribution of earnings growth to its own first period was 99.81% and then continued to decrease to 76.69% in the 10th period. The contribution of tournament influence to revenue growth is 0.19% in the first period and then it keeps growing and grows to 23.31% in the 10th period. It can be concluded that tournament influence and revenue growth have the greatest contribution to themselves and a lesser contribution to each other.

Variance decomposition

Period Variance decomposition of TYC(%) Variance decomposition of GDP(%)
TYC GDP GDP TYC
1 100 0 99.81126 0.18874
2 94.38029 5.61971 88.85596 11.14404
3 94.44463 5.55537 83.70355 16.29645
4 93.84035 6.15965 81.0841 18.9159
5 93.94914 6.05086 79.55577 20.44423
6 93.82881 6.17119 78.56684 21.43316
7 93.826 6.174 77.87858 22.12142
8 93.81337 6.18663 77.37445 22.62555
9 93.78918 6.21082 76.99104 23.00896
10 93.77399 6.22601 76.6908 23.3092
Conclusion

This paper adopts the quarterly year-on-year data of the economic indices of the sporting goods manufacturing industry and the sports service industry from the 1st quarter of 2006 to the 2nd quarter of 2023, and uses the economic return measurement and forecasting model ARIMA proposed in this paper to measure and forecast the economic returns of the sports industry in China. The “sharp peaks and thick tails” distribution of the time series of the sports economic income index is extremely significant, which leads to the conclusion that the time series of the economic income index does not obey the normal distribution. Based on the results of the economic revenue measurement of the sports industry, a time series prediction model was constructed to further investigate the relationship between the revenue growth of the sports industry and the influence of events. The VAR model is used to preprocess the “Tournament Impact (TYC)” and “Gain Growth (GDP)”, and the series lnTYC and lnGDP become smooth after the second-order differencing, which meets the smoothness test, and the positive correlation between TYC and GDP is verified in Johansen’s cointegration test. The positive correlation between TYC and GDP is verified in the Johansen cointegration test. The vector error model test shows that there is a negative correlation between TYC and GDP in the short run, and the interaction between them also has a large fluctuation with -1.74408 times the strength of the long-run equilibrium state. Impulse response, Granger causality test, and variance decomposition methods are used to analyze the relationship between TYC and GDP. In the impulse response analysis, both TYC and GDP show positive response results when they are subjected to the first shock, and if they are subjected to the shock in the long run, the mutual positive effects of both TYC and GDP can remain in a relatively stable state, but slowly decrease with the prolongation of the shock period. The causal relationship between TYC and GDP is tested, and P<0.05 for GDP when TYC is the dependent variable, i.e., GDP is the Granger cause of TYC. Whereas, P>0.05 for TYC when GDP is the dependent variable, tournament influence is not a Granger cause of earnings growth. This indicates that there is a one-way causal relationship between the two variables. On the basis of the impulse response and Granger causality test analysis, the final variance decomposition analysis was conducted. Both tournament influence and earnings growth have the highest degree of contribution to themselves, and even in the tenth period the contribution to themselves can still reach 93.77% and 99.81%, while the contribution to each other is lower.

Language:
English