The authors have declared that no competing interests exist.

Since receiving unexplained pneumonia patients at the Jinyintan Hospital in Wuhan, China in December 2019, the new coronavirus (COVID-19) has rapidly spread in Wuhan, China and spread to the entire China and some neighboring countries. We establish the dynamics model of infectious diseases and time series model to predict the trend and short-term prediction of the transmission of COVID-19, which will be conducive to the intervention and prevention of COVID-19 by departments at all levels in mainland China and buy more time for clinical trials.

Based on the transmission mechanism of COVID-19 in the population and the implemented prevention and control measures, we establish the dynamic models of the six chambers, and establish the time series models based on different mathematical formulas according to the variation law of the original data.

The results based on time series analysis and kinetic model analysis show that the cumulative diagnosis of pneumonia of COVID-19 in mainland China can reach 36,343 after one week (February 8, 2020), and the number of basic regenerations can reach 4.01. The cumulative number of confirmed diagnoses will reach a peak of 87,701 on March 15, 2020; the number of basic regenerations in Wuhan will reach 4.3, and the cumulative number of confirmed cases in Wuhan will reach peak at 76,982 on March 20. Whether in Mainland China or Wuhan, both the infection rate and the basic regeneration number of COVID-19 continue to decline, and the results of the sensitivity analysis show that the time it takes for a suspected population to be diagnosed as a confirmed population can have a significant impact on the peak size and duration of the cumulative number of diagnoses. Increased mortality leads to additional cases of pneumonia, while increased cure rates are not sensitive to the cumulative number of confirmed cases.

Chinese governments at various levels have intervened in many ways to control the epidemic. According to the results of the model analysis, we believe that the emergency intervention measures adopted in the early stage of the epidemic, such as blocking Wuhan, restricting the flow of people in Hubei province, and increasing the support to Wuhan, had a crucial restraining effect on the original spread of the epidemic.

It is a very effective prevention and treatment method to continue to increase investment in various medical resources to ensure that suspected patients can be diagnosed and treated in a timely manner.

Based on the results of the sensitivity analysis, we believe that enhanced treatment of the bodies of deceased patients can be effective in ensuring that the bodies themselves and the process do not result in additional viral infections, and once the pneumonia patients with the COVID-19 are cured, the antibodies left in their bodies may prevent them from reinfection COVID-19 for a longer period of time.

Since December 2019, many unexplained cases of pneumonia with cough, dyspnea, fatigue, and fever as the main symptoms have occurred in Wuhan, China in a short period of time.

China's health authorities and CDC quickly identified the pathogen of such cases as a new type of coronavirus, which the World Health Organization (WHO) named COVID-19 on January 10, 2020.

On January 22, 2020, the Information Office of the State Council of the People's Republic of China held a press conference introduced the relevant situation of pneumonia prevention and control of new coronavirus infection.

On the same day, the People's Republic of China's CDC released a plan for the prevention and control of pneumonitis of new coronavirus infection, including the COVID-19 epidemic Research, specimen collection and testing, tracking and management of close contacts, and propaganda, education and risk communication to the public.

Wuhan, China is the origin of COVID-19 and one of the cities most affected by it. The Mayor of Wuhan stated at a press conference on January 31, 2020 that Wuhan is urgently building Vulcan Mountain Hospital and Thunder Mountain Hospital patients will be officially admitted on February 3 and February 6.

In fact, there are many imminent questions about the spread of COVID-19. How many people will be infected tomorrow? When will the inflection point of the infection rate appear? How many people will be infected during the peak period? Can existing interventions effectively control the COVID-19? What mathematical models are available to help us answer these questions?

The COVID-19 is a novel coronavirus that was only discovered in December 2019, so data on the outbreak is still insufficient, and medical means such as clinical trials are still in a difficult exploratory stage.

Recently, COVID-19 suddenly struck in Wuhan, the seventh largest city of the People's Republic of China. The daily epidemic announcement provides us with basic data of epidemiological research. We obtained the epidemic data from the National Health Commission of the People's Republic of China from January 10, 2020 to February 9, 2020, including the cumulative number of cases, the cumulative number of suspected cases, the cumulative number of people in recovery, the cumulative number of deaths and the cumulative number of people in quarantine in the Chinese mainland.

Based on the collected epidemic data, we tried to find the propagation rule of the COVID-19, predict the epidemic situation, and then propose effective control and prevention methods. There are generally three kinds of methods to study the law of infectious disease transmission. The first is to establish a dynamic model of infectious diseases; The second is statistical modeling based on random process, time series analysis and other statistical methods. The third is to use data mining technology to obtain the information in the data and find the epidemic law of infectious diseases.

After the outbreak of the COVID-19 epidemic, the Chinese government has taken many effective measures to combat the epidemic, such as inspection detention, isolation treatment, isolation of cities, and stopping traffic on main roads.

Classes | Explanations for different classes |

People who may be infected by the virus | |

Infected with the virus but without the typical symptoms of infection | |

Infected with the virus and highly infectious but not quarantined | |

Diagnosed and quarantined | |

Suspected cases of infection or potential victims | |

People who are cured after infection |

Since the incubation period of the COVID-19 is as long as 2 to 14 days, there are already infected but undetected people (

As shown in _{qd} at which
suspected patients are converted into confirmed cases represents a measure of quarantine intensity due to the constant changes in medical procedures. At the same time, some highly infectious people in the free environment will be transferred to confirmed cases at the rate of _{id}, while others will be moved out at the rate of _{} due to lack of timely treatment.

The incidence rate of the susceptible population _{eq}. After medical diagnosis, some of the suspected cases were confirmed, while others were not detected and returned to the susceptible population with a ratio of _{qs}. The susceptible population has also been converted to suspected cases at a rate of _{qs}. According to the above-mentioned population classification and parameter definitions, we have established a SEDQIR model based on SEIR, which can better reflect the spread of the COVID-19 in the population.

In order to study the deeper COVID-19 transmission rule, we perform a detailed analysis of some parameters to transform the degree of infection into a form more conducive to data expression.

Among them, we refer to the infection rate coefficients of latent and freely infected people in susceptible populations as _{E} and _{I}. At this stage, the epidemic caused by COVID-19 may still be in the early stages of spreading among the population. We need to fit and estimate the above parameters through the original data published by the National Health Commission of China. Therefore, we will formulate the formula to a certain extent. Simplify: f(t) =

The infection rate

Among them,

Both the exponential smoothing method and the ARIMAX model are time series analysis methods, and these methods are often used in statistical modeling to analyze changes that occur over time.

The spread of COVID-19 in the population has not been more than two months. Based on statistical modeling methods, we still cannot know the seasonal fluctuations of COVID-19. Through simple data observation, we find that the spread of COVID-19 in the population may have some trends, so when we tried to use the exponential smoothing-based time series prediction, we used the Brown linear exponential smoothing model to analyze the relevant data of COVID-19. Compared with the primary exponential smoothing model, the Brown linear exponential smoothing model contains two Term exponential smoothing value.

Brown's linear exponential smoothing formula is as follows:

Among them, _{t }^{(1)} is the exponential smoothing value of the model, _{t }^{(2)} is the quadratic exponential smoothing value of the model.

In the above formula,

Time series analysis based on exponential smoothing is a statistical modeling method that cannot consider input variables. Considering that the COVID-19 confirmed cases, cumulative deaths, and cumulative recovery variables may have some relationship in value, the time series analysis based on ARIMAX models will also be performed on these variables in Mainland China, Hubei Province, Wuhan City and some surrounding cities, and these time series models will be compared and selected according to the test and prediction effects of statistics.24 In order to make accurate prediction of 2019 nCoV in the population, the formula of ARIMAX model is as follows

The ARIMAX model can be seen as an ARIMA model with an intervention sequence. Taking the number of confirmed cases that we are most concerned about when COVID-19 spreads through the population as an example, the model can represent the sequence value of the number of confirmed cases over a period of time as random fluctuations the past value of the number of confirmed cases and the past value of the input sequence, the input sequence may be variables that have a high degree of correlation with the number of confirmed cases, such as the number of deaths, the cumulative number of recoveries, and the number of suspected cases.

Due to the long duration of the COVID-19 epidemic, it is still in an ascending period as of February 9. In this paper, days are taken as the minimum time unit, a discrete model is obtained according to the practical meaning of the continuous model, and the epidemic data of Hubei province and the whole country are used to obtain the changes of its parameters, and the numerical simulation is carried out. If the number of days is taken as the minimum time unit, the continuous model can be discretized as:

According to the discrete model, the initial value of each variable can be given to describe the model numerically when the parameters are determined. We give the initial values of some parameters according to the references and the latest outbreak information, and we will use the least square method to obtain the variables and parameters that cannot be determined. According to the literature

In this paper, the least square method is used to fit the unknown parameters, but the value of

We use sequence diagrams and autocorrelation functions of the original data to determine the stationarity of these time series, and to smooth the series whose average and variance are not always constant. In the exponential smoothing method, we perform a natural logarithmic transformation on the series to omplete the smoothing process. In the ARIMA and ARIMAX models, we use the first-order difference or the second- order difference to smooth the original sequence. Using the above processing, we can obtain the time series analysis model summary information of the number of confirmed cases in mainland China as shown in

Method | Stability treatment | Model | Fitting effect | Ljung- Box Q(18) | Number of Outliers | Serial number | ||

Stationary R-squared | R-squared | Normalized BIC | Sig. | |||||

Exponential smoothing method | _ | Brown | 0.245 | 0.996 | 11.338 | 0.787 | 0 | 1 |

Natural logarithmic transformation | Brown | 0.605 | 0.992 | 12.039 | 0.958 | 0 | 2 | |

ARIMA | First order difference | ARIMA (0,1,0) | 0.235 | 0.961 | 13.644 | 0.000 | 0 | 3 |

Two order difference | ARIMA (0,2,0) | 0.687 | 0.997 | 11.223 | 0.912 | 0 | 4 | |

ARIMAX | First order difference | ARIMA X(0,1,0) | 0.977 | 0.999 | 10.634 | 0.987 | 0 | 5 |

Two order difference | ARIMA X(0,2,0) | 0.208 | 0.997 | 11.793 | 0.997 | 0 | 6 |

As shown in

In the function graph, we can judge whether the residuals still contain valid information of the time series. Through

Then the number of confirmed cases in mainland China is predicted. Based on the predictions in

We also made six time series models of cumulative confirmed cases in Hubei Province as shown in

Method | Stability treatment | Model | Fitting effect | Ljung- Box Q(18) | Number of Outliers | Serial number | ||

Stationary R-squared | R-squared | Normalized BIC | Sig. | |||||

Exponential smoothing method | _ | Brown | 0.136 | 0.994 | 11.228 | 0.958 | 0 | 7 |

Natural logarithmic transformation | Brown | 0.021 | 0.987 | 11.928 | 0.845 | 0 | 8 | |

ARIMA | First order difference | ARIMA (0,1,0) | 0.205 | 0.958 | 13.141 | 0.000 | 0 | 9 |

Two order difference | ARIMA (0,2,0) | 0.264 | 0.994 | 11.200 | 0.977 | 0 | 10 | |

ARIMAX | _ | ARIMA X(0,0,0) | 0.999 | 0.999 | 10.383 | 0.806 | 0 | 11 |

First order difference | ARIMA X(0,1,0) | 0.973 | 0.999 | 10.279 | 0.655 | 0 | 12 |

According to

According to the data released by the National Health Construction Commission of China, we set the data on January 10 as the initial value. On January 10, the transmission of COVID-19 only occurred in Hubei Province, of which 41 were confirmed, 0 were suspected, 2 were cured, 2 were infected with the COVID-19 but not yet sick, and 0 people were ill but not isolated, namely:

Since COVID-19 originates from Wuhan, Hubei, the above initial value can also be used as the national initial value. Based on this, we use the least square method to calculate

_{H}= 0.815 , _{H}= 2236 .

_{N}= 0.553 , _{N}= 2527 .

Among them,

We use SEIQDR model to predict the spread of COVID-19 in mainland China and Wuhan, and get the development trend of the cumulative number of confirmed cases in mainland China and Wuhan as shown in

To observe the development trend of the cumulative number of confirmed cases in mainland China, we believe that when the cumulative number of confirmed cases in Wuhan reaches a peak, the growth rate of confirmed cases in mainland China will decrease rapidly, and the cumulative number of confirmed cases in mainland China will peak around March 15, 2020, with a peak of 87,701. In order to give a clearer mathematical description of the spread of COVID-19 in Wuhan and Mainland China, we analyze the infection rate and basic regeneration number of the new coronavirus. Using the diachronic data and the calculation formula for the infection rate given above, the infection rate is fitted, we let the basic regeneration number of COVID-19 be

As can be seen from the above formula, there is a close relationship between the basic regeneration number and the infection rate. In the above formula, t is the average latency of the COVID-19, and we also set the initial value of the average latency to 7. From this, we can get the trend of infection rate and basic regeneration number of 2019 nCoV in Wuhan and Mainland China over time as shown in

According to the results of the SEIQDR model, we can also study more about the development trend of the COVID-19 epidemic in mainland China, Hubei Province, and Wuhan. Without the loss of generality, other parameters in the SEIQDR model remain initially set. Under the circumstances, we analyze some parameters in mainland China to explore the possible changes in the cumulative number of confirmed diagnoses. We have made a trend chart of the number of confirmed cases in mainland China when the average time required for the suspected population to be transformed into the confirmed population changes as shown in

In

Finally, we analyzed the sensitivity of mortality

We found that when the mortality rate gradually increased from 2% to 10%, the cumulative number of diagnoses increased by about 2,000. We believe that ,this means that the bodies of patients killed by COVID 19 may still carry a certain number of new infectious coronaviruses. According to the results of this simulation, we suggest that mainland China, especially Hubei Province and Wuhan City, should pay attention to the treatment of the dead bodies of the dead patients, try to ensure that the body itself and the handling process do not cause additional contagion. We also gradually increased the cure rate from 2% to 10%, and found that the cumulative number of confirmed diagnoses has almost no fluctuation. We think this means that once the pneumonia patients with the new coronavirus are cured, the antibodies left in their bodies may make them no longer a member of the susceptible population infected with the new coronavirus. In the early stage of COVID-19 transmission, as we did not know about COVID-19 and clinical trials were difficult to proceed immediately, our modeling and results provided reference values for intervention and prevention and clinical trials at all levels.

There is no doubt that the propagation of COVID-19 in the population will be affected by the intricacies of many factors. In the early stage of the COVID-19 propagation, it is difficult to establish a dynamic propagation model with parameters to be estimated and obtain fairly accurate simulation results, but the preliminary estimation of parameters such as average latency and mortality through existing data may be helpful for solving important parameters such as infection rate and rehabilitation rate, which will help us have a more accurate grasp of the transmission trend of COVID-19. On the other hand, statistical modeling of the spread of new coronavirus pneumonia in the population based on time series analysis is a thing that can be done immediately after getting the latest data every day, because the dynamic model of the time series is based on the law of the data itself. Although this method often requires sufficient data to support it, in the early stages of epidemic transmission, this method can still be used to more accurately predict the indicators of epidemic transmission in the short term, so as to provide intervention control at all levels of the departments and Policy implementation provides short-term emergency prevention programs.

This article will inevitably make some assumptions when building the model. When we build a dynamic discrete model for a certain period of time for COVID-19, we ignore the impact of factors such as population birth rate and natural mortality.

For simple calculations, we also Assume that the latent population of COVID-19 and the infected but not yet isolated population have the same range of activities and capabilities, that is, we assume that for COVID-19, the population

This work was supported by the Philosophical and Social Sciences Research Project of Hubei Education Department (19Y049), and the Staring Research Foundation for the Ph.D. of Hubei University of Technology (BSQD2019054), Hubei Province, China.