Spatiotemporal Characteristics of Traffic Accidents in China, 2016–2019

This study analyzed in-depth investigation reports for 418 traffic accidents with at least five deaths (TALFDs) in China from 2016 to 2019. Statistical analysis methods including hierarchical cluster analysis were employed to examine the distribution characteristics of these accidents. Accidents were found to be concentrated in July and August, and the distribution over the seven days of the week was relatively uniform; only Sunday had a higher number of accidents and deaths. In terms of 24-hour distribution, the one-hour periods with the most accidents and deaths were 8:00–9:00, 10:00–11:00, 14:00–15:00, and 18:00–19:00. Tibet, Qinghai, and Ningxia had the highest death rates per 10,000 vehicles as well as the highest death rates per 100,000 inhabitants in TALFDs. In addition, the provinces with the most accidents and deaths were Sichuan, Henan, and Yunnan. Accidents on ordinary highways accounted for approximately 70% of the total, with the death toll on those roads accounting for approximately 64% of total deaths. Accidents on expressways accounted for approximately a quarter of all traffic accidents while the number of deaths accounted for more than 30% of the total. These results can guide traffic management departments to adopt better planning and management strategies to help reduce the number of traffic accidents and deaths.


Introduction
The World Health Organization's (WHO) Global Status Report on Road Safety 2018 noted that the number of traffic-related deaths had continued to rise worldwide. In 2016, the number of deaths from traffic accidents worldwide reached 1.35 million. On average, one person loses his or her life in a traffic accident every 24 seconds, and the number of people injured is as high as 50 million. Traffic-related injuries and deaths are therefore a significant public health concern. In the abovementioned WHO report, it was estimated that the number of deaths from traffic accidents in China in 2016 was 256,180, accounting for approximately 19% of global traffic-related deaths that year [1]. Given this severe trafficsafety situation, China needs to focus on the management and prevention of accidents, strengthen the analysis of accidents, and increase research on accident-prevention measures. In this regard, it is necessary to investigate accidents in a timely manner to effectively and scientifically identify hidden safety hazards and improve overall accident prevention.
Using statistical methods to analyze the spatiotemporal distribution of traffic accidents in Nigeria, Jegede [2] identified six traffic areas that could be considered accident black spots; in particular, March, September, and December were found to have the highest incidence of accidents. Plug et al. [3] studied the spatiotemporal distribution of single-vehicle accidents in Australia, providing useful information to help decision-makers formulate traffic-safety strategies. Singh [4] analyzed the spatiotemporal distribution of accidents at the national, state, and metropolitan levels in India in 2003 and 2013. The most accidents occurred from May to June and from December to January; further, most accidents occurred between 9:00 and 21:00. Additionally, there were huge differences in traffic accident risk according to spatial distribution; specifically, 16 of India's 35 states and federal territories had risks above the national average. Albayati et al. [5] analyzed the monthly and road-type distribution of accidents in Iraq from 2002 to 2015; October had the highest number of accidents, and accidents on arterial roads accounted for 59% of accidents on all types of roads. Kang et al. [6] analyzed the spatiotemporal characteristics of traffic accidents involving the elderly in Seoul. Accident black spots were found in the hiking and climbing areas of Seoul during the spring and fall. Studying the spatiotemporal characteristics of traffic accidents in highway tunnels in China, Ma et al. [7] and Sun et al. [8] found that most accidents occurred during special holidays (especially the Chinese New Year), and most occurred near the entrances and exits of tunnels. Using correlation analysis to examine accidents on a typical continuous downhill section of an expressway, Ma et al. [9] found that accident, death, and injury rates were the highest between 8:00 and 10:00. Moreover, Ma et al. [10] used big data for traffic accidents in Suzhou to analyze their spatiotemporal distribution characteristics; they found that the incidence of traffic accidents was positively correlated with road grade. Pleerux [11] used accident data for 2012-2017 collected from Thailand's Road Accident Data Center to identify road-accident patterns and distribution; they found a high density of accidents in three main areas of the Sri Racha district of Chon Buri Province. Ramírez et al. [12] used 2014-2016 data from the traffic police department of Bogota, Colombia, regarding incidents with injuries or fatalities to identify critical zones in the city that required more attention.
Clustering algorithms disseminate objects in a dataset into several groups based on their characteristics [13]. The most-used clustering algorithms include K-means [14], fuzzy C-means [15], two-step [16], and hierarchical clustering [17]. In recent years, cluster analysis methods have been widely used in the field of traffic accident analysis [18][19][20][21][22][23][24][25]. In hierarchical clustering, a dataset is recursively partitioned into successively smaller clusters [26]. Moreover, hierarchical clustering does not require any predefined parameters and is therefore more suitable for handling real-world data than other methods [27].
Although there have been studies of the spatiotemporal characteristics of traffic accidents in various regions of China, data can be difficult to obtain for all of China. In China's national standard Road Traffic Accident Information Investigation (GA/T 1082-2013) [28], fatal accidents are divided into four levels: general, major, severe, and particularly severe; these correspond to accidents causing 1-2, 3-9, 10-29, and 30 or more deaths, respectively. According to the Regulations for the Handling of Road Traffic Accidents [29], implemented on May 1, 2018, in the event of a traffic accident with at least five deaths (TALFD), the traffic administrative department of the public security bureau (TADPSB) should report it to the local government and to the Traffic Management Bureau of the Ministry of Public Security. In addition, the Provisions on Procedures for Handling Road Traffic Accidents (order no. 146 of the Ministry of Public Security) [30] stipulates that the TADPSB should investigate accidents causing three or more deaths and produce in-depth reports. Accordingly, we collected in-depth investigation reports of TALFDs in China from 2016 to 2019 that had been delivered to the Traffic Management Bureau of the Ministry of Public Security. In this study, statistical analysis methods, including hierarchical cluster analysis, were employed using SPSS software to explore the spatiotemporal characteristics of TALFDs.

Cluster Method
Cluster methods are distinguished primarily by their different linkage rules for the formation of clusters. Single linkage, complete linkage, average linkage, and Ward's method are widely used across various disciplines. Ward's method eliminates small clusters and produces clusters of comparable sizes corresponding to a homogenized subset of the selected data file, which makes the study of the examined objects more accurate than the other methods [31]. This method was therefore selected for the present study. Squared Euclidean distance (SED) was used for Ward's method in this study, which is expressed as follows: where d 2 is SED, x ik is the value of the k-symbol for the i observation of the variable, x jk is the minimum value of the variable x ik , and n is the total number of objects.

Standardization of Variables
Variable standardization is necessary when the values for different variables are in different units. The Zscore is the most frequently used approach for variable standardization, which is carried out as follows: where z kj is the kth standardized value of the jth variable, x kj is the kth original data value of the jth variable, x j is the mean value of the jth variable, and S j is the standard deviation of the jth variable.  The figures for traffic accidents and deaths during the study period were entered into SPSS as variables, and Z-scores were used to standardize the variables. Ward's method was used to sort the samples into classes, and rescaled distance was used to obtain a hierarchical analysis dendrogram (Fig. 1). The clustering results were divided into poor, medium, and relatively good, as shown in Tab. 2. The clustering results suggest that, during the study period, TALFDs in China were poor in July and August. Specifically, the number of accidents in July and August were 1.96 and 1.72 percentage points higher, respectively, than the monthly average of 8.33%; further, the death toll was 2.41 and 3.37 percentage points higher, respectively, than the monthly average of 8.33%. The possible reasons for this are as follows. First, July and August belong to China's summer vacation period, which is the peak season for tourism. Self-driving trips, long-distance bus trips, and long-distance tourist bus trips are at their peak. Thus, the risk of accidents is higher than in other months. Flat tires and spontaneous combustion are characteristic accidents that occur frequently in the summer. For example, on July 1, 2016, 26 people were killed and 4 injured in an accident on the Tianjin Jinji Expressway attributed to a punctured tire. The vehicle fell into the Yandong Canal, leading to 26 people drowning. Second, in July and August, it is very hot in most parts of China, and some drivers travel at night to avoid the high temperatures, which increases the risk of accidents due to fatigue. For example, on August 10, 2017, at 23:30, an accident attributed to driver fatigue resulted in 36 deaths and 13 injuries on the Beijing-Kunming Expressway in Ankang, Shaanxi. Fig. 2 shows the percentage of TALFDs and percentage of deaths in China from 2016 to 2019 by day of the week. During the study period, accidents and fatalities remained relatively uniform in terms of distribution by day of the week. Only Sunday had a higher number of accidents and fatalities. Accidents and fatalities were, respectively, 2.46 and 3.31 percentage points higher on Sunday than the weekly average of 14.29%. This could be because Sunday comes at the end of the weekend break, and the total traffic volume is higher than on other days.

Distribution Characteristics by Hour of the Day
Similarly, the figures for TALFDs and deaths during the study period were entered into SPSS as variables, and Z-scores were used to standardize the variables. Ward's method was used to sort the samples into classes, and rescaled distance was used to obtain a hierarchical analysis dendrogram (Fig. 3). As shown in Tab. 3, the clustering results were divided into relatively good, medium, poor, and worse.  The number of accidents and fatalities was the highest during the four periods of 8:00-9:00, 10:00-11:00, 14:00-15:00, and 18:00-19:00, accounting, respectively, for 5.74% and 6.35%, 5.74% and 6.61%, 6.46% and 5.90%, and 6.70% and 5.76% of TALFDs and fatalities. The reasons are as follows: 8:00-9:00 and 18:00-19:00 are peak travel periods, and the increased travel volume increases safety risks. Meanwhile, 10:00-11:00 and 14:00-15:00 have the highest number of accidents and fatalities, which could be attributable to fatigue setting in around noon.

.1 Distribution Characteristics by Region
There are 31 administrative regions in mainland China, including 24 provinces, 5 autonomous regions, and 4 municipalities. These regions significantly differ in terms of socioeconomic development, climate, geography, and road transportation infrastructure. This leads to obvious differences in the traffic-safety situation of each region. It is often customary to divide mainland China into six regions: Northeast, North, East, Northwest, Southwest, and South Central. Conditions within a given region are usually relatively the same. The Northeast includes Heilongjiang, Jilin, and Liaoning (all provinces); the North includes Beijing, Tianjin, Hebei, Shanxi, and Inner Mongolia (two provinces, two municipalities, one autonomous region); the East includes Shanghai, Jiangsu, Zhejiang, Anhui, Fujian, Jiangxi, and Shandong (six provinces, one municipality); the Northwest includes Shaanxi, Gansu, Qinghai, Ningxia, Xinjiang (three provinces, two autonomous regions); the Southwest includes Chongqing, Sichuan, Guizhou, Yunnan, and Tibet (three provinces, one municipality, one autonomous region); and South Central includes Henan, Hubei, Hunan, Guangdong, Guangxi, and Hainan (five provinces, one municipality).
Using population and vehicle ownership data [32] for each region as of the end of 2018, we calculated the death rate per 10,000 vehicles and per 100,000 inhabitants in TALFDs from 2016 to 2019 (Fig. 4). Fig. 4 shows that the four areas with the most fatalities per 10,000 vehicles in TALFDs were Tibet (0.65), Qinghai The total number of TALFDs and deaths, fatalities per 10,000 vehicles, and fatalities per 100,000 inhabitants were entered into SPSS as variables, and Z-scores were used to standardize the variables. Ward's method was used to sort the samples into classes, and rescaled distance was used to obtain a hierarchical analysis dendrogram (Fig. 5). The clustering results were divided into better, relatively good, medium, poor, and worse, as shown in Tab  The clustering results indicate that Tibet, Qinghai, and Ningxia were the worse regions for TALFDs during the study period due to high fatality rates per 10,000 vehicles and per 100,000 inhabitants. Meanwhile, Sichuan, Henan, and Yunnan were poor regions for TALFDs because of the high numbers of traffic accidents and deaths. Sichuan and Yunnan are in Southwest China, and some TALFDs in those areas could be related to specific geographical and climatic conditions. For example, in 2016, the "11.15" severe traffic accident in Zhaotong, Yunnan Province, caused 10 deaths. In 2017, the "3.02" severe traffic accident in Lincang, Yunnan Province, caused 10 deaths and 37 injuries. Then, in 2019, the "1.12" major traffic accident in Liangshan Prefecture, Sichuan Province, caused multiple injuries and deaths. All of those accidents involved vehicles falling off of cliffs, which is relatively rare in other regions. Among other regions, Heilongjiang (Northeast), Hebei (North), Shandong (East), Shaanxi (Northwest), Yunnan (Southwest), and Henan (South Central) ranked first in their regions in the number of accidents and deaths. Thus, attention should be paid to accidents in those regions.

Distribution Characteristics for Different Road Types
The Road Traffic Safety Laws of the People's Republic of China broadly divide roads into two categories: highways and urban roads. The standard Roadway Marking Rules and National Highway Numbering (GB/T 917-2017) [33], implemented in 2017, classifies highways into five grades: expressway, first-class highway, second-class highway, third-class highway, and fourth-class highway. Among these, highways other than expressways are called ordinary highways. The present study divided roads into three categories: urban roads, expressways, and ordinary highways.
Figs. 6 and 7 show the number of TALFDs and their fatalities by road type during the study period. The number of TALFDs and their fatalities were the highest on ordinary highways, accounting for 70.10% and 64.32% of the total, respectively. TALFDs on expressways accounted for approximately one-quarter of the total, but the death toll exceeded 30%. Urban roads only accounted for approximately 5% of TALFDs and deaths, making urban roads the lowest among the three road types. The reasons for these findings are as follows: At the end of 2018, the total mileage of roads in China was 4,846,500 km, with expressways accounting for 142,600 km [34]. The total mileage of ordinary highways was approximately 33 times that of expressways. Moreover, China's territory is vast, and road infrastructures and traffic-safety facilities are very different for ordinary highways in different places. Therefore, ordinary highways, especially mountain highways in the Southwest, involve a higher risk of TALFDs. Furthermore, expressway TALFDs accounted for a quarter of TALFDs but 30% of fatalities. This is because expressway traffic density is high, driving speeds are fast, and multivehicle crashes are prone to occur under severe weather conditions. During the study period, three traffic accidents involving more than 30 deaths occurred across the country: the 2016 "6.26" particularly severe traffic accident on Yifeng Expressway in Chenzhou, Hunan; the 2017 "8.10" particularly severe traffic accident on the Beijing-Kunming Expressway in Ankang, Shaanxi; and the 2019 "9.28" particularly severe traffic accident on the Changchun-Shenzhen Expressway in Wuxi, Jiangsu. All occurred on expressways. On urban roads, TALFDs and their fatalities only accounted for approximately 5%, which is not a high proportion. Although the traffic volume on urban roads is large, the speed limit is generally low, traffic safety facilities are abundant, and traffic management measures are more stringent than for highways. Therefore, compared to expressways and ordinary highways, the risk of TALFDs is lower on urban roads. (1) Regarding time distribution, a high number of accidents occurred in July and August-1.96 and 1.72 percentage points higher, respectively, than the monthly average. The death toll during those months was 2.41 and 3.37 percentage points higher, respectively, than the monthly average. The distribution of accidents by day of the week was relatively uniform. Only Sunday had a higher number of accidents and deaths-2.46 and 3.31 percentage points higher, respectively, than the weekly average. Regarding the time of day, accidents and fatalities were the highest during the one-hour periods of 8:00-9:00, 10:00-11:00, 14:00-15:00, and 18:00-19:00.

Conclusion
(2) Regarding spatial distribution, Tibet, Qinghai, and Ningxia had the highest death rates per 10,000 vehicles and per 100,000 inhabitants in TALFDs. However, the number of accidents and the number of deaths were the highest in Sichuan, Henan, and Yunnan. In terms of road type, accidents on ordinary highways accounted for approximately 70% of the total while deaths accounted for approximately 64%. Accidents on expressways accounted for approximately one-fourth of the total, but the death toll exceeded 30%. Additionally, urban roads only accounted for approximately 5% of accidents and fatalities.
This study examined the spatiotemporal distribution characteristics of 418 TALFDs in China from 2016 to 2019. The findings can help improve the effectiveness and scientific nature of accidentprevention work. It should be noted that the time span of the study object is short, and the data sample size is small. Therefore, a long-term follow-up study should be undertaken in the future. In addition, GIS technology has been a popular tool for the visualization of accident data and hotspot analysis in recent years. In future research, analysis methods based on GIS should be employed to describe the spatiotemporal characteristics of TALFDs.