Digital twin modeling and intelligent optimization for rail operation safety assessment

. Currently, the use of ﬁ xed empirical parameters combined with traditional algorithms to construct train parameter prediction models for estimating train energy consumption is often used. However, due to changes in train related parameters, there is a signi ﬁ cant error between the estimated results of this method and the actual values, which cannot re ﬂ ect the true state of train operation. It affects the subsequent maintenance and repair of trains, thereby affecting the safety of the entire subway operation system. To address the above issues, a train traction power ﬂ ow model based on digital twin technology is proposed. The swarm intelligence optimization algorithm is not used to correct model parameters. By correcting the deviation of train passenger load, accurate prediction of train energy consumption can be achieved. The experimental results show that the power ﬂ ow model constructed by the research institute can reduce the power error of single interval trains by over 11.91%. The average DC voltage error of the train power ﬂ ow model after parameter correction is 0.19%. The average DC current error is 12.07%. The root mean square error range of passenger capacity for similar day neural network models is [0.0521,0.0811]. The experimental results indicate that the model constructed by the research institute can accurately predict train energy consumption and ensure the safety of the subway operation system.


Introduction
As one of the modern modes of transportation, the subway provides great convenience for human life.The subway system is generally set up in a relatively closed system near an underground or elevated bridge.Therefore, when a safety accident occurs, it can cause serious damage.The subway operation safety system is a complex electromechanical integration high-tech system [1].In addition to requiring reasonable cooperation between personnel operation and equipment operation, the impact of the operating environment needs to be considered.In the subway safety indicators, starting from the equipment and facilities aspect, the research on constructing relevant train status models to ensure the overall subway operation safety has become mature.However, due to factors such as line conditions, signal systems, and manual operations, it is difficult to accurately evaluate the state values of trains by empirical parameters combined with traditional algorithms.In addition, the wear and tear experienced by the train during use is gradually becoming severe.The corresponding resistance coefficient also changes during train operation.This will make it impossible for the already constructed model to accurately evaluate the current train state parameters.It is more likely to make the model unsuitable for the current actual situation.To address the above issues, digital twin technology is used to model train power flow inference.Using swarm intelligence optimization algorithms to perform fidelity processing on train model parameters, solving the problem of error accumulation caused by fixed parameters in the model.Finally, a similar daily deep neural network model is proposed to address the operational calculation state deviation caused by the rated load of the train model.The precise value of train passenger capacity is calculated.This study is mainly divided into four parts.The first part is a discussion on relevant work in recent years.The second part is the construction of train related models using digital twin technology and swarm intelligence optimization algo-rithms.The deep neural network and grey comprehensive evaluation method are used to solve the problem of train load current deviation.The third part verifies the performance of the constructed train model.The fourth part draws conclusions.

Raleted works
It is very common to model subway trains to obtain relevant parameter information.Liu et al. used a multi mass dynamic model combined with an adaptive Iterative learning control algorithm to calculate the speed and position of the subway [2].Cai et al. proposed a model based on finite element method and vehicle rail interaction to study the influence of elastic cushion stiffness on vehicle speed [3].Yang et al. used a mixed geographic weighted regression model to explore the spatial variation relationship between subway independent variables and subway passenger volume [4].Dawood et al. proposed using continuous operation to simplify image processing technology and machine intelligence systems to automatically detect and quantify soil surface defects in subway networks.The proposed algorithm is tested in subsequent experiments [5].Wang et al. proposed a model free adaptive fault-tolerant control scheme to solve the fault-tolerant control of subway trains [6].In the application of digital twin technology, Siyi constructed a three step model using digital twin technology.The effectiveness of the model is verified in simulation experiments [7].Yu et al. proposed a model based on digital twin technology combined with production line scheduling to solve workshop scheduling problems.The model is subjected to simulation experiments [8].Chen et al. proposed an automatic painting monitoring and remote operation system based on digital twin technology to solve the problems encountered during the painting process of rotating parts [9].Chen et al. proposed a teacher competency assessment model based on machine learning and digital twin technology to improve teachers' teaching abilities [10].Xu et al. proposed an intelligent lighting system based on digital twins to enhance the energy-saving effect of museum exhibits [11].To better evaluate the effectiveness of train circulation, management rules, and operational processes, Jean Christophe et al. used digital twin technology to model railway line areas.Different information sources are analyzed to verify the potential effects of influencing factors [12].Jostein et al. utilized digital twin technology to analyze new and existing assets in order to maximize production benefits.In actual production, digital twin technology is introduced to maximize profits [13].Venkat et al. used digital twin technology to inspect and evaluate assets in service.By intervening in minimal asset deterioration, operational asset management achieves high-precision management [14].Zheng H et al. proposed an identification and quantification model based on digital twin technology to help companies bring new business opportunities.This model is validated in practical applications [15].In summary, digital twin technology has good performance and extensive applications in various industries.However, research on subway train modeling is not indepth enough.Therefore, a subway train flow prediction model based on digital twin technology and oriented towards track operation safety is proposed.The particle swarm optimization algorithm in swarm intelligence optimization algorithms is used to adjust empirical parameters.The grey comprehensive evaluation method and deep neural network are used to correct the load deviation in the subway power flow model.

Construction of train dc traction power flow model and passenger capacity testing model
The digital twin technology enables mapping from virtual space to achieve interaction between virtual and actual situations.Based on this, a traction power flow model for the train is constructed.The swarm intelligence optimization algorithm is used to correct empirical parameters, making the predicted results of the constructed model more accurate.Finally, the grey comprehensive evaluation method is combined with a deep neural network to construct a similar neural network model to solve the train load deviation.

Construction of metro train system power flow model based on the digital twin theory framework
The safety of rail operation is related to the normal operation of trains and the guarantee of passenger travel, which is an indispensable part of modern transportation.According to the different modes of track operation, it can be divided into three types: subway, light rail, and tram.The subway plays an important role in transportation and transportation.Due to the complexity and diversity of the subway transportation system itself, the safety situation has a huge impact.A small mistake can bring a huge disaster to the operation of the entire subway.The manpower and material resources required for repair are incalculable.Therefore, it is very important to evaluate the safety of subway traffic.Usually, the indicators for subway operation safety evaluation are analyzed from the aspects of organizational management, personnel operation, equipment and facilities, and operating environment to construct a corresponding safety evaluation indicator system.The subway operation safety evaluation index system is shown in Figure 1.
From Figure 1, there are various factors that affect the safe operation of the subway.Organizational management, personnel management, and operational environment can be continuously adjusted through human intervention and later planning.However, in terms of equipment and facilities, it is necessary to continuously clarify the status of the train to ensure the safety of the entire subway operation.In terms of power supply system, system flow can reflect the operating status of the urban rail power supply system, which is one of the conditions to ensure the normal operation of the subway.At present, the conventional modeling methods for train power flow prediction often adopt forms such as ideal driving strategies, solidified experience parameters, and limited passenger discretion.But the model constructed using the above method may deviate from the true value.Digital twin technology can utilize virtual space to map actual situations.By integrating information and physics, a hybrid driven data model is constructed that best represents the actual situation.Therefore, digital twin technology is used to model the power flow deduction of the train power supply system.The flowchart of data twin modeling is shown in Figure 2.
From Figure 2, data twin modeling is divided into six parts.Different sections have different functions.The data collection aspect includes the collection and organization of sensor data, operation records, and maintenance log data.The main function of feature extraction and selection is to reduce model complexity and improve accuracy.The constructed model needs to be tested and adjusted parameters before it can be put into practical use.The focus of this study is the construction of a highly realistic digital twin model for subway trains.Therefore, precise load input values will be focused.Based on the distribution mode of the subway traction power supply system, a feedforward power flow model for the DC side system is constructed.The modeling of subway DC traction power flow is shown in Figure 3.
From Figure 3, the traction power supply DC system of the subway is independent of the distribution method.The structure of the DC system mainly involves three factors: traction substation, traction network, and train.Among them, the traction substation includes the entire traction substation system.The traction substation system mainly consists of diode rectifier group, medium voltage energy feedback device, and DC traction network.The diode rectifier group obtains the DC power supply of the DC traction power supply system through voltage reduction.When conducting power flow calculations, the diode rectifier unit is analyzed according to the Thevenin equivalent circuit.The calculation formula involved is shown in (1).
In equation ( 1), i is the external current source of the single port network port.Roi represents the zero voltage of all independent power sources for a single port.uoc is the voltage generated by all independent power sources within the single port network.The train traction power supply system transformed into a typical twelve pulse rectifier unit main circuit after converting the power distribution form through a diode rectifier unit.The output characteristics of its voltage current will be divided into two stages.The first stage is that only one rectifier bridge operates at each moment.The second stage involves two rectifier bridges working at each moment.When the voltage current output characteristic is in the first stage, the current of the  rectifier group is less than the critical current I dg .The corresponding no-load voltage calculation formula is shown in formula (2).
In equation ( 2), U 2L is the effective value of the secondary measurement line voltage of the transformer.The output characteristics of the rectifier unit are shown in formula (3).
In equation ( 3), X c is the commutation reactance.The equivalent internal resistance of commutation reactance in the Thevenin equivalent circuit is 3X c /p.When the voltage current output characteristic is in the second stage, the current of the rectifier group is greater than the critical current I dg .The calculation formula for the output characteristics of the diode rectifier group is shown in (4).
In equation ( 4), U d1 is the ideal no-load voltage.U d1 =1.35 U 2L .The medium voltage energy feedback device can recover and utilize the excess energy generated when the traction motor is converted into a power generation state.As a specially customized transformer, the medium voltage energy feedback device can provide specific phase angle AC voltage to the inverter.The common medium voltage energy feedback device is a three-phase voltage source inverter device.The schematic diagram of the circuit structure is shown in Figure 4.
From Figure 4, the three-phase voltage source inverter device contains six equivalent switching elements.The phase voltage and line voltage of the motor depend on the state of the six power switches on the inverter arm.The working characteristics of the medium voltage energy feedback device depend on the control method.Generally, the three-phase rotating coordinate system is converted into a two-phase rotating coordinate system to achieve a dual closed-loop control form with an outer voltage loop and an inner current loop.In a two-phase rotating coordinate system, feedforward decoupling is used to control the coupling of axis components.The main function of DC traction network is to provide power supply circuits for trains.The contact resistance values on both sides of the train are shown in (5).
In equation ( 5), r i represents the equivalent resistance per unit length of the contact network.x i represents the distance between the train and both sides of the power supply section.

Parameter correction of traction power flow model and construction of similar neural network model
After constructing a DC traction power flow model for subway trains, accurate load status is provided to achieve more accurate deduction of system power flow movement status and reflect train energy consumption.However, the collected parameters are usually simulated in the form of preset values, which cannot accurately reflect the actual state of the train.Moreover, empirical parameters are introduced during the calculation process.There will be an increasing gap between the empirical parameters solidified and the actual parameters of the components used in the train, which will increase the error between the prediction model and the actual situation.The data used in highprecision train models should be a dataset that is closer to the true values.Therefore, the fidelity of the train model parameters needs to be ensured.The factors that can most affect the train load status are selected for fidelity processing.After consulting relevant materials and comparing relevant parameters [16], the total weight of the train and the unit basic resistance coefficient of the train are selected for correction.The main way to study the fidelity processing of train model parameters is to interact with the actual system and the virtual model.Related algorithms are used to autonomously search for the optimal solution of the model, minimizing the error between the predicted results of the model and the actual situation.The swarm intelligence optimization algorithm has been introduced.The swarm intelligence optimization algorithm is mainly divided into two types: ant colony algorithm and particle swarm optimization (PSO).PSO has the characteristics of fast convergence speed and low requirements for selecting initial values.Therefore, PSO is used for parameter correction of the train model.The PSO schematic is shown in Figure 5.
From Figure 5, PSO requires setting initialization conditions and initializing the population.The fitness of each particle can be calculated only after the value range of relevant parameters is determined.In terms of data preprocessing, factors such as external signal interference, sensor errors, and noise often cause anomalies in certain actual data values collected.Therefore, the abnormal data in this section needs to be preprocessed to improve the mining efficiency of the data.The methods involved in data preprocessing include judging and replacing outliers, supplementing missing data, denoising data, and normalizing data.The formula used for judging and replacing outlier data is shown in (6).
In equation ( 6), MAD is the median of all absolute deviations.median(X) is the median of all observed data sequences X. xi represents each point in the observation sequence.Data normalization processing will limit the processed data to the range of [0,1], transforming dimensional expressions into dimensionless expressions.The relevant formulas are shown in (7).
After data preprocessing, calculate the optimal position of the individual.In the PSO algorithm, particles represent individuals.The formula for selecting and updating the optimal position of particles involved is shown in (8).
In equation ( 8), x id represents the position information of a single particle.p id is the optimal position found during the iteration process.p best is the optimal position.R is the optimal position searched by the population from a certain moment to the current iteration process.g d limits the feasible solution space of particles.Usually, two methods are used to balance the exploration and development capabilities of algorithms: boundary absorption and boundary rebound.The functional expression of boundary absorption is shown in (9).
The function expression for boundary rebound is shown in (10).
In equation (10), v id represents the flight speed of the particle.After PSO performs fidelity processing on the input train total weight and unit basic resistance coefficient, the train energy consumption can be predicted based on the constructed train power flow model.However, due to the influence of vehicle weight factors during testing, the load capacity of the already constructed model was adjusted.The weight of a train is not only related to the weight of the train itself, but also to the passenger capacity.By establishing a mapping relationship between factors that affect train load and train load, the corresponding passenger capacity can be obtained.The accurate total weight of the train is obtained.The selected influencing factors include time and weather.Considering the impact of comprehensive factors on train passenger load, a train passenger load modeling method based on similar days is adopted.Use the grey correlation method to select time similar to the test day as training data to reduce redundant information in the model.By combining the selected similar days with the Deep Neural Network (DNN) model, a Similar DayDeep Neural Network (SD-DNN) based on similar days is constructed to mine more hidden features and improve the computational accuracy and efficiency of the model.

Result analysis of system power flow deduction model and similar deep neural network model
Tanking the Beijing Metro Line 2 train as the experimental object, the corresponding data is collected using a data sensor.The data types include pedestrian flow, train DC current, and DC voltage.The four methods mentioned above are used to preprocess the collected actual data.The processing results are shown in Figure 6.The sampling period of the actual data is set.According to the data sampling period, the step size is determined to calculate the power of the train.The initial population of the PSO algorithm is 200.The iterative results of the algorithm are obtained by searching the four-dimensional space.The iteration results indicate that the algorithm has a faster search speed within 10 iterations.The convergence of PSO decreased from 4.0 to 2.98.When the number of iterations reaches 25 steps, the PSO fitness does not change, indicating that the algorithm training is completed at this time.Then the number of iterations is set to 30.When the number of iterations reaches 30, the model training stops.The convergence of PSO particles in the total train weight correction experiment is shown in Figure 7.
From Figure 7, the initial population has the highest degree of particle convergence dispersion.After 10 iterations, the total weight coefficient of the train began to stabilize.The numerical fluctuation range is between [0.18,0.74].After 30 iterations, the total weight coefficient of the train tends to stabilize.The value fluctuates around 0.5.The comparison results between the measured power of a single train section and the estimated power after correcting the parameters are shown in Figure 8.
From Figure 8, the power flow model of the train system without correction parameters estimates the train power higher than the measured power, with a maximum of 0.98 MW.The relative error range is 10%.As time goes on, the prediction of train power by uncorrected parameter models gradually aligns with the measured power of the train.
However, there is still a significant difference between the predicted and measured results of the model in the first 150 s.The correction parameter model can well fit the predicted curve of train power with the measured curve.The difference between the predicted and true values of current and voltage before and after parameter correction is shown in Figure 9.
From Figure 9, the predicted results of current and voltage for the model before parameter correction cannot fit well with the actual current and voltage curves.In Figure 9a, there is a significant discrepancy between the predicted current of the model with uncorrected parameters and the actual situation.The maximum relative error is 10%.The range of numerical fluctuations does not exhibit a certain pattern.In terms of voltage prediction, from Figure 9b that the model with uncorrected parameters still has a significant error between the predicted voltage and the actual value.The maximum relative error is 0.5%.The overall fluctuation range is slightly smaller than the current prediction curve.
The predicted curves of the model with corrected parameters in terms of current and voltage are well aligned with the actual curves.When using a fixed parameter model to detect various indicators of trains, the results obtained differ significantly from the true values.To further clarify the error range before and after parameter correction, the parameter correction results of some sections of the train and the train power error under their influence are verified.The experimental results are shown in Figure 10.
From Figure 10a, the resistance coefficient of a single train varies after adjusting for different interval parameters.In Figure 10b, as the weight of the train section increases, the model power error of the uncorrected parameters gradually increases.When the train interval is 30, the power error can reach 37.98%.After adjusting the parameters, the power error of the model decreased from 9.42% to 3.61%.The estimation results of the actual train load using the SD-DNN model have been validated.The obtained results are compared with DNN and the actual situation, as shown in Figure 11.From Figure 11, compared to DNN, the SD-DNN model has a better fit with the actual train passenger load curve.During weekdays, the SD-DNN model predicts two peaks of train passenger load at 8 am and 7 pm.From 13:00 to 15:00 at noon, the passenger load of the train is relatively low, which is in line with the daily working day travel status.Comparing the predicted results of the DNN model, the peak predicted load of DNN is smaller than the actual load and the load predicted by SD-DNN.The low peak prediction quantity is higher than both, which may be due to the fact that the train passenger load estimation model based on DNN cannot extract specific features from massive data.Therefore, it is impossible to obtain results that are similar to the actual situation.The SD-DNN model can select training sets that are similar to the predicted daily features to learn more accurately about changes in train load status, making the results obtained by the model more similar to the actual results.The SD-DNN model is used to predict the train load capacity during holidays.The DNN model is compared with the actual load situation of the train simultaneously.The experimental results are shown in Figure 12.
From Figure 12, the DNN based model presents relatively fixed results for the passenger load budget of different sections of trains.The passenger load predicted by the DNN model in Figure 12a and 12b are similar.The passenger load during small peak hours is about 100 tons.From the actual load situation, there is still a certain gap between the passenger load prediction of the DNN model and the actual situation.There is no significant small peak in actual load.The distribution of passenger capacity during holidays is relatively uniform.The prediction results of the SD-DNN model constructed by the research institute are in good agreement with the actual load curve.The passenger load prediction results of the constructed train model during three consecutive time periods of holidays, working days, and weekends are compared.The mean absolute percentage error (MAPE) and root mean square error (RMSE) are used as measurement indicators to represent the difference between the SD-DNN model prediction and the actual load.Meanwhile, DNN models are also compared.The experimental results are shown in Table 1.
From Table 1, the prediction accuracy of the SD-DNN model is higher than that of the DNN model.In terms of MAPE, the SD-DNN model has an accuracy higher than the DNN model by over 9.57%.The RMSE value is 0.0412 higher.The MAPE of the SD-DNN model on weekends was 11.72% and 14.65%, respectively, with a 2.93% increase on Sundays compared to Saturdays.The MAPE for holidays is 12.88%.The average MAPE for the four working days is 12.36%.The MAPE on weekdays is relatively stable.There will be significant fluctuations on

Conclusion
To better ensure subway traffic safety, based on the equipment and facility system aspects of subway operation safety level indicators, a subway power supply system power flow deduction model based on digital twin technology is constructed.Particle swarm optimization algorithm is used to correct model parameters.The grey comprehensive evaluation method is used to select similar days.The SD-DNN model is constructed to evaluate the passenger load of trains.Compared with the prediction results of fixed parameter models, the model constructed by the research institute with corrected parameters has higher estimation accuracy.The predicted curve has a good fit with the actual parameter curve, which can reduce the error of the conventional model by more than 10%.By correcting the parameters of a single train and the entire line of trains, the power error of a single section of the train was reduced by 11.91%.The power error of trains in 20 sections has been reduced to 4.12%.Compared to the model with fixed parameters, the power error has been improved by 22.28%.Similarly, the reduction in power error for 30 intervals is 34.37%.The average DC voltage error of the calibration parameter model for the entire train line is 0.19%.The average DC current error is 12.07%.The SD-DNN model has a good fit between the predicted curve of passenger capacity and the actual load curve.The MAPE values for passenger capacity at different time periods are between [11.72,14.65].The range of RMSE values is [0.0521,0.0811].The model constructed through research has good performance.However, the study only focuses on the safety of subway operation from the perspective of equipment and facilities.Other factors that affect the safe operation of the subway have not been considered.Therefore, there is some room for improvement.

Fig. 7 .
Fig. 7. Convergence of particles during the correction process of total train weight.

1 Fig. 10 .
Fig.10.Parameter correction results of some sections of the train and the train power error under their influence.
Saturday and Sunday.The RMSE value of the SD-DNN model fluctuates in the range of [0.05,0.08].The RMSE value for weekdays is within the range of [0.06,0.07].The RMSE values of the SD-DNN model are stable and small, indicating good model performance.Compared with the results obtained from the DNN model, the SD-DNN model has been optimized.