Great Western Coffee Shop

Journey by Journey => Wales local journeys => Topic started by: SandTEngineer on February 21, 2018, 17:40:06



Title: Cambrian ERTMS - Loss of TSR Data (20/10/2017)
Post by: SandTEngineer on February 21, 2018, 17:40:06
To a  signal engineer this looks extremely serious and I'm not supprised RAIB has decided in the circumstances to undertake an investigation.

Welcome to the Digital Railway.....

Quote
Cambrian line
Investigation into the loss of speed restriction data to trains on the Cambrian line, 20 October 2017.

Published 21 February 2018
From:
Rail Accident Investigation Branch
S300 Cambrian TSRs

During the morning of Friday 20 October 2017, a train driver travelling on the Cambrian coast line in North Wales reported that long standing temporary speed restrictions were not indicated on their in-cab display. As signalling staff at the control centre in Machynlleth investigated this report, they became aware that this failure applied to several trains under their control. The temporary speed restrictions were required on the approach to level crossings so that people crossing the line had sufficient warning of an approaching train.

The Cambrian lines were equipped in 2011 with a pilot installation of the European Rail Traffic Management System (ERTMS), a form of railway signalling. ERTMS removes the need for signals along the track by transmitting data directly to the train. This data is used to display movement authorities and other information such as temporary and permanent speed restrictions, on a screen in front of the driver.

Subsequent investigation found that the signalling system stopped transmitting temporary speed restriction data after a routine shutdown and restart at around 23:10 hrs the previous evening. The signallers had no indication of an abnormal condition and signalling control centre displays showed these restrictions as being applied correctly.

The RAIB has decided to undertake an independent investigation because to date, the signalling system supplier has not identified the cause of the failure. It is possible that finding the cause would have been assisted by downloading of suitable data from the signalling system before it was restarted during correction of the failure.

An additional procedure, since introduced at the control centre, is intended to identify and avoid any recurrence of the failure.

The RAIB investigation will consider:

the geographic extent of the failure and the effect it had on the safety of railway operations
why trains were permitted to operate without information about temporary speed restrictions
practices for the gathering of data needed for investigation before restarting computer based signalling systems after a potentially unsafe failure.


Title: Re: Cambrian ERTMS - Loss of TSR Data (20/10/2017)
Post by: stuving on February 21, 2018, 18:06:29
To a  signal engineer this looks extremely serious and I'm not supprised RAIB has decided in the circumstances to undertake an investigation.

Welcome to the Digital Railway.....

Indeed. We all know software has design faults, and I've observed corrosion, but it's not meant to suffer from gales, vandalism, attacks by livestock, or whatever else removes lineside signs.

Now, who are these "suppliers"? RAIB should tell us when they report, but all I can find is that Systra claim to have done systems integration (on- and off-board). That might or might not involve writing "glue" software, but I'm sure all the boxes were made by other companies.


Title: Re: Cambrian ERTMS - Loss of TSR Data (20/10/2017)
Post by: grahame on February 21, 2018, 18:15:28
Quote
The RAIB has decided to undertake an independent investigation because to date, the signalling system supplier has not identified the cause of the failure. It is possible that finding the cause would have been assisted by downloading of suitable data from the signalling system before it was restarted during correction of the failure.

I wonder when the RAIB were informed and whether "to date" means up to the date of the new report - 4 months.

It may sound like a not very dangerous circumstance - a long standing alert that all the drivers must have received multiple times in a day for months and all knew about - but it was a wrong side failure of the signalling system ...


Title: Re: Cambrian ERTMS - Loss of TSR Data (20/10/2017)
Post by: ellendune on February 21, 2018, 18:56:59
It may sound like a not very dangerous circumstance - a long standing alert that all the drivers must have received multiple times in a day for months and all knew about - but it was a wrong side failure of the signalling system ...

The driver noticed the longstanding one.  But what about the recent ones the driver did not know about? - If there were any? 


Title: Re: Cambrian ERTMS - Loss of TSR Data (20/10/2017)
Post by: SandTEngineer on February 21, 2018, 19:05:29
....from the careful wording by RAIB (as always, not making any assumptions) I have a hunch that more than TSR data may have been involved.  I hope it wasn't signalling data....


Title: Re: Cambrian ERTMS - Loss of TSR Data (20/10/2017)
Post by: Trowres on February 21, 2018, 21:07:26
Why would ERTMS have required a "routine shutdown and restart"?


Title: Re: Cambrian ERTMS - Loss of TSR Data (20/10/2017)
Post by: welshman on February 23, 2018, 08:16:21
Windows Update obviously.   ;D ;D


Title: Re: Cambrian ERTMS - Loss of TSR Data (20/10/2017)
Post by: GBM on February 23, 2018, 08:32:41
Why would ERTMS have required a "routine shutdown and restart"?

Also frequently used on First Kernow busses!


Title: Re: Cambrian ERTMS - Loss of TSR Data (20/10/2017)
Post by: Trowres on October 21, 2018, 21:57:41
RAIB has now published an interim report:
https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/749430/IR012018_181018_Cambrian_TSRs.pdf (https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/749430/IR012018_181018_Cambrian_TSRs.pdf)

Quote
Just after 23:00 hrs and near the end of passenger service on 19 October 2017,
a software reset occurred in the computer based signalling system controlling
the Cambrian lines and located at Machynlleth signalling control centre.  This
automatic reset, known as a rollover, was triggered when the equipment on-board
a train at Machynlleth station, automatically requested a movement authority
already allocated to another train.

Such software rollovers occur between 10 and 12 times each year, and the
signalling staff at the control centre followed their established processes for
returning to normal service.  It was necessary to stop movement authorities being
given to the three trains within the area controlled by the signalling system during
the rollover but normal working resumed around 23:19 hrs.  The three trains
continued to their respective destinations after a short delay. These were the last
trains of the day.

During a rollover, the signalling system at Machynlleth takes information from
a support computer system (the GEST server described at paragraph 25). 
This should include information about temporary speed restrictions.  However,
during the rollover on the 19 October 2017, the data relating to temporary speed
restrictions between Dovey Junction and Pwllheli failed to reload from the support
system to the signalling control system.  Staff working in the signalling control
centre were unaware of this when they subsequently permitted trains to begin
operating again.

The remainder of the report discusses at some length the ongoing attempts to discover why the GEST server repeatedly failed to provide details of the TSRs to the signalling system.
What it doesn't discuss is the so-called "rollover". Why would a train's on-board equipment automatically request "a movement authority already allocated to another train" ? Is this part of the ERTMS spec? And why, if this causes a system reboot that halts trains under the control of the Machynlleth system for about 15 minutes, has it been tolerated "between 10 and 12 times each year"?


Title: Re: Cambrian ERTMS - Loss of TSR Data (20/10/2017)
Post by: eightf48544 on October 22, 2018, 18:25:27
Agree with you it seems very odd that a train can request an already issued movement order.

Also your point that the RAIB don't seem to have queried why this happens at around once a moth and requires a rollover which stops the trains for 15 minutes.  Do they just come to halt anywhere on the line, not necessarily a good idea given some of the gradients?

Can the driver override it (on instruction)?



Title: Re: Cambrian ERTMS - Loss of TSR Data (20/10/2017)
Post by: stuving on October 22, 2018, 23:53:43
Before asking those questions, I'd want to know more about how ERTMS (aka ETCS) works - such as what constitutes the same movement authority? If a train haasn't been issued with it, how can it request it, rather than a different one for the same track section? I wonder if this is because the MA is already pre-allocated as part of a route.

But of course ERTMS is complicated stuff - no, more complicated than that!

I can see that this is a cobbled-together experimental system only fit for a lightly-used line, so its shortcomings may have been acceptable. But I am puzzled that it (the GEST) was based on an existing system in use is Spain, yet Ansaldo have design data for it.

Remember, this is an interim report.


Title: Re: Cambrian ERTMS - Loss of TSR Data (20/10/2017)
Post by: grahame on December 20, 2019, 01:45:05
Report published yesterday ... summary from the BBC (https://www.bbc.co.uk/news/uk-wales-50857909) - not sure if this is the same incident, related but different, or disconnected and at about the same time

Quote
A passenger train approached a level crossing at nearly three times the speed limit because of computer problems, a report said.

The train was travelling at 80km/h (50mph) in Gwynedd, breaking the limit of 30km/h (19mph), which aims to give people time to cross safely.

No accidents were caused due to the speed between Barmouth and Llanaber.

But the chief inspector of railways said the industry had to learn "important lessons".

Simon French said nobody running the railway at the time had any idea anything was "amiss".

The Rail Accident Investigation Branch (RAIB) report into the incident made five recommendations, including improving computer safety measures.

It found information about temporary speed limits failed to be transmitted to four trains using the line on the morning of 20 October 2017.

Fortunate there was no accident.   This serves to stress the importance of signalling data being complete and correct - and how the completeness and correctness of that data must be in a different league to the correctness (sometimes lack thereof) of data we see on passenger information systems.


Title: Re: Cambrian ERTMS - Loss of TSR Data (20/10/2017)
Post by: SandTEngineer on December 20, 2019, 10:29:30
Its the same incident, Grahame.  We will never know the full truth of what happened as it seems the data logger was reset as part of the incident, and all recorded data was lost.  Interesting that RAIB had to call in the Norwegian equivalent to provide technical expertise.

Full report here: https://www.gov.uk/government/news/report-172019-loss-of-safety-critical-signalling-data-on-the-cambrian-coast-line?utm_source=9bee70f2-baf6-40de-80ac-7578bc4febdb&utm_medium=email&utm_campaign=govuk-notifications&utm_content=immediate

As I said in my original post that started this topic way back in February 2018, 'Welcome to the Digital Railway' ::)


Title: Re: Cambrian ERTMS - Loss of TSR Data (20/10/2017)
Post by: Celestial on December 20, 2019, 10:41:20
One thing I find slightly surprising is that the temporary speed restriction had been in place for 3 years because of level crossing sighting. I wonder at what point a TSR becomes a PSR, and is there any plan to address the issue.

On that note, it also struck me that there were several of these reducing the speed to 19mph. I would imagine the total effect of these is considerable in terms of increasing journey times, and thus making the overall journey less attractive, and probably less reliable. 


Title: Re: Cambrian ERTMS - Loss of TSR Data (20/10/2017)
Post by: SandTEngineer on December 20, 2019, 10:49:02
One thing I find slightly surprising is that the temporary speed restriction had been in place for 3 years because of level crossing sighting. I wonder at what point a TSR becomes a PSR, and is there any plan to address the issue.

Thats a very good question.  Probably the problem here is that the Cambrian ERTMS system is quite bespoke and there is little, if any, technical support for getting software changes made.  Just my personal thoughts.


Title: Re: Cambrian ERTMS - Loss of TSR Data (20/10/2017)
Post by: eXPassenger on December 20, 2019, 15:35:36
One thing I find slightly surprising is that the temporary speed restriction had been in place for 3 years because of level crossing sighting. I wonder at what point a TSR becomes a PSR, and is there any plan to address the issue.

Thats a very good question.  Probably the problem here is that the Cambrian ERTMS system is quite bespoke and there is little, if any, technical support for getting software changes made.  Just my personal thoughts.

Surely a change from a TSR to a PSR would be a change in the operational data.  It would be horrendous programming practice to hard code this type of information.


Title: Re: Cambrian ERTMS - Loss of TSR Data (20/10/2017)
Post by: SandTEngineer on December 20, 2019, 15:39:43
One thing I find slightly surprising is that the temporary speed restriction had been in place for 3 years because of level crossing sighting. I wonder at what point a TSR becomes a PSR, and is there any plan to address the issue.

Thats a very good question.  Probably the problem here is that the Cambrian ERTMS system is quite bespoke and there is little, if any, technical support for getting software changes made.  Just my personal thoughts.

Surely a change from a TSR to a PSR would be a change in the operational data.  It would be horrendous programming practice to hard code this type of information.

I agree that imposing a TSR and removing same would be 'Operational Data' but making it permanent would require the system software to be permanently altered (well, thats my limited understanding of the capabilities of the ERTMS system).


Title: Re: Cambrian ERTMS - Loss of TSR Data (20/10/2017)
Post by: eXPassenger on December 20, 2019, 15:53:23
One thing I find slightly surprising is that the temporary speed restriction had been in place for 3 years because of level crossing sighting. I wonder at what point a TSR becomes a PSR, and is there any plan to address the issue.

Thats a very good question.  Probably the problem here is that the Cambrian ERTMS system is quite bespoke and there is little, if any, technical support for getting software changes made.  Just my personal thoughts.

Surely a change from a TSR to a PSR would be a change in the operational data.  It would be horrendous programming practice to hard code this type of information.

I agree that imposing a TSR and removing same would be 'Operational Data' but making it permanent would require the system software to be permanently altered (well, thats my limited understanding of the capabilities of the ERTMS system).

As a retired IT Director that horrifies me.  All the operational data should be in a database with strict controls and auditing of changes.  The programs should then apply that data.  By analogy your Sat Nav system is not re-issued when a speed limit changes, there is just an updated map file.


Title: Re: Cambrian ERTMS - Loss of TSR Data (20/10/2017)
Post by: Worcester_Passenger on December 20, 2019, 18:26:17

As a retired IT Director that horrifies me.  All the operational data should be in a database with strict controls and auditing of changes.  The programs should then apply that data.  By analogy your Sat Nav system is not re-issued when a speed limit changes, there is just an updated map file.

Agree completely.


Title: Re: Cambrian ERTMS - Loss of TSR Data (20/10/2017)
Post by: stuving on December 20, 2019, 18:52:25
As a retired IT Director that horrifies me.  All the operational data should be in a database with strict controls and auditing of changes.  The programs should then apply that data.  By analogy your Sat Nav system is not re-issued when a speed limit changes, there is just an updated map file.
Agree completely.

What the report describes for PSRs is that they are kept in non-volatile memory, but it doesn't say what process is involved in changing that. It could be similar to flashing the BIOS in a PC - take RBC out of service, load loader and data, burn into NVM, restart RBC processor and run read-back mode to check data, restart in service - something like that.

For the first user of the equipment, on the LGV to Alsace, TSRs were also held in NVM, because they were managed by track maintenance staff not signallers. The report does not say whether this was the same memory as for PSRs, or another one, nor whether there were different change procedures. I imagine the work methods for this task will be those familiar to signalling technicians, which might not be how IT would approach it. 

 



This page is printed from the "Coffee Shop" forum at http://gwr.passenger.chat which is provided by a customer of Great Western Railway. Views expressed are those of the individual posters concerned. Visit www.gwr.com for the official Great Western Railway website. Please contact the administrators of this site if you feel that content provided contravenes our posting rules ( see http://railcustomer.info/1761 ). The forum is hosted by Well House Consultants - http://www.wellho.net