From: "Saved by Windows Internet Explorer 8" Subject: ARIANE 5 Failure - Full Report Date: Mon, 8 Nov 2010 14:12:36 +0100 MIME-Version: 1.0 Content-Type: text/html; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable Content-Location: http://www.di.unito.it/~damiani/ariane5rep.html X-MimeOLE: Produced By Microsoft MimeOLE V6.1.7600.16543
Paris, 19 July 1996
ARIANE 5
Flight 501 Failure
Report by the Inquiry Board
The Chairman of the Board :
Prof. J. L. LIONS
On 4 June 1996, the maiden flight of the Ariane 5 launcher ended in a = failure. Only about 40 seconds after initiation of the flight sequence, = at an=20 altitude of about 3700 m, the launcher veered off its flight path, broke = up and=20 exploded. Engineers from the Ariane 5 project teams of CNES and Industry = immediately started to investigate the failure. Over the following days, = the=20 Director General of ESA and the Chairman of CNES set up an independent = Inquiry=20 Board and nominated the following members :
- Prof. Jacques-Louis Lions (Chairman) Acad=02emie des Sciences =
(France)
-=20
Dr. Lennart L=01beck (Vice-Chairman) Swedish Space Corporation (Sweden) =
- Mr.=20
Jean-Luc Fauquembergue De=02l=02egation Ge=02ne=02rale pour l'Armement =
(France)
-=20
Mr. Gilles Kahn Institut National de Recherche en Informatique et en =
Automatique=20
(INRIA), (France)
- Prof. Dr. Ing. Wolfgang Kubbat Technical =
University of=20
Darmstadt (Germany)
- Dr. Ing. Stefan Levedag Daimler Benz Aerospace =
(Germany)
- Dr. Ing. Leonardo Mazzini Alenia Spazio (Italy)
- =
Mr. Didier=20
Merle Thomson CSF (France)
- Dr. Colin O'Halloran Defence Evaluation =
and=20
Research Agency (DERA), (U.K.)
The terms of reference assigned to the Board requested it
- to determine the causes of the launch failure,
- to =
investigate=20
whether the qualification tests and acceptance tests were appropriate =
in=20
relation to the problem encountered,
- to recommend corrective =
action to=20
remove the causes of the anomaly and other possible weaknesses of the =
systems=20
found to be at fault.
The Board started its work on 13 June 1996. It was assisted by a = Technical=20 Advisory Committee composed of :
- Dr Mauro Balduccini (BPD)
- Mr Yvan Choquer (Matra Marconi =
Space)
-=20
Mr Remy Hergott (CNES)
- Mr Bernard Humbert (Aerospatiale)
- Mr =
Eric=20
Lefort (ESA)
In accordance with its terms of reference, the Board concentrated its = investigations on the causes of the failure, the systems supposed to be=20 responsible, any failures of similar nature in similar systems, and = events that=20 could be linked to the accident. Consequently, the recommendations made = by the=20 Board are limited to the areas examined. The report contains the = analysis of the=20 failure, the Board's conclusions and its recommendations for corrective=20 measures, most of which should be undertaken before the next flight of = Ariane 5.=20 There is in addition a report for restricted circulation in which the = Board's=20 findings are documented in greater technical detail. Although it = consulted the=20 telemetry data recorded during the flight, the Board has not undertaken = an=20 evaluation of those data. Nor has it made a complete review of the whole = launcher and all its systems.
This report is the result of a collective effort by the Commission, = assisted=20 by the members of the Technical Advisory Committee.
We have all worked hard to present a very precise explanation of the = reasons=20 for the failure and to make a contribution towards the improvement of = Ariane 5=20 software. This improvement is necessary to ensure the success of the=20 programme.
The Board's findings are based on thorough and open presentations = from the=20 Ariane 5 project teams, and on documentation which has demonstrated the = high=20 quality of the Ariane 5 programme as regards engineering work in general = and=20 completeness and traceability of documents.
Chairman of the Board
On the basis of the documentation made available and the information=20 presented to the Board, the following has been observed:
The weather at the launch site at Kourou on the morning of 4 June = 1996 was=20 acceptable for a launch that day, and presented no obstacle to the = transfer of=20 the launcher to the launch pad. In particular, there was no risk of = lightning=20 since the strength of the electric field measured at the launch site was = negligible. The only uncertainty concerned fulfilment of the visibility=20 criteria.
The countdown, which also comprises the filling of the core stage, = went=20 smoothly until H0-7 minutes when the launch was put on hold since the = visibility=20 criteria were not met at the opening of the launch window (08h35 local = time).=20 Visibility conditions improved as forecast and the launch was initiated = at H0 =3D=20 09h 33mn 59s local time (=3D12h 33mn 59s UT). Ignition of the Vulcain = engine and=20 the two solid boosters was nominal, as was lift-off. The vehicle = performed a=20 nominal flight until approximately H0 + 37 seconds. Shortly after that = time, it=20 suddenly veered off its flight path, broke up, and exploded. A = preliminary=20 investigation of flight data showed:
The origin of the failure was thus rapidly narrowed down to the = flight=20 control system and more particularly to the Inertial Reference Systems, = which=20 obviously ceased to function almost simultaneously at around H0 + 36.7=20 seconds.
The information available on the launch includes:
- telemetry data received on the ground until H0 + 42 seconds
- =
trajectory data from radar stations
- optical observations (IR =
camera,=20
films) - inspection of recovered material.
The whole of the telemetry data received in Kourou was transferred to = CNES/Toulouse where the data were converted into parameter over time = plots. CNES=20 provided a copy of the data to Aerospatiale, which carried out analyses=20 concentrating mainly on the data concerning the electrical system.
The self-destruction of the launcher occurred near to the launch pad, = at an=20 altitude of approximately 4000 m. Therefore, all the launcher debris = fell back=20 onto the ground, scattered over an area of approximately 12 km2 east of = the=20 launch pad. Recovery of material proved difficult, however, since this = area is=20 nearly all mangrove swamp or savanna.
Nevertheless, it was possible to retrieve from the debris the two = Inertial=20 Reference Systems. Of particular interest was the one which had worked = in active=20 mode and stopped functioning last, and for which, therefore, certain = information=20 was not available in the telemetry data (provision for transmission to = ground of=20 this information was confined to whichever of the two units might fail = first).=20 The results of the examination of this unit were very helpful to the = analysis of=20 the failure sequence.
Post-flight analysis of telemetry has shown a number of anomalies = which have=20 been reported to the Board. They are mostly of minor significance and = such as to=20 be expected on a demonstration flight.
One anomaly which was brought to the particular attention of the = Board was=20 the gradual development, starting at Ho + 22 seconds, of variations in = the=20 hydraulic pressure of the actuators of the main engine nozzle. These = variations=20 had a frequency of approximately 10 Hz.
There are some preliminary explanations as to the cause of these = variations,=20 which are now under investigation.
After consideration, the Board has formed the opinion that this = anomaly,=20 while significant, has no bearing on the failure of Ariane 501.
In general terms, the Flight Control System of the Ariane 5 is of a = standard=20 design. The attitude of the launcher and its movements in space are = measured by=20 an Inertial Reference System (SRI). It has its own internal computer, in = which=20 angles and velocities are calculated on the basis of information from a=20 "strap-down" inertial platform, with laser gyros and accelerometers. The = data=20 from the SRI are transmitted through the databus to the On-Board = Computer (OBC),=20 which executes the flight program and controls the nozzles of the solid = boosters=20 and the Vulcain cryogenic engine, via servovalves and hydraulic = actuators.
In order to improve reliability there is considerable redundancy at = equipment=20 level. There are two SRIs operating in parallel, with identical hardware = and=20 software. One SRI is active and one is in "hot" stand-by, and if the OBC = detects=20 that the active SRI has failed it immediately switches to the other one, = provided that this unit is functioning properly. Likewise there are two = OBCs,=20 and a number of other units in the Flight Control System are also=20 duplicated.
The design of the Ariane 5 SRI is practically the same as that of an = SRI=20 which is presently used on Ariane 4, particularly as regards the = software.
Based on the extensive documentation and data on the Ariane 501 = failure made=20 available to the Board, the following chain of events, their = inter-relations and=20 causes have been established, starting with the destruction of the = launcher and=20 tracing back in time towards the primary cause.
The SRI internal events that led to the failure have been reproduced = by=20 simulation calculations. Furthermore, both SRIs were recovered during = the=20 Board's investigation and the failure context was precisely determined = from=20 memory readouts. In addition, the Board has examined the software code = which was=20 shown to be consistent with the failure scenario. The results of these=20 examinations are documented in the Technical Report.
Therefore, it is established beyond reasonable doubt that the chain = of events=20 set out above reflects the technical causes of the failure of Ariane = 501.
In the failure scenario, the primary technical causes are the Operand = Error=20 when converting the horizontal bias variable BH, and the lack of = protection of=20 this conversion which caused the SRI computer to stop.
It has been stated to the Board that not all the conversions were = protected=20 because a maximum workload target of 80% had been set for the SRI = computer. To=20 determine the vulnerability of unprotected code, an analysis was = performed on=20 every operation which could give rise to an exception, including an = Operand=20 Error. In particular, the conversion of floating point values to = integers was=20 analysed and operations involving seven variables were at risk of = leading to an=20 Operand Error. This led to protection being added to four of the = variables,=20 evidence of which appears in the Ada code. However, three of the = variables were=20 left unprotected. No reference to justification of this decision was = found=20 directly in the source code. Given the large amount of documentation = associated=20 with any industrial application, the assumption, although agreed, was=20 essentially obscured, though not deliberately, from any external = review.
The reason for the three remaining variables, including the one = denoting=20 horizontal bias, being unprotected was that further reasoning indicated = that=20 they were either physically limited or that there was a large margin of = safety,=20 a reasoning which in the case of the variable BH turned out to be = faulty. It is=20 important to note that the decision to protect certain variables but not = others=20 was taken jointly by project partners at several contractual levels.
There is no evidence that any trajectory data were used to analyse = the=20 behaviour of the unprotected variables, and it is even more important to = note=20 that it was jointly agreed not to include the Ariane 5 trajectory data = in the=20 SRI requirements and specification.
Although the source of the Operand Error has been identified, this in = itself=20 did not cause the mission to fail. The specification of the = exception-handling=20 mechanism also contributed to the failure. In the event of any kind of=20 exception, the system specification stated that: the failure should be = indicated=20 on the databus, the failure context should be stored in an EEPROM memory = (which=20 was recovered and read out for Ariane 501), and finally, the SRI = processor=20 should be shut down.
It was the decision to cease the processor operation which finally = proved=20 fatal. Restart is not feasible since attitude is too difficult to = re-calculate=20 after a processor shutdown; therefore the Inertial Reference System = becomes=20 useless. The reason behind this drastic action lies in the culture = within the=20 Ariane programme of only addressing random hardware failures. From this = point of=20 view exception - or error - handling mechanisms are designed for a = random=20 hardware failure which can quite rationally be handled by a backup = system.
Although the failure was due to a systematic software design error,=20 mechanisms can be introduced to mitigate this type of problem. For = example the=20 computers within the SRIs could have continued to provide their best = estimates=20 of the required attitude information. There is reason for concern that a = software exception should be allowed, or even required, to cause a = processor to=20 halt while handling mission-critical equipment. Indeed, the loss of a = proper=20 software function is hazardous because the same software runs in both = SRI units.=20 In the case of Ariane 501, this resulted in the switch-off of two still = healthy=20 critical units of equipment.
The original requirement acccounting for the continued operation of = the=20 alignment software after lift-off was brought forward more than 10 years = ago for=20 the earlier models of Ariane, in order to cope with the rather unlikely = event of=20 a hold in the count-down e.g. between - 9 seconds, when flight mode = starts in=20 the SRI of Ariane 4, and - 5 seconds when certain events are initiated = in the=20 launcher which take several hours to reset. The period selected for this = continued alignment operation, 50 seconds after the start of flight = mode, was=20 based on the time needed for the ground equipment to resume full control = of the=20 launcher in the event of a hold.
This special feature made it possible with the earlier versions of = Ariane, to=20 restart the count- down without waiting for normal alignment, which = takes 45=20 minutes or more, so that a short launch window could still be used. In = fact,=20 this feature was used once, in 1989 on Flight 33.
The same requirement does not apply to Ariane 5, which has a = different=20 preparation sequence and it was maintained for commonality reasons, = presumably=20 based on the view that, unless proven necessary, it was not wise to make = changes=20 in software which worked well on Ariane 4.
Even in those cases where the requirement is found to be still valid, = it is=20 questionable for the alignment function to be operating after the = launcher has=20 lifted off. Alignment of mechanical and laser strap-down platforms = involves=20 complex mathematical filter functions to properly align the x-axis to = the=20 gravity axis and to find north direction from Earth rotation sensing. = The=20 assumption of preflight alignment is that the launcher is positioned at = a known=20 and fixed position. Therefore, the alignment function is totally = disrupted when=20 performed during flight, because the measured movements of the launcher = are=20 interpreted as sensor offsets and other coefficients characterising = sensor=20 behaviour.
Returning to the software error, the Board wishes to point out that = software=20 is an expression of a highly detailed design and does not fail in the = same sense=20 as a mechanical system. Furthermore software is flexible and expressive = and thus=20 encourages highly demanding requirements, which in turn lead to complex=20 implementations which are difficult to assess.
An underlying theme in the development of Ariane 5 is the bias = towards the=20 mitigation of random failure. The supplier of the SRI was only following = the=20 specification given to it, which stipulated that in the event of any = detected=20 exception the processor was to be stopped. The exception which occurred = was not=20 due to random failure but a design error. The exception was detected, = but=20 inappropriately handled because the view had been taken that software = should be=20 considered correct until it is shown to be at fault. The Board has = reason to=20 believe that this view is also accepted in other areas of Ariane 5 = software=20 design. The Board is in favour of the opposite view, that software = should be=20 assumed to be faulty until applying the currently accepted best practice = methods=20 can demonstrate that it is correct.
This means that critical software - in the sense that failure of the = software=20 puts the mission at risk - must be identified at a very detailed level, = that=20 exceptional behaviour must be confined, and that a reasonable back-up = policy=20 must take software failures into account.
2.3 THE TESTING AND QUALIFICATION PROCEDURES
The Flight Control System qualification for Ariane 5 follows a = standard=20 procedure and is performed at the following levels :
- Equipment qualification
- Software qualification (On-Board =
Computer=20
software)
- Stage integration
- System validation =
tests.
The logic applied is to check at each level what could not be = achieved at the=20 previous level, thus eventually providing complete test coverage of each = sub-system and of the integrated system.
Testing at equipment level was in the case of the SRI conducted = rigorously=20 with regard to all environmental factors and in fact beyond what was = expected=20 for Ariane 5. However, no test was performed to verify that the SRI = would behave=20 correctly when being subjected to the count-down and flight time = sequence and=20 the trajectory of Ariane 5.
It should be noted that for reasons of physical law, it is not = feasible to=20 test the SRI as a "black box" in the flight environment, unless one = makes a=20 completely realistic flight test, but it is possible to do ground = testing by=20 injecting simulated accelerometric signals in accordance with predicted = flight=20 parameters, while also using a turntable to simulate launcher angular = movements.=20 Had such a test been performed by the supplier or as part of the = acceptance=20 test, the failure mechanism would have been exposed.
The main explanation for the absence of this test has already been = mentioned=20 above, i.e. the SRI specification (which is supposed to be a = requirements=20 document for the SRI) does not contain the Ariane 5 trajectory data as a = functional requirement.
The Board has also noted that the systems specification of the SRI = does not=20 indicate operational restrictions that emerge from the chosen = implementation.=20 Such a declaration of limitation, which should be mandatory for every=20 mission-critical device, would have served to identify any = non-compliance with=20 the trajectory of Ariane 5.
The other principal opportunity to detect the failure mechanism = beforehand=20 was during the numerous tests and simulations carried out at the = Functional=20 Simulation Facility ISF, which is at the site of the Industrial = Architect. The=20 scope of the ISF testing is to qualify :
- the guidance, navigation and control performance in the whole =
flight=20
envelope,
- the sensors redundancy operation, - the dedicated =
functions of=20
the stages,
- the flight software (On-Board Computer) compliance =
with all=20
equipment of the Flight Control Electrical System.
A large number of closed-loop simulations of the complete flight = simulating=20 ground segment operation, telemetry flow and launcher dynamics were run = in order=20 to verify :
- the nominal trajectory
- trajectories degraded with respect =
to=20
internal launcher parameters
- trajectories degraded with respect =
to=20
atmospheric parameters
- equipment failures and the subsequent =
failure=20
isolation and recovery
In these tests many equipment items were physically present and = exercised but=20 not the two SRIs, which were simulated by specifically developed = software=20 modules. Some open-loop tests, to verify compliance of the On-Board = Computer and=20 the SRI, were performed with the actual SRI. It is understood that these = were=20 just electrical integration tests and "low-level " (bus communication)=20 compliance tests.
It is not mandatory, even if preferable, that all the parts of the = subsystem=20 are present in all the tests at a given level. Sometimes this is not = physically=20 possible or it is not possible to exercise them completely or in a=20 representative way. In these cases it is logical to replace them with = simulators=20 but only after a careful check that the previous test levels have = covered the=20 scope completely.
This procedure is especially important for the final system test = before the=20 system is operationally used (the tests performed on the 501 launcher = itself are=20 not addressed here since they are not specific to the Flight Control = Electrical=20 System qualification).
In order to understand the explanations given for the decision not to = have=20 the SRIs in the closed-loop simulation, it is necessary to describe the = test=20 configurations that might have been used.
Because it is not possible to simulate the large linear accelerations = of the=20 launcher in all three axes on a test bench (as discussed above), there = are two=20 ways to put the SRI in the loop:
A) To put it on a three-axis dynamic table (to = stimulate=20 the Ring Laser Gyros) and to substitute the analog output of the=20 accelerometers (which can not be stimulated mechanically) by = simulation via a=20 dedicated test input connector and an electronic board designed for = this=20 purpose. This is similar to the method mentioned in connection with = possible=20 testing at equipment level.
B) To substitute both, the analog output of the=20 accelerometers and the Ring Laser Gyros via a dedicated test input = connector=20 with signals produced by simulation.
The first approach is likely to provide an accurate simulation = (within the=20 limits of the three-axis dynamic table bandwidth) and is quite = expensive; the=20 second is cheaper and its performance depends essentially on the = accuracy of the=20 simulation. In both cases a large part of the electronics and the = complete=20 software are tested in the real operating environment.
When the project test philosophy was defined, the importance of = having the=20 SRIs in the loop was recognized and a decision was taken to select = method B=20 above. At a later stage of the programme (in 1992), this decision was = changed.=20 It was decided not to have the actual SRIs in the loop for the following = reasons=20 :
The opinion of the Board is that these arguments were technically = valid, but=20 since the purpose of a system simulation test is not only to verify the=20 interfaces but also to verify the system as a whole for the particular=20 application, there was a definite risk in assuming that critical = equipment such=20 as the SRI had been validated by qualification on its own, or by = previous use on=20 Ariane 4.
While high accuracy of a simulation is desirable, in the ISF system = tests it=20 is clearly better to compromise on accuracy but achieve all other = objectives,=20 amongst them to prove the proper system integration of equipment such as = the=20 SRI. The precision of the guidance system can be effectively = demonstrated by=20 analysis and computer simulation.
Under this heading it should be noted finally that the overriding = means of=20 preventing failures are the reviews which are an integral part of the = design and=20 qualification process, and which are carried out at all levels and = involve all=20 major partners in the project (as well as external experts). In a = programme of=20 this size, literally thousands of problems and potential failures are=20 successfully handled in the review process and it is obviously not easy = to=20 detect software design errors of the type which were the primary = technical cause=20 of the 501 failure. Nevertheless, it is evident that the limitations of = the SRI=20 software were not fully analysed in the reviews, and it was not realised = that=20 the test coverage was inadequate to expose such limitations. Nor were = the=20 possible implications of allowing the alignment software to operate = during=20 flight realised. In these respects, the review process was a = contributory factor=20 in the failure.
In accordance with its termes of reference, the Board has examined = possible=20 other weaknesses, primarily in the Flight Control System. No weaknesses = were=20 found which were related to the failure, but in spite of the short time=20 available, the Board has conducted an extensive review of the Flight = Control=20 System based on experience gained during the failure analysis.
The review has covered the following areas :
- The design of the electrical system,
- Embedded on-board =
software in=20
subsystems other than the Inertial Reference System,
- The =
On-Board=20
Computer and the flight program software.
In addition, the Board has made an analysis of methods applied in the = development programme, in particular as regards software development=20 methodology.
The results of these efforts have been documented in the Technical = Report and=20 it is the hope of the Board that they will contribute to further = improvement of=20 the Ariane 5 Flight Control System and its software.
The Board reached the following findings:
a) During the launch preparation campaign and the count-down no = events=20 occurred which were related to the failure.
b) The meteorological conditions at the time of the launch were = acceptable=20 and did not play any part in the failure. No other external factors = have been=20 found to be of relevance.
c) Engine ignition and lift-off were essentially nominal and the=20 environmental effects (noise and vibration) on the launcher and the = payload=20 were not found to be relevant to the failure. Propulsion performance = was=20 within specification.
d) 22 seconds after H0 (command for main cryogenic engine = ignition),=20 variations of 10 Hz frequency started to appear in the hydraulic = pressure of=20 the actuators which control the nozzle of the main engine. This = phenomenon is=20 significant and has not yet been fully explained, but after = consideration it=20 has not been found relevant to the failure.
e) At 36.7 seconds after H0 (approx. 30 seconds after lift-off) the = computer within the back-up inertial reference system, which was = working on=20 stand-by for guidance and attitude control, became inoperative. This = was=20 caused by an internal variable related to the horizontal velocity of = the=20 launcher exceeding a limit which existed in the software of this = computer.
f) Approx. 0.05 seconds later the active inertial reference system, = identical to the back-up system in hardware and software, failed for = the same=20 reason. Since the back-up inertial system was already inoperative, = correct=20 guidance and attitude information could no longer be obtained and loss = of the=20 mission was inevitable.
g) As a result of its failure, the active inertial reference system = transmitted essentially diagnostic information to the launcher's main=20 computer, where it was interpreted as flight data and used for flight = control=20 calculations.
h) On the basis of those calculations the main computer commanded = the=20 booster nozzles, and somewhat later the main engine nozzle also, to = make a=20 large correction for an attitude deviation that had not occurred.
i) A rapid change of attitude occurred which caused the launcher to = disintegrate at 39 seconds after H0 due to aerodynamic forces.
j) Destruction was automatically initiated upon disintegration, as=20 designed, at an altitude of 4 km and a distance of 1 km from the = launch=20 pad.
k) The debris was spread over an area of 5 x 2.5 km2. Amongst the = equipment=20 recovered were the two inertial reference systems. They have been used = for=20 analysis.
l) The post-flight analysis of telemetry data has listed a number = of=20 additional anomalies which are being investigated but are not = considered=20 significant to the failure.
m) The inertial reference system of Ariane 5 is essentially common = to a=20 system which is presently flying on Ariane 4. The part of the software = which=20 caused the interruption in the inertial system computers is used = before launch=20 to align the inertial reference system and, in Ariane 4, also to = enable a=20 rapid realignment of the system in case of a late hold in the = countdown. This=20 realignment function, which does not serve any purpose on Ariane 5, = was=20 nevertheless retained for commonality reasons and allowed, as in = Ariane 4, to=20 operate for approx. 40 seconds after lift-off.
n) During design of the software of the inertial reference system = used for=20 Ariane 4 and Ariane 5, a decision was taken that it was not necessary = to=20 protect the inertial system computer from being made inoperative by an = excessive value of the variable related to the horizontal velocity, a=20 protection which was provided for several other variables of the = alignment=20 software. When taking this design decision, it was not analysed or = fully=20 understood which values this particular variable might assume when the = alignment software was allowed to operate after lift-off.
o) In Ariane 4 flights using the same type of inertial reference = system=20 there has been no such failure because the trajectory during the first = 40=20 seconds of flight is such that the particular variable related to = horizontal=20 velocity cannot reach, with an adequate operational margin, a value = beyond the=20 limit present in the software.
p) Ariane 5 has a high initial acceleration and a trajectory which = leads to=20 a build-up of horizontal velocity which is five times more rapid than = for=20 Ariane 4. The higher horizontal velocity of Ariane 5 generated, within = the=20 40-second timeframe, the excessive value which caused the inertial = system=20 computers to cease operation.
q) The purpose of the review process, which involves all major = partners in=20 the Ariane 5 programme, is to validate design decisions and to obtain = flight=20 qualification. In this process, the limitations of the alignment = software were=20 not fully analysed and the possible implications of allowing it to = continue to=20 function during flight were not realised.
r) The specification of the inertial reference system and the tests = performed at equipment level did not specifically include the Ariane 5 = trajectory data. Consequently the realignment function was not tested = under=20 simulated Ariane 5 flight conditions, and the design error was not=20 discovered.
s) It would have been technically feasible to include almost the = entire=20 inertial reference system in the overall system simulations which were = performed. For a number of reasons it was decided to use the simulated = output=20 of the inertial reference system, not the system itself or its = detailed=20 simulation. Had the system been included, the failure could have been=20 detected.
t) Post-flight simulations have been carried out on a computer with = software of the inertial reference system and with a simulated = environment,=20 including the actual trajectory data from the Ariane 501 flight. These = simulations have faithfully reproduced the chain of events leading to = the=20 failure of the inertial reference systems.
The failure of the Ariane 501 was caused by the complete loss of = guidance and=20 attitude information 37 seconds after start of the main engine ignition = sequence=20 (30 seconds after lift- off). This loss of information was due to = specification=20 and design errors in the software of the inertial reference system.
The extensive reviews and tests carried out during the Ariane 5 = Development=20 Programme did not include adequate analysis and testing of the inertial=20 reference system or of the complete flight control system, which could = have=20 detected the potential failure.
On the basis of its analyses and conclusions, the Board makes the = following=20 recommendations.
R1 Switch off the alignment function of the inertial reference = system=20 immediately after lift-off. More generally, no software function should = run=20 during flight unless it is needed.
R2 Prepare a test facility including as much real equipment as = technically feasible, inject realistic input data, and perform complete, = closed-loop, system testing. Complete simulations must take place before = any=20 mission. A high test coverage has to be obtained.
R3 Do not allow any sensor, such as the inertial reference = system, to=20 stop sending best effort data.
R4 Organize, for each item of equipment incorporating = software, a=20 specific software qualification review. The Industrial Architect shall = take part=20 in these reviews and report on complete system testing performed with = the=20 equipment. All restrictions on use of the equipment shall be made = explicit for=20 the Review Board. Make all critical software a Configuration Controlled = Item=20 (CCI).
R5 Review all flight software (including embedded software), = and in=20 particular :
R6 Wherever technically feasible, consider confining = exceptions to=20 tasks and devise backup capabilities.
R7 Provide more data to the telemetry upon failure of any = component,=20 so that recovering equipment will be less essential.
R8 Reconsider the definition of critical components, taking = failures=20 of software origin into account (particularly single point = failures).
R9 Include external (to the project) participants when = reviewing=20 specifications, code and justification documents. Make sure that these = reviews=20 consider the substance of arguments, rather than check that = verifications have=20 been made.
R10 Include trajectory data in specifications and test=20 requirements.
R11 Review the test coverage of existing equipment and extend = it where=20 it is deemed necessary.
R12 Give the justification documents the same attention as = code.=20 Improve the technique for keeping code and its justifications = consistent.
R13 Set up a team that will prepare the procedure for = qualifying=20 software, propose stringent rules for confirming such qualification, and = ascertain that specification, verification and testing of software are = of a=20 consistently high quality in the Ariane 5 programme. Including external = RAMS=20 experts is to be considered.
R14 A more transparent organisation of the cooperation among = the=20 partners in the Ariane 5 programme must be considered. Close engineering = cooperation, with clear cut authority and responsibility, is needed to = achieve=20 system coherence, with simple and clear interfaces between partners.
- END -