1.5. Reliability prediction during project life cycle#

1.5.1. Reliability prediction framework underlying this handbook#

Any reliability prediction should be clear, specific and useful.

  • “Clear” means that the assumptions underlying the prediction are clearly defined and communicated together with the prediction results, allowing for a meaningful interpretation.

  • “Specific” means that the characteristics of the product and mission under analysis are considered appropriately, and the prediction accounts for the relevant variables.

  • “Useful” means that the prediction is giving answers to the questions that are of relevance for the required use of the prediction results.

To achieve this, the framework of this handbook starts the prediction process with a discussion of the assumptions underlying the prediction (Section 1.5.2), allowing to be at the same time specific to the problem at hand and clear regarding the meaning and interpretation of the prediction results. A key concept is the intended use of the prediction, which is driving the required scope and focus of the prediction in order to be useful for the trade-offs that will be supported (Section 1.5.3). Throughout the project life cycle, the prediction will serve different purposes, and different reliability prediction tasks are performed to coordinate the prediction throughout the system development (Section 1.5.4).

1.5.2. Assumptions and ground rules for the prediction#

When planning a reliability prediction, the ground rules and key assumptions underlying the prediction must be agreed upon in a first step. Which assumptions are appropriate, depends on the following aspects:

  • the characteristics of the system or item under analysis,

  • the project life cycle phase in which the prediction is performed and

  • the intended use of the prediction results.

Thus, sufficient information on these three points should be available before starting with the prediction, and even before defining any ground rules and assumptions.

Table 1.5.1 gives an overview of different areas in which assumptions need to be made before starting a system level reliability prediction. The associated ground rules give a rough indication of the general principles that should be followed to achieve maximum accuracy (i.e. realistic predictions). It is not mandatory to be compliant with each ground rule, but the assumptions made should be justified considering the characteristics of the system, the project life cycle phase in which the prediction is made and the intended use of the prediction results.

It should be clear that the assumptions made can have a tremendous impact on the results. If the main goal is to receive reliability predictions that are comparable (e.g. between different manufacturers or suppliers), one must ensure that the same assumptions are used. At least the key assumptions with the largest impact on the result should be agreed by all parties, and/or specified in the supply chain.

Any reliability prediction report should provide full information on the assumptions made for the prediction, as listed in Table 1.5.1. This ensures that the required elements of a reliability prediction, as defined in [NR_METHODO_11], Clause 4, are provided with the prediction. The relation between each of these required elements, the assumptions listed in Table 1.5.1 and the relevant sections of this handbook is given in Table 1.5.2.

Table 1.5.1 List of assumptions to be agreed upon and associated ground rules for predictions at system level (equipment level or higher).#
Assumptions Ground rules
1 Basic information needed to define the assumptions
1.1 System definition The system or item under analysis shall be clearly defined with a description of all relevant characteristics needed for performing the prediction.
1.2 Project life cycle phase The project life cycle phase during which the prediction is performed as well as the associated reviews shall be clearly defined
1.3 Intended use of the prediction The reliability prediction objectives or intended use (s) of the prediction results shall be clearly defined
2 Assumptions related to the reliability prediction coverage
2.1 Mission phases coverage The prediction shall cover all mission phases that affect reliability and are relevant for the supported reliability prediction use cases.
2.2 Elements coverage The prediction shall cover all spacecraft elements unless their contribution to overall system (un-)reliability is negligible
2.3 Failure modes coverage The prediction shall cover all failure modes with a relevant effect on the state (or performance) of the overall system.
2.4 Failure mechanisms coverage The prediction shall cover all failure mechanism with a relevant contribution to the occurrence of the considered failure modes.
2.5 Failure root causes coverage The coverage in terms of failure root causes shall be defined depending on the reliability prediction use
3 Assumptions related to the reliability prediction input
3.1 Mission definition The mission definition shall specify the functions as well as the performance levels (degraded system modes) to be analysed.
3.2 Design lifetime The design life time shall be clearly specified. For life time extensions, the analysis shall account for accumulated time and stresses.
3.3 Operational conditions In each project phase, the operational conditions shall be defined or updated based on all available information
3.4 Environmental conditions In each project phase, the environmental conditions shall be defined or updated based on all available information
3.5 Product design information The available product design information, as of the current project phase, shall be used to build or update the reliability model
3.6 Methods, models The prediction methods and models shall be selected based on the technologies, use conditions and available information
3.7 Data Relevant test or field return data should be used to build or update the reliability models (when available)
4 Assumptions related to the reliability modelling
4.1 Redundancy considerations Redundancies shall be modelled considering the specific type of redundancy and appropriate input from lower levels
4.2 Degraded system modes Degraded system modes shall be modelled explicitly, considering reliability as a function of the required performance level.
4.3 Dormant phases Dormant phase modelling shall account for the difference between the stresses in active and passive mode
4.4 Common cause effects Common cause effects shall be considered accounting for the system layout, use conditions and considered categories of failures
4.5 Distribution functions The selection of distribution functions to model reliability in time shall be justified considering the technologies and relevant failure mechanisms
5 Assumptions related to the reliability prediction outputs
5.1 Prediction metrics The prediction metrics (e.g. failure rate, reliability in time, probability of failure on demand) shall be consistent with the reliability modelling
5.2 Prediction uncertainties The most relevant epistemic uncertainties associated with the prediction shall be identified and communicated together with the prediction results.
5.3 Conservatism The required accuracy or conservatism (realistic vs. conservative prediction) shall depend on the intended use of the prediction
Table 1.5.2 Relation between the assumptions listed in Table 3, the required elements of a prediction according to IEEE 1413:2010 and the relevant sections of this handbook.#

Required element of a reliability prediction ([NR_METHODO_11], Clause 4)

Table 1.5.1 Assumptions

Handbook Chapters

Identification and description of the item for which the prediction is made and the life cycle phase upon which the prediction is performed

  • System definition
  • Project life cycle phase

Section 1.4.1.1, Section 1.5.4

Intended use of the prediction results

  • Intended use of the prediction

Section 1.5.3.2

RP coverage: No required element in [NR_METHODO_11]

  • Mission phase coverage
  • Elements coverage
  • Failure modes coverage
  • Failure mechanisms coverage
  • Root causes coverage

Section 1.4.1, Section 1.4.2, Section 2.4, Section 1.4.3, Section 1.4.4

List of inputs used for the selected methodologies

  • Mission definition
  • Design lifetime
  • Operational conditions
  • Environmental conditions
  • Product design information
  • Methods, models
  • Data

Section 1.4.3.4, Section 2.4

Modelling: No required element in [NR_METHODO_11]

  • Redundancy considerations
  • Degraded system modes
  • Dormant phases
  • Common cause effects
  • Distribution functions

Section 9

Prediction metrics: Definitions and values

  • Prediction metrics

Section 2.4

Uncertainties and limitations of the prediction

  • Prediction uncertainties

Section 2.6

Statistical confidence in the prediction

  • Conservativism

Section 2.6

1.5.3. Scope and focus of the prediction for different reliability prediction uses#

The scope and focus of a reliability prediction can be defined in terms of different axes, see Table 1.5.2 above (RP coverage).

As a general rule, all elements and associated failure modes and mechanisms need to be covered by the prediction, unless their contribution to overall system (un-)reliability – or to the decision that will be supported – may be assumed to be negligible for practical purposes. Similar considerations hold for the coverage of mission phases during the prediction. The required coverage in terms of root causes (failure categories, see Section 1.4.3 for classification) depends on the intended use of the prediction. This is discussed in the following subsections.

1.5.3.1. Reliability prediction versus reliability management#

Achieving a high reliability product is an important objective during the design and production of any space system. Considering all root causes of failure is a prerequisite, and different mitigation processes are in place to avoid the occurrence of each of them, see Table 1.5.3 for examples. Apart from measures to avoid the different root causes, system level design aims at mitigating the effect of lower level failures on the success of the mission.

The objective of reliability predictions is to provide quantitative estimates for the (remaining) probability of failure despite the implementation of these measures. Some of the mitigation measures are explicitly considered in the prediction, e.g. quality level of EEE parts, or redundancy at system level. Others may be used as a justification to neglect certain root causes in the prediction, provided that the mitigation measures are sufficiently effective to avoid their occurrence. To give an example, calculations from radiation engineering may provide evidence that the rate of destructive Single Event Effects is negligible compared to the random failure rate. Similar considerations become relevant for wear-out failures of EEE components, which can in most cases be effectively avoided by safe life qualification (with appropriate margins), at least when the prediction is limited to the specified design lifetime.

In addition, depending on the intended use of the prediction, there may be no added value to make a quantitative prediction for a certain root cause, because it does not make a difference for the trade-offs that will be supported by the prediction. These aspects are discussed in the following sections.

Table 1.5.3 Failure categories (from Table 4 -2) with examples of mitigation measures.#

FAILURE CATEGORY

ROOT CAUSE

MITIGATION

RANDOM FAILURE (RF)

UNKNOWN RESIDUAL DEFECT / WEAKNESS

  • CONSISTENT WITH QUALITY LEVEL
  • UNDER NORMAL STRESSES (REFER TO DATA SHEET)
  • ONE-OFF EVENT

Space Qualification

Part quality selection

Derating

Redundancy

FDIR

SYSTEMATIC FAILURE (SF)

  • DESIGN ERROR
  • MANUFACTURING ERROR
  • OPERATIONS ERROR

Robust design

Quality assurance (during design, manufacturing and operations)

Qualification & verification processes

WEAR-OUT FAILURE (WO)

  • NORMAL PHYSICAL PROCESS → TIME/EQUIVALENT TIME
    • OPERATIONS-RELATED (e.g. On/Off, duty cycle)
    • ENVIRONMENT-RELATED (e.g. Radiations)

Components and materials selection

Design calculations and margins

Lifetime qualification with margins

EXTRINSIC FAILURE (EF)

  • VACUUM (Outgassing, cold-welding, heat transfer)
  • THERMAL (Solar radiations, Solar albedo, Earth OL Radiation
  • MAGNETIC FIELD
  • MECHANICAL VIBRATIONS / SHOCKS (launcher, pyro activation)
  • ATOMS (EROSION (O) → considered as WO)
  • RADIATIONS (CUMULATED EFFECTS → considered as WO)
  • UV (degradation → considered as WO)
  • PLASMA (ESD)
  • SEE : DESTRUCTIVE / NON DESTRUCTIVE
  • MICROMETEORITES
  • DEBRIS

Components and materials selection

Design calculation and margins

Qualification and verfication testing

Thermal control

Shielding (thermal, radiation, debris)

Radiation engineering

Debris impact predictions

Avoidance manoeuvres

...

1.5.3.2. Overview on different reliability prediction uses#

Different reliability prediction uses become relevant throughout the project life cycle of a space mission; see Table 1.5.4 for an overview. The Table includes some classical reliability prediction uses related to the management and verification of reliability requirements or to the support of design trade-off decisions. However, also some new stakes are addressed, e.g. related to the design of constellations or to the safe disposal of satellites.

Table 1.5.4 Possible reliability prediction uses throughout the project life cycle.#

Reliability prediction uses

Description

Phases

Input to the design, support trade-offs and comparisons

During the design phase, reliability prediction may be used to compare competing designs or trade-off options, to identify weak parts of the design and to assess the impact of design changes. The level of detail increases with the project phases.

A - D

Establishment, management and verification of quantitative reliability requirements

The purpose of quantitative reliability requirements and their management and verification is to ensure acceptable (as specified) reliability of space products through contractual specification.

0 - D

Support decisions on the choice of engineering design margins

Reducing excessive margins is one way to reduce cost, but should be justified by an appropriate rationale or analysis. On the other hand, it may sometimes be reasonable to increase a specific margin to avoid a catastrophic single point failure.

A - C

Choosing a test strategy at part, equipment or higher levels

Another way to reduce cost is to reduce the effort dedicated to testing. Additional tests may be useful e.g. to verify a design for identified stresses, to avoid costly redesigns. Also these decisions should be justified by an appropriate rationale or analysis.

D

Support business planning for single spacecrafts and for the design of constellations

Reliability predictions - or specified reliability requirements - form an important input to support space customer’s business planning. This holds particularly for the design of constellations as “system of systems”, e.g. to decide on the number of spares, replenishment scenarios or redundancy management.

0 - E

Health monitoring and decision-making on life time extension vs. safe disposal

Spacecrafts are designed to be reliable for a specified lifetime and any life time extension needs to be justified based on RP. To support the decision, the reliability prediction needs to be revisited, in particular for the functions relevant for satellite safe disposal (space debris mitigation).

E - F

The reliability prediction methodology presented in this handbook intends to embrace different reliability prediction uses, although the focus is clearly on the “classical” uses related to the development and design of a single spacecraft. The first use listed in Table 1.5.4 - reliability prediction for design support - is considered as the base case. Recommendations on the root cause coverage required for this use are provided in Section 1.5.3.2.1 , followed by a discussion of the remaining uses in the following subsections.

1.5.3.2.1. Reliability prediction as input to the design#

Providing input to the design of a spacecraft may be seen as the classical use for reliability predictions in space applications, and is the focus of this handbook. To be useful for design, the methodology needs to account for the relevant design variables in order to support the required trade-offs, and for the affected categories of failures.

Guidance on the root cause coverage required for this reliability prediction use is given in Table 1.5.5 below. It should be noted that the recommendation made for systematic failure modelling is driven by the limitations of the available modelling approaches, which do not account for the relevant decision variables (e.g. impact of maturity category, test strategy). Other design decisions, such as redundancy sizing or the margin policy, are not effective to avoid systematic failures. For these reasons, the added value to consider this failure category for design support is small despite its clear relevance for the overall failure count.

Table 1.5.5 Root causes coverage for reliability prediction as input to the design.#

Failure category

Required coverage for reliability prediction as design support

Random failures

Full coverage required

Systematic failures

Not generally required to cover systematic failures, unless full root causes coverage is needed e.g. to support the design of a constellation (Section 1.5.3.2.5).

Wear-out failures

Wear-out after the specified lifetime is out of scope for this use (see Section 1.5.3.2.6 for life time extensions). Premature wear-out (excluding systematic failures) needs to be considered for technologies for which safe life qualification is not possible, or not fully effective.

Extrinsic failures

Relevant stress contributors resulting from the spacecraft environment should be considered in the prediction of random and wear-out failures. Explicit consideration of extrinsic failures with dedicated models is only required if the rate of occurrence of additional failure modes (e.g. destructive SEE, space debris impact) cannot be neglected when compared to the random failure rate.

The recommendations regarding root causes coverage are generally valid also for preliminary reliability predictions, e.g. for the Preliminary Design Review (PDR). However, the level of detail used in the modelling can be reduced in this context to reduce prediction efforts and to account for limited input availability in early project phases.

1.5.3.2.2. Establishment and verification of quantitative reliability requirements#

To allow for a meaningful verification, the specification of quantitative reliability requirements should always go hand in hand with a clear definition of the requirement’s scope in terms of failure categories, as well as elements coverage.

The recommendations on root cause coverage for design support in Table 1.5.5 may be used also in this context. The quantitative reliability requirements specified between customers, prime contractors and suppliers can then directly be used as a design driver, with the goal to find the best architecture and detailed design to comply with the requirements under the given schedule and budget constraints.

After completion of the detailed design, the prediction may be extended to account for systematic failures as well, e.g. when full root causes coverage is needed to support the business planning for the owner of a single satellite or a satellite constellation (Section 1.5.3.2.5).

1.5.3.2.3. Reliability prediction supporting the choice of engineering design margins#

The recommendations made in Table 1.5.5 are based on the assumption that the occurrence of certain root causes is effectively reduced by different mitigation measures, as listed in Table 1.5.3. However, one possible use of quantitative reliability predictions is to assess the risk associated with a reduction of design margins, or the benefit of increasing a specific margin. The reliability prediction then needs to account for the effect of this margin policy decision, which may require the consideration of additional root causes. Safe life qualification to avoid wear-out failures before end-of-life is a case in point; wear-out models that are based on Physics of Failures allow quantifying the effect of the associated margins (e.g. radiation margins) on the item’s reliability. More generally, the effect of design margins can be quantified with the aid of any reliability model that consider the effect of the stress contributors that are addressed by this margin.

1.5.3.2.4. Reliability prediction supporting the choice of a test strategy#

Also the choice of a suitable test strategy may be based on quantitative reliability predictions to assess the effect on the reliability of the flight item. Part level tests, such as e.g. lot acceptance tests, reliability tests, lifetime tests or radiation tests, generally have a clear relation to a specific failure category and the risk associated with a specific test plan (e.g. sample size, duration) can be quantified using statistical methods.

System level qualification and verification tests are performed mainly as a means of quality control, to identify possible design and manufacturing errors that may otherwise lead to systematic failures during operations in orbit. However, with the approaches for systematic failure modelling presented in the current handbook, it is not possible to consider the effect of testing and thus to quantify the risk associated with a specific test strategy.

1.5.3.2.5. Reliability prediction as input for business planning and design of constellations#

To support business planning on the customer side (for single satellites, and especially for constellations), or for the insurance of space systems, reliability predictions need to be as realistic as possible. To achieve this, the scope of the prediction should follow the recommendations made for design support (Table 1.5.5), with an extension to account for systematic failures as well.

1.5.3.2.6. Reliability prediction to support decisions about life time extensions#

The limited coverage of wear-out effects proposed in Table 1.5.5 for design support is justified by the fact that components are generally qualified for the specified lifetime. This condition is violated in the case of a lifetime extension, requiring additional considerations for the associated reliability predictions.

To support decisions on life time extensions, it may not be required to revisit the scope of the prediction for all spacecraft elements; e.g. for space debris mitigation only the functions needed for safe disposal are of interest, and health monitoring may be used to better assess the risk of failure in redundant system architectures.

Where a quantitative prediction is required, the scope of the prediction needs to be extended to account for additional wear-out failures that may become relevant due to the lifetime extension.

1.5.4. Reliability prediction during project life cycle#

This section explains how the system reliability prediction process interacts with the system development process and which activities and deliverables should be performed throughout the system life cycle. The typical system life cycle of space products according to [BR_METHODO_1] consists of seven phases, as shown in Fig. 1.5.1 (see also Section 1.4.1.1).

../../_images/methodo_figure5_2.png

Fig. 1.5.1 Interaction of the reliability process with the system life cycle#

The establishment and cascading of reliability requirements will be explained in Section 1.5.4.2.1. The evaluation of system architectures from reliability perspective in early design phase is introduced in Section 1.5.4.3. The verification of system reliability requirements is part of Section 1.5.4.4 and the aspects of reliability prediction for life-time extension and safe disposal are handled in Section 1.5.4.5. The contribution of reliability prediction to the system development during the different life cycle phases is shown in Section 1.5.4.1.

1.5.4.1. Deliverables of the reliability prediction process during Life Cycle#

The scope and aim of each system life cycle phase, is highlighted in Fig. 1.5.1, taken from [BR_METHODO_1].

../../_images/methodo_figure5_3.png

Fig. 1.5.2 System Life Cycle.#

The contribution of reliability prediction to each phase and the associated reviews are explained as follows in Table 1.5.6. Table 1.5.7 gives an overview of reliability documents to be provided per review during the system life cycle.

Table 1.5.6 Reliability Deliverables per Project Milestones/Reviews#
Phase
ECSS Document 0 A B C D E F
Document Title MDR PRR SRR PDR CDR QR AR ORR FRR LRR CRR ELR MCR
Failure modes and effects analysis/ failure modes, effects and criticality analysis (as input for system level analysis, e.g. FTA) ECSS-Q-ST-30-02 *) X X
Fault tree analysis (FTA) (to support reliability prediction) ECSS-Q-ST-40-12C X X X X
Reliability Prediction ECSS-Q-ST-30C X *) X *) X X X X

*) Although ECSS-Q-ST-30C mentions a reliability prediction at PRR and SRR, this is not done for all projects and may be used only to assist apportioning of requirements to lower level. A FMEA at SRR may be required for specific missions and can be used e.g. to assist safety analysis.

Table 1.5.7 Contribution of reliability prediction in different mission phases#
Main objectives Associated reviews Involvement of Reliability Prediction
Task Input Output
Phase 0: Mission analysis - needs identification

Define mission needs and expected performance

Identify constraints and boundary conditions with respect to physical and operational environment

Define possible mission concepts

Mission Definition Review (MDR): Definition of mission baseline.

In Phase 0 the reliability prediction activities focus on capturing top level requirements and boundary conditions. The Mission Definition Review (MDR) provides the mission profile that is to be used for reliability prediction. Top level reliability requirements are derived from customer needs.

A first high level reliability prediction could be done, e.g. for quotation

Mission Profile

At this stage of the development process no system architecture is available.

Only data from similar projects

Top level requirements included in preliminary technical specification

First rough reliability estimation

Phase A: Feasibility

Assess the technical and programmatic feasibility of the possible concepts by identifying constraints related to implementation, costs, schedule, organization, operations, maintenance, production and disposal.

Identify critical technologies and propose pre‐development activities.

Preliminary Requirement Review (PRR): Assess feasibility of user requirements to allow a solid start of preliminary design.

In Phase A the following points are to be addressed by the reliability activities:

Assessment of feasibility to achieve the system level reliability requirement

Breakdown of system level reliability requirement to lower level to establish a requirement basis to start with preliminary definition in Phase B, which is an input for Preliminary Requirement Review (PRR)

The input data for activities in Phase A are based on a concept of the system architecture, as the detailed design is usually not available and also no FMEA/FMECA. To perform a reliability assessment as well as the partitioning of reliability requirements, often historical data of similar systems and components are used as initial values.

System reliability requirements

Reliability requirement breakdown to lower level.

Reliability Prediction in such an early phase may be done as rough estimate to check feasibility and to support apportioning of requirements to lower levels.

Phase B: Preliminary definition

Conduct “trade-off” studies and select the preferred system concept, together with the preferred technical solution(s) for this concept.

Establish a preliminary design definition for the selected system concept and the preferred technical solution(s).

System Requirements Review (SRR): Freeze of high level requirements

Preliminary Design Review (PDR): Freeze of mission baseline and requirements down to subsystem level. Freeze of design concept at system level.

In the early phase of the development, quantitative methods are used to support the definition of the system and to refine the allocation of requirements. During the development activities, decisions and trade-offs are supported by reliability prediction. For each proposed system solution a preliminary reliability prediction is performed as decision basis to support the Preliminary Design Review (PDR).

Preliminary version of FMEA, FMECA are to be prepared as input for system level analysis.

Input data for the preliminary reliability assessment are preliminary system architecture, preliminary versions of FMEA/FMECA and preliminary reliability data on sub-system, equipment and component level.

Top level reliability requirements

System level reliability requirements

The allocation of system level reliability requirements to lower levels is refined as system architecture evolves. For the System Requirements Review (SRR) the reliability requirements are to be validated.

Preliminary reliability prediction including FTA to support PDR.

Phase C: Detailed definition

Completion of the detailed design definition at all levels in the customer‐supplier chain.

Production, development testing and pre‐qualification of selected critical elements and components.

Production and development testing of engineering models, as required by the selected model philosophy and verification approach.

Critical Design Review (CDR): Confirmation of detailed design, release of final design. Authorisation to complete qualification and build flight units.

During Phase C, the reliability assessment is updated based on the detailed system definition to demonstrate that reliability requirements are met, supporting Critical Design Review (CDR).

FMEA, FMECA and component reliability prediction are to be prepared based on detailed design

Input data based on a detailed system architecture are updated, including FMEA, FMECA, FMES and component level reliability prediction. Reliability Prediction including Fault Tree Analysis or other equivalent methods for system reliability assessment
Phase D: Qualification and production

Complete qualification testing and associated verification activities.

Complete manufacturing, assembly and testing of flight hardware/software and associated ground support hardware/software.

Qualification Review (QR): Demonstrate that the system meets all requirements and verification proof is complete

Acceptance Review (AR): Acceptance of the System by the Customer

Operation Readiness Review (ORR): Verify readiness of the operational teams and procedures and their compatibility with the flight system.

The reliability prediction is updated for customer acceptance considering test results from qualification. Input is based on final system design with updated FMEA, FMECA and FMES. Reliability Prediction including updated Fault Tree Analysis or other equivalent methods for system reliability assessment
Phase E: Operations / Utilization

Prepare launch of the system

Perform launch and in-orbit testing.

Perform in-orbit operations

Flight Readiness Review (FRR): Verify that the flight and ground segments are ready for launch.

Launch Readiness Review (LRR): Performed right before launch to provide authorization to proceed for launch.

Commissioning Results Review (CRR): Verify system performance after in-orbit testing.

End-of-Life Review (ELR): Verify that the system has completed its useful life, ensure safe disposal

Depending on the result of in-orbit testing during commissioning, for example if redundancies are not available, the reliability assessment may have to be re-evaluated to support CRR.

Update of reliability prediction based on in-orbit feedback.

Reliability assessment for safe disposal.

FMEA, FMES

In-orbit test results

In-orbit reliability data

Update reliability prediction with in-orbit data for deorbiting function to support End-of-Life Review (ELR) and decision making on life extension.

1.5.4.2. Management of reliability requirements#

1.5.4.2.1. Establishment of reliability requirements#

In the following, the process steps to support the establishment of appropriate reliability requirements will be explained according to [BR_METHODO_5] . The first step is the classification of the type of space mission, as shown in Table 1.5.8. Details on the coverage of this handbook in terms of mission types can be found in Section 1.4.2.1.

Table 1.5.8 Space Mission Classification#

Mission Type

Class A

Human Flight or Material Transport Flight

Class B

Telecommunication, Observation and Navigation for application with high integrity requirements

Class C

Telecommunication, Observation Mission, Space Probe

Class D

Test and demonstration mission

Class L

Launcher, Launch base

Within each class, the missions can be further categorized based on the following criteria [BR_METHODO_5]:

Table 1.5.9 Space Mission Classification#
Category 1 Category 2 Category 3
Development time More than 2 years More than 2 years Less than 2 years
Criticality to customer strategic objectives High Medium Low

The type of mission as well as the categories of mission within each class (Table 1.5.9) should be taken into account for the establishment of reliability requirements. As a general rule, the larger the economic loss a failure of the mission would cause, the more stringent the reliability requirement should be. Furthermore, the criticality of the mission success to the strategic objectives of the customer may justify a higher reliability requirement.

Reliability requirements are expressed as expected probability of success over a given time period under consideration of:

  • Customer commercial objectives, revenue and return on investment

  • Insurer requirements

  • Regulation, e.g. avoidance of space debris and safe disposal

  • Technical feasibility

  • Cost, weight and volume constraints

For example

The probability that the satellite achieves its performance requirements shall be better than 90% after a mission duration of 10 years in orbit.

Each quantitative requirement should be linked to an explanation to ensure correct interpretation. This includes definition of the scope, principles and boundary conditions that are to be applied for the reliability prediction, as discussed with the ground rules and assumptions for reliability prediction in Section 1.5.2.

The reliability requirements could also be defined considering partial losses leading to reduced functional capability and graceful degradation, for example:

The probability that at least \(k\) out of \(n\) antenna links are operative after \(y\) years in orbit shall be greater than 98%.

Where \(n\) denotes the total number of units installed and \(k\) an acceptable degraded mode, with \(k < n\). The duration \(y\) is usually defined in years and corresponds to end of mission, end of in-orbit testing or end of LEOP, but it can also refer to operation cycles or hours.

1.5.4.2.2. Allocation of reliability requirements#

The allocation of requirements may start with assigning historical data of similar systems to sub-systems and equipment level and is then refined as more details become available.

The allocation of reliability requirements then consists of the following steps:

  1. Analysis of input requirement to formulate functional and performance requirements

  2. Define functional architecture to ensure system performance.

  3. Functional failure analysis at system and sub-system level to identify failure scenarios that would lead to violation of reliability requirements

  4. Create a high level system model that consists of the relevant subsystems based on the functional failure analysis.

  5. Assign reliability targets to sub-functions. Besides the use of historical data of similar sys-tems different approaches could be used to assign initial reliability targets, for example:

    • Equal allocation

    • Proportional allocation (ARINC method)

    • Feasibility-Of-Objectives (FOO) Method

  6. Review of sub-system targets with regard to feasibility, cost, schedule etc. and refine allocation if deemed necessary. This may involve iterations to find a well-balanced apportionment of reliability targets.

The allocation of requirements to lower level starts with the identification of system functions. Based on the customer top level reliability requirements, a system function for the design can be identified. This overall system function is decomposed into its sub-functions [BR_METHODO_4]. The functional failure analysis determines the failure effects and repercussions at system level. The following generic failures could be used as guideline to assess each function and sub-function:

  • Total loss of function

  • Partial loss of function

  • Un-commanded or spurious functioning

  • Erroneous functioning

The results of the functional failure analysis allow identifying the functions; those failures would affect the ability to perform the required system function. The cascading of the top-level system reliability requirement to the contributing sub-functions should consider the results of the functional failure analysis. That means, the reliability target allocated to functions and sub-functions should consider the relevant failure modes to ensure functional integrity.

An example of the functional failure analysis for the power supply system is shown in Table 1.5.10. A similar analysis needs to be performed for all functions of the satellite, whereby in early phases of the development the main functions are considered and as design evolves the functional breakdown can be refined and more details are included.

Table 1.5.10 Example Functional Failure Analysis Power Supply#

Functional Failure Analysis

Function: Provide electrical power

Function / Sub-function Functional failure Failure effect
1. Provide electrical power Total loss of electrical power Total loss of power supply. No data communication. Loss of satellite control.
1.1 Photovoltaics Total loss of photovoltaics capabilities

Total loss of power supply. No payload

Total loss of satellite

Partial loss of photovoltaics capabilities Degraded performance, battery not fully charged, power interrupt in Earth's shadow possible. Payload interruptions.
1.2 Charge Battery Loss of battery charging

No power supply in Earth shadow in one orbit

Interruption of payload

Possible loss of satellite

Overcharge of battery

Permanent damage of battery possible

No power supply in Earth shadow in one orbit

Interruption of payload

Possible loss of satellite

1.3 Output voltage regulation Erroneous function output regulation - too low Insufficient voltage supply to satellite. No data communication
Erroneous function output regulation – voltage too high Permanent damage to electronic components possible if no over-voltage protection is implemented

The analysis of functional failures can also be represented with a reliability block diagram. Quantitative methods are used to support the system definition and to refine the allocation of requirements to sub-system level.

In the following, different approaches for requirement allocation are introduced, including:

  • Equal allocation

  • Proportional allocation (ARINC method)

  • Feasibility-Of-Objectives (FOO) Method

These methods are applicable for a serial system structure only. For more complex system architectures, the reliability allocation should also make use of system level reliability assessment methods (see Section 9).

The apportionment of the system level reliability target to sub systems is based on the following relation for serial system.

Equation

(1.5.1)#\[\hat{R}_{i}(t) = (\hat{R}_{S}(t))^{w_{i}}\]
  • \(\hat{R}_{S}(t)\) denotes the reliability target on system level

  • \(\hat{R}_{i}(t)\) denotes the reliability target on sub-system level, and

  • \(w_{i}\) denotes the weighting factor of sub-system \(i\)

It is obvious that for different values for \(R_{i}(t)\) the same system reliability \(R_{S}(t)\) can be obtained. Thus, there are infinite possible solutions to allocate the system level reliability target to the sub-system level. Therefore, practical methods are needed to assist the system design in the apportionment of reliability requirements.

Equal Allocation

This method equally distributes the system reliability on all the sub-systems below the system level {cite:p}`MIL-HDBK-338B` . The weighting factor is equal for each sub system and given by the number of sub systems. The reliability target for the sub-systems is given by

Equation

(1.5.2)#\[\hat{R}_{i}(t) = (\hat{R}_{S}(t))^{\frac{1}{n}}\]
  • \(\hat{R}_{S}(t)\) denotes the reliability target on system level

  • \(\hat{R}_{i}(t)\) denotes the reliability target on sub-system level, and

  • \(n\) denotes the number of sub-systems.

For example, a reliability target of 0.9 on system level would result in a target of 0.9826 for each sub system, if the system consists of 6 sub-systems. If the target on system level is given as failure probability or failure rate the target on subsystem level is obtained from the following equations.

Equation

(1.5.3)#\[\hat{F}_{i}(t) = \frac{\hat{F}_{S}(t)}{n}\]
  • \(\hat{F}_{S}(t)\) denotes the failure probability target on system level

  • \(\hat{F}_{i}(t)\) denotes the failure probability target on sub-system level

  • \(n\) denotes the number of sub-systems.

Equation

(1.5.4)#\[\hat{\lambda}_{i}(t) = \frac{\hat{\lambda}_{S}(t)}{n}\]
  • \(\hat{\lambda}_{S}(t)\) denotes the failure rate target on system level

  • \(\hat{\lambda}_{i}(t)\) denotes the failure rate target on sub-system level, and

  • \(n\) denotes the number of sub-systems.

The equal allocation is very easy to apply, but does not consider the technical feasibility and experience from similar projects. This could result in very stringent requirements for some sub-system that cannot be achieved or only with increased effort. Thus, it might be necessary to refine the allocation in order to get to a more balanced share between sub-systems.

Proportional Allocation (ARINC Method)

The proportional allocation also known as ARINC method takes historical data on sub-system reliability into account to distribute the reliability to sub-system level. The weighting factor is determined by the ratio of the observed sub-system failure probability to the total system failure probability, as shown in Eq. (1.5.5). The new reliability target is allocated to sub-system proportional to this factor.

Equation

(1.5.5)#\[w_{i} = \frac{F_{i.old}(t)}{F_{S.old}(t)}\]
  • \(F_{S.old}(t)\) denotes the historical failure probability on system level

  • \(F_{i.old}(t)\) denotes the historical failure probability on sub-system level

Given that 15% of the system failures are caused by a failure of the power supply sub-system a reliability target of 98,43% is derived for the power supply system using Eq. (1.5.1) to achieve a reliability of 90% on system level, as shown in Table 1.5.11. Please note that in the example the failure probabilities of the sub-system were derived from the minimal cut set approximation to simplify the calculation.

Table 1.5.11 Example of proportional allocation of reliability targets to sub-systems#

Sub-System

Weighting Factor \(w_{i}\)

Failure probability Target (approximation)

Reliability target \(\hat{R}_{i}(t) = (\hat{R}_{S}(t))^{w_{i}}\)

Power

0.15

0.015

0.9843

Tele-Command/Telemetry

0.20

0.02

0.9791

Propulsion

0.10

0.01

0.9895

Orbit Control

0.15

0.015

0.9843

Structure

0.10

0.01

0.9895

Data Communication (Pay load)

0.30

0.04

0.9689

System level

100%

0.1

0.90

Feasibility-Of-Objectives (FOO) Method

The FOO method allows users to assign grading factors to sub-systems and their components in order to determine how reliability targets are cascaded from top level to lower level. A sub-system with high grading factors is allocated a lower reliability than a sub-system with low grading factors. Default grading categories are complexity, technical level (state of art), operating time and environmental condition. Users may change these categories. Each rank is based on a scale from 1 to 10 and is estimated using both design engineering and expert judgment [NR_METHODO_1] :

  1. System Complexity. Complexity is evaluated by considering the probable number of parts or components making up the sub system and also is judged by the assembled intricacy of these parts or components. The least complex sub system is rated at 1, and a highly complex sub system is rated at 10.

  2. Technology level. The state of present engineering progress in all fields is considered. The least developed design or method receives a value of 10, and the most highly developed is assigned a value of 1.

  3. Operating Time. An element that operates for the entire mission time is rated 10, and an element that operates the least time during the mission is rated 1.

  4. Environmental conditions are also rated from 10 through 1. Elements expected to experience harsh and very severe environments during their operation are rated as 10, and those expected to encounter the least severe environments are rated as 1.

The first stage for this allocation method is to calculate the total grading value for each sub-system. This is obtained by multiplying the grading factors from each category:

Equation

(1.5.6)#\[G_{i} = \prod_{j}^{n} g_{ij}\]

Equation

(1.5.7)#\[w_{i} = \frac{\prod_{j}^{n} g_{ij}}{\sum_{i} G_{i}}\]

An example for reliability allocation using FOO method is shown in Table 1.5.12. The system level reliability target of 0.9 is distributed to sub-systems based on weighting factor obtained from Eq. (1.5.7).

Table 1.5.12 Example of graded allocation of reliability targets to sub-systems#
Sub-system Categories grading values \(g_{ij}\) Weighting Factor \(w_{i}\) Sub System Target \(\hat{R}_{i}(t)\)
Complexity Technology Level Operation time Environmental Conditions
Power 6 5 5 5 0.12550 0.98686
Tele-Command/ Telemetry 6 4 7 6 0.16867 0.98239
Propulsion 5 6 5 5 0.12550 0.98686
Orbit Control 8 8 5 7 0.37483 0.96128
Structure 4 2 10 8 0.10710 0.98878
Pay load 7 6 7 2 0.09839 0.98969
System level 1.0 0.90

1.5.4.3. Reliability assessment for system architecture development during conceptual design#

The outcome of the conceptual design phase are concepts that will be implemented during the next stages of the system development. The selection of the preferred system architecture is essentially a trade-off among the various architecture options. According to [BR_METHODO_3], a trade-report should contain the result of the evaluation of every identified alternative design solution with regard to the key technical requirements. For each alternative design solution, the following should be performed:

  • Assessment of all the key technical requirements / evaluation criteria,

  • Presentation of the pros and cons of the design solution, and

  • Identification of the technical and programmatic risks.

The reliability prediction is an important part of the trade-off studies and could be applied to support system engineering, subsystem engineering and equipment level design engineering. It provides a quantitative assessment of alternative design solutions regarding the achievement of reliability requirements. In the trade-off study, a sensitivity analysis can support system engineering through quantifying how reliability on system level changes if certain parameters change. Importance measures (see Section 9) could be used to perform such a sensitivity analysis. Furthermore, the reliability prediction for the trade-off should identify the equipment failure modes that significantly impact the system reliability. If the correlation between equipment failure modes and system reliability can be established, the aim of system reliability improvement is to eliminate or significantly reduced these failure modes by improving equipment quality or reconfiguring the system architecture. Different design concepts may cause different types of failure modes. The system designers should be aware of the underlying failure causes to achieve a robust design. To achieve the desired quality, the development process should be accompanied by an appropriate quality assurance procedure, see e.g. [BR_METHODO_2].

1.5.4.4. Reliability requirement verification for compliance demonstration#

The verification activities should demonstrate that the design and architecture is compliant with the reliability requirements for the corresponding level. This includes to demonstrate that:

  • All requirements have been taken into account and that the design is compliant with the requirements.

  • There is sufficient confidence that the final product will meet the requirements.

  • There is sufficient confidence in the correctness of the design on system level so that the specification and design of the next lower level can progress.

  • The verification activities should demonstrate that the design and architecture is compliant with the reliability requirements for the corresponding level. But the verification should also demonstrate that the sub-systems on the next lower level are compliant with the requirements for the next lower level. It is important that compliance of each building block with the requirements is substantiated and that potential non-compliances are identified.

  • For reliability requirements, the verification is performed through quantitative analysis, see Section 9. To finally demonstrate compliance with the top level reliability requirements, the final reliability prediction should be performed when the data for the sub-systems and component failure behaviour are available. Depending on the performance requirements, a consideration of degraded system operability might be useful.

1.5.4.5. Reliability assessment for life time extension and safe disposal#

To support life time extension decisions, the reliability prediction may need to be updated and in particular the reliability of the functions for safe disposal has to be considered to demonstrate compliance with space debris mitigation requirements, see also Section 1.5.3.2.6.

The following aspects should be considered:

  • If the life extension exceeds the qualification time, the assumptions made in the beginning of the project have to be revisited.

  • Wear-out may need to be considered for life time beyond qualification time.

  • The result of in-orbit testing and reliability estimation based on in-orbit feedback have to be taken into account to support decision on life time extension.

It is important to note that the probability of success of safe disposal has to be demonstrated already during the design phase as part of the space debris mitigation requirements. The analysis for safe disposal is then reassessed if the lifetime in-orbit is completed and should be further extended or a failure occurred during the lifetime. That means the requirements for safe disposal determine to what extent a life time extension is possible.