This discussion is locked.
You cannot post a reply to this discussion. If you have a question start a new discussion

A new flyer on safely managing the emergent properties of complex systems

To develop a system that is safe, a sufficient understanding of its properties is needed. For a complex system, this must include emergent properties, without which understanding is not complete and confidence in its safety cannot be claimed. Our new flyer has been created to help managers and engineers understand complexity and emergent properties to guide systems more clearly and safely through their life cycles. In doing so, there is greater potential to develop safe products that are fit for purpose, produced efficiently, and supported effectively. Download the flyer for free: Safely managing the emergent properties of complex systems

The diagram below demonstrates how to navigate complex systems safely:

Our flyer also presents the objectives for engineering managers, which includes sustainable thinking, exploiting technology for deeper management insights, as well the objectives for engineers, which includes a better understanding of emergent properties and when to take action.

Download the flyer for free: Safely managing the emergent properties of complex systems

All feedback on this paper is welcome. Log in to your IET EngX account and leave your comments below.

Parents
  • There are two problem domains associated with emergence

    1. understanding the processes that create emergent properties

    2. managing a system that has emergent properties.

    The millennium bridge is a classic example.

    The swaying of the bridge was the result of a combination of processes including the way people walk, the way people react when what they are walking on moves, and the design of the bridge that allowed it to sway. The emergent property resulted from the way they all interacted through various forms of feedback processes.

    Managing the swaying, i.e. changing the design of the bridge to reduce the effect was a completely different problem.

    The models associated with each domain are different and require different levels of abstraction. The first depends on the nature of the emergent properties. The second is nearly always a top down approach which models the emergent properties as observed - i.e. the system as a whole.

    Unfortunately, in both domains, the problems are most probably non-linear and unpredictable, even if the project looks very much like lots of other problems.

    IMHO, it all comes down to dealing with uncertainty. The best remedy is test and repeat through processes of proof of concept, pilot, prototype, trials etc. To minimise the risks, you need a few people with a wide range of knowledge and experience - they've hopefully seen it before, or at least recognise when things aren't right.

    Models are essential but they need to be validated.

  • [Here I quote Bernard Robertson-Dunn. I write it this way to get the SW to behave the way I want]

    There are two problem domains associated with emergence

    1. understanding the processes that create emergent properties

    2. managing a system that has emergent properties.

    The millennium bridge is a classic example.

    [End quote]

    I would add an issue to your two. The first is recognising and characterising the emergent property. Sometimes it is obvious enough, as with the Millenium Bridge. Sometimes it isn't, as when a swarm of drones doesn't quite do what you are hoping it would do. First question: what *is* it doing? 

  • Peter,

    I agree and include it as an issue in both. I might even rephrase is as "First understand the problem"

  • I wouldn't disagree, but I would caveat: test and repeat is good for reliability issues, less so for safety issues - in fact it can give a false sense of security. The problem of course is that it's incredibly hard to "prove" safety by testing, you might find bugs but it's much harder to prove there aren't any. So we'd also say, every time you make a change as a result of a test, also go back and revisit the safety pathways (HAZID etc) to consider unintended consequences of that change. It's actually the same mindset: test and repeat, and in parallel revisit safety analysis and repeat. And again being aware of unknown unknowns - if in doubt involve more of the inter-related teams than you think you need in that re-analysis work.

    It's particularly deeply scary when a complex safety related system is about to go into service and the engineering team say "oh we did find this bug, but it's ok, we tested it and that problem's now fixed" - yes but how do you know you didn't inadvertently affect anything else??? The big issue of emergent properties.

    But absolutely re your basic principle Bernard - it's surprising how often teams miss the fact that in the classic V-model there are intended to be verification (test and repeat) steps at every stage, missing these adds some risk on a simple project, but considerable risk on a project with potential emergent properties.

    Cheers,

    Andy

  • In a system development project there are two types of requirements Functional and Non-Functional.

    Non-Functional requirements include things like security, reliability, resilience, maintainability, and safety.

    In a non-linear complex system, potentially everything could affect anything else - hence your question "but how do you know you didn't inadvertently affect anything else??? "

    Which is why the role of Systems Engineer was invented. A Systems Engineer is supposed to make sure the complete system works as intended and from the perspective of all stakeholders. A Systems Engineer is different from a Control Engineer, in that the latter is concerned with the dynamics of the system.

    Similarly a Project Manager is usually responsible for administrative, cost and schedule.

    My experience in the information system world is that there are many Project Managers who think they know all about what is being developed - but don't. IS project managers are trained only in project management.

    This is different from engineering project managers who start off as engineers and move into project management. Depending on the project they can handle the systems engineer's role as well - the danger there is conflict of interest. Often money and time trump everything else.

    Getting back to dynamics, development is a dynamic process and a well educated/experienced control engineer should have the technical skills to treat the development itself as a complex system that needs managing.

    It's well known that controlling a non-linear dynamic process requires feedback loops, which is a generalisation of the "test and repeat" concept.

  • I wouldn't classify RAMSS properties as non-functional, necessarily. The property of a level-crossing control system to ensure only road traffic or only rail traffic occupies the crossing space at any given time is very much a functional property of the control system. It is also obviously a safety property. Ensuring that it operates in a fail-safe manner is also a functional property, as well as a safety property.

    RAMSS properties are in my experience not subject to the same kind of process as system development from functional requirements. The two "SS"s arise largely because of what is in the Anglosphere usually called duty of care. In other words, they become legal requirements. They involve system development processes as well as products (usually documentation) and are thus different from other requirements which only specify things you can observe in the built+operational state. For example, risk analysis is required (which consists of hazard identification and analysis and risk assessment, as specified generally in IEC Guide 51, which says that all standards related to safety shall specify such processes and documentation products). The way this works in most European countries is that there are laws specifying that applicable technical standards are to be applied, and those applicable technical standards will include sector-specific (or if there are none, generic) safety and cybersecurity standards (the cybersecurity part is at present woefully underdeveloped). In England, it has been the case for some decades that HSE has general oversight of engineering safety; if something bad happens they will investigate, and if you haven't (for example) followed the requirements of IEC 61508 they will prosecute (private prosecutions by organisations such as HSE are brought in English law). 

    However, it is not the case that the engineers responsible for ensuring the strictures of IEC 61508 are followed are "System Engineers". It is most often the case that people are called "System Engineers" if they do system engineering, and system engineering is that series of tasks described in the INCOSE Handbook. It essentially involves top-down hierarchical system development. Most large engineering companies in Europe don't (yet) approach their engineering projects this way, but some do, in particular in France. There are real difficulties in that the strictures of, for example, IEC 61508 are very much component-based and are not really compatible with INCOSE-type system engineering as in IEC 15288. There is concern about this within the IEC 61508 maintenance team, for example. This most particularly affects Hazard and Risk Analysis (HRA, as it is called in IEC 61508) and how this is performed. A systems approach will necessarily include emergent properties in its HRA. A component-based HRA with additional work on "system integration" is generally regarded as likely to miss some. The situation is plausibly argued by some to be worse wrt cybersecurity. 

Reply
  • I wouldn't classify RAMSS properties as non-functional, necessarily. The property of a level-crossing control system to ensure only road traffic or only rail traffic occupies the crossing space at any given time is very much a functional property of the control system. It is also obviously a safety property. Ensuring that it operates in a fail-safe manner is also a functional property, as well as a safety property.

    RAMSS properties are in my experience not subject to the same kind of process as system development from functional requirements. The two "SS"s arise largely because of what is in the Anglosphere usually called duty of care. In other words, they become legal requirements. They involve system development processes as well as products (usually documentation) and are thus different from other requirements which only specify things you can observe in the built+operational state. For example, risk analysis is required (which consists of hazard identification and analysis and risk assessment, as specified generally in IEC Guide 51, which says that all standards related to safety shall specify such processes and documentation products). The way this works in most European countries is that there are laws specifying that applicable technical standards are to be applied, and those applicable technical standards will include sector-specific (or if there are none, generic) safety and cybersecurity standards (the cybersecurity part is at present woefully underdeveloped). In England, it has been the case for some decades that HSE has general oversight of engineering safety; if something bad happens they will investigate, and if you haven't (for example) followed the requirements of IEC 61508 they will prosecute (private prosecutions by organisations such as HSE are brought in English law). 

    However, it is not the case that the engineers responsible for ensuring the strictures of IEC 61508 are followed are "System Engineers". It is most often the case that people are called "System Engineers" if they do system engineering, and system engineering is that series of tasks described in the INCOSE Handbook. It essentially involves top-down hierarchical system development. Most large engineering companies in Europe don't (yet) approach their engineering projects this way, but some do, in particular in France. There are real difficulties in that the strictures of, for example, IEC 61508 are very much component-based and are not really compatible with INCOSE-type system engineering as in IEC 15288. There is concern about this within the IEC 61508 maintenance team, for example. This most particularly affects Hazard and Risk Analysis (HRA, as it is called in IEC 61508) and how this is performed. A systems approach will necessarily include emergent properties in its HRA. A component-based HRA with additional work on "system integration" is generally regarded as likely to miss some. The situation is plausibly argued by some to be worse wrt cybersecurity. 

Children
  • Accidentally perhaps this has highlighted another issue issue with this flyer, what is meant by "Safely" in the title? Because I work in functional safety systems, exactly as Peter describes in his opening paragraph, I had assumed this referred to functional safety (because that's what I think about in the day job!). But as Bernard's post suggests, there are systems, in fact very many (but not alll!), where there are no functional safety requirements. So is "Safely managing..." relating to technical safety, i.e. the risk that mis-engineering kills people, or e.g. business safety, i.e. the risk that mis-engineering bankrupts the business? I assume it's the former, if so this should probably be clarified.

    That will also make it clearer that for safety related systems (systems with functional safety requirements) this doesn't just need systems engineers, it needs systems engineers who also understand functional safety. Some would suggest instead that it needs "systems engineers and safety engineers" - but that's not an approach I personally like at all. For the successful development of functional safety systems my experience that it works far, far better if the expertise is integrated in the same people. Which I guess is what this programme is starting to try to achieve,

    Thanks,

    Andy

  • Andy, to clarify, the paper is focused on technical safety, rather than business safety.  Clearly there is a link in that mis-engineering may impact people's welfare, business reputation and sales which can affect the viability of the company.  However, we are coming at the issue from an engineering rather than a commercial angle.