ALARP for Engineers

Hi all,

the IMechE have published (in 2024) a document titled 'ALARP for Engineers: A Technical Safety Guide".

It's a really useful document that I am finding helpful in our support to our customer. Does anyone know if there is any other equivalent documents produced by any other Institutes (IET would be great) that will help in application of Safety Techniques for complex systems that include Software and Humans in the loop?

I am also starting to look into STPA (I'm a Tim Kelly disciple, so working on Nancy Leveson processes feels disloyal, lol). I have the STPA handbook (all 188 pages of it). Can anyone recommend any additional resources that would help develop capability in this area please? it seems that a step change in how we perceive Hazards and Risk is both required and inevitable. Applying a top level Safety Target to a lower level system is driving me mad and STPA changes the approach of quantification of hazards to be far more sensible and manageable from a System level point of view. 

Many Thanks,

SJ

Parents
  • Hi Steven,

    Well that's a big question - but a very good one. In my experience (given that I've been working in this field for something over 30 years) there's remarkably little documented guidance that I've ever found, and what there is (for example with apologies to Nancy Leveson!) I find can be very, very specific and for many problems over complex. Yes for complex systems, for software, and for human factors then best practice processes are important and valuable, but when training organisations in safety management I tend to focus much more on safety culture. Because it's failures of that, whether in the development environment or in the application environment, that tends to actually result in accidents. And, conversely, if you have a good safety culture in the development team then everyone will be aware of the need to use those best practice processes anyway. But I've never actually found good guidance on safety culture (in the sense of functional safety) - if you find any I'd love to know so I can recommend it. The US Navy did fantastic work in this field in (IIRC) the 1980s/1990s, might be worth looking that up, it will still be relevant.

    For embedded software I do like the book "Embedded Software Development for Safety-Critical Systems" by Chris Hobbs, I find it very pragmatic.   

    Applying a top level Safety Target to a lower level system is driving me mad and STPA changes the approach of quantification of hazards to be far more sensible and manageable from a System level point of view. 

    Now I'll admit I'd never heard of STPA before your post! I don't think it's made inroads into the rail safety world yet. (I've had a brief look, but I'll refrain from commenting further until I've looked into it a bit deeper.) However, regarding safety targets, the problem is that without them it is very hard to know when to stop spending money. It is always possible to spend more money to make a system safer, in the UK ALARP / SFAIRP sets a legal principle that you keep spending money until the cost is "disproportionate" (the definition of which is a legal nightmare in itself), trying to prove disproportion for every subsystem without having a safety target is...challenging. So we have the IEC61508 approach of apportioning safety targets down, and when they can't be calculated (e.g. for software) assigning SILs. That all said, yes it is complex, and in practice doesn't happen anything like often enough. The poor sub-system supplier doesn't manage to get a safety target assigned to them because the system integrator hasn't had a safety target assigned to them, so people use SILs instead ("It'll kill lots of people if it fails so it's SIL4") and then they reverse engineer a hardware safety target from that. Which is not the intention at all, but we all do it (some of us through gritted teeth). Where safety targets are assigned properly they work really well - in my experience it's where there's a relatively short path from the piece of hardware you need a target for to the top level of the overall system, so they are always worth pushing for, but I agree sometimes you just have to concede that you are not going to be assigned a sensible one. 

    (Not really relevant, but: I got the chief safety officer for a delivery project for a new railway line very angry when - as the Independent Safety Assessor - I pushed him to say what his safety target was. Eventually he said furiously "no-one is going to die on my railway". Which was a silly thing to say, of course people are going to die on it during its service life, and at least some of those deaths would be preventable by spending more money. For a start the railway had level crossings, and any railway with level crossings will have fatalities - by spending more on bridges they could have been removed. By saying "it's not going to happen" he signalled that he wasn't actually looking at this seriously enough. I've summarised of course, there were many other reasons why we were aware that he didn't have a clear view of what his risks were.) 

    One thing that is crucial when looking at guidance is that "best practice" varies wildly between industries, I believe you're in the military equipment world so it's important to seek best (or at least accepted) practice in that field. Of course we do all learn from each other, and ideally we would do FAR more of that, but in practice a process that would be considered essential for ALARP / SFAIRP in one industry can be considered OTT  (or even inappropriate) in another. 

    I will ask my Human Factors colleagues about guidance in that area as that's not my field, and I'll post here if they come up with anything. 

    I have threatened that when I retire in a year or two that I'll write a book about this (from a particular perspective, all safety engineers have particular angles, never trust anyone who says "there is only one way to engineer safely"), but whether I would actually get around to it when there's guitars to play and a shed to potter in I don't know!

    Again, really good questions,

    Thanks,

    Andy 

Reply
  • Hi Steven,

    Well that's a big question - but a very good one. In my experience (given that I've been working in this field for something over 30 years) there's remarkably little documented guidance that I've ever found, and what there is (for example with apologies to Nancy Leveson!) I find can be very, very specific and for many problems over complex. Yes for complex systems, for software, and for human factors then best practice processes are important and valuable, but when training organisations in safety management I tend to focus much more on safety culture. Because it's failures of that, whether in the development environment or in the application environment, that tends to actually result in accidents. And, conversely, if you have a good safety culture in the development team then everyone will be aware of the need to use those best practice processes anyway. But I've never actually found good guidance on safety culture (in the sense of functional safety) - if you find any I'd love to know so I can recommend it. The US Navy did fantastic work in this field in (IIRC) the 1980s/1990s, might be worth looking that up, it will still be relevant.

    For embedded software I do like the book "Embedded Software Development for Safety-Critical Systems" by Chris Hobbs, I find it very pragmatic.   

    Applying a top level Safety Target to a lower level system is driving me mad and STPA changes the approach of quantification of hazards to be far more sensible and manageable from a System level point of view. 

    Now I'll admit I'd never heard of STPA before your post! I don't think it's made inroads into the rail safety world yet. (I've had a brief look, but I'll refrain from commenting further until I've looked into it a bit deeper.) However, regarding safety targets, the problem is that without them it is very hard to know when to stop spending money. It is always possible to spend more money to make a system safer, in the UK ALARP / SFAIRP sets a legal principle that you keep spending money until the cost is "disproportionate" (the definition of which is a legal nightmare in itself), trying to prove disproportion for every subsystem without having a safety target is...challenging. So we have the IEC61508 approach of apportioning safety targets down, and when they can't be calculated (e.g. for software) assigning SILs. That all said, yes it is complex, and in practice doesn't happen anything like often enough. The poor sub-system supplier doesn't manage to get a safety target assigned to them because the system integrator hasn't had a safety target assigned to them, so people use SILs instead ("It'll kill lots of people if it fails so it's SIL4") and then they reverse engineer a hardware safety target from that. Which is not the intention at all, but we all do it (some of us through gritted teeth). Where safety targets are assigned properly they work really well - in my experience it's where there's a relatively short path from the piece of hardware you need a target for to the top level of the overall system, so they are always worth pushing for, but I agree sometimes you just have to concede that you are not going to be assigned a sensible one. 

    (Not really relevant, but: I got the chief safety officer for a delivery project for a new railway line very angry when - as the Independent Safety Assessor - I pushed him to say what his safety target was. Eventually he said furiously "no-one is going to die on my railway". Which was a silly thing to say, of course people are going to die on it during its service life, and at least some of those deaths would be preventable by spending more money. For a start the railway had level crossings, and any railway with level crossings will have fatalities - by spending more on bridges they could have been removed. By saying "it's not going to happen" he signalled that he wasn't actually looking at this seriously enough. I've summarised of course, there were many other reasons why we were aware that he didn't have a clear view of what his risks were.) 

    One thing that is crucial when looking at guidance is that "best practice" varies wildly between industries, I believe you're in the military equipment world so it's important to seek best (or at least accepted) practice in that field. Of course we do all learn from each other, and ideally we would do FAR more of that, but in practice a process that would be considered essential for ALARP / SFAIRP in one industry can be considered OTT  (or even inappropriate) in another. 

    I will ask my Human Factors colleagues about guidance in that area as that's not my field, and I'll post here if they come up with anything. 

    I have threatened that when I retire in a year or two that I'll write a book about this (from a particular perspective, all safety engineers have particular angles, never trust anyone who says "there is only one way to engineer safely"), but whether I would actually get around to it when there's guitars to play and a shed to potter in I don't know!

    Again, really good questions,

    Thanks,

    Andy 

Children
  • Hello Andy:

    In the latest IEEE Spectrum Magazine (April 2026) there is a very interesting article titled "AI Mistakes are very different from Human Mistakes"

    which makes the point that new safety procedures are needed to be developed when dealing with AI based systems.

    Quote "AI errors come at seemingly random times, without any clustering around particular topics . The mistakes tend to be more evenly distributed through the knowledge space; an LLM might be equally likely to make a mistake on a calculus question as it is to propose that cabbages eat goats.".  

    Peter Brooks

    Palm Bay FL