Values, guidance, NICE and the ESVS.

This is a transcript of a 7 minute talk I was invited to give at the Cardiovascular and Interventional Society of Europe’s [CIRSE] annual conference in Barcelona, as part of a session on “Controversies in Standard Endovascular Aneurysm Repair [EVAR] within IFU” [indications for use].

This talk: “NICE guidelines best inform clinical practice”, was one side of a debate: my opponent’s title was “European Vascular Society [ESVS] guidelines should be the standard of care”.

If you have on-demand access to CIRSE2022 content, you can view a recording of the session here.

Barcelona. Spain. 13^th September 2022. 15:00h

Thanks. My name is Chris Hammond, and I’m Clinical Director for radiology in Leeds. I was on the NICE AAA guideline development committee from 2015-2019.

I have no declarations, financial or otherwise. We’ll come onto that in a bit more detail later.

This talk is not going to be about data. I hope we are all familiar with the published evidence about AAA repair. No. This talk is about values. Specifically, the values that NICE brings to bear in its analysis and processes to create recommendations and why these values mean NICE guidelines best inform clinical practice. What are those values?

Rigour, diversity, context.

Let’s unpick those a little.

NICE’s is known for academic rigour. Before any development happens, the questions that need answering are clearly and precisely identified in a scoping exercise. A PICO question is created, the outcomes of interest defined, and the types of evidence we are prepared to accept are stipulated in advance.

The scope and research questions are then published and sent out for consultation – another vital step.

After the technical teams have done their work, their results are referred explicitly back to the scope. Conclusions and recommendations unrelated to the scope are not allowed.

This process is transparent and documented and it means committee members cannot change their mind on the importance of a subject if they do not like the evidence eventually produced.

It’s impossible to tell from the ESVS document what their guideline development process was. A few paragraphs at the beginning of the document are all we get. ESVS do not publish their scope, research questions, search strategies or results. How can we be assured therefore that their conclusions are not biased by hindsight, reinterpreting or de-emphasizing outcomes that are not expedient?

We can’t.

For example, data on cost effectiveness and outcomes for people unsuitable for open repair are inconvenient for EVAR enthusiasts. I’ll let you decide the extent to which these data are highlighted in the ESVS document.

More, in failing to define the acceptable levels of evidence for specific questions ESVS ends up making recommendations based on weak data. Recommendations are made based on the European Society of Cardiology criteria which conflate evidence and opinion. Which is it? Evidence or opinion?

Opinions may be widely held and still be wrong. The sun does not orbit the earth. An opinion formulated into a guideline gives the opinion illegitimate validity.

Finally, there is the rigour in dealing with potential conflicts of interest. These are the ESVS committee declarations – which I had to ask for. The NICE declarations are in the public domain on the NICE website. Financial conflicts of interest are not unexpected though one might argue that the extensive and financially substantial relationships with industry of some of the ESVS guideline authorship do raise an eyebrow.

The question though is what to do about them. NICE has a written policy on how to deal with a conflict, including exclusion of an individual from part of the guidance development where a conflict may be substantial. This occurred during NICE’s guideline development.

The ESVS has no such policy. I know because I have asked to see it. Which makes one wonder: why collect the declarations in the first place.

How can we then be assured these conflicts of interest did not influence guideline development, consciously or subconsciously.

We can’t

What about diversity?

This is the author list of the ESVS guideline. All 15 of the authors, all 13 of the document reviewers and all 10 of the guideline committee are medically qualified vascular specialists. They are likely to all have had similar training, attended similar conferences and educational events and have broadly similar perspectives. It’s a monoculture.

Where are the patients in this? The ESVS asked for patient review of the plain English summaries it wrote to support its document, but patients were not involved in the development of scoping criteria, outcomes of importance or in the drafting of the guideline itself.

Where is the diversity of clinical opinion? Where are the care of the elderly specialists to provide a holistic view? Where is anaesthesia? Primary care? Nursing?

Where is the representation of the people who pay for vascular services: infrastructure, salaries, devices? And who indirectly pay for all this, maybe for your meal out last night, for the cappuccino you’ve just drunk? Where is their perspective when they also have to fund the panoply of modern healthcare?

NICE committees have representation of all these groups, and their input into the development of the AAA guidance was pivotal. The NICE guidance was very controversial, but the consistency of arguments advanced by diverse committee members with no professional vested interest was persuasive.

Finally, we come to context.

An understanding of the ethical and social context underpinning a guideline is essential.

We cannot divorce the treatments we offer from the societal context in which we operate. We live in a society which emphasises individual freedom and choice and are comfortable with some people having more choices than others, usually based on wealth. Does this apply equally in healthcare? In aneurysm care? What if offering expensive choices for aneurysm repair means we don’t spend money on social care, nursing homes, cataracts or claudicants.

To what extent should guidelines interfere with the doctor-patient relationship? Limit it or the choices on offer? What is the cost of clinical freedom and who bears it?

NICE makes very clear the social context in which it makes its recommendations. It takes a society-wide perspective, and its social values and principles are explicit. You can find them on the NICE website. Even if you don’t agree with its philosophical approach, you know what it is.

We don’t know any of this for the ESVS guideline. We don’t know how ESVS values choice over cost, the individual over the collective. Healthcare over health. This means that the ESVS guideline ends up being a technical document, written by technicians for technicians, devoid of context and wider social relevance.

The ESVS guideline is not an independent dispassionate analysis, and it never could be, because its development within an organisation so financially reliant on funding from the medical devices industry was not openly and transparently underpinned by NICE’s values of rigour, diversity and context.

Rigour. Diversity. Context

That’s why NICE guidelines best inform clinical practice.

Thanks for your attention.

Decisions, QALYs and the value of a life

Here’s a well known thought experiment:

A runaway train is on course to collide with and kill five people who are stuck at a crossing down the track. There is a railway point and you can pull a lever to reroute the train to a siding track, bypassing the people stuck at the crossing but killing 2 siding workers.

What would you do? What is the ethical thing to do? Why? What if one of the siding workers was related to you? What if the people stuck at the crossing were convicted murderers on their way to prison? What if the people on the crossing were not killed but permanently maimed?

Have a think about it before reading on.

Unless you answered that you did not accept the situation at face value (like Captain Kirk and the Kobayashi Maru simulation), or refused to choose, you will have made some judgements about the relative value of the choices on offer and perhaps the relative value of the lives at risk. You are not alone in this: in a 2018 Nature paper, nearly 40 million people from 233 countries were prepared to make similar choices. On average there were preferences to save the young over the old, the many over the few and the lawful over the unlawful, though with some interesting regional and cultural variations.

Making value judgements about people in a thought experiment is one thing, but making them in the real world with impacts on real people and their lives is another. Ascribing value to a persons life has grim historical and moral connotations. If someone is deemed somehow less valuable than someone else there is a risk that this is used to justify stigmatisation, discrimination, persecution and even genocide. We therefore need to be extremely careful about the moral context in which such judgements are made and the language we use to discuss them. Human rights, justice and the fundamental equivalence of the life and interests of different people must be central.

Decisions which affect the health, livelihoods and welfare of citizens are (and need to be) made all the time. In some cases decisions affect length or quality of life, or liberty. Decision making during the pandemic (whether locking down, opening up, isolation, mask wearing or travel restriction) is a potent recent example. Few people would argue that no decisions were necessary even if they may disagree with the details of some (or all) of the actual decisions made.

But if everyone’s life and interests are equivalent, how do we avoid becoming paralysed when faced with choices which inevitably will have (sometimes significant) consequences for different individuals? We do this by understanding that the values we ascribe to the people affected by a decision are not absolute measures of their worth, but merely tokens which allow us to undertake some accounting. If the process by which we allocate these tokens is transparent, just and humane then their use to inform a decision is morally defensible. Choosing to switch the points because this results in the least worst outcome on average is morally very different from choosing to switch them because you have a seething hatred of railway engineers.

What tokens can we use in healthcare to help us make decisions?

There have been attempts to provide a quantitative framework for measuring health. The most commonly recognised token of health is the Quality Adjusted Life Year (QALY) though there are others (eg. Disability Adjusted Life Years [DALY]). One QALY is a year lived in full health. A year lived in in less than full health results in less than one QALY, as does less than one year lived in full health. How much we scale a QALY for less than full health is determined by studies asking members of the public to imagine themselves ill or disabled and then enquiring (for example) how much length of life they’d trade to be restored to full health (time trade-off) or what risk of death they’d accept for a hazardous cure (standard gamble).

The QALY is a crude and clumsy tool. It has been criticised for relying on functional descriptions of health states (like pain, mobility and self care) rather than manifestations of human thriving (stability, attachment, autonomy, enjoyment), for systematically biasing against the elderly or the disabled and for failing to take into account that health gains to the already healthy may be valued less than health gains to the already unhealthy (prospect theory). The scalar quantities contributing to a QALY (‘utilities’) reflect the perceptions of those surveyed during QALY development, validation and revalidation. These perceptions may be clouded by fear or ignorance and may have little relation to the real experiences of people living with a health impairment or handicap. Some have argued that QALYs have poor internal validity and are therefore a spurious measure.

These are important, though arguable, technical criticisms and to some extent explain the marked international variation in the use of the QALY: they are used in the UK and some Commonwealth countries, but have been rejected as a basis for health technology assessment in the US and Germany. And yet, decisions need to be made. If not QALYs then what else?

But the most emotionally charged criticism of QALYs is that they somehow inherently rank people’s value according to how healthy they are or that the health of people who gain fewer QALYs from an intervention is somehow worth less than the health of those who gain more. This is a misunderstanding. QALYs (like the assessments you made of the lives at risk from the runaway train) are accounting tokens. When fairly, justly and transparently allocated (and technical criticisms might be important here) they merely allow a quantitative assessment of outcome. The rationale underlying QALY assessment is explicit that the value of a QALY is the same no matter who it accrues to: there is no moral component in the calculation. And nor is there the requirement that efficiency of QALY allocation be the sole (or even most important) driver of decision making.

A QALY calculation is fundamentally contingent on the interaction of the intervention with the people being intervened on. Someone’s capacity to benefit (which is what QALYs measure) depends not just on their characteristics but also those of the intervention. Absolute ranking of QALYs as an empirical assessment of someone’s ‘value’ based on their health is therefore meaningless: a different intervention on the same set of people can result in a totally different estimate of outcome.

Consider if rather than five people on the crossing there was only one (and still two siding workers). A pure utilitarian consequentialist would switch from ploughing the train into the siding to smashing it into the crossing. But this doesn’t mean she has suddenly changed her mind about the value of the lives of the people involved, merely that the situation and therefore the most efficient outcome, has changed.

QALYs don’t ascribe a value to someone’s life. They are accounting tokens, providing a (perhaps flawed) quantitative estimate of health outcome in a specific circumstance – usually that of evaluation of an intervention relative to an alternative in an identifiable group of people. This is not to say that some people might not be harmed by a decision based on a QALY assessment, but that, of itself, does not make the decision unfair or unjust.

Alongside utilitarian efficiency and QALYs, egalitarian considerations of fairness and equity, distributional factors, affordability, and political priorities may (and often do) feed into the decisions that are ultimately made.

In praise of doing nothing

“Don’t just do something, stand there”

A man in his early 80’s is diagnosed with a large hepatocellular carcinoma. This was detected incidentally on an abdominal ultrasound performed for an episode of renal colic. The liver lesion is completely asymptomatic. The ultrasound prompted an MRI of the liver, then a staging CT, a visit to the liver surgeon and ultimately a plan for hemihepatectomy. As there will not be enough residual liver after this he is referred to me for portal vein embolisation to get the future remnant liver to grow before resection.

Modern medicine is remarkably safe. Portal vein embolisation carries negligible risk. Major liver resection in octogenarians is associated with a perioperative mortality of about 5%. There have been dramatic improvements in the safety of many invasive procedures facilitated by improvements in anaesthetic care, pre- and post-operative medical management and rehabilitation, nutrition and the advent of minimally invasive surgery and imaging guided techniques. This is unarguable progress.

However, when procedures are so safe that there is very little downside to undertaking them, we create a set of ethical and evidential questions which we are poorly equipped to answer. These relate to long term outcomes (and predicting these on a patient-by-patient basis) and the appropriate use of resource. As questions about ‘can we do this’ become easier, questions about ‘should we do this’ become more complex. These dilemmas are not unique to surgery and intervention: decisions about (for example) starting renal replacement therapy are similar.

Differing perspectives and different information is needed to answer questions about ‘can we’ and ‘should we’.

‘Can’ questions tend to be technical. Is the procedure possible? How might we do it? What are the immediate peri-procedural risks? Outcomes of interest are usually focussed on procedure, process, operator or institution. They are easy to measure and so are supported by a large literature of mainly cohort and registry based data. We know whether we can or not.

‘Should’ questions are more nebulous, patient centred and frequently more philosophical. Why are we doing this intervention? What are we trying to achieve? What outcomes are relevant and for whom? What do patients think? What does this patient think? What happens if we do nothing? And more widely, is it worth the cost and on whose terms? Producing evidence to answer these questions is much more difficult. It requires engagement with patients, long term follow up and an assessment of natural history of both patient and pathology with and without the intervention. This is time consuming, expensive and sometimes ethically difficult where professional norms have progressed beyond the evidence base. Research methodology advances such as the IDEAL collaboration (and perhaps routine dataset linkages) are mitigating some of these implementation barriers but adoption is slow.

Furthermore, even if we have long term meaningful outcome data for a cohort of patients, how does this relate to our patient? Risk prediction models and decision tools are frequently inaccurate or so cumbersome as to be clinically useless (or both!)

The relative absence of ‘should’ information makes informed consent with a patient about whether to proceed or not very difficult. Inevitably fundamental personal, professional and philosophical inclinations will colour evidence interpretation and consent conversations. It’s easy to influence a patient toward a favoured decision, especially if they are minded to leave the ultimate decision to the doctor. Clinicians vary in their willingness to intervene: some are more conservative than others and it’s a well trodden path that youthful exuberance is tempered as a clinician gets older. We should be honest (with the patient and with ourselves) to what extent our personal philosophy and practice style impacts shared decision making.

But it’s not just personal philosophy that influences decisions in the common situation where the evidence base is limited. Setting aside the distorting influence of medical device marketing (conservative management rarely generates revenue), there are structural features within healthcare which promote intervention over doing nothing.

Specialties like interventional radiology exist (unfortunately still) almost entirely in the context of doing procedures, rather than managing patient pathways. Therefore not doing something can be seen as an existential threat. What value is added if not by doing interventions?

Secondly, with the best intentions, we benchmark some outcomes as failure, creating pressure to avoid them at all costs. But heroic interventions to prevent these eventualities is not always good healthcare. Major amputation in peripheral vascular disease is not always inappropriate. Dying with, or even of, a malignancy does not always represent a bad outcome.

Third, there is a psychological tendency, when faced with an individual at risk, to see doing something as a moral obligation and always better than doing nothing even when the benefit is uncertain. When the risk of intervening (‘can we do this’) is very low what is there to lose? The obligation is exacerbated by public perception of modern healthcare as almost omnipotent. This ‘Rule of Rescue’ is a powerful motivator for healthcare staff who are expected to treat, and for patients who expect to be treated.

Finally, follow up is often delegated away from the operator. Without the immediate and personal experience of our patients’ outcomes longer term, it’s easy to equate safety with efficacy but this is a reductionist fallacy. A person’s outcome depends on more than the successful management of a single pathological entity. The literature is replete with examples of successfully performed interventions that make no difference to outcome. And there is ample evidence that clinicians often overstate the benefits and understate the risks of the interventions they offer.

These structural features nudge modern healthcare towards ever increasing intervention. But in the rush to intervene we lose sight of the option of doing nothing and thereby risk becoming more technical, less humane and ultimately less effective as healthcare providers. Sometimes (often?) a sensitive conversation, compassion, empathy and reassurance are preferable to virtuoso technique. The Fat Man’s 13th Rule is as relevant now as it was 40 years ago: The delivery of good medical care is to do as much nothing as possible.

My general position has slowly become increasingly conservative and I declined to do the portal vein embolisation after a long discussion with the patient. The referring surgeon (someone I respect greatly) and I had a ‘frank exchange of views’. Fourteen months later the patient remained well and his lesion was no larger. Clearly this single anecdote cannot be used to justify an overall philosophy: he might have died of the lesion. He may still. I don’t think that is necessarily a failure.