Vioxx and Bextra: Out of the frying pan into the fire? (BMJ 2004)

Some time ago in a discussion about possible future projects Iain Chalmers mentioned to me that he would like to see more work done around the question of whether new treatments were better than older established ones. It seems that the general impression conveyed by the marketers and advertisers is that new is better than old (which may have more than a little to do with the fact that manufacturers make much of their profit when new products are protected by patent).

Whilst I was in Canada at the 12th Annual Cochrane Colloquium the news broke that rofecoxib (Vioxx) had been withdrawn by the manufacturers and on my return I discovered that this was due to the results from a large long-term study on Vioxx to see what effect it had on polyps in the large bowel. The absolute increase in serious thromboembolic events was about 1% over 18 months, which was a doubling of the risk in the placebo group.

The problem is that these events are not very common and short-term trials looking at how well Vioxx performs as an anti-inflammatory agent would not have sufficient power to detect differences of this magnitude.

So in 2004 we had to decide what to do with patients who had been taking Vioxx? The crucial question is whether this is a class effect (which represents an increased thromboembolic risk in all Cox-2 inhibitors) or whether it is a particular problem with Vioxx. Two different questions could be asked:

1.Have other Cox-2 drugs been shown to increase thromboembolic risk?

2.Have other Cox-2 drugs been shown to be better than Vioxx for thromboembolic risk?

We did not know the answer to either question, but my guess it that those with a vested interest in the other Cox-2 inhibitors will major on the fact that we do not know the answer to question one, and those who are more concerned about safety will focus on the second question. The latter will also remember the issues related to delayed publication of adverse effects from SSRIs in adolescents mentioned in my last newsletter.

One alternative to Vioxx is Bextra (valdecoxib) and I was interested to read in today’s BMJ that concern has been voiced about delay in releasing information about Bextra following the withdrawal of Vioxx. Guess what one of the members of the FDA’s Safety and Risk Management committee has found?

“The first of the two trials was published but in a manner that obscured the risks, according to Dr Furberg. He said,’ They listed each [adverse] event individually and said the numbers were too small to analyse. But I added up heart attacks, strokes, and deaths and found a statistically significant fourfold increase over placebo.’” ‘ .

So do you think a class effect is likely? If you do then changing patients from Vioxx to another Cox-2 inhibitor may not be the right thing to do. This may be an instance where new is not better after all.

Publication Bias – How much of a problem is it for SSRIs? (Lancet 2004)

The term “Publication Bias” can be used in a variety of different ways and includes problems ranging from non-publication of trials, to delayed publication, publication in less accessible journals or languages, and selective publication of the outcomes in papers that do get published. The problem is that if the results influence where, when and what is published, then it is biased sample of the whole data, and any systematic review of the published results may be misleading.

Human nature is such that Trialists and Sponsors have an agenda in what they publish and the way the data is presented. For example in small trials there are often baseline imbalance between the active treatment and control groups. This can be exploited in the way that the data is presented as change from baseline or absolute post treatment values (for lung function data as an example). Sometimes only the outcome favouring the product is presented, or it may be emphasised over the alternative. An example of this was found in the Systematic review comparing metered dose inhalers in the BMJ (Brocklebank D, Wright J, Cates C. Systematic review of clinical effectiveness of pressurised metered dose inhalers versus other hand held inhaler devices for delivering corticosteroids in asthma. BMJ 2001;323:896-.) An apparent group difference between delivery devices disappeared when the FEV1 data from some of the research papers was adjusted from absolute to change from baseline values (or vice versa).

With care systematic reviewers can be on the lookout for such problems, and sometimes adjust for them. However when outcomes are not published at all, or even whole trials do not reach the public domain, this represents a serious threat to the combined results of clinical trials that are collected together in Systematic Reviews.

A particularly worrying example of this was published this year in the Lancet.

Here is the abstract and reference:


“BACKGROUND: Questions concerning the safety of selective serotonin reuptake inhibitors (SSRIs) in the treatment of depression in children led us to compare and contrast published and unpublished data on the risks and benefits of these drugs. METHODS: We did a meta-analysis of data from randomised controlled trials that evaluated an SSRI versus placebo in participants aged 5-18 years and that were published in a peer-reviewed journal or were unpublished and included in a review by the Committee on Safety of Medicines. The following outcomes were included: remission, response to treatment, depressive symptom scores, serious adverse events, suicide-related behaviours, and discontinuation of treatment because of adverse events. FINDINGS: Data for two published trials suggest that fluoxetine has a favourable risk-benefit profile, and unpublished data lend support to this finding. Published results from one trial of paroxetine and two trials of sertraline suggest equivocal or weak positive risk-benefit profiles. However, in both cases, addition of unpublished data indicates that risks outweigh benefits. Data from unpublished trials of citalopram and venlafaxine show unfavourable risk-benefit profiles. INTERPRETATION: Published data suggest a favourable risk-benefit profile for some SSRIs; however, addition of unpublished data indicates that risks could outweigh benefits of these drugs (except fluoxetine) to treat depression in children and young people. Clinical guideline development and clinical decisions about treatment are largely dependent on an evidence base published in peer-reviewed journals. Non-publication of trials, for whatever reason, or the omission of important data from published trials, can lead to erroneous recommendations for treatment. Greater openness and transparency with respect to all intervention studies is needed.”

Reference: Whittington CJ, Kendall T, Fonagy P, Cottrell D, Cotgrove A, Boddington E. Selective serotonin reuptake inhibitors in childhood depression: systematic review of published versus unpublished data.

Lancet 2004;363(9418):1341-5.

The Editor of the Lancet has expressed strong views about the way the data was withheld.

(Depressing research. Lancet 2004;363(9418):1335.)

The public prosecutor in New York is taking one of the manufacturers of an SSRI to court over this issue. If the case is successful perhaps this will provide a lever for change and a greater level of transparency in papers reporting the results of medical research.

(Is GSK guilty of fraud? Lancet 2004;363(9425):1919.)

The Moral of the Story

I think we have to assume that publication bias is always present (rather than the opposite) and the tests that we carry out, such as funnel plots, should be looked at to see if there is reasonable evidence that publication bias is NOT present!

The problem is particularly acute for the reporting of adverse events, as in the example above. There are already moves afoot to encourage the registration of all clinical trials, but this will not address the problem of selective reporting of results from trials (for example adverse event recording). I wonder if it would help to ask that when the results of controlled trials are analysed, the groups should be concealed from the statistician and authors so that significant differences are reported as such, and the direction of effect (for or against the active treatment) is then added later. Perhaps consideration should be given to including such an approach into the next revision of the Consort statement?

Using Evidence in Practice – An introduction to Evidence Based Medicine (Prescriber 2003)

An introduction to Evidence Based Medicine

What do you mean by evidence-based medicine?  Whilst the term evidence based medicine (EBM) is probably familiar to most readers, it is worth pausing initially to think about what we understand by the term. The claim that a position is “evidence based” can be used to try to silence any questions or argument. On the contrary, asking questions about the evidence for any suggested course of action is at the heart of EBM philosophy. I can do no better than to quote the introduction to one of my favourite books in this area, Follies and Fallacies in Medicine(1), in which the authors describe themselves as suffering from incurable “scepticaemia”.

The aim of our book is to reach inquisitive minds, particular those who are still young and uncorrupted by dogma. We offer no solutions to the problems we raise because we do not pretend to know of any. Both of us have been thought to suffer from scepticaemia* but are happy to regard this affliction, paradoxically, as a health promoting state. Should we succeed in infecting others we will be well content.

*Scepticaemia: An uncommon generalised disorder of low infectivity. Medical school education is likely to confer life-long immunity.

The first step towards using EBM to inform our daily practice is to be prepared to question whether we always know the best course of action or have looked at the evidence that underpins the decisions that we make.

We are certainly influenced by our own past experience, what our colleagues do and what experts tell us. These often enlighten us and inform our practice, but we must also be aware that experiences are subject to chance variation, and that the person who is closest at hand may not give the best advice. For example, the experience of the last patient with a condition is not necessarily the best pointer for the next one. What we were taught in medical school may also now be out of date. We do well, however, to remember that our own experience and those of our patients are always important and worth exploring. How many times have you had the experience of suddenly understanding why a patient has presented with a longstanding headache when they let slip that a friend at work had been diagnosed as having a brain tumour?

What EBM is not

Whilst it is invaluable to know what the evidence is in relation to problems that we have to investigate and treat, you may be surprised to learn that the advocates of EBM would be the first to agree that evidence is only a small part of making clinical decisions (see box).

"First, evidence alone is never sufficient to make a clinical decision. Decision-makers must always trade the benefits and risks, inconvenience, and costs associated with alternative management strategies, and in doing so consider the patient's values."

Users Guides to the Medical Literature(2)

EBM is not a kind of cookbook medicine full of easy answers to difficult questions, and it can be quite time-consuming. In general as we dig into the evidence we find that there is much that is unknown, but tolerance of uncertainty is well known to us in primary care, and in my experience sharing this uncertainty carefully with patients is often surprisingly well received.

'For every complex problem there is a simple answer, and it's wrong.'
HL Menken

Why is EBM important?

There is an ever-increasing quantity of medical literature published each week and keeping up to date is a huge challenge. It is simply not possible to read all the relevant literature (even in our areas of special interest), so how can we stay in touch with recent developments? If you have written a personal learning plan I wonder whether this is a recognised problem and how you plan to address it?

Increasingly we are put under pressure by patients who have read about a new treatment in the paper or found an article on the Internet, or by consultants who advocate particular referral or treatment pathways for patients with particular symptom presentations. So how are we to respond?

The medical literature is a powerful resource for us, but we have to recognise that it serves many different needs. Those who commission and carry out medical research need somewhere to publish the findings of their work. This may be of high or low quality, and it is not necessarily safe to assume that publication of a paper in a peer-review journal means you can believe all that the authors say. Just look at the subsequent correspondence if you want to see what I mean!

The bottom line is whether this paper means that I should change what I am currently doing, and in order to assess this some basic skills are needed. Many of these, including some explanation of statistical concepts, will be covered in later articles in this series, but the first useful skill is being able to turn a vague concern into an answerable question.
We need to be able to pose a question that reliable research studies can answer. The structure of such a question in relation to treatment options will have 4 parts to it and can be summarised using the acronym PICO. We need to consider the Patient’s problem, the Intervention suggested, the possible Comparative treatments and the Outcomes that matter (see Box). PICO

Thus “Does my child need antibiotics for this ear infection?” might be rephrased “In children with acute otitis media, how much difference do antibiotics make in comparison with paracetamol alone, in terms of duration of pain, deafness, recurrent infections and serious complications”.

Once we have determined the question that we want to ask, we can move on to decide what is the most valid evidence to answer the question and how to find it.

Archie Cochrane’s Challenge

I was impressed as a student by Archie Cochrane’s book ‘Effectiveness and Efficiency’ in which he pointed out that we could be as efficient as we like in providing medical care, but that if it is not effective care we are wasting our time(3). He set out a challenge in 1979 as follows(4):

It is surely a great criticism of our profession that we have not organised a critical summary, by specialty or subspecialty, updated periodically, of all relevant randomised controlled trials.

In response to this challenge the Cochrane Collaboration prepares and updates such summaries in the form of systematic reviews of the best evidence available, and there are now over 1,000 of these on the Cochrane Library. Whilst there will inevitably be gaps in this database for some time to come, increasing numbers of reviews do address issues related to primary care.
I would be the first to admit that Cochrane reviews are not light reading, but a later article in this series will address the subject of how to understand systematic reviews. Moreover part of the purpose of publications such as Clinical Evidence is to summarise the results of Cochrane reviews in a concise understandable format.

EBM in daily practice

If we want to practice better medicine we will need to keep up to date with new developments and decide how to integrate them into our practice. The concept of Clinical Governance challenges us to demonstrate whether we have been able to measure changes in our practice as a result. This can be challenging and exciting but we have to be realistic about how much can be achieved in the face of numerous demands made upon us and the volume of uncertainties that we face every day. We also need to avoid efficiently implementing treatments that are not effective!

There is little point wasting time looking for answers that probably do not exist, and in my experience the quickest place to start looking is in a synopsis of published research that has already been assessed for quality, such as Clinical Evidence or Best Evidence (an electronic summary of Evidence Based Medicine Journal and ACP Journal Club). Whilst searching Medline may be more familiar the best data tends to be buried in a sea of other material. Again this will be dealt with in more depth in a future article.

So if all this sounds like hard work – it is! But it is worth it and it can be fun, so look out for the future topics in this series that may change the way you read journals and perhaps even how you practise in the future.


1. Skrabanek P, McCormick J. Follies and Fallacies in Medicine. 3 ed: Tarragon Press; 1998.
2. Guyatt G, Rennie D. Users’ Guides to the Medical Literature: AMA Press; 2001.
3. Cochrane A. Effectiveness and Efficiency: The Nuffield Provincial Hospitals Trust; 1971.
4. Cochrane A. 1931-1971: a critical review, with particular reference to the medical profession. In: Medicine for the year 2000. London: Office of Health Economics; 1979. p. 1-11.

Reproduced with permission.

The perils and pitfalls of sub-group analysis (Pulse Article 2001)

This article is part of a series on Critical Reading.

Controlled clinical trials are designed to investigate the effect of a treatment in a given population of patients, for example aspirin is given to patients with ischaemic heart disease. Inevitably there will be differences between the patients included in the trial (men versus women, older versus younger, hypertensive versus non-hypertensive).

It is tempting to look at the effects of treatment separately in different types of patient in order to decide who will benefit most from being given the treatment. Although this analysis of the sub-groups of patients is widely carried out in the medical literature, it is not very reliable. And the ISIS-2 trial gives a clear example of how this can be misleading [1]. The trial looked at the effect of aspirin given after acute myocardial infarction, and when the results were reported the editorial team at the Lancet wished to publish a table of sub-group analyses. The authors agreed as long as the first line in the table compared the effects in patients with different birth signs [2].

The analysis showed that aspirin was beneficial in all patients except those with the star signs of Libra and Gemini. This served as a warning against the over interpretation of the results of the other sub-groups reported in the paper. The problem is that the play of chance can lead to apparently significant differences between sub-groups, and these are really only helpful in very large trials which show really big overall differences in the treatment and control groups.

Two examples of the use of sub-group analysis are somewhat contentious. The first was reported in the Lancet and looked at the evidence from different trials of mammography to try to reduce deaths from breast cancer[3]. The overall result from all the trials together showed mammography to be of significant benefit, but the authors looked at the characteristics of the trials and felt that some were more reliable than others. The data from these selected trials did not show a benefit from mammography. On this basis the authors concluded that screening for breast cancer was unjustified.

Use of aspirin

Similarly a recent paper in the BMJ suggested that aspirin may not be useful for primary prevention in patients with mildly elevated blood pressure on the basis of the results of patients in this sub-group [4]. I would suggest that before deciding about aspirin for such patients you ask yourself whether you would still treat those with the Libra and Gemini birth signs with aspirin following an MI. Moreover if patients on aspirin for secondary prevention of ischaemic heart disease ask whether they should stop if their blood pressure is up a bit, my answer would be no.

The bottom line is that the best overall estimate of the effect of a treatment comes from the average effect on all the patients and not from the individual sub-groups [5]. Sub-group analysis is generally best restricted to the realm of generating hypotheses for further testing rather than evidence that should change practice.


1. Horton R. From star signs to trial guidelines. Lancet 2000;355:1033-34

2. ISIS-2 Collaboration group. Randomised trial of intravenous streptokinase, oral aspirin, both, or neither among 17,187 cases of suspected myocardial infarction. Lancet 1988; ii:39-60

3. Gotzche PC, Olsen O. Is screening for breast cancer with mammography justifiable? Lancet 2000;355:129-34

4. Meade TW, Brennan PJ, on behalf of the MRC General Practice Research framework. Determination of who may derive the most benefit from aspirin in primary prevention; subgroup results from a randomised controlled trial. BMJ 2000; 321:13-7.

5. Gotzsche PC. Why we need a broad perspective on meta-analysis. BMJ 2000; 321:585-6

Relative or Absolute measures of effect (Pulse Article 2001)

This article is part of a series on Critical Reading.

Measuring outcomes in clinical trials can be done in a variety of ways, and presentation of the results may influence the way that readers respond. For any trial that reports dichotomous outcomes (that is where patients can only be in one of two categories, such as dead or alive, pregnant or not) the results can be shown simply as a two-by-two table.

Non-pregnant Pregnant
Levonelle 976 11
Yuzpe 997 31

The data shown indicates that 1% of patients given Levonelle-2 for post-coital contraception become pregnant in comparison with 3% of those who are given the older Yuzpe method (1). This can be reported in different ways. The Relative Risk of becoming pregnant is obtained by dividing the risks of pregnancy in the treated and untreated groups and comes out as 0.33 if you have Levonelle-2 (or in other words your risk of becoming pregnant is one third of the risk with Yuzpe). This sounds impressive in comparison to the Risk Difference which is obtained by subtracting the risk in the two groups and is only 0.02 because the pregnancy rate is low in both groups. The Number Needed to Treat (NNT) with Levonelle-2 rather than Yuzpe to avoid one extra pregnancy is the inverse of the Risk Difference and in this case works out as 63 patients.(2)

Each measure has its advantages and disadvantages. The relative risk of a given treatment (such as statins for the prevention of ischaemic heart disease) tends to be independent of the risk of the patients being treated. This makes it a good measure to use when combining the results of different trials in a meta-analysis (3).

Risk difference on the other hand is helpful when considering treatments for individual patients as the amount of difference a treatment will make to them depends on their level of risk. A good example of this comes from the comparison of different oral contraceptive pills in terms of the risk of deep vein thrombosis. Safety studies have indicated that third generation oral contraceptive pills carry twice the risk of venous thromboembolism as the older pills, this is a relative risk of 2 and caused a great deal of alarm amongst pill takers. However the absolute risks are very low, 1 in 10,000 with the older pills and 2 in 10,000 with the third generation ones; the risk difference is 0.0001. Put another way 10,000 women would have to take the third generation pills for one year before one of them suffered thromboembolic disease as a consequence giving a Number Needed to Harm (NNH) of 10,000.

Of course the interpretation of Numbers Needed to Treat may be dependent on how important the consequences are and some women opted to change pill to minimise their risks whilst others were happy to continue, as the individual risk to them was so low.

Those of you who are interested in seeing some examples of graphical displays of Numbers Needed to Treat in different clinical scenarios related to primary care will find examples in the Cates plots in other articles on this site (such as Vitamin D for asthma).

There are related article on this topic (Relatively Absolute and Communicating Risk).


1. Task Force on Postovulatory Methods of Fertility Regulation. Randomised controlled trial of levonorgestrel versus Yuzpe regimen of combined oral contraceptives for emergency contraception. Lancet 1998;352:428-33.

2. Which postcoital contraceptive? Cates C. BMJ 2000;321:664

3. Egger M, Davey Smith G, Phillips AN. Meta-analysis: principles and procedures. BMJ 1997; 315: 1533-1537.

Evidence from Randomised Trials and Systematic Reviews (Pulse Article 2001)

This article is part of a series on Critical Reading.

The main threats to validity in non-randomised studies is related to BIAS due to differences in the populations of patients who do and do not receive the experimental treatment. Randomisation should overcome this problem because the random allocation of patients to the treatment or control groups should create an equal spread of known and unknown risk factors between the two groups. Whilst statistical techniques can be used to adjust for known confounding factors in non-randomised studies, by definition the unknown ones can only be overcome with randomisation.

Allocation concealment

Even in randomised controlled trials it is important to check that the allocation of patients to the active and comparison groups is well concealed. The quality of allocation concealment is routinely used by the Cochrane Collaboration in grading trials included in systematic reviews because empirical research has shown that studies which do not have well concealed allocation tend to show more inflated results than those that do. Why should this be? The problem is selection bias: if I was carrying out a randomised trial of my favourite wart paint it is important that I do not know which treatment the next patient will receive, otherwise I can influence the results by choosing the milder wart infections for treatment with the paint. This is quite easy in practice as I would only have to find an excuse to rule the next patient out of the trial if they were due to have the paint and had a horrendous crop of warts!

Similarly if I know the treatment used I may be more optimistic in deciding that a wart has completely gone if the patient had my special paint than if they did not. This is a form of detection bias. Secure double blinding (using an identical wart paint substitute prepared by an outside agency) will overcome both problems, and again has been shown to reduce the size of treatment effects compared with the results of unblinded (open) studies.

What about other trials?

Finally after checking all these quality measures for the paper do not forget that the study being reported is only one of a larger group of other studies that have been carried out on the same topic all over the world. It is for this reason that the Cochrane Collaboration has set out to collect together all the evidence from controlled clinical trials that has a bearing of questions related to clinical practice and published the results in the Cochrane Database. Systematic reviews of this kind are one way to combat the increasing volume of papers published each year, but I am often asked what exactly is a systematic review and how does it differ from a meta-analysis.

Narrative reviews

Traditionally reviews of interesting topics have been commissioned by journals that ask an expert in the area to give a viewpoint; the problem is that all experts have their favourite approach to a topic and will tend to be most familiar with those papers that support their own view. (How often do you keep a copy of something that you have read that you think is wrong?) This type of narrative review is therefore inherently likely to be biased.

It is helpful to think of a review as being a scientific investigation but of papers rather than patients. Would you trust a trial that reported the results of a new drug where only a few of those treated have their data for you to see and the choice of which ones in entirely up to the investigator. I certainly would not, and in the same way caution is needed when reading the results of narrative reviews.

Systematic Reviews

So what exactly is a Systematic Review? Mulrow has defined a Systematic review as “an efficient scientific technique to identify and summarise evidence on the effectiveness of interventions and to allow the generalisability and consistency of research findings to be assessed and data inconsistencies to be explored.” (1)

The difference is that the review sets out to find all the appropriate evidence on a topic, not just the bits that suit the writer. Ideally the review should start with a protocol that is decided in advance, and for Cochrane reviews these are also published on the Cochrane database. This helps to avoid data-dredging for results that happen to be show ‘statistical significance’. Post hoc analysis done after the data is collected is equivalent to firing an arrow into a large wooden wall and then drawing a target around the place the arrow lands – much easier that drawing the target first and then hitting the bulls-eye!

The methods section of the systematic review should make clear how the search for evidence was carried out, how the identified trials were selected for inclusion or exclusion from the review, and how the data from the trials was combined. The data pooling is termed Meta-analysis and is no more than using mathematical techniques to combine the results from two or more individual trials. A systematic review sometimes does not include Meta-analysis if the data is not suitable for pooling, and nor does a Meta-analysis mean that all the data has been systematically searched out.

In other articles I unpack some of the techniques used in Meta-analysis and explore the use of meta-analysis in systematic reviews.


1. Mulrow CD. Rationale for systematic reviews. BMJ 1994;309:597-9

Do I need to change my practice (Pulse Article 2001)?

This article is part of a series on Critical Reading.

When speaking to registrars about critical appraisal, one of the commonest question is “How do I decide whether the paper is good enough to warrant a change in my current practice?” In the article on asking a good question I described how to break down the question addressed by a research paper into its four components, and having done this you next have to decide whether the findings of the paper are likely to be important to you and especially to your patients.

Is it valid?

In particular is the approach being described in the paper worth trying on the next patient who presents with the relevant condition. To answer this we need to look at issues relating to the validity of the paper in question. Two types of validity have been described: internal validity which relates to the mechanisms of the study itself and external validity which is more to do whether the results of the paper can be extrapolated to the patient in our own practice. In the rest of this article I will concentrate on issues of internal validity using as an example an imaginary study of olive oil for children with acute otitis media.

Choosing controls

The key issue to think about in relation to internal validity is to look at how a comparison group is chosen in relation to the patients who are given the experimental treatment. In a case-series (for example a set of 6 patients who are given a new treatment in routine practice) there may be no comparison group at all, so the immediate concern is that they might have achieved a good result anyway. For example I might tell you that I have treated a series of 100 children with acute otitis media with warm olive oil and that 85 were better in a few days. This sounds impressive until you look at the results of placebo treatment in antibiotic trials for this condition and find a similar recovery rate.

Better than a case series would be a case-control study in which the records of patients who had prolonged pain following ear infections were checked to see how many had been given olive oil; this proportion receiving olive oil could then be compared to the proportion of olive oil use in other patients who did not have prolonged pain. The problem now is being sure that the children do not have other differences influencing the olive oil usage, and this is rarely possible.

Better still a group of children could be compared by offering parents the choice of whether they use the oil or not; this would constitute a prospective cohort study but uncertainty remains about possible important differences between those who chose to have the oil and those who refuse it.

Overcoming Bias

In both the case-control study and the cohort study design the threat to internal validity is related to bias in the choice of the comparison group (selection bias), as well as other possible biases which may be present because both the patient and the doctor are well aware of the treatment that they have received. It will be no surprise to you that the only secure way around these biases is to use a randomised controlled trial that is preferably double-blind, and these will be addressed in the next article.

HRT and heart disease

So are any of these biases important. They certainly can be and a couple of examples may help to show how. In the early non-randomised studies of Hormone Replacement Therapy the results suggested that women on HRT had lower rates of heart disease, and HRT has therefore been advocated as a measure to reduce risks of Ischaemic heart disease(1). Some of the authors of these early studies did point out that there were some problems, particularly as the rates of road traffic accident deaths were also lower in the group receiving HRT. The more recent evidence from randomised controlled trials (such as the HERS study[2]) has not confirmed the protective effect and it is probable that the women who opted for HRT had other differences from the control group and may have had generally lower risk factors for heart disease.

Preventing Teenage Pregnancy

Another example of this was a cross-sectional survey in the BMJ reporting the association between teenage pregnancies and practice characteristics in different areas (3). The results include this statement “On multivariate analysis, practices with at least one female doctor, a young doctor, or more practice nurse time had significantly lower teenage pregnancy rates. Deprivation and fundholding remained significantly associated with higher teenage pregnancy rates.” The problem here is that we have no evidence that the age or sex of the doctors caused the lower rates of pregnancy, and the unexplained association with fund-holding practices having higher pregnancy rates should perhaps ring some alarm bells. No one  suggested that the end of fundholding would solve the teenage pregnancy problem!

A fuller discussion of association and causation can be found in Follies and Fallacies of Medicine (Tarragon Press) [4] which I would recommend as both amusing and informative background reading for all registrars.


1. Barrett-Connor E, Grady D. Hormone replacement therapy, heart disease and other considerations. Annu Rev Public Health 1998;19:55-72

2. Hulley S, Grady D, Bush T et al. Randomised trial of estrogen plus progestin for secondary prevention of coronary heart disease in postmenopausal women. JAMA 1998;280:605-133.

3. Association between teenage pregnancy rates and the age and sex of general practitioners: cross sectional survey in Trent 1994-7. Julia Hippisley-Cox, Jane Allen, Mike Pringle, Dave Ebdon, Marion McPhearson, Dick Churchill, and Sue Bradley. BMJ 2000; 320: 842-845.

4. Follies and Fallacies in Medicine. Skrabanek and McCormick. Tarragon Press.

Asking a good question (Pulse Article 2001)

This article is part of a series on Critical Reading.

Where do you start when trying to judge papers in medical journals? All too often we are in a hurry and glance briefly at the title and then the conclusion of the abstract. However I would suggest that you try to get inside the mind of the writer of the article; try to work out why they carried out this piece of work. It is easier to do this if you have a structure to work to and I suggest using a four part question at this point.

  1. What are the characteristics of the Patients in the trial?
  2. What is the Intervention being studied?
  3. What is it Compared with?
  4. What Outcomes are measured?

Take a piece of paper and jot down the answers to the four questions shown in the box and you will have a neat summary of the question that your paper is trying to answer. You should have a note of the characteristics of the patients in the trial, the main intervention studied, what it was compared with and what outcomes were measured. You can remember the headings using the acronym PICO (Patient, Intervention, Comparison, and Outcome).

Is this an important question?

If you have been able to identify the four parts of the question that the paper is trying to answer the next thing to ask yourself is whether the answer is going to be relevant to you and the patients that you are looking after. Much research is driven by academic or industrial interest and the question may not be relevant to you.

All too often the outcomes chosen are surrogates that are easy to measure but may not reliably indicate whether the treatment will be of real benefit to the patient. Also the comparison may be with the wrong alternative treatment, or the patients in the trial may not be representative of those seen in your practice. Two examples may help to illustrate the point.

Antibiotics for Acute Otitis Media

There is not shortage of randomised controlled trials that have compared one antibiotic with another for the treatment of acute otitis media, and this is an important issue for pharmaceutical companies introducing new antibiotics. However the first question to answer is whether any antibiotic is needed at all, and this cannot be assessed from comparing two antibiotics with each other. What is needed is evidence from trials comparing antibiotic with placebo to decide how much overall difference they make, and indeed the evidence from all identified trials of this type showed limited benefit of antibiotics balanced by side effects from the treatment. (1)

Nebulised Steroids in Asthma

Here again the crucial question is what nebulised steroids are compared with; the obvious alternative delivery method is using a spacer and metered-dose inhaler since the two delivery methods appear to be equally effective when used for delivery of beta-agonists in acute asthma (2). In spite of this there are very few randomised controlled trials that compare these two delivery methods for steroids. Nebulised fluticasone has been shown to reduce the requirements for oral steroids in severe asthmatics when compare with placebo, but to my mind this is not really the key issue. The costs of nebulised steroids are considerably more than using spacer delivery after all, so we need clear evidence of superiority against spacers not placebos in this instance.

In a nutshell

So in summary use the 4 part question to summarise what the paper is about and then decide if it is a question that is worth spending the time to read in more detail. Consider if the question is an important one and if it is you will then need to think about the validity of the research method used before taking too much notice of the results; this will be the subject of the next article in this series.


1. Del Mar C, Glasziou P, Hayem M. Are antibiotics indicated as initial treatment for children with acute otitis media? A meta-analysis. BMJ 1997;314:1526 –1529

2. Cates C J, Rowe BH. Holding chambers versus nebulisers for beta-agonist treatment of acute asthma (Cochrane Review). In: The Cochrane Library, Issue 2, 2000. Oxford: Update Software.

Antibiotics and Ear Infections – Patient Handout (BMJ 1999)

Ear infections in children will often get better without needing to use antibiotics; the collected evidence from trials performed in several different countries has shown that most children with ear infections given Paracetamol suspension (such as Calpol) were better in a few days. In fact 17 out of 20 children got better in this way without the use of an antibiotic. In comparison if all 20 children took antibiotics only one extra child got better over the same period, and at present there is no way of knowing which one of the 20 given antibiotics would benefit. Also if the 20 children were all given antibiotics, one was likely to suffer a side-effect as a consequence (such as a rash, diarrhoea or vomiting).

Antibiotics did not reduce pain in the first 24 hours and there was also no difference in the likelihood of a further ear infection or hearing difficulty. In the Netherlands antibiotics have not been used routinely for some years for ear infections; they have less of a problem with antibiotic resistance than in this country.

Change of Policy

In view of the above evidence we have changed our policy and no longer give antibiotics routinely for ear infections in children. We would recommend treatment with Paracetamol suspension, which will reduce pain and fever. It should be given at full dose until the earache is gone. If the ear infection persists, or the child is particularly unwell, then antibiotics may be tried. This will be discussed on an individual basis with you during your consultation with the doctor.

Cates C. An evidence based approach to reducing antibiotic use in children with acute otitis media: controlled before and after study. BMJ 1999;318(7185):715-6 doi: 10.1136/bmj.318.7185.715

A 2000 paper that changed our contraceptive practice

In 2000 my senior partner presented the results of a paper published in the Lancet (1)comparing the standard combined oestrogen and progesterone method (Yuzpe) for post-coital contraception with two doses of progesterone (levonorgestrel) only. Until then women to take large numbers of tablets, but a formulation in two single tablets had become available in the UK (Levonelle-2). The comparison was quite clear cut: less vomiting following the progesterone only regimen and also less pregnancies.

I decided to check this out further on the Cochrane Library and found a review covering emergency contraception which was updated in March 1999. The review found two randomised controlled trials which compared levonorgestrel and Yuzpe (including the WHO study in the Lancet). These Pooled results are displayed graphically below as Cates plots using Visual Rx (version 4). In this case Yuzpe and levonorgestrel have been compared and the graphical displays represent 100 patients who are treated.

Figure 1 demonstrates the pregnancy rates; the green faces are patients who do not fall pregnant whichever regimen they receive, and the one red patient will fall pregnant anyway. The single yellow face represents a patient would be pregnant if given Yuzpe but not with levonorgestrel. This represents a Number Needed to treat of 63 (95%CI 45-193) with progesterone only compared to Yuzpe to prevent one extra pregnancy.

Figure 2 looks at the numbers of patients who will vomit; again the green faces will not be sick with either treatment, and the red ones are sick with both. Here the 14 yellow faces will be patients who do not vomit with levonorgestrel but would have done so with Yuzpe. The Number Needed to Treat is 7 (95% CI 7-8) to prevent one patient vomiting.

Although the new treatment was more expensive we estimated that switching to levonorgestrel should save between one and two pregnancies in one hundred patients attending for post-coital contraception. The extra cost of levonorgestrel was about £200 per pregnancy prevented as it was more expensive than Yuzpe in the UK, but in France it was already available to patients directly from the chemist. For us the extra prescribing cost compared well with the alternative cost and inconvenience of terminations of pregnancy!

We abandoned Yuzpe in our practice and switched to levonorgestrel instead. The only unhappy member of the practice team is one of my other partners who had the topic lined up for her own presentation a few weeks later and had to find a new topic to present!


1. Randomised controlled trial of levonorgestrel versus Yuzpe regimen of combined oral contraceptives for emergency contraception. Task Force on Postovulatory Methods of Fertility Regulation. Lancet 1998; 352: 428-33

2. Cheng L, Gülmezoglu AM, Ezcurra E, Van Look PFA. Interventions for emergency contraception. (Cochrane Review). In: The Cochrane Library, Issue 1, 2000. Oxford: Update Software.


Figure 1: Levonorgestrel v Yuzpe – Patients who became pregnant

Figure 2: Levonorgestrel v Yuzpe – Patients who suffered vomiting