Discover more from Sensible Medicine
Medical vs N95 Masks in Healthcare Workers
The Study of the Week breaks down the recent Medical vs N95 mask study in the Annals of Internal Medicine
I recently sat through my N95 fit test. All clinicians had to comply. And we have a lot of clinicians. It is a big commitment in time and money and person-power to fit these special masks.
The idea is simple: clinicians caring for patients with COVID-19 are especially susceptible to infection. N95’s have to be better than medical masks, which have huge areas of space for viruses to float in and about. The reasons surgeons wear medical masks during surgery is so that they don’t sneeze or drool into a wound—not to stop tiny viruses.
What’s weird though is that while the CDC recommends N95 masks for routine care of patients with COVID-19, the WHO, recommends only medical masks.
So, there is tension. Two groups of experts disagree. We call this equipoise.
One way to resolve the tension is do what one group believes and ignore the other (masking young children, for instance). The better way to resolve equipoise is to do a randomized clinical trial.
The McMasters group in Canada led a multi-center (29 centers) trial in four countries comparing the use of medical masks vs N95 during routine care of patients with COVID-19. (N95 masks were used in both groups during procedures that create aerosols.) The Annals of Internal Medicine published the study, which is open access.
About 1000 healthcare workers were randomized equally during the heat of the pandemic. These were highly susceptible individuals. The authors excluded workers who had a previous SARS-CoV-2 infection or those who had been vaccinated.
The primary endpoint was a positive PCR test for SARS-CoV-2.
52 of 497 (10.5%) participants in the medical mask group versus 47 of 507 (9.3%) in the N95 respirator group tested positive. The authors expressed this 1.2% absolute risk difference as a hazard ratio of 1.14, or 14% higher in the medical mask group.
But that is the point estimate. The 95% confidence intervals ranged from 0.77 (a 23% lower rate) to 1.69 (a 69% higher rate).
The statistical plan, which is set out beforehand, was by a noninferiority analysis. If the worst-case scenario, the upper bound of the confidence interval, was less than 2 (or twice as bad), then medical masks would be deemed noninferior to N95. Clearly that was the case here.
Investigators choose noninferior designs when the active arm offers something desirable. In drug studies, noninferior designs are used when the new drug is more convenient to use (direct acting anticoagulant vs warfarin); in surgery studies, noninferior designs were used to study transcatheter aortic valve implantation (TAVI) vs surgery because TAVI is less invasive.
Here the medical mask is clearly easier to use and less costly so a noninferior design works well.
The authors then added a subgroup analysis based on country. They called it an “unplanned analysis.” This was likely forced on them by reviewers or editors.
I would ignore it. Even in the best case scenarios, say when a trial finds a highly positive result and the subgroups are pre-specified, subgroups are difficult to interpret.
This is because a trial is “powered” to sort out signal from noise in the main results. When you slice up subgroups in smaller numbers, you increase the rate of false positives, e.g. finding noise not signal.
Richard Peto famously showed this in a landmark cardiology trial called ISIS-2, which found a benefit to aspirin after heart attack. Editors wanted to know which group had more or less benefit. Peto refused. But the Lancet forced him.
So, to make his point, he did an analysis of aspirin effects based on astrological sign. And found that Libra or Gemini patients had no effect from aspirin, but all other signs had massive benefit.
This was a beautiful demonstration of how subgroups pick up noise.
This is not complicated. Medical masks were noninferior to N95s in preventing healthcare workers from turning positive for SARS-CoV-2 infection.
This study created quite a stir on the Internet.
I’ve already addressed one line of criticism, the subgroups. Critics say the N95 masks work better in Canada. But we’ve already set out that subgroups are fraught due to smaller numbers.
Another criticism holds that the reason why there were no significant differences is that healthcare workers could get infected outside the hospital. Pediatrician and voice of reason Alasdair Munro had a nice explanation on his Substack.
Munro points out that the question of the study is not: do masks work? We know that in a physics lab, the N95 filters more virus than a medical mask. Heck, you don’t even need a physics lab; just look at the profile of someone wearing a medical mask. Masks aren’t used in physics labs; they are used in the messy real world.
This study asked the question of how the two masks function in the real world. Of course, healthcare workers have lives outside the hospital. And of course, many healthcare workers have been exposed to the virus and have some immunity.
I call these competing factors affecting the primary endpoint.
This is why cardiology (and cancer) studies enroll near perfect patients. You want to minimize the role of competing causes of the primary outcome—which is usually mortality. Let’s say you did a study of a heart drug in 90-year-olds. It could be an amazing drug but it would not reduce mortality in people this old, because there are oodles of things that can cause death in these patients.
It’s the same with healthcare workers and turning positive on a PCR test.
Before this study, CDC experts felt that the exposure to SARS-CoV-2 was SO HIGH when caring for infected patients, that it would overwhelm the cumulative exposure. And, therefore, we need to use the better mask. The WHO did not feel this was the case.
This study finds that that the WHO experts were correct. It’s great to know that. We can change policy and deliver care more efficiently.
Keep in mind that if this study were repeated now, in the presence of even higher levels of vaccination and immunity (e.g. more competing factors), it would surely be noninferior.
I only wish we did more of these types of studies during the pandemic. Gosh we would know so much more than we do now.