This is a plain-language summary of our peer-reviewed research
Measuring the fitted filtration efficiency of cloth masks, medical masks and respirators
Amanda A Tomkins, Gurleen Dulai, Ranmeet Dulai, Sarah Rassenberg, Darren Lawless, Scott Laengert, Rebecca S Rudman, Shiblul Hasan, Charles-Francois de Lannoy, Ken G Drouillard, Catherine M Clase
Background. There were a number of studies of different types of masks, but they had the following problems:
Not many participants; some studies are based on one male volunteer
No ethnic diversity and women were not well represented among the participants
Only a few types of masks studied on the same participant(s) at the same time
The cloth masks studied were haphazardly chosen and poorly described
These studies are informative and they are summarized on our page Mask Types.
However, it is always preferable to compare data within the same study, under the same conditions, rather than to draw comparisons from across different studies. When we study things in a planned way on multiple people, we can use statistical testing to say if the differences in percentage filtration are likely to be real, or whether they are likely to have arisen by change. The short-hand for this idea is ‘statistical signficance’.
Our work aimed to fill this knowledge gap.
These are our main results as a graph.
The vertical axis is the filtration efficiency as a %. This is the percentage filtration provided by the mask as worn by the participants.
The central line on each bar shows the median value for the participants tested.
When two sets of data were statistically the same, they share a letter.
For example, the cloth Essex mask worn on earloops, shares the letter ‘a’ with the cloth Essex mask worn on overhead ties, and with both level 1 (L1) medical masks, with one of the level 3 (L3) medical masks, and with one of the KN95s. The interpretation is that, within the limits of our study, the filtration for these masks is the same.
Bars that are not marked with an ‘a’ filtered better than cloth masks and level 1 medical masks. These bars represent one of the two level 3 masks, 4 out of 5 of the KN95/KF94s and both of the N95s.
The N95s are marked with a letter ‘g’ which is not shared by any other mask, to show that the better filtration of the N95s is statistically significantly different from that of all the other masks tested.
We compared different types of masks using a Portacount particle counter, the same technology that occupational health departments use to assess how well respirators fit. We used particles between 0.02 and 1 micron to assess filtration. For comparison, a typical virus is 0.1 micron, and a human hair is 70 microns. Viruses are excreted in respiratory particles that may contain several viruses, and also contain mucus and shed cells.
Aerosols are particles that remain suspended in the air. Respiratory particles up to 100 microns are aerosols. Most infectious particles are between 0.5 and 5 microns in size. Particles that are 2.5 microns or less are measured as PM2.5, which is an important measure of outdoor air quality, in general, and in the context of wildfires. PM10, also used to assess air quality, reflects particles 10 microns or less.
In testing 0.02 to 1 micron, we are testing at the lower end of the relevant size range for infectious aerosols, which will be the harder to filter, smaller particles. We are also testing at the lower end of the relevant size range for wildfires and air quality.
Perfect filtration is 100% and no filtration 0%.
In this second experiment, we looked at mask hacks - minor modifications to masks in the hope of improving efficiency.
Previous studies have suggested that some of these hacks may be effective, but the studies are small and involved few participants.
Wearing the earloops on an earguard or scrubcap, or using the knot & tuck method did not improve filtration for medical level 1 or level 3 masks. It’s also useful to know that they don't make fitted filtration worse: many people find these modifications more comfortable.
However, the three different mask braces we tested all improved fitted filtration by reducing the edge leak and improving the seal. Level 3 masks worn with braces resulted in fitted filtration >90%. These are among the highest results we observed other than N95s.
A medical mask with a mask brace is a good alternative if N95s are not available, for example, in disasters and low-resource settings.
However, in our assessment of discomfort, combinations with mask braces generally were scored significantly worse than when the same mask was worn alone (supplementary figure 4).
In the third and final experiment, we tested the effect of double-masking, wearing a cloth mask over a level 1 or level 3 medical mask.
Level 1 and level 3 medical masks are made of material that is an excellent filter. The middle layer of these masks is meltblown polypropylene. These haphazardly arranged tiny fibres are excellent at trapping tiny particles. Unfortunately meltblown is fragile, so the sheet of meltblown is supported on either side by an outer and inner layer of spunbond polypropylene; and meltblown is not washable. These masks, when worn, filter imperfectly (50% and 50-70% respectively) because of leaks around the edge.
Well-designed cloth masks have a complementary problem. The two layers of woven material are not as good a filter as meltblown. However, because of the good design and the flexibility of the cloth, they can make a good seal to the face. These masks, when worn, filter imperfectly (50%) because of particles coming through the material.
The idea of overmasking is to combine the better edge seal of the cloth mask with the better filtration material of the medical mask.
This idea was borne out. There was a statistically significant difference in fitted filtration, for level 1 and level 3 masks, when a cloth mask was worn over the top. This happened whether the overmask was on earloops or on ties, though in general, fitted filtration was highest when the overmask was on ties.
Across all the experiments we asked the wearers to assess each mask for glasses fog and reported how severe that was, using a standardized scale. For participants who didn’t normally wear glasses, we provided safety glasses for this assessment. We also asked them to report the leakiness of the mask, again using a standardized scale. These subjective assessments were made before we formally tested the mask so at that point, neither participants or tester knew what the fitted filtration result would be.
We found a relationship between measured filtration and the subjective assessment of glasses fog, and a stronger relationship between measured filtration and the subjective assessment of overall leak.
When we looked at the data in more detail, some differences emerged. All respirators were assigned the lowest leak score of 1. In keeping with their combination of excellent filtration material and excellent seal, these masks have the highest filtration efficiencies.
The relationship between subjective leak and measured fitted filtration efficiency was strongest for level 1 and level 3 masks, and was weak for cloth masks and for KN95/KF94s. This suggests that for level 1 and level 3 masks, a wearer’s impression of how good the edge seal is, how little edge leak is reported, predicts filtration. If a wearer thinks a mask is leaking badly, it likely is.
A volunteer testing an Essex cloth mask on overhead ties. The Portacount counts particles in the 0.02 to 1 micron range sampled from the ends of the tubes. The tubing is double - a clear tube is attached to a small metal device called the probe, which allows it to sample air from inside the masks. The blue tube ends at chest height and samples air in the room. We supplement the particles in the room with salt particles from a particle generator.
Figure 2. Box and whisker plot showing the effect of minor modifications, or hacks, to a certified level 1 Polar Bear mask and to a certified level 3 Halyard mask. 10 participants, 1 replicate. N = 10. Data were not normal by Lillefor’s test. Kruskal-Wallis with Conover-Inman post hoc comparisons. Boxes show interquartile range and whiskers minimum and maximum. Letters denote groups which are statistically similar and dissimilar: a shared letter for two mask types signifies no difference between those types; absence of a shared letter signifies a significant difference p<0.05. Neoprene brace made using downloadable, public domain, template from Fix The Mask and recommended materials; silicone brace designed at McMaster University; FTM brace - proprietary Fix-The-Mask brace. L1 and L3 controls were retested on these participants as part of this panel; estimates differ slightly from those in Fig 1.
Fig 3. Effects of overmasking with Essex masks on earloops and on overhead ties on fitted filtration efficiency. The top graph shows (left) the Essex mask on earloops (Earloop) and on ties (Ties) worn alone, followed by Essex-on-earloop with a second Essex-on-earloop as an overmask (Earloop-Earloop), and by Essex-on-earloop with a second Essex mask on ties as an overmask (Earloop-Ties). The centre panel shows the level 1 certified Polar Bear mask worn alone (L1), with an Essex-on-earloop as an overmask (L1- Earloop), and with an Essex-on-ties as an overmask (L1-Ties). The right panel shows the level 3 certified Halyard mask worn alone (L3), with an Essex-on-earloop as an overmask (L3- Earloop), and with an Essex-on-ties as an overmask (L3-Ties). N = 6, 3 replicates. Data were normal by Lillefor’s test. ANOVA with Tukey’s honestly significant difference for post hoc comparisons was used. Mean and median; boxes show one standard deviation (SD); whiskers show 95% confidence intervals. Letters denote groups which are statistically similar and dissimilar: a shared letter for two mask types signifies no difference between those types; absence of a shared letter signifies a difference. Earloop: Essex mask worn on elastic earloops. Ties: Essex mask worn on overhead cloth ties. L1 level 1; L3 level 3.
Fig 4. Relationship between fitted filtration efficiency and glasses fog, and between fitted filtration efficiency and subjective leak. Glasses fog and subjective leak were assessed before fitted filtration efficiency was measured. Top graphic fitted filtration efficiency against glasses fog score across data generated for each sub-study. Open circles are raw data, squares are means and whiskers are standard deviations for each score category. Dashed line is the linear regression fit: FFE = -1.78±0.48*Glasses Fog Score + 79.3±1.7; R2 = 0.04; p<0.001, df = 338. Bottom graphic presents data against leak score. Dashed line regression fit: FFE = -5.5±0.6*Leak Score + 88.5±1.7; R2=0.22; p<0.001, df =338.
When a new respirator design is brought forward, it must be tested on a panel of people that are meant to represent the general population. The data used to define this panel were drawn from predominantly white men some decades ago, they don’t represent the general population and certainly don’t represent healthworkers, where women and people of diverse ancestry are highly represented. The data are also based on anthropometric measurements made with calipers, ie, they are vertical distances through the head. We thought that when trying to fit a mask to the surface of the face it would make more sense to measure the surface distance, as if fitting clothes. We measured our participants both ways, and examined the relationship between the caliper measurements and measurements made with tape: there was no correlation.
In conclusion, we showed that well-designed 2-layer woven-cotton cloth masks and medical masks performed about the same, at 50% filtration. Some level 3 masks, and most KN95/KN94 masks performed around 70%. In times of shortage, we recommend wearing a cloth mask on overhead ties over a medical mask (close to 90%), or a mask brace over a medical mask (above 90%). Even without fit testing, N95s performed best, > 95% in all cases.
The differences in filtration are more marked when considered as exposure. 30% leak around a KN95 and 5% leak around an unfitted N95 results in 6 times the exposure with the KN95. 50% leak around a well-designed cloth mask or a level 1 medical mask is 10 times the exposure of the 5% leak around an unfitted N95 and 50 times the < 1% leak that is permissible for a fitted N95.
Exposure is relevant because the higher the exposure, the higher the likelihood of developing clinical disease. For some diseases, including Covid, higher exposures likely lead to more severe disease, as well as an increased chance of developing clinical disease.
There is no doubt that for the protection of the wearer, the best mask is a respirator (N95 or FFP3). In regulated environments, individuals are fit tested with specific masks, to ensure <1% leak. In our volunteers, the N95s we tested all filtered above 95%. However, our sample size was limited to 4 individuals, and we studied only 2 different respirators. Not all N95s will fit all individuals this well.
However, all masks work. We studied aerosols, not droplets, and aerosols at the lower end of the relevant range. We studied protection of the wearer, not source control; all masks also provide source control, but that is a different question. Knowing that imperfect masks work, to some extent, explains the epidemiologic data showing reductions in transmission during periods of mask mandates, at times when the masks worn were largely cloth masks, neither optimized nor standardized. It explains the CDC data showing that protection of the individual varied with mask type. (These studies are listed on our Why Masks Matter page). It explains why hospitals experience decreases in outbreaks when they institute mask mandates and higher rates when they remove them, even if the masks in question are largely level 1 medical masks.
Our data also offers insight into the randomized literature on masks. If medical masks don’t work very well, then a trial of medical masks worn by individuals to protect themselves will need to be very large to have a reasonable chance of showing a difference, if one exists. The negative Danmask trial, which studied wearing medical masks when out in the community, likely results from a power issue (ie, not enough participants, not enough time, not enough events), among other problems. If medical masks work to some extent, and N95s work very well, then when they are compared, a trial will need to be very large to show a difference, particularly if the N95s are not worn continuously. The negative Loeb trial, which studied switching to an N95 when a high-risk situation had been identified, compared with continuing to wear a medical mask, likely results from this power issue, among other problems. Finally, the ability of cloth and medical masks to filter aerosols and reduce exposure, explains why the largest trial conducted of this question, a community implementation of cloth masks and medical masks in Bangladesh, did show a difference in its primary outcome, a reduction in symptoms and evidence of COVID-19 infection.
Fig 1. Fitted filtration efficiency for cloth (2), L1 (2) and L3 (2) certified medical masks, a non-certified Kegis 3D mask (purchased as a KN95 look-alike), KF94s (2), KN95s (3), and for respirators, N95 and CaN99. N = 4. Bars present mean and standard deviation (SD) and whiskers showing 5 - 95 confidence values. Data were normal by Lillefor’s test. Letters above whiskers indicate statistical groupings according to Tukey’s post hoc comparisons. A shared letter for two mask types signifies no difference between those types; absence of a shared letter signifies a significant difference p<0.05.