I admit it. Early in
my career I was a Pump Jockey. I have
received my basic training in air monitoring and I was quite proud of the fact
that I could calibrate a pump and work out all the logistics of air
monitoring. Indeed, it was somewhat
magical and heady for me to realize that we can sample the air and actually
determine the concentration of specific chemical species within the air of the
breathing zone of workers.
Armed with my list of Exposure
Limits (both ACGIH TLVs and my company’s internal limits) I was ready to
take on the world of Industrial Hygiene.
I was hot stuff! I understood
the basic premise that a ratio of Exposure/Exposure
Limit less than one had a happy face J while an exposure above the exposure limit required
some action L. Confidence was high and self-doubt and introspection
relatively low. If an exposure limit
was 10ppm and I measured a breathing zone exposure
of 20ppm I was pretty sure that an overexposure had occurred L. If I took a single measurement of a
breathing zone exposure of 2.1ppm
for the same compound I would tend to declare the situation “safe” J
and not consider doing any more testing.
If I was the least bit unsure and took another sample (same scenario,
same worker, and different day) and got 4.2ppm I would still tend to think that
this average exposure that was less
than 50% of the OEL was safe. If for
some crazy reason I took a third sample and got 8.4ppm my confidence might be
shaken somewhat but I could still rationalize that the mean and median measured
exposures were still below 50% of
the OEL often considered to be the “action level” or point where you would do
something to control the exposure L .
Enter statistical analysis and my introduction to
reality. Indeed, I eventually I learned
that exposures in essentially all
workplace environments are quite variable even for the same worker doing the
same job. I learned that most exposures are well described by either
a normal or lognormal distribution. The
normal distribution is the “bell shaped curve” that has probabilities for every
exposure value with likelihoods for
those values. The area from the top of
the bell to the left (toward negative infinity) has 50% of the exposure values and the area to the
right toward positive infinitely has the other 50%. So if the population of exposure numbers is highly scattered or
diverse then the width or spread of the bell is relatively broad. It should be noted that the numbers never
end, they go to negative infinity to the left and positive infinity to the
right. So there is always some finite
(but often vanishingly small) chance of any exposure in this distribution.
A lognormal distribution is just
the distribution of the log of all these exposures. This distribution of exposures in a lognormal distribution is bounded on the left by
zero (just like the real world) and positive infinity to the right. It is skewed or pushed over to the left which
means it is asymmetrical with more values of exposure concentrated toward zero (just like the real world). Indeed, in general, the lognormal
distribution does a much better job of describing the distribution of real
world exposures in any homogeneous
scenario and should be used by default as long as the data passes a fit test of
the lognormal assumption.
The above is statistical reality but what we folks in the
field need is a user-friendly statistical tool to put this rubber to the road. There have been a number of candidates over
the years but the latest and, in my opinion, the greatest is IH STAT developed
by Dr. John Mulhausen who is the Director of Corporate Safety and Industrial
Hygiene at 3M Company. John developed the
original spreadsheet program over the years where it has been modified into its
current multilingual version by Daniel Drolet.
You can get it at: http://www.aiha.org/get-involved/VolunteerGroups/Pages/Exposure-Assessment-Strategies-Committee.aspx For us English speakers, I suggest
downloading the “macro free version” for ease of use.
As an exercise let’s put our data 2.1, 4.2 and 8.4 ppm into
IH STAT and see what we get. The
program advises that the data fit both the normal and lognormal distribution
but fit the lognormal better. The error
bands around the estimates of the mean are very broad primarily because we only
have three samples. Statistically, the
model is much “happier” with 6 or more samples but that was frankly unheard of
in my pump jockey days.
The statistical lognormal fitted model has a geometric
standard deviation (GSD) of 2.0. This represents the width of the lognormal
curve as discussed above and a value of 2 is pretty typical. Indeed, it is not until the GSD gets to be
greater than 3 that the process is considered to be out of control or the exposure group poorly defined.
What is most interesting about this analysis is that the
lognormal distribution predicts that greater than 10% of the time the OEL will
be exceeded in this exposure scenario.
That would mean that for more than 25 days in a 250-day working year the
exposure in this scenario would be
predicted to exceed the exposure limit
(OEL). If I had known this in my heady
days as a pump jockey it would have given me pause. Indeed, there was advice around even on those
days from NIOSH that if the GSD was 2 then the “action level” should be about 10%
of the OEL. Thus, the above data were all above this recommended action level.
Unfortunately, absent wonderful tools like IH STAT, few were doing
detailed statistical analysis in those days (the 1970s) and I certainly was
not.
The Pennsylvania Dutch have a wonderful saying: “Too soon old and too late smart”. It is definitely not too late for you rise
from pump jockey status to that of exposure assessor using this remarkable
tool.
Mike
ReplyDeleteWell said. As a laboratorian, one familiar with the variability of instrumental measurements not to mention the variability of exposure you so succinctly described, it was always troubling when a client treated a single result as unequivocal evidence of an exposure, one way or the other. As a commerical laboratory, you can imagine the raised eyebrows when advising clients to take more samples to truly characterize the exposure. Best regards, Bob Lieckfield, Jr., CIH, Bureau Veritas North America.
I have never been in a position that statistical modeling like this was necessary. I've always been curious, though, and I apologize if this is a dumb question: How well validated is the lognormal model? To use the above example, how many times has anyone performed 250 consecutive days of sampling on the same task to document the lognormal exposure distribution and the 10% overexposure prediction? Would such sampling not be likely to reduce the gsd and change the percentage of overexposures predicted?
ReplyDeleteIt is not a dumb question. Indeed,it has been discussed quite recently among folks with more statistical knowledge than myself. It is my understanding that the lognormal distribution does fit large data sets as well as or better than most other distributions. Given the uncertainty from just 3 samples in the example - a sample of 250 days will almost certainly be different from the 10% predicted. This was just the best prediction based on the available data. There does remain however a lot of uncertainty which is also statistically estimated. Indeed, the gsd could go up or down - it was simply the best estimate from only 3 samples.
ReplyDeleteHi Mike,
ReplyDeleteThank you very much for a very informative piece.
I tried downloading the mentioned file but it seems the link is not functioning, is there another portal that can be used?
Dear Anonymous,
ReplyDeleteSorry you could not get the files. Tell me which you want and I will send them to you. mjayjock@gmail.com
Mike
Too bad you were not my teacher in college and or trainer in the work place!
ReplyDeleteThis cunt knows what he is talking about.
ReplyDelete