I admit it. Early in
my career I was a Pump Jockey. I have
received my basic training in air monitoring and I was quite proud of the fact
that I could calibrate a pump and work out all the logistics of air
monitoring. Indeed, it was somewhat
magical and heady for me to realize that we can sample the air and actually
determine the concentration of specific chemical species within the air of the
breathing zone of workers.

Armed with my list of

**Exposure Limits**(both ACGIH TLVs and my company’s internal limits) I was ready to take on the world of Industrial Hygiene. I was hot stuff! I understood the basic premise that a ratio of**Exposure/Exposure Limit**less than one had a happy face J while an exposure above the exposure limit required some action L. Confidence was high and self-doubt and introspection relatively low. If an exposure limit was 10ppm and I measured a breathing zone**exposure**of 20ppm I was pretty sure that an overexposure had occurred L. If I took a single measurement of a breathing zone**exposure**of 2.1ppm for the same compound I would tend to declare the situation “safe” J and not consider doing any more testing. If I was the least bit unsure and took another sample (same scenario, same worker, and different day) and got 4.2ppm I would still tend to think that this average**exposure**that was less than 50% of the OEL was safe. If for some crazy reason I took a third sample and got 8.4ppm my confidence might be shaken somewhat but I could still rationalize that the mean and median measured**exposures**were still below 50% of the OEL often considered to be the “action level” or point where you would do something to control the**exposure**L**.**
Enter statistical analysis and my introduction to
reality. Indeed, I eventually I learned
that

**exposures**in essentially all workplace environments are quite variable even for the same worker doing the same job. I learned that most**exposures**are well described by either a normal or lognormal distribution. The normal distribution is the “bell shaped curve” that has probabilities for every**exposure**value with likelihoods for those values. The area from the top of the bell to the left (toward negative infinity) has 50% of the**exposure**values and the area to the right toward positive infinitely has the other 50%. So if the population of**exposure**numbers is highly scattered or diverse then the width or spread of the bell is relatively broad. It should be noted that the numbers never end, they go to negative infinity to the left and positive infinity to the right. So there is always some finite (but often vanishingly small) chance of any**exposure**in this distribution. A lognormal distribution is just the distribution of the log of all these**exposures**. This distribution of**exposures**in a lognormal distribution is bounded on the left by zero (just like the real world) and positive infinity to the right. It is skewed or pushed over to the left which means it is asymmetrical with more values of**exposure**concentrated toward zero (just like the real world). Indeed, in general, the lognormal distribution does a much better job of describing the distribution of real world**exposures**in any homogeneous scenario and should be used by default as long as the data passes a fit test of the lognormal assumption.
The above is statistical reality but what we folks in the
field need is a user-friendly statistical tool to put this rubber to the road. There have been a number of candidates over
the years but the latest and, in my opinion, the greatest is IH STAT developed
by Dr. John Mulhausen who is the Director of Corporate Safety and Industrial
Hygiene at 3M Company. John developed the
original spreadsheet program over the years where it has been modified into its
current multilingual version by Daniel Drolet.
You can get it at: http://www.aiha.org/get-involved/VolunteerGroups/Pages/Exposure-Assessment-Strategies-Committee.aspx For us English speakers, I suggest
downloading the “macro free version” for ease of use.

As an exercise let’s put our data 2.1, 4.2 and 8.4 ppm into
IH STAT and see what we get. The
program advises that the data fit both the normal and lognormal distribution
but fit the lognormal better. The error
bands around the estimates of the mean are very broad primarily because we only
have three samples. Statistically, the
model is much “happier” with 6 or more samples but that was frankly unheard of
in my pump jockey days.

The statistical lognormal fitted model has a geometric
standard deviation (GSD) of 2.0. This represents the width of the lognormal
curve as discussed above and a value of 2 is pretty typical. Indeed, it is not until the GSD gets to be
greater than 3 that the process is considered to be out of control or the

**exposure**group poorly defined.
What is most interesting about this analysis is that the
lognormal distribution predicts that greater than 10% of the time the OEL will
be exceeded in this exposure scenario.
That would mean that for more than 25 days in a 250-day working year the

**exposure**in this scenario would be predicted to exceed the**exposure limit**(OEL). If I had known this in my heady days as a pump jockey it would have given me pause. Indeed, there was advice around even on those days from NIOSH that if the GSD was 2 then the “action level” should be about 10% of the OEL. Thus, the above data were all above this recommended action level. Unfortunately, absent wonderful tools like IH STAT, few were doing detailed statistical analysis in those days (the 1970s) and I certainly was not.
The Pennsylvania Dutch have a wonderful saying: “Too soon old and too late smart”. It is definitely not too late for you rise
from pump jockey status to that of exposure assessor using this remarkable
tool.

Mike

ReplyDeleteWell said. As a laboratorian, one familiar with the variability of instrumental measurements not to mention the variability of exposure you so succinctly described, it was always troubling when a client treated a single result as unequivocal evidence of an exposure, one way or the other. As a commerical laboratory, you can imagine the raised eyebrows when advising clients to take more samples to truly characterize the exposure. Best regards, Bob Lieckfield, Jr., CIH, Bureau Veritas North America.

I have never been in a position that statistical modeling like this was necessary. I've always been curious, though, and I apologize if this is a dumb question: How well validated is the lognormal model? To use the above example, how many times has anyone performed 250 consecutive days of sampling on the same task to document the lognormal exposure distribution and the 10% overexposure prediction? Would such sampling not be likely to reduce the gsd and change the percentage of overexposures predicted?

ReplyDeleteIt is not a dumb question. Indeed,it has been discussed quite recently among folks with more statistical knowledge than myself. It is my understanding that the lognormal distribution does fit large data sets as well as or better than most other distributions. Given the uncertainty from just 3 samples in the example - a sample of 250 days will almost certainly be different from the 10% predicted. This was just the best prediction based on the available data. There does remain however a lot of uncertainty which is also statistically estimated. Indeed, the gsd could go up or down - it was simply the best estimate from only 3 samples.

ReplyDeleteHi Mike,

ReplyDeleteThank you very much for a very informative piece.

I tried downloading the mentioned file but it seems the link is not functioning, is there another portal that can be used?

Dear Anonymous,

ReplyDeleteSorry you could not get the files. Tell me which you want and I will send them to you. mjayjock@gmail.com

Mike

Too bad you were not my teacher in college and or trainer in the work place!

ReplyDelete