Wednesday, June 12, 2013

Describing Uncertainty

Uncertainty is everywhere.   Indeed, there is not a thing or quantity that we can measure that does not have some level of uncertainty.  For example, the length of any "standard" length object for measuring 1 yard (i.e., a yard stick) is not exactly 1 yard (except by definition) - there is always some finite  tolerance + or - (however small) around the 1 yard mark.  We cannot be completely certain about any measurement.   Measurement tolerance is one form of uncertainty.  Quantities that vary naturally are another form.  For example, the weight of all adult females in Pennsylvania will have some variation and therefore the weight of any PA female adult with fall within a range of uncertainty.  Indeed, any individual will have weight that varies over time.  The other source of uncertainty, which is typically the most important and dominant, in risk assessment is a lack of information associated with the value of interest.   I am going to use the weight of my dog Libby in an attempt to show the interaction between natural variation and lack of knowledge.   In this whimsical example let us assume that Libby's weight is an important value in a risk assessment and if we overestimate her weight we overestimate the risk.   Similarly, if we underestimate her weight we underestimate the risk.   How much does Libby weigh? 
When I pose this question to students they usually ask me to identify Libby's breed.   I tell them that I will only disclose to them that Libby is a pure bred dog and is fully grown.   I tell them that they all have some information about the universe of dog weights on Earth.  At that point, the students usually present a range of weights to represent Libby, for example 5-200 lbs.  This implies that there is no chance Libby could weigh less than 5 or more than 200 lbs. (Worst case estimated weight as risk surrogate = 200 lbs).  I then tell them that Libby is an English Springer Spaniel.   Those with online-capable phones might find that female Springer's typically weigh between 35 and 45 lbs.   The students could choose 35-45 lbs as the new range of estimated weights, one born of more information but some will remember the condition that if we specifically underestimate Libby's weight we underestimate the risk.   These wise students then may (and have) come up with an estimated range of 35 to 70 lbs.  The next step might be to go to my home and look into the window of our garage (where Libby is) to see her.   Such an examination will disclose an overweight dog perhaps as much as 20-30 lbs overweight.  The new estimated range from these observations might be 55-70 lbs.  The final step might be to go into my garage and weigh Libby daily over the period of a month.  Here the data might show her to weight to vary between 62-64 lbs. This range is now a much better data-based estimate of lowest to highest which reflects the natural variation of her weight while all the previous estimates were plagued by a greater lack of knowledge.   Forced to make a deterministic (single value) estimate of her weight one might use 65 lbs to guard against getting her on a day when she ate or retained somewhat more than normal.   
The point here is that we could ALWAYS provide some useful quantitative estimate of Libby's weight and the uncertainty around it even when the available data was meager.  The estimate got better (more useful) with more data.
Just to stretch this analogy even further.   What if I said I have an animal in my garage and I will not tell you what species of animal but you need to estimate its weight as proportional to risk!   Here the uncertainty born of ignorance is MUCH greater and the 40 fold range of estimated weight above.  Indeed, given such a large range one might rightly question the utility of the estimate but the point remains that the uncertainty at any stage of the assessment should be estimated and disclosed.  Such uncertainly analysis shows where we might get the best bang for our data buck - in this case knowing what animal species is in the garage. That specific information would then allow us to make much more narrow and a more useful estimation of the range of weight as a surrogate for risk. 


  1. An interesting site to get students more comfortable with estimation is the Fermi Questions site:

    The object is to estimate metrics of things you actually have some idea about, do mental math using conversion factors, and get within an order of magnitude of the result. It's a fun exercise.

  2. Great stuff Frank - thanks for letting us know!

    1. I got a score of 1.6 on five questions. Really not sure how I did that well.