Silicon Dale

Bias is Good !

Does it please you more to win $100 than it hurts you to lose $100 ? For most people the winning/losing payoffs are not symmetric. OK, so you're rich enough that it is symmetric for you. What if the possible gain or loss were $1 million ? And "most people" includes most investors - and hence most financial directors - and hence most resource modellers.

So maybe bias is good ... provided that it is in the right direction. In assessing the potential of a mining project, if a resource is over-estimated by (say) 20% then even if the project might proceed, it could well make a loss. If the same resource is underestimated by the same factor of 20%, the company might expect a loss but if went ahead with the project the company would be pleasantly surprised. In the first case there is risk of real financial loss, in the second there is just an opportunity cost in the event that the project is not even started. The only real money that has been lost is that spent on the estimation (this could in itself be quite a large sum - but still nothing compared to the costs of starting a full-scale mining operation).

Clearly it is always best to have the most accurate estimate possible, but the costs of positive or negative errors in that estimate are not symmetrical. In the light of this, an unbiased modelling method such as offered by geostatistical estimation is not necessarily the best way to estimate a resource.

Kriging, just like ordinary linear regression, is a ‘least squares' method. Such methods are known to be sensitive to outliers, and many metalliferous deposits have positively skewed data distributions in which outliers will mostly be high-grade samples. A linear kriged estimate derived from such a data set is globally unbiased, but always the mining engineer (and even more the mineral processor) is wary of the estimated high grade blocks. Either they will fail to materialise in the mining operation, or even if they do, the result is poor recovery when delivered to the mill - which above all seeks consistent feed quality, not the highest but patchy grades.

Therefore, an estimation method which will automatically downplay the effect of positive outliers might well be preferred. Such a method would be provided, for example (in the usual case of a positively skewed distribution), by minimising the sum of absolute deviations rather than squared deviations. Linear kriging estimation yields the spatial equivalent of the arithmetic mean. An alternative method minimising absolute deviations would yield a spatial equivalent of the median. The median is much more robust with respect to outliers, and gives a better fit to the ‘middle' of a skewed distribution.

But even the median does not represent the most common grade value. In a histogram the highest class is that located at the mode. Although in a symmetrical distribution the mean, median, and mode all coincide, in a skewed distribution they are all different. In most positively skewed distributions the mean is higher than the median which itself is higher than the mode. The processing engineers need to know what will be the most common grade to be delivered to the mill - and this is given by a mode rather than a mean or median. It is possible to devise an alternative spatial estimation procedure which will generate a spatial equivalent of the mode. Such an estimation method has some interesting additional properties. In the case of a bimodal distribution, such as might be found in a set of samples straddling a geological zone boundary, there will be a sudden switch in the estimate from one mode to the other as the two modes change in relative importance when traversing the boundary. In other words, a spatial mode estimator could be used to help identify the boundary location.

Using either a spatial median or a spatial mode estimation method, the effect of outliers is greatly diminished. The estimates obtained will be biased, but the bias will be conservative in that the magnitude of any over-estimation is reduced. The result is that the risk of a bad decision that costs real money is reduced by comparison with the risk of a bad decision that incurs only paper opportunity costs. Or to view it another way, if the project does proceed on the basis of such estimates made by these biased methods, there is more chance of a pleasant surprise than an unpleasant one.

Unfortunately, the nice mathematical constructs used for least-squares methods cannot be used for least-absolute-deviation methods. There are certain other complications also, as discussed in my book Nonparametric Geostatistics. This is now out of print, but I have 30 remaining copies available.

Stephen Henley
Matlock, England

Copyright © 2002 Stephen Henley
Bias is Good: Earth Science Computer Applications, v.17,no.6,p.1-2