Some thoughts on discontinuities
Geostatistics - or for that matter almost any other spatial estimation method - assumes that the region within which estimates are being computed is at least continuous. It may not be homogeneous, the mean value and the variogram itself may not be stationary, but at least it should not be cut in two by a geological boundary (or any other boundary that introduces a discontinuity in the grade value field). If there are such boundaries, then one is expected to divide the data into separate sets and carry out the estimation exercise separately in each zone.
This is the principle, at least. The practice is not always so easy. You may not be able to see the discontinuities (faults, say, or lithological contacts). You may not even be certain they exist: it may be that you only have a supposition that discontinuities might be present as a result of detailed interpretation of the data.
Unfortunately geostatistics is virtually silent on the best course of action to take in such circumstances. That is because, however distribution-free, or however tolerant to non- stationarity, particular kriging methods might be, none will properly handle data that are derived from a field which is not even continuous. If even the number and location of discontinuities cannot reliably be identified from the drillhole data - as is common in many complex mineral deposits - how can you obtain resource estimates at the vital early stage when decisions on possible further drilling must be taken ? The purist would answer that you really need to have enough understanding of the geology before you can estimate anything. While that is true, it doesn't satisfy the project manager who is still pressing for a decision yesterday.
There is a possible method, though it is not in any sense geostatistical. All of kriging is derived from systems which are weighted moving averages (i.e. weighted arithmetic means) of the available data. Now even elementary statistical textbooks illustrate that there are at least three measures of the ‘centre' of a distribution, of which the arithmetic mean is only one. There is also the median - the value which partitions the data set into two equal parts (it is either the middle value or the mean of the two middle values, depending on whether there are an odd or even number of data values). I have previously proposed a range of methods derived from a weighted moving median as a nonparametric alternative to geostatistics. But there is yet a third measure: the mode.
The mode is conventionally described as the ‘most frequent value': it is a value corresponding to the highest class in the histogram. It is possible to develop a weighted moving mode technique, with weights defined by distance from the point being estimated and modified by de-clustering (these will be discussed in a later Silicon Dale item). Given a set of weighted data values for each point to be estimated, it is possible to generate a histogram or cumulative frequency curve which represents the local data distribution. The mode of this distribution then represents an estimate of the grade at this point.
This is where it starts to get interesting. Unlike the mean and the median, the mode is not necessarily unique. A distribution may display two or more histogram peaks. It will then have two or more modes. This can of course occur in the distribution of local weighted data values, and indeed is very likely to occur in the vicinity of a discontinuity - when values are higher on one side than the other so that the local distribution represents a mixture of two different distributions.
In this case, as you approach a discontinuity, a second mode will appear and steadily increase in importance (i.e. histogram height) until the discontinuity is crossed. Beyond this point, the new mode will become dominant and the original mode will progressively disappear. What does this mean for your estimation ? It is possible to set up a simple rule: ‘use the highest peak'. Then a real discontinuity will be represented by a discontinuity in the estimates as you cross it (from a set of lower values to a set of higher values for example).
I developed this method in 1979 to solve a problem encountered while working on the Botswana Geochemical Atlas. The problem was that some of the moving average maps showed blurred features which looked like linear anomalies that had no relationship with any known geological features. Some of these anomalies were straight lines trending north-south or east-west, which provided the clue: that they represented lease boundaries, and of course data from different companies operating in different lease areas were assayed by different laboratories. Applying a moving mode estimation process, the discontinuities were highlighted, while the estimated surface form over the interior of each lease area was little affected by the change in algorithm. As a result, it was possible to estimate the systematic differences in assay values from one lease area to another. Although it was of course impossible to know the correct absolute values, we could at least compensate for the inter-laboratory differences and eliminate these spurious linear effects.
In due course I built the algorithm into an optional exploration module within Datamine. I believe they no longer offer it for sale, but the original Fortran implementation of the algorithm is available free of charge (just email me on steve@SiliconDale.co.uk) - with all the ‘user beware' warnings of course: you use it entirely at your own risk.