## Black Swans

Let us reflect on idiograph as signature. Each signature is unique, of course - it remains the basis on which shops verify our possession of a valid credit card, for example. But if we begin to think about signatures, we can begin to distinguish styles - 'Timothy P. Burt', 'T.P. Burt' and 'Tim Burt' in my case. We can also think of those that are legible and those that are not! We have begun to classify, but it does not take us very far. We cannot predict how the next person, say, Noel Castree, will sign on the basis of all we have seen before. However, it is not far from this position to one type of science - induction - in which we use evidence as the basis for generalization.

Many scientific investigations proceed by slowly and carefully building up a set of measurements about the phenomenon of interest. Usually, through repeated exploration of the data, a regular pattern becomes apparent. Where possible, scientists try to express this regularity in the form of an equation; in many cases this is a regression equation, summarizing (regression) and quantifying (correlation) the degree of association between a dependent variable and one or more controlling factors. Take the case where we have paired observations between an independent variable (X) that is considered to control the dependent variable (Y). A scattergram describes the relationship between X and Y. The regression line (of the form Y = a + b X) defines the best-fit relationship between X and Y, while the dimensionless correlation coefficient (r) quantifies the goodness-of-fit (degree of scatter around the line) of the regression equation. An example might be the relationship between rain gauge altitude and average annual rainfall in the Northern Pennine hills, UK (Figure 7.1). Knowing something about the general nature of orographic rainfall, I expect there to be a simple and straightforward relationship between these two variables for any similar environment. Observations of rainfall gradients lead to explanations that lead to expectations.

In the grand scheme of things, this kind of relationship is hardly a 'law of nature'. Nevertheless it is a 'rule' of some sort and as such has some value. It provides a (limited) basis for further work: we can try to explain why the relationship exists, and we can attempt to make predictions. In terms of explanation, we know that altitude does not directly 'cause' rainfall - X is not the true cause of Y. In our case, X is the cause of Y only via several intermediate variables. Nevertheless, the linkage is easy enough to explain and might well form the foundation for deductive investigations (see below). Our regression equation also allows predictions to be made - about rainfall totals at places where no measurements have so far been made. This is where, using the inductive method, we must make a leap of faith (Mitchell, 1985) - the reliance of a general rule

2500

E 2000-

0 100 200 300 400 500 600 700 Rain gauge altitude (m)

Figure. 7.1 The relationship between rain gauge altitude and average annual rainfall in the Northern Pennine hills, UK.

on a set of observations. Probably, our regression equation would yield reasonable estimations of average annual rainfall within the Tees or Wear basins in England, where the data were collected, but it would be less reliable as we moved to different localities (e.g. the Lake District) or outside the range of observations. For example, might the relationship change for the very highest global elevations and the greatest changes in elevation? And indeed it does. Continuing our rainfall example, we know well enough that a British rainfall gradient should not be expected to hold elsewhere - rainfall gradients reverse at high altitude in mountainous areas, for example. And at the micro-topographic scale, the distribution of rainfall is likely to be much more related to slope angle and aspect, rather than to altitude per se.

Our simple example of upland rainfall demonstrates the tension between the idiographic and nomothetic approaches - the difficulty of using specific cases as the basis for generalization. In geography, very often the problem is compounded from the need to apply the results of one scale of analysis at different scales. This may entail upscaling of results from smaller to larger areas, for example, extending results from small catchment studies to large river basins. Or, in some circumstances, it can involve downscaling, for example, applying the results of general circulation models (global scale) to particular regions. It has long been known that generalizations made at one level do not necessarily hold at another, and that conclusions derived at one scale may be invalid at another (Haggett, 1965). One common approach to the upscaling problem in catchment hydrology is to use 'nested' experiments, each one designed to fit neatly inside the next. Thus, we might move from bounded plots through instrumented hill slopes and small catchments to a large river basin study. In this way we can show how small-scale processes have a more general impact; on the other hand, as scale changes, so too do the main controlling variables. Thus, in small basins, hillslope topography is the major control of storm runoff response, whereas, in large basins, the nature of the channel network is more likely to control flood response (see Anderson and Burt, 1978, and Burt, 1989, for examples).

The standard textbook example of induction - extended empirical generalization - is: 'All swans are white'. Despite countless observations that all swans were white, David Hume (1711-1776), the Scottish philosopher, pointed out that the truth of the statement could not be guaranteed because all swans had not been observed Eventually, black swans were discovered in Australia. This shows how difficult it is to generalize on the basis of specific investigations; empirical generalizations can only be proved beyond doubt if each and every possibility can be examined.