089 Data analysis and data getting in the process of scientific investigationa

Between 1919 and 1928 an iterative sequence occurred that went through three main stages, each leading logically to the next via interaction of theory and practice. The analysis of existing records led to the analysis of experimental trials which then led to the design of experimental trials. There were different but interactive aspects to this development. We can see (i) sequential evolution of the new methods in response to unfolding realizations of need, (ii) the persuading of practitioners to try the new techniques, and (iii) the changing role of the statistician implied by the development.

3.12 Evolution of the New Methods

Fisher’s attempts to analyze experimental data quickly led him to the essential principles of experimental design. The need for randomization to achieve validity; for replication to provide a valid estimate of error; for blocking extraneous sources of disturbance to achieve accuracy. Blocking in two directions simultaneously (by randomized Latin squares) was particularly appealing. Fisher would have been brought to see the enormous advantages of the unorthodox factorial arrangements as an economical way to assess the effects of variables in combination by, for example, his early attempts to impart meaning to the differences associated with the 13 differently manured Broadbalk plots to which fertilizers had been applied in a highly nonbalanced manner. However, while the efficiency of factorial designs could be increased by packing in more factors, larger factorial designs required bigger blocks and hence produced greater inhomogeneity in the experimental material, giving larger experimental errors. The answer which quickly followed was confounding.

3.13 Persuading Practitioners

The blessings of feedback were only available if scientists would try out his designs but, not surprisingly, Fisher at first did not have an easy job selling his revolutionary ideas at Rothamsted. Indeed, the first design run to his specification (in 1924) was not done at Rothamsted at all. It was a randomized Latin Square design run at Bagshot for the Forestry Commission who had asked for and acted on his advice. But between 1924 and 1929, as described in „Studies in Crop Variation IV and VI“ [5, 6], there is a rapid development of ideas which were quickly put into practice. It is clear that Eden had become a convinced disciple during this period and it is refreshing, but alas unfamiliar, to see publication of new designs simultaneous with data obtained from their successful use. By the end of this period data were being collected from designs of great accuracy and beauty which included all of Fisher’s ideas. In spite of all this in 1926 the Director of Rothamsted, Sir John Russell, wrote a paper [16] in the Journal of the Ministry of Agriculture about agricultural experimentation which almost totally ignored the ideas of his protegé. However, in the next issue [9] in a paper notable for its brevity and clarity, Fisher outlined his philosophy on the subject, setting his boss to rights and anyone else who would listen.

3.14 A New Heritage for Statisticians

The original concept that the research station needed a statistician was revolutionary, but certainly the role initially envisaged in 1919 for the statistician was a passive and possibly even a temporary one. Russell wondered if anything more could be extracted from the existing records. Fisher’s work gradually made clear that the statistician’s job did not begin when all the work was over — it began long before it was started. The statistician was not a curator of dusty relics. His responsibility to the scientific team was that of the architect with the crucial job of ensuring that the investigational structure of a brand new experiment was sound and economical. The latter role is much more fun than the former. He himself relished it and we should thank him for bequeathing it to us. It calls for abilities of a high order. It requires among other things the wit to comprehend complicated scientific problems, the patience to listen, the penetration to ask the right questions, and the wisdom to see what is, and what is not, important. Finally, it requires from the statistitian the courage to wager his reputation each time an experiment is run. For the time must come when all the data are in and conclusions must be drawn; at this stage oversights in the design, if they exist, will become embarrassingly evident.