CHAPTER I-2 ON STATISTICS TEACHING, TEACHERS, AND CURRICULA Everyone agrees that an understanding of statistics and probability is a crucial element in a person's quantitative skills. But for decades, the public and the relevant professions have lamented that statistics teaching desperately needs to be improved. Though various educational palliatives have been offered, none has shown any evidence of success. The problem -- really a disease -- in the practice and teaching of statistics has been diagnosed by many writers: Obfuscation of teeth-gritting students by teachers who repeat by rote the body of complex algebra and tables that only a rare expert understands down to the roots. Years ago, Allen Wallis and Harvey Roberts had it right: "The great ideas of statistics are lost in a sea of algebra" 1956, p. viii. Just as surely as there is a problem, the resampling method can contribute mightily to solving it. Indeed, it is the only scientifically tested and proven cure for the malady. A quarter century ago, claims for resampling were derided as ridiculous. Now it is agreed that resampling is theoretically valid. (Its new respectability is shown by its passage from "Ridiculous" to "Everyone always knew that.") Some writers (Edginton, referred to by Manly, 1991, p. 17) even suggest that in many cases the classical tests are simply approximations to resampling methods1, and that the parametric methods are only a second-choice substitute for resampling methods. Yet resampling is still mostly passed by when the curriculum is set, though a few leaders in the profession have now come out in its favor. A long-time problem in promoting resampling has been to make clear that it is a basic tool for researchers and decision- makers. I emphasize "tool" because resampling often gets confused with Monte Carlo simulation as a way of teaching conventional parametric methods. The subject here is resampling as a substitute and complement to conventional methods, and as the method of first choice in handling actual everyday problems. The subject is not a device to improve the standard pedagogy. THE CRITICISMS OF STATISTICS PRACTICE AND EDUCATION To set the scene, here are some comments by thoughtful critics of statistics education. Please forgive me for multiplying the quotations, but it is only the fact of this sentiment being widespread, and the view held by respected statisticians, that lends authority to the criticism. The introductory statistics course is troublesome. Many readers will surely confirm that assertion with their own knowledge of what students and teachers say about the subject. And there is much written testimony to this effect by thoughtful critics of statistics education. Here are some examples: 1. Garfield (1991): "A review of the professional literature over the past thirty years reveals a consistent dissatisfaction with the way introductory statistics courses are taught" (p. 1). Garfield asserts (referring to her dissertation, 1981, and to work by Wise) that "It is a well known fact that many students have negative attitudes and anxiety about taking statistics courses" (p. 1). "Students enrolled in an introductory statistics course have criticized the course as being boring and unexciting... Instructors have also expressed concern that after completing the course many students are not able to solve statistical problems... (1981, quoting Duchastel, 1974). 2. Dallal (1990, p. 266): "[T]he field of statistics is littered with students who are frustrated by their courses, finish with no useful skills, and are turned off to the subject for life". 3. Hey (1983, p. xii): For more years than I care to recall, I have been teaching introductory statistics and econometrics to economics students. As many teachers and students are all too aware, this can be a painful experience for all concerned. Many will be familiar with the apparently never-ending quest for ways of reducing the pain - by redesigning courses and by using different texts or writing new ones. But the changes all too often turn out to be purely cosmetic, with the fundamental problem left unchanged. 4. Barlow (1989, Preface) Many science students acquire a distinctly negative attitude towards the subject of statistics...As a student I was no different from any other in this respect. 5. Hogg: "[S]tudents frequently view statistics as the worst course taken in college." He explains that "many of us are lousy teachers, and our efforts to improve are feeble" (1991, p. 342). 6. Vaisrub (1990) about her attempt to teach medical students conventional statistical methods: "I gazed into the sea of glazed eyes and forlorn faces, shocked by the looks of naked fear my appearance at the lectern prompted" . 5. Freedman et. al. noting that most students of probability and statistics simply memorize the rules: "Blindly plugging into statistical formulas has caused a lot of confusion." (1991, p. xv) 6. Ruberg (1992): It seems that many people are deeply afraid of statistics. [They say] `Statistics was my worst subject' or `All those formulas'...I wish they had a deeper understanding of the statistical method...rather than the general confusion about which formulas are most appropriate for a particular data set. 7. Freedman et. al.: [W]hen we started writing, we tried to teach the conventional notation... But it soon became clear that the algebra was getting in the way. For students with limited technical ability, mastering the notation demands so much effort that nothing is left over for the ideas. To make the point by analogy, it is as if most the undergraduates on the campus were required to take a course in Chinese history--and the history department insisted on teaching them in Chinese. (from the introduction to the first edition) 8. Based on their review of the literature, Garfield and Ahlgren say that "students appear to have difficulties developing correct intuitions about fundamental ideas of probability", and they proceed to offer reasons why this is so (1988, p. 45). These sorts of negative comment are not commonly heard about other subjects and other groups of students; both the nature of the criticisms and their volume with respect to statistics are unusual, we believe. One of us has been teaching economics, business, and demography for three decades without hearing such complaints. 7. Statisticians have long worried about the unthinking use of parametric tests whose foundations are poorly understood. "Students are often given the false impression that `easy-to-use packages can be a substitute for a proper knowledge of statistical methodology'" (Dallal, 1990, p. 266, quoting Searle, 1989). And now the readily available computer packages, which perform conventional tests with a single command, exacerbate this problem. As the Encyclopedia of Statistics notes: Use of Inappropriate "Canned Programs" As statisticians we find all too frequently that an experimenter takes data directly to a computer center programmer (usually called an analyst) for "statistical analysis." The programmer pulls a canned statistical program out of the file and there may result extensive machine outputs, all of which are irrelevant to the purpose of the experiment. This deplorable situation can be avoided only through having competent statistical advice, preferably in the design stage and certainly in the analysis stage. ("Computers and Statistics", Vol 2, p. 95) Blindly picking formulae for inferential procedures always has afflicted statistics. But now such computerized routines have made the problem even worse. The statistics user does not even feel the need to learn the conditions under which a test may or may not be appropriate. To illustrate, SigmaStat's advertisement in Science Magazine has a researcher say: "Dear SigmaStat Advisor ... I need a foolproof way to pick the best statistical tests on my research data." The advertisement then goes on to say, "The SigmaStat Advisor automatically picks the best test for you. SigmaStat automatically points you to the appropriate tests. You simply type in your goals, and SigmaStat's Advisor automatically suggests the best statistical test for your data." To confirm that most conventional teaching of statistics is a miserable failure, check for yourself the state of understanding of even an advanced student or user of statistics. Present to a researcher or student a set of data for a standard situation - say, four groups of rats given different drugs. Ask a series of questions such as: What kind of statistical analysis will you do? What test will you use? Why an F test (or whatever)? What exactly does the F statistic (or whatever) mean? What is the meaning of the table that you will use? What do the numbers in the table signify? How are the numbers derived? Is the Normal distribution relevant here? What has that particular distribution got to do with your data? What is the formula for a Normal distribution? What is the reason for each of the elements in the formula? Why are they combined in the fashion they are? Unless the user can answer every one of those questions clearly and confidently, the person is at risk of proceeding inappropriately with the data for lack of full understanding. A person must have a full intuitive understanding of the statistical process, and must be able to answer a comparable set of questions, though they are much easier to handle in order to develop an appropriate resampling test to analyze those data. A user of resampling methods therefore is less at risk of simply plugging in an inappropriate procedure. All the more need for Resampling, then.WHAT IS THE NATURE OF THE DIFFICULTY? Many writers have discussed the nature of the difficulty with conventional methods. Hogg says: Statistics teaching is often stagnant; statistics teachers resent change. The most popular elementary texts evolve but slowly over decades. Meanwhile, statistics is progressing rapidly. (1990, p. 20) Hollander and Proschan assert that the source of the difficulty in teaching statistics is "Simple -- the books are written in a foreign language...The books primarily explain the mechanics of statistics using the language of algebra" (1984, p. v). They "aim to narrate in plain English words" the nature of statistics. But this leads them to completely eliminate from their instruction the core element, probabilistic inferential statistics, because plain English does not enable them to present the conventional apparatus for significance testing and confidence limits. In the last decade or so, the discipline's greybeards have decided that teaching probabilistic-statistics is just too tough a nut to crack, and have concluded that students should be taught mainly descriptive statistics -- tables and graphs -- rather than how to draw inferences probabilistically. For example, Scheaffer (1990) calls for "a more empirical, data-oriented approach to statistics, sometimes termed exploratory data analysis" (p. 90 check quote and itals). And Moore and Roberts (1989) and Singer and Willett (1990) suggest that actual rather than hypothetical data will increase student interest, and they offer suggestions about data sets. But probability and inferential statistics clearly are the heart of the matter. A statistics course without inferential statistics is like Hamlet without the Prince appearing. Gardner suggests that all mathematics is inherently difficult to teach. A teacher of mathematics, no matter how much he loves his subject and how strong his desire to communicate, is perpetually faced with one overwhelming difficulty: How can he keep his students awake? (Gardner, 1977, p. x) Efron says conventional statistics is a very difficult theory. The theory that's usually taught to elementary [statistics] students is a watered-down version of a very complicated theory that was developed in order to avoid a great deal of numerical calculation... It's really quite a hard theory, and should be taught second, not first. (Quoted by Peterson, 1991, p. 56) Elsewhere Efron (with Tibshirani) says (page xiv, 1993): "The traditional road to statistical knowledge is blocked, for most, by a formidable wall of mathematics". Tbe inherent difficulty of statistical inference is discussed at greater length in Chapter 00. Hogg argues that the formal equational approach is unsound not only because it is difficult, but also because it points the student away from deep understanding of scientific-statistical problems. Statistics is often presented as a branch of mathematics, and good statistics is often equated with mathematical rigor or purity, rather than with careful thinking. There is little attempt to measure what statistics courses accomplish (1990). Hey says (in touting Bayesian statistics as the answer): I was aware of the real problem for some time, but it was not until about three years ago that I finally admitted it to myself. The fundamental malaise with most statistics and econometrics courses is that they use the Classical approach to inference. Students find this unnatural and contorted. It is not intuitively acceptable and does not accord with the way that people assimilate information... (1983, p. xi). And Kempthorne writes: ...there has been a failure in the teaching of statistics that originates with a failure of the teaching of teachers of statistics,...Part of the malaise that I see occurs, I believe, because it is easy to think of counting and of areas and volumes, so rather than teach something about statistics, one takes the easy route of teaching a species of mathematics. And one can get a partial justification because this species of mathematics is a critical part of the whole area. What must happen is that the ideas and aims of statistics must determine the mathematics of statistics that is taught and not vice versa. Mathematics is surely a beautiful art form (in addition to being useful). If the statistics that is taught is to have this good form, then its form is determined by its mathematical form. And then, I suggest, form wins out over content, and essential ideas of statistics are lost...(1980, p. 19) All the above comments may be correct. But I believe that statistics - as distinguished from probability theory - has some very special and very great difficulties, and that the core of the problem is this: There is no way to induce students to enjoy the body of conventional inferential statistics because there is no way to make the ideas intuitively clear and perfectly understood. And even more fundamental than whether the students enjoy the material is whether they will acquire a set of techniques that they can put to effective use. The trouble in statistics teaching is in the product, and not the packaging and advertising. Sooner or later the conventional teaching of statistics founders on the body of complex algebra and tables. Freedman et. al. say : [W]hen we started writing, we tried to teach the conventional notation... But it soon became clear that the algebra was getting in the way. For students with limited technical ability, mastering the notation demands so much effort that nothing is left over for the ideas. To make the point by analogy, it is as if most the undergraduates on the campus were required to take a course in Chinese history--and the history department insisted on teaching them in Chinese. (from the introduction to the first edition) The various devices that have been suggested to mitigate the problem certainly can be valuable. But - and please forgive me if I am very blunt - such devices are like bandaids on internal bleeding. One must note a certain schizophrenia. The very statisticians who assert that the problem is the "wall of algebra" proceed to themselves use this tool heavily - even in discussions of resampling which renders much (if not all) of the formulaic approach nugatory (see for example Efron and Tibshirani, 1993; Hall, 1992; Westfall and Young, 1993). THE SOLUTION TO THE PROBLEM: RESAMPLING The resampling approach mitigates the problem, especially in connection with the facilitating computer program RESAMPLING STATS. A physical process necessarily precedes any statistical procedure. Resampling methods stick close to the underlying physical process by simulating it, requiring less abstraction than classical methods. The abstruse structure of classical mathematical formulas used with tables based on restrictive assumptions concerning data distributions tend to separate the user from the actual data or physical process under consideration; this is a major source of statistical error. Resampling has most commonly been used when classical methods are not promising. In contrast, I argue that resampling should be used, and so taught, as the tool of first resort in everyday practice of statistical inference, mainly because there is a greater chance that a wrong classical test will be used than a wrong resampling test. That is, the likelihood of "Type 4 error" decreases when the user is oriented to resampling. The situation here is like people suffering for years with a serious disease, clearly diagnosed. There is a treatment even when it has been shown to work. But people refuse the treatment for ideological or religious or esthetic reasons. A CRITERION FOR CHOOSING A STATISTICAL METHOD AND FOR DECIDING WHETHER TO TEACH RESAMPLING For operational comparison of methods we need a criterion. I suggest "statistical utility." By this I mean a composite of a) the likelihood that an appropriate test will be used (that is, avoidance of Type 4 error), plus b) the technical characteristics of the tests. This criterion is like an over-all cost-benefit analysis, or a loss-function approach, to choosing methods. The point of view of Wallis and Roberts squares with focusing on statistical utility. They worry that "Techniques and details, beyond a comparatively small range of fairly basic methods, are likely to do more harm than good in the hands of beginners...The great ideas...are lost...nonparametric [methods] involving simpler computations, are more nearly foolproof in the hands of the beginner" (1956, viii, xi). And Wallis and Roberts were prepared to accept some loss of power for this purpose. But the same argument applies in a much stronger way to resampling, because it is not less powerful in general, and because it is even less subject to error because it is entirely rather than partially intuitive. Singh makes a similar point about operations research: In operations research, in particular, the danger of applying unwarily the wrong procedure or method is great because it is quite likely that the assumptions underlying the methods do not hold in the case of the problem under study. The only safeguard against such misapplication is a general understanding of the ideas underlying operations-research methods." (Singh, 1972, pp. 20-21) The statistical utility of resampling and other methods is an empirical issue, and the test population should be non- statisticians. American Statistical Association president Arnold Zellner wrote that at a meeting on statistics education I challenged participants to design and perform controlled experiments to show that their proposed solutions to the suffering problem actually work. Perhaps you can do society a service by developing the methodology and showing that your resampling approach produces significantly less suffering and more statistical educational value than other approaches. Such scientific, positive approaches would be extremely valuable, in my opinion. (Correspondence, November 12, 1990) But such studies were done, and long ago, by my colleagues and me at the University of Illinois, and published in a very widely read journal. Controlled experiments (Simon, Atkinson, and Shevokas, 1976) found better results for resampling methods, even without the use of computers. Students handle more problems correctly, and like statistics much better, with resampling than with conventional methods. This should constitute a prima facie case for resampling. And as Table I-1-1 in Chapter I-1 shows, instruction at both the levels of introductory statistics and graduate courses in statistics, from Frederick (Maryland) Junior College to Stanford, yields positive evaluations from students. But more empirical study would be welcome. Table 1 Singh notes: In operations research, in particular, the danger of applying unwarily the wrong procedure or method is great because it is quite likely that the assumptions underlying the methods do not hold in the case of the problem under study. The only safeguard against such misapplication is a general understanding of the ideas underlying operations-research methods. (1972, pp. 20-21) This historical note is interesting: When statistics was in its infancy, W.S. Gosset replied to an explanation of the sampling distribution of the partial correlation coefficient by R.A. Fisher [from letter No. 6, May 5, 1922, in Letters From W.S. Gosset to R.A. Fisher 1915-1936, Arthur Guinness Sons and Company, Ltd., Dublin. Issued for private circulation.]: "...I fear that I can't conscientiously claim to understand it, but I take it for granted that you know what you are talking about and thankfully use the results! It's not so much the mathematics, I can often say "Well, of course, that's beyond me, but we'll take it as correct' but when I come to 'Evidently' I know that means two hours hard work at least before I can see why." Considering that the original "Student" of statistics was concerned about whether he could understand the mathematical underpinnings of the discipline, it is reasonable that today's students have similar misgivings. Lest this concern keep our students from appreciating the importance of statistics in research, we consciously avoid theoretical mathematical discussions." (Dowdy and Wearden, 1991, pp. XV, XVI.) If Gossett himself could not understand even such simple formulaic material, what should we expect of ordinary students? WHY RESAMPLING SUCCEEDS We can see resampling's greatest strength by considering the now-famous problem of the three doors. Logical mathematical deduction is grossly inadequate for almost everyone to arrive at the right answer to that problem. Simulation, however -- and hands-on simulation with physical symbols, rather than computer simulation -- is a surefire way of finding and showing the correct solution. (Simon, forthcoming) Furthermore, the explanation soon appears when one examines the simulation results for the three-door problem. Important from the mathematician's point of view, such simulation provides sound insights into why the process is what it is. It is much the same with other problems in probability and statistics. Simulation can provide not only answers but also insight, whereas for most non-mathematicians, formulas produce obfuscation and confusion. The resampling method is not really taught by an instructor. Rather, it is learned by the students. With a bit of guidance from the instructor, the students invent - from scratch - resampling methods of doing statistics problems. Through a process of self-discovery students develop useful operating definitions of necessary concepts such as "universe," "trial," "estimate," and so on. And together they invent -- after false starts and then moves in new directions -- sound approaches for easy and not-so-easy problems in probability and statistics. For example, with a bit of guidance an average university class can be brought to reinvent such devices as the resampling version of Fisher's randomization test. The students learn more than how to do problems. They gain the excitement of true intellectual discovery. And they come to understand something of the nature of mathematics and its creation. Of course, this "discovery" method of teaching causes difficulties for some teachers. It requires that the teacher react spontaneously and let the discussion find its own path, rather than having everything prepared in advance. For some teachers this requires practice. Others may never find it congenial. But for the teacher who is open and responsive and a bit inventive, teaching the resampling method in this fashion is wonderfully exciting. Perhaps most exciting is to see ordinary students inventing solutions to problems that conventional probability theory did not discover for many centuries. The openness of resampling learning also bothers some students, especially at first. They miss the comfort students derive from a notebook full of well-organized cut-and-dried formulae; the apparent lack of course structure worries some. But after a few weeks the average student comes to like the resampling approach better, as the controlled experiments of Shevokas and Atkinson show. OTHER BENEFITS OF RESAMPLING Resampling has benefits beyond its greater statistical utility. Resampling has many characteristics that contemporary educators (correctly, in my view) call for to improve the basic quality of student learning in mathematics and science education, as discussed in the guidelines in the National Council of Teachers of Mathematics's (NCTM's) Professional Standards for Teaching Mathematics. 1. NCTM has urged greater use of simulation in teaching probability and statistics. Concepts of probability ... should be taught intuitively. ... The focus of instructional time should be shifted from the selection of the correct counting technique to analysis of the problem situation and design of an appropriate simulation procedure. ... students should value both [theoretical and simulation] approaches. What should not be taught is that only the theoretical approach yields the "right" solution. (NCTM, 1989) 2. There also are calls for active and hands-on learning rather than a passive process. The National Research Council says that "in reality no one can teach mathematics, and that effective teachers are actually those who can stimulate students to learn mathematics" (1989, p. 58, quoted from Garfield, 1991, p. 5). Herbert Simon writes that "learning results from things the student does, and not (except indirectly) from things a teacher does" (1991, p. 284) - witness the fact that the single sentence a student is likely to remember from a course is the sentence the student him/herself spoke in class (is it true for you?). In his famous How to Solve It, Polya writes: A great discovery solves a great problem but there is a grain of discovery in the solution of any problem. Your problem may be modest; but if it challenges your curiousity and brings into play your inventive faculties, and if you solve it by your own means, you may experience the tension and enjoy the triumph of discovery. Such experiences at a susceptibly age may create a taste for mental work and leave their imprint on mind and character for a lifetime. Thus, a teacher of mathematics has a great opportunity. If he fills his allotted time with drilling his students in routine operations he kills their interest, hampers their intellectual development, and misuses his opportunity. But if he challenges the curiosity of his students by setting them problems proportionate to their knowledge, and helps them to solve their problems with stimulating questions, he may give them a taste for, and some means of, independent thinking. (1957, p. v). Resampling in the classroom is as active as any learning can be. 3. One of the arguments given for studying formal deductive mathematics and logic has been that the material teaches sound thinking processes. Perhaps so. But there are also reasons to believe that a person's general intellectual development can benefit from learning to handle quantitative problems by empirical methods such as resampling. These are some of the reasons: a. The step-by-step resampling process resembles the historical process in which mathematicians commonly develop more general abstract ideas. Mathematical invention often is empirical. That is, one may have a mathematical idea, first try it out with numerical experiments, and only later generalize and formalize it. Or, one may solve a set of individual problems individually by numerical methods, and only later see the thread that runs through them and then arrive at the general abstract solution. Resampling has much in common with the first part of this process. Littlewood (1986, p. 97) tells us that the great Indian- English mathematician Ramanujan often worked just that way: "by empirical induction from particular numerical cases". [(He was not a "Martian" a la Barrow's scenario in Chapter 00 because he did not stop with the induction in all cases, though sometimes he did.)] Mathematicians often lament that students see a mathematical result only in its logical form, sanitized of this production process, and therefore learners do not know what goes into the making of mathematics. This lament should produce some sympathy for teaching resampling when teaching probability. Consider as an example Mosteller's problem 39 in his Fifty Challenging Problems. In a laboratory, each of a handful of thin 9-inch glass rods had one tip marked with a blue dot and the other with a red. When the laboratory assistant tripped and dropped them onto the concrete floor, many broke into three pieces. For these, what was the average length of the fragment with the blue dot? If one does not at first see the general answer, one may approach the problem empirically with the following steps: 1. Numbers 1-900, to model a nearly-continuous (discrete marks 1/100 inch apart) glass rod nine inches long. 2. Select randomly two numbers from (1), without replacement (because a break can only take place once at a given spot). (This makes the same unrealistic assumption implicit in Professor Mosteller's problem and solution that the probabilities of break points are the same throughout. I'd guess that very small pieces are less likely than are pieces somewhat less than 1/3 of rod, say. Working up a simulated approach, or even more so an actual experiment, is more likely to spotlight such an unrealistic assumption - if it is indeed so - than proceeding with formulae. Practical people should care about this.) 3. Consider that the blue dot is at the "high number" end of the rod. Hence subtract the larger number in step (2) from "9," and record. 4. Repeat (1-3) say 1000 times, and take the mean of the results, to get the answer sought. After doing the simulation, one may notice that all three pieces average one-third of the total length. By reviewing the results of the specific trials one may then understand why this is so, and then find the "explanation" for the result, which constitutes the more abstractly-based answer. b. Teachers of problem-solving often focus on "breakthrough" processes that involve putting aside assumptions that restrict thinking from getting to the solution; a well-known example is the invisible boundary around a set of points that keeps one from going outside a square to connect the nine points with a given number of straight lines. Another focus of such "creativity" instruction is increasing the flow of ideas, as in the brainstorming procedure. But there is reason to think that teaching people how to overcome the limitations of the amount of material our unaided brains can handle at once is also crucial. These are some pieces of evidence: i) Experience with dynamic programming (a.k.a. backward induction, and the decision tree) shows that in a very large proportion of cases where the technique is useful, the benefit comes not from computation (which gets done only in about 20 percent of the cases) but rather from reducing to paper the cloud of vague thoughts about the issue at hand - the various alternatives, probabilities, and consequences that float around our brains all at once. Many of our problems seem too difficult for "rational" systematic thought - for example, many life decisions such as what to choose for an occupation - because there are so many connections and so many interactions. But the "tool" of pencil and paper and constructing a decision tree can help greatly. ii) The famous "Brothers and sisters have I none..." puzzle illustrates the power of simple techniques for mastering information that is too complex to handle by ratiocination alone. INTELLECTUAL OBJECTIONS TO RESAMPLING, AND REBUTTALS TO THEM The resistances to teaching resampling include intellectual objections, turf problems, and individuals' investments in their stock of professional knowledge. Indeed, these barriers have been great enough to prevent it from being adopted widely since the late 1960s, when I first began teaching it and publishing about it. However, the intellectual objections are beginning to melt away, in considerable part because of the more recent development of the theory of the bootstrap by Bradley Efron and many, many others in his wake. The opponents of resampling now are becoming defensive, to the point of dwelling on such irrelevancies as whether the random number generator is good enough. I'll consider only the intellectual objections here. These are some of the root objections to resampling as a basic tool for everyday use: 1. With card and dice experiments one can make statistical tests without the mathematical theory of probability. But figuring the answer analytically is sometimes quicker. For example, imagine that you want to know how often you will get two aces if you deal a hand of only two cards from a bridge deck. A satisfactory simulation would take some time, but one can figure the answer in a hurry by multiplying 4/52 x 3/51; the formulaic method quickly tells us that, on the average, we shall get 2 aces in a hand of 2 cards once in 220 hands. Though analytical methods may produce an answer more quickly than the resampling method, a person who does not expect to use probability statistics often might find that, in the long run, it is more efficient to spend a bit of extra time on doing resampling trials rather than studying the analytical methods. However, people who plan scientific careers might well eventually study analytic statistics as well as learning simulation methods because that study deepens one's intuition about scientific and mathematical relationships. 2. The resampling method demands that every step be thought through from first principles. This may seem to be a disadvantage because such thinking about models is hard work. However, this discipline reduces the probability of erroneous calculations that arise from blindly plugging in the wrong method and formula. Moreover, learning and practicing the experimental approach aids problem-solving in general. A side-benefit is that for the decreasing number of students who come to the class unfamiliar with computers, the resampling method together with the computer program RESAMPLING STATS also serves as a painless introduction to computers and computer programming. Such basic concepts as IF and looping are learned without special instruction because they are seen to be necessary for repeated resampling trials. And general notions such as booting up, menus, and the operating system also are learned without fuss, as natural parts of the process, simply because this is what the students find themselves doing. Fear of computers also is rapidly dispelled in this environment. 3. In earlier years, a frequent objection to resampling -- and the easiest to meet -- was that the estimates are inaccurate. Resampling sample sizes must, of course, be large enough for any desired level of accuracy. But in many situations, an adequate sample of random numbers can be drawn by hand in a few minutes. If not, it is easy to program any of these procedures onto a personal computer using RESAMPLING STATS or other languages, and thereby obtain samples of huge sizes in seconds. With the computer, even probabilities close to 0 or 1 can be estimated accurately in a short span of time. Furthermore, students quickly grasp the importance of inaccuracy due to sampling error as they observe the variation in their resampling samples. This causes them to worry about sample variability -- which is perhaps the most important of all statistics learning. Then the students increase their resampling samples in size until they reach acceptable levels of accuracy. 4. In some statistical testing situations, resampling may be less effective than conventional methods. Mathematical statisticians are now busy investigating the conditions under which the bootstrap is to be preferred by trained statisticians. But this does not alter very much -- if at all -- the role of resampling for that large number of persons who will never come to have a satisfactory command of conventional analytic methods. 5. Resampling may be considered obscurantist, anti- intellectual, and likely to limit the student's advance into formal mathematical analysis. There are several counter- arguments, however: a. There are many students who will never go forward with the study of statistics and mathematics, even if they are never exposed to this instruction. For these students, this powerful engine of problem-solving is an important educational bonus. b. This instruction interests many students. It probably pushes far more students into further study than it keeps out. c. Perhaps most important, learning resampling is of great intellectual value to those who will make a further study of statistics and probability. The procedure by which one must explicitly structure problems in this method is also necessary when problems are solved analytically, though when using the analytic method the structuring process is too often done implicitly or without awareness, leading to the wrong model or unsound choice of a cookbook formula. Resampling experience is therefore of great value in teaching what analytic methods are good for, and how to use them correctly. Consider these remarks by Alvan Feinstein, professor of clinical biostatistics at Yale Medical School who has spent decades trying to get medical researchers to use statistics effectively: "[E]laborate analyses have sometimes obscured rather than clarified the scientific problems...[they cause people who might otherwise ask hard questions to be] too baffled or awed by the intricate mathematics to want to speak up" (1988, p. 476)... The clinician, forgetting the importance of his own contribution to the logic and data of the research, becomes mesmerized by what he does not understand: the statistical analyses. He assumes that the statistical computations will somehow validate the more basic activities, rectifying errors in observation and correcting distorted logic" (1970a, p. 143). Feinstein also quotes with approval P. G. H. Gell: "Mathematics has now matured into a sacred cow which as often as not gets in the way of scientific traffic" (in Feinstein, 1970b, p. 291) Similar laments may be found by wise students of the research and decision-making process in such fields as business, psychology, sociology, and economics, as well as biology. 6. After years of attempting to attract the interest of technically-minded statisticians, and to satisfy them that simulation is not intellectually inferior, resampling advocates now hear thunder on the other side. The growing return to teaching the philosophy and techniques of data analysis in the schools as well as in universities is itself a welcome development. An editorial in The American Statistician says "The battle over whether more of the curriculum should be taught from a data-analytic point of view has essentially been won" (Scheaffer, 1990, p. 2). But as discussed earlier, this new emphasis has led some to downgrade the importance of probabilistic tools for determining the reliability of descriptive statistics. The pendulum will surely swing back, though, because probabilistic-statistics is too fundamental to be pushed to the side for very long. To their credit, the professional associations - the National Council of Teachers of Mathematics, and the American Statistical Association - recognize the problem with the old methods. They have even given their blessing to the resampling simulation method, though still in a somewhat muted fashion. But any innovation takes a long time to be fully adopted, and educational innovations are perhaps slowest of all, and hence until now the resampling methods are fully taught at relatively few universities and high schools. 7. The "Shiny Toy" Difficulty. In connection with changing paradigms in economics, Robert Solow comments: "You can only replace one shiny toy with another" (in Klamer, 1983, p. 144). He is addressing himself to the behavior of students, but the remark is equally appropriate with respect to instructors, who enjoy having shiny toys to present to their classes. Resampling done by hand was a very dull occupation. But the computer makes this method a much more shiny new toy, and may help overcome this barrier to adoption. [The Battle for Turf. There are two aspects of this barrier - the battle for intellectual turf, and the battle to keep jobs. This anecdote is telling: At a seminar to bio-statisticians at the National Institutes of Health on December 14, 1993, one of the audience said at the end of the talk: "This is a very egalitarian method. But if users can understand everything that's being done, and can create their own methods, what do they need me for?" That, of course, is one of the main barriers to the resampling method.] CONCLUSION Estimating probabilities with conventional formal mathematical methods usually is so complex that the process scares many people. And properly so, because the complexities frequently cause error. The statistical profession has long expressed grave concern about the widespread use of conventional tests whose foundations are poorly understood. But the easy availability of statistical computer packages that can perform conventional tests with a single command, irrespective of whether the user understands what is going on or whether the test is appropriate, has recently exacerbated this problem. This has led a call for emphasizing descriptive statistics and even ignoring inferential statistics. Probabilistic analysis is essential, however. Judgments about whether to allow a new medicine on the market, or whether to re-adjust a screw machine, require more than eyeballing the data to assess chance variability. But the conventional practice and teaching of probabilistic statistics, with its abstruse structure of mathematical formulas cum tables of values based on restrictive assumptions concerning data distributions -- all of which separate the user from the actual data or physical process under consideration -- will not do the job. Resampling can do the job. But there are large barriers against its widespread adoption in everyday use. (See Chapter 00 on the future of resampling.) When will it be given the job? REFERENCES Barlow, Roger, Statistics (New York: Wiley, 1989). Dallal, Gerard E., "Statistical Computing Packages: Dare We Abandon Their Teaching to Others?", in The American Statistician, November 1990, Vol. 44, No. 4, p. 265-6. Dowdy, Shirley and Stanley Wearden, Statistics for Research, Wiley & Sons, 1991. Edginton, Eugene S., Randomization Tests (New York: Marcel Dekker, 1980), referred to by Manly, 1991, p. 17. Efron, Bradley and Tibshirani, Robert J., An Introduction to the Bootstrap (New York: Chapman and Hall, 1993). Einstein, Albert, Relativity (New York: Crown Press, 1916/1952). Encyclopedia of Science, "Computers and Statistics," Vol. 2, p. 95. Feinstein, Alvan R., "Clinical Biostatistics - I", Clinical Pharmacology and Therapeutics, Vol 11, No 1, 135-148. Feinstein, Alvan R., "Clinical Biostatistics - II: Statistics Versus Science in the Design of Experiments", Vol. 11, No 2, 282-292. Feinstein, Alvan R., "Fraud, Distortion, Delusion, and Consensus: The Problems of Human and Natural Deception in Epidemiologic Science", The American Journal of Medicine, Vol 84, March, 1988, 475-478. Freedman, David, Robert Pisani, Roger Purves, and Ani Adhikari, Instructor's Manual for Statistics (second edition) (New York: Norton, 1991). Gardner, Martin, Mathematical Carnival (New York: Vintage Books, 1977). Garfield, B. Joan, "Reforming the Introductory Statistics Course," paper presented at the American Educational Research Association Annual Meeting, Chicago, 1991. Hayek, F. A., New Studies (in Philosophy, Politics, Economics and the History of Ideas) (Chicago: The University of Chicago Press, 1978). Hey, John D., Data in Doubt: An Introduction to Bayesian Statistical Inference for Economists (Oxford: Martin Robertson, 1983). Hogg, Robert V., , "Statistical Education: Improvements are Badly Needed", The American Statistician, vol. 45, Nov. 1991, 342-343. Hollander, Myles and Frank Proschan, The Statistical Exorcist (New York: Marcel Dekker, (1984). Hotelling, H., "The Teaching of Statistics," Annals of Mathematical Statistics, 11, 1940: 457-72. Kempthorne, Oscar, "The Teaching of Statistics: Content Versus Form," The American Statistician, February 1980, vol. 34, no. 1, pp. 17-21. Littlewood, J. E., Littlewood's Miscellany, edited by Bela Bollobas (New York: Cambridge, 1953/1986). Manly, Bryan F. J., Randomization and Monte Carlo Methods in Biology (New York: Chapman and Hall, 1991). Moore, Thomas L., and Rosemary A. Roberts, "Statistics at Liberal Arts Colleges," The American Statistician, May 1989, vol. 43, no. 2, pp. 80-85. Mosteller, Frederick, Fifty Challenging Problems in Probability (New York: Dover, 1965/1987). Nagel, Ernest, and James R. Newman, "Goedel's Proof", in Newman, 1956, pp. 1668-1695. National Research Council (1989, p. 58, quoted from Garfield, 1991, p. 5). NCTM, 1989. Peterson, Ivars, "Pick a Sample," Science News, July 27, 1991, pp. 56-58. Polanyi, Michael, Knowing and Being, Edited by Marjorie Grene (Chicago: The University of Chicago Press, 1969) Polanyi, Michael, Personal Knowledge (Chicago: U of C Press, 1962). Polya, G., How To Solve It, A New Aspect of Mathematical Method ( Garden City, New York: Doubleday & Company, Inc.,1957). Ruberg, Stephen J., Biopharmaceutical Report, Vol 1, Summer, 1992. Scheaffer, Richard L., "Toward a More Quantitatively Literate Citizenry", The American Statistician, 44, February 1990, p. 2. Simon, Herbert (1991, p. 284). Singer, Judith D. and John B. Willett, "Improving the Teaching of Applied Statistics: Putting the Data Back Into Data Analysis," The American Statistician, August 1990, vol. 44, no. 3, pp. 223-230. Singh, Jagjit, Great Ideas of Operations Research (New York: Dover Publications, Inc. 1972). Solow, Robert, in Klamer, Arjo, Conversations With Economists (Rowman & Littlefield, MD 1983), p. 144. Vaisrub, Naomie, Chance, Winter, 1990, p. 53. Wallis, W. Allen, and Harry V. Roberts, Statistics: A New Approach (Chicago: Free Press, 1956). Westfall, Peter H., and S. Stanley Young, Resampling-Based Multiple Testing (New York: Wiley, 1993). Vaisrub, Naomie, Chance, Winter, 1990, p. 53 OUT? IN SFUTURE?? As a general matter, the law and the bureaucrats prevent people from learning from the best teachers in the nation, hence preventing intellectual progress and productivity gains in education. Examples: 1) In twenty-seven states, high school students may not receive credit for courses taught by television. 2) In Maryland, the state university may not beam lower-level undergraduate programs to other institutions, including junior colleges, a cartel-like scheme that would be illigal if undertaken by private firms. Even tougher are the informal barriers against presenting the best teaching by the best minds through high-tech media - video cassettes, computer tutorials, and "distance learning" via tv. For example, at George Mason University, Dr. X finds that the professors will not participate in "course-sharing" -- that is, bringing in television programs from other universities. All the talk, and all the commissions, intended to improve education are a waste of time if the most important steps that can be taken to improve education are stymied by organizational and individual self-interest on the part of the education establishment. No one should be surprised at the existence of these barriers to the production of the finest education. Put yourself in the place of teachers, and it's easy enough to imagine yourself not welcoming technical changes which will reduce the demand for your services. Teachers are no different than weavers in the eighteenth century, caboose-riding brakemen in the early twentieth century, and industrial workers right now. REFS Bisgaard, Soren, "Teaching Statistics to Engineers", The American Statistician, vol 45, #4, November, 1991, 274-283 Singh, Jagjit, Great Ideas of Operations Research (New York: Dover Publications, Inc. 1972). Dowdy, Shirley and Stanley Wearden, Statistics for Research, Wiley & Sons, 1991. Efron, quoted by Peterson, 1991, p. 56. Hogg, Robert V., "Statisticians Gather to Discuss Statistical Education," Amstat News, November 1990. Hogg, Robert V., , "Statistical Education: Improvements are Badly Needed", The American Statistician, vol. 45, Nov. 1991, 342-343. Hey (1983, p. xi). Kempthorne (no citation). Freedman (from the introduction to the first edition). Efron, Bradley and Tibshirani, Robert J., An Introduction to the Bootstrap (New York: Chapman and Hall, 1993). Hall (no citation). Westfall and Young (no citation). Singh (1972, pp. 20-21). NCTM, 1989. Simon, Herbert (1991, p. 284). Polya (1957, p. v). Littlewood (1986, p. 97). Mosteller (Fifty Challenging Problems, problem 39). Feinstein (1988, p. 476; 1970a, p. 143; and 1970b, p. 291). Hall, Peter, The Bootstrap and Edgeworth Expansion (New York: Springer-Verlag New York, Inc., 1992). Scheaffer (1990, p. 2). Solow, Robert (in Klamer, 1983, p. 144). Watts, Donald, "Why is Introductory Statistics Difficult to Learn?" And What Can We Do to Make It Easier?", The American Statistician, vol 45, #4, November, 1991, 290-291 Westfall, Peter H., and S. Stanely Young, Resampling-Based Multiple Testing (New York: Wiley, 1993). page # teachbk I-2prob 5-8-6997