Tall, Dark, and Mysterious

12/15/2005

So, what IS the point of those introductory college statistics classes, anyway?

File under: Those Who Can't, Queen of Sciences. Posted by Moebius Stripper at 9:06 pm.

For the past few months, I’ve been tutoring a college student in statistics. This tutee, unlike the other one I took on this summer, is good-natured, engaged, reasonably comfortable with basic mathematics, and in general an absolute pleasure to work with. But I figure there’s a reason that TD&M gets more hits in an hour than there are holds on every copy of Pollyanna in BC’s public libraries put together, and why mess with a winning formula? So enough about that student.

Let’s talk instead about the purpose of these statistics requirements for business and social science majors.

I taught such a course last year. My objective going into the course - and objective that made its way onto the syllabus - was for my students to emerge with a decent ability to assess and interpret quantitative data. Every lesson plan was subordinated to this purpose. I taught the standard intro-stats notation and terminology, but only as a means to an end. Each and every one of the ten quizzes and three tests that my students wrote contained at least one question that required that they answer in plain English. It was not enough that they be able to give me the bounds of confidence interval; to receive full credit, they needed to tell me what it meant. It was not enough for a student to tell me that a sample proportion fell inside the critical region and that we should therefore reject the null hypothesis; they needed to tell me what that meant in terms of a manufacturer’s or politician’s claim.

The class, for the most part, went rather well. Marks weren’t great; but I was confident that a good mark in my statistics class indicated a genuine understanding of statistics, and not just an ability to pluck out numbers that, when plugged into a meaningless equation, will yield the numerical answer that will be marked correct.

The statistics class that my tutee just completed was quite different. It covered the exact same topics as my class - sampling, measures of central tendency, distribution, probability, estimations of means and proportions, hypothesis testing - but this professor’s teaching and testing style was very different from mine. He gave a lot of assignments, and was fond of breaking questions down into multiple parts that lead a student to the answer. I’m not entirely opposed to some amount of that - hell, I think that such questions are probably the best way, at least initially, to deal with students who freeze when confronted with a problem that they can’t solve immediately - but this prof’s multi-part questions were…ill-conceived, to say the least. In particular:

  • Every question on a certain topic followed exactly the same template. No two topics shared same template. My tutee quickly figured out that an eight-part question in which the first part asked for a sample proportion and the second asked for the claim of the population proportion, was a hypothesis test that called for use of formulas 8.3 and 8.4. She also figured out that you could plug the first, second, third, and fourth numbers given in the question, respectively, into 8.3; 8.4 used the result of 8.3, along with the fifth number given in the question.

  • Even if I had devoted my best efforts to the task, I could not have written questions more leading than this guy’s. For instance:
      …The distribution of student weights is unknown. 42 students are weighed…

      a) …

      b) …

      c) Which of the following applies:

    • We can use Formula 7.3 because the distribution of student weights is normal.
    • We can use Formula 7.3 because the sample size is at least 30.
    • We cannot use Formula 7.3 because the distribution is not normal and the sample size is less than 30.
  • The man was a certifiable jargon/notation fetishist. Tell me, how the hell else do you explain a question of the form “What is the sign (less than, greater than, or not equal to) that appears in the statement of the alternative hypothesis HA?” Or - and this is my bias talking, because I can never for the life of me remember which label goes with which - the query “would this be a Type I or a Type II error?” with no followup.

None of the questions had an “explain in plain English” portion. My tutee, whose term mark was among the highest in her class, could tell me that in Question #4 of Section 9, we reject the null hypothesis because x-bar fell in the critical region, but she could not tell me that what this meant was that the lightbulb manufacturer’s claim was bullshit. She solved the problems on her tests and assignments by pattern-matching on the rigid templates, and on following the leading questions. When I worked with her two nights before her exam, she stated matter-of-factly that she expected to forget everything from the course the next day.

If this what one of the top students in this statistics class has taken from the course, then I think it’s a pretty safe bet that this statistics class is not preparing students to assess and interpret quantitative data.

But I can’t hold this professor responsible, because he seems to be doing a good job under the circumstances: he’s got a jam-packed curriclum to follow, and is responsible for delivering a bevy of content at the expense of skills. Even though this prof he gives plenty of practice problems that prepare for the very predictable tests, and even though he gives excellent notes, and even though he is available for plenty of extra help…despite all that, my student - who has been sick half the term and who, by her own admission, has been slacking off lately - is one of the top students. I’m certainly not going to second-guess what I presume was a conclusion that his students could not handle a more rigorous course, one that aims to train students to assess and interpret non-canned quantitative data.

I can’t blame him for concluding that there’s no way he can deliver such a course successfully, so he might as well not have his students hate him by the end of the term. And if that means that there’s no guarantee that an A student will understand what it means for a poll to be accurate within three percentage points nineteen times out of twenty, then so be it.

And there’s a big problem with that. If there’s one math class in which the question “what’s the point of this?” should never ever come up, surely it is introductory statistics. But I can’t for the life of me see how anyone could justify teaching a statistics course like the one I tutored.

I wish I could design such a course, because having taught it once, I know exactly what I’d do differently if I were granted full control over the format. In two words: less content. Oddly, calling for less content in a math class tends to invite charges of “dumbing down”, and we can’t have that! - nevermind that the textbooks of yore contained vastly less content than the ones of today - but emphasized mastery and application.

Here’s what I’d trim out of a single-semester intro stats class:

  • Most of the probability section. I love probability - so much that I spent far too much time on it last term - but it’s easy to underestimate just how much difficulty students have with it. I’d get rid of everything that isn’t necessary for binomial probability applications, and leave those in only because of the normal approximation to them. (Height of stupidity: spending three weeks on permutations and combinations, and then glossing over connection between probabilty and statistics. Yes, I did that last year.)

  • The Student’s-t distribution. There’s more than enough you can do with normally-distributed sample sizes, and if we’re going to wave our hands over the Central Limit Theorem anyway, why confuse matters with the rule that samples of size thirty use Table A5 while samples of size 29 use Table A7? This time would be better spent elsewhere.
  • Though not on the “estimating the standard deviation” section. Estimating means is simpler and more relevant, and students still struggle with it.

The leftover time - and really, there isn’t much when you cover the rest of the course at a reasonable pace - can be used with hands-on activities, which are so natural for a statistics course. It can be used to have students design the sorts of questions that usually appear on tests: the data they encounter when they see the latest polls, or when they weigh precisely a bag of apples, provides suitable fodder for a variety of such problems. It can be used to discuss why one researcher would rather risk Type I errors, and another Type II errors. It can be used by emphasizing, over and over and over again, the implications of the material everywhere.

I don’t think that such a course would be at all dumbed-down from the one that my tutee took this year; to the contrary, it would require students to think far more deeply about the material. But such a course would be faithful to what I assume are the reasons for teaching introductory statistics. And if I were to teach it, I’d feel a lot more better about the answers I give to what’s the point of this stuff than I would if I were instead responsible for delivering the more content-heavy statistics class class that nearly every business and social science department requires its students to take.