I’d like to contrast the different ways that statistics instructors approach data analysis with those applied by researchers using statistical methods to ask specific questions of their data. I previously took undergraduate statistics, and then two semesters of graduate-level coursework. In my first course, and in the first semester of graduate coursework, I feel like the design of both courses were really geared towards learning tools: here is how to calculate standard deviation by hand, here is how you calculate a confidence interval from a z-table, here is the equation for a two-sample pooled t-test, etc etc. In the later course, the entire curriculum was entirely based around experimental design, and approached statistics from an applied research perspective instead of a theoretical quantitative perspective. There is no question that all of my classmates found the latter course more difficult (the fact that it was taught by a 60 yr old tenured professor who recorded all of his lectures with meticulously timed monotone recitals of equations didn’t exactly help).

In all of the statistics coursework I’ve taken, particularly in graduate school, I’ve noticed that my co-workers (and myself occasionally too) have preprogrammed into our brains whatever scientific question that we’ve written into our proposals, and just focus on that technique during the entire semester. For example, my data happened to be very zero-inflated, so I spent a ton of time learning ZIP models and negative binomials. I feel like that approach perhaps limits the ability to gain a much broader understanding of the entire universe of questions that you can ask of your data, or more important, that you can ask of your field, before you pull out your calipers or microscope for the first time.

That’s why I really appreciate the much broader, theoretical perspective that Jarrett’s course goes over, in if I occasionally complain about all of the simulations that are so ubiquitous in the homework assignments. If you don’t know the theory, then there’s no point in knowing how to use the tools. I walked by a friend’s office last week and we chatted about some statistics problems, and simulations. And someone far back in the office commented that she didn’t think she needed to know the equations for standard error because there was a code that she could simply type in to get standard error. The friend and I smiled at each other knowingly, realizing that if you don’t know what’s under the hood, then you really shouldn’t be driving the car. And of course, a comprehensive approach limits the topics that you can explore in a single semester. But the truth is that there are so many types of analysis, that you can’t hope to master everything.

And I think that speaks to the larger problem of specialization in an time of interdisciplinary conglomeration. Does someone who studies gene regulation care about community ecology? Should someone who studies disease transmission in amphibians care about microbial anaerobic metabolism? More applicably, should someone who needs to run a factorial ANOVA with multiple post-hoc comparisons also know how to run a discriminant functional analysis? And the logical answer I come up with, is that if you’re only focused on your data after the fact, after your experiment is already run, then no, it doesn’t really matter. You can run the right analysis based on the validation of assumptions in your distribution, and get a quantitative result. But if you’re focused on the question you want to ask in your field, then yes, many types of analyses should be on the table and understood. Similarly, if you’re asking a specific question about gene regulation, and that’s all you care about, then sure, you don’t need to know about community dynamics. But if you’re asking about gene regulation in a specific organism, then yes, maybe community ecology effects selective processes and phenotypic expression from native gene sequences.

“The expert knows more and more about less and less until he knows everything about nothing.” ― Mahatma Gandhi

“Of course, I am interested, but I would not dare to talk about them. In talking about the impact of ideas in one field on ideas in another field, one is always apt to make a fool of oneself. In these days of specialization there are too few people who have such a deep understanding of two departments of our knowledge that they do not make fools of themselves in one or the other.” ― Richard P. Feynman, The Meaning of It All: Thoughts of a Citizen-Scientist

In science, specialization without integration can only get you so far, and having multiple avenues of inquiry, and subsequent analysis enables a far broader understanding. One of my PIs, who happens to be the head of her department, is famous for saying that she is a chemist, and doesn’t care at all about biology. Fair enough. After hearing that, I immediately sent her an email with the following quotes:

“A man cannot be professor of zoölogy on one day and of chemistry on the next, and do good work in both. As in a concert all are musicians,—one plays one instrument, and one another, but none all in perfection.” — Louis Agassiz

“When chemists have brought their knowledge out of their special laboratories into the laboratory of the world, where chemical combinations are and have been through all time going on in such vast proportions,—when physicists study the laws of moisture, of clouds and storms, in past periods as well as in the present,—when, in short, geologists and zoologists are chemists and physicists, and vice versa,—then we shall learn more of the changes the world has undergone than is possible now that they are separately studied.” — Louis Agassiz

The more you know, especially about statistics, the more questions you can ask about the world around you. So I’m excited to learn about the things I don’t know about, more about the mechanics involved with the things I do know about, and the exact circumstances of when I should use a scalpel and not a sledgehammer, a t-test and not a general linear model. To know how each component of an advanced technique or function-code is built, and to ask questions that require multiple avenues of analysis that allow comprehensive understanding of ultimate and not just proximate causation. Top-down and bottom-up understanding. That would be nice, to know math and analysis, to have both theory and action not merely one or the other. http://wakinglifemovie.net/Transcript/Chapter/12