R Immersion

As the end of the semester comes near I have found myself reflecting on how far I’ve come since the first class. This was the first graduate class I have taken and after a 2 year hiatus from any school at all I had to make quite the adjustment as I work a full time research job in tandem with class. I had no experience doing any coding at all and I thought I knew a sufficient amount of statistics from my data preparation and embarrassing p-value conclusions in excel. When the R coding conversations kicked off and I still was having trouble knitting a pdf output I became very nervous.

I realized that coding in R was like learning a new language and it reminded me of stories my father told me about moving to Italy in the 1970s without knowing a word of Italian when English was not very common there. He took a language immersion class to learn Italian as quickly as possible in which they spoke only Italian. He became a fluent Italian speaker in only a matter of months because you essentially force yourself to learn it when it’s the only option.  Anyways, I tried to treat our class and my time working on homework in this style of thinking in code and ultimately putting the code and new statistics knowledge together piece by piece.

Most of my research experience is in the biotech realm of protein and molecular biology. However, under the same premise as learning code I have actually learned a lot about ecology and environmental biology just from the related readings and surrounding ecologists in class. I have noticed that ecology studies can have a lot more variables at play and this often requires the researcher to take a step back and assess the best way to analyze the data. In my research, I have found myself often pushed in the direction of regimented data analysis to determine a “significant” or “insignificant” difference based on a .05 p-value regardless of the situation.  Although data generated in an in vitro laboratory study is usually more controlled and intentionally contains fewer variables than most field studies it is still valuable to think about the relationships and different models.

This default data analysis style I think is dangerous and incorrect because just like science, statistics is an evolving and improving field. This was said perfectly in the introduction of the Ecology Special Section on P Values Forum, “We also need to remember that ‘‘statistics’’ is an active research discipline, not a static tool-box to be opened once and used repeatedly… Continual new developments in statistics allow not only for reexamination of existing data sets and conclusions drawn from their analysis, but also for inclusion of new data in drawing more informative scientific inferences.”

Prior to this class I admittedly thought of statistics as always possessing a right way to do things like many people in immunology and molecular biology, including well-published scientists. Thankfully, being surrounded by ecologists and thinking about complicated data sets has allowed me break free of this thought process and the robot analysis style.

Advertisements
This entry was posted in Uncategorized. Bookmark the permalink.

3 Responses to R Immersion

  1. martinew says:

    I’m impressed that you have not been put off ecology! I’m pretty sure that a lot of ecologists would think that what you do is fairly complex too. Biotech seems so precise with many things to control for. You can’t just throw a settlement plate out in the water and go and look at it a few months time. I do love ecology for this reason.

  2. jebyrnes says:

    I’m so glad this has been helpful to you! As I mentioned, I learned all of this in an Ag Stats class. Then, when I learned mixed models, it was in the context of political science and sociology research. Actually, our Psych department does a “Stats Lunch” that I went to a few times, and it was wonderful – y’all might want to think about going in the future. Same problems, same analyses, slightly different lingo. But I learn so much from hearing about how others think about how they build their models and the issues they run into. It’s worth continuing to learn – not just so you can watch as statistics evolves as a science, but to learn bigger and broader ways of thinking about data. Because, frankly, you’ll eventually run into the same issues, just in a different guide some day!

  3. coastalsci says:

    Stuart, nice piece and well-written. When I read this last week I found myself nodding in agreement. I too came into the class thinking of statistics as something rather mechanical – a finite set of tools that one just had to learn well enough to be able to “prove” the validity of one’s research (or learn enough to use a plug-and-play stats package as friends and colleagues I knew did). So long as the p-values were in line, life was good.
    The section you quoted from the P Values Forum is particularly apt. Very cool that as a result of this class, apart from learning a new language(!) and more statistical prowess, you came away with an appreciation for the messiness and challenges of ecological research and with a broader, more sophisticated sense of how you are using statistics in your own world of molecular biology. Good wishes to you in your future work and courses!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s