Phil's worst data nightmare...

Posted by Phil Colgan

My worst data nightmare came as i was analyzing my first metagenomic data set for a poster presentation just 3 weeks away. I had ran through several tutorials for metagenomic analysis on mock data sets in the past but this was the first time i was handed a bunch of raw data and expected to deliver a finished, presentable product. Lets just say those tutorials over simplify things a great deal. You never know what state the data will be in when its handed to you. The first thing i learned that helped make the next project much easier was the importance of having well formatted metadata. My metadata was spread out between several tables each containing data that i may or may not have needed and i spent a great deal of time just figuring out how to parse each of them to end up with a consolidated table that made the subsequent analysis in R less of a headache. The second thing i learned is that you can do the same task in a given programming language several different ways, some ways being harder than others while some are just as easy but just take a different approach. This can be pretty confusing to programming noobs like myself. Since then, it has been helpful to disregard the urge to assume the first way i learned to complete a task is the best way and to keep my eyes open for simpler or more personally intuitive approaches. The last thing i learned is the importance of having a good mentor to guide you through your data wrangling woes. With my amazing lab behind me, i finished the poster on time and proceeded to own the presentation.