Posted by Kelsie Ferin
In Wickham’s paper “Tidy Data” [http://vita.had.co.nz/papers/tidy-data.pdf], he describes tidy data to be datasets that are structured to facilitate analysis. This relates to almost everything that we have discussed in class so far and especially the five normal forms that were discussed last week. In order to have good data that you, others, and the database you are storing your data in will understand, you need to keep your data clean, tidy, and structured. There are many ways that keeping data “tidy” will benefit you. For example, if you implement this process in the beginning of your project, this will allow your working environment to be more user friendly, and hopefully get rid of any headaches along the way. Also, once you finish your data collection and it is already in a tidy format, then you will not have to do any extra formatting when trying to share your work with others. One drawback from using tidy data is that you need to use an analysis package that would do well at using a tidy format. Also, if you are working with collaborators then they also need to be familiar with how to keep the data tidy and the analysis package that is being used to keep everything consistent, and this can sometimes be a problem.