Tidy data

Posted by Laila Puntel

The author stated that structuring your data (“Data tidying”) allows easy manipulation, modeling and visualization, and have a specific structure. Data tidying facilitates the analysis. Has a lot in common to what we have been discussion in class, because the author propose a way to standardized the data, closely related to the idea of the NORMAL forms in databases that simplifies the analysis and processing. Normalization was one of our topics of discussion in class which also aims to follow “rules” that allows you to shape your data in a tidy way. In this article he mentioned normalization rules, like: attributes must be have a single value, each observation forms a row, and avoid multiples values for the attribute, etc. There are benefits directly related to the storage and use of the data, especially important for data sharing. Besides, there are advantages related to the development of tools that can be used to process databases like manipulation, visualization, and modelling. If your data is in well know and described “tidy” format then is easier to use and develop tidy tools for data analysis! One of the drawbacks of having this “tidy” data structure is that people either contributing or using this databases have to be familiar with existing tools to work with them. For example the author mentioned that some of the tool for visualization may only work if we have an input-tidy data.