Posted by Jordan Kersey
The split-apply-combine strategy for data analysis can allow one to successfully divide up a large piece of data, manipulate certain smallers pieces, and then piece the edited version back together. It allows for one to concentrate on a more managable portion of a data set, in order to be able to do more significant and complex analysis, without all the extra stuff attached. After this editing has been completed, this process allows for it all to be brought back together, forming the original data set once again.
We have used the split-apply-combine idea for homework two, in SQLite. We deconstructed tables, and connected them with different pieces in order to analyze certain aspects. Finally, we combined the data back together. Also, it seems to me that the lab script we worked on, on October 10th was a version of this. We needed to rid ourselves of columns, then re-name them. Finally, we could have re-combined the entire data set, which would have completed the proccess.
I think the advantages to using this strategy would be simplicity in editing data.