AGRON590RD Blog#02 (vawalker, 20160902)

Posted by Victoria Walker

Rules for Care and Feeding of Data

  1. Rule I am most likely to follow: #3
    • The analysis that we do is internally reproducable. We make write-up documents (.pdf) for each section of work. If we’re contacted and asked how something was done, we’d be able to share that information. But not all the data can be shared. Since we don’t “own” it, we can just point them to the proper dissemination servers and the version used. But if the data version had been updated, and the re-processed products made available, it could be very difficult for someone else to get the same version of the data that we used (it depends on the archive process of that agency).
  2. Rule I may never follow: #8
    • I’m not really sure what we do with the data we actually collect (after the project is over). I know it’s typically put into a .csv file and uploaded to Box w/ a scanned copy of the original data sheet. This folder is available to the research group. The data we have from other agencies/projects is typically from ftp servers that you need to register (free) to use. We may have a local copy on hand if we’ve worked on it recently or know that the server status is spotty.
  3. Rule that I’ll probably try and fail: #4
    • I’m still not sure what to do when you’re in the grey area between free and proprietary. If it’s a proprietary data format or MATLAB toolbox, but anyone can access it w/ free registration, can you share it? What if you have freely available data, but scripts that were shared w/ you by an agency? Do you get the person who wants help from you in touch w/ the person who helped you? Or do you just credit that person and make sure it’s listed in the header/comments? Can you make it available as part of your “workflow” if it is a script that was emailed to you, and slight modifications were made, as long as those changes are tracked in the header?

The enumeration and bullets are my new formatting trick.