Diff or Chop? Github, CSV data files and OpenRefine


OUseful.Info, the blog...

A recent post on the OKFNLabs blog – Diffing and patching tabular data – proposes a visualisation scheme (and some associated tooling) for comparing the differences between two tabular data/CSV files:

csv diff

With Github recently announcing that tabular CSV and TSV files are now previewable as such via a searchable* rendering of the data, I wonder if such view may soon feature on that site? An example of how it might work is described in James Smith’s ODI blogpost Adapting Git for simple data, which also has a recipe for diffing CSV data files in Github as it currently stands.

* Though not column sortable? I guess that would detract from Github’s view of showing files as is…? For more discussion on on the rationale for a “Github for data”, see for example Rufus Pollock’s posts Git and Github for Data and We Need Distributed Revision/Version Control for…

View original post 692 more words

Advertisements