Notes: Data Diffs

Posted on | 124 words | ~1mins

I came across Data diffs: Algorithms for explaining what changed in a dataset.

  • The two papers discussed are:
    1. Scorpion by Eugene Wu and Sam Madden. Finds common properties of outlier points.
    2. DIFF. SQL implementation of Scorpion and similar. The computated can be distributed.
  • An author of the DIFF paper, Peter Bailis, founded sisudata.com.

The comments on HN list some related work:

Searching for “data diffs” on HN finds related work: