The tdda package provides Python support for test-driven data analysis (see 1-page summary with references, or the blog)

  • The tdda.referencetest library is used to support the creation of reference tests, based on either unittest or pytest.
  • The tdda.constraints library is used to discover constraints from a (Pandas) DataFrame, write them out as JSON, and to verify that datasets meet the constraints in the constraints file. There is also a command-line utility for discovering and verifying constraints.
  • The tdda.rexpy library is a tool for automatically inferring regular expressions from a column in a Pandas DataFrame or from a (Python) list of examples. There is also a command-line utility for Rexpy.