Overview
The tdda
package provides Python support for
test-driven data analysis
(see 1-page summary
with references, or the blog)
The
tdda.referencetest
library is used to support the creation of reference tests, based on eitherunittest
orpytest
.The
tdda.constraints
library is used to discover constraints from a (Pandas) DataFrame, write them out as JSON, and to verify that datasets meet the constraints in the constraints file. It also supports tables in a variety of relation databases. There is also a command-line utility for discovering and verifying constraints, and detecting failing records.The
tdda.rexpy
library is a tool for automatically inferring regular expressions from a column in a Pandas DataFrame or from a (Python) list of examples. There is also a command-line utility for Rexpy.
Although the library is provided as a Python package, and can be called through its Python API, it also provides command-line tools.