.. tdda documentation master file, created by sphinx-quickstart on Tue Feb 14 15:12:30 2017. You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. Test-Driven Data Analysis (Python TDDA library) =============================================== Version |version|. (`Installation `_) The TDDA module helps with the testing of data and of code that manipulates data. It serves as a concrete implementation of the ideas discussed on the `test-driven data analysis `_ blog. When installed, the module offers a suite of command-line tools that can be used with data from any source, not just Python. It also provides enhanced test methods for Python code, and the new *Gentest* functionality enables automatic generation of test programs for arbitrary code (not just Python code). There is also a full Python API for all functionality. *Test-driven data analysis* is closely related to `reproducible research `_, but with more of a focus on automated testing. It is best seen as overlapping and partly complementary to reproducible research. The major components of the TDDA module are: .. image:: image/tdda-machines-light.png :alt: Machines illustrating the constraint discovering functionality, which takes data in and produces constraints as output; rexpy, which takes strings in and produces regular expressions as output, and gentest, which takes code in and produces tests as output. * **Automatic Constraint Generation and Verification:** The package includes command-line tools and API calls for - *discovery* of constraints that are satisfied by (example) data --- ``tdda discover``; - *verification* that a dataset satisfies a set of constraints. The constraints can have been generated automatically, constructed manually, or (most commonly) consist of generated constraints that have been subsequently refined by hand --- ``tdda verify``; - *detection* of records, fields and values that fail to satisify constraints (anomaly detection) --- ``tdda detect``. * **Reference Testing:** The TDDA library offers extensions to ``unittest`` and ``pytest`` for managing the testing of data analysis pipelines, where the results are typically much larger, and more complex, and more variable than for many other sorts of programs. * **Automatic Generation of Regular Expressions from Examples:** There is a command-line tool (and API) for automatically inferring `regular expressions `_ from (structured) textual data --- ``rexpy``. This was developed as part of constraint generation, but has broader utility. * **Automatic Test Generation (Experimental):** From version 2.0 on, the TDDA library also includes experimental features for automatically generating tests for almost any command-line based program or script. The code to be tested can take the form of a shell script or any other command-line code, and can be written in any language or mix of languages. Contents ======== .. toctree:: :maxdepth: 2 overview.txt installation.txt constraints.txt rexpy.txt gentest.txt referencetest.txt constraints-api.txt rexpy-api.txt windows.txt tests.txt examples.txt changes.txt Resources ========= * `Talks & Filmed Tutorials about TDDA etc (Nick Radcliffe) `_ * `TDDA Library (PyCon DE, Eberhard Hansis, 2019) `_ * `Tutorial Video Screencasts on Exercises `_ * `Tutorials YouTube Channel `_ * `Paper: Automatic Constraint Generation and Verification `_ * `1-page summary of ideas `_ * `Quick-reference Guide / Cheat Sheet `_ * `TDDA Blog `_ * `Twitter tdda0 `_ * `Slack `_ (mail/DM on twitter for invitation) * `Source Repository (Github) `_ Indexes and Search ================== * :ref:`genindex` * :ref:`modindex` * :ref:`search`