Recent Changes

This Version

2.1 Upgrade pandas dependency to 2.0 and significantly improve compatibility with Pandas 2.0+.
2.1 Add support for parquet files for input and output data, (particularly for constraint generation, verification, and detection). New dependency on pyarrow to support this.
2.1 Deprecate use of .feather files. Support will be removed in a future version, no earlier than 2.2.
2.1 Inference of date formats: the TDDA library now uses its own methods to infer date formats, as Pandas no longer supports this.
2.1 Experimental support for CSV metadata specification files. This is unstable, not fully documented, and subject to change.

Previous Versions

2.0.8 and 2.0.9 Fixed to IP address lookup in gentest.
2.0 Addition of Gentest—functionality for automatically generating Python test code for any command-line program
2.0 Major overhaul of documentation.
- More descriptive documentation
- Better (though incomplete) separation between user code (particularly the command-line utilities tdda gentest, tdda discover, tdda verify, tdda detect and rexpy).
- Add more external links to resources and fix those that had rusted
- Improve the CSS to make the documentation render better on tdda.readthedocs.io
- Adopt a customized version of the readthedocs theme for the documentation everywhere, so that what you see if you build the documentation locally should be more similar to what you see at tdda.readthedocs.io
2.0 Significant changes to the algorithm used by Rexpy. Should now be faster, but potentially more stochastic.
2.0 Rexpy can now generate many different flavours of regular expressions.
2.0. Planned Deprecation We plan to move from using .feather files to .parquet files in the 2.1 release, ad which point .feather files will immediately be deprecated.

Older Versions

Reference test exercises added.
Escaping of special characters for regular expressions is now done in a way that is uniform across Python2, Python pre-3.7, and Python 3.7+.
JSON is now generated the same for Python2 and PYthon3 (no blank lines at the end of lines, and UTF8-encoded).
Fixed issue with tdda test command not working properly in the previous version, to self-test an installation.
Added new option flag --interleave for tdda detect. This causes the _ok detection fields to be interleaved with the original fields that they refer to in the resulting detection dataset, rather than all appearing together at the far right hand side. This option was actually present in the previous release, but not sufficiently documented.
Fix for the --write-all parameter for tdda.referencetest result regeneration, which had regressed slightly in the previous version.
Improved reporting of differences for text files in tdda.referencetest when the actual results do not match the expected file contents. Now fully takes account of the ignore and remove parameters.
The ignore_patterns parameter in assertTextFileCorrect() (and others) in tdda.referencetest now causes only the portion of a line that matches the regular expressions to be ignored; anything else on the line (before or after the part that matches a regular expression) must be identical in the actual and expected results. This means that you are specifying the part of the line that is allowed to differ, rather than marking an entire line to be ignored. This is a change in functionality, but is what had always been intended. For fuller control (and to get the previous behaviour), you can anchor the expressions with ^.*(...).*$, and then they will apply to the entire line.
The ignore_patterns parameter in tdda.referencetest can now accept grouped subexpressions in regular expressions. This allows use of alternations, which were previously not supported.
The ignore_substrings parameter in assertTextFileCorrect() (and others) tdda.referencetest now only matches lines in the expected file (where you have full control over what will appear there), not in the actual file. This fixes a problem with differences being masked (and not reported as problems) if the actual happened to include unexpected matching content on lines other than where intended.
The tdda.constraints package is now more resilient against unexpected type mismatches. Previously, if the type didn’t match, then in some circumstances exceptions would be (incorrectly) raised for other constraints, rather than failures.
The tdda.constraints package now supports Python datetime.date fields in Pandas DataFrames, in addition to the existing support of datetime.datetime.
The tdda.constraints Python API now provides support for in-memory constraints, by allowing Python dictionaries to be passed in to verify_df() and detect_df(), as an alternative to passing in a .tdda filename. This allows an application using the library to store its constraints however it wants to, rather than having to use the filesystem (e.g. storing it online and fetching with an HTTP GET).
The tdda.constraints package can now access MySQL databases using the mysql.connector driver, in addition to the MySQLdb and mysqlclient drivers.
The tdda.rexpy tool can now quote the regular expressions it produces, with the new --quote option flag. This makes it easier to copy the expressions to use them on the command line, or embed them in strings in many programming languages.
The Python API now allows you to import tdda and then refer to its subpackages via tdda.referencetest, tdda.constraints or tdda.rexpy. Previously you had to explicitly import each submodule separately.