Recent Changes
This Version
2.1 Upgrade pandas dependency to 2.0 and significantly improve compatibility with Pandas 2.0+.
2.1 Add support for parquet files for input and output data, (particularly for constraint generation, verification, and detection). New dependency on pyarrow to support this.
2.1 Deprecate use of .feather files. Support will be removed in a future version, no earlier than 2.2.
2.1 Inference of date formats: the TDDA library now uses its own methods to infer date formats, as Pandas no longer supports this.
2.1 Experimental support for CSV metadata specification files. This is unstable, not fully documented, and subject to change.
Previous Versions
2.0.8 and 2.0.9 Fixed to IP address lookup in gentest.
2.0 Addition of Gentest—functionality for automatically generating Python test code for any command-line program
2.0 Major overhaul of documentation.
More descriptive documentation
Better (though incomplete) separation between user code (particularly the command-line utilities
tdda gentest,tdda discover,tdda verify,tdda detectandrexpy).Add more external links to resources and fix those that had rusted
Improve the CSS to make the documentation render better on tdda.readthedocs.io
Adopt a customized version of the readthedocs theme for the documentation everywhere, so that what you see if you build the documentation locally should be more similar to what you see at tdda.readthedocs.io
2.0 Significant changes to the algorithm used by Rexpy. Should now be faster, but potentially more stochastic.
2.0 Rexpy can now generate many different flavours of regular expressions.
2.0. Planned Deprecation We plan to move from using
.featherfiles to.parquetfiles in the 2.1 release, ad which point.featherfiles will immediately be deprecated.
Older Versions
Reference test exercises added.
Escaping of special characters for regular expressions is now done in a way that is uniform across Python2, Python pre-3.7, and Python 3.7+.
JSON is now generated the same for Python2 and PYthon3 (no blank lines at the end of lines, and UTF8-encoded).
Fixed issue with
tdda testcommand not working properly in the previous version, to self-test an installation.Added new option flag
--interleavefor tdda detect. This causes the_okdetection fields to be interleaved with the original fields that they refer to in the resulting detection dataset, rather than all appearing together at the far right hand side. This option was actually present in the previous release, but not sufficiently documented.Fix for the
--write-allparameter fortdda.referencetestresult regeneration, which had regressed slightly in the previous version.Improved reporting of differences for text files in
tdda.referencetestwhen the actual results do not match the expected file contents. Now fully takes account of theignoreandremoveparameters.The
ignore_patternsparameter inassertTextFileCorrect()(and others) intdda.referencetestnow causes only the portion of a line that matches the regular expressions to be ignored; anything else on the line (before or after the part that matches a regular expression) must be identical in the actual and expected results. This means that you are specifying the part of the line that is allowed to differ, rather than marking an entire line to be ignored. This is a change in functionality, but is what had always been intended. For fuller control (and to get the previous behaviour), you can anchor the expressions with^.*(...).*$, and then they will apply to the entire line.The
ignore_patternsparameter intdda.referencetestcan now accept grouped subexpressions in regular expressions. This allows use of alternations, which were previously not supported.The
ignore_substringsparameter inassertTextFileCorrect()(and others)tdda.referencetestnow only matches lines in the expected file (where you have full control over what will appear there), not in the actual file. This fixes a problem with differences being masked (and not reported as problems) if the actual happened to include unexpected matching content on lines other than where intended.The
tdda.constraintspackage is now more resilient against unexpected type mismatches. Previously, if the type didn’t match, then in some circumstances exceptions would be (incorrectly) raised for other constraints, rather than failures.The
tdda.constraintspackage now supports Pythondatetime.datefields in Pandas DataFrames, in addition to the existing support ofdatetime.datetime.The
tdda.constraintsPython API now provides support for in-memory constraints, by allowing Python dictionaries to be passed in toverify_df()anddetect_df(), as an alternative to passing in a.tddafilename. This allows an application using the library to store its constraints however it wants to, rather than having to use the filesystem (e.g. storing it online and fetching with an HTTPGET).The
tdda.constraintspackage can now access MySQL databases using the mysql.connector driver, in addition to the MySQLdb and mysqlclient drivers.The
tdda.rexpytool can now quote the regular expressions it produces, with the new--quoteoption flag. This makes it easier to copy the expressions to use them on the command line, or embed them in strings in many programming languages.The Python API now allows you to
import tddaand then refer to its subpackages viatdda.referencetest,tdda.constraintsortdda.rexpy. Previously you had to explicitly import each submodule separately.