Command Line Reference
tdda
NAME
tdda — test-driven data analysis
SYNOPSIS
tdda discover Generate constraints for data validation
tdda verify Verify (validate) data against constraints
tdda detect Detect data that fails constraints
tdda examples Copy the tdda example data and code
tdda gentest Auto-generate Python tests for code in any language
tdda diff Find difference in datasets in parquet or CSV files
tdda ls List fields in a dataset
tdda cat Display rows from a dataset as a rich table
tdda head Display the first N rows of a dataset
tdda tail Display the last N rows of a dataset
tdda sample Display N random rows from a dataset
tdda serial Convert or infer flat-file metadata in tdda.serial,
CSVW, or Frictionless formats
tdda tag Tag tests that failed in the last reference test run
tdda config Show TDDA configuration
tdda version Print the TDDA version number
tdda help Print this help
tdda help COMMAND Print help on COMMAND (e.g. discover, verify)
tdda installman Install tdda man pages
tdda test Run the tdda library's self-tests.
OPTIONS
-v, --version Print version number (same as tdda version)
-h, -?, --help Print this help
SEE ALSO
rexpy(1), tdda-installman(1)
tdda discover
NAME
tdda discover — automatically generate constraints for data
SYNOPSIS
tdda discover [-h] [-?] [-7] [--no-config] [--colour]
[--no-colour] [-x] [-X] [-g] [-G]
[-r REPORT ...] [-o REPORT_PATH]
[--no-md] [--allowed] [--no-allowed]
[--required] [--no-required] [--no-ar]
[--pandas] [--polars] [--backend BACKEND]
INPUT [CONSTRAINTS]
POSITIONAL ARGUMENTS
INPUT is one of:
a CSV file or other flat file (e.g.
.csv,.txt,.psv), optionally using:format to specify flat-file metadata (see the help fortdda serial)a data frame in a Parquet file (
.parquet) e.g. from pandas, polars, Ra table from PostgreSQL databases (e.g.
postgres:tablename)a table from MySQL databases (e.g.
mysql:tablename)a table from SQLite databases (e.g.
sqlite:tablename)Standard input (stdin): Use
-to read from stdin
(Use tdda help serial, tdda serial --help, or man tdda-serial
for more information.)
CONSTRAINTS Name of the (JSON) constraints file to create.
Will use
.tddaextension if no extension is specified.Can be missing or
-to write to standard output.
DESCRIPTION
The tdda discover command is used to find constraints that are satisfied
(in most cases) by the input ("training") data provided.
OPTIONS
The following options are available.
* indicates options that are the default behaviours
-h, --help Show this help message and exit
-?, --? Same as -h or --help
-7, --ascii Report without using special characters
-N, --no-config Skip loading ~/.tdda.toml
--colour Use colour in terminal output *
--no-colour Do not use colour in terminal output
-x, --rex Include regular expression generation
-X, --no-rex Exclude regular expression generation *
-g, --group-rex Group regular expression generation
-G, --no-group-rex Do not group regular expression generation *
-r, --report [REPORT ...] Report formats to write, space-separated.
Formats: html, md (markdown), txt (text),
json, yaml, toml.
The stem of the output file is taken from
REPORT_PATH if -o is given, otherwise from
CONSTRAINTS.
-o, --report-path REPORT_PATH Stem path for report files (extension
is replaced by the format).
--no-md Do not create metadata in constraints file
--allowed Create allowed-fields constraint (default)
--no-allowed Do not create allowed-fields constraint
--required Create required-fields constraint (default)
--no-required Do not create required-fields constraint
--no-allowed-required Same as --no-allowed --no-required
--no-ar Same as --no-allowed --no-required
--pandas, --pd Use Pandas as DataFrame engine. *
--polars, --pl Use Polars as DataFrame engine.
--backend, -B BACKEND Backend choice for Pandas
(when dataframe engine is Pandas)
n for numpy_nullable *
a for pyarrow
o for original.
EXAMPLES
The example data can be obtained by running 'tdda examples', which will create various directories, including constraints_examples, containing the source data for these examples.
tdda discover elements.parquet elements.tdda
This command will read data from elements.parquet and (attempt to) find constraints satisfied by every record, and the data collectively. By default this can include minimum and maximum constraints on field values or lengths, nullability constraints, uniqueness constraints, sign constraints, and allow-values constraints.
The results will be written to elements.tdda in a JSON format,
including metadata. The output constraints file, elements.tdda can be
used with tdda verify to verify that another dataset with the same
structure satisfies the constraints, or with tdda detect to find
which records and/or values fail to satisfy the constraints. The .tdda
file can be edited (carefully) by hand, or programmatically, to add,
remove, tighten, or loosen constraints.
tdda discover elements.csv
This command is almost the same as the first except that it reads data from the CSV file specified, and writes the constraints to the screen (standard output).
The CSV structure and field types will normally be inferred (possibly incorrectly) by TDDA, and if the inference is bad, the command may fail. If you use:
tdda discover elements.csv:format.serial
metadata in format.serial will be used to guide the DataFrame
creation. If you use
tdda discover elements.csv:
it will look for any associated metadata for elements.csv using
naming conventions described in the help for tdda serial.
tdda discover --rex md.serial:elements.parquet
This is similar to the last two except that:
regular expression inference is requested (
--rex) for text fields. Rexpy will be used to attempt to infer one or a few regular expressions that characterize each field in the input data.a metadata file to be used to interpret the
.csvfile is provided explicitly.
tdda discover elements.parquet elements.tdda -r html -o elements
This discovers constraints as in example 1, and also writes an HTML
report to elements.html.
tdda discover elements.parquet elements.tdda -r md json txt -o elements
This discovers constraints as in example 1, and also writes reports
to elements.md, elements.json, and elements.txt.
tdda discover --rex postgres:elements
This is similar again except that now the postgres:specifier will be
interpreted as a database connection file in the user's home
directory, with the name ~/.dbCredential.postgres. This file should
contain connection information for a supported database. The extension
.postgres does not itself mean that this is a PostgreSQL database,
though that is a common convention. Use one of
tdda help db
tdda help database
to get help with the database connection file format.
SEE ALSO
tdda-verify(1),
tdda-detect(1),
tdda-serial(1)
Test Driven Data Analysis, book by Nicholas J. Radcliffe, chapters 2-7.
tdda verify
NAME
tdda verify — Verify that constraints are satisfied by data
SYNOPSIS
tdda verify [-h] [-?] [-7] [--no-config]
[--colour] [--no-colour]
[--epsilon EPSILON] [-a] [-f] [--dense]
[-t {strict,loose}] [--verify-required-fields]
[--verify-allowed-fields] [--no-verify-required-fields]
[--no-verify-allowed-fields] [--varf] [--no-varf]
[--pandas] [--polars] [--backend BACKEND]
INPUT [CONSTRAINTS]
POSITIONAL ARGUMENTS
INPUT is one of:
a CSV file or other flat file (e.g.
.csv,.txt,.psv), optionally using:format to specify flat-file metadata (see the help fortdda serial)a data frame in a Parquet file (
.parquet) e.g. from pandas, polars, Ra table from PostgreSQL databases (e.g.
postgres:tablename)a table from MySQL databases (e.g.
mysql:tablename)a table from SQLite databases (e.g.
sqlite:tablename)Standard input (stdin): Use
-to read from stdin
CONSTRAINTS, if provided, is a JSON .tdda file containing
constraints.
If no constraints file is provided, a file with the same path as
the input file, with a .tdda extension will be tried.
DESCRIPTION
The tdda verify command is used to check that data conforms
to the constraints specified. Any constraints not satisfied
by the data are reported, together with summary statistics.
The tdda verify command does not report which records and
values cause constraints to be violated: the companion command
tdda detect performs this function.
OPTIONS
-h, --help Show this help message and exit
-?, --? Same as -h or --help
-7, --ascii Report without using special characters
-N, --no-config Skip loading ~/.tdda.toml
--colour Use colour in terminal output
--no-colour Do not use colour in terminal output
--epsilon EPSILON Epsilon fuzziness (tolerance for comparisons)
-a, --all Report all fields, even if there are no
failures
-f, --fields Report only fields with failures
--dense Compact output: less vertical space used
-t, --type_checking {strict,loose}
"loose" means consider all numeric types
equivalent
--verify-required-fields, --vrf
Force verify of required fields
--verify-allowed-fields, --vaf
Force verify of allowed fields
--no-verify-required-fields, --no-vrf
Force no verication of required fields
--no-verify-allowed-fields, --no-vaf
Force no verification of allowed fields
--varf, --vraf Force verification of allowed and required
fields
--no-varf, --no-vraf Force no verification of allowed and required
fields
--pandas, --pd Use Pandas as DataFrame engine.
--polars, --pl Use Polars as DataFrame engine.
--backend, -B BACKEND Backend choice for Pandas
(when dataframe engine is Pandas)
n for numpy_nullable *
a for pyarrow
o for original.
EXAMPLES
The example data can be obtained by running tdda examples, which will
create various directories, including constraints_examples, containing
source data for these examples.
tdda verify elements.parquet elements.tdda
This command reads data from elements.parquet and checks it against the
constraints in elements.tdda, reporting any constraints that are not
satisfied.
SEE ALSO
tdda-detect(1),
tdda-discover(1),
tdda-serial(1)
Test Driven Data Analysis, book by Nicholas J. Radcliffe, chapters 2-7.
tdda detect
NAME
tdda detect — Detect data that does not obey supplied constraints
SYNOPSIS
tdda detect [-h] [-?] [-7] [--no-config] [--colour] [--no-colour]
[-epsilon EPSILON] [-o REPORT_PATH] [-a] [-f]
[-t {strict,loose}] [--write-all-records]
[--per-constraint] [--no-per-constraint]
[--no-original-fields] [--original-fields]
[--no-output-fields] [--output-fields [OUTPUT_FIELDS ...]]
[-r [REPORT ...]] [--interleave] [--no-interleave]
[--index] [--int] [--key [KEY ...]] [--dense]
[--verify-required-fields] [--verify-allowed-fields]
[--no-verify-required-fields] [--no-verify-allowed-fields]
[--varf] [--no-varf] [--pandas] [--polars]
[--backend BACKEND]
INPUT [CONSTRAINTS [OUTPUT]]
POSITIONAL ARGUMENTS
INPUT is one of:
a CSV file or other flat file (e.g.
.csv,.txt,.psv), optionally using:format to specify flat-file metadata (see the help fortdda serial)a data frame in a Parquet file (
.parquet) e.g. from pandas, polars, Ra table from PostgreSQL databases (e.g.
postgres:tablename)a table from MySQL databases (e.g.
mysql:tablename)a table from SQLite databases (e.g.
sqlite:tablename)Standard input (stdin): Use
-to read from stdin
CONSTRAINTS, if provided, is a JSON .tdda file containing
constraints.
If no constraints file is provided, a file with the same path as
the input file, with a .tdda extension will be tried.
OUTPUT specifies the destination for detected records.
This is usually a file if the input was a file (e.g. a .csv
file or a parquet file), but does not have to be the same type.
If the input is a database table, the output is always a database
table in the same database.
DESCRIPTION
The tdda detect command finds and reports data that fails to satisfy
the constraints in the CONSTRAINTS file specified. It also performs all
the same functions as tdda verify.
OPTIONS
-h, --help Show this help message and exit
-?, --? Same as -h or --help
-7, --ascii Report without using special characters
-N, --no-config Skip loading ~/.tdda.toml
--colour Use colour in terminal output
--no-colour Do not use colour in terminal output
--epsilon EPSILON Epsilon fuzziness (tolerance for comparisons)
-a, --all Report all fields, even if there are no
failures
-f, --fields Report only fields with failures
-r, --report [REPORT ...]
Report formats to write, space-separated.
Formats: html, md (markdown), txt (text),
json, yaml, toml.
The stem of the output file is taken from
REPORT_PATH if -o is given, otherwise from
OUTPUT.
-t, --type_checking {strict,loose}
"loose" means consider all numeric types
equivalent
-o, --report-path REPORT_PATH
Stem path for report files (extension is
replaced by the format).
--write-all-records Include passing records
--per-constraint Write one flag column per failing constraint in
addition to n_failures. Set by default.
--no-per-constraint Do not write out any per-constraint flag columns
--no-original-fields Do not write out original fields columns
--original-fields Write out original fields columns (default)
--no-output-fields Do not write out any original fields in the output. By
default, all original columns will be included.
--output-fields [OUTPUT_FIELDS ...]
Specify original columns to write out.
--interleave Interleave ok columns with original fields.
--no-interleave Do not interleave ok columns with original fields.
--index Include a row-number index in the output file when
detecting. Rows are usually numbered from 1,
unless the input file already has an index.
--int Write out boolean fields as integers, with 1 for true
and 0 for false.
--key [KEY ...] Key or key fields to use when reporting failures
--dense Compact output: less vertical space used
--verify-required-fields, --vrf
Force verify of required fields
--verify-allowed-fields, --vaf
Force verify of allowed fields
--no-verify-required-fields, --no-vrf
Force no verication of required fields
--no-verify-allowed-fields, --no-vaf
Force no verification of allowed fields
--varf, --vraf Force verification of allowed and required
fields
--no-varf, --no-vraf Force no verification of allowed and required
fields
--pandas, --pd Use Pandas as DataFrame engine.
--polars, --pl Use Polars as DataFrame engine.
--backend, -B BACKEND Backend choice for Pandas
(when dataframe engine is Pandas)
n for numpy_nullable *
a for pyarrow
o for original.
EXAMPLES
The example data can be obtained by running tdda examples, which will
create various directories, including constraints_examples, containing
source data for these examples.
tdda detect elements.parquet elements.tdda elements-failures.parquet
This command reads data from elements.parquet, checks it against the
constraints in elements.tdda, and writes records with one or more
constraint failures to elements-failures.parquet.
tdda detect elements.parquet elements.tdda elements-failures.parquet -r html -o elements
As above, and also writes an HTML report to elements.html.
tdda detect elements.parquet elements.tdda elements-failures.parquet -r md json txt -o elements
As above, and also writes reports to elements.md, elements.json,
and elements.txt.
SEE ALSO
tdda-verify(1),
tdda-discover(1),
tdda-serial(1)
Test Driven Data Analysis, book by Nicholas J. Radcliffe, chapters 2-7.
tdda diff
NAME
tdda diff — compare csv or parquet files
SYNOPSIS
tdda diff [--fields FIELD1,FIELD2,...]
[--xfields FIELD1,FIELD2,... ]
[--horizontal] [-H] [--vertical] [-V]
[--find-md] [--no-md]
[--maxdiffs N] [--key FIELD]
[--mono] [--bw] [--colours COLOURS] [-c COLOURS]
[--dps N] [--precision N]
[--AE] [--LR] [--angles] [--pm]
[--prefixes PREFIXES]
[-N] [--no-config]
[--strict] [--medium] [--loose] [--permissive]
LEFT RIGHT
POSITIONAL ARGUMENTS
LEFT The first dataset to be compared, as a parquet or flat file
(e.g. CSV), optionally using : format to specify flat-file metadata
(see the help for tdda serial).
(Normally thought of as left or actual)
RIGHT The second dataset to be compared as a parquet or flat file
(e.g. CSV), optionally using : format to specify flat-file metadata
(see the help for tdda serial).
(Normally thought of as right, expected, reference, etc.)
DESCRIPTION
The tdda diff command compares two tabular datasets in CSV or Parquet
files and shows some or all differences. It uses the same underlying
functionality as the tdda.referencetest assertions such as
assertDataFramesEqual, and provides similar control over what
differences to consider, e.g. which fields, and strictness of type and
numeric comparisons. It also provides a number of options for controlling
the display of differences.
By default, comparisons are row-based and consider all fields (columns), as typed values after reading.
OPTIONS
* indicates options that are the default behaviours
--fields FIELD1,FIELD2,...
Check only these fields (comma-separated list)
--xfields FIELD1,FIELD2,...
Check all fields except these (comma-separated list)
--horizontal, -H
Horizontal display (left and right, side by side)
--vertical, -V
Vertical display (left above right)
--find-md
Attempt to find associated metadata for flat files automatically,
without requiring : colon syntax in the path.
--no-md, --no-find-md
Do not attempt to find associated metadata for flat files (default).
--key FIELD
Use this field as a join key when reporting differences.
--maxdiffs N
Maximum number of differences to show.
--mono
Show monochrome output with different values in bold
and shared values dimmed.
--bw
Show black and white output with different values in bold and shared
values in the terminal's default style.
--colours COLOURS, -c COLOURS
Use colours specified e.g. -c red-blue
--dps N
Number of decimal places to show for floating-point values.
Also sets precision if not specified separately.
--precision N
Precision for floating point comparisons. Two floats a and b will be
considered equal if abs(a - b) < 1e-N.
--AE
Use A: and E: as labels for the two datasets (actual/expected)
--LR
Use L: and R: as labels for the two datasets (left/right)
--angles
Use < and > as labels for the two datasets
--pm
Use + and - as labels for the two datasets
--prefixes PREFIXES
Use prefixes specified as labels for the two datasets
e.g. --prefixes "actual:-ref:" or "actual: -ref: " to include spaces
-N, --no-config
Use default configuration (ignore ~/.tdda.toml)
--strict
Use strict type comparisons
--medium
Use medium-strictness type comparisons
--loose
Use loose (permissive) type comparisons
--permissive
Use loose (permissive) type comparisons
--pandas, --pd Use Pandas as DataFrame engine. *
--polars, --pl Use Polars as DataFrame engine.
--backend, -B BACKEND Backend choice for Pandas
(when dataframe engine is Pandas)
n for numpy_nullable *
a for pyarrow
o for original.
--help, -?, --?
Show help on tdda diff.
EXAMPLES
Data suitable for all examples can be obtained with
tdda examples diff
tdda diff a.csv a.csv
This is the simplest form of the command. It will read a.csv and
convert it to a data frame, using the default back end (Pandas).
tdda diff a.csv b.csv --vertical
Compare two CSV files, stacking left and right values vertically rather than side by side. Useful when there are many columns or long values.
tdda diff before.parquet after.parquet --key Income,Expenditure
Compare two Parquet files using a composite join key. The fields
Income and Expenditure must form a primary key in both datasets.
Rows are matched by key rather than by position.
tdda diff actual.csv expected.csv --AE --bw
Compare two CSV files using A: and E: as markers for actual and
expected, with monochrome bold highlighting instead of colour.
tdda diff foo.csv: bar.csv:
Compare two CSV files, asking TDDA to find associated metadata files
for each using naming conventions (e.g. @.serial or
foo-metadata.json in the same directory).
tdda diff foo.csv bar.txt:money.serial
Compare foo.csv (loaded with default settings) against bar.txt,
using money.serial as the metadata file describing its format.
tdda diff a.parquet b.csv --loose --dps 3
Compare a Parquet file against a CSV file with loose type matching and floating-point values compared to 3 decimal places.
tdda ls
NAME
tdda ls — List fields in a dataset
SYNOPSIS
tdda ls [-h] [-1|--one-line] [-l] [--pandas] [--polars]
[--backend BACKEND]
INPUT
POSITIONAL ARGUMENTS
INPUT is one of:
a CSV file (or
.tsv,.psv,.txt)a Parquet file (
.parquet)a flat file with colon syntax to trigger metadata lookup (e.g.
foo.csv:)a flat file with an explicit metadata path (e.g.
foo.csv:foo.serial)
DESCRIPTION
The tdda ls command lists the fields in a dataset.
Without --long, it prints a one-line summary followed by the field
names, right-aligned.
With --long, it prints a one-line summary followed by a table showing
each field's dtype, minimum value, maximum value, and null count.
For flat files, a second line reports how the file was read and which metadata file was used, if any.
OPTIONS
-h, -?, --help Show this help message and exit
-1, --one-line List all field names on one line, space-separated
-l, --long Show dtype, min, max, and null count per field
--pandas, --pd Use Pandas as DataFrame engine (default)
--polars, --pl Use Polars as DataFrame engine
--backend, -B BACKEND Backend choice for Pandas
n for numpy_nullable *
a for pyarrow
o for original
EXAMPLES
The example data can be obtained by running tdda examples, which will
create various directories, including serial_examples.
tdda ls accounts1k.parquet
List the fields in accounts1k.parquet.
tdda ls -l accounts1k.csv:
Show field details for accounts1k.csv, using any associated metadata
file found automatically.
tdda ls -l accounts1k.csv --polars
Show field details using Polars.
SEE ALSO
tdda-diff(1),
tdda-serial(1),
tdda-verify(1)
tdda cat
NAME
tdda cat — Display rows from a dataset as a rich table
SYNOPSIS
tdda cat [-h] [N | -N | +N] [-s | -S]
[--fields FIELDS] [--xfields FIELDS]
[-r N [--seed SEED]]
[--pandas] [--polars] [--backend BACKEND]
INPUT [FIELD ...]
POSITIONAL ARGUMENTS
INPUT is one of:
a CSV file (or
.tsv,.psv,.txt)a Parquet file (
.parquet)a flat file with colon syntax to trigger metadata lookup (e.g.
foo.csv:)a flat file with an explicit metadata path (e.g.
foo.csv:foo.serial)
FIELD ... Field names (or fnmatch wildcard patterns) to display.
Fields appear in the order given. Equivalent to --fields;
both may be combined. Wildcards must be quoted in the shell.
DESCRIPTION
The tdda cat command displays rows from a dataset as a rich table.
Without a row count, all rows are shown.
N or -N First N rows
+N Last N rows
Null values are shown as ∅.
OPTIONS
-h, -?, --help Show this help message and exit
--fields FIELDS Show only these fields. FIELDS is a
comma- or space-separated list of field names
or fnmatch wildcard patterns (e.g. eu_*,
[a-z]*). Fields appear in the order
specified. Requires quoting in the shell when
using spaces or wildcards.
--xfields FIELDS Exclude these fields. Same format as
--fields. Fields appear in dataset order.
-s Short headers: column width driven by data;
headers split at word boundaries (punctuation
and lowercase→uppercase transitions) and packed
onto as few lines as possible.
-S Short headers: as -s but split anywhere
(mid-word) to fit the data width.
-r N, --random N Show N random rows instead of a slice.
--seed SEED Random seed for -r. If omitted, a seed is
chosen automatically and printed.
--pandas, --pd Use Pandas as DataFrame engine (default)
--polars, --pl Use Polars as DataFrame engine
--backend, -B BACKEND Backend choice for Pandas
n for numpy_nullable *
a for pyarrow
o for original
EXAMPLES
tdda cat accounts1k.parquet
Display all rows from accounts1k.parquet.
tdda cat -10 accounts1k.csv:
Display the first 10 rows, using any associated metadata file.
tdda cat +10 accounts1k.csv:
Display the last 10 rows.
tdda cat --fields 'name,balance' accounts1k.csv:
Display only the name and balance fields.
tdda cat --fields 'amount*' --xfields '*_raw' accounts1k.csv:
Display fields matching amount*, excluding those ending in _raw.
tdda cat -r 20 --seed 42 accounts1k.csv:
Display 20 random rows with a fixed seed.
tdda cat -s accounts1k.csv:
Display all rows with compact multi-line headers, splitting at word
boundaries (open_date → open date, accountType → account Type).
SEE ALSO
tdda-head(1),
tdda-tail(1),
tdda-sample(1),
tdda-ls(1),
tdda-diff(1),
tdda-serial(1)
tdda head
NAME
tdda head — Display the first N rows of a dataset
SYNOPSIS
tdda head [-h] [N] [-s | -S]
[--fields FIELDS] [--xfields FIELDS]
[--pandas] [--polars] [--backend BACKEND]
INPUT [FIELD ...]
POSITIONAL ARGUMENTS
INPUT Dataset path (CSV, Parquet, or colon syntax).
FIELD ... Field names (or fnmatch wildcard patterns) to display.
Fields appear in the order given. Equivalent to --fields;
both may be combined. Wildcards must be quoted in the shell.
DESCRIPTION
The tdda head command displays the first N rows of a dataset (default 10)
as a rich table.
Null values are shown as ∅.
OPTIONS
-h, -?, --help Show this help message and exit
N Number of rows to show (default 10)
--fields FIELDS Show only these fields. FIELDS is a
comma- or space-separated list of field names
or fnmatch wildcard patterns (e.g. eu_*,
[a-z]*). Fields appear in the order
specified. Requires quoting in the shell when
using spaces or wildcards.
--xfields FIELDS Exclude these fields. Same format as
--fields. Fields appear in dataset order.
-s Short headers: column width driven by data;
headers split at word boundaries and packed
onto as few lines as possible.
See tdda-cat(1) for details.
-S Short headers: split anywhere to fit data width.
--pandas, --pd Use Pandas as DataFrame engine (default)
--polars, --pl Use Polars as DataFrame engine
--backend, -B BACKEND Backend choice for Pandas
n for numpy_nullable *
a for pyarrow
o for original
EXAMPLES
tdda head accounts1k.parquet
Display the first 10 rows of accounts1k.parquet.
tdda head 20 accounts1k.csv:
Display the first 20 rows, using any associated metadata file.
tdda head --fields 'name,balance' accounts1k.csv:
Display only name and balance for the first 10 rows.
tdda head -s 20 accounts1k.csv:
Display the first 20 rows with compact multi-line headers.
SEE ALSO
tdda-cat(1),
tdda-tail(1),
tdda-sample(1),
tdda-ls(1),
tdda-diff(1),
tdda-serial(1)
tdda tail
NAME
tdda tail — Display the last N rows of a dataset
SYNOPSIS
tdda tail [-h] [N] [-s | -S]
[--fields FIELDS] [--xfields FIELDS]
[--pandas] [--polars] [--backend BACKEND]
INPUT [FIELD ...]
POSITIONAL ARGUMENTS
INPUT Dataset path (CSV, Parquet, or colon syntax).
FIELD ... Field names (or fnmatch wildcard patterns) to display.
Fields appear in the order given. Equivalent to --fields;
both may be combined. Wildcards must be quoted in the shell.
DESCRIPTION
The tdda tail command displays the last N rows of a dataset (default 10)
as a rich table.
Null values are shown as ∅.
OPTIONS
-h, -?, --help Show this help message and exit
N Number of rows to show (default 10)
--fields FIELDS Show only these fields. FIELDS is a
comma- or space-separated list of field names
or fnmatch wildcard patterns (e.g. eu_*,
[a-z]*). Fields appear in the order
specified. Requires quoting in the shell when
using spaces or wildcards.
--xfields FIELDS Exclude these fields. Same format as
--fields. Fields appear in dataset order.
-s Short headers: column width driven by data;
headers split at word boundaries and packed
onto as few lines as possible.
See tdda-cat(1) for details.
-S Short headers: split anywhere to fit data width.
--pandas, --pd Use Pandas as DataFrame engine (default)
--polars, --pl Use Polars as DataFrame engine
--backend, -B BACKEND Backend choice for Pandas
n for numpy_nullable *
a for pyarrow
o for original
EXAMPLES
tdda tail accounts1k.parquet
Display the last 10 rows of accounts1k.parquet.
tdda tail 20 accounts1k.csv:
Display the last 20 rows, using any associated metadata file.
tdda tail --fields 'name,balance' accounts1k.csv:
Display only name and balance for the last 10 rows.
tdda tail -s 20 accounts1k.csv:
Display the last 20 rows with compact multi-line headers.
SEE ALSO
tdda-cat(1),
tdda-head(1),
tdda-sample(1),
tdda-ls(1),
tdda-diff(1),
tdda-serial(1)
tdda sample
NAME
tdda sample — Display N random rows from a dataset
SYNOPSIS
tdda sample [-h] [N] [--seed SEED] [-s | -S]
[--fields FIELDS] [--xfields FIELDS]
[--pandas] [--polars] [--backend BACKEND]
INPUT [FIELD ...]
POSITIONAL ARGUMENTS
INPUT Dataset path (CSV, Parquet, or colon syntax).
FIELD ... Field names (or fnmatch wildcard patterns) to display.
Fields appear in the order given. Equivalent to --fields;
both may be combined. Wildcards must be quoted in the shell.
DESCRIPTION
The tdda sample command displays N randomly selected rows from a dataset
(default 10) as a rich table.
When no --seed is given, a random seed is chosen automatically and printed
so the result can be reproduced.
Null values are shown as ∅.
OPTIONS
-h, -?, --help Show this help message and exit
N Number of random rows to show (default 10)
--seed SEED Random seed. If omitted, a seed is chosen
automatically and printed.
--fields FIELDS Show only these fields. FIELDS is a
comma- or space-separated list of field names
or fnmatch wildcard patterns (e.g. eu_*,
[a-z]*). Fields appear in the order
specified. Requires quoting in the shell when
using spaces or wildcards.
--xfields FIELDS Exclude these fields. Same format as
--fields. Fields appear in dataset order.
-s Short headers: column width driven by data;
headers split at word boundaries and packed
onto as few lines as possible.
See tdda-cat(1) for details.
-S Short headers: split anywhere to fit data width.
--pandas, --pd Use Pandas as DataFrame engine (default)
--polars, --pl Use Polars as DataFrame engine
--backend, -B BACKEND Backend choice for Pandas
n for numpy_nullable *
a for pyarrow
o for original
EXAMPLES
tdda sample accounts1k.parquet
Display 10 random rows from accounts1k.parquet, printing the seed used.
tdda sample 50 accounts1k.csv:
Display 50 random rows, using any associated metadata file.
tdda sample 20 --seed 42 accounts1k.csv:
Display 20 random rows with a fixed seed (reproducible).
tdda sample --fields 'name,balance' accounts1k.csv:
Display 10 random rows showing only name and balance.
tdda sample -s 20 --seed 42 accounts1k.csv:
Display 20 random rows with compact multi-line headers.
SEE ALSO
tdda-cat(1),
tdda-head(1),
tdda-tail(1),
tdda-ls(1),
tdda-diff(1),
tdda-serial(1)
tdda serial
NAME
tdda serial — Converts and generates serial metadata files.
SYNOPSIS
tdda serial [FLAGS] inmetadata outmetadata
tdda serial --to FMT [FLAGS] inmetadata outmetadata
Converts metadata from one metadata format, in inpath,
to another, in outpath.
tdda serial [FLAGS] indata outmetadata
Creates metadata for indata in outmetadata
tdda serial [FLAGS] inmetadata script.py
Creates Python code for reading a file in the format in inmetadata as
Python. Often, a reading library would be specified, e.g.
tdda serial a.serial a.py --to pd.r
which specifies that the Python script should use pandas.read_csv.
Supported formats FMT:
SHORT FORM LONG FORM/Description
. tdda.serial
pd.r pandas.read_csv
pd.w pandas.DataFrame.to_csv
pl.r polars.read_csv
pl.w polars.DataFrame.write_csv
csv.r python.csv.reader
csv.w python.csv.writer
csvw CSVW
fl frictionless
fless frictionless
fl.r frictionless.resource
fl.p frictionless.package
Multiple formats can be separated by commas.
Format is usually inferred from filename if following common conventions
for tdda.serial, CSVW, and frictionless.
OPTIONS
--to FMT Specify output metadata format (see list of
formats above)
-B BE, --backend BE Specify backend for Pandas flavours:
n: numpy_nullable
a: pyarrow
o: original Pandas backend.
--for FILE Filename for data to use when generating CSVW
or Frictionless data.
(Can also be used for tdda.serial and .py
output)
-N, --no-config Use default configuration (ignore ~/.tdda.toml)
-g, --gen, --generate Generate (infer) metadata for flat file
-q, --quiet Quiet output
-v, --verbose Verbose output
-V, --Verbose More verbose output
Options used primarily or exclusively with --generate/--gen/-g
--sep D, --delimiter D Specify D as the field separator.
--quote-char Q, --quote Q Specify Q as the quote character.
(Q is always " or ' in practice.)
--nulls S Specify null indicator, or comma-separated
list of null indicators.
--escape Use backslash as escape character.
NOTE: Always backslash: does not take
argument.
--no-escape Do not support backslash escaping with -g.
NOTE: This only affects quotes, separators,
and backslashes. Standard escapes for
control sequences (\t, \n, \r, \f)
are always supported.
--stutter Specify quote stuttering.
Usually an alternative to --escape.
--no-stutter Do not use quote stuttering.
Usually used with --escape.
--encoding ENC, -e ENC Specify ENC as encoding.
--date-format D Specify D as the (file-wide default)
date format.
--datetime-format D Specify D as the (file-wide default) format
for datetime fields.
--sample-lines N, -n N Use (up to) N sample lines when inferring
metadata.
--single-field, -1 Inform the metadata inferred that the file
contains only a single field (column).
--include-path Include path in .serial output
--exclude-path Do not include in .serial output
--quoting Q Set quoting to Q. Q must be one of:
QUOTE_ALL
QUOTE_MINIMAL
QUOTE_NONNUMERIC
QUOTE_NONE
QUOTE_NOTNULL
QUOTE_STRINGS
QUOTE_STRINGS_ONLY
--use-literal-dates Specifies that date formats should be
written to .serial files with unambiguous
literal examples such as 2000-12-31T12:34:56.
--use-yyyy-dates Specifies that date formats should be
written to .serial files in the form
exemplified by YYYY-MM-DD HH:MM:SS.
--use-pc-dates Specifies that date formats should be
written to .serial files in Python
strftime-compatible % formats, exemplified
by %Y-%m-%dT%H:%M:%S.
EXAMPLES
tdda serial a.csv a.serial
Generate tdda.serial metadata describing format ofa.csvina.serialtdda serial --to . a.csv a.serial
Same as previous, explicitly specifying the default,tdda.serial, output format (.is short fortdda.serialformat).tdda serial a.csv a-metadata.json
Generate CSVW metadata describing format ofa.csvina-metadata.jsontdda serial --to csvw a.csv a.json
Same as previous, explicitly specifying format with non-standard output nametdda serial a.serial a-metadata.json
Convertstdda.serialmetadata to CSVWtdda serial a-metadata.json a.serial
Converts CSVW metadata totdda.serial
USING SERIAL METADATA WITH TDDA COMMANDS
For all tdda command-line commands, and in most places within
API calls where CSV or other flat file is specified, there is the
option to specify the file format using tdda.serial files,
CSVW files, or Frictionless files. This is based on the : (colon) specifier.
When specifying a path to a CSV (or other flat) file:
If the path is used by itself, the
tddalibrary will use eithertdda.serial.csv_to_pandasortdda.serial.csv_to_polarsto read it into a DataFrame. The default is currently pandas (with thenumpy_nullableback end), but this can be configured (seetdda config) or, in many cases controlled with command line flags (--polars,--pandas,--backend BACKEND(for Pandas only)).If the path ends in a colon (e.g.
foo.csv:), TDDA will search for metadata in the same directory as the file and, if it finds one, pass that to the appropriatecsv_to_...function for more accurate DataFrame generation.In doing this, it will look for the following in priority order, given a file
foo.csv:foo.csv.serial(tdda.serialmetadata)foo.serial(tdda.serialmetadata). This is actually more common than the previous form, but if there are multiple files with different extensions, the former is more specific, so is checked first.Anything that matches foo using
@as a wildcard, e.g.@.serial,f@.serial,f@o.serial,@oo.serial. (@acts like*in the shell, while avoiding needing*in filenames, which can be awkward.)foo-metadata.json,foo-csvmetadata.json,foo-csv-metadata.json,foo.csvmetadata.json,foo.csv-metadata.json(all of which are common conventions for CSVW metadata files).The same CSVW patterns with
@wildcardsfoo.serial.json,foo.serial.yaml,foo.resource.json,foo.resource.yaml,foo.package.json,foo.package.yaml, all of which are common for Frictionless metadata files.The same patterns for
serialorpackagefrictionless files with@wildcards. Wildcards are not searched inresourcefiles, because in frictionless these always correspond to a single data file.
If the path contains a colon, the part to the right of the colon will be interpreted as a metadata file. So
foo.csv:bar.serialwill usebar.serial.
BUGS
The tdda serial functionality is fairly new, and there are probably
still bugs and undesirable features in the implementation.
SEE ALSO
Test Driven Data Analysis, book by Nicholas J. Radcliffe, chapter 8.
tdda gentest
NAME
tdda gentest — Gentest writes tests, so you don't have to.™
SYNOPSIS
tdda gentest Runs the Gentest Wizard
tdda gentest 'SHELL COMMAND' [OPTIONS] [test_output.py]
[REFERENCE_FILE ...]
POSITIONAL ARGUMENTS
SHELL COMMAND is the command to be tested. It should normally be enclosed in single quotes. It can be any terminal command — a shell built-in, a shell script, an R program, a Python program, or anything else that can be run from the terminal.
test_output.py is the name of the Python test script to generate. If not specified, Gentest derives a name from the command.
REFERENCE_FILE ... are optional additional files or directories that Gentest should monitor for files created or modified during command execution.
DESCRIPTION
Gentest will create Python tests, using the tdda's reference-testing capabilities, for terminal-based programs written in any language. For example, the shell command can be a built-in shell command or can run a shell script, an R program, or of course a Python program.
It has a wizard, invoked just by typing gentest, that prompts for
the information it needs before generating the tests.
Alternatively, the command to be tested and optionally other parameters can all be specified on the command line.
Gentest's tests:
Runs the provided command more than once (by default)
Captures output to
stdoutandstderrCaptures the exit code
Notices any files created in the directory or subdirectories or other specified places
Uses variations in output and other heuristics to identify parts of the output that appear variable and uses
rexpyto write reference tests that only test things that appear to be fixed and not system dependent.Writes a Python test script, using
tdda.referencetest, that contains a set of tests of the shell command specified.
The test script can then, of course, be edited by hand.
The test script, when run, executes the command again and checks that its behaviour is as expected (i.e., is “the same” as when Gentest ran originally, except for the variations allowed in the reference test specifications).
OPTIONS
-h, --help Show this help message and exit
-?, --? Same as -h or –help
-m N, --max-files N Max files to track
-r, --relative-paths Show relative paths wherever possible
-n N, --iterations N Number of times, N, to run the command
(default 2)
-O, --no-stdout Do not generate a test checking output to STDOUT
-E, --no-stderr Do not generate a test checking output to STDERR
-Z, --non-zero-exit Do not require exit status to be 0
-C, --no-clobber Do not overwrite existing test script or
reference directory
-N, --no-config Use default configuration (ignore ~/.tdda.toml)
EXAMPLES
tdda gentest
Runs the Gentest wizard, which presents a dialogue something like this
(where all suggested answers, in square brackets, are accepted by
hitting RETURN). (Obviously, this is an improbably simple command test;
it's usually a command to run a script or program.
$ tdda gentest
Enter shell command to be tested: echo "Hey, cats!"
Enter name for test script [test_echo__Hey__cats__]:
Check all files written under $(pwd)?: [y]:
Check all files written under (gentest's) $TMPDIR?: [y]:
Enter other files/directories to be checked, one per line, then a blank line:
Check stdout?: [y]:
Check stderr?: [y]:
Exit code should be zero?: [y]:
Clobber (overwrite) previous outputs (if they exist)?: [y]:
Number of times to run script?: [2]:
Running command 'echo "Hey, cats!"' to generate output (run 1 of 2).
Saved (non-empty) output to stdout to /home/tdda/ref/echo__Hey__cats__/STDOUT.
Saved (empty) output to stderr to /home/tdda/ref/echo__Hey__cats__/STDERR.
Running command 'echo "Hey, cats!"' to generate output (run 2 of 2).
Saved (non-empty) output to stdout to /home/tdda/ref/echo__Hey__cats__/2/STDOUT.
Saved (empty) output to stderr to /home/tdda/ref/echo__Hey__cats__/2/STDERR.
Test script written as /home/tdda/test_echo__Hey__cats__.py
Command execution took: 0.022s
SUMMARY:
Directory to run in: /home/tdda
Shell command: echo "Hey, cats!"
Test script generated: /home/tdda/test_echo__Hey__cats__.py
Reference files: (none)
Check stdout: yes (was 'Hey, cats!\n')
Check stderr: yes (was empty)
Expected exit code: 0
Clobbering permitted: yes
Number of times script ran: 2
Number of tests written: 4
tdda gentest 'echo "Hey, cats!"' 'test_echo.py' -n 3
Same as above except that the command and a custom name for the test script has been supplied, so the wizard does not run, and the number of times to run the command has been increased to three.
The test script produced is almost identical except for the number of times the command is run.
tdda gentest 'diff verifier1.txt verifier2.txt' -Z
Gentest will normally fail if the program produces a non-zero exit
code, generally indicating an error. Commands like diff, however,
produce a non-zero exit code (1) when there are differences. The -Z
option (or --non-zero-exit) allows the exit code to be non-zero, and
Gentest generates a test that checks it is the expected value (1, in
this case, if the two verifier files should be different).
SEE ALSO
rexpy(1), tdda-diff(1)
Test Driven Data Analysis, book by Nicholas J. Radcliffe, chapter 9, and chapter 9-12 for reference testing more generally.
tdda tag
NAME
tdda tag — tag tests that failed in the last reference test run
SYNOPSIS
tdda tag
DESCRIPTION
The tdda tag command reads the log of failing tests written by the
most recent logged tdda.referencetest run and adds @tag decorators
to those tests in their source files. Tagged tests can then be run in
isolation, allowing a rapid edit-test cycle focused on failing
tests. A logged run of tdda.referencetest uses --log-failures or
(for unittest-style tests only) -F.
WORKFLOW
A typical workflow with unittest-style tests (ReferenceTestCase) is:
python tests.py -9 # Remove any existing @tag decorators
python tests.py -F # Run tests, logging failures
tdda tag # Add @tag to failing tests
python tests.py -1 # Run only tagged (failing) tests
When all tests are passing:
python tests.py -9 # Remove @tag decorators
The equivalent workflow with pytest is:
pytest --untag # Remove any existing @tag decorators
pytest --log-failures # Run tests, logging failures
tdda tag # Add @tag to failing tests
pytest --tagged # Run only tagged (failing) tests
When all tests are passing:
pytest --untag # Remove @tag decorators
SEE ALSO
tdda(1)
tdda examples
NAME
tdda examples — Creates example data for TDDA
SYNOPSIS
tdda examples [OUTDIR]
tdda examples [MODULE...] [OUTDIR]
tdda examples all [OUTDIR]
POSITIONAL ARGUMENTS
MODULE can be any of:
referencetestconstraintsrexpygentestbook
If not specified, all the first four will be created, without requiring internet access.
OUTDIR is an optional directory in which to write the example directories; by default this will be the current working directory (.).
If all is specified, or book is included,
the tdda-book-examples will be downloaded from GitHub, which
does require internet access.
DESCRIPTION
Write out example code and data for all examples, by default, or for a particular module if specified.
If no module is specified, examples for all four are written out.
Examples are created in subdirectories of OUTDIR (default: the current directory .).
EXAMPLES
tdda examples
Creates the referencetest, constraints, rexpy, and gentest examples in.tdda examples gentest
Createsexamples_gentestin.tdda examples gentest book
Creates gentest and book examples in.tdda examples all
Creates all the examples, four from local files and the book examples from GitHub in.
tdda version
NAME
tdda version — Reports the (active) installed version of tdda
SYNOPSIS
tdda version
DESCRIPTION
Reports the version number of the (active) TDDA tools.
EXAMPLES
tdda version
tdda config
NAME
tdda config — Shows config settings
SYNOPSIS
tdda config [--annotated|-a] [--current|-c] [--default|-d] [--file|-f]
tdda config [--annotated|-a] current|default|file
DESCRIPTION
Shows configuration information. Use:
-c, --current, or current for the current configuration
-d, --default, or default for the default configuration
-f, --file, or file for the configuration file location and contents.
With no argument, it shows the current configuration.
Use -a or --annotated with any of the above to show allowed values
alongside each parameter.
EXAMPLES
tdda config
tdda config -c
tdda config -d
tdda config -f
PARAMETERS
null_rep
Used to show nulls in some contexts.
Default: "∅"
Allowed: Any string
colour
Controls whether output is colourized.
Default: true
Allowed: true, false
engine
Controls whether pandas or polars is used for CSV files by default.
Default: "pandas"
Allowed: "pandas", "polars"
pandas_backend
Controls default backend for CSV loading etc.
Default: "numpy_nullable"
Allowed: "numpy_nullable" (or "n"), "pyarrow" (or "a"), "original" (or "o")
PARAMETERS (referencetest)
left_colour
Colour for left (actual) side of diffs.
Default: "red"
Allowed: A named ANSI colour (red, bright_red etc.) or an RGB hex colour with leading # such as #FF0000 for pure red. Interpreted by the rich library.
right_colour
Colour for right (expected) side of diffs.
Default: "green"
Allowed: A named ANSI colour (red, bright_red etc.) or an RGB hex colour with leading # such as #FF0000 for pure red. Interpreted by the rich library.
failure_colour
Colour used to highlight failures.
Default: "red"
Allowed: A named ANSI colour (red, bright_red etc.) or an RGB hex colour with leading # such as #FF0000 for pure red. Interpreted by the rich library.
mono
Use bold instead of colour for diffs.
Default: false
Allowed: true, false
bw
Black and white mode: no colour or bold.
Default: false
Allowed: true, false
left_prefix
Prefix string for left (actual) diff lines.
Default: "< "
Allowed: Any string
right_prefix
Prefix string for right (expected) diff lines.
Default: "> "
Allowed: Any string
vertical
Show diffs vertically rather than side by side.
Default: false
Allowed: true, false
force_val_prefixes
Always show left/right prefixes on diff lines.
Default: false
Allowed: true, false
type_checking
How strictly to check types in reference test comparisons.
Default: "strict"
Allowed: "strict", "medium", "loose"
log_failures
Log failing test IDs to file for use with tdda tag.
Default: false
Allowed: true, false
PARAMETERS (constraints)
interleave
Interleave pass and fail results in verify output.
Default: true
Allowed: true, false
per_constraint
Report results per constraint rather than per field.
Default: true
Allowed: true, false
detect_passes
Include passing fields in detect output.
Default: true
Allowed: true, false
report_formats
List of additional report formats to generate.
Default: []
Allowed: Any subset of "html", "md", "txt", "json", "yaml", "toml"
write_all_records
Write all records to detect output, not just failures.
Default: false
Allowed: true, false
int_bools
Use integers (0/1) rather than booleans in detect output.
Default: false
Allowed: true, false
verify_required_fields
Verify that all required fields are present.
Default: unset
Allowed: true, false
verify_allowed_fields
Verify that no fields are present outside the allowed set.
Default: unset
Allowed: true, false
write_required_fields
Discover should include the required-fields constraint.
Default: false
Allowed: true, false
write_allowed_fields
Discover should include an allowed-fields constraint.
Default: false
Allowed: true, false
PARAMETERS (tddadiff)
type_checking
How strictly to check types when comparing dataframes.
Default: "medium"
Allowed: "strict", "medium", "loose"
find_md
Infer metadata when comparing dataframes with tdda diff.
Default: true
Allowed: true, false
PARAMETERS (serial)
md_inpath
Path(s) to search for serial metadata files; relative paths are resolved relative to the CSV file.
Default: "./_write.serial"
tdda test
NAME
tdda test — Run the tdda library's self-tests
SYNOPSIS
tdda test
DESCRIPTION
Runs tdda's (internal) self-tests.
NOTE: It is hard to guarantee that all will pass on all systems given that dependencies are not tightly pinned. It is not necessarily a problem if some tests fail, but is a concern if a very large number fail.
SEE ALSO
tdda(1)
tdda help
NAME
tdda help — Provides help on tdda and its sub-commands.
SYNOPSIS
tdda help
tdda help COMMAND
POSITIONAL ARGUMENTS
COMMAND can be any of:
discover
verify
detect
examples
gentest
diff
serial
tag
config
help
version
test
installman
DESCRIPTION
Shows help on a tdda subcommand or topic.
Taking inspiration from git, if the man pages are installed
(see tdda installman), help on main commands can also be obtained with
man tdda-COMMAND
For example:
man tdda-discover
Help can also be obtained on each command with --help, -h or -?, e.g.
tdda discover --help
EXAMPLES
tdda help Shows this help
tdda help gentest Shows help on gentest
SEE ALSO
tdda-installman(1)
tdda installman
NAME
tdda installman — install tdda man pages
SYNOPSIS
tdda installman [--system]
DESCRIPTION
Installs the tdda man pages so they can be accessed with the man command.
Once installed, the main tdda man page is available as:
man tdda
Man pages for tdda subcommands are available as:
man tdda-COMMAND
For example:
man tdda-discover
man tdda-gentest
The rexpy man page is accessed as:
man rexpy
By default, man pages are installed to ~/.local/share/man/man1.
On MacOS, this directory may not be in the default man search path;
if so, tdda installman will print the line to add to your shell
config file to make the man pages available in new shells.
With --system, man pages are installed to /usr/local/share/man/man1,
which is in the default search path on most systems but may require
running with sudo.
On Windows, man pages are not supported; consider running tdda under
WSL (Windows Subsystem for Linux).
OPTIONS
--system, -s
Install system-wide to /usr/local/share/man/man1 (may require sudo).
EXAMPLES
tdda installman
Install man pages to~/.local/share/man/man1.tdda installman --system
Install man pages system-wide (may require sudo).
SEE ALSO
tdda-help(1)
rexpy
NAME
rexpy — infer regular expressions from example strings
SYNOPSIS
rexpy [FLAGS] [INPUTFILE [OUTPUTFILE]]
DESCRIPTION
rexpy reads a list of strings (one per line) and infers one or more
regular expressions that characterize them.
If INPUTFILE is provided it should contain one string per line; otherwise lines are read from standard input.
If OUTPUTFILE is provided, the regular expressions found will be written there (one per line); otherwise they will be printed to standard output.
OPTIONS
-h, --header
Discard the first line as a header.
-?, --help
Print usage information and exit.
-g, --group
Generate capture groups for each variable fragment of each regular
expression, i.e. surround variable components with parentheses.
e.g. ^[A-Z]+\-[0-9]+$
becomes ^([A-Z]+)\-([0-9]+)$
-q, --quote
Display regular expressions as double-quoted, escaped strings,
suitable for use in Unix shells, JSON, and string literals in many
programming languages.
e.g. ^[A-Z]+\-[0-9]+$
becomes "^[A-Z]+\-[0-9]+$"
--portable, --grep
Produce maximally portable regular expressions
(e.g. [0-9] rather than \d). This is the default.
--java
Produce Java-style regular expressions (e.g. \p{Digit}).
--posix
Produce POSIX-compliant regular expressions
(e.g. [[:digit:]] rather than \d).
--perl
Produce Perl-style regular expressions (e.g. \d).
-u, --underscore
Allow underscore to be treated as a letter.
Mostly useful for matching identifiers. Also -_.
-d, --dot, --period
Allow dot to be treated as a letter.
Mostly useful for matching identifiers. Also -..
-m, --minus, --hyphen, --dash
Allow minus to be treated as a letter.
Mostly useful for matching identifiers.
-vlf, --variable
Use variable-length fragments.
-flf, --fixed
Use fixed-length fragments.
-v, --version
Print the version number.
-V, --verbose
Set verbosity level to 1.
-VV, --Verbose
Set verbosity level to 2.
SEE ALSO
tdda(1), tdda-discover(1)