Reference Tests
The referencetest
module provides support for unit tests,
allowing them to easily compare test results against saved
“known to be correct” reference results.
This is typically useful for testing software that produces any of the following types of output:
a CSV file
a text file (for example: HTML, JSON, logfiles, graphs, tables, etc)
a string
a Pandas DataFrame.
The main features are:
If the comparison between a string and a file fails, the actual string is written to a file and a
diff
command is suggested for seeing the differences between the actual output and the expected output.There is support for CSV files, allowing fine control over how the comparison is to be performed. This includes:
the ability to select which columns to compare (and which to exclude from the comparison).
the ability to compare metadata (types of fields) as well as values.
the ability to specify the precision (as number of decimal places) for the comparison of floating-point values.
clear reporting of where the differences are, if the comparison fails.
There is support for ignoring lines within the strings/files that contain particular patterns or regular expressions. This is typically useful for filtering out things like version numbers and timestamps that vary in the output from run to run, but which do not indicate a problem.
There is support for re-writing the reference output with the actual output. This, obviously, should be used only after careful checking that the new output is correct, either because the previous output was in fact wrong, or because the intended behaviour has changed.
It allows you to group your reference results into different kinds. This means you can keep different kinds of reference result files in different locations. It also means that you can selectively choose to only regenerate particular kinds of reference results, if they need to be updated because they turned out to have been wrong or if the intended behaviour has changed. Kinds are strings.
Prerequisites
pandas
optional, required for CSV file support, see https://pandas.pydata.org.
pytest
optional, required for tests based on pytest rather thanunittest
, see https://docs.pytest.org.
These can be installed with:
pip install pandas
pip install pytest
The module provides interfaces for this to be called from unit-tests
based on either the standard Python unittest
framework,
or on pytest
.
Simple Examples
Simple unittest example:
For use with unittest
, the
ReferenceTest
API is provided
through the ReferenceTestCase
class. This is an extension to the standard unittest.TestCase
class, so that the ReferenceTest
methods can be called directly from
unittest
tests.
This example shows how to write a test for a function that generates a CSV file:
from tdda.referencetest import ReferenceTestCase, tag
import my_module
class MyTest(ReferenceTestCase):
@tag
def test_my_csv_file(self):
result = my_module.produce_a_csv_file(self.tmp_dir)
self.assertCSVFileCorrect(result, 'result.csv')
MyTest.set_default_data_location('testdata')
if __name__ == '__main__':
ReferenceTestCase.main()
To run the test:
python mytest.py
The test is tagged with @tag
, meaning that it will be included if
you run the tests with the --tagged
option flag to specify that only
tagged tests should be run:
python mytest.py --tagged
The first time you run the test, it will produce an error unless you have already created the expected (“reference”) results. You can create the reference results automatically
python mytest.py --write-all
Having generated the reference results, you should carefully examine the files it has produced in the data output location, to check that they are as expected.
Simple pytest example:
For use with pytest
, the
ReferenceTest
API is provided
through the referencepytest
module. This is
a module that can be imported directly from pytest
tests, allowing them
to access ReferenceTest
methods and properties.
This example shows how to write a test for a function that generates a CSV file:
from tdda.referencetest import referencepytest, tag
import my_module
@tag
def test_my_csv_function(ref):
resultfile = my_module.produce_a_csv_file(ref.tmp_dir)
ref.assertCSVFileCorrect(resultfile, 'result.csv')
referencepytest.set_default_data_location('testdata')
You also need a conftest.py
file, to define the fixtures and defaults:
import pytest
from tdda.referencetest import referencepytest
def pytest_addoption(parser):
referencepytest.addoption(parser)
def pytest_collection_modifyitems(session, config, items):
referencepytest.tagged(config, items)
@pytest.fixture(scope='module')
def ref(request):
return referencepytest.ref(request)
referencepytest.set_default_data_location('testdata')
To run the test:
pytest
The test is tagged with @tag
, meaning that it will be included if
you run the tests with the --tagged
option flag to specify that only
tagged tests should be run:
pytest --tagged
The first time you run the test, it will produce an error unless you have already created the expected (“reference”) results. You can create the reference results automatically:
pytest --write-all -s
Having generated the reference results, you should examine the files it has produced in the data output location, to check that they are as expected.
Methods and Functions
- class tdda.referencetest.referencetest.ReferenceTest(assert_fn)
The
ReferenceTest
class provides support for comparing results against a set of reference “known to be correct” results.The functionality provided by this class can be used with:
the standard Python
unittest
framework, using theReferenceTestCase
class. This is a subclass of, and therefore a drop-in replacement for,unittest.TestCase
. It extends that class with all of the methods from theReferenceTest
class.the
pytest
framework, using thereferencepytest
module. This module provides all of the methods from theReferenceTest
class, exposed as functions that can be called directly from tests in apytest
suite.
In addition to the various test-assertion methods, the module also provides some useful instance variables. All of these can be set explicitly in test setup code, using the
set_defaults()
class method.- all_fields_except(exclusions)
Helper function, for using with check_data, check_types and check_order parameters to assertion functions for Pandas DataFrames. It returns the names of all of the fields in the DataFrame being checked, apart from the ones given.
exclusions is a list of field names.
- assertBinaryFileCorrect(actual_path, ref_path, kind=None)
Check that a binary file matches the contents from a reference binary file.
- actual_path:
A path for a binary file.
- ref_path:
The name of the reference binary file. The location of the reference file is determined by the configuration via
set_data_location()
.- kind:
The reference kind, used to locate the reference file.
- assertCSVFileCorrect(actual_path, ref_csv, kind='csv', csv_read_fn=None, check_data=None, check_types=None, check_order=None, condition=None, sortby=None, precision=None, **kwargs)
Legacy convenience method with second parameter called ref_csv. Just calls assertOnDiskDataFrameCorrect.
- assertCSVFilesCorrect(actual_paths, ref_csvs, kind='csv', csv_read_fn=None, check_data=None, check_types=None, check_order=None, condition=None, sortby=None, precision=None, **kwargs)
Legacy method that just calls assertOnDiskDataFramesCorrect.
- assertDataFrameCorrect(df, ref_path, actual_path=None, kind='csv', csv_read_fn=None, check_data=None, check_types=None, check_order=None, condition=None, sortby=None, precision=None, type_matching=None, **kwargs)
Check that an in-memory Pandas DataFrame matches a reference one from a save reference DataFrame on disk (parquet of CSV).
- df:
Actual DataFrame.
- ref_path:
Name of reference file, which can be a .parquet file or a CSV file. The location of the reference file is determined by the configuration via
set_data_location()
. Renamed from csv_path in version 2.2- actual_path:
Optional parameter, giving path for file where actual DataFrame originated, used for error messages.
- kind:
(Optional) reference kind (a string; see above), used to locate the reference CSV file.
- csv_read_fn:
(Optional) function to read a CSV file to obtain a pandas DataFrame. If
None
, then a default CSV loader is used.The default CSV loader function is a wrapper around Pandas
pd.read_csv()
, with default options as follows:index_col
isNone
infer_datetime_format
isTrue
quotechar
is"
quoting
iscsv.QUOTE_MINIMAL
escapechar
is\
(backslash)na_values
are the empty string,"NaN"
, and"NULL"
keep_default_na
isFalse
It also accepts the
check_data
,check_types
,check_order
,check_extra_cols
,sortby
,condition
andprecision
optional parameters described inassertDataFramesEqual()
.Raises
NotImplementedError
if Pandas is not available.
- assertDataFramesEqual(df, ref_df, actual_path=None, expected_path=None, check_data=None, check_types=None, check_order=None, condition=None, sortby=None, precision=None, type_matching=None)
Check that an in-memory Pandas DataFrame matches an in-memory reference one.
- df:
Actual DataFrame.
- ref_df:
Expected DataFrame.
- actual_path:
(Optional) path for file where actual DataFrame originated, used for error messages.
- expected_path:
(Optional) path for file where expected DataFrame originated, used for error messages.
- check_data:
(Optional) restriction of fields whose values should be compared. Possible values are:
None
orTrue
(to apply the comparison to all fields; this is the default).False
(to skip the comparison completely)a list of field names (to check only these fields)
a function taking a
DataFrame
as its single parameter, and returning a list of field names to check.
- check_types:
(Optional) restriction of fields whose types should be compared. See check_data (above) for possible values.
- check_order:
(Optional) restriction of fields whose (relative) order should be compared. See check_data (above) for possible values.
- check_extra_cols:
(Optional) restriction of extra fields in the actual dataset which, if found, will cause the check to fail. See check_data (above) for possible values.
- sortby:
(Optional) specification of fields to sort by before comparing.
None
orFalse
(do not sort; this is the default)True
(to sort on all fields based on their order in the reference datasets; you probably don’t want to use this option)a list of field names (to sort on these fields, in order)
a function taking a
DataFrame
(which will be the reference data frame) as its single parameter, and returning a list of field names to sort on.
- condition:
(Optional) filter to be applied to datasets before comparing. It can be
None
, or can be a function that takes a DataFrame as its single parameter and returns a vector of booleans (to specify which rows should be compared).- precision:
(Optional) number of decimal places to use for floating-point comparisons. Default is not to perform rounding.
type_matching ‘strict’, ‘medium’, ‘permissive’
Raises
NotImplementedError
if Pandas is not available.
- assertFileCorrect(actual_path, ref_path, kind=None, lstrip=False, rstrip=False, ignore_substrings=None, ignore_patterns=None, remove_lines=None, ignore_lines=None, preprocess=None, max_permutation_cases=0, encoding=None)
Check that a text file matches the contents from a reference text file.
- actual_path:
A path for a text file.
- ref_path:
The name of the reference file. The location of the reference file is determined by the configuration via
set_data_location()
.
It also accepts the
kind
,lstrip
,rstrip
,ignore_substrings
,ignore_patterns
,remove_lines
,preprocess
andmax_permutation_cases
optional parameters described inassertStringCorrect()
.This should be used for unstructured data such as logfiles, etc. For CSV files, use
assertOnDiskDataFramCorrect()
instead.The ignore_lines parameter exists for backwards compatibility as an alias for remove_lines.
The
assertFileCorrect()
method can be used as an alias forassertTextFileCorrect()
, retained for backwards compatibility.
- assertFilesCorrect(actual_paths, ref_paths, kind=None, lstrip=False, rstrip=False, ignore_substrings=None, ignore_patterns=None, remove_lines=None, ignore_lines=None, preprocess=None, max_permutation_cases=0, encodings=None)
Check that a collection of text files matche the contents from matching collection of reference text files.
- actual_paths:
A list of paths for text files.
- ref_paths:
A list of names of the matching reference files. The location of the reference files is determined by the configuration via
set_data_location()
.
This should be used for unstructured data such as logfiles, etc. For CSV files, use
assertOnDiskDataFramesCorrect()
instead.It also accepts the
kind
,lstrip
,rstrip
,ignore_substrings
,ignore_patterns
,remove_lines
,preprocess
andmax_permutation_cases
optional parameters described inassertStringCorrect()
.The
assertFilesCorrect()
metohd can be used as an alias forassertTextFilesCorrect()
, retained for backwards compatibility.
- assertOnDiskDataFrameCorrect(actual_path, ref_path, kind='parquet', csv_read_fn=None, check_data=None, check_types=None, check_order=None, condition=None, sortby=None, precision=None, **kwargs)
- Check that a DataFrame on disk (as a parquet file,
or possible a CSV file, reference DataFrame, also on disk.
Args:
- actual_path:
File containing actual serialized DataFrame
- ref_path:
File containing reference serialized DataFrame. The location of the reference file is determined by the configuration via
set_data_location()
.- kind:
(Optional) reference kind (a string; see above), used to locate the reference file. CSV is used here, and applies, for now, to parqet as well as CSV.
- csv_read_fn:
(Optional) function to read a CSV file to obtain a pandas DataFrame. If
None
, then a default CSV loader is used if it is a CSV file.The default CSV loader function is a wrapper around Pandas
pd.read_csv()
, with default options as follows:index_col
isNone
infer_datetime_format
isTrue
quotechar
is"
quoting
iscsv.QUOTE_MINIMAL
escapechar
is\
(backslash)na_values
are the empty string,"NaN"
,and
"NULL"
keep_default_na
isFalse
- **kwargs:
Any additional named parameters are passed straight through to the csv_read_fn function.
It also accepts the
check_data
,check_types
,check_order
,check_extra_cols
,sortby
,condition
andprecision
optional parameters described inassertDataFramesEqual()
.Raises
NotImplementedError
if Pandas is not available.
- assertOnDiskDataFramesCorrect(actual_paths, ref_paths, kind='csv', csv_read_fn=None, check_data=None, check_types=None, check_order=None, condition=None, sortby=None, precision=None, **kwargs)
- Check that a set of serialized datafames in files
match corresponding reference ones.
- actual_paths:
List of actual serialized data frames (Parquet or CSV)
- ref_paths:
List of names of matching reference serialized data frames (Parquet or CSV). The location of the reference files is determined by the configuration via
set_data_location()
.- kind:
(Optional) reference kind (a string; see above), used to locate the reference CSV or Paerquet file.
- csv_read_fn:
(Optional) function to read a CSV file to obtain a pandas DataFrame. If
None
, then a default CSV loader is used.The default CSV loader function is a wrapper around Pandas
pd.read_csv()
, with default options as follows:index_col
isNone
infer_datetime_format
isTrue
quotechar
is"
quoting
iscsv.QUOTE_MINIMAL
escapechar
is\
(backslash)na_values
are the empty string,"NaN"
, and"NULL"
keep_default_na
isFalse
- **kwargs:
Any additional named parameters are passed straight through to the csv_read_fn function.
It also accepts the
check_data
,check_types
,check_order
,check_extra_cols
,sortby
,condition
andprecision
optional parameters described inassertDataFramesEqual()
.Raises
NotImplementedError
if Pandas is not available.
- assertStringCorrect(string, ref_path, kind=None, lstrip=False, rstrip=False, ignore_substrings=None, ignore_patterns=None, remove_lines=None, ignore_lines=None, preprocess=None, max_permutation_cases=0)
Check that an in-memory string matches the contents from a reference text file.
- string:
The actual string.
- ref_path:
The name of the reference file. The location of the reference file is determined by the configuration via
set_data_location()
.- kind:
The reference kind, used to locate the reference file.
- lstrip:
If set to
True
, both strings are left-stripped before the comparison is carried out. Note: the stripping is on a per-line basis.- rstrip:
If set to
True
, both strings are right-stripped before the comparison is carried out. Note: the stripping is on a per-line basis.- ignore_substrings:
An optional list of substrings; lines containing any of these substrings will be ignored in the comparison.
- ignore_patterns:
An optional list of regular expressions; lines will be considered to be the same if they only differ in substrings that match one of these regular expressions. The expressions should only include explicit anchors if they need to refer to the whole line. Only the matched expression within the line is ignored; any text to the left or right of the matched expression must either be exactly the same on both sides, or be ignorable.
- remove_lines
An optional list of substrings; lines containing any of these substrings will be completely removed before carrying out the comparison. This is the means by which you would exclude ‘optional’ content.
- preprocess:
An optional function that takes a list of strings and preprocesses it in some way; this function will be applied to both the actual and expected.
- max_permutation_cases:
An optional number specifying the maximum number of permutations allowed; if the actual and expected lists differ only in that their lines are permutations of each other, and the number of such permutations does not exceed this limit, then the two are considered to be identical.
The ignore_lines parameter exists for backwards compatibility as an alias for remove_lines.
- assertTextFileCorrect(actual_path, ref_path, kind=None, lstrip=False, rstrip=False, ignore_substrings=None, ignore_patterns=None, remove_lines=None, ignore_lines=None, preprocess=None, max_permutation_cases=0, encoding=None)
Check that a text file matches the contents from a reference text file.
- actual_path:
A path for a text file.
- ref_path:
The name of the reference file. The location of the reference file is determined by the configuration via
set_data_location()
.
It also accepts the
kind
,lstrip
,rstrip
,ignore_substrings
,ignore_patterns
,remove_lines
,preprocess
andmax_permutation_cases
optional parameters described inassertStringCorrect()
.This should be used for unstructured data such as logfiles, etc. For CSV files, use
assertOnDiskDataFramCorrect()
instead.The ignore_lines parameter exists for backwards compatibility as an alias for remove_lines.
The
assertFileCorrect()
method can be used as an alias forassertTextFileCorrect()
, retained for backwards compatibility.
- assertTextFilesCorrect(actual_paths, ref_paths, kind=None, lstrip=False, rstrip=False, ignore_substrings=None, ignore_patterns=None, remove_lines=None, ignore_lines=None, preprocess=None, max_permutation_cases=0, encodings=None)
Check that a collection of text files matche the contents from matching collection of reference text files.
- actual_paths:
A list of paths for text files.
- ref_paths:
A list of names of the matching reference files. The location of the reference files is determined by the configuration via
set_data_location()
.
This should be used for unstructured data such as logfiles, etc. For CSV files, use
assertOnDiskDataFramesCorrect()
instead.It also accepts the
kind
,lstrip
,rstrip
,ignore_substrings
,ignore_patterns
,remove_lines
,preprocess
andmax_permutation_cases
optional parameters described inassertStringCorrect()
.The
assertFilesCorrect()
metohd can be used as an alias forassertTextFilesCorrect()
, retained for backwards compatibility.
- set_data_location(location, kind=None)
Declare the filesystem location for reference files of a particular kind. Typically you would subclass
ReferenceTestCase
and pass in these locations though its__init__
method when constructing an instance of ReferenceTestCase as a superclass.If calls to
assertTextFileCorrect()
(etc) are made for kinds of reference data that hasn’t had their location defined explicitly, then the default location is used. This is the location declared for theNone
kind and this default must be specified.This method overrides any global defaults set from calls to the
ReferenceTest.set_default_data_location()
class-method.If you haven’t even defined the
None
default, and you make calls toassertTextFileCorrect()
(etc) using relative pathnames for the reference data files, then it can’t check correctness, so it will raise an exception.
- classmethod set_default_data_location(location, kind=None)
Declare the default filesystem location for reference files of a particular kind. This sets the location for all instances of the class it is called on. Subclasses will inherit this default (unless they explicitly override it).
To set the location globally for all tests in all classes within an application, call this method on the
ReferenceTest
class.The instance method
set_data_location()
can be used to set the per-kind data locations for an individual instance of a class.If calls to
assertTextFileCorrect()
(etc) are made for kinds of reference data that hasn’t had their location defined explicitly, then the default location is used. This is the location declared for theNone
kind and this default must be specified.If you haven’t even defined the
None
default, and you make calls toassertTextFileCorrect()
(etc) using relative pathnames for the reference data files, then it can’t check correctness, so it will raise an exception.
- classmethod set_defaults(**kwargs)
Set default parameters, at the class level. These defaults will apply to all instances of the class.
The following parameters can be set:
- verbose:
Sets the boolean verbose flag globally, to control reporting of errors while running tests. Reference tests tend to take longer to run than traditional unit tests, so it is often useful to be able to see information from failing tests as they happen, rather than waiting for the full report at the end. Verbose is set to
True
by default.- print_fn: Sets the print function globally, to specify
the function to use to display information while running tests. The function have the same signature as Python3’s standard print function, a default print function is used which writes unbuffered to
sys.stdout
.- tmp_dir:
Sets the tmp_dir property globally, to specify the directory where temporary files are written. Temporary files are created whenever a text file check fails and a ‘preprocess’ function has been specified. It’s useful to be able to see the contents of the files after preprocessing has taken place, so preprocessed versions of the files are written to this directory, and their pathnames are included in the failure messages. If not explicitly set by
set_defaults()
, the environment variable TDDA_FAIL_DIR is used, or, if that is not defined, it defaults to /tmp, c:temp or whatevertempfile.gettempdir()
returns, as appropriate.
- classmethod set_regeneration(kind=None, regenerate=True)
Set the regeneration flag for a particular kind of reference file, globally, for all instances of the class.
If the regenerate flag is set to
True
, then the framework will regenerate reference data of that kind, rather than comparing.All of the regeneration flags are set to False by default.
- tdda.referencetest.referencetest.tag(test)
Decorator for tests, so that you can specify you only want to run a tagged subset of tests, with the -1 or –tagged option.
unittest
Framework Support
This module provides the
ReferenceTestCase
class,
which extends the
standard unittest.TestCase
test-case class, augmenting it
with methods for checking correctness of files against reference data.
It also provides a main()
function, which can be used to run (and
regenerate) reference tests which have been implemented using subclasses
of ReferenceTestCase
.
For example:
from tdda.referencetest import ReferenceTestCase
import my_module
class TestMyClass(ReferenceTestCase):
def test_my_csv_function(self):
result = my_module.my_csv_function(self.tmp_dir)
self.assertCSVFileCorrect(result, 'result.csv')
def test_my_pandas_dataframe_function(self):
result = my_module.my_dataframe_function()
self.assertDataFrameCorrect(result, 'result.csv')
def test_my_table_function(self):
result = my_module.my_table_function()
self.assertStringCorrect(result, 'table.txt', kind='table')
def test_my_graph_function(self):
result = my_module.my_graph_function()
self.assertStringCorrect(result, 'graph.txt', kind='graph')
TestMyClass.set_default_data_location('testdata')
if __name__ == '__main__':
ReferenceTestCase.main()
Tagged Tests
If the tests are run with the --tagged
or -1
(the digit one)
command-line option, then only tests that have been decorated with
referencetest.tag
, are run. This is a mechanism for allowing
only a chosen subset of tests to be run, which is useful during
development. The @tag
decorator can be applied to either test
classes or test methods.
If the tests are run with the --istagged
or -0
(the digit
zero) command-line option, then no tests are run; instead, the
framework reports the full module names of any test classes that have
been decorated with @tag
, or which contain any tests that have been
decorated with @tag
.
For example:
from tdda.referencetest import ReferenceTestCase, tag
import my_module
class TestMyClass1(ReferenceTestCase):
@tag
def test_a(self):
...
def test_b(self):
...
@tag
class TestMyClass2(ReferenceTestCase):
def test_x(self):
...
def test_y(self):
...
If run with python mytests.py --tagged
, only the tagged tests are
run (TestMyClass1.test_a
, TestMyClass2.test_x
and
TestMyClass2.test_y
).
Regeneration of Results
When its main is run with --write-all
or --write
(or -W
or -w
respectively), it causes the framework to regenerate reference data
files. Different kinds of reference results can be regenerated by
passing in a comma-separated list of kind
names immediately after
the --write
option. If no list of kind
names is provided, then all
test results will be regenerated.
To regenerate all reference results (or generate them for the first time)
pytest -s --write-all
To regenerate just a particular kind of reference (e.g. table results)
python my_tests.py --write table
To regenerate a number of different kinds of reference (e.g. both table and graph results)
python my_tests.py --write table graph
unittest
Integration Details
- class tdda.referencetest.referencetestcase.ReferenceTestCase(*args, **kwargs)
Wrapper around the
ReferenceTest
class to allow it to operate as a test-case class using theunittest
testing framework.The
ReferenceTestCase
class is a mix-in ofunittest.TestCase
andReferenceTest
, so it can be used as the base class for unit tests, allowing the tests to use any of the standardunittest
assert methods, and also use any of thereferencetest
assert extensions.- static main(module=None, argv=None, **kw)
Wrapper around the
unittest.main()
entry point.This is the same as the
main()
function, and is provided just as a convenience, as it means that tests using theReferenceTestCase
class only need to import that single class on its own.
- tag()
Decorator for tests, so that you can specify you only want to run a tagged subset of tests, with the -1 or –tagged option.
- class tdda.referencetest.referencetestcase.TaggedTestLoader(check, printer=None)
Subclass of
TestLoader
, which strips out any non-tagged tests.- getTestCaseNames(testCaseClass)
Return a sorted sequence of method names found within testCaseClass
- loadTestsFromModule(*args, **kwargs)
Return a suite of all test cases contained in the given module
- loadTestsFromName(*args, **kwargs)
Return a suite of all test cases given a string specifier.
The name may resolve either to a module, a test case class, a test method within a test case class, or a callable object which returns a TestCase or TestSuite instance.
The method optionally resolves the names relative to a given module.
- loadTestsFromNames(*args, **kwargs)
Return a suite of all test cases found using the given sequence of string specifiers. See ‘loadTestsFromName()’.
- loadTestsFromTestCase(*args, **kwargs)
Return a suite of all test cases contained in testCaseClass
- tdda.referencetest.referencetestcase.main()
Wrapper around the
unittest.main()
entry point.
pytest
Framework Support
This provides all of the methods in the
ReferenceTest
class,
in a way that allows them to be used as pytest
fixtures.
This allows these functions to be called from tests running from the
pytest
framework.
For example:
import my_module
def test_my_csv_function(ref):
resultfile = my_module.my_csv_function(ref.tmp_dir)
ref.assertCSVFileCorrect(resultfile, 'result.csv')
def test_my_pandas_dataframe_function(ref):
resultframe = my_module.my_dataframe_function()
ref.assertDataFrameCorrect(resultframe, 'result.csv')
def test_my_table_function(ref):
result = my_module.my_table_function()
ref.assertStringCorrect(result, 'table.txt', kind='table')
def test_my_graph_function(ref):
result = my_module.my_graph_function()
ref.assertStringCorrect(result, 'graph.txt', kind='graph')
class TestMyClass:
def test_my_other_table_function(ref):
result = my_module.my_other_table_function()
ref.assertStringCorrect(result, 'table.txt', kind='table')
with a conftest.py
containing:
from tdda.referencetest.pytestconfig import (pytest_addoption,
pytest_collection_modifyitems,
set_default_data_location,
ref)
set_default_data_location('testdata')
This configuration enables the additional command-line options,
and also provides a ref
fixture, as an instance of the
ReferenceTest
class.
Of course, for brevity, if you prefer, you can use:
from tdda.referencetest.pytestconfig import *
rather than importing the four individual items if you are not customising anything yourself, but that is less flexible.
This example also sets a default data location which will apply to
all reference fixtures. This means that any tests that use ref
will
automatically be able to locate their “expected results” reference data
files.
Reference Fixtures
The default configuration provides a single fixture, ref
.
To configure a large suite of tests so that tests do not all have to
share a single common reference-data location, you can set up additional
reference fixtures, configured differently. For example, to set up a fixure
ref_special
, whose reference data is stored in ../specialdata
, you
could include:
@pytest.fixture(scope='module')
def ref_special(request):
r = referencepytest.ref(request)
r.set_data_location('../specialdata')
return r
Tests can use this additional fixture:
import my_special_module
def test_something(ref_special):
result = my_special_module.something()
ref_special.assertStringCorrect(resultfile, 'something.csv')
Tagged Tests
If the tests are run with the --tagged
command-line option, then only tests that have been decorated with
referencetest.tag
, are run. This is a mechanism for allowing
only a chosen subset of tests to be run, which is useful during
development. The @tag
decorator can be applied to test functions,
test classes and test methods.
If the tests are run with the --istagged
command-line option,
then no tests are run; instead, the
framework reports the full module names of any test classes or functions
that have been decorated with @tag
, or classes which contain any
test methods that have been decorated with @tag
.
For example:
from tdda.referencetest import tag
@tag
def test_a(ref):
assert 'a' + 'a' == 'aa'
def test_b(ref):
assert 'b' * 2 == 'bb'
@tag
class TestMyClass:
def test_x(self):
list('xxx') == ['x', 'x', 'x']
def test_y(self):
'y'.upper() == 'Y'
If run with pytest --tagged
, only the tagged tests are
run (test_a
, TestMyClass.test_x
and TestMyClass.test_y
).
Regeneration of Results
When pytest
is run with --write-all
or --write
, it causes
the framework to regenerate reference data files. Different kinds of
reference results can be regenerated by passing in a comma-separated list
of kind
names immediately after the --write
option. If no list
of kind
names is provided, then all test results will be regenerated.
If the -s
option is also provided (to disable pytest
output capturing), it will report the names of all the files it has
regenerated.
To regenerate all reference results (or generate them for the first time)
pytest -s --write-all
To regenerate just a particular kind of reference (e.g. table results)
pytest -s --write table
To regenerate a number of different kinds of reference (e.g. both table and graph results)
pytest -s --write table graph
pytest
Integration Details
In addition to all of the methods from
ReferenceTest
,
the following functions are provided, to allow easier integration
with the pytest
framework.
Typically your test code would not need to call any of these methods
directly (apart from set_default_data_location()
), as they are
all enabled automatically if you import the default ReferenceTest
configuration into your conftest.py
file:
from tdda.referencetest.pytestconfig import *
- tdda.referencetest.referencepytest.addoption(parser)
Support for the
--write
and--write-all
command-line options.A test’s
conftest.py
file should declare extra options by defining apytest_addoption
function which should just call this.It extends pytest to include
--write
and--write-all
option flags which can be used to control regeneration of reference results.
- tdda.referencetest.referencepytest.ref(request)
Support for dependency injection via a
pytest
fixture.A test’s
conftest.py
should define a fixture function for injecting aReferenceTest
instance, which should just call this function.This allows tests to get access to a private instance of that class.
- tdda.referencetest.referencepytest.set_default_data_location(location, kind=None)
This provides a mechanism for setting the default reference data location in the
ReferenceTest
class.It takes the same parameters as
tdda.referencetest.referencetest.ReferenceTest.set_default_data_location()
.If you want the same data locations for all your tests, it can be easier to set them with calls to this function, rather than having to set them explicitly in each test (or using
set_data_location()
in your@pytest.fixture
ref
definition in yourconftest.py
file).
- tdda.referencetest.referencepytest.set_defaults(**kwargs)
This provides a mechanism for setting default attributes in the
ReferenceTest
class.It takes the same parameters as
tdda.referencetest.referencetest.ReferenceTest.set_defaults()
, and can be used for setting parameters such as thetmp_dir
property.If you want the same defaults for all your tests, it can be easier to set them with a call to this function, rather than having to set them explicitly in each test (or in your
@pytest.fixture
ref
definition in yourconftest.py
file).
- tdda.referencetest.referencepytest.tagged(config, items)
Support for
@tag
to mark tests to be run with--tagged
or reported with--istagged
.It extends pytest to recognize the
--tagged
and--istagged
command-line flags, to restrict testing to tagged tests only.
Reference Test Examples
The tdda.referencetest
module includes a set of examples,
for both unittest
and pytest
.
To copy these examples, run the command:
tdda examples referencetest [directory]
If directory
is not supplied referencetest-examples
will be used.
Alternatively, you can copy all examples using the following command:
tdda examples
which will create a number of separate subdirectories.