Introduction
You expect the results of two queries are the same.
The sets match
is by far the most used and most useful type of expectation in CAT. In fact, you can rewrite any other expectation to sets match
. But that does not mean you should - other types of expectation add clarity and readability to your tests, and that is crucial for their maintenance.
Example
You may want to verify, that all customers from your CRM system are also present in your analytical system:
Tests:
- Name: All customers are loaded
Suite: smoke tests
First Data Source: CRM
First Query: |
SELECT CustomerID
FROM Contacts.Customer
ORDER BY Customer_ID
Second Data Source: DWH
Second Query: |
SELECT CustomerID
FROM Dimension.Customer
ORDER BY CustomerID
Expectation: sets match
Results
If data returned by both queries are the same, there are no missing and no additional rows, the result is Passed
.
If there is at least one difference, the result is Failed
.
If the underlying provider of any of the queries throws an exception, the result is Error
.
Remarks
Both queries must return sets with the same structure. Column names and data types may differ. Number of columns must be the same, otherwiset the test will immediatly end with Failed
result.
CAT iterates simultaneously through the first set and through second set. The iteration goes row by row, that means CAT will NOT read the entire sets to RAM and the comparison is very efficent. You can safely iterate very large query results without worying about the machine resources.
ORDER BY
clause) and CAT can then perform the comparison very quickly.If the sencod set runs out of rows sooner than the first, or viceversa, test result is Failed
.
CAT compares both rows column by column for every row. Order of columns IS important, names of columns are NOT important, data types are NOT important.
When comparing two values, CAT uses .NET representation of the values (exactly the same what underlying provider returned).
CAT first tries to compare if the values are NULL or not - if one is not and the other is, result is Failed
. If both are NULL, all is OK and the evaluation contiunes to the next pair of values.
CAT then tries compare the values as they are, as their string representations, as decimal numbers (if both can be converted to them), as datetimes (if both can be converted to them). Whenever the values are clearly different, CAT marks the test as Failed
, otherwise continues.
Case is ignored. Some Data
is the same as some data
.
When CAT iterates through both sets and does not find any differences, the test result is Passed
.