Introduction to basic concepts and terms in CAT. If you are new to CAT, read carefully.
What is CAT About
If you end up here, we expect you have some data and you want to trust them. CAT is about turning your assumptions about data into confidence - you will not assume the data is correct, you will know it is. Or you will exactly know what is wrong. You achieve that very simply by creating very simple automated tests.
This page summarizes basic terms and concepts used in CAT.
Every company has lots of data. Very often, the data are in various formats:
The data are often “travelling” through the company systems, e.g., they are created in a CRM system, loaded into DWH and from DWH to Power BI wokspace for reporting. It is a best practice to have automated tests, that give you enough confidence that all data are trustworthy.
A Data Source in CAT is a definition of where the data is and how to connect to them. You give them a friendly name (such as CRM, DWH, Power BI, …). This friendly name is then used in your tests, where you define your expectation about the data in the system. This way you dont’ have to repeat all connection information in every test - you just specify, that your test examines data in e.g., DWH.
Example of a Data Source:
Data Sources: - Provider: SqlServer@1 Connection string: data source=localhost;integrated security=true;initial catalog=DWH Name: DWH
The example is in YAML, but you can have them defined also in a relational database table, in MS Excel file, …
For details, see Data Source.
A Test Definition describes your expectations about the data. E.g., you expect that a SQL statement issued against your CRM system (MySQL) returns exactly the same data as another SQL statement issued against your DWH (SQL Server). Or that none of your tables in a Power BI model is empty.
Example of a simple smoke test:
Tests: - Test Suite: Smoke tests Test Name: Dimension.Customer has data Data Source: DWH Query: SELECT * FROM Dimension.Customer Expectation: set is not empty
And again - you can define your tests wherever you want - in YAML (as in the example), in MS Excel, in a relational database table.
For details, see Test Definition.
As mentioned, in CAT you crated named Data Sources - they tell to CAT where your data is. But that does not mean, that CAT can connect to any data source. CAT has predefined set of providers - they define to what types of data is CAT able to connect, such as MS SQL Server, ORACLE, PostgreSQL, MS Excel, CSV, Power BI, …
Note that in the data source definition the provider (and version) must be specified:
Data Sources: - Provider: SqlServer@1 # without this line CAT wouldn't know HOW to connect (what drivers to use) Connection string: data source=localhost;integrated security=true;initial catalog=DWH Name: DWH
For details, see Providers
In an Excpectation you express, what you expect from the dataset(s) returned by a query (or queries) you defined in a Test Definition.
CAT comes with predefined set of expectations, such as SetIsEmpty, SetRowCount or SetsMatch.
For details about Expectations, see Expectations.
For overview of all defined Expectations, see Expectations.
CAT PowerShell module by default outputs the results of tests (and log messages) to console. For interactive testing sessions it is enough, but in various scenarios, you might need something else:
- Developers or business users often need results of tests in MS Excel format
- Azure DevOps or other automation tools need special file formats in order to display nice charts and details of test results
- You may be tasked to automate further processes, such as sending emails, insert results into database, invoke other processes etc.
CAT helps you significantly by creating various Outputs, such as MS Excel files, JSON, TRX or YAML files etc.
Output: xlsx, json
For details, see Outputs.
CAT needs to know somehow what data do you want to tests (data sources) and what exactly you need to check (tests).
For every testing project, you need to create a new CAT project file. You can do it simly with
In simple scenarios, you simly define data sources and tests directly in that project file, something like this:
Data Sources: # where to connect - Provider: SqlServer@1 # it is MS SQL server Connection string: data source=localhost;integrated security=true;initial catalog=DWH Name: DWH # friendly name for this data source Tests: - Name: Active customers have contracts # name for the test Description: Every active customer has at least one active contract. First Data Source: DWH # First Query: | SELECT CustomerID FROM dbo.Customer as cust WHERE cust.IsActive = 1 ORDER BY cust.CustomerID Second Data Source: DWH Second Query: | SELECT DISTINCT c.CustomerID FROM dbo.Contract as c WHERE c.IsActive = 1 ORDER BY c.CustomerID Expectation: SetsMatch # what do I expect from the result datasets Output: xlsx
But often, you may need to load the test definitions (or even data source definitions) from e.g., relational database table or from a MS Excel file. You can do so, just specify what do you want in the project file.
In project files you also define outputs you need (if simple output to console window is not enough).
For full reference, see Project files.