Test examples
I want to see examples of automated tests against data.
On this page
No problem, here you are. Just beware the details are explained later in the documentation. The configuration format is simple and self-explanatory, why not have a look right now…
Uniqueness
This test verifies the uniqueness of your data:
- Name: Check there are no duplicities
Test Suite: Uniqueness Data Quality Dimension
Description: This test finds if a column contains duplicate values
Data Source: CRM
Query: |
SELECT customer_code, COUNT(*)
FROM contacts.customer
GROUP BY customer_code
HAVING COUNT(*) > 1;
Expectation: set is empty
Completness
This tests checks the completeness of your data:
- Name: Check there are no null values
Test Suite: Completeness Data Quality Dimension
Description : This test finds if a column contains null values
Data Source: CRM
Query: |
SELECT *
FROM contacts.customer
WHERE full_name IS NULL OR full_name = '';
Expectation: set is empty
Accuracy
These tests check that your data are accurate according to your standards:
- Name: Status can contain only values as Open, Closed, Suspended
Test Suite: Completeness Data Quality Dimension
Description : This test finds if column status has wrong value
Data Source: Your data source
Query: |
SELECT *
FROM staging.payments
WHERE status NOT IN ('Open', 'Closed', 'Suspended');
Expectation: set is empty
- Name: The code must have 14 characters.
Test Suite: Completeness Data Quality Dimension
Description : >
This test finds if your super important code
hasn't got an icorrect length.
Data Source: DWH
Query: |
SELECT DISTINCT OrderCode
FROM [Staging].[Order]
WHERE LENGTH(OrderCode) <> 14;
Expectation: set is empty
Consistency
This test checks the consistency of your data. We need to make sure that the same values are consistent across our datasets or datasources:
- Name: Same numbers between source file and database
Test Suite: Consistency Data Quality Dimension
Description: >
This tests checks that SUM of salery is same
between source excel file and database
First Data Source: DWH
First Query: |
SELECT SUM(payments_first) FROM staging.payments
Second Data source: ExcelInput
Second Query: |
SELECT SUM(payments_january) FROM "Excel"."Payments"
Expectation: Sets match
Timeliness
This test checks the timeliness of your data by monitoring the last processing day:
- Name: Update of table
Test Suite: Timeliness Data Quality Dimension
Description: >
This is to check if the staging table, with
the column LastProcessDay, was updated today
Data Source: DWH
Query: |
SELECT TOP(5) *
FROM Staging.Shipment
WHERE LastProcessDay > CAST(GETDATE() AS DATE);
Expectation: set is not empty
The tests are simple, straightforward and take seconds to automate. You have a full power of raw SQL or DAX to express your expectations about your data. You can compare data across more different systems (such as MS Excel with ORACLE, SQL server with Power BI semantic model). As you’ll learn later in our tutorial, you can execute all (or part of) your tests with a single command or press of a button.