CAT 0.15.0
On this page
Driverless testing of CSV files
Thanks to a new experimental provider, we are now able to test CSV files without installation of any drivers. Advanced SQL constructs are allowed. Compared to CsvOleDB@1 driver, this one is well documented, advanced, and seems to work well.
See supported SQL constructs, functions etc. here: https://duckdb.org/docs/sql/introduction
As a teaser how it might be used, check the example below. It reads metadata from two CSV files, JOINs them and compares the resultset with a database metadata:
Data Sources:
- Provider: DuckDB@1 # this provider is in preview
Connection string: 'DataSource=:memory:'
Name: duckdb
- Provider: SqlServer@1
Connection string: data source=localhost;integrated security=true;initial catalog=WideWorldImporters;
Name: sql
Tests:
- Name: Check all tables were created
Description: compares metadata with the existing tables in database
First Data Source: duckdb
First Query: |
SELECT t.TABLE_SCHEMA, t.TABLE_NAME, c.COLUMN_NAME
FROM read_csv_auto('metadata_tables.csv', delim=',', header=True) as t
JOIN read_csv_auto('metadata_columns.csv', delim=',', header=True) as c
ON t.TABLE_SCHEMA = c.TABLE_SCHEMA AND t.TABLE_NAME = c.TABLE_NAME
WHERE t.TABLE_TYPE = 'BASE TABLE'
ORDER BY t.TABLE_SCHEMA, t.TABLE_NAME, c.COLUMN_NAME
Second Data Source: sql
Second Query: |
SELECT c.TABLE_SCHEMA, c.TABLE_NAME, c.COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS as c
JOIN INFORMATION_SCHEMA.TABLES as t
ON t.TABLE_SCHEMA = c.TABLE_SCHEMA AND t.TABLE_NAME = c.TABLE_NAME
WHERE t.TABLE_TYPE = 'BASE TABLE'
ORDER BY TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME
Expectation: sets match
Please consider everything related to DuckDB is a preview.
Anything may change, even the concept as a whole. Your feedback is crucial, please try it and let us know.
At the moment, this cannot be used for reading MS Excel files, but soon it will be.
CAT Portal
CAT portal got some new stuff. It now contains link to our new landing page, possibility to ask for a demo and a for for contacting our sales guys.
Quality
We invested considerable time into testing the whole CAT system - and that is and will be continuous ongoing process.