Introduction

Excel@2 Provider lets you run SQL queries against MS Excel files, without any drivers installed.

Mainly for historical reasons, we support more ways to test MS Excel workbooks / worksheets data. This provider, Excel@2, aims to replace them completely. Both Excel@1 and ExcelOleDB@1 had their drawbacks. Those are addressed in this provider.

Excel@2:

  • Does not need any additional driver to be installed. It comes with CAT out-of-the-box and you can use it without any prerequisites.

  • It allows you to use advanced SQL constructs in queries you run against data in MS Excel worksheets.

  • It is based on a maintained and well documented technology.

Define Data Source

In the Connection string, you provide full path to a .xlsx file. Older .xls format is NOT supported.

You also have to specify names of sheets you want to test in CAT. The names are case sensitive. Those sheets will be loaded to memory before the test execution starts.

Data Sources:
- Name:               MyExcelData     # every data source must have a friendly name
  Provider:           Excel@2         # don't confuse with Excel@1 and ExcelOleDB@1
  Connection string:  ./2023.xlsx     # relative or absolute file path
  Sheets:             Sheet1, Sheet2  # !! case sensitive, names of sheets you want to load

In the example above, CAT will create two tables in in-memory database: "MyExcelData"."Sheet1" and "MyExcelData"."Sheet2". The naming convention is "DataSourceName"."SheetName".

Using SQL

In our example, you can write this simple test:

Tests:
- Name:         Check at least something was generated
  Suite:        Smoke tests
  Description:  Sometimes happens the generated sheet is empty, this should alert us
  Data source:  MyExcelData # name of the data source
  Query:        SELECT * FROM "MyExcelData"."Sheet1" LIMIT 1
  Expectation:  set is not empty

Remember all worksheets from all data sources you specified are available as tables in the tests. Feel free to combine them using all kinds of joins or set operations like UNION, UNION ALL, EXCEPT, INTERSECT etc.

The data from sheets is loaded with types (numbers, dates, booleans, …). This is useful because you can also use all kind of functions - for working with date and time, strings, numbers. You can use pattern matching, regular expressions, …

SQL syntax supports almost all you can imagine: aggregate functions, common table expressions (CTEs), windowing functions, grouping sets, pivot and unpivot, corellated subqueries, …

See subsections (in the left menu) for details, such as SELECT syntax. For complete reference what constructs are supported in the queries, refer to this documentation:

https://duckdb.org/docs/archive/0.8.1/sql/query_syntax/select.

Limitations

This provider does NOT offer the “strike-through” feature (only Excel@1 supports this). Please beware that lines with strike-through formatting will be processed as any other rows.