Data Source

Details about Data Sources in CAT

What is a Data Source in CAT and How to Define It

As described in the Introduction, you first need to give “friendly names” to your sources of data. You do this in a Data Source definition.

Example: if you want to test data in your DWH, define a data source in your CAT project file:

Data Sources:
- Name: DWH
  Provider: SqlServer@1
  ConnectionString: Data Source=OurDwhServer;Integrated Security=true;Initial Catalog=OurDwhDatabaseName

Another example, you might want to compare data in your source system in PostgreSQL with data in DWH. You need to define two data sources, like this:

Data Sources:
- Name: DWH
  Provider: SqlServer@1
  ConnectionString: Data Source=OurDwhServer;Integrated Security=true;Initial Catalog=OurDwhDatabaseName
- Name: OurSourceSystem
  Provider: Postgres@1
  ConnectionString: Server=127.0.0.1;Port=5432;Database=OurSourceSystem;User Id=myUsername;Password=myPassword;

You can then use these friendly names (DWH and OurSourceSystem in the example above) in Test definitions.

Properties of a Data Source

Every Data Source in CAT must have three properties:

Property Meaning
Name Friendly name for the given source of data
Provider “type” of data source, see Providers
Connection String Provider specific information needed for estabilishing a connection, usually a connection string for a database or file path for a file

All of the three properties are mandatory. The names of the properties does not need to be exactly as specified, see Naming conventions.

Where to Define Data Sources

The easiest and most straight-forward option is to define all your data sources in a Project file. The examples above use project files for defining data sources.

In some cases, you might want to define data sources elsewhere, e.g. in a database table, in MS Excel sheet. You can define data sources in ANY implemented provider. E.g., CAT has Postgres@1 provider. That means you can also define your data sources (and / or tests ) in a table:

Get List of Data Sources from:
- Provider: Postgres@1
  Connection string: Server=localhost;Port=5432;Database=testing;User Id=test_user;Password=myPassword;
  Query: SELECT * FROM public.data_sources;

If the query SELECT * FROM public.data_sources returns the three mentioned mandatory columns, CAT will use every returned row as a data source.

The same way, you can define your data sources in MS Excel, ORACLE table or view, another YAML file, CSV file etc.

Environment variables

Regardless whether you define your data sources in a project file or elsewhere, you can use environment variables. This is one of techniques how to avoid having passwords in plain text in project files.

If you have defined environment variable “DWH_CONNECTION_STRING”, you can use it like this:

Data Sources:
- Name: DWH
  Provider: SqlServer@1
  Connection string: "%DWH_CONNECTION_STRING%"
  # mind the percents and the quotes arround the environment variable name

CAT will automatically use the value of the variable (but will leave your project file untouched). It works also outside of YAML, but only for data sources (sensitive values are not expected in test definitions).