Databricks

Does it work in Databricks?

Yes. The integration was successfully tested on MS Azure Databricks.

The prerequisities are installed .NET 8 runtime and pythonnet package. This is possible to have on your cluster.

Libraries

Go to Libraries settings of your cluster.

Install pythonnet package as you’d install any other module.

Install datasmartly_cat, but beware it must be installed from Test PyPI repository (https://test.pypi.org/simple), because CAT’s integration with Python is in preview.

.NET runtime

Create InitCluster.sh in your workspace or repository and add the code below. You can also merge the code into you existing cluster initialization script, if you have any.

#!/bin/bash
echo 'Installation of .NET 8 runtime start.'
sudo apt-get update && \
  sudo apt-get install -y dotnet-sdk-8.0
echo 'Installation of .NET 8 runtime end.'

In cluster configuration, edit your cluster, go to Advanced and add init script (type Workspace).

Use CAT

If all of the above works for you, you can create a .cat.yaml project file in your workspace and run it using invoke_project function. All will work the same way as if you were using CAT from your local machine.