Databricks
Does it work in Databricks?
On this page
Yes. The integration was successfully tested on MS Azure Databricks.
The prerequisities are installed .NET 8 runtime and pythonnet package. This is possible to have on your cluster.
Libraries
Go to Libraries settings of your cluster.
Install pythonnet package as you’d install any other module.
Install datasmartly_cat, but beware it must be installed from Test PyPI repository (https://test.pypi.org/simple
), because CAT’s integration with Python is in preview.
.NET runtime
Create InitCluster.sh
in your workspace or repository and add the code below. You can also merge the code into you existing cluster initialization script, if you have any.
#!/bin/bash
echo 'Installation of .NET 8 runtime start.'
sudo apt-get update && \
sudo apt-get install -y dotnet-sdk-8.0
echo 'Installation of .NET 8 runtime end.'
In cluster configuration, edit your cluster, go to Advanced and add init script (type Workspace).
Use CAT
If all of the above works for you, you can create a .cat.yaml
project file in your workspace and run it using invoke_project
function. All will work the same way as if you were using CAT from your local machine.