Saturn guarantees that your code developed in Jupyter notebooks will run in an identical environment when you run it on a Dask cluster or in a deployment. This tutorial demonstrates how to customize that environment. In it, you will learn:
- How to set custom environment variables
- How to include external git repositories
- How to add and enable additional JupyterLab extensions
On the Jupyter page of Saturn, there are a number of different options for customizing your instance. This article will focus on two Advanced Options, "Start Script", "Environment Variables", and "postBuild Script".
A "start script" is an optional shell script that runs whenever Saturn creates resources like Jupyter servers, Dask workers, or Custom Deployments. It can be used for inexpensive tasks that you want to be dynamic, such as cloning the most recent state of a git repository every time an instance starts up. In the example below, I am cloning a repo:
To try this out yourself, copy and paste the following into your start script:
echo 'cloning repo'
git clone https://github.com/theislab/scanpy
To confirm the repo was cloned, you can:
- Launch the instance
- Open up a jupyterlab
- In jupyterlab, hit file then select new terminal
Then you should see the scanpy repo i.e.:
You'll notice I have some code
MIN_GENES=200 in the Environment Variables field. You can provide environment variables to further customize your instance and Dask workers.
To access MIN_GENES from within the notebook, you can simply use os.environ and confirm it returns 200.
You can further customize your instance with a postBuild script. This is a shell script that is run at the end of building the docker image used by Saturn resources. It should be used for expensive and slow-changing tasks, such as enabling JupyterLab extensions.
To access the postBuild script:
1. Click the project's name in the card at the top of the page. In this workspace, I would click "scanpy-2"
2. Scroll down and click into the image dropdown, select "build your own"
3. You'll see the postBuild box appear, and for this example I will be adding the system monitor labextension by pasting the below into postBuild box.
pip install nbresuse==0.3.3
jupyter labextension install jupyterlab-topbar-extension jupyterlab-system-monitor
*note you will need to explicitly add any other packages you want in your environment through either environment.yml or requirements.txt.
4. Hit build, this will take a few minutes to execute. Then launch a jupyterlab from the instance we just customized. Within a notebook you'll now notice a memory usage bar at the top right of your notebook:
*note that the postBuild script does not have access to user credentials, and the "/home/jovyan" directory will not exist, and cannot be used.