Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
languagebash
themeMidnight
titleCancelling your running job when its done.
[aa3301@axon test]$ jobstats.py
User: aa3301
Default Account: zrc
User is part of the following slurm accounts ['zrc']
User Raw Share: 1
User Raw Usage: 476411
Number of Pending Jobs: 0
Number of Running Jobs: 1
Total Jobs Completed: 5
Total Jobs Completed Successfully: 0
Total Jobs Failed: 0
Total Jobs Cancelled: 0
Total Jobs Timeout: 0

                            Running Jobs
________________________________________________________________________________
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
             47257    shared jupyter-   aa3301  R       0:48      1 ax08

                  Running + Pending Jobs
________________________________________________________________________________
             JOBID PARTITION PRIOR     NAME     USER    STATE       TIME  TIME_LIMIT  NODES CPUS TRES_P           START_TIME     NODELIST(REASON)      QOS
             47257    shared   108 jupyter-   aa3301  RUNNING       0:48     5:00:00      1    6  gpu:1  2020-03-05T16:50:33                 ax08   normal

[aa3301@axon test]$ scancel 47257


Running a Jupyter notebook via an SSH tunnel

If you're connecting to Axon the external SSH connection with out the VPN you won't be able to reach the Jupyter notebook on the compute node to access it. To get around this you can redirect the traffic from the compute node through the login node to your remote machine via an ssh tunnel.

Code Block
languagebash
themeMidnight
titleLaunch a Jupyter notebook
[aa3301@axon ~]$ srun --pty -c 6 --gres=gpu:1 -t 01:00:00 /bin/bash
[aa3301@ax01 ~]$ ml anaconda3-2019.03
[aa3301@ax01 ~]$ XDG_RUNTIME_DIR=""
[aa3301@ax01 ~]$ jupyter notebook --no-browser --ip=$(hostname -I | awk '{print$1}') --port=$(shuf -i 8888-9000 -n1)
[I 16:36:47.176 NotebookApp] [nb_conda_kernels] enabled, 1 kernels found
[I 16:36:51.677 NotebookApp] JupyterLab extension loaded from /share/apps/anaconda3-2019.03/lib/python3.7/site-packages/jupyterlab
[I 16:36:51.677 NotebookApp] JupyterLab application directory is /share/apps/anaconda3-2019.03/share/jupyter/lab
[I 16:36:51.711 NotebookApp] [nb_conda] enabled
[I 16:36:51.711 NotebookApp] Serving notebooks from local directory: /share/zrc/aa3301
[I 16:36:51.712 NotebookApp] The Jupyter Notebook is running at:
[I 16:36:51.712 NotebookApp] http://10.198.24.12:8944/?token=84cb9ff65b505f63f3e6ffcc03253adc7d133e3e5e063773
[I 16:36:51.712 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 16:36:51.758 NotebookApp]

    To access the notebook, open this file in a browser:
        file:///share/zrc/aa3301/.local/share/jupyter/runtime/nbserver-418198-open.html
    Or copy and paste one of these URLs:
        http://10.198.24.12:8944/?token=84cb9ff65b505f63f3e6ffcc03253adc7d133e3e5e063773

Once you have started the jupyter notebook you will need to make note of the ip and port number of the url listed in the last line above. In this case the server ip is 10.198.24.12 and the port number is 8944.

When you have this you can open a tunnel from your machine to the the server, by running the following openssh command from a terminal on your machine.

Code Block
languagebash
themeMidnight
titleStarting an ssh tunnel from your local machine
> ssh -L 8080:10.198.24.12:8944 -p 55 aa3301@mbb-nat-vlan415.net.columbia.edu
aa3301@mbb-nat-vlan415.net.columbia.edu's password:
Last login: Mon Apr  6 10:10:02 2020 from adm.rc.zi.columbia.edu
Welcome to the Axon GPU Cluster!
...

You will now have a window which looks like a normal ssh session, but in addition to your normal ssh session you also have a tunnel from port 8080 on your machine to 10.198.24.12:8944 (in the example above).

Now put the following url in your web browser: http://localhost:8080 and you will see a page like this.

Image Added

If we look back at the original command you can see the token which was generated when we launched the jupyter notebook embedded in the url in the last line. We can now take the portion after token= (which is 84cb9ff65b505f63f3e6ffcc03253adc7d133e3e5e063773 in the example above) and paste it into the "Password or token:" field in the page above, and you will be good to go.

Info

The tunnel ssh session needs to stay open as long as you're using Jupyter notebook. It may look idle, but it is keeping the tunnel open. You can use this session for any other work, but when it closes your tunnel to Axon will close as well.