=Overview= [https://rapids.ai/ RAPIDS] is a suite of open source software libraries from NVIDIA mainly for executing data science and analytics pipelines in Python on GPUs. It relies on NVIDIA CUDA primitives for low-level compute optimization and provides friendly Python APIs, similar to those in Pandas or Scikit-learn. The main components are: * '''cuDF''', a Python GPU DataFrame library (built on the Apache Arrow columnar memory format) for loading, joining, aggregating, filtering, and otherwise manipulating data. * '''cuML''', a suite of libraries that implement machine learning algorithms and mathematical primitive functions that share compatible APIs with other RAPIDS projects. * '''cuGraph''', a GPU accelerated graph analytics library, with functionality like NetworkX, which is seamlessly integrated into the RAPIDS data science platform. * '''Cyber Log Accelerators (CLX or ''clicks'')''', a collection of RAPIDS examples for security analysts, data scientists, and engineers to quickly get started applying RAPIDS and GPU acceleration to real-world cybersecurity use cases. * '''cuxFilter''', a connector library, which provides the connections between different visualization libraries and a GPU dataframe without much hassle. This also allows you to use charts from different libraries in a single dashboard, while also providing the interaction. * '''cuSpatial''', a GPU accelerated C++/Python library for accelerating GIS workflows including point-in-polygon, spatial join, coordinate systems, shape primitives, distances, and trajectory analysis. * '''cuSignal''', which leverages CuPy, Numba, and the RAPIDS ecosystem for GPU accelerated signal processing. In some cases, cuSignal is a direct port of Scipy Signal to leverage GPU compute resources via CuPy but also contains Numba CUDA kernels for additional speedups for selected functions. * '''cuCIM''', an extensible toolkit designed to provide GPU accelerated I/O, computer vision & image processing primitives for N-Dimensional images with a focus on biomedical imaging. * '''RAPIDS Memory Manager (RMM)''', a central place for all device memory allocations in cuDF (C++ and Python) and other RAPIDS libraries. In addition, it is a replacement allocator for CUDA Device Memory (and CUDA Managed Memory) and a pool allocator to make CUDA device memory allocation / deallocation faster and asynchronous. = Apptainer images= To build an Apptainer (formerly called [[Singularity#Please_use_Apptainer_instead|Singularity]]) image for RAPIDS, the first thing to do is to find and select a Docker image provided by NVIDIA. ==Finding a Docker image== There are two types of RAPIDS Docker images starting with the RAPIDS v23.08 release: ''base'' and ''notebooks''. For each type, multiple images are provided for different combinations of RAPIDS and CUDA versions, either on Ubuntu or on CentOS. You can find the Docker pull command for a selected image under the '''Tags''' tab on each site. * [https://catalog.ngc.nvidia.com/orgs/nvidia/teams/rapidsai/containers/base RAPIDS Base]: contain a RAPIDS environment ready for use. Use this type of image if you want to submit a job to the Slurm scheduler. * [https://catalog.ngc.nvidia.com/orgs/nvidia/teams/rapidsai/containers/notebooks RAPIDS Notebooks]: extend the base image by adding a Jupyter notebook server and example notebooks. Use this type of image if you want to interactively work with RAPIDS through notebooks and examples. ==Building an Apptainer image== For example, if a Docker pull command for a selected image is given as docker pull nvcr.io/nvidia/rapidsai/rapidsai:cuda11.0-runtime-centos7 on a computer that supports Apptainer, you can build an Apptainer image (here ''rapids.sif'') with the following command based on the pull tag: [name@server ~]$ apptainer build rapids.sif docker://nvcr.io/nvidia/rapidsai/rapidsai:cuda11.0-runtime-centos7 It usually takes from thirty to sixty minutes to complete the image-building process. Since the image size is relatively large, you need to have enough memory and disk space on the server to build such an image. =Working on clusters with an Apptainer image= Once you have an Apptainer image for RAPIDS ready in your account, you can request an interactive session on a GPU node or submit a batch job to Slurm if you have your RAPIDS code ready. ==Working interactively on a GPU node== If an Apptainer image was built based on a `Notebooks` type of Docker image, it includes a Jupyter Notebook server and can be used to explore RAPIDS interactively on a compute node with a GPU.
To request an interactive session on a compute node with a single GPU, e.g. a T4 type of GPU on Graham, run [name@gra-login ~]$ salloc --ntasks=1 --cpus-per-task=2 --mem=10G --gres=gpu:t4:1 --time=1:0:0 --account=def-someuser Once the requested resource is granted, start the RAPIDS shell on the GPU node with [name@gra#### ~]$ module load apptainer [name@gra#### ~]$ apptainer shell --nv -B /home -B /project -B /scratch rapids.sif * the --nv option binds the GPU driver on the host to the container, so the GPU device can be accessed from inside the Apptainer container; * the -B option binds any filesystem that you would like to access from inside the container. After the shell prompt changes to Apptainer>, you can check the GPU stats in the container to make sure the GPU device is accessible with Apptainer> nvidia-smi After the shell prompt changes to Apptainer>, you can launch the Jupyter Notebook server in the RAPIDS environment with the following command, and the URL of the Notebook server will be displayed after it starts successfully. Apptainer> jupyter-lab --ip $(hostname -f) --no-browser '''NOTE:''' Starting with the RAPIDS v23.08 release, all packages are included in the base conda environment which is activated by default in the container shell. As there is no direct Internet connection on a compute node on Graham, you would need to set up an SSH tunnel with port forwarding between your local computer and the GPU node. See [[Advanced_Jupyter_configuration#Connecting_to_JupyterLab|detailed instructions for connecting to Jupyter Notebook]]. ==Submitting a RAPIDS job to the Slurm scheduler== Once you have your RAPIDS code ready, you can write a job submission script to submit a job execution request to the Slurm scheduler. It is a good practice to [[Using node-local storage|use the local disk]] on a compute node when working via a container. '''Submission script''' {{File |name=submit.sh |lang="sh" |contents= #!/bin/bash #SBATCH --gres=gpu:t4:1 #SBATCH --cpus-per-task=2 #SBATCH --mem=10G #SBATCH --time=dd:hh:mm #SBATCH --account=def-someuser module load apptainer # copy your container image and your rapids code and data to the local disk on a compute node via $SLURM_TMPDIR cd $SLURM_TMPDIR cp /path/to/rapids.sif ./ cp /path/to/your_rapids_code.py ./ cp -r /path/to/your_datasets ./ apptainer exec --nv rapids.sif python ./my_rapids_code.py # save any results to your /project before terminating the job cp -r your_results ~/projects/def-someuser/username/ }} =Helpful links= * [https://docs.rapids.ai/ RAPIDS Docs]: a collection of all the documentation for RAPIDS, how to stay connected and report issues; * [https://github.com/rapidsai/notebooks RAPIDS Notebooks]: a collection of example notebooks on GitHub for getting started quickly; * [https://medium.com/rapids-ai RAPIDS on Medium]: a collection of use cases and blogs for RAPIDS applications.