Skip to main content

AlphaFold

Introduction

In short AlphaFold is groundbreaking AI system that is making research faster in the field of bioinformatics. To use AlphaFold the system first takes in a sequence of an amino acid and will then predict three dimensional structure of a protein and does so extremely efficiently.

Read more on the AlphaFold official website.

This section on AlphaFold will go through how to use AlphaFold on Elja.


Getting started

note

Due to Nvidia compatibility issues Elja now requires you to run AlphaFold in a Conda environment.

Setting up the Conda environment

We start by initializing the conda environment, these are the same steps as seen in Conda:

$ module use /hpcapps/lib-mimir/modules/all 
$ module load Anaconda3/2022.05
$ conda config --add channels defaults
$ conda config --add channels bioconda
$ conda config --add channels conda-forge
$ conda config --set auto_activate_base false
$ conda init
$ bash # You can also log out and in again.

Load AlphaFold

Once conda is initialized and ready to use we can load AlphaFold module.

$ ml use /hpcapps/libbio-gpu/modules/all
$ ml load AlphaFold/2.3.1

Run AlphaFold on Elja

To run AlphaFold on Elja you can either run an interactive session or run a batch job.

Starting an interactive session

You can start an interactive session with the srun command on a GPU node. You can use the screen command or tmux to create a secondary terminal where your interactive session is running in the background.

$ srun --job-name "AlphaFold" --partition gpu-1xA100 --time 01:00:00 --pty bash
$ conda activate $env_path
$ run_alphafold.sh -d /AlphaFoldData/AlphaFold/data -o /hpcapps/source/alphafold_non_docker/dummy_test/ -f /hpcapps/source/alphafold_non_docker/example/query.fasta -t 2020-05-14

Running AlphaFold with SBATCH

 cat submit.slurm
#!/bin/bash
#SBATCH --mail-type=ALL
#SBATCH --mail-user=<MAIL> # for example uname@hi.is
#SBATCH --nodes=1 # number of nodes
#SBATCH --partition=gpu-1xA100
#SBATCH --time=1-00:00:00 # run for 1 day maximum
#SBATCH --output=slurm_job_output.log
#SBATCH --error=slurm_job_errors.log # Logs if job crashes

module use /hpcapps/libbio-gpu/modules/all
module load AlphaFold/2.3.1
conda activate $env_path

# Run the command

run_alphafold.sh -d /AlphaFoldData/AlphaFold/data -o /hpcapps/source/alphafold_non_docker/dummy_test/ -f /hpcapps/source/alphafold_non_docker/example/query.fasta -t 2020-05-14