AlphaFold2

From ScientificComputing
Revision as of 18:04, 6 December 2021 by Jarunanp (talk | contribs)

Jump to: navigation, search

< Examples

Load modules

AlphaFold2 is installed in the new software stack can be loaded as following.

[jarunanp@eu-login-18 ~]$ module load gcc/6.3.0 openmpi/4.0.2 alphafold/2.1.1
Now run 'alphafold_init' to initialize the virtual environment

The following have been reloaded with a version change:
  1) gcc/4.8.5 => gcc/6.3.0

[jarunanp@eu-login-18 ~]$ alphafold_init
(venv_alphafold) [jarunanp@eu-login-18 ~]$ 

Databases

The AlphaFold databases has the total size when unzipped of 2.2 TB. Users can download the databases to $SCRATCH if you have enough space. You can check your free space by using the command

$ lquota

However, if there are several users of AlphaFold in your group, institute or department, we recommend to use a group storage.

For D-BIOL members, the AlphaFold databases is currently located at /cluster/work/biol/alphafold.


Submit a job

Here is an example of a job submission script which requests 12 cores, 120GB of total memory and 120GB of the total local scratch space and one GPU.

#!/usr/bin/bash
#BSUB -n 12
#BSUB -W 24:00
#BSUB -R "rusage[mem=10000, scratch=10000, ngpus_excl_p=1]"
#BSUB -J alphafold

source /cluster/apps/local/env2lmod.sh
module load gcc/6.3.0 openmpi/4.0.2 alphafold/2.1.1
source /cluster/apps/nss/alphafold/venv_alphafold/bin/activate

# Define paths to databases
DATA_DIR="/cluster/scratch/jarunanp/21_10_alphafold_databases"

python /cluster/apps/nss/alphafold/alphafold-2.1.1/run_alphafold.py \
--data_dir=$DATA_DIR \
--output_dir=$TMPDIR \
--max_template_date="2021-12-06" \
--bfd_database_path=$DATA_DIR/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt \
--uniref90_database_path=$DATA_DIR/uniref90/uniref90.fasta \
--uniclust30_database_path=$DATA_DIR/uniclust30/uniclust30_2018_08/uniclust30_2018_08 \
--mgnify_database_path=$DATA_DIR/mgnify/mgy_clusters_2018_12.fa \
--pdb70_database_path=$DATA_DIR/pdb70/pdb70 \
--template_mmcif_dir=$DATA_DIR/pdb_mmcif/mmcif_files \
--obsolete_pdbs_path=$DATA_DIR/pdb_mmcif/obsolete.dat \
--fasta_paths=ubiquitin.fasta

# Copy the results from the compute node
mkdir -p output
cp -r $TMPDIR/* output


< Examples