Difference between revisions of "AlphaFold2"

From ScientificComputing
Jump to: navigation, search
(Add info about postprocessing)
(24 intermediate revisions by 2 users not shown)
Line 17: Line 17:
  
 
== Databases ==
 
== Databases ==
The AlphaFold databases has the total size when unzipped of 2.2 TB. Users can download the databases to $SCRATCH. However, if there are several users of AlphaFold in your group, institute or department, we recommend to use a group storage.
+
The AlphaFold databases are available for all cluster users at '''/cluster/project/alphafold'''.
  
For D-BIOL members, the AlphaFold databases are currently located at /cluster/work/biol/alphafold.
+
If you wish to download databases separately, you can see the instruction [[Downloading Alphafold databases|here]].
  
== Download the AlphaFold databases to your $SCRATCH ==
+
== Postprocessing ==
* Download and install aria2c in your $HOME
 
$ cd $HOME
 
$ wget https://github.com/aria2/aria2/releases/download/release-1.36.0/aria2-1.36.0.tar.gz
 
$ tar xvzf aria2-1.36.0.tar.gz
 
$ cd aria2-1.36.0
 
$ module load gcc/6.3.0 gnutls/3.5.13 openssl/1.0.1e
 
$ ./configure --prefix=$HOME/.local
 
$ make
 
$ make install
 
$ export PATH="$HOME/.local/bin:$PATH"
 
$ which aria2c
 
~/.local/bin/aria2c
 
  
* Check if you have enough space in your $SCRATCH. You may need to free up your $SCRATCH in case there is not enough space.
+
Similar plots as generated by the [https://colab.research.google.com/github/sokrypton/ColabFold/blob/main/AlphaFold2.ipynb Colabfold jupyter notebook] can be created by the [https://gitlab.ethz.ch/sis/alphafold-postprocessing alphafold-postprocessing python script].
$ lquota
+
It is available on Euler as a module
+-----------------------------+-------------+------------------+------------------+------------------+
+
  module load gcc/6.3.0 alphafold-postprocessing
| Storage location:          | Quota type: | Used:            | Soft quota:      | Hard quota:     |
+
  postprocessing.py -o plots/ work_directory/
+-----------------------------+-------------+------------------+------------------+------------------+
 
| /cluster/home/jarunanp      | space      |        10.38 GB |        17.18 GB |        21.47 GB |
 
| /cluster/home/jarunanp      | files      |            85658 |          160000 |          200000 |
 
+-----------------------------+-------------+------------------+------------------+------------------+
 
| /cluster/shadow            | space      |        16.38 kB |          2.15 GB |          2.15 GB |
 
| /cluster/shadow            | files      |                7 |            50000 |            50000 |
 
+-----------------------------+-------------+------------------+------------------+------------------+
 
  | /cluster/scratch/jarunanp  | space      |          2.42 TB |          2.50 TB |          2.70 TB |
 
  | /cluster/scratch/jarunanp  | files      |          201844 |          1000000 |          1500000 |
 
+-----------------------------+-------------+------------------+------------------+------------------+
 
  
* Create a folder for the databases
+
The above command will process ''pkl'' files generated by ''alphafold'' in the folder ''work_directory/'' and put the resulting plots into a folder ''plots/''.
$ cd $SCRATCH
 
$ mkdir alphafold_databases
 
  
* Download the databases: you can call a script to download all the databases or call a script for each databases. These scripts are in the same directory $ALPHAFOLD_ROOT/scripts/.  
+
The postprocessing is integrated in the setup script described below.
  
$ bsub -W 24:00 "$ALPHAFOLD_ROOT/scripts/download_all_data.sh $SCRATCH/alphafold_databases"
+
== Create a job script ==
 +
A job script is a BASH script containing commands to request computing resources, set up the computing environment, run the application and retrieve the results.
  
== Submit a job ==
+
Here is a breakdown of a job script called ''run_alphafold.bsub''.
Here is an example of a job submission script (job_script.bsub) which requests 12 CPU cores, in total 120GB of memory, in total 120GB of local scratch space and one GPU.
+
 
 +
=== Request computing resources ===
  
 +
AlphaFold2 can run with CPUs only, or with CPUs and GPUs which help speed up the computation significantly. Here we request 12 CPU cores, in total 120GB of memory, in total 120GB of local scratch space and one GPU.
 
  #!/usr/bin/bash
 
  #!/usr/bin/bash
  #BSUB -n 12
+
  #BSUB -n 12                                                   # Number of CPUs
  #BSUB -W 4:00
+
  #BSUB -W 24:00                                                 # Runtime
  #BSUB -R "rusage[mem=10000, scratch=10000, ngpus_excl_p=1]"
+
  #BSUB -R "rusage[mem=10000, scratch=10000]"                    # CPU memory and scratch space per CPU core
  #BSUB -J alphafold
+
#BSUB -R "rusage[ngpus_excl_p=1] select[gpu_mtotal0>=10240]"  # Number of GPUs and GPU memory
+
#BSUB -R "span[hosts=1]"                                       # All CPUs in the same host
 +
  #BSUB -J alphafold                                             # Job name
 +
 
 +
=== Set up a computing environment for AlphaFold ===
 
  source /cluster/apps/local/env2lmod.sh
 
  source /cluster/apps/local/env2lmod.sh
 
  module load gcc/6.3.0 openmpi/4.0.2 alphafold/2.1.1
 
  module load gcc/6.3.0 openmpi/4.0.2 alphafold/2.1.1
 
  source /cluster/apps/nss/alphafold/venv_alphafold/bin/activate
 
  source /cluster/apps/nss/alphafold/venv_alphafold/bin/activate
   
+
 
  # Define paths to databases
+
=== Enable Unified Memory (if needed) ===
  DATA_DIR="/cluster/scratch/jarunanp/21_10_alphafold_databases"
+
If the input protein sequence is too large for a single GPU memory (approximately larger than 1500aa), enable Unified Memory to bridge the system memory to the GPU memory so that you can oversubscribe the GPU memory of a single GPU.
   
+
  ...
 +
#BSUB -R "rusage[ngpus_excl_p=4] select[gpu_mtotal0>=10240]"
 +
...
 +
export TF_FORCE_UNIFIED_MEMORY=1
 +
export XLA_PYTHON_CLIENT_MEM_FRACTION="4.0"
 +
 
 +
=== Define paths ===
 +
  # Define paths to databases, fasta file and output directory
 +
  DATA_DIR="/cluster/project/alphafold"
 +
FASTA_DIR="/cluster/home/jarunanp/fastafiles"
 +
  OUTPUT_DIR=${SCRATCH}/protein_name/output
 +
 
 +
For the output directory, there are two options.
 +
* Use $SCRATCH (max 2.7TB), $HOME (max. 20GB) or group storage (/cluster/project or /cluster/work), e.g.,
 +
OUTPUT_DIR=${SCRATCH}/protein_name/output
 +
 
 +
* Use the local /scratch as the output directory. To do so, request the scratch space with BSUB options, e.g., here requesting 120GB scratch space in total. At the end of the computation, don't forget to copy the result from there.
 +
#BSUB -n 12
 +
#BSUB -R "rusage[scratch=10000]"
 +
...
 +
OUTPUT_DIR=${TMPDIR}/output
 +
...
 +
python /path/run_alphafold.py ...
 +
...
 +
cp ${TMPDIR}/output ${SCRATCH}/protein_name
 +
 
 +
=== Start Multi-Process Service on GPU (version >= 2.1.2) ===
 +
From the version 2.1.2, it is possible to enable running relaxation on GPU with the option --use_gpu_relax=1. This option will try to create multiple contexts on the GPU but the default GPU computing mode is exclusive and does not allow creating multiple contexts. This can be circumvented by starting [https://docs.nvidia.com/deploy/mps/index.html Multi-Process Service] with the command
 +
 
 +
nvidia-cuda-mps-control -d
 +
 
 +
=== Call Python run script ===
 
  python /cluster/apps/nss/alphafold/alphafold-2.1.1/run_alphafold.py \
 
  python /cluster/apps/nss/alphafold/alphafold-2.1.1/run_alphafold.py \
 
  --data_dir=$DATA_DIR \
 
  --data_dir=$DATA_DIR \
  --output_dir=$TMPDIR \
+
  --output_dir=$OUTPUT_DIR \
 
  --max_template_date="2021-12-06" \
 
  --max_template_date="2021-12-06" \
 
  --bfd_database_path=$DATA_DIR/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt \
 
  --bfd_database_path=$DATA_DIR/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt \
Line 82: Line 95:
 
  --uniclust30_database_path=$DATA_DIR/uniclust30/uniclust30_2018_08/uniclust30_2018_08 \
 
  --uniclust30_database_path=$DATA_DIR/uniclust30/uniclust30_2018_08/uniclust30_2018_08 \
 
  --mgnify_database_path=$DATA_DIR/mgnify/mgy_clusters_2018_12.fa \
 
  --mgnify_database_path=$DATA_DIR/mgnify/mgy_clusters_2018_12.fa \
--pdb70_database_path=$DATA_DIR/pdb70/pdb70 \
 
 
  --template_mmcif_dir=$DATA_DIR/pdb_mmcif/mmcif_files \
 
  --template_mmcif_dir=$DATA_DIR/pdb_mmcif/mmcif_files \
 
  --obsolete_pdbs_path=$DATA_DIR/pdb_mmcif/obsolete.dat \
 
  --obsolete_pdbs_path=$DATA_DIR/pdb_mmcif/obsolete.dat \
--fasta_paths=ubiquitin.fasta
 
 
# Copy the results from the compute node
 
mkdir -p output
 
cp -r $TMPDIR/* output
 
  
 +
Then, define the input fasta file, select the model preset (monomer or multimer) and define the path to structure databases accordingly.
 +
* For a monomeric protein
 +
--fasta_paths=$FASTA_DIR/ubiquitin.fasta \
 +
--model_preset=monomer \
 +
--pdb70_database_path=$DATA_DIR/pdb70/pdb70
 +
 +
* For a multimeric protein
 +
--fasta_paths=$FASTA_DIR/IFGSC_6mer.fasta \
 +
--model_preset=multimer \
 +
--pdb_seqres_database_path=$DATA_DIR/pdb_seqres/pdb_seqres.txt \
 +
--uniprot_database_path=$DATA_DIR/uniprot/uniprot.fasta
 +
 +
''' Enable relaxation on GPU (version >= 2.1.2)'''<br>
 +
In this version, it is possible to enable running relaxation on GPU with the option --use_gpu_relax. Please see above how to start MPS to use this option.
 +
--use_gpu_relax=1
 +
 +
=== Disable Multi-Process Service (version >= 2.1.2) ===
 +
If MPS is enabled before running AlphaFold, disable MPS with the command
 +
 +
echo quit | nvidia-cuda-mps-control
 +
 +
== Submit a job ==
 
Submit a job with the command
 
Submit a job with the command
  $ bsub < job_script.sh
+
  $ bsub < run_alphafold.bsub
  
 
The screen output is saved in the output file named starting with ''lsf.o'' followed by the JobID, e.g., lsf.o195525946. Please see [[Job output|this page]] for how to read the output file.
 
The screen output is saved in the output file named starting with ''lsf.o'' followed by the JobID, e.g., lsf.o195525946. Please see [[Job output|this page]] for how to read the output file.
  
From testing folding [https://www.ebi.ac.uk/pdbe/entry/pdb/3h7p/protein/1 ubiquitin.fasta] with AlphaFold, it took around 40 minutes to finish for the databases stored on $SCRATCH.
+
From [[Downloading_Alphafold_databases#Benchmark_results|our benchmark]], it took around 40 minutes to fold Ubiquitin[76aa] and 2.5 hours to fold T1050[779aa].
  
== Benchmark results ==
+
== Setup script ==
AlphaFold2 uses HHsearch and HHblits from the HH-suite to perform protein sequence searching. The HH-suite searches do many random file access and read operations. Therefore, it is recommended to store the databases of AlphaFold on a solid state drive (SSD) due to the significantly higher input/output speed (IOPS) compared to a traditional mechanical hard disk drive (HDD).
 
  
We tested the performance of AlphaFold to fold two proteins ([https://www.ebi.ac.uk/pdbe/entry/pdb/3h7p/protein/1 Ubiquitin with the length of 76 amino acids], [https://www.predictioncenter.org/casp14/target.cgi?target=T1050 T1050 with the length of 779 amino acids]) reading the AlphaFold databases from our three central storage systems.
+
This setup script creates a job script with estimate computing resources depending on the input protein sequence. To download the setup script:
* '''/cluster/scratch''' is a fast, short-term, personal storage system based on SSD
 
* '''/cluster/project''' is a long-term group storage system which uses HDD for the permanent storage and NVMe flash caches to accelerate the reading speed
 
* '''/cluster/work''' is a fast, long-term, group storage system based on HDD and suitable for large files
 
  
The tests ran on two of NVIDIA GPU models available on Euler including RTX 2080 Ti and TITAN RTX ([[GPU_job_submission#Available_GPU_node_types|see the GPU specs here]]). All jobs allocated 12 CPU cores, 1 GPU, the total memory of 120 GB and the total scratch space of 120 GB. The figures below show the benchmark results which are the average runtime of five runs for the tests with the databases on /cluster/scratch and /cluster/project. The tests with the databases on /cluster/work were run only once because the small reads on this storage system decrease significantly not only the performance of these particular tests but also the overall performance of the whole /cluster/work storage system. The tested compute nodes were not reserved for testing, i.e., the compute nodes might be loaded by other computational while the AlphaFold tests were running.
+
git clone https://gitlab.ethz.ch/sis/alphafold_on_euler
  
<table>
+
Usage:
<tr>
 
<td>
 
[[Image:Benchmark ubiquitin 1gpu.jpg|600px]]
 
  
<small>Fig 1: The performance results of AlphFold2 in folding the Ubiquitin structure </small>
+
  ./setup_alphafold_run_script.sh -f [Fasta file] -w [work directory] --max_template_date yyyy-mm-dd
</td>
 
<td>
 
[[Image:Benchmark T1050 1gpu.jpg|600px]]
 
  
<small>Fig 2: The performance results of AlphFold2 in folding the T1050 structure  </small>
+
Example:
</td>
 
</tr>
 
</table>
 
  
<table>
+
$ ./setup_alphafold_run_script.sh -f ../../fastafiles/IFGSC_6mer.fasta -w $SCRATCH
<tr>
+
  Reading /cluster/home/jarunanp/alphafold_run/fastafiles/IFGSC_6mer.fasta
<td style="width: 600px; text-align:left">
+
  Protein name:            IFGSC_6mer
[[Image:Alphafold ubiquitin.png|400px]]
+
  Number of sequences:     6
</td>
+
  Protein type:           multimer
<td style="width: 600px; text-align:left">
+
  Number of amino acids:
[[Image:Alphafold_T1050.png|400px]]
+
                    sum: 1246
</td>
+
                    max: 242
</tr>
+
  Estimate required resources:
</table>
+
    Run time: 24:00
<table>
+
    Number of CPUs: 12
<tr style="vertical-align:top;">
+
    Total CPU memory: 120000
<td style="width: 550px; text-align:left: ">
+
    Number of GPUs: 1
<small>Fig 3: This figure shows a cartoon representation of two superimposed ubiquitin structures. Ubiquitin is a small monomeric protein with 76 amino acids. The structure in blue has been determined experimentally (X-ray crystallography, pdb database code: 1upq.pdb). The model in green shows the structure predicted by AlphaFold2. The RMSD (root mean square distance) between the two structures is 0.797 A. The RMSD has been calculated for the backbone atoms. (Image and caption text by Dr. Simon Rüdisser, BNSP)</small>
+
    Total GPU memory: 20480
</td>
+
    Total scratch space: 120000
<td style="width: 50px;">
+
  Output an LSF run script for AlphaFold2: /cluster/scratch/jarunanp/run_alphafold.bsub
</td>
 
<td style="width: 550px; text-align:left">
 
<small>Fig 4: The five models of T1050 generated by AlphaFold2 are shown as cartoon representation. T1050 is a monomeric protein with 779 amino acids. T1050 is one of the targets from the CASP (Critical Assessment of Techniques for Protein Structure Prediction) initiative. (Image and caption text by Dr. Simon Rüdisser, BNSP) </small>
 
</td>
 
</tr>
 
</table>
 
 
From testing folding the two proteins with AlphaFold, '''/cluster/project shows to be the best choice as a group storage for the AlphaFold databases.''' The performance of AlphaFold when reading the data from /cluster/scratch and /cluster/project is comparable to one another and around 10 times faster than when reading the data from /cluster/work. /cluster/scratch is for short-term storage and only for personal use and, therefore, it is not an optimal solution for a group of users.
 
  
 
== Further readings ==
 
== Further readings ==
 
* [https://deepmind.com/blog/article/alphafold-a-solution-to-a-50-year-old-grand-challenge-in-biology DeepMind Blog post: "AlphaFold: a solution to a 50-year-old grand challenge in biology"]
 
* [https://deepmind.com/blog/article/alphafold-a-solution-to-a-50-year-old-grand-challenge-in-biology DeepMind Blog post: "AlphaFold: a solution to a 50-year-old grand challenge in biology"]
 
* [https://ethz.ch/en/news-and-events/eth-news/news/2021/08/computer-algorithms-revolutionise-biology.html ETH News: "Computer algorithms are currently revolutionising biology"]
 
* [https://ethz.ch/en/news-and-events/eth-news/news/2021/08/computer-algorithms-revolutionise-biology.html ETH News: "Computer algorithms are currently revolutionising biology"]
 +
* [[AlphaFold2_presentation_21_March_2022#Slides | AlphaFold2 presentation slides 21 March 2022]]
 +
* [[Downloading_Alphafold_databases| Downloading AlphaFold databases and benchmark results]]
 
{{back_to_tutorials}}
 
{{back_to_tutorials}}

Revision as of 13:27, 17 June 2022

< Examples

AlphaFold2 predicts a protein's 3D folding structure by its amino acid sequence with the accuracy that is competitive with experimental results. This AI-powered structure prediction of AlphaFold2 has been recognized as the scientific breakthrough of the year 2021. The AlphaFold package is now installed in the new software stack on Euler.

Load modules

The AlphaFold module can be loaded as following.

$ env2lmod
$ module load gcc/6.3.0 openmpi/4.0.2 alphafold/2.1.1
Now run 'alphafold_init' to initialize the virtual environment

The following have been reloaded with a version change:
  1) gcc/4.8.5 => gcc/6.3.0

$ alphafold_init
(venv_alphafold) [jarunanp@eu-login-18 ~]$ 

Databases

The AlphaFold databases are available for all cluster users at /cluster/project/alphafold.

If you wish to download databases separately, you can see the instruction here.

Postprocessing

Similar plots as generated by the Colabfold jupyter notebook can be created by the alphafold-postprocessing python script. It is available on Euler as a module

module load gcc/6.3.0 alphafold-postprocessing
postprocessing.py -o plots/ work_directory/

The above command will process pkl files generated by alphafold in the folder work_directory/ and put the resulting plots into a folder plots/.

The postprocessing is integrated in the setup script described below.

Create a job script

A job script is a BASH script containing commands to request computing resources, set up the computing environment, run the application and retrieve the results.

Here is a breakdown of a job script called run_alphafold.bsub.

Request computing resources

AlphaFold2 can run with CPUs only, or with CPUs and GPUs which help speed up the computation significantly. Here we request 12 CPU cores, in total 120GB of memory, in total 120GB of local scratch space and one GPU.

#!/usr/bin/bash
#BSUB -n 12                                                    # Number of CPUs
#BSUB -W 24:00                                                 # Runtime
#BSUB -R "rusage[mem=10000, scratch=10000]"                    # CPU memory and scratch space per CPU core
#BSUB -R "rusage[ngpus_excl_p=1] select[gpu_mtotal0>=10240]"   # Number of GPUs and GPU memory 
#BSUB -R "span[hosts=1]"                                       # All CPUs in the same host
#BSUB -J alphafold                                             # Job name

Set up a computing environment for AlphaFold

source /cluster/apps/local/env2lmod.sh
module load gcc/6.3.0 openmpi/4.0.2 alphafold/2.1.1
source /cluster/apps/nss/alphafold/venv_alphafold/bin/activate

Enable Unified Memory (if needed)

If the input protein sequence is too large for a single GPU memory (approximately larger than 1500aa), enable Unified Memory to bridge the system memory to the GPU memory so that you can oversubscribe the GPU memory of a single GPU.

...
#BSUB -R "rusage[ngpus_excl_p=4] select[gpu_mtotal0>=10240]"
...
export TF_FORCE_UNIFIED_MEMORY=1
export XLA_PYTHON_CLIENT_MEM_FRACTION="4.0"

Define paths

# Define paths to databases, fasta file and output directory
DATA_DIR="/cluster/project/alphafold"
FASTA_DIR="/cluster/home/jarunanp/fastafiles"
OUTPUT_DIR=${SCRATCH}/protein_name/output

For the output directory, there are two options.

  • Use $SCRATCH (max 2.7TB), $HOME (max. 20GB) or group storage (/cluster/project or /cluster/work), e.g.,
OUTPUT_DIR=${SCRATCH}/protein_name/output
  • Use the local /scratch as the output directory. To do so, request the scratch space with BSUB options, e.g., here requesting 120GB scratch space in total. At the end of the computation, don't forget to copy the result from there.
#BSUB -n 12
#BSUB -R "rusage[scratch=10000]"
...
OUTPUT_DIR=${TMPDIR}/output
...
python /path/run_alphafold.py ...
...
cp ${TMPDIR}/output ${SCRATCH}/protein_name

Start Multi-Process Service on GPU (version >= 2.1.2)

From the version 2.1.2, it is possible to enable running relaxation on GPU with the option --use_gpu_relax=1. This option will try to create multiple contexts on the GPU but the default GPU computing mode is exclusive and does not allow creating multiple contexts. This can be circumvented by starting Multi-Process Service with the command

nvidia-cuda-mps-control -d

Call Python run script

python /cluster/apps/nss/alphafold/alphafold-2.1.1/run_alphafold.py \
--data_dir=$DATA_DIR \
--output_dir=$OUTPUT_DIR \
--max_template_date="2021-12-06" \
--bfd_database_path=$DATA_DIR/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt \
--uniref90_database_path=$DATA_DIR/uniref90/uniref90.fasta \
--uniclust30_database_path=$DATA_DIR/uniclust30/uniclust30_2018_08/uniclust30_2018_08 \
--mgnify_database_path=$DATA_DIR/mgnify/mgy_clusters_2018_12.fa \
--template_mmcif_dir=$DATA_DIR/pdb_mmcif/mmcif_files \
--obsolete_pdbs_path=$DATA_DIR/pdb_mmcif/obsolete.dat \

Then, define the input fasta file, select the model preset (monomer or multimer) and define the path to structure databases accordingly.

  • For a monomeric protein
--fasta_paths=$FASTA_DIR/ubiquitin.fasta \
--model_preset=monomer \
--pdb70_database_path=$DATA_DIR/pdb70/pdb70
  • For a multimeric protein
--fasta_paths=$FASTA_DIR/IFGSC_6mer.fasta \
--model_preset=multimer \
--pdb_seqres_database_path=$DATA_DIR/pdb_seqres/pdb_seqres.txt \
--uniprot_database_path=$DATA_DIR/uniprot/uniprot.fasta

Enable relaxation on GPU (version >= 2.1.2)
In this version, it is possible to enable running relaxation on GPU with the option --use_gpu_relax. Please see above how to start MPS to use this option.

--use_gpu_relax=1

Disable Multi-Process Service (version >= 2.1.2)

If MPS is enabled before running AlphaFold, disable MPS with the command

echo quit | nvidia-cuda-mps-control

Submit a job

Submit a job with the command

$ bsub < run_alphafold.bsub

The screen output is saved in the output file named starting with lsf.o followed by the JobID, e.g., lsf.o195525946. Please see this page for how to read the output file.

From our benchmark, it took around 40 minutes to fold Ubiquitin[76aa] and 2.5 hours to fold T1050[779aa].

Setup script

This setup script creates a job script with estimate computing resources depending on the input protein sequence. To download the setup script:

git clone https://gitlab.ethz.ch/sis/alphafold_on_euler

Usage:

./setup_alphafold_run_script.sh -f [Fasta file] -w [work directory] --max_template_date yyyy-mm-dd

Example:

$ ./setup_alphafold_run_script.sh -f ../../fastafiles/IFGSC_6mer.fasta -w $SCRATCH
 Reading /cluster/home/jarunanp/alphafold_run/fastafiles/IFGSC_6mer.fasta
 Protein name:            IFGSC_6mer
 Number of sequences:     6
 Protein type:            multimer
 Number of amino acids:
                   sum: 1246
                   max: 242
 Estimate required resources:
   Run time: 24:00
   Number of CPUs: 12
   Total CPU memory: 120000
   Number of GPUs: 1
   Total GPU memory: 20480
   Total scratch space: 120000
 Output an LSF run script for AlphaFold2: /cluster/scratch/jarunanp/run_alphafold.bsub

Further readings

< Examples