Difference between revisions of "AlphaFold2"
Line 24: | Line 24: | ||
== Submit a job == | == Submit a job == | ||
− | Here is an example of a job submission script which requests 12 cores, 120GB of total | + | Here is an example of a job submission script which requests 12 CPU cores, in total 120GB of memory, in total 120GB of local scratch space and one GPU. |
#!/usr/bin/bash | #!/usr/bin/bash | ||
Line 56: | Line 56: | ||
cp -r $TMPDIR/* output | cp -r $TMPDIR/* output | ||
− | + | A small example as [https://www.ebi.ac.uk/pdbe/entry/pdb/3h7p/protein/1 ubiquitin.fasta] took around 40 minutes to finish for the databases stored on $SCRATCH. The screen output is saved in the output file named starting with ''lsf.o'' followed by the JobID, e.g., lsf.o195525946. | |
{{back_to_tutorials}} | {{back_to_tutorials}} |
Revision as of 18:10, 6 December 2021
< Examples |
Load modules
AlphaFold2 is installed in the new software stack can be loaded as following.
[jarunanp@eu-login-18 ~]$ module load gcc/6.3.0 openmpi/4.0.2 alphafold/2.1.1 Now run 'alphafold_init' to initialize the virtual environment The following have been reloaded with a version change: 1) gcc/4.8.5 => gcc/6.3.0 [jarunanp@eu-login-18 ~]$ alphafold_init (venv_alphafold) [jarunanp@eu-login-18 ~]$
Databases
The AlphaFold databases has the total size when unzipped of 2.2 TB. Users can download the databases to $SCRATCH if you have enough space. You can check your free space by using the command
$ lquota
However, if there are several users of AlphaFold in your group, institute or department, we recommend to use a group storage.
For D-BIOL members, the AlphaFold databases is currently located at /cluster/work/biol/alphafold.
Submit a job
Here is an example of a job submission script which requests 12 CPU cores, in total 120GB of memory, in total 120GB of local scratch space and one GPU.
#!/usr/bin/bash #BSUB -n 12 #BSUB -W 24:00 #BSUB -R "rusage[mem=10000, scratch=10000, ngpus_excl_p=1]" #BSUB -J alphafold source /cluster/apps/local/env2lmod.sh module load gcc/6.3.0 openmpi/4.0.2 alphafold/2.1.1 source /cluster/apps/nss/alphafold/venv_alphafold/bin/activate # Define paths to databases DATA_DIR="/cluster/scratch/jarunanp/21_10_alphafold_databases" python /cluster/apps/nss/alphafold/alphafold-2.1.1/run_alphafold.py \ --data_dir=$DATA_DIR \ --output_dir=$TMPDIR \ --max_template_date="2021-12-06" \ --bfd_database_path=$DATA_DIR/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt \ --uniref90_database_path=$DATA_DIR/uniref90/uniref90.fasta \ --uniclust30_database_path=$DATA_DIR/uniclust30/uniclust30_2018_08/uniclust30_2018_08 \ --mgnify_database_path=$DATA_DIR/mgnify/mgy_clusters_2018_12.fa \ --pdb70_database_path=$DATA_DIR/pdb70/pdb70 \ --template_mmcif_dir=$DATA_DIR/pdb_mmcif/mmcif_files \ --obsolete_pdbs_path=$DATA_DIR/pdb_mmcif/obsolete.dat \ --fasta_paths=ubiquitin.fasta # Copy the results from the compute node mkdir -p output cp -r $TMPDIR/* output
A small example as ubiquitin.fasta took around 40 minutes to finish for the databases stored on $SCRATCH. The screen output is saved in the output file named starting with lsf.o followed by the JobID, e.g., lsf.o195525946.
< Examples |