Difference between revisions of "CLC genomics server"

From ScientificComputing
Jump to: navigation, search
(Created page with "CLC Genomics Server 7.5.1 (for further informations see http://www.clcbio.com/products/clc-genomics-server) is installed on the Euler cluster. == Introduction == The CLC Gen...")
 
(Current version of the server)
 
(9 intermediate revisions by the same user not shown)
Line 1: Line 1:
CLC Genomics Server 7.5.1 (for further informations see http://www.clcbio.com/products/clc-genomics-server) is installed on the Euler cluster.
+
==Introduction==
  
== Introduction ==
+
CLC Genomics Server is a software solution for centralized bioinformatics analysis and sharing of data generated from all High-Throughput Sequencing platforms. It contains the same tools as the CLC Genomics Workbench, such as mapping of reads to a known reference, de novo assembly, and variant calling. With a single click within CLC Genomics Workbench it is possible to offload resource-demanding tasks to an HPC cluster, that would not possible to analyse in a desktop computer environment. Please find further information about the CLC genomics workbench on the [https://sharepoint.biol.ethz.ch/it/clc/SitePages/Home.aspx sharepoint] page of D-BIOL.
  
The CLC Genomics Workbench (http://www.clcbio.com/products/clc-genomics-workbench/) is a '''next generation sequencing solution''' that provides numerous features within the fields of genomics, transcriptomics and epigenomics and additionally includes all features of CLC Main Workbench. Further information about the CLC Genomics Workbench are provided on the [https://sharepoint.biol.ethz.ch/it/clc/SitePages/Home.aspx sharepoint] page of D-BIOL.
+
==Current version of the server==
 +
10.0.0
  
The CLC Genomics Workbench can used as stand-alone application, but for calculations that require larger amounts of computational resources, it may reaches its limitations. Therefore CLC Bio provides the CLC Genomics Server installation, which allows users to '''offload''' their '''resource-demanding tasks''' from the CLC Genomics Workbench clients '''to the server installation on Euler'''. The jobs are then submitted from the CLC Genomics Workbench and processed on the Euler cluster.
+
==Life time==
 +
2013-
  
== Workbench versions compatible with genomics server on Euler ==
+
==Tutorial==
 
+
[[Using_the_CLC_genomics_service|Using the CLC genomics service]]
*'''8.5.x'''
 
 
 
== Requirements for Using the CLC Genomics Server Installation on Euler ==
 
 
 
For using the CLC Genomics Server installation on Euler, certain requirements need to be fulfilled. First of all, an '''installation of the CLC Genomics Workbench client''' on you local computer is needed. The client software is provided by IDES (www.ides.ethz.ch). Furthermore, you need to install the '''CLC Workbench Client Plugin''', that is used for the communication between the CLC Genomics Workbench and the CLC Genomics Server.
 
 
 
# Start the CLC Genomics Workbench (on Windows Computers you have to start it as '''administrator''', i.e., right-click the CLC workbench icon and choose '''run as administrator''')
 
# Click on the '''Plug-in''' button
 
# Click on the '''Donwload Plug-ins''' tab and choose the '''CLC Workbench Client Plugin''' and click on '''Download and Install'''
 
 
 
{|
 
|-
 
|[[Image:Eulerclcplugin1.png|thumb|370px|Plug-in button]]
 
|[[Image:Eulerclcplugin2.png|thumb|400px|Download and install plug-in]]
 
 
 
|}
 
 
 
As a last requirement a '''CLC Genomics Server account''' is needed to use the CLC Genomics Server installation on the Brutus cluster. For requesting an account, please contact cluster-support@id.ethz.ch
 
 
 
== Login to the CLC Genomics Server from the CLC Genomics Workbench Client ==
 
 
 
For connecting the CLC Genomics Workbench client to the CLC Genomics Server, an SSH-tunnel is no longer required. The CLC Genomics Server on Euler is running in a virtual machine and the clients can directly connect to this virtual machine.
 
 
 
=== Connecting the Client to the Server ===
 
 
 
# Open the CLC Genomics Workbench client (first only the '''local data''' is shown in the menu at the top left)
 
# Open the '''File''' menu and click on the entry '''CLC Server Login'''
 
# Enter the '''username''', '''password''' of your CLC Genomics Server account
 
# Click on "Advanced" and enter '''clc01.hpc-lca.ethz.ch''' as server host and '''7777''' as server port. Then click on the '''Login''' button
 
 
 
After the login procedure, the server data locations will be displayed in the '''Navigation Area''' menu. When connected to the CLC Genomics Server, you will be able to see all server data locations (the folder with a blue dot next to them) but not their content. You will only be able to see and use the content of your own data location (unless you explicitly ask us to change the permissions in case you would like to share data with other users).
 
 
 
{|
 
|-
 
|[[Image:Clcwiki1.png|thumb|360px|Local data locations]]
 
|[[Image:Clcwiki2.png|thumb|360px|Login option in the "File" menu]]
 
|-
 
|[[Image:Clcwiki3.png|thumb|360px|Enter username, password, server host and port]]
 
|[[Image:Clcwiki4.png|thumb|360px|Server data locations]]
 
|}
 
 
 
=== Connecting to the Server via the Web Interface ===
 
 
 
The CLC Genomics Server provides a web interface which allows the users to connect to the server via their browser. It is possible to do more user-oriented things like browsing data, upload/download data, access/edit meta-data on data and do data-queries.
 
 
 
# Open a web browser
 
# Enter '''clc01.hpc-lca.ethz.ch:7777''' in the address field of your browser
 
# Enter your NETHZ username and password
 
 
 
{|
 
|-
 
|[[Image:Clcweb1.png|thumb|360px|Login screen of the web interface]]
 
|[[Image:Clcweb2.png|thumb|360px|Browsing data in the web interface]]
 
|-
 
|}
 
 
 
== Data Management ==
 
 
 
The '''data sets''' that users would like to use for the CLC Genomics Server installation on Euler need to be imported to the server before they can be used. Therefore we attach a '''server data location''' (one folder) to each CLC Genomics Server account that is created on Euler. Unless a user owns some permanent space in Euler, the server data locations are considered as scratch space that can be used for temporary storage of data and will be purged on a regular basis. After the jobs have finished, the results should be copied back on a local machine or any other storage location. Please note that there is no backup for these data sets.
 
 
 
In general there are two different ways of importing data to a server data location. On one hand, the '''data can directly be imported into the CLC Genomics Workbench client''' and then be moved to the server data location by drag-and-drop within the client. For this, one has to click on a file in the local CLC data location and move it to the server data location that is attached to each CLC Genomics Server account. Mounting NAS shares from the IT services storage group been tested on Euler and should work.
 
 
 
== Submitting Jobs from the CLC Genomics Workbench Client to Euler ==
 
 
 
As an example for demonstrating how to submit a job from the CLC Genomics Workbench client to the Euler cluster, we choose a BLAST search. For all other tasks that can be achieved with the CLC Genomics Workbench client, it works the same way. In principle '''there is a single difference when comparing to run a job on Brutus instead of the CLC Genomics Workbench client'''. You have to '''choose the Grid option instead of Workbench''' and then you have to choose a queue.
 
 
 
For CLC on Euler, we have several queues that range from 1 to 24 cores.Please '''be aware that not all of the applications of the CLC Genomics Server can make use of multiple cores'''. <font color="red">Only choose a queue with more than 1 core, if the application you would like to use is listed here. Otherwise, please choose the 1 core queue</font>:
 
 
 
*Trim Sequences
 
*Create Alignment
 
*Map Reads to Reference
 
*De Novo Assembly
 
*RNA-Seq Analysis
 
*Probabilistic Variant Detection
 
*Create Sequencing QC Report
 
*Create Detailed Mapping Report
 
*BLAST
 
*Large Gap Read Mapper (current in beta, part of the Transcript Discovery plug-in)
 
'''When setting up a BLAST search, you can set the option in the workbench how many threads should be used. Please set this to 12, when using the 12 core queue.'''
 
 
 
 
 
 
 
{|
 
|-
 
|[[Image:Eulerclcsubjob1.png|thumb|360px|Click on data in server location and an application]]
 
|[[Image:Eulerclcsubjob2.png|thumb|360px|Choose '''CLC Server''' option]]
 
|-
 
|[[Image:Eulerclcsubjob3.png|thumb|360px|Job is submitted to cluster]]
 
|[[Image:Eulerclcsubjob4.png|thumb|360px|Job is queued]]
 
|-
 
|[[Image:Eulerclcsubjob5.png|thumb|360px|Job is running]]
 
|[[Image:Eulerclcsubjob6.png|thumb|360px|Job has finished, data can be copied back]]
 
|}
 
 
 
== Local BLAST Searches ==
 
Euler provides a local BLAST database, which is currently static but in the future it will be updated once a week from the NCBI reference. The local BLAST search is much faster than the BLAST requests, sent to the NCBI. At a later stage of the project, users will also be able to provide their own databases in addition to the BLAST ones.
 
 
 
== Documentation and Tutorials on the CLC Genomics Workbench ==
 
 
 
CLC Bio provides a variety of documentations and tutorials to help the users getting started:
 
* [http://www.clcbio.com/products/clc-genomics-workbench Main page]
 
* User manual, both [http://www.clcsupport.com/clcgenomicsworkbench/current online] and in [http://www.clcbio.com/files/usermanuals/CLC_Genomics_Workbench_User_Manual.pdf PDF format]
 
* [http://helpdesk.clcbio.com/index.php?pg=kb.book&id=7 FAQ]
 
* [http://www.clcbio.com/desktop-applications/top-features Features of the Genomics Workbench]
 
* [http://www.clcbio.com/support/tutorials Tutorials]
 
* [http://www.clcbio.tv/channel/629214 Video tutorials]
 

Latest revision as of 14:05, 14 December 2017

Introduction

CLC Genomics Server is a software solution for centralized bioinformatics analysis and sharing of data generated from all High-Throughput Sequencing platforms. It contains the same tools as the CLC Genomics Workbench, such as mapping of reads to a known reference, de novo assembly, and variant calling. With a single click within CLC Genomics Workbench it is possible to offload resource-demanding tasks to an HPC cluster, that would not possible to analyse in a desktop computer environment. Please find further information about the CLC genomics workbench on the sharepoint page of D-BIOL.

Current version of the server

10.0.0

Life time

2013-

Tutorial

Using the CLC genomics service