CLC Assembly Cell
CLC Assembly Cell is a high-performance computing solution for reference assembly of Next Generation Sequencing data.
The command-line interface of CLC Assembly Cell enables the functionalities to be easily included in scripts and other Next Generation Sequencing work-flows.
CLC Assembly Cell is utilizing SIMD instructions to parallelize and accelerate the assembly algorithms, making the program the fastest Next Generation Sequencing assembler at present.
Some of the main functionalities of CLC Assembly Cell are
- Reference assembly of Illumina Genome Analyzer, SOLiD, and 454 sequencing data
- Support for both gapped and ungapped alignments when doing short read assemblies
- Support for assembly of paired end reads
- Fast analysis of raw data, including reporting
- Option of joining data from different sources into the same analysis (including data generated by different kinds of sequencing technologies)
- Extraction of data from part(s) of an assembly. Examples are extraction of contig and reads from an area of interest, or extraction (exclusion) of data from a specific sequencing lane that is suspected not to be of acceptable quality.
- Find variations (SNP detection)
- Support for input file formats Fasta, Sff, and GenBank
- A number of output options, including tables with assembly info
- A "graphical" (ASCII art :-)) assembly viewer to get quick overview
- Full integration with CLC Genomics Workbench. Output data from CLC Assembly Cell can be imported and further analyzed in CLC Genomics Workbench.