Introduction#
Megahit is a de-novo assembler for NGS metagenomics. It is an effective alternative to diamond and bowtie. Here we will look how long it takes to assemble the pair-genome using the version of 1.2.9 of megahit.
Methodology#
We will run both single and concurrent runs on
Rome (Dual EPYC 7552 with 48 cores each and 512GB DDR4 RAM combined) and
Turin (Single EPYC 9655 with 96 cores and 768GB DDR5 RAM) CPUs
Code Block (bash)
module load megahit/1.2.9
r1=/pasteur/appa/scratch/test_scal/test_PE_R1.fastq
r2=/pasteur/appa/scratch/test_scal/test_PE_R2.fastq
megahit -1 ${r1} -2 ${r2} -o megahit_output --num-cpu-threads ${XX} --min-contig-len 100
Results#
| Threads | Turin | Rome | RAM |
|---|---|---|---|
| 96 | 10h28m | 26h | 26G |
| 48 | 11h35m | 21h47 | 26G |
| 24 | 17h34m | 36h | 26G |
| 24x4 | 21h32m | too long | 4x26G |
| 32x3 | 17h | 31h | 3x26G |
Conclusions:#
Turin is about twice faster than Rome. Do not use more than 48 threads on any of them. Use 48 on Rome to fit into the 24h queue.