Introduction#

Megahit is a de-novo assembler for NGS metagenomics. It is an effective alternative to diamond and bowtie. Here we will look how long it takes to assemble the pair-genome using the version of 1.2.9 of megahit.

Methodology#

We will run both single and concurrent runs on

Rome (Dual EPYC 7552 with 48 cores each and 512GB DDR4 RAM combined) and

Turin  (Single EPYC 9655 with 96 cores and 768GB DDR5 RAM) CPUs

Code Block (bash)

module load megahit/1.2.9
r1=/pasteur/appa/scratch/test_scal/test_PE_R1.fastq
r2=/pasteur/appa/scratch/test_scal/test_PE_R2.fastq
megahit -1 ${r1} -2 ${r2} -o megahit_output --num-cpu-threads ${XX} --min-contig-len 100

Results#

Threads Turin Rome RAM
96 10h28m 26h 26G
48 11h35m 21h47 26G
24 17h34m 36h 26G
24x4 21h32m too long 4x26G
32x3 17h 31h 3x26G

Conclusions:#

Turin is about twice faster than Rome. Do not use more than 48 threads on any of them. Use 48 on Rome to fit into the 24h queue.