Introduction#
When automating a pipeline, you might want to take advantage of the simple yet powerful system where you can mix Python and shell.
Installation#
Load Python 3 module and install snakemake. Make sure your $HOME/.local/bin is in your $PATH variable
install snakemake (bash)
module load Python/3.10.13 # or any more recent 3 version
pip3 install --user snakemake
In most cases, you should make a virtual environment first:
install snakemake (bash)
module load Python/3.10.13 # or any more recent 3 version
cd /pasteur/appa/scratch/<yourloginname>
virtualenv mysnake
source mysnake/bin/activate
pip3 install snakemake snakemake-executor-plugin-slurm
Example 1#
One of the typical tasks is cleaning the sequencing samples and identifying their contents. Here is one way of doing it.
- Put your samples in a subdirectory ./data. Your samples can be .bam or .fastq, single reads or paired reads. In case of single reads, they have to be called
.bam, for paired reads - R1 .fastq and R2 .fastq. The suffix will be ignored. - Let us assume that we will be working with the sample mydata.bam
- git clone git@gitlab.pasteur.fr:kpetrov/metal.git . This will create a sub-directory called metal, with a Snakefile.py file
- run
snakemake (bash)
snakemake -s metal/Snakefile.py --executor=slurm --jobs 16 sample_2bLCA_lca.tsv
Here 16 is the number of jobs to run. 5. Snakemake will show the number of jobs it thinks it will do and inform you of the progress. 6. When completed, you should have
files (bash)
mydata_2bLCA_lca.tsv
mydata_report
mydata_2bLCA_tophit_aln
mydata_2bLCA_tophit_report
mydata/
in the sub-directory mydata you will find all the intermediate files and corresponding log files. The main log file is in .snakemake/logs subdirectory. 7. Do not forget to clean that up once you no longer need it.
Example 2#
You can find another useful example in the AlphaFold manual here. That one will deal with every sample file you have.
Do not forget to set executor to 'slurm'
Related articles#
Related articles appear here based on the labels you select. Click to edit the macro and add or change labels.
false5FAQAfalsemodifiedtruepagelabel in ("pipeline","python","snakemake") and type = "page" and space = "FAQA"pipeline python snakemake
true
| Related issues |