Skip to main content

Metaerg

0. Introduction

Metaerg is a automated pipeline that uses third-party softwares aswell as a large database to annotate genomes or sets of bins from microbial ecosystems. Example of such annotations tasks are feature prediction with HMM, BLAST and DIAMOND.

learn more about Metaerg by reading from their github repository and the following article in frontiersin which was written by the same people who made Metaerg.


1. Getting Started

1.1 Installation

1.2 Required Tools and libraries

Perl Modules

Dependencies
Archive::Extract
Bio::Perl
Bio::DB::EUtilities
DBD::SQLite
DBI
File::Copy::Recursive
HTML::Entities
LWP::Protocol::https
SWISS::Entry

Table 1. This table displays all required Perl modules that are required to run Metaerg

DependenciesReq. versionVersion on Elja
antismash≥6.0.07.0.0
ARAGORNx1.2.41
mincedx0.4.2
BLAST+ executeablesx2.13.0
DIAMOND2.0.132.0.13
GenomeToolsx1.6.2
HMMER3.x.x3.3.2
Infernalx1.1.4
prodigalx2.6.3
pyarrowx12.0.0
Pythonx3.10.4
RepeatMaskerx4.1.4
RepeatScoutx1.0.6
signalpx0.5b
tmhmmx2.0c
Tandem Repeats Finderx4.09.1

Table 2. This table displays all main dependencies that are required to install MetaErg

2 Run Metaerg on Elja

2.1 Loading Metaerg

Before being able to run Metaerg on Elja, you will have to load the Metaerg module. To do this you type these following commands in the terminal:

ml use /hpcapps/lib-mimir/modules/all
ml load Metaerg

2.2 Running Metaerg

To run Metaerg you will have to type metaerg with parameters which you can find here. An example of a Metaerg run would look like this:

[..] $ metaerg --contig_file dir-with contig-files --database_dir /AlphaFoldData/MetaergData/

Note that --database_dir /AlphaFoldData/MetaergData/ is always necessary as this is the location of the Metaerg database which the machine learned code is based on.