Requirements
pyANI-plus
relies on several other programs, packages, and tools for normal running and for development. While many of these dependencies are installed automatically during setup, some may need to be downloaded and installed separately. This page provides a list of all required dependencies, along with explanations of their roles and why they are used.
Python3
pyANI-plus
is designed to run on Python3, taking advantage of its latest features and improvements. It is not compatible with Python2. Python3 is required for installation and development.
NCBI-BLAST+
ANIb analysis, which calculates Average Nucleotide Identity using BLASTn+
, involves comparing genome sequences through the BLAST+
tool provided by NCBI.
MUMmer
For ANIm (Average Nucleotide Identity using MUMmer
) analysis, genome sequences are compared using the nucmer
tool from the MUMmer
package. The same tool provides the dnadiff
command to compare and analyze genome sequences.
A key difference between nucmer
and dnadiff
methods lies in how intermediate alignments are generated. dnadiff
uses the --maxmatch
(all anchor matches regardless of their uniqueness) parameter and -m
(many-to-many) alignments to replicate the results reported by the dnadiff
wrapper. In contrast, ANIm uses the --mum
(anchor matches that are unique in both the reference and query) parameter by default, with the possibility of using the --maxmatch
and -1
parameters in the delta.filter wrapper to generate 1-to-1 alignments. A further difference is that dnadiff
uses intermediate processed output from mummer
that identifies a golden path through the alignment; nucmer
does not use this golden path.
sourmash
For sourmash (Average Nucletide Identity using sourmash) analysis, genome sequences are compared using the sourmash
tool.
fastANI
For fastANI (Average Nucletide Identity using FastANI) analysis, genome sequences are compared using the fastANI
tool.
SQLite3
The output generated by pyani-plus
analyses is stored in a local database using SQLite3, for rapid querying and recovery. This allows for persistent storage of results without the need to keep the original alignment files, and for incremental addition of new analyses. SQLite3 is installed with Python
snakemake
By building on snakemake
, we maintain a single interface for defining and managing workflows. This allows us to standardize job execution across different environments without needing separate scheduling logic for local, cluster, or cloud execution.
Python Packages
pyANI-plus
depends on several other Python packages, and we gratefully acknowledge their contribution:
Matplotlib
: for graphical outputintervaltree
: for identification of overlapsSeaborn
: for graphical outputNetworkX
: for graph calculations and representationNumpy
: for matrix calculationsPandas
: for dataframe operationsSQLAlchemy
: for interaction with SQLite3rich
: provides progress bars for user interaction and other CLI elements
Development
We rely on a number of additional packages to aid pyani-plus
development. If you set up a development environment as recommended in the contributing page, then the following Python packages will be installed, or expected to be present:
coverage
: to generate code coverage output for thecodecov.io
servicepre-commit
: Manages and runs pre-commit hooks to enforce code quality and formatting before commitspytest
: to manage and run automated testingpytest-cov
: to integratepytest
withcodecov
andcoverage
pytest-xdist
: Enables parallel test execution withpytest
, improving test runtime efficiency.Ruff
: Python linter that enforces coding style and helps catch potential issues.types-tqdm
: Provides type hints for tqdm, improving type checking and IDE support.