Requirements

pyANI-plus relies on several other programs, packages, and tools for normal running and for development. While many of these dependencies are installed automatically during setup, some may need to be downloaded and installed separately. This page provides a list of all required dependencies, along with explanations of their roles and why they are used.

Python3

pyANI-plus is designed to run on Python3, taking advantage of its latest features and improvements. It is not compatible with Python2. Python3 is required for installation and development.

NCBI-BLAST+

ANIb analysis, which calculates Average Nucleotide Identity using BLASTn+, involves comparing genome sequences through the BLAST+ tool provided by NCBI.

MUMmer

For ANIm (Average Nucleotide Identity using MUMmer) analysis, genome sequences are compared using the nucmer tool from the MUMmer package. The same tool provides the dnadiff command to compare and analyze genome sequences.

A key difference between nucmer and dnadiff methods lies in how intermediate alignments are generated. dnadiff uses the --maxmatch (all anchor matches regardless of their uniqueness) parameter and -m (many-to-many) alignments to replicate the results reported by the dnadiff wrapper. In contrast, ANIm uses the --mum (anchor matches that are unique in both the reference and query) parameter by default, with the possibility of using the --maxmatch and -1 parameters in the delta.filter wrapper to generate 1-to-1 alignments. A further difference is that dnadiff uses intermediate processed output from mummer that identifies a golden path through the alignment; nucmer does not use this golden path.

sourmash

For sourmash (Average Nucletide Identity using sourmash) analysis, genome sequences are compared using the sourmash tool.

fastANI

For fastANI (Average Nucletide Identity using FastANI) analysis, genome sequences are compared using the fastANI tool.

SQLite3

The output generated by pyani-plus analyses is stored in a local database using SQLite3, for rapid querying and recovery. This allows for persistent storage of results without the need to keep the original alignment files, and for incremental addition of new analyses. SQLite3 is installed with Python

snakemake

By building on snakemake, we maintain a single interface for defining and managing workflows. This allows us to standardize job execution across different environments without needing separate scheduling logic for local, cluster, or cloud execution.

Python Packages

pyANI-plus depends on several other Python packages, and we gratefully acknowledge their contribution:

  • Matplotlib: for graphical output
  • intervaltree: for identification of overlaps
  • Seaborn: for graphical output
  • NetworkX: for graph calculations and representation
  • Numpy: for matrix calculations
  • Pandas: for dataframe operations
  • SQLAlchemy: for interaction with SQLite3
  • rich: provides progress bars for user interaction and other CLI elements

Development

We rely on a number of additional packages to aid pyani-plus development. If you set up a development environment as recommended in the contributing page, then the following Python packages will be installed, or expected to be present:

  • coverage: to generate code coverage output for the codecov.io service
  • pre-commit: Manages and runs pre-commit hooks to enforce code quality and formatting before commits
  • pytest: to manage and run automated testing
  • pytest-cov: to integrate pytest with codecov and coverage
  • pytest-xdist: Enables parallel test execution with pytest, improving test runtime efficiency.
  • Ruff: Python linter that enforces coding style and helps catch potential issues.
  • types-tqdm: Provides type hints for tqdm, improving type checking and IDE support.