Skip to content

Error in Cluster step #29

@wwood

Description

@wwood

Hi,

I'm trying to run MetaCompass to test it out, using as input some reads with simulated variants. I ran this yesterday and the git/db are up to date. Any ideas? Thanks. (I can supply the reads if needed)

➜  docker build -t metacompass:2.0-beta .; docker run -it --rm -v `pwd`/RefSeq_V2_db:/data -v `pwd`/kicking_tyres:/input metacompass:2.0-beta
[+] Building 0.9s (17/17) FINISHED                                                                                                                             docker:default
 => [internal] load build definition from Dockerfile                                                                                                                     0.0s
 => => transferring dockerfile: 3.15kB                                                                                                                                   0.0s
 => [internal] load metadata for docker.io/library/ubuntu:22.04                                                                                                          0.8s
 => [internal] load .dockerignore                                                                                                                                        0.0s
 => => transferring context: 2B                                                                                                                                          0.0s
 => [ 1/13] FROM docker.io/library/ubuntu:22.04@sha256:1ec65b2719518e27d4d25f104d93f9fac60dc437f81452302406825c46fcc9cb                                                  0.0s
 => CACHED [ 2/13] RUN apt-get update && apt-get install -y     wget     curl     git     build-essential     ca-certificates     && rm -rf /var/lib/apt/lists/*         0.0s
 => CACHED [ 3/13] RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O /tmp/miniconda.sh &&     bash /tmp/miniconda.sh -b -p /opt/conda &  0.0s
 => CACHED [ 4/13] RUN conda config --remove channels defaults &&     conda config --add channels conda-forge &&     conda config --add channels bioconda &&     conda   0.0s
 => CACHED [ 5/13] WORKDIR /opt                                                                                                                                          0.0s
 => CACHED [ 6/13] RUN git clone https://github.com/marbl/MetaCompass.git                                                                                                0.0s
 => CACHED [ 7/13] WORKDIR /opt/MetaCompass                                                                                                                              0.0s
 => CACHED [ 8/13] RUN conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/main                                                                0.0s
 => CACHED [ 9/13] RUN conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/r                                                                   0.0s
 => CACHED [10/13] RUN conda env create -f metacompass_environment.yml                                                                                                   0.0s
 => CACHED [11/13] RUN nextflow -version                                                                                                                                 0.0s
 => CACHED [12/13] RUN mkdir -p /opt/metacompass_db                                                                                                                      0.0s
 => CACHED [13/13] RUN echo '#!/bin/bash\nconda activate metacompass\nexec "$@"' > /usr/local/bin/entrypoint.sh &&     chmod +x /usr/local/bin/entrypoint.sh             0.0s
 => exporting to image                                                                                                                                                   0.0s
 => => exporting layers                                                                                                                                                  0.0s
 => => writing image sha256:ab05ac3157ebf63ec66989da1b798529208afe8b9866e35456a64c271c6a92ce                                                                             0.0s
 => => naming to docker.io/library/metacompass:2.0-beta                                                                                                                  0.0s
root@8a8796107568:/opt/MetaCompass# ls /input
sample_0_reads.1.fq.gz  sample_0_reads.2.fq.gz

root@8a8796107568:/opt/MetaCompass# nextflow run metacompass2.nf \                                                                                                              --reference_db /data \
  --forward /input/sample_0_reads.1.fq.gz \
  --reverse /input/sample_0_reads.2.fq.gz \                                                                                                                                     --output /output --threads 8

 N E X T F L O W   ~  version 25.04.6

Launching `metacompass2.nf` [serene_cajal] DSL2 - revision: f6e5b1e425

Output dir is /opt/MetaCompass/results
executor >  local (47)                                                                                                                                                        [b5/08ebc8] filter_reads     [100%] 1 of 1 ✔
[45/5dec41] map_to_gene (39) [100%] 40 of 40 ✔
executor >  local (47)                                                                                                                                                        [b5/08ebc8] filter_reads     [100%] 1 of 1 ✔
[45/5dec41] map_to_gene (39) [100%] 40 of 40 ✔
[b8/769d55] select_genomes   [100%] 1 of 1 ✔
[d2/67cac2] collect_refs     [100%] 1 of 1 ✔
[36/8e4e6c] SkaniTriangle    [100%] 1 of 1 ✔
[06/52a962] Cluster          [  0%] 0 of 1 ✘
[-        ] ConcatFasta      -
[a8/c035ff] IndexReads       [100%] 1 of 1 ✔
[-        ] ClusterIndex     -
[61/b76977] interleaveReads  [100%] 1 of 1 ✔
[-        ] reduceClusters   -
[-        ] refAssembly      -
[-        ] deNovoAssembly   -
[-        ] createOutputs    -

ERROR ~ Error executing process > 'Cluster'

Caused by:
  Process `Cluster` terminated with an error exit status (1)


Command executed:

  python3 "/opt/MetaCompass/scripts/cluster.py"  skani_matrix_stool.txt 5 .

Command exit status:
  1

Command output:
  (empty)

Command error:
  /opt/MetaCompass/scripts/cluster.py:27: SyntaxWarning: invalid escape sequence '\.'
    modified_items = re.match("(GCA_[0-9]+\.[0-9]+)", matched_genome_name).group(1)
  /opt/MetaCompass/scripts/cluster.py:103: SyntaxWarning: invalid escape sequence '\.'
    modified_items = [re.match("(GCA_[0-9]+\.[0-9]+)", item).group(1) for item in items]
  Traceback (most recent call last):
    File "/opt/MetaCompass/scripts/cluster.py", line 20, in <module>
      matched_genome = matching_files.read_text()
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/opt/conda/envs/metacompass/lib/python3.12/pathlib.py", line 1027, in read_text
      with self.open(mode='r', encoding=encoding, errors=errors) as f:
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/opt/conda/envs/metacompass/lib/python3.12/pathlib.py", line 1013, in open
      return io.open(self, mode, buffering, encoding, errors, newline)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  FileNotFoundError: [Errno 2] No such file or directory: 'matching_files.txt'

Work dir:
  /opt/MetaCompass/work/06/52a962b3f692671a69dba6153dc91a

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

 -- Check '.nextflow.log' file for details

The dockerfile (largely AI written)

# MetaCompass Dockerfile
# Based on installation instructions from https://github.com/marbl/MetaCompass
FROM ubuntu:22.04

# Set non-interactive frontend to avoid prompts during package installation
ENV DEBIAN_FRONTEND=noninteractive

# Install system dependencies
RUN apt-get update && apt-get install -y \
    wget \
    curl \
    git \
    build-essential \
    ca-certificates \
    && rm -rf /var/lib/apt/lists/*

# Install Miniconda
ENV CONDA_DIR=/opt/conda
ENV PATH=$CONDA_DIR/bin:$PATH
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O /tmp/miniconda.sh && \
    bash /tmp/miniconda.sh -b -p $CONDA_DIR && \
    rm /tmp/miniconda.sh

# Set up conda
# RUN conda config --set always_yes yes --set changeps1 no && \
#     conda update -q conda

# Configure conda channels and accept TOS
RUN conda config --remove channels defaults && \
    conda config --add channels conda-forge && \
    conda config --add channels bioconda && \
    conda config --set channel_priority strict

# Clone MetaCompass repository
WORKDIR /opt
RUN git clone https://github.com/marbl/MetaCompass.git

# Set working directory to MetaCompass
WORKDIR /opt/MetaCompass

# Create conda environment from the provided environment file
RUN conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/main
RUN conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/r
RUN conda env create -f metacompass_environment.yml

# Make RUN commands use the new environment
SHELL ["conda", "run", "-n", "metacompass", "/bin/bash", "-c"]

# Verify Nextflow installation
RUN nextflow -version

# Create directory for reference database
RUN mkdir -p /opt/metacompass_db

# Set environment variables
ENV CONDA_DEFAULT_ENV=metacompass
ENV CONDA_PREFIX=$CONDA_DIR/envs/metacompass
ENV PATH=$CONDA_PREFIX/bin:$PATH

# Create a script to activate the conda environment
RUN echo '#!/bin/bash\n\
conda activate metacompass\n\
exec "$@"' > /usr/local/bin/entrypoint.sh && \
    chmod +x /usr/local/bin/entrypoint.sh

# Set entrypoint to activate conda environment
# ENTRYPOINT ["/usr/local/bin/entrypoint.sh"]

# Default command
CMD ["/bin/bash"]

# Add labels for documentation
LABEL maintainer="MetaCompass Docker Image" \
      description="Docker image for MetaCompass v2.0-beta - Reference-guided Assembly of Metagenomes" \
      version="2.0-beta" \
      source="https://github.com/marbl/MetaCompass"

# Expose any necessary ports (if needed for web interfaces)
# EXPOSE 8080

# Notes for users:
# 1. To download the pre-built reference database (16GB), run:
#    wget https://obj.umiacs.umd.edu/metacompass-db/RefSeq_V2_db.tar.gz
#    tar -xzf RefSeq_V2_db.tar.gz
#
# 2. To run MetaCompass:
#    nextflow run metacompass2.nf \
#      --reference_db /path/to/RefSeq_V2_db \
#      --forward /path/to/forward_reads.fastq \
#      --reverse /path/to/reverse_reads.fastq \
#      --output /path/to/output_directory \
#      --threads 8
#
# 3. Hardware requirements:
#    - 90GB+ disk space for normal installation
#    - 8GB+ memory (16GB recommended) for Pilon error correction

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions