Skip to content

Question: Does two reads with opposite strand with same mapping location regarded as duplicates? #67

@zztin

Description

@zztin

Hi Andrew,

I have a nanopore sequencing library where one molecule could result in reads on both forward or reverse strands. I would like to know if preseq regard these pairs as "distinct" observation, or as "repeated" observation?

The definition of "distinct" observation is not very clearly stated in the documentation. I was trying to find it in the implementation:

Related lines:

curr_gr.get_start() != prev_gr.get_start())

Related class: GenomicRegion.hpp

It seems like get_start() is getting the smaller coordinate of a bam read?
If so, does it mean that both reverse and forward-mapped reads would be seen as the same molecule (same 'start') -- This is the desired behavior in my use case.

I would love to hear from you!

The command I used:

./preseq lc_extrap -B -o ./yield-estimates.txt <input-sorted-bam>

Software version:
v2.0 downloaded precompiled binary

Best,
Li-Ting

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions