Skip to content

fix: remove STAR protrude params#261

Open
kelly-sovacool wants to merge 5 commits into
mainfrom
iss-260
Open

fix: remove STAR protrude params#261
kelly-sovacool wants to merge 5 commits into
mainfrom
iss-260

Conversation

@kelly-sovacool

Copy link
Copy Markdown
Member

Changes

Remove STAR parameters --alignEndsProtrude and --peOverlapNbasesMin, which cause reads to protrude over transcript boundaries. This was noticed by a user with a custom genome containing HPV which has very short transcripts.

Issues

fixes #260

PR Checklist

(Strikethrough any points that are not applicable.)

  • This comment contains a description of changes with justifications, with any relevant issues linked.
  • [ ] Update docs if there are any API changes.
  • Update CHANGELOG.md with a short description of any user-facing changes and reference the PR number. Guidelines: https://keepachangelog.com/en/1.1.0/

@github-actions github-actions Bot added the RENEE RepoName label Feb 20, 2026
@codecov

codecov Bot commented Feb 20, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.50%. Comparing base (550b5a6) to head (34cc0c3).
⚠️ Report is 10 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #261   +/-   ##
=======================================
  Coverage   87.50%   87.50%           
=======================================
  Files           1        1           
  Lines          56       56           
=======================================
  Hits           49       49           
  Misses          7        7           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kelly-sovacool

Copy link
Copy Markdown
Member Author

Waiting on confirmation from Bahman Afsari that this fixed his error

@kelly-sovacool kelly-sovacool requested a review from Copilot April 6, 2026 17:13

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to prevent RSEM quantification failures on custom references with short transcripts (e.g., HPV) by removing STAR alignment parameters that can allow alignments to protrude past transcript boundaries.

Changes:

  • Removed --alignEndsProtrude / --peOverlapNbasesMin from some STAR invocations in the single-end and paired-end Snakemake rules.
  • Added a changelog entry describing the user-facing fix and linking to the PR.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
workflow/rules/single-end.smk Removes protrude-related STAR params for the star_basic path.
workflow/rules/paired-end.smk Removes protrude-related STAR params in specific STAR calls (but not all relevant branches).
CHANGELOG.md Documents the fix as a user-facing change.

Comment on lines 318 to 323
--sjdbGTFfile {params.gtffile} \
--limitSjdbInsertNsj {params.nbjuncs} \
--quantMode TranscriptomeSAM GeneCounts \
--outSAMtype BAM SortedByCoordinate \
--alignEndsProtrude 10 ConcordantPair \
--peOverlapNbasesMin 10 \
--outTmpDir=${{tmp}}/STARtmp_{wildcards.name} \
--sjdbOverhang ${{readlength}}

Copilot AI Apr 6, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change removes --alignEndsProtrude / --peOverlapNbasesMin for the star_basic branch, but the same parameters are still present in the multi-sample 2-pass STAR path in this file (e.g., in star1p at ~493–494 and star2p at ~613–614). That means the RSEM “hung over the end of transcript” failure can still occur when options.star_2_pass_basic is false. Please remove these parameters consistently from the other STAR invocations as well (or gate them behind a config option if they’re still needed in some modes).

Copilot uses AI. Check for mistakes.
Comment on lines 474 to 480
--readFilesIn {input.file1} {input.file2} \
--readFilesCommand zcat \
--runThreadN {threads} \
--outFileNamePrefix {params.prefix}. \
--outSAMtype BAM Unsorted \
--alignEndsProtrude 10 ConcordantPair \
--peOverlapNbasesMin 10 \
--sjdbGTFfile {params.gtffile} \
--outTmpDir=${{tmp}}/STARtmp_{wildcards.name} \

Copilot AI Apr 6, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The protrude-related STAR params are removed in this hunk, but --alignEndsProtrude 10 ConcordantPair and --peOverlapNbasesMin 10 are still present in the star_basic command used when options.star_2_pass_basic is enabled (see same file around ~384–385). If users run paired-end with star_2_pass_basic: true, they may still hit the RSEM boundary issue described in #260. Please remove these parameters from the star_basic STAR call as well (and consider also whether the Arriba STAR call at ~689 should keep --peOverlapNbasesMin).

Copilot uses AI. Check for mistakes.
@kelly-sovacool

Copy link
Copy Markdown
Member Author

Bahman confirmed this version works for him.

@kelly-sovacool kelly-sovacool added this to the 2026-04 milestone Apr 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

RENEE RepoName

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fragment is hung over the end of the transcript

2 participants