Juvenile Pacific cod will be reared in three temperatures under feeding and non-feeding conditions, then an integrated genomic approach will identify genes, gene variants, and epigenetic markers that respond to thermal stress and confer resilience. To complement the genomic approaches and further investigate temperature influences on energy resources, we will perform lipid analyses. This work will inform predictions of genetic selection and molecular response of Pacific cod in the Gulf of Alaska under climate change.
-
Heatwave juvenile cod rearing experiment data - metadata-MHW-sample_collection_data.csv
-
Temperature experiment sampling data - data/temp-experiment.csv
-
Liver lipid data from Louise
-
2025 WGS /volume1/web/nightingales/G_macrocephalus/ 30-1149633765 and 30-1149634506
30-1149633765: heatwave genetics study of juvenile cod spanning 2008 - 2023 + six experimental fish that needed resequencing for juvenile temperature study. 30-1149634506: pilot cod ecotype sequencing (shallow, deep)
The three main working directories are:
data
code
output
For any code document, the name should start with a 2 number prefix (eg 01-temp-size-analysis.Rmd). All output from that code should be in a sub-directory of output named the same as the code. For example the output of 01-temp-size-analysis.Rmd would be in output/01-temp-size-analysis.Rmd/
The repo also contains the following additional directories:
general-notebooks # phenotype/growth analyses (e.g. 0-Phenotypes)
lcWGS # genetic assignment analysis of experimental fish (see lcWGS/README.md)
reports # project reports, presentations, and posters
Heavy jobs assume UW Hyak (klone) paths; most bash chunks are set eval = FALSE and are run manually rather than during knitting. Raw reads and references live off-repo on Owl and are referenced via variables defined in each notebook's setup chunk.
Fish (n=40 per treatment, N=160) were tagged prior to the experiment, and per-individual growth metrics were collected to calculate specific growth rates based on wet weights (SGRww) and standard length (SGRsl) during experimental treatments, and Fulton's condition index based on wet weight (Kwet), and hepatosomatic index (HSI) at treatment termination. Please see 0-Phenotypes notebook for details.
The workflow for RNAseq data analysis uses the following steps. Visualizations of results can be viewed in the rendered .md files (e.g., 07-cod-RNAseq-DESeq2.md`)
- Perform quality control (QC) and trimming on the raw reads (
05-cod-RNAseq-trimming). QC was performed using FastQC/MultiQC, and reads were trimmed using Flexbar. - Align trimmed reads to reference transcriptome\genome and generate an estimate of transcript\gene abundance in the form of a gene-level counts matrix (
06-cod-RNAseq-alignment,06.2-cod-RNAseq-alignment-genome). Reads were pseudoaligned to a transcriptome using kallisto, and transcript abundances were summarized to gene-level counts using Trinity abundance_estimates_to_matrix.pl. We also aligned reads to a genome using Hisat2 and summarized to gene-level counts using featureCounts. - Identify differentially expressed genes (DEGs), and generate associated visualizations (heatmap, volcano plot) (
07-cod-RNAseq-DESeq2,07.2.1-cod-RNAseq-DESeq2genome-exon,07.2.2-cod-RNAseq-DESeq2-genome-gene). Differential expression analysis was performed with DESeq2. - Annotate reference transcriptome/genome to generate a database of transcript/gene IDs and associated gene ontology (GO) terms (
03-transcriptome-annotation,03.2-genome-annotation) - Identify enriched GO terms (
08-cod-RNAseq-GO-annotation,08.2.1-cod-RNAseq-GO-annotation-genome-exon). Enrichment analysis was performed using DAVID.
An alternative differential expression analysis using edgeR is also available (13.0.0-RNAseq-edgeR).
Experimental fish are assigned to known spawning populations based on low-coverage whole-genome sequencing (lcWGS) data from reference fish. The approach combines code from the AFSC lcWGS pipeline and the WGSassign pipeline. All steps are laid out in 01-lcWGS-WGSassign, with supporting files in the lcWGS/ directory. See lcWGS/README.md for details.
Whole-genome bisulfite sequencing (WGBS) data are processed to characterize the DNA methylation landscape across treatments. The workflow:
- Prepare the bisulfite-converted reference genome with Bismark (
16.2-genome-prep-klone,16-bismark.job). - Run the nf-core/methylseq pipeline via Nextflow on UW Hyak (
klone), using17.nextflow,17.config,17.samplesheet.csv(see setup notes in17.notes). - Characterize the genome-wide methylation landscape (
18-methylation-landscape) and perform PCA on methylation profiles across samples (20-methylation-pca).
A dedicated track integrates blood RNA-seq and WGBS from the same individuals held at two temperatures (samples 1–36 @ 16 °C, 44–74 @ 0 °C) against GCF_031168955.1, to investigate epigenetic control of gene expression under thermal stress. The plan and notebook index are in 21-RNAseq-methylation.md. Workflow:
- Build sample sheets and the paired-sample set from the Owl listings (
25-blood-sample-sheet). - RNA-seq to gene-level DEGs (16 vs 0 °C) with HISAT2/featureCounts/DESeq2 (
26-blood-RNAseq-align-DESeq2). - WGBS to CpG coverage, DMCs/DMRs (methylKit), and per-gene methylation (
27-blood-WGBS-methylation-DMR). - Matched expression × methylation feature matrices (
28-blood-integration-feature-matrices). - Integration — genome-wide methylation↔expression, per-gene correlation, DEG×DMG overlap, joint PCA/DIABLO (
29-blood-integration-analysis). - Publication figures (
30-blood-integration-figures). - GO/functional enrichment of the DEG×DMG candidate genes (
31-blood-integration-GO-enrichment).