FunctionLab · j-funk · Feb 27, 2026 · Mar 26, 2026 · Mar 27, 2026 · Mar 31, 2026
diff --git a/docs/extended-universe.rst b/docs/extended-universe.rst
@@ -0,0 +1,131 @@
+==================================
+Extended universe (legacy mode)
+==================================
+
+The ``extended_universe`` URL parameter reproduces the cross-organism
+term-enrichment behavior used in
+:doc:`Functional Module Detection <modules>` and
+:doc:`Tissue-specific Networks <functional-networks>` on GIANT
+networks between February 2024 and April 2026. New analyses default
+to human-only; MAGE networks have always used a human-only annotation
+universe and the flag does not apply to them.
+
+When to use it
+==============
+
+Use ``extended_universe=true`` only when you need to reproduce a result
+from a publication, figure, or saved link generated between February
+2024 and April 2026.
+
+What it changes
+===============
+
+Term enrichment in :doc:`Functional Module Detection <modules>` uses
+a one-sided Fisher's exact test followed by Benjamini–Hochberg
+correction. Two inputs to that test depend on which annotation
+universe is in effect:
+
+* **Term size (K).** The number of genes annotated to the term.
+* **Background universe (N).** The total number of genes considered
+  available for annotation.
+
+In the current default mode, both K and N are computed from
+human annotations only. In extended-universe mode, K and N include
+annotations from non-human organisms that were carried over from
+the source databases: mouse (*Mus musculus*), zebrafish (*Danio
+rerio*), fruit fly (*Drosophila melanogaster*), nematode worm
+(*Caenorhabditis elegans*), and budding yeast (*Saccharomyces
+cerevisiae*).
+
+A note on Q values
+------------------
+
+Functional Module Detection now computes term enrichment Q values using a one-sided
+Fisher's exact test (the upper-tail probability ``hypergeom.sf(k - 1)``),
+as described in Krishnan et al. (2016) Genome-wide prediction and
+functional characterization of the genetic basis of autism spectrum
+disorder. *Nature Neuroscience*. From November 2017 through
+April 2026, the calculation instead used the point probability of
+observing exactly the seen overlap (``hypergeom.pmf(k)``).
+
+``extended_universe=true`` restores the point-probability calculation
+along with the cross-organism universe, so any result produced between
+February 2024 and April 2026 can be reproduced exactly. Results from
+before February 2024 will have matching test statistics, but term-size
+and universe definitions will differ from what this flag restores. If
+you need to reproduce a result from before February 2024, please get
+in touch and we will help you recover matching values.
+
+A note on data versioning
+-------------------------
+
+The ``extended_universe`` flag restores the legacy *code path*, but
+it cannot, on its own, restore the legacy *data state*. HumanBase
+imports gene records and term annotations from external sources
+(NCBI, Gene Ontology, MSigDB, MeSH, and others) on its own
+schedule. Two quantities used by the hypergeometric test drift
+between releases independently of any term-release version label:
+
+* the **gene universe (M)** — the set of distinct genes with at
+  least one annotation, summed across the loaded organisms; and
+* each **term size (K)** — the number of distinct genes annotated
+  to a given term.
+
+In practice this means a community page rerun today with
+``extended_universe=true`` will reproduce the exact statistical
+calculation used in the legacy code path, but the inputs M and K
+reflect today's annotation tables, not the tables present at the
+time of the original run. Q values shift in proportion to how much
+the underlying data has drifted since.
+
+This is a known limitation. Long-term reproducibility is on the
+roadmap, achieved by pinning gene and term snapshots per HumanBase
+release so the data state itself is versioned.
+
+Where it applies
+================
+
+The flag is supported only for **GIANT** networks (the original
+human tissue and biological-process networks). MAGE network analyses
+have used a human-only annotation pipeline from the start, so there
+is no legacy cross-organism behavior to reproduce. Requests
+that combine ``extended_universe=true`` with a MAGE network are
+rejected by the API.
+
+The flag applies to the GIANT version of:
+
+* :doc:`Functional module detection <modules>` — term enrichment for
+  each detected community.
+* :doc:`Tissue-specific networks <functional-networks>` — annotated
+  term tables on gene pages (Process and Tissue tabs).
+
+It does not affect network edge weights, gene-prediction scores, or
+any non-enrichment output.
+
+How to use it
+=============
+
+Append ``extended_universe=true`` to the URL of a community page or
+gene page. For example:
+
+* Functional module detection result::
+
+    https://humanbase.io/module/overview/?body_tag=<job_id>&extended_universe=true
+
+* Gene page::
+
+    https://humanbase.io/gene/3553/blood?extended_universe=true
+
+The interface displays a banner whenever the flag is active, so it
+is always visible whether a page is in extended-universe or
+default mode. On a non-GIANT page the flag is ignored, removed from
+the address bar, and a warning banner explains that the request
+fell back to human-only data.
+
+Programmatic access
+===================
+
+The same parameter is forwarded to the underlying API endpoints
+(``/community/`` and ``/terms/annotated/``). Scripts replaying
+historical analyses through the API should append
+``&extended_universe=true`` to the query string.
diff --git a/docs/functional-networks.rst b/docs/functional-networks.rst
@@ -1,24 +1,31 @@
 Tissue-specific Networks
 ===========================
-In order to leverage the vast collections of raw, noisy genomic data, they must be integrated, summarized, and presented in a biologically informative manner. We provide a means of mining tens of thousands of whole-genome experiments by way of functional interaction networks. Each interaction network represents a body of data, probabilistically weighted and integrated, focused on a particular tissue or process context. 
+In order to leverage the vast collections of raw, noisy genomic data, they must be integrated, summarized, and presented in a biologically informative manner. We provide a means of mining tens of thousands of whole-genome experiments by way of functional interaction networks. Each interaction network represents a body of data, weighted and integrated, focused on a particular tissue, cell, or process context. 
 
-It is important to consider gene relationships within a tissue context as the precise actions of genes are frequently dependent on their tissue context, and human diseases result from the disordered interplay of tissue- and cell lineage–specific processes. These factors combine to make the understanding of tissue-specific gene functions, disease pathophysiology and gene-disease associations particularly challenging.
+It is important to consider gene relationships within a tissue or cell type as the precise actions of genes are frequently dependent on their context, and human diseases result from the disordered interplay of tissue- and cell lineage–specific processes. These factors combine to make the understanding of tissue-specific gene functions, disease pathophysiology and gene-disease associations particularly challenging.
 
 Tissue-specific network construction is described in the following publication: Greene, C. S., Krishnan, A., Wong, A. K., Ricciotti, E., Zelaya, R. A., Himmelstein, D. S., ... & Troyanskaya, O. G. (2015). `Understanding multicellular function and disease with human tissue-specific networks <https://www.nature.com/articles/ng.3259>`_. Nature Genetics.
 
 Method
 ---------------------------
-Briefly, functional integration relies on the construction of process-specific functional relationship networks. These are interaction networks in which each node represents a gene, each edge a functional relationship, and an edge between two genes is probabilistically weighted based on experimental evidence relating to those genes. We integrate evidence from many data sets, with each data set weighted in a process-specific manner. 
+Briefly, functional integration relies on the construction of process-specific functional relationship networks. These are interaction networks in which each node represents a gene, each edge a functional relationship, where an edge between two genes is a probability based on experimental evidence relating to those genes. We integrate evidence from many data sets, with each data set weighted in a process-specific manner. 
 
-One naïve Bayesian classifier is trained per biological area of interest (e.g. a tissue, or a specific biological process), using the appropriate gold standard for the biological context in addition to one global process-unaware classifier trained using the complete gold standard. Each classifier consisted of a class node predicting the binary presence or absence of a functional relationship (FR) between two genes and n nodes conditioned on FR, each representing the value of a data set.
+For GIANT, one naïve Bayesian classifier is trained per biological area of interest (e.g. a tissue, or a specific biological process), using the appropriate gold standard for the biological context in addition to one global process-unaware classifier trained using the complete gold standard. Each classifier consisted of a class node predicting the binary presence or absence of a functional relationship (FR) between two genes and n nodes conditioned on FR, each representing the value of a data set.
 
-Parameter regularization is performed as described in `Steck and Jaakkola (2002) <https://proceedings.neurips.cc/paper_files/paper/2002/file/1819932ff5cf474f4f19e7c7024640c2-Paper.pdf>`_ using mutual information between data sets to estimate a strength of prior belief for each data set. While a large amount of shared information does not guarantee a redundant data set, since the same subset of information could be shared many times, it provides a valuable quantitative estimate of data set uniqueness. 
+Parameter regularization is performed as described in Steck and Jaakkola (2002) using mutual information between data sets to estimate a strength of prior belief for each data set. While a large amount of shared information does not guarantee a redundant data set, since the same subset of information could be shared many times, it provides a valuable quantitative estimate of data set uniqueness.
+
+MAGE constructs networks in two stages.
+In stage 1 (representation learning), each dataset is converted into a gene graph with edges derived from coexpression or protein/gene interactions. MAGE trains a masked graph autoencoder that hides a fraction of edges and learns to reconstruct them using information from neighboring genes in the graph. The decoder outputs a reconstruction probability for each gene pair, which serves as dataset-level evidence for functional relatedness.
+
+In stage 2 (context-specific integration), MAGE learns a tissue- or cell-type-specific mapping from dataset-level evidence to a functional relationship probability. This supervised model is trained using a tissue- or cell-type-specific functional gold standard derived from Gene Ontology biological process relationships together with tissue expression patterns. The output is a tissue- or cell-type-specific functional network where each edge weight is the predicted probability that two genes participate in shared biological processes in that context. 
 
 Data integration
 ---------------------------
-We collected and integrated 987 genome-scale data sets encompassing approximately 38,000 conditions from an estimated 14,000 publications including both expression and interaction measurements. To integrate these data, we automatically assess each data set for its relevance to each of 144 tissue- and cell lineage–specific functional contexts. The resulting functional maps provide a detailed portrait of protein function and interactions in specific human tissues and cell lineages ranging from B lymphocytes to the renal glomerulus and the whole brain. This approach allows us to profile the specialized function of genes in a high-throughput manner, even in tissues and cell lineages for which no or few tissue-specific data exist.
+GIANT integrates 987 genome-scale data sets encompassing approximately 38,000 conditions from an estimated 14,000 publications including both expression and interaction measurements. To integrate these data, we automatically assess each data set for its relevance to each of 144 tissue- and cell lineage-specific functional contexts. The resulting functional maps provide a detailed portrait of protein function and interactions in specific human tissues and cell lineages ranging from B lymphocytes to the renal glomerulus and the whole brain. This approach allows us to profile the specialized function of genes in a high-throughput manner, even in tissues and cell lineages for which no or few tissue-specific data exist.
 
-* Gene co-expression: All gene expression data sets are from NCBI's Gene Expression Omnibus (GEO). Genes with more than 30% of values missing were removed, and remaining missing values were imputed using ten nearest neighbors. Non-log-transformed data sets were log transformed. Expression measurements were summarized to Entrez identifiers, and duplicate identifiers were merged. The Pearson correlation was calculated for each gene pair, normalized with Fisher's z transform, mean subtracted and divided by the standard deviation. 
+MAGE integrates 7,463 genome-scale datasets representing more than 250,000 experiments across multiple data types. These include protein–protein interaction resources, transcription factor binding motif information, perturbation and microRNA target profiles, and large collections of gene expression studies. Each dataset is processed into a graph representation, and the full collection of dataset-level edge evidence is then integrated into 289 tissue and cell-type networks. 
+
+* Gene co-expression: All gene expression data sets are from NCBI's Gene Expression Omnibus (GEO) for GIANT and refine.bio for MAGE. Genes with more than 30% of values missing were removed, and remaining missing values were imputed using ten nearest neighbors. Non-log-transformed data sets were log transformed. Expression measurements were summarized to Entrez identifiers, and duplicate identifiers were merged. The Pearson correlation was calculated for each gene pair, normalized with Fisher's z transform, mean subtracted and divided by the standard deviation. 
 
 * Protein-interaction: Interaction data are collected from BioGRID, IntAct, MINT, and MIPS.
 
@@ -29,6 +36,7 @@ We collected and integrated 987 genome-scale data sets encompassing approximatel
 
 Evidence
 ---------------------------
+For GIANT:
 The "evidence" for an edge is measured as the contribution or "influence" of each dataset on the posterior classification probability. Each dataset contribution is calculated as the posterior probability of a functional relationship given only that dataset, minus the prior probablility.
 
 Contribution of dataset D to an edge functional relationship prediction (FR)::
@@ -37,11 +45,20 @@ Contribution of dataset D to an edge functional relationship prediction (FR)::
 
 Note that the contributions will not sum to 1.0, as each contribution is measured separately. Generally, individual gene expression datasets will not contribute much to the posterior probability but cumulatively can make a significant contribution.
 
+For MAGE:
+In each tissue- or cell-type-specific MAGE network, an edge between genes *u* and *v* is assigned a single score produced by the stage 2 (context-specific integration) gradient-boosting integration model (XGBoost). Each gene pair is represented by a 7,463-dimensional feature vector (one feature per dataset) derived from the stage 1 (representation learning) masked-edge reconstruction probabilities, and the boosting model maps these features to a predicted score between 0 and 1, where the score represents the probability of a functional relationship in that context.
+
+The final network edge weight is the predicted score:
+edge_weight(u, v) ∈ [0, 1]
+
+Higher values indicate a higher predicted probability that the two genes participate in a functional relationship in the selected tissue or cell type.  
+
+
 Example
 ---------------------------
 
 IL1B in blood vessel
 ~~~~~~~~~~~~~~~~~~~~~~~~~
 We examined and experimentally verified the tissue-specific molecular response of blood vessel cells to stimulation by IL-1β (IL1B), a pro-inflammatory cytokine. We anticipated that the genes most tightly connected to IL1B in the blood vessel network would be among those responding to IL-1β stimulation in blood vessel cells. We tested this hypothesis by profiling the gene expression of human aortic smooth muscle cells (HASMCs; the predominant cell type in blood vessels) stimulated with IL-1β.
 
-Examination of the genes whose expression was significantly upregulated at 2 h after stimulation showed that 18 of the 20 IL1B network neighbors were among the top 500 most upregulated genes in the experiment (P = 2.07 × 10−23). The blood vessel network was the most accurate tissue network in predicting this experimental outcome; none of the other 143 tissue-specific networks or the tissue-naive network performed as well when evaluated by each network's ability to predict the result of IL-1β stimulation on the cells.
+Examination of the genes whose expression was significantly upregulated at 2 h after stimulation showed that 18 of the 20 IL1B network neighbors were among the top 500 most upregulated genes in the experiment (P = 2.07 × 10−23). The blood vessel network was the most accurate GIANT tissue network in predicting this experimental outcome; none of the other 143 GIANT tissue-specific networks or the tissue-naive network performed as well when evaluated by each network's ability to predict the result of IL-1β stimulation on the cells.
diff --git a/docs/img/use-cases/functional-module-3.png b/docs/img/use-cases/functional-module-3.png
diff --git a/docs/img/use-cases/functional-module-4.png b/docs/img/use-cases/functional-module-4.png
diff --git a/docs/index.rst b/docs/index.rst
@@ -35,6 +35,7 @@ Help topics
    use-cases
    functional-networks
    modules
+   extended-universe
    netwas
    deepsea
    sei

diff --git a/docs/modules.rst b/docs/modules.rst
@@ -22,5 +22,5 @@ This approach has two key desirable characteristics:
 
 We use a dynamic :code:`k = min(50, 0.2 * |V|)` to obtain the shared-nearest-neighbor tissue-specific network and apply the Louvain algorithm to cluster this network into distinct modules, where V is the number of query genes. Krishnan et al. (2016) showed that module node membership and cluster sizes are robust by testing a range of values for k from 10 to 100. To stabilize clustering across different runs of the Louvain algorithm, we run the algorithm 100 times and calculate cluster comembership scores for each pair of genes that was equal to the fraction of times (out of 100) the pair was assigned to the same cluster. Genes are assigned to clusters where their comembership score ≥ 0.9.
 
-Resulting modules are then tested for functional enrichment using genes annotated to Gene Ontology biological process terms. Representative processes and pathways enriched within each cluster are presented alongside of the cluster with their resulting Q value. The Q value of each term associated to the modules is calculated using one-sided Fisher's exact tests and Benjamini–Hochberg corrections to correct for multiple tests.
+Resulting modules are then tested for functional enrichment using genes annotated to Gene Ontology biological process terms. GIANT networks use annotations from UniProt-GOA (experimental evidence codes), while MAGE networks use annotations from NCBI gene2go (all evidence codes including computationally inferred). Enrichment is also performed against Disease Ontology and MSigDB gene sets. Representative processes and pathways enriched within each cluster are presented alongside of the cluster with their resulting Q value. The Q value of each term associated to the modules is calculated using one-sided Fisher's exact tests and Benjamini-Hochberg corrections to correct for multiple tests.
-Original file line number
+Diff line change
@@ Expand Up / @@ -35,6 +35,7 @@ Help topics @@
        use-cases
        functional-networks
        modules
+       extended-universe
        netwas
        deepsea
        sei
@@ Expand Down @@
Original file line number	Diff line number	Diff line change
Expand Up		@@ -22,5 +22,5 @@ This approach has two key desirable characteristics:

		We use a dynamic :code:`k = min(50, 0.2 * \|V\|)` to obtain the shared-nearest-neighbor tissue-specific network and apply the Louvain algorithm to cluster this network into distinct modules, where V is the number of query genes. Krishnan et al. (2016) showed that module node membership and cluster sizes are robust by testing a range of values for k from 10 to 100. To stabilize clustering across different runs of the Louvain algorithm, we run the algorithm 100 times and calculate cluster comembership scores for each pair of genes that was equal to the fraction of times (out of 100) the pair was assigned to the same cluster. Genes are assigned to clusters where their comembership score ≥ 0.9.

		Resulting modules are then tested for functional enrichment using genes annotated to Gene Ontology biological process terms. Representative processes and pathways enriched within each cluster are presented alongside of the cluster with their resulting Q value. The Q value of each term associated to the modules is calculated using one-sided Fisher's exact tests and Benjamini–Hochberg corrections to correct for multiple tests.
		Resulting modules are then tested for functional enrichment using genes annotated to Gene Ontology biological process terms. GIANT networks use annotations from UniProt-GOA (experimental evidence codes), while MAGE networks use annotations from NCBI gene2go (all evidence codes including computationally inferred). Enrichment is also performed against Disease Ontology and MSigDB gene sets. Representative processes and pathways enriched within each cluster are presented alongside of the cluster with their resulting Q value. The Q value of each term associated to the modules is calculated using one-sided Fisher's exact tests and Benjamini-Hochberg corrections to correct for multiple tests.