moderndive (Python)

The Python companion package for ModernDive: Statistical Inference via Data Science — a faithful port of the R moderndive and infer packages to a modern Python data-science stack (polars, plotly, plotnine, statsmodels).

📖 Documentation (with runnable examples): https://moderndive.readthedocs.io

It is intentionally pure-Python (no compiled extensions) so it installs under Pyodide via micropip for in-browser execution.

Installation

pip install moderndive          # from PyPI (once published)
# or, from source:
pip install git+https://github.com/moderndive/moderndive-python

What’s inside

A tidy simulation-inference grammar mirroring R infer: specify → hypothesize → generate → calculate, plus fit() for multiple regression, observe(), and assume() (theoretical t/z/F/Chisq). specify() is also available as a DataFrame method, so you can write df.specify(...) just like R’s df %>% specify(...). calculate(stat=...) takes the full infer vocabulary or any custom callable test statistic. Summaries via get_p_value / get_confidence_interval (percentile, SE, bias-corrected); British-spelling and short aliases included.
Dual-engine plots: visualize / shade_p_value / shade_confidence_interval (and every plot helper) take engine="plotly" (default, interactive) or engine="plotnine" — same code, your choice of output.
Theory-based wrapper tests: t_test, prop_test, chisq_test, t_stat, chisq_stat, plus the moderndive.theory module.
Regression & summary helpers mirroring R moderndive: get_regression_table, get_regression_points, get_regression_summaries, get_correlation, pop_sd, tidy_summary, count_missing (built on statsmodels where relevant, returning polars frames), plus the model plots gg_parallel_slopes / geom_parallel_slopes and gg_categorical_model / geom_categorical_model, and pairplot (the GGally::ggpairs analog).
Sampling: rep_slice_sample / rep_sample_n for sampling-distribution activities.
58 datasets: load_*() loaders returning polars DataFrames (the moderndive/infer, nycflights23, gapminder, ISLR2, and FiveThirtyEight datasets used in the book).

Quick start

Are tracks more likely to be popular in metal than in deep house? Compute the observed difference in “popular” rates, then permute the genre labels 1000 times to build a null distribution and read off a p-value.

import moderndive as md
from moderndive import get_p_value, visualize, shade_p_value

spotify = md.load_spotify_metal_deephouse()

# Observed difference in popularity rates (metal − deep house)
obs = (
    spotify
    .specify(formula="popular_or_not ~ track_genre", success="popular")
    .calculate(stat="diff in props", order=("metal", "deep-house"))
)
obs

ObservedStatistic(stat='diff in props', value=0.034)

# Permutation null distribution + p-value
null = (
    spotify
    .specify(formula="popular_or_not ~ track_genre", success="popular")
    .hypothesize(null="independence")
    .generate(reps=1000, type="permute", seed=76)
    .calculate(stat="diff in props", order=("metal", "deep-house"))
)
print(get_p_value(null, obs_stat=obs, direction="right"))

shape: (1, 1)
┌─────────┐
│ p_value │
│ ---     │
│ f64     │
╞═════════╡
│ 0.075   │
└─────────┘

# Visualize — interactive plotly by default; engine="plotnine" for ggplot-style
visualize(null) + shade_p_value(obs_stat=obs, direction="right")

Development

This repo uses uv.

uv sync --extra dev          # create the environment
make test                    # run the test suite (enforces 100% coverage)
make readme                  # re-render README.md from README.qmd (needs Quarto)
make build-data              # rebuild the bundled Parquet datasets (needs R; see tools/)
make build                   # build the wheel/sdist

The test suite is held at 100% statement coverage (enforced in CI via --cov-fail-under=100). Releases are automated on v* tags — see RELEASING.md.

License

MIT. The ModernDive book content is licensed CC-BY-NC-SA 4.0; this software package is MIT-licensed.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
.github		.github
README_files/figure-commonmark		README_files/figure-commonmark
doc		doc
moderndive		moderndive
tests		tests
tools		tools
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yaml		.readthedocs.yaml
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
README.qmd		README.qmd
RELEASING.md		RELEASING.md
code-of-conduct.md		code-of-conduct.md
codecov.yml		codecov.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

moderndive (Python)

Installation

What’s inside

Quick start

Development

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

moderndive (Python)

Installation

What’s inside

Quick start

Development

License

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages