Skip to content

zayd100/TierFlow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TierFlow — Contact Time Heatmap Pipeline

Answers one question: when should your reps call which leads?

Loads your contact_attempts CSV, converts every timestamp to the lead's local timezone, computes connect and meeting rates per (day × hour) cell, masks cells with too few samples, and renders heatmaps segmented by industry and lead tier.


Connect Rate Heatmap Meeting Rate Heatmap SaaS Connect Rate

Folder structure

tierflow_heatmap/
│
├── data/
│   ├── contact_attempts_seed.csv        ← 20-row seed (schema reference)
│   └── contact_attempts_synthetic.csv   ← generated by generate_synthetic.py
│
├── outputs/                             ← all PNGs and CSVs land here (git-ignored)
│   ├── heatmap_connect_rate.png
│   ├── heatmap_connect_rate_saas.png
│   ├── heatmap_meeting_rate.png
│   ├── top_windows_connect_rate.csv
│   └── ...
│
├── notebooks/                           ← Jupyter notebooks go here (optional)
│
├── heatmap_pipeline.py                  ← main pipeline
├── generate_synthetic.py                ← synthetic data generator
├── requirements.txt
└── README.md

Quickstart

# 1. Install dependencies
pip install -r requirements.txt

# 2a. Run on the 20-row seed (heatmap will be mostly grey — not enough data yet)
python heatmap_pipeline.py

# 2b. Or generate synthetic data and run on that
python generate_synthetic.py --rows 3000
python heatmap_pipeline.py --csv data/contact_attempts_synthetic.csv --min-samples 5

Outputs land in outputs/. You get one overall heatmap plus one per industry.


CLI reference

python heatmap_pipeline.py [options]

--csv           Path to your CSV file
                default: data/contact_attempts_seed.csv

--metric        What "best time" means
                connect_rate  → answered + meeting_booked  (default)
                meeting_rate  → meeting_booked only

--industry      Filter to one industry, e.g. --industry SaaS
                Omit to get per-industry breakdown automatically

--tier          Filter to one lead tier, e.g. --tier 1
                Omit to include all tiers

--min-samples   Minimum contact attempts before a cell is shown
                Cells below this show as grey  (default: 10)

--no-annotate   Hide rate/count labels inside cells (cleaner for exporting)

Examples

# Tier 1 SaaS leads — what time gets meetings?
python heatmap_pipeline.py \
  --csv data/contact_attempts_synthetic.csv \
  --metric meeting_rate \
  --industry SaaS \
  --tier 1 \
  --min-samples 3

# Overall connect rate, strict confidence threshold
python heatmap_pipeline.py \
  --csv data/contact_attempts_synthetic.csv \
  --metric connect_rate \
  --min-samples 20

CSV schema

Your CSV must contain these columns. See data/contact_attempts_seed.csv for a filled example.

Column Type Notes
attempt_id int Unique per row. Auto-increment or UUID.
timestamp_utc datetime Always UTC. Format: YYYY-MM-DD HH:MM:SS
lead_id str Foreign key to your leads table.
rep_id str Rep who made the attempt.
rep_role str Warmer or Closer
contact_channel str call / email / sms / linkedin / whatsapp
industry str Lead's industry vertical. Keep consistent casing.
lead_tier int Tier at time of attempt (1 = hottest). Snapshot — don't join live.
lead_score int Score at time of attempt (0–100). Snapshot.
outcome str answered / no_reply / voicemail / meeting_booked / bounced / wrong_number
duration_seconds int Call duration. 0 for no-answer / email.
lead_timezone str IANA timezone string, e.g. America/New_York
notes str Optional free-text. Leave blank if nothing to add.

How it works

CSV
 └─ load & validate schema
     └─ localise timestamps (UTC → lead's local hour + day-of-week)
         └─ flag outcomes  (connect_rate, meeting_rate)
             └─ pivot table  7 rows (Mon–Sun) × 24 cols (00:00–23:00)
                 └─ mask cells below min_samples
                     └─ render heatmap PNG  (+ per-industry breakdown)
                         └─ export top-5 windows CSV

The key step is timezone localisation. A call logged at 14:00 UTC means 19:00 in Karachi and 09:00 in New York. The heatmap is built from the lead's local hour — not the rep's — because that's what determines whether someone picks up.


Reading the output

  • Warm cells (amber → red) — high connect or meeting rate at that hour/day combo
  • Dark cells — low rate
  • Grey cells — fewer than --min-samples attempts; don't draw conclusions from these
  • Annotations42%\n(17) means 42% rate from 17 attempts in that cell

The console also prints a top 5 windows table:

day  hour  connect_rate   n
Tue 11:00         0.647  17
Thu 09:00         0.647  17
Wed 16:00         0.565  23

And saves it to outputs/top_windows_{metric}.csv — pipe this into your dashboard.


How much data do you need?

Stage What you can do
0–500 rows Build and test the pipeline. Heatmap will be mostly grey.
500–2,000 rows Overall heatmap starts showing patterns. Per-industry is still thin.
2,000–5,000 rows Per-industry heatmaps become reliable. Tier filtering works.
5,000+ rows Full segmentation (industry × tier). Use --min-samples 20.

Lower --min-samples to see more cells earlier, but treat them as directional hints not gospel.


Connecting to TierFlow

When you're ready to productise, swap the flat CSV read for a database query:

# Replace load_data() with something like:
import sqlalchemy as sa

engine = sa.create_engine(os.getenv("DATABASE_URL"))
df = pd.read_sql("""
    SELECT attempt_id, timestamp_utc, lead_id, rep_id, rep_role,
           contact_channel, industry, lead_tier, lead_score,
           outcome, duration_seconds, lead_timezone
    FROM contact_attempts
    WHERE timestamp_utc >= NOW() - INTERVAL '90 days'
""", engine)

The rest of the pipeline is unchanged.


Next steps

Once you have enough real data, natural extensions from here:

  • Confidence intervals — add Wilson score intervals per cell so dashboards can show error bars
  • Regression model — use scikit-learn to learn which features (hour, day, tier, industry, score) drive connect rate, then score future call slots
  • Rep-level heatmaps — same pipeline, filter by rep_id
  • Decay weighting — weight recent attempts more heavily than old ones
  • Scheduler integration — feed top-window windows back into routing engine to auto-suggest call times per lead

About

A Python pipeline that analyzes your sales team's contact attempt data and generates heatmaps showing the best times to reach leads, segmented by industry and tier.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages