Skip to content
View ryankellyongh's full-sized avatar

Block or report ryankellyongh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ryankellyongh/README.md

Hi, I’m Ryan Kelly 👋

🎓 Data Analytics student at Northeastern University
🌎 Interested in environmental data, public health, clean energy, and sustainability
📊 I use data to identify performance gaps and turn them into actionable insights


🔍 Current Focus

  • Environmental data analysis (BERDO, emissions, energy systems)
  • Machine learning (Logistic Regression, Random Forest)
  • Exploring GIS and spatial data

📁 Featured Projects

🏙️ BERDO Analysis

Analyzed 5,500+ Boston buildings to identify emissions patterns and non-compliance risks.

Key Outcomes:

  • Identified 1,902 records missing valid Site EUI, highlighting data completeness challenges
  • Flagged 1,003 high-priority buildings based on high energy intensity and property complexity
  • Built an interactive Streamlit app that lets users look up any Boston address in the BERDO dataset and see the annual cost of non-compliance.

Impact: Insights directly informed workforce discussions around building performance, emissions reduction, and equitable decarbonization.

Tools: Python (pandas, matplotlib), Streamlit, Excel | Live app →

🧬 Antibiotic Resistance Prediction

Built Logistic Regression and Random Forest models to predict cefepime resistance in E. coli

Key Outcomes:

  • Logistic Regression achieved 87% recall and 0.871 balanced accuracy on the validation set, outperforming Random Forest across both metrics
  • Tuned models using nested cross-validation (5-fold outer, 3-fold inner GridSearchCV) to prevent data leakage during hyperparameter selection
  • Feature coefficient analysis identified key genomic resistance drivers, supporting model interpretability

Impact: Supports faster clinical decision-making for antibiotic selection.

Tools: Python (scikit-learn, pandas), statistical analysis

⚙️ Portable Liquid Filling Device

Designed a manually operated dispensing device for cost-effective, field-deployable applications.

Key Outcomes:

  • cost reduction (~68%) vs. commercial alternatives
  • Modular design prioritizes cleanability and durability
  • CAD models and assembly documentation included

Impact: Enables resource-constrained teams to scale operations.

Tools: FreeCAD, Python, FMEA, Technical Documentation


🛠️ Tools & Skills

  • Python (pandas, scikit-learn, numpy)

  • R (statistical analysis)

  • SQL

  • Tableau

  • Excel

  • Machine Learning

  • Data Visualization


📚 Currently Learning

  • Advanced geospatial analysis (QGIS, ArcGIS)

  • Time-series forecasting for energy demand

  • Climate impact modeling and scenario analysis


💡 Open To

  • Collaborations on environmental data projects and sustainability analytics.

  • Internships in climate tech, renewable energy, or environmental consulting.

  • Conversations about data-driven climate action.


📫 Let’s Connect

https://www.linkedin.com/in/ryankelly10/

Pinned Loading

  1. awesome-northeast-eco awesome-northeast-eco Public

    Now more than ever, we need to find ways to help one another. Here is a curated list of sustainability resources, services, and organizations for people living in New England and the Northeast US —…

  2. berdo-analysis berdo-analysis Public

    This project analyzes Boston’s 2025 BERDO building dataset to identify reporting gaps, ownership patterns, and energy-use priorities. Site EUI is used as a screening metric, while official BERDO pe…

    Jupyter Notebook

  3. Exploring-the-Bacterial-Genome-using-Data-Science Exploring-the-Bacterial-Genome-using-Data-Science Public

    I trained and evaluated two models: Logistic Regression and Random Forest to predict cefepime resistance in *E. coli* using presence/absence gene features. After comparing performance using balance…

    Jupyter Notebook

  4. Portable-Filling-Machine Portable-Filling-Machine Public

    A compact, manually operated liquid dispensing device for lab and field use. It features a plunger mechanism, O-ring seal, and nozzle for accurate volume control, with a housing built from common, …

    Python