๐Ÿ”๏ธ Summit County Housing Analysis

A Personal Data Science Project

What started as curiosity became a full-stack data science exploration.

๐Ÿ”๏ธ Local Context

Understanding the housing market in my area.

๐Ÿ› ๏ธ Data Engineering

Building a production-grade ETL pipeline from raw, messy public records.

๐Ÿ”ฎ ML Inference

Testing the limits of browser-based ML... and my own skillset.

๐Ÿ“Š 1. Data Story

Explore the data through interactive visualizations:

  • Distance to ski lifts and resort proximity analysis
  • 20+ years of price trends vs. interest rates
  • Buyer origin patterns (local vs. out-of-state)
  • Seasonal purchase patterns and market cycles
  • Raw property data samples with 20+ attributes

๐Ÿงฌ 2. ML Experiments

Dive into the model development process:

  • Tournament leaderboard comparing 10+ model runs
  • Gradient Boosting vs. Neural Network performance
  • SHAP values showing feature importance
  • Partial Dependence Plots for all numeric features
  • Model selection and version comparison tools

๐Ÿ”ฎ 3. Price Predictor

Test the model with your own scenarios:

  • Interactive "What-If" simulator with real-time predictions
  • Adjust property features (size, beds, location, etc.)
  • Runs entirely in your browser using ONNX Runtime
  • Compare predictions across different model versions
  • No backend requiredโ€”pure client-side ML inference

๐Ÿ› ๏ธ How to use the Product

You can explore the dashboard immediately using the cards above or the navigation. The steps below are optional and intended for developers who want to run the data pipeline manually.

To see the full instructions, see the Public GitHub Repo →

1. Data Collection make scrape

Runs the asynchronous scraper to pull the latest property records.

2. ETL Pipeline make ingest

Resets the local SQLite warehouse and performs complex SQL feature engineering.

3. Model Training make tournament

Triggers a parameter sweep tournament. The best model is promoted.