๐๏ธ Summit County Housing Analysis
A Personal Data Science Project
What started as curiosity became a full-stack data science exploration.
Understanding the housing market in my area.
Building a production-grade ETL pipeline from raw, messy public records.
Testing the limits of browser-based ML... and my own skillset.
๐ 1. Data Story
Explore the data through interactive visualizations:
- Distance to ski lifts and resort proximity analysis
- 20+ years of price trends vs. interest rates
- Buyer origin patterns (local vs. out-of-state)
- Seasonal purchase patterns and market cycles
- Raw property data samples with 20+ attributes
๐งฌ 2. ML Experiments
Dive into the model development process:
- Tournament leaderboard comparing 10+ model runs
- Gradient Boosting vs. Neural Network performance
- SHAP values showing feature importance
- Partial Dependence Plots for all numeric features
- Model selection and version comparison tools
๐ฎ 3. Price Predictor
Test the model with your own scenarios:
- Interactive "What-If" simulator with real-time predictions
- Adjust property features (size, beds, location, etc.)
- Runs entirely in your browser using ONNX Runtime
- Compare predictions across different model versions
- No backend requiredโpure client-side ML inference
๐ ๏ธ How to use the Product
You can explore the dashboard immediately using the cards above or the navigation. The steps below are optional and intended for developers who want to run the data pipeline manually.
To see the full instructions, see the Public GitHub Repo →
make scrape
Runs the asynchronous scraper to pull the latest property records.
make ingest
Resets the local SQLite warehouse and performs complex SQL feature engineering.
make tournament
Triggers a parameter sweep tournament. The best model is promoted.