Skip to main content

Built for Tuva Analytics Teams

In-Warehouse Predictive Intelligence

Built by healthcare data engineers for analytics teams, Illuminate Predictive Models removes the need to master ML pipelines, point-in-time feature engineering, or model deployment. Instead of spending six figures on vendor risk scores or months building custom ML infrastructure, you get production-ready predictions that run inside your existing dbt workflow.

  • No external ML platform required: runs inside your existing dbt workflow
  • No opaque scoring logic: transparent features, diagnostics, and model metadata
  • No selection bias: trained on your population, your cost structure, and your data completeness

What You Can Predict

Out-of-the-box spend and utilization models, plus configurable targets for your own workflows.

Total Spend

Expected paid amount per member over customizable time horizons

Inpatient Utilization

Predicted encounter rates for acute inpatient admissions

Emergency Department Visits

ED encounter probability and expected frequency

SNF Utilization

Skilled nursing facility encounter predictions

Custom Targets

Fully configurable target policy for any encounter type and time horizon

Overview

Illuminate Predictive Models makes it easy to train and deploy healthcare risk models without having to build ML infrastructure from scratch or depend on opaque third-party vendors. We train gradient-boosted models directly in your data warehouse on your own claims data, producing calibrated spend and utilization predictions as dbt tables with no external infrastructure required.

Comprehensive Feature Engineering

  • Demographics: Age, sex, race, state, enrollment tenure, and cold-start indicators
  • Utilization History: Paid amounts and encounter counts across 3/6/12-month lookback windows by encounter type
  • Chronic Conditions: CMS chronic condition assignments from both claims mart and raw diagnosis codes
  • HCC Risk Scores: Hierarchical Condition Category assignments normalized across payers and plan versions

Calibrated Probability Outputs

  • Count Thresholds: P(Y >= k), the probability of at least 1, 2, 3, or 5 encounters in a given category
  • Spend Percentiles: P(spend in top k%), the probability a member falls in the top 1% or 5% of spenders
  • Isotonic Calibration: Predictions calibrated to match aggregate actuals for reliable population-level estimates

Clinical and Operational Insights

  • Point-in-time feature construction with no lookahead bias or data leakage
  • Person-level train/test splits that prevent information leakage from overlapping monthly windows
  • Claims lag adjustment to account for incomplete recent claims data
  • Feature importance and fill-rate diagnostics to catch data quality issues early
  • Model registry with signature-based reuse to avoid unnecessary retraining

Purpose-Built for Tuva Users

  • Runs entirely within your dbt workflow with no Jupyter, Airflow, or external ML platforms
  • Trained on your population, your cost structure, your data completeness, with no selection bias
  • Separate models per data source for multi-payer environments
  • PHI-safe summary exports for non-technical stakeholders
  • Versioned model artifacts with full audit trail

Differentiation: Build vs Vendor vs Illuminate

FeatureBuild In-HouseVendor Risk ScoresIlluminate Predictive Models
Training DataYour own claims population, but requires substantial engineering investmentNational averages that may not match your dataYour own claims population with no selection bias
InfrastructurePipeline orchestration, model hosting, serving, and monitoring all owned by your teamSeparate ML platform, API integrations, or file transfersRuns in your warehouse via dbt with zero external dependencies
CalibrationMust be designed and maintained internallyRequires manual adjustment factors for your populationAutomatically calibrated to your actuals
TransparencyHigh if your team invests in diagnostics and documentationBlack-box scores with limited explainabilityFull feature importance, fill rates, and diagnostics
CustomizationFlexible but costly to build and maintainFixed model outputs, vendor-controlled roadmapConfigure targets, horizons, features, and thresholds via dbt vars
UpdatesDependent on internal roadmap and staffingAnnual or semi-annual vendor refresh cyclesRetrain anytime on fresh data with a single dbt run
IntegrationCustom data products required for activation and BICSV drops, API calls, or proprietary formatsNative dbt tables in your warehouse, ready for downstream analytics

Quickstart Path

  1. Add Tuva and illuminate_predictive_models to packages.yml and run dbt deps.
  2. Set minimal vars in dbt_project.yml (for example ml_enabled: true).
  3. Run dbt run --select package:illuminate_predictive_models.
  4. Validate outputs in your ML schema before downstream operationalization.

Core Output Contract

Output TableDescription
train_model_registryTrain/reuse status, artifact URI, diagnostics, and model metadata for the current run
predict_valuesPredicted values by person, anchor month, target definition, and prediction horizon
predict_probabilities_longThreshold and percentile probability outputs, including P(Y >= k) and spend top-percent probabilities
train_metrics_longTrain/test evaluation metrics, including MAE, RMSE, R2, AUC, Brier, and logloss

Bring Predictive Modeling Into Your Existing Tuva Workflow

Keep your data, logic, and operational analytics in one place. Illuminate Predictive Models helps your team move from retrospective reporting to proactive risk targeting without adding a separate ML platform.

Book a Demo