Method 1: Regression Adjustment

Overview

Regression adjustment is the most straightforward causal inference method. It estimates the treatment effect by controlling for confounding variables through linear regression. The key insight is that if you can measure and control for all variables that influence both treatment and outcome, the remaining correlation between treatment and outcome represents the causal effect.

Navigation:

When to Use Regression Adjustment

Best for:

Few, well-understood confounders
Linear relationships between variables
When you have domain knowledge about confounding structure
Initial exploratory causal analysis

Key Assumption:

No unmeasured confounders (Conditional Ignorability)
All variables that influence both treatment and outcome are measured and included

Strengths:

Simple and interpretable
Fast to implement and compute
Transparent modeling assumptions
Easy to communicate to stakeholders

Limitations:

Strong assumption about unmeasured confounders
Sensitive to model misspecification
May not handle complex interactions well

Basic Implementation

Simple Regression Adjustment

import statsmodels.formula.api as smf
from sklearn.preprocessing import StandardScaler
import pandas as pd
import numpy as np

def regression_adjustment(df, outcome_col, treatment_col, confounder_cols):
    """
    Simple regression adjustment for causal inference
    
    This function implements the fundamental equation:
    Y = α + β*Treatment + γ₁*X₁ + γ₂*X₂ + ... + ε
    
    The coefficient β represents our causal estimate of treatment effect.
    """
    # Prepare formula
    confounders_str = ' + '.join(confounder_cols)
    formula = f'{outcome_col} ~ {treatment_col} + {confounders_str}'
    
    print(f"Running regression: {formula}")
    
    # Fit model
    model = smf.ols(formula, data=df).fit()
    
    # Extract treatment effect
    treatment_effect = model.params[treatment_col]
    confidence_interval = model.conf_int().loc[treatment_col]
    
    results = {
        'ate': treatment_effect,
        'ci_lower': confidence_interval[0],
        'ci_upper': confidence_interval[1],
        'p_value': model.pvalues[treatment_col],
        'r_squared': model.rsquared,
        'model': model
    }
    
    return results

# Example usage with interpretation
confounders = ['tenure_months', 'job_level', 'team_size', 'manager_span']
result = regression_adjustment(df, 'tickets_per_week', 'copilot_usage', confounders)

print(f"\n=== CAUSAL EFFECT ESTIMATE ===")
print(f"Treatment Effect: {result['ate']:.2f} tickets per week")
print(f"95% Confidence Interval: ({result['ci_lower']:.2f}, {result['ci_upper']:.2f})")
print(f"P-value: {result['p_value']:.4f}")
print(f"Model R²: {result['r_squared']:.3f}")

# Interpret the results
if result['p_value'] < 0.05:
    print(f"\n✅ SIGNIFICANT EFFECT DETECTED")
    if result['ate'] > 0:
        print(f"Copilot usage INCREASES productivity by {result['ate']:.2f} tickets/week")
        print(f"Monthly impact: ~{result['ate'] * 4:.1f} additional tickets per user")
    else:
        print(f"Copilot usage DECREASES productivity by {abs(result['ate']):.2f} tickets/week")
        print("Consider investigating implementation or training issues")
else:
    print(f"\n❌ NO SIGNIFICANT EFFECT DETECTED")
    print(f"Cannot conclude that Copilot has a causal effect on productivity")
    print(f"Consider: larger sample size, different model specification, or other methods")

Understanding the Output:

Treatment Effect (ATE): The average causal effect across all users
Confidence Interval: Range of plausible true effects (95% confidence)
P-value: Probability of seeing this effect if true effect were zero
R²: How much variation the model explains (model fit quality)

Model Diagnostics and Validation

Comprehensive Model Diagnostics

def regression_diagnostics(model, df, outcome_col, treatment_col):
    """
    Comprehensive diagnostic testing for regression adjustment
    """
    import matplotlib.pyplot as plt
    import seaborn as sns
    from scipy import stats
    
    diagnostics = {}
    
    # 1. Residual analysis
    residuals = model.resid
    fitted_values = model.fittedvalues
    
    # Normality test
    _, normality_p = stats.jarque_bera(residuals)
    diagnostics['normality_test'] = {
        'statistic': _,
        'p_value': normality_p,
        'normal': normality_p > 0.05
    }
    
    # Homoscedasticity test
    _, het_p, _, _ = stats.diagnostic.het_breuschpagan(residuals, model.model.exog)
    diagnostics['homoscedasticity_test'] = {
        'p_value': het_p,
        'homoscedastic': het_p > 0.05
    }
    
    # 2. Influential observations
    influence = model.get_influence()
    cooks_d = influence.cooks_distance[0]
    leverage = influence.hat_matrix_diag
    
    # Identify outliers
    outliers = np.where(cooks_d > 4/len(df))[0]
    high_leverage = np.where(leverage > 2 * len(model.params) / len(df))[0]
    
    diagnostics['outliers'] = {
        'cooks_d_outliers': len(outliers),
        'high_leverage': len(high_leverage),
        'outlier_indices': outliers[:10]  # First 10 outliers
    }
    
    # 3. Model specification tests
    # Ramsey RESET test for functional form
    from statsmodels.stats.diagnostic import linear_reset
    reset_stat, reset_p = linear_reset(model, power=2)
    diagnostics['specification_test'] = {
        'reset_p_value': reset_p,
        'correct_specification': reset_p > 0.05
    }
    
    # 4. Multicollinearity check
    from statsmodels.stats.outliers_influence import variance_inflation_factor
    
    vif_data = pd.DataFrame()
    vif_data["Variable"] = model.model.exog_names[1:]  # Exclude intercept
    vif_data["VIF"] = [variance_inflation_factor(model.model.exog, i) 
                       for i in range(1, model.model.exog.shape[1])]
    
    diagnostics['multicollinearity'] = {
        'max_vif': vif_data['VIF'].max(),
        'high_vif_vars': vif_data[vif_data['VIF'] > 5]['Variable'].tolist()
    }
    
    # Print diagnostic summary
    print("=== REGRESSION DIAGNOSTICS ===")
    print(f"Normality test p-value: {diagnostics['normality_test']['p_value']:.4f}")
    print(f"Homoscedasticity test p-value: {diagnostics['homoscedasticity_test']['p_value']:.4f}")
    print(f"Specification test p-value: {diagnostics['specification_test']['reset_p_value']:.4f}")
    print(f"Max VIF: {diagnostics['multicollinearity']['max_vif']:.2f}")
    print(f"Outliers detected: {diagnostics['outliers']['cooks_d_outliers']}")
    
    # Warnings
    warnings = []
    if not diagnostics['normality_test']['normal']:
        warnings.append("Non-normal residuals - consider robust standard errors")
    if not diagnostics['homoscedasticity_test']['homoscedastic']:
        warnings.append("Heteroscedasticity detected - use robust standard errors")
    if not diagnostics['specification_test']['correct_specification']:
        warnings.append("Model specification issues - consider interactions or polynomials")
    if diagnostics['multicollinearity']['max_vif'] > 5:
        warnings.append(f"High multicollinearity in: {diagnostics['multicollinearity']['high_vif_vars']}")
    
    if warnings:
        print("\n⚠️  WARNINGS:")
        for warning in warnings:
            print(f"   - {warning}")
    else:
        print("\n✅ All diagnostic tests passed")
    
    return diagnostics

# Run diagnostics
diagnostic_results = regression_diagnostics(
    result['model'], df, 'tickets_per_week', 'copilot_usage'
)

Robust Standard Errors

def regression_with_robust_se(df, outcome_col, treatment_col, confounder_cols):
    """
    Regression adjustment with robust standard errors
    """
    # Prepare formula
    confounders_str = ' + '.join(confounder_cols)
    formula = f'{outcome_col} ~ {treatment_col} + {confounders_str}'
    
    # Fit model with robust standard errors
    model = smf.ols(formula, data=df).fit(cov_type='HC3')  # HC3 robust standard errors
    
    # Extract results
    treatment_effect = model.params[treatment_col]
    robust_se = model.bse[treatment_col]
    robust_ci = model.conf_int().loc[treatment_col]
    
    results = {
        'ate': treatment_effect,
        'robust_se': robust_se,
        'ci_lower': robust_ci[0],
        'ci_upper': robust_ci[1],
        'p_value': model.pvalues[treatment_col],
        'model': model
    }
    
    print(f"=== ROBUST REGRESSION RESULTS ===")
    print(f"Treatment Effect: {results['ate']:.3f}")
    print(f"Robust Standard Error: {results['robust_se']:.3f}")
    print(f"95% CI: [{results['ci_lower']:.3f}, {results['ci_upper']:.3f}]")
    
    return results

# Run with robust standard errors
robust_result = regression_with_robust_se(
    df, 'tickets_per_week', 'copilot_usage', confounders
)

Advanced Regression with Interaction Effects

Heterogeneous Treatment Effects

Sometimes the treatment effect differs across subgroups. Interaction terms help us identify these heterogeneous effects - crucial for targeted deployment strategies.

def regression_with_interactions(df, outcome_col, treatment_col, confounder_cols, interaction_vars=None):
    """
    Regression adjustment with interaction terms for heterogeneous effects
    
    Interaction terms allow the treatment effect to vary by subgroup:
    Y = α + β₁*Treatment + β₂*Group + β₃*(Treatment × Group) + controls
    
    The interaction coefficient β₃ tells us how the treatment effect
    differs between groups.
    """
    base_formula = f'{outcome_col} ~ {treatment_col} + {" + ".join(confounder_cols)}'
    
    # Add interaction terms
    if interaction_vars:
        interactions = [f'{treatment_col}:{var}' for var in interaction_vars]
        formula = base_formula + ' + ' + ' + '.join(interactions)
    else:
        formula = base_formula
    
    print(f"Running interaction model: {formula}")
    
    model = smf.ols(formula, data=df).fit()
    
    # Calculate marginal effects for different groups
    effects = {}
    main_effect = model.params[treatment_col]
    
    if interaction_vars:
        for var in interaction_vars:
            interaction_term = f'{treatment_col}:{var}'
            if interaction_term in model.params:
                interaction_coeff = model.params[interaction_term]
                
                # Calculate effects for each level of the interaction variable
                for level in df[var].unique():
                    if pd.api.types.is_numeric_dtype(df[var]):
                        # For continuous variables, show effect at different values
                        marginal_effect = main_effect + (interaction_coeff * level)
                        effects[f'{var}_{level}'] = marginal_effect
                    else:
                        # For categorical variables
                        if level == df[var].mode().iloc[0]:  # Reference category
                            effects[f'{var}_{level}'] = main_effect
                        else:
                            # This is simplified - proper categorical interactions need dummy encoding
                            effects[f'{var}_{level}'] = main_effect + interaction_coeff
    
    return model, effects

# Example: Different effects by job level
model, effects = regression_with_interactions(
    df, 'tickets_per_week', 'copilot_usage', 
    confounders, interaction_vars=['job_level']
)

print(f"\n=== HETEROGENEOUS TREATMENT EFFECTS ===")
for group, effect in effects.items():
    print(f"{group}: {effect:.2f} tickets/week")

# Interpret interaction effects
print(f"\n=== INTERACTION INTERPRETATION ===")
print("These results show how Copilot's effect varies by job level.")
print("Use this to target deployment to groups with highest returns.")

Marginal Effects Analysis

def calculate_marginal_effects(model, df, treatment_col, continuous_vars):
    """
    Calculate marginal effects for continuous interaction variables
    """
    from statsmodels.stats.contingency_tables import mcnemar
    
    marginal_effects = {}
    
    for var in continuous_vars:
        interaction_term = f'{treatment_col}:{var}'
        if interaction_term in model.params:
            # Calculate marginal effect at different percentiles
            percentiles = [10, 25, 50, 75, 90]
            var_percentiles = np.percentile(df[var], percentiles)
            
            main_effect = model.params[treatment_col]
            interaction_coeff = model.params[interaction_term]
            
            effects_at_percentiles = {}
            for p, value in zip(percentiles, var_percentiles):
                marginal_effect = main_effect + (interaction_coeff * value)
                effects_at_percentiles[f'p{p}'] = marginal_effect
            
            marginal_effects[var] = effects_at_percentiles
    
    return marginal_effects

# Calculate marginal effects for continuous variables
if 'tenure_months' in confounders:
    marginal_effects = calculate_marginal_effects(
        model, df, 'copilot_usage', ['tenure_months']
    )
    
    print("\n=== MARGINAL EFFECTS BY TENURE ===")
    for var, effects in marginal_effects.items():
        print(f"{var}:")
        for percentile, effect in effects.items():
            print(f"  {percentile}: {effect:.3f}")

Polynomial and Non-linear Specifications

Testing Non-linear Relationships

def test_nonlinear_specifications(df, outcome_col, treatment_col, confounder_cols):
    """
    Test different functional forms to improve model specification
    """
    from sklearn.preprocessing import PolynomialFeatures
    from sklearn.metrics import r2_score
    
    specifications = {}
    
    # 1. Linear specification (baseline)
    linear_formula = f'{outcome_col} ~ {treatment_col} + {" + ".join(confounder_cols)}'
    linear_model = smf.ols(linear_formula, data=df).fit()
    specifications['linear'] = {
        'model': linear_model,
        'r_squared': linear_model.rsquared,
        'aic': linear_model.aic,
        'ate': linear_model.params[treatment_col]
    }
    
    # 2. Quadratic treatment effect
    df_quad = df.copy()
    df_quad['treatment_squared'] = df_quad[treatment_col] ** 2
    quad_formula = f'{outcome_col} ~ {treatment_col} + treatment_squared + {" + ".join(confounder_cols)}'
    quad_model = smf.ols(quad_formula, data=df_quad).fit()
    specifications['quadratic'] = {
        'model': quad_model,
        'r_squared': quad_model.rsquared,
        'aic': quad_model.aic,
        'linear_coeff': quad_model.params[treatment_col],
        'quad_coeff': quad_model.params['treatment_squared']
    }
    
    # 3. Log specification (if outcome is positive)
    if (df[outcome_col] > 0).all():
        df_log = df.copy()
        df_log['log_outcome'] = np.log(df_log[outcome_col])
        log_formula = f'log_outcome ~ {treatment_col} + {" + ".join(confounder_cols)}'
        log_model = smf.ols(log_formula, data=df_log).fit()
        specifications['log'] = {
            'model': log_model,
            'r_squared': log_model.rsquared,
            'aic': log_model.aic,
            'ate_percent': log_model.params[treatment_col] * 100  # Percentage effect
        }
    
    # Compare specifications
    print("=== MODEL SPECIFICATION COMPARISON ===")
    for spec_name, spec_results in specifications.items():
        print(f"{spec_name.upper()}:")
        print(f"  R²: {spec_results['r_squared']:.4f}")
        print(f"  AIC: {spec_results['aic']:.2f}")
        if 'ate' in spec_results:
            print(f"  Treatment Effect: {spec_results['ate']:.3f}")
        elif 'ate_percent' in spec_results:
            print(f"  Treatment Effect: {spec_results['ate_percent']:.1f}%")
    
    # Recommend best specification
    best_spec = min(specifications.items(), key=lambda x: x[1]['aic'])
    print(f"\n✅ RECOMMENDED SPECIFICATION: {best_spec[0].upper()} (lowest AIC)")
    
    return specifications

# Test different specifications
specifications = test_nonlinear_specifications(
    df, 'tickets_per_week', 'copilot_usage', confounders
)

Sensitivity Analysis

Testing Robustness to Model Specification

def sensitivity_analysis_specification(df, outcome_col, treatment_col, all_confounders):
    """
    Test sensitivity of results to different confounder specifications
    """
    from itertools import combinations
    
    # Generate different confounder combinations
    results = []
    
    # Minimal specification
    minimal_confounders = all_confounders[:2]
    minimal_result = regression_adjustment(df, outcome_col, treatment_col, minimal_confounders)
    results.append({
        'specification': 'minimal',
        'confounders': minimal_confounders,
        'ate': minimal_result['ate'],
        'ci_lower': minimal_result['ci_lower'],
        'ci_upper': minimal_result['ci_upper'],
        'n_confounders': len(minimal_confounders)
    })
    
    # Medium specification
    medium_confounders = all_confounders[:len(all_confounders)//2]
    medium_result = regression_adjustment(df, outcome_col, treatment_col, medium_confounders)
    results.append({
        'specification': 'medium',
        'confounders': medium_confounders,
        'ate': medium_result['ate'],
        'ci_lower': medium_result['ci_lower'],
        'ci_upper': medium_result['ci_upper'],
        'n_confounders': len(medium_confounders)
    })
    
    # Full specification
    full_result = regression_adjustment(df, outcome_col, treatment_col, all_confounders)
    results.append({
        'specification': 'full',
        'confounders': all_confounders,
        'ate': full_result['ate'],
        'ci_lower': full_result['ci_lower'],
        'ci_upper': full_result['ci_upper'],
        'n_confounders': len(all_confounders)
    })
    
    # Create sensitivity plot
    sensitivity_df = pd.DataFrame(results)
    
    print("=== SENSITIVITY TO CONFOUNDER SPECIFICATION ===")
    for _, row in sensitivity_df.iterrows():
        print(f"{row['specification'].upper()}: ATE = {row['ate']:.3f} "
              f"[{row['ci_lower']:.3f}, {row['ci_upper']:.3f}] "
              f"({row['n_confounders']} confounders)")
    
    # Check if confidence intervals overlap
    ate_range = (sensitivity_df['ate'].min(), sensitivity_df['ate'].max())
    ate_stability = (ate_range[1] - ate_range[0]) / abs(sensitivity_df['ate'].mean())
    
    print(f"\nATE range: [{ate_range[0]:.3f}, {ate_range[1]:.3f}]")
    print(f"Relative stability: {ate_stability:.1%}")
    
    if ate_stability < 0.2:
        print("✅ Results are stable across specifications")
    else:
        print("⚠️  Results vary significantly across specifications")
        print("   Consider robustness checks or alternative methods")
    
    return sensitivity_df

# Run sensitivity analysis
sensitivity_results = sensitivity_analysis_specification(
    df, 'tickets_per_week', 'copilot_usage', confounders
)

Business Value Interpretation

Converting Statistical Results to Business Impact

def interpret_business_impact(ate, baseline_mean, confidence_interval, cost_per_license=50):
    """
    Convert regression results to business-relevant metrics
    """
    # Calculate percentage improvement
    percentage_improvement = (ate / baseline_mean) * 100
    
    # Calculate business value
    weekly_improvement = ate
    monthly_improvement = ate * 4.33  # Average weeks per month
    annual_improvement = ate * 52
    
    # ROI calculation (simplified)
    annual_license_cost = cost_per_license * 12
    # Assume each ticket represents $20 in value
    ticket_value = 20
    annual_value = annual_improvement * ticket_value
    roi = ((annual_value - annual_license_cost) / annual_license_cost) * 100
    
    # Effect size interpretation
    if abs(ate) < 0.5:
        effect_magnitude = "small"
    elif abs(ate) < 2:
        effect_magnitude = "medium"
    else:
        effect_magnitude = "large"
    
    interpretation = {
        'statistical': {
            'ate': ate,
            'percentage_improvement': percentage_improvement,
            'confidence_interval': confidence_interval,
            'effect_magnitude': effect_magnitude
        },
        'business': {
            'weekly_improvement': weekly_improvement,
            'monthly_improvement': monthly_improvement,
            'annual_improvement': annual_improvement,
            'annual_value': annual_value,
            'roi_percent': roi
        }
    }
    
    # Create business summary
    print("=== BUSINESS IMPACT SUMMARY ===")
    print(f"Copilot Effect: {ate:.2f} additional tickets per week per user")
    print(f"Percentage Improvement: {percentage_improvement:.1f}%")
    print(f"Monthly Impact: {monthly_improvement:.1f} additional tickets per user")
    print(f"Annual Value: ${annual_value:.0f} per user")
    print(f"ROI: {roi:.0f}%")
    print(f"Effect Size: {effect_magnitude}")
    
    # Uncertainty interpretation
    ci_lower, ci_upper = confidence_interval
    print(f"\nUncertainty Range:")
    print(f"Best case: {ci_upper:.2f} tickets/week ({(ci_upper/baseline_mean)*100:.1f}% improvement)")
    print(f"Worst case: {ci_lower:.2f} tickets/week ({(ci_lower/baseline_mean)*100:.1f}% improvement)")
    
    return interpretation

# Interpret business impact
baseline_productivity = df[df['copilot_usage'] == 0]['tickets_per_week'].mean()
business_impact = interpret_business_impact(
    ate=result['ate'],
    baseline_mean=baseline_productivity,
    confidence_interval=(result['ci_lower'], result['ci_upper'])
)

Best Practices and Recommendations

Implementation Checklist

Before using regression adjustment results:

✅ Check assumptions: No unmeasured confounders assumption is critical
✅ Run diagnostics: Test for normality, homoscedasticity, specification
✅ Use robust SEs: Protect against heteroscedasticity
✅ Test interactions: Check for heterogeneous treatment effects
✅ Sensitivity analysis: Test robustness to model specification
✅ Business translation: Convert statistical results to actionable insights

When to Move to Other Methods

Consider alternative methods if:

Diagnostic tests fail consistently
Results are sensitive to model specification
You suspect unmeasured confounders
Treatment assignment appears non-random
Need to handle complex treatment timing

Next Method: Propensity Score Methods →

Regression adjustment provides a solid foundation for causal analysis when assumptions are met. Always validate results through diagnostics and sensitivity analysis before making business decisions.