Can Family Engagement Close Achievement Gaps?

Learning Objectives Addressed

✓ Objective 1: Probability as Foundation

Maximum likelihood estimation for all models
P-values, confidence intervals, standard errors
Cross-validation for generalization
Understanding sampling variability

✓ Objective 2: Appropriate GLM Application

Multinomial logistic (categorical outcome)
Binary logistic (dichotomous screening)
Poisson regression (count data)
Matching model to response type

✓ Objective 3: Model Selection

Compared 5 different approaches
Cross-validation metrics
Test set performance
Convergence across methods

✓ Objective 4: General Audience Communication

Non-technical problem framing
Visual storytelling
Policy implications
Accessible findings presentation

✓ Objective 5: Programming Implementation

tidymodels framework
Reproducible workflows
Version control (GitHub)
Professional documentation

Required Models Implemented

✓ Multiple regression with quantitative + qualitative predictors
✓ Multinomial logistic regression with multiple predictors
✓ Poisson regression AND Linear Discriminant Analysis
✓ Ridge regression for regularization
✓ Polynomial regression (interaction terms)

The Problem: Education's Stubborn Inequality

In America today, your ZIP code predicts your academic success better than your ability. High-income students are 1.7 times more likely to earn mostly A's compared to low-income students—a 25.8 percentage point gap that has persisted for decades.

Achievement gap showing 62% of high-income vs 37% of low-income students earning A's — **Figure 1:** The Achievement Gap. High-income students achieve "mostly A's" at 62.3% vs. 36.5% for low-income students.

But here's what makes this particularly heartbreaking: it's not about intelligence or potential. It's about resources, opportunities, and support systems that aren't equally distributed.

The Research Question

I wanted to investigate whether family engagement—something that doesn't require money—could help level the playing field. Specifically:

Central Hypothesis

Does family engagement in education provide a STRONGER protective effect for disadvantaged students than for advantaged students?

If yes, targeted engagement programs could be a powerful equity intervention.

This is called the "compensatory hypothesis" in education research: the idea that certain interventions might help close gaps rather than just raising all boats equally.

Why This Matters

Traditional education interventions often struggle with an equity paradox: programs meant to help everyone tend to be captured by families who already have advantages. Tutoring programs? Wealthy families sign up first. Advanced classes? Kids whose parents navigate the system. Summer programs? Transportation and cost create barriers.

But family engagement—attending parent-teacher conferences, helping with homework, participating in school activities—these don't require wealth. If engagement helps disadvantaged students MORE, it suggests a truly equitable intervention strategy.

Data & Methodology

Dataset

I analyzed the NCES Parent and Family Involvement in Education (PFI) Survey, combining 2016 and 2019 waves for a sample of 25,391 K-12 students after cleaning.

The dataset captures:

Academic outcomes: Student grades (4 categories), at-risk status, days absent
Family engagement: 8 school activities, homework involvement, cultural enrichment
Socioeconomic factors: Household income, parent education, family structure
Control variables: Grade level, disability status, race/ethnicity, school type

Composite Engagement Measures

Rather than treating each activity separately, I created three composite measures capturing different dimensions of involvement:

# School Engagement (0-8 activities)
school_engagement = attend_event + volunteer + general_meeting + 
                    pta_meeting + parent_teacher_conf + fundraising + 
                    committee + counselor

# Homework Involvement (standardized)
homework_involvement = scale(
  (homework_days + homework_hours + homework_help) / 3
)[,1]

# Cultural Enrichment (weighted composite)
cultural_enrichment = (story + crafts + games + projects + sports_home) +
                      (library + bookstore) / 4 + dinners / 7

Why Composites?

Individual activities are noisy. A family might attend one event but not another due to scheduling, not disengagement. Composite scores capture breadth of involvement, which is more predictive than any single activity.

Testing the Compensatory Hypothesis

The key to testing whether engagement helps disadvantaged students MORE was including interaction terms:

# Interaction: Does engagement effect differ by income?
model_spec ~ ... + 
             income + school_engagement + 
             income × school_engagement + ...

If the interaction coefficient is negative and significant, it means engagement reduces risk MORE for low-income students—evidence for the compensatory hypothesis.

R 4.3+ tidymodels ggplot2 nnet glmnet MASS

Statistical Models Implemented

I implemented five different modeling approaches, each addressing a different analytical question and demonstrating mastery of appropriate GLM selection:

Model 1: Multinomial Logistic Regression (Primary Model)

Why this model: Student grades have 4 unordered categories (High Achievers, Solid Performers, Struggling, At-Risk). Multinomial logistic handles categorical outcomes without assuming ordinal relationships.

Model specification:

multinom_spec <- multinom_reg() %>%
  set_engine("nnet") %>%
  set_mode("classification")

recipe <- recipe(grade_category ~ ., data = train) %>%
  step_dummy(all_nominal(), -all_outcomes()) %>%
  step_interact(terms = ~ income:school_engagement + 
                        parent_ed:homework_involvement) %>%
  step_normalize(all_predictors()) %>%
  step_zv(all_predictors())

Performance: Achieved 62.9% cross-validation accuracy with stable test set performance (62.2%), substantially exceeding the 54.1% baseline from always predicting the majority class.

Key finding: Income × school engagement interaction was negative and significant (β = -0.202, p = 0.009), providing direct evidence for the compensatory hypothesis.

Model 2: Binary Logistic Regression (Screening Application)

Why this model: For practical early-warning systems, schools need a binary classification: at-risk or not. Binary logistic provides interpretable odds ratios for risk factors.

Outcome definition: Students are "at-risk" if they earn C's or lower AND either have high absenteeism (>10 days) or low school enjoyment.

Challenge encountered: Severe class imbalance (94% not at-risk, 6% at-risk) led to overfitting. The model achieved 80.2% ROC-AUC in cross-validation but collapsed to 21.3% on the test set.

Methodological Lesson: Class Imbalance

This demonstrates why overall accuracy can be misleading. The model achieved 94% accuracy by predicting nearly everyone as not at-risk—useless for identifying students who need help!

Solution for future work: Implement down-sampling, class weights, or threshold optimization before deployment.

Coefficients still interpretable: Despite prediction failures, the model revealed that homework involvement reduces at-risk odds by 41% per standard deviation increase (OR = 0.59, p < 0.001)—a finding consistent across all models.

Coefficient plot showing predictors of at-risk status — **Figure 2:** Binary Logistic Regression Coefficients. Negative values = protective effects. Homework involvement shows strongest protective effect (β = -0.529).

Model 3: Poisson Regression (Attendance Analysis)

Why this model: Days absent is count data (non-negative integers, no upper bound). Poisson regression is appropriate when modeling such data.

Performance: Cross-validation RMSE of 4.55 days, improving to 4.37 on test set. R² of 0.146 indicates the model explains 14.6% of variance in absences.

Key insights:

Disability status increases expected absences by 13% (β = 0.126, p < 0.001), highlighting need for specialized support
Homework involvement reduces absences (β = -0.075, p < 0.001), suggesting engagement creates a virtuous cycle of attendance and achievement
Large year effect (β = -0.651, p < 0.001): 2019 had substantially fewer absences than 2016, potentially reflecting policy changes

Model 4: Linear Discriminant Analysis (Validation)

Why this model: LDA uses different statistical assumptions (multivariate normality, equal covariance matrices) than multinomial logistic. Convergence between methods validates that findings aren't artifacts of modeling choices.

Performance: Achieved 62.6% cross-validation accuracy, nearly identical to multinomial's 62.9%.

Robustness Check: Model Convergence

When models with different assumptions yield nearly identical results (multinomial 62.9% vs. LDA 62.6%), it provides strong evidence that findings are robust and not dependent on specific parametric assumptions.

Model 5: Ridge Regression (Regularization)

Why this model: With three correlated engagement measures plus interaction terms, multicollinearity could inflate coefficient standard errors. Ridge regression shrinks coefficients toward zero, improving stability and interpretability.

Implementation:

ridge_spec <- multinom_reg(penalty = tune(), mixture = 0) %>%
  set_engine("glmnet") %>%
  set_mode("classification")

ridge_results <- ridge_wf %>%
  tune_grid(
    resamples = cv_folds,
    grid = grid_regular(penalty(range = c(-5, 0)), levels = 20)
  )

Optimal penalty selection: Cross-validation identified λ = 0.01 as providing the best balance between bias and variance.

Key finding: Ridge-regularized coefficients were highly similar to non-regularized multinomial logistic, confirming that multicollinearity wasn't severely inflating estimates. The compensatory effect remained significant with similar magnitude.

Model	Response Type	CV Performance	Test Performance	Status
Multinomial Logistic	4 categories	62.9% accuracy	62.2% accuracy	✓ Stable
Binary Logistic	Binary	80.2% ROC-AUC	21.3% ROC-AUC	⚠ Overfit
Poisson	Count	4.55 RMSE	4.37 RMSE	✓ Improved
LDA	4 categories	62.6% accuracy	62.1% accuracy	✓ Stable
Ridge (Multinomial)	4 categories	62.7% accuracy	62.0% accuracy	✓ Stable

Key Findings

Core Discovery: The Compensatory Effect is Real

Statistical evidence: Income × engagement interaction β = -0.202, p = 0.009
Practical meaning: Each school activity reduces at-risk probability 18% MORE for low-income students (10% for high-income)
Robustness: Finding replicated across multinomial logistic, binary logistic, and ridge regression
Policy relevance: Targeted engagement programs yield higher returns than universal programs

The Achievement Gap

High-income students achieve "mostly A's" at 62.3% versus 36.5% for low-income students—a 25.8 percentage point gap. Parent education shows an even stronger relationship: children of college graduates earn A's at 2.3 times the rate of children whose parents have high school education or less.

Engagement Patterns Differ by SES

Bar chart showing higher-income families engage more — **Figure 3:** School Engagement by Income. High-income families average 4.54 activities vs. 3.66 for low-income (Cohen's d = 0.44).

The 0.88 activity gap likely reflects structural barriers (time constraints from multiple jobs, transportation, less welcoming environments) rather than differential interest. Critically: low-income families CAN and DO engage more when barriers are removed.

The Compensatory Effect in Action

Line chart showing steeper slope for low-income students — **Figure 4:** % Achieving Mostly A's by Engagement Level. Steeper slope for low-income students (orange) demonstrates compensatory effect.

Moving from low to high engagement:

Low-income: Success rate increases from 32% → 44% (+12 pp, 37.5% relative increase)
High-income: Success rate increases from 54% → 67% (+13 pp, 24.1% relative increase)

While high-income students benefit slightly more in absolute terms, the relative benefit is much larger for low-income students—this is the essence of the compensatory effect.

Which Practices Matter Most?

Strongest Protective Factors (Ranked by Effect Size)

Homework involvement - OR = 0.59 (41% reduction in at-risk odds)
Cultural enrichment - β = -0.231, p < 0.001
School engagement - OR = 0.90 per activity
Parent-teacher conferences - OR = 0.85

Importantly, most of these activities require time but minimal financial resources, making them accessible across income levels when barriers are addressed.

Why This Matters: Policy Implications

The Equity Imperative

These findings directly challenge the assumption that education interventions help everyone equally. The compensatory effect means:

Targeting matters: Don't spread resources thin with universal programs
Equity requires differential investment: Give more support where it helps most
Measurable impact: 18% reduction in at-risk probability is substantial and actionable

Actionable Recommendations for Schools

Evidence-Based Strategies

1. Prioritize Homework Involvement Programs (Strongest Effect: OR = 0.59)

Provide structured homework help sessions at school
Train parents in effective support strategies (focus on effort, not just answers)
Set clear, achievable expectations with progress monitoring
Create homework helplines or online resources for working parents

2. Target Low-Income Families (18% Compensatory Advantage)

Focus outreach on disadvantaged communities with personalized invitations
Remove barriers: provide transportation, childcare, flexible scheduling
Universal programs risk being captured by high-resource families
Track participation by income to ensure equity

3. Focus on Accessible Activities

Parent-teacher conferences (no special resources needed)
School event attendance (builds community connections)
General meetings (low commitment threshold for initial engagement)
Avoid expensive activities that create barriers (fundraising galas, etc.)

4. Support Students with Disabilities (Highest Risk Factor: OR = 1.61)

Generic engagement programs won't address specialized needs
Require targeted interventions beyond family involvement
Coordinate between special education and family engagement staff

What Won't Work

Traditional approaches that this research suggests are less effective:

❌ Universal programs without targeted outreach (captured by advantaged families)
❌ Expensive activities as primary engagement (creates barriers)
❌ One-size-fits-all messaging (doesn't address specific barriers)
❌ Activities requiring weekday daytime availability (working parents excluded)

Economic Argument

Beyond moral imperatives, targeted engagement is cost-effective:

Homework help programs: ~$200 per student annually
Avoiding one grade retention: ~$12,000 per student
If engagement reduces at-risk probability by 18%, ROI is substantial
Scales across entire districts without major infrastructure

Reflection: What I Learned

Objective 1: Probability as Foundation

This project deepened my understanding of how probability theory underpins every statistical decision:

Maximum Likelihood Estimation: Rather than viewing MLE as a black-box optimization, I now understand it as finding parameters that maximize P(data | parameters). For multinomial logistic, this means finding coefficients that make the observed grade distribution most likely given the predictors.

Inference Tools: P-values, confidence intervals, and standard errors are all fundamentally about quantifying uncertainty due to sampling variability. The interaction term p-value (0.009) tells us that if there were truly no compensatory effect, we'd see an effect this large less than 1% of the time by chance alone.

Cross-Validation: By estimating P(correct prediction | new data), CV assesses how well models generalize beyond the training set. The stable performance (62.9% CV → 62.2% test) indicates appropriate model complexity.

Objective 2: Applying Appropriate GLMs

Matching models to response types was crucial:

Response	Type	Why This GLM
Grades	4 unordered categories	Multinomial logistic handles categorical without assuming order
At-risk status	Binary	Binary logistic bounds predictions to [0,1], interpretable odds ratios
Days absent	Count	Poisson appropriate for non-negative integers, log link
Grades (validation)	Multivariate	LDA tests robustness under different assumptions

The class imbalance failure in binary logistic was a crucial lesson: high accuracy (94%) can be meaningless if achieved by predicting only the majority class. Future applications require preprocessing (SMOTE, class weights) before deployment.

Objective 3: Model Selection

Comparing five approaches taught me that:

Convergence validates findings: Multinomial and LDA achieving ~62% despite different assumptions provides confidence
Performance metrics vary by context: Binary model's accuracy was misleading; should have emphasized sensitivity/specificity for imbalanced data
Multiple models complement each other: Poisson on absences revealed engagement's effect on attendance, strengthening the overall story

Objective 4: Communicating to General Audiences

This portfolio itself demonstrates general audience communication:

Leading with stakes (achievement gaps) before methods
Using visual storytelling (Figure 4's slopes immediately convey compensatory effect)
Translating statistics to plain language (β = -0.202 becomes "18% stronger effect")
Emphasizing policy relevance over technical sophistication

Objective 5: Programming Implementation

The tidymodels framework enforced best practices:

Recipes: Standardized preprocessing prevents data leakage
Workflows: Bundling recipes + models ensures reproducibility
Resampling: Cross-validation becomes a single function call, reducing errors
Tuning: Grid search for ridge penalty automated parameter selection

Biggest Challenge

The binary logistic overfitting due to class imbalance was frustrating but educational. Initially, 94% accuracy looked great—until test set collapse revealed the problem. This taught me to:

Always check confusion matrices, not just overall metrics
Be skeptical of performance that seems "too good"
Consider class distribution when interpreting accuracy
Implement balancing techniques for deployment applications

Most Surprising Finding

I expected engagement to help everyone equally. The compensatory effect—that it helps disadvantaged students MORE—was surprising and encouraging. It suggests truly equitable interventions are possible, not just those that raise all boats equally while maintaining gaps.

Future Directions

To strengthen causal claims:

Propensity score matching: Compare similar families who differ in engagement
Natural experiments: Leverage policy changes or program rollouts
Longitudinal analysis: Track students over time for cumulative effects
Mechanism analysis: WHY does engagement help disadvantaged students more?

Code & Reproducibility

Repository Structure

EDUCATIONAL-EQUITY-THROUGH-FAMILY-ENGAGEMENT/
├── code/
│   ├── 01_data_preparation.R       # Data cleaning, variable construction
│   ├── 02_exploratory_analysis.R   # EDA, ggpairs plots
│   ├── 03_statistical_modeling.R   # Train all 5 models
│   ├── 04_model_evaluation.R       # CV, test metrics, comparisons
│   └── 05_visualization.R          # Publication-quality figures
├── figures/                         # All visualizations (PNG, 300 DPI)
├── data/                            # Raw and processed data
├── output/                          # Model objects, results tables
└── README.md                        # Technical documentation

Key Technologies

R 4.3+ tidyverse tidymodels ggplot2 nnet (multinomial) glmnet (ridge) MASS (LDA)

How to Reproduce

Clone repository: git clone https://github.com/mutuac-bit/EDUCATIONAL-EQUITY-THROUGH-FAMILY-ENGAGEMENT.git
Install dependencies: renv::restore()
Run scripts sequentially: source("code/01_data_preparation.R")
All results will be saved to output/, figures to figures/

View complete code on GitHub: github.com/mutuac-bit/EDUCATIONAL-EQUITY-THROUGH-FAMILY-ENGAGEMENT