The Generalized Synthetic Control Method: A Powerful ML Algorithm to Produce Counterfactual Estimates for Experimental Data
The Generalized Synthetic Control Method: A Powerful ML Algorithm to Produce Counterfactual Estimates for Experimental Data
This post will cover the Generalized Synthetic Control method, a machinelearning algorithm that is lesser known within Data Science circles. Though this method is not often talked about, this quasiexperimental approach utilizes a robust latentfactor approach to synthetically produce counterfactuals for treated units. As a result, we can gather coefficient estimates and standard errors to assess and evaluate treatment effects of policy interventions. A policyrelevant example using this method will be provided, with replication code available from my GitHub here.
Why Generalized Synthetic Control?
The fundamental challenge of causal inference is the parallel trends assumption. To quote Angrist and Pischke (2009), twoway fixed effects models produce nonbiased, causal estimates only in “parallel worlds.” This is a strong assumption to consider, as the unobserved counterfactual can rarely be produced using real data. For example, as a graduate student studying Data Science for Public Policy, it is impossible to know my postgraduation salary if I had not attended graduate school. Luckily, we can rely on a combination of econometrics and machine learning to produce these counterfactual estimates while eliminating many of the confounding variables that may bias the typical differenceindifference or fixedeffects regression models. This is where we turn to the Generalized Synthetic Control model.
What is Generalized Synthetic Control?
The Generalized Synthetic Control Model “imputes counterfactuals for each treated unit using control group information based on a linear interactive fixed effects model that incorporates unitspecific intercepts interacted with timevarying coefficients” (Xu, 2017). Essentially, this supervised learning technique combines the interactive fixedeffects model and the synthetic control model, while employing a latent factor approach to provide valid, simulationbased uncertainty estimates.
The model first estimates an interactive fixedeffect model using only the control group data, obtaining a fixed number of latent factors using a crossvalidation approach. In other words, the algorithm selects the number of latent factors, k, that minimizes the Mean Squared Prediction Error (MSPE). Next, the algorithm estimates factor loadings for each treated unit. Lastly, the algorithm imputes treated counterfactuals and standard errors based on the estimated factors and factor loadings. This is the step in which the β coefficients are generated, while standard error estimates are calculated using bootstrapping techniques. These steps become much clearer once the visualizations are produced.
Case Study: The TestOptional Movement
Since 2003, institutions have increasingly begun to drop the standardizedtesting requirement for admission. In the 200304 admission cycle, over 75% of degreegranting, 4year institutions required either the SAT or ACT upon application. However, in 201819, this number had dropped to just 48.3% with institutions increasingly shifting towards a policy of neither recommending nor requiring scores.
Though most institutions who adopt this testoptional framework are under Private control, the Public sector has seen an increase in dropping testrequirements amidst the coronavirus pandemic. With the closing of many testcenters, many institutions are shifting away from the testing requirements in the 2021 admissions cycle. Given the uncertainty of the pandemic, however, it is possible that these policies may last longer than just one year. Using the Generalized Synthetic Control method, I can examine the postimplementation effects of this policy on lowincome and minority enrollment.
The theoretical framework claims that lessprivileged students may find more difficulty in both preparing for standardized tests and taking the test itself. Meanwhile, students who come from a more affluent background have greater access to both testpreparation and testtaking. By removing the burden of standardized test taking, the playing field can be more leveled regarding this aspect of the application. As a result, postsecondary access may become less challenging for underrepresented students. We can test this framework using the Generalized Synthetic Control method. Be sure to follow along using my replication code, available here.
Dataset
The preprocessing of the data would be conducted just as we would if we were running a twoway fixedeffects regression with unique YearID pairs. In other words, we want a panel data format such as the following (note this is fake data):
Year  UnitID  TestOptional?  Pct. Pell Enrollment  Pct. Minority Enrollment  … 

2003  367546  0  0.13  0.39  … 
2003  485656  1  0.34  0.45  … 
2003  324635  0  0.09  0.23  … 
…  …  …  …  ..  … 
2018  367546  1  0.24  0.37  … 
2018  485656  1  0.43  0.67  … 
2018  324635  0  0.13  0.25  … 
Once we have the data in this format, we’re ready to run the model. I’ll be conducting this algorithm with a sample of over 1000 4year colleges and universities with data from the Integrated Postsecondary Education Data System (IPEDS). Since the algorithm automatically selects the number of latent factors that minimize MSPE, we do not need to worry as much about control variables as these will be captured in the latent factors. However, I do include controls for institutional quality (acceptance rate, yield rate) and other measures that can influence postsecondary choice (tuition, avg. grant, enrollment).
Running the Algorithm
With the GSC model, I can combat the parallel trends assumption of causal inference with highconfidence uncertainty estimates. Using GSC, there needs to be a balance between the number of years an institution is in the “pretreatment” period as well as the number of years following the implementation. The higher the minimum value of pretreatment values set as the parameter, the less biased the results will be. However, as this value increases, the number of observations decrease. With that tradeoff in mind, I find the best balance for both models in using 6 pretreatment periods in the Pell model and 11 pretreatment periods in the URM model. Further, this allows for both models to be examined with fouryears of postpolicy implementation effects.
Results
After running the algorithm, we get the following regression output:
Pct. fulltime firsttime students Pell 
Pct. degreeseeking firsttime students minority 


Testoptional  2.166*** (0.488) 
2.287*** (0.728) 
Controls  Yes  Yes 
Institution/year fixed effects  Yes  Yes 
Unobserved Factors  0  2 
Observations  1118  1055 
Treated  134  119 
Control  984  936 
Note: Standard errors are based on parametric bootstraps (blocked at the institution level) of 2,500 times. Panel data from Title IV fouryear degreegranting institutions. Years included are 20072017 in Column (1) and 20032017 in Column (2). *p<0.05, **p<0.01, ***p<0.001.
Result Summary
In short, I find that dropping the testrequirement is associated with roughly a 2percentage point increase in both firsttime Pell enrollment and underrepresented minority enrollment. These results are significant at the 99.9% confidence interval with standard errors based on 2,500 parametric bootstraps blocked at the institution level.
A visualization of this algorithm is displayed below. The AUC between the treated (solid) and counterfactual (dashed) lines represent the beta estimate of the independent variable of interest (the policychange). Note that we have less data on the Pell treated and counterfactual averages since this variable did not become available in the IPEDS universe until 2008.
The visualization above helps tremendously to explain how the Generalized Synthetic Control algorithm works. It uses factor loadings to match “treated” institutions with “control” institutions prior to treatment, so we can see where the control group would end up had they implemented the policy. In my experience as an econometrician, I believe the GSC algorithm creates a far more accurate measure of treatment effects than diffindiff or twoway fixed effects could by utilizing these factor weights and matching scores.
Further, we can investigate the average treatment effects on the institutions with the testoptional policy by period:
These visualizations display the β coefficients on the Y axis, the time relative to treatment on the X axis, and the 95% confidence interval in the shaded region. While the URM model is a bit noisier than the Pell model, we can still find significant treatment effects in years 3 and 4 after policy implementation.
The visualization from the Pell model also tells us that there may be placebo effects apparent. This means that institutions that dropped the testrequirement were seeing a significant increase in Pct. Pell enrollment prior to their treatment (especially in time t – 2, where the null hypothesis that β = 0 falls out of the 95% confidence interval). This makes the results spurious, and more challenging to attribute the increase in Pell enrollment to the testoptional policy. We can see that there are no placebo effects apparent in the Pct. Minority model.
Bootstrapped Standard Errors Visualization
The area in the shaded region, the uncertainty estimates (aka the 95% confidence interval), change depending on the number of bootstraps. Bootstrapping is a data science method of resampling such that we eliminate sampling error in our estimates. As we increase the number of bootstraps, the uncertainty estimates begin to stagnate. These are illustrated in the GIFS below.
In this first GIF, we see the uncertainty estimates as we increase the number of bootstraps from 5 to 500:
Here, we increase the number of bootstraps from 500 to 1000:
And lastly, just to drive the point home, here’s what our uncertainty estimates look like when we increase the bootstraps from 2,000 to 2,500 (the number I used in my main regression model):
These GIFS illustrate the bootstrap resampling method, displaying the method in a way that clearly defines how the algorithm works under the hood.
Conclusions
The generalized synthetic control method is a wonderful supervisedlearning approach to produce counterfactual estimates, eliminating much of the confoundedness found in OLS and FixedEffects regression techniques. Further, the algorithm allows you to produce beautiful visualizations that aid tremendously in explaining how the algorithm works to a nontechnical audience.