David McKenzie

This is my third, and final, in a series of posts on doing power calculations for regression discontinuity (see part 1 and part 2).
Scenario 3 (SCORE DATA AVAILABLE, AT LEAST PRELIMINARY OUTCOME DATA AVAILABLE; OR SIMULATED DATA USED): The context of data being available seems less usual to me in the planning stages of an impact evaluation, but could be possible in some settings (e.g. you have the score data and administrative data on a few outcomes, and then are deciding whether to collect survey data on other outcomes). But more generally, you will be in this stage once you have collected all your data. Moreover, the methods discussed here can be used with simulated data in cases where you don’t have data.
There is then a new Stata package rdpower written by Matias Cattaneo and co-authors that can be really helpful in this scenario (thanks also to him for answering several questions I had on its use). It calculates power and sample sizes, assuming you are then going to be using the rdrobust command to analyze the data. There are two related commands here:
- rdpower: this calculates the power, given your data and sample size for a range of different effect sizes
- rdsampsi: this calculates the sample size you need to get a given power, given your data and that you will be analyzing it with rdrobust.
Another use is once people have data, to help understand the power consequences of choosing different bandwidths for the RD estimation. For example, using the Senate elections data they provide as demonstration data with the package:
- rdpower demvoteshfor2 demmv, tau(5) shows that one has 81.8% power to detect a 5 percentage point jump in vote share (this is what tau gives) at the cutoff, with the optimal bandwidth chosen (which is 17.7 here). I can then see what power would be if I take a smaller bandwidth, say of 10: rdpower demvoteshfor2 demmv, tau(5) h(10) – this tells me I would only have 45.2% power with the smaller bandwidth.
Notes: Matias adds that you may want to use the option scaleregul(0) when using this command, which is not the default, but avoids regularization choosing quite small bandwidths.
No comments:
Post a Comment