. drop if sat_school >= .; (398 observations deleted) . reg sat_school hhsize, r; Regression with robust standard errors Number of obs = 692 F( 1, 690) = 3.92 Prob > F = 0.0482 R-squared = 0.0081 Root MSE = .76476 ------------------------------------------------------------------------------ | Robust sat_school | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- hhsize | -.0140157 .007082 -1.98 0.048 -.0279205 -.0001109 _cons | 3.476232 .0635027 54.74 0.000 3.35155 3.600914 ------------------------------------------------------------------------------We see a significant relationship where larger households are less satisfied with the schooling received by their children. We might be worried that larger families are found in poorer, more rural areas where the overall quality of education is lower. To control for this we can add fixed effects for the census enumeration area or EA (this is the level on which our data is clustered -- we have 5 households in each census enumeration area). This controls for the socio-economic status of the community and (in most cases) the school the children attend. Thus we want the model:

. qui tab ea_code, gen(eac_); . reg sat_school hhsize eac_*, r; Regression with robust standard errors Number of obs = 692 F(187, 484) = . Prob > F = . R-squared = 0.4850 Root MSE = .65793 ------------------------------------------------------------------------------ | Robust sat_school | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- hhsize | -.0194672 .0066937 -2.91 0.004 -.0326195 -.0063149 eac_1 | .0389344 .0212165 1.84 0.067 -.0027535 .0806222 eac_2 | .0778688 .3477253 0.22 0.823 -.6053689 .7611064 ... eac_205 | 1.111936 .051018 21.79 0.000 1.011692 1.212181 eac_206 | .7190059 .2901567 2.48 0.014 .1488836 1.289128 eac_207 | 1.077869 .0267748 40.26 0.000 1.02526 1.130478 _cons | 3.038934 .0133874 227.00 0.000 3.01263 3.065239 ------------------------------------------------------------------------------If we want to test whether the fixed effects are jointly significiant, we would use

. testparm eac_*; ( 1) eac_1 = 0 ( 2) eac_2 = 0 ... F(187, 484) = 523.69 Prob > F = 0.0000This method works perfectly fine, but it is unwiedly and involves three seperate commands.

. xi: reg sat_school hhsize i.ea_code, r; i.ea_code _Iea_code_11020308-42080602(naturally coded; _Iea_code_11020308 omitted) Regression with robust standard errors Number of obs = 692 F(186, 484) = . Prob > F = . R-squared = 0.4850 Root MSE = .65793 ------------------------------------------------------------------------------ | Robust sat_school | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- hhsize | -.0194672 .0066937 -2.91 0.004 -.0326195 -.0063149 _Ie~11040201 | .0389344 .3466722 0.11 0.911 -.642234 .7201028 _Ie~11040503 | -.0389344 .0212165 -1.84 0.067 -.0806222 .0027535 ... _Ie~42080508 | .6800716 .2864624 2.37 0.018 .1172081 1.242935 _Ie~42080602 | 1.038934 .0212165 48.97 0.000 .9972465 1.080622 _cons | 3.077869 .0314294 97.93 0.000 3.016114 3.139624 ------------------------------------------------------------------------------This is the most efficient method when you have a small number of categories and care about the estimated value of the fixed effect for each category.

. areg sat_school hhsize, a(ea_code) r; Regression with robust standard errors Number of obs = 692 F( 1, 484) = 8.46 Prob > F = 0.0038 R-squared = 0.4850 Adj R-squared = 0.2648 Root MSE = .65793 ------------------------------------------------------------------------------ | Robust sat_school | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- hhsize | -.0194672 .0066937 -2.91 0.004 -.0326195 -.0063149 _cons | 3.522633 .0626253 56.25 0.000 3.399582 3.645684 -------------+---------------------------------------------------------------- ea_code | absorbed (207 categories) . xtreg sat_school hhsize, fe i(ea_code); Fixed-effects (within) regression Number of obs = 692 Group variable (i): ea_code Number of groups = 207 R-sq: within = 0.0188 Obs per group: min = 1 between = 0.0003 avg = 3.3 overall = 0.0081 max = 5 F(1,484) = 9.29 corr(u_i, Xb) = -0.0505 Prob > F = 0.0024 ------------------------------------------------------------------------------ sat_school | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- hhsize | -.0194672 .0063855 -3.05 0.002 -.032014 -.0069204 _cons | 3.522633 .0598293 58.88 0.000 3.405075 3.64019 -------------+---------------------------------------------------------------- sigma_u | .56106439 sigma_e | .65793019 rho | .42103494 (fraction of variance due to u_i) ------------------------------------------------------------------------------ F test that all u_i=0: F(206, 484) = 2.18 Prob > F = 0.0000Note that

(1) *y*_{1t} = a + b *x*_{1t} + c_{t}

(2)*y*_{2t} = a + b *x*_{2t} + c_{t}

(3)*y*_{3t} = a + b *x*_{3t} + c_{t}

(4)*y*_{4t} = a + b *x*_{4t} + c_{t}

(5)*y*_{5t} = a + b *x*_{5t} + c_{t}

By making a linear combination of equations (1) - 1/5 [(1) + (2) + (3) + (4) + (5)] we see that
(2)

(3)

(4)

(5)

. bys ea_code: egen h_m = mean(hhsize); . bys ea_code: egen s_m = mean(sat_school); . gen h_dm = hhsize - h_m; (10 missing values generated) . gen s_dm = sat_school - s_m; . reg s_dm h_dm, r; Regression with robust standard errors Number of obs = 692 F( 1, 690) = 11.93 Prob > F = 0.0006 R-squared = 0.0188 Root MSE = .55111 ------------------------------------------------------------------------------ | Robust s_dm | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- h_dm | -.0194672 .005635 -3.45 0.001 -.030531 -.0084034 _cons | -.000289 .0209499 -0.01 0.989 -.0414222 .0408442 ------------------------------------------------------------------------------Note that all these models give exactly the same value for

contact: djiboliz@gmail.com

last modified: 2 May 2007