This week in methods class, we covered different types of hypothesis tests, depending on the level of measurement for the independent and dependent variables.

Hypothesis Testing

Hypothesis Testing

The agenda in this lab is to demonstrate and execute a basic version of the tests discussed in class, we will be covering four tests, three of which directly come from the matrix above (I’ll include probit/logit in a bonus section but will not cover it.)

  1. Chi-squared test - Tabular Analysis
  2. T-tests - Difference of means test
  3. Correlation vs Covariance - Understanding the correlation coefficient
  4. Bivariate Linear Regression / Ordinary Least Squares (OLS) - Correlation regression

A reminder about hypothesis testing :

\(H_0\): the null hypothesis - any difference observed is just the result of random chance \(H_1\): the research hypothesis - the variables are dependent, there is a relationship between the two variables. Knowing the value of one variable helps to predict the value of the other variable - we want to reject the null hypothesis

1 Chi-squared \(x^2\) test

A chi-square test typically is used to compare two categorical variables. It works by comparing the observed frequencies to the expected frequencies if there was no relationship between the two variables.

You can do it by hand, and Dr. Siegel goes through this in Lecture 2 Part 1, on slide 54 onwards, when he goes through an example of how to conduct a \(x^2\) test by hand. I will show you how to do this in R as well, using the function chisq.test().

I’m going to run a hypothetical test to see if there is a difference in the number of 2008 voters, between the republican party and democratic party. The two variables of interest are interest_voted2008 (did the respondent vote in the 2008 elections?), and prevote_regpty (what is the official party of registration of the respondent?)

1.1 Setting up the data

  • We will be using data from the ANES 2012 survey throughout this lab.
  • The codebook and data will be on sakai
library(sjlabelled)
library(haven) # for opening .sav files
anes2012 <- read_sav("~/Dropbox/Duke/Summer 2021/RBSI Online/Methods Lab/anes_timeseries_2012/anes_timeseries_2012.sav")
  • You always want to start by checking and preparing the data.
    • You can check the codebook to find the levels of a data and how it has been coded.
    • For prevote_regpty the possible responses to the question are:

    • So to keep the analysis as clean as possible, we might just want to restrict our chi-square analysis to those who answered 1. Democratic Party or 2. Republican Party, since we don’t have any logical reason for recoding independents or those who didn’t answer
    • Though note that there is a significant number of respondents who don’t qualify under the restrictions (-1 Inapplicable: 2958; 4 None or Independent: 214; 5 Other: 456)

table(anes2012$prevote_regpty)
## 
##   -9   -8   -1    1    2    4    5 
##   15   24 2958 1443  804  214  456
anes2012$regpty <- ifelse(anes2012$prevote_regpty == 1, 1,
                  ifelse(anes2012$prevote_regpty == 2, 2, NA))
table(anes2012$regpty)
## 
##    1    2 
## 1443  804
summary(anes2012$regpty) # Note the 3667 NAs
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   1.000   1.000   1.000   1.358   2.000   2.000    3667
  • For **interest_voted2008*,the possible responses to the question are:
summary(anes2012$interest_voted2008)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  -9.000   1.000   1.000   1.187   1.000   2.000
table(anes2012$interest_voted2008)
## 
##   -9   -8    1    2 
##    4   18 4585 1307

We only have 22 nonrespondents, so we can do a simple recode for this as well

anes2012$interest_voted2008 <- ifelse(anes2012$interest_voted2008 > 0 , anes2012$interest_voted2008, NA )
table(anes2012$interest_voted2008)
## 
##    1    2 
## 4585 1307

1.2 Chi-squared test + Analysis

Now for the chi-squared test:

chisq.test(anes2012$interest_voted2008, anes2012$regpty)
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  anes2012$interest_voted2008 and anes2012$regpty
## X-squared = 4.3926, df = 1, p-value = 0.0361
  • Here the p-value is 0.0361, which means we can reject the null hypothesis at the 0.05 level, but not at the 0.01 level.
    • the levels for hypothesis testing tend to be 0.10, 0.05, 0.01, and 0.001.
    • Interpreting this, it means that the number of 2008 voters differs significantly across party registration levels, at the 5 percent significance level.
  • For more on some interesting ways to present a contingency table and the difference between the observed and expected values, see Antoine Soetewey’s blog for some examples.

2 T-test or differences in means test

The t-test or difference of means test is is used to determine if the null hypothesis can be rejected and that we can infer that the differences between the two samples are statistically significant. It is similar to the chi-squared test, but is better employed when the dependent variable is continuous and the independent variable categorical.

Examples:

  1. Do republicans and democrats have different feelings toward the military?
  • independent variable: political affiliation (categorical)
  • dependent variable: feeling thermometer (continuous)
  1. Does college-level literacy vary across states in the U.S.?
  • independent variable: Geographical region (categorical)
  • Dependent variable: percentage of college educated adults

The difference in means test is referenced in Lecture 2, part 1, on slide 109 onwards. In R we perform the test using the function t.test(). Note that what the difference of means t-test, or a two-tailed t-test, concludes is whether there is a difference in means, rather than whether one mean is specifically less than or greater than the other - the latter type of test would be a one-tailed t-test, which R can also conduct, and is one of the options if you check ?t.test.

We will test the first example hypothesis Do republicans and democrats have different feelings toward the military?

Hypothesis: Is the relationship between party ideology pid_x and support for the military ftgr_military statistically significant?

Null Hypothesis:: There is no difference in level of support for the military (proxied by feeling thermometer) between Republicans and Democrats

2.1 Cleaning Data

As always, check the variables and clean up the data.

From the codebook, pid_x should range from 1 to 7. And we know that feeling thermometers should range from 0 to 100.

Hence we need to recode the values of either variable that doesn’t fall within that range. I’ve chosen this time to recode them as new variables ft_military_2 and pid_x_2. In pid_x_2, I’ve chosen to turn it into a binary variable, where 1 captures all respondents who identify as democrat-leaning (pid_x <= 3), and where 2 captures all respondents who identify as republican-leaning (pid_x >= 5).

summary(anes2012$ftgr_military) # Feeling thermometer should range from 0 to  100
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   -9.00   60.00   85.00   74.07  100.00  100.00
summary(anes2012$pid_x) # pid_x should range from 1 to 7.
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  -2.000   1.000   3.000   3.502   5.000   7.000
anes2012$ft_military_2 <- ifelse(anes2012$ftgr_military <0, NA, anes2012$ftgr_military)
anes2012$pid_x <- ifelse(anes2012$pid_x <0, NA, anes2012$pid_x)
anes2012$pid_x_2 <- ifelse(anes2012$pid_x <= 3, 1, 
                    ifelse(anes2012$pid_x >= 5, 2, NA))
table(anes2012$pid_x_2)
## 
##    1    2 
## 3103 1995

2.2 T-test

t.test(anes2012$pid_x_2, anes2012$ft_military_2)
## 
##  Welch Two Sample t-test
## 
## data:  anes2012$pid_x_2 and anes2012$ft_military_2
## t = -278.75, df = 5485.4, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -79.62001 -78.50791
## sample estimates:
## mean of x mean of y 
##   1.39133  80.45529

Interpreting the results:

Here, we see that the p-value is <2.2e-16, which means it is very close to 0. We can reject the null hypothesis at the 0.001 significance level, that there is no difference in respondents’ feelings towards the military, between republican and democrat respondents.

Does this follow through with our expectations of how people actually feel towards the military in real life?

3 Bivariate Regressions

The logic of regression is often thought of in terms of some independent variable \(x\) can cause changes in some dependent variable \(y\). So we can think of this relationship as resembling:

\[Y = \alpha + \beta_1 X\]

Where \(\beta\) is the coefficient that represents how much a unit change in independent variable \(X\) results in a \(\beta\) unit change in the dependent variable \(Y\).

In this section, we will be going over how to perform ordinary least squares (OLS) regessions in R, using the same example of political ideology and its effects on a voter’s feelings towards the military. Here we are interested in how one unit change in political ideology (towards increased ‘democratic-ness’), shifts one’s feelings towards the military.

The Hypothesis: Voters who are recorded as more ‘democratic’ on the political ideology scale, are more likely to have negative feelings towards the military

3.1 Setting up the data

  • We will be using the same data from ANES 2012.

  • The dependent variable is ftgr_military which measures an individual’s feelings towards obama - the ratings are between 0 to 100 degrees, where ratings between 50 and 100 indicate favorable feelings, and 0 to 50 indicates that the participant does not feel favorable towards the military
  • The independent variable is pid_x, a 7 point measure of party ID, where 1 indicates a strong democrat, and 7 indicates a strong republican.

Earlier, we already cleaned up the variables, so we will be using the cleaned variables ft_military_2, and pid_x

summary(anes2012$ft_military_2)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##    0.00   70.00   85.00   80.46  100.00  100.00     434
summary(anes2012$pid_x)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   1.000   1.000   3.000   3.524   5.000   7.000      24

3.2 Running the OLS

To run an ordinary least squares regression, we can use the lm() function to create a linear regression model. If you use ?lm you can see more details about how lm() works but here we use a simple setup:

  • lm(y ~ x, data)
  • where y is the dependent variable, and x is the independent variable
  • after defining the model (and saving it as an object), you use the summary() function to analyze the results of the model
model1 <- lm(ft_military_2 ~ pid_x, data = anes2012)
summary(model1) 
## 
## Call:
## lm(formula = ft_military_2 ~ pid_x, data = anes2012)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -84.977 -11.089   5.207  17.615  22.800 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  75.9044     0.5478 138.562   <2e-16 ***
## pid_x         1.2961     0.1333   9.724   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 20.82 on 5457 degrees of freedom
##   (455 observations deleted due to missingness)
## Multiple R-squared:  0.01703,    Adjusted R-squared:  0.01685 
## F-statistic: 94.56 on 1 and 5457 DF,  p-value: < 2.2e-16

3.2.1 Interpretation

The output for summary() shows the original lm() formula used, the summary statistics for the residuals, the coefficients for the predictor variables used, and performance measures for the model more generally.

  • The coefficient is the \(\beta\) value for \(X\) which we discussed in the beginning, it represents how much a unit change in independent variable \(X\) results in a \(\beta\) unit change in the dependent variable \(Y\).
  • Hence, the summary output for the model predicts an increase of \(1.2961\) in ftmilitary_2 for every unit increase in pid_x - what does that mean in real terms?
    • Remember that every unit increase in pid7 indicates a shift in ideology closer to Strong Republican
  • Statistical Significance: We also discussed statistical significance in class, the model output gives us both the t value, but also automatically conducts a two-tailed test. Hence the \(Pr(>|t|)\) value tells us at what level of significance the null hypothesis can be rejected at, \(<2e-16\) or \(***\) indicates that the threshold of significance is even smaller than 0.001.
  • R-squared: Lastly, the model output also provides some measures of goodness of fit, which will be important when comparing between different models, but does not have much interpretation value when just conducting hypothesis testing.

4 Multivariate Regressions

Next, we move on to multivariate regressions, where we are comparing the effects of multiple independent variables on one dependent variable. Whereas before in the bivariate analysis we only had one \(x\), this time we can have multiple \(x\)s.

\[Y = \alpha + \beta_1 X_1 + \beta_2 X_2 + \beta_3 X_3 + \beta_4 X_4\]

This time we include two more independent variables, on top of pid_x: 1. Pre family income - inc_incgroup_pre 2. Religious practices - relig_import

summary(anes2012$inc_incgroup_pre)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   -9.00    4.00   12.00   11.86   20.00   28.00
get_labels(anes2012$inc_incgroup_pre) #you will need the package sjlabelled to use the get_labels() function
##  [1] "-9. Refused"                                                                 
##  [2] "-8. Don't know"                                                              
##  [3] "-2. Missing; IWR mistakenly entered '2' in place of DK code for total income"
##  [4] "01. Under $5,000"                                                            
##  [5] "02. $5,000-$9,999"                                                           
##  [6] "03. $10,000-$12,499"                                                         
##  [7] "04. $12,500-$14,999"                                                         
##  [8] "05. $15,000-$17,499"                                                         
##  [9] "06. $17,500-$19,999"                                                         
## [10] "07. $20,000-$22,499"                                                         
## [11] "08. $22,500-$24,999"                                                         
## [12] "09. $25,000-$27,499"                                                         
## [13] "10. $27,500-$29,999"                                                         
## [14] "11. $30,000-$34,999"                                                         
## [15] "12. $35,000-$39,999"                                                         
## [16] "13. $40,000-$44,999"                                                         
## [17] "14. $45,000-$49,999"                                                         
## [18] "15. $50,000-$54,999"                                                         
## [19] "16. $55,000-$59,999"                                                         
## [20] "17. $60,000-$64,999"                                                         
## [21] "18. $65,000-$69,999"                                                         
## [22] "19. $70,000-$74,999"                                                         
## [23] "20. $75,000-$79,999"                                                         
## [24] "21. $80,000-$89,999"                                                         
## [25] "22. $90,000-$99,999"                                                         
## [26] "23. $100,000-$109,999"                                                       
## [27] "24. $110,000-$124,999"                                                       
## [28] "25. $125,000-$149,999"                                                       
## [29] "26. $150,000-$174,999"                                                       
## [30] "27. $175,000-$249,999"                                                       
## [31] "28. $250,000 or more"
summary(anes2012$relig_import)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  -9.000   1.000   1.000   1.249   2.000   2.000
get_labels(anes2012$relig_import)
## [1] "-9. Refused"      "-8. Don't know"   "1. Important"    
## [4] "2. Not important"

4.1 Recoding both variables:

anes2012$faminc <- ifelse(anes2012$inc_incgroup_pre > 0 , anes2012$inc_incgroup_pre, NA)

anes2012$relig_import_2 <- ifelse(anes2012$relig_import > 0, anes2012$relig_import, NA)
fit2 <- lm(ft_military_2 ~ pid_x + faminc + relig_import_2, data = anes2012)
summary(fit2)
## 
## Call:
## lm(formula = ft_military_2 ~ pid_x + faminc + relig_import_2, 
##     data = anes2012)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -86.968 -11.832   4.488  15.724  27.934 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    85.25602    1.09710  77.710   <2e-16 ***
## pid_x           1.25808    0.13943   9.023   <2e-16 ***
## faminc         -0.01035    0.03657  -0.283    0.777    
## relig_import_2 -7.08419    0.63435 -11.168   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 20.57 on 4974 degrees of freedom
##   (936 observations deleted due to missingness)
## Multiple R-squared:  0.04408,    Adjusted R-squared:  0.0435 
## F-statistic: 76.45 on 3 and 4974 DF,  p-value: < 2.2e-16

Here we can make three inferences about the results:

  1. As political ideology increases one unit towards more ‘republican’ levels, the respondents’ feeling towards the military increases by 1.26 units on average, and this effect is statistically significant at the 0.001 level.
  • Also, the inclusion of more variables has not changed the effect of political ideology on feelings towards the military
  1. As respondents’ religious practices shift from important to not important, their feeling towards the military decreases by 7.08 units on average, and this is also significant at the 0.001 level.
  2. Family income has no statistically significant effect on one’s feelings towards the military.

4.2 Putting it together

How can we use these regression predictions to develop a better understanding of the possible effects of political ideology on feelings towards the military? How meaningful are the effects of +1.26 units from political ideology, or -7.08 from (lack of) religious practices?

4.2.1 What does the intercept mean?

Recall the results from model1:

summary(model1)
## 
## Call:
## lm(formula = ft_military_2 ~ pid_x, data = anes2012)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -84.977 -11.089   5.207  17.615  22.800 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  75.9044     0.5478 138.562   <2e-16 ***
## pid_x         1.2961     0.1333   9.724   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 20.82 on 5457 degrees of freedom
##   (455 observations deleted due to missingness)
## Multiple R-squared:  0.01703,    Adjusted R-squared:  0.01685 
## F-statistic: 94.56 on 1 and 5457 DF,  p-value: < 2.2e-16

The intercept (often labeled the constant, or \(\alpha\)) is the expected mean value of Y when all X=0. - Hence, in the model above, the intercept (75.904) indicates the baseline mean level of respondent’s feelings towards the military, if pid_x is held at the lowest level (0).

The estimate, (often labeled the coeffcient, or \(\beta\)) is the expected impact of X on Y for every additional unit of X - Hence, in the model above, the coefficient (1.2961) indicates the upward slope or additional increase in Y, for every unit increase in X

library(ggplot2)
ggplot(anes2012, aes(x = pid_x, y = ft_military_2)) + geom_point(alpha = 0.005) + 
    geom_smooth(method='lm')
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 455 rows containing non-finite values (stat_smooth).
## Warning: Removed 455 rows containing missing values (geom_point).

4.2.2 Coefficient Comparison

Can we compare between different kinds of coefficients? - Are the results from pid_x and relig_import_2 equally significant or meaningful in fit2? - In the model above, the estimated coefficient for pid_x is 1.25808, which seems sizeably smaller than that of relig_import_2 which is -7.08419. - However, this doesn’t take into account the range and values of pid_x, which ranges from 1 to 7. Compared to relig_import_2, which is only 1 - 2. - So this means that between a respondent who is strongly democrat (pid_x = 1), and a respondent who is strongly republican (pid_x = 7), the difference in predicted level of feelings towards the military is 8.80656, which arguably is greater than the impact of relig_import_2

4.2.3 Model goodness of fit and \(R^2\)

  • You might remember this from slide 57 in Week 2, Pt 2 of the lecture slides.
  • \(R^2\) is the proportion of the variance in the dependent variable that our model explains.
    • It ranges from 0-1.
    • The closer our R2 is to 1, the more of the variation our model explains.
    • The closer our R2 is to 1, the better our model is at predicting the dependent variable.

From the summary() function, when used on lm(), we get some information on the goodness of fit of the model as well. In fit2 above, for example, the \(R^2\) value is 0.0435, which indicates that though the variables we have identified may be significant, they may not be key explanations for the dependent variable of interest.

Talking about the intercept, Rsquare, and confidence intervals

5 Making Tables

Lastly, we are going to go over how to present the results you get from a regression, into the kinds of tables you often see in papers. We use a package called stargazer to present the results.

library(stargazer)
stargazer(model1, fit2, type = "html", title="Effects of Political Ideology on Feelings Towards the Military", digits=1, out="table1.htm", style = "ajps")
## 
## <table style="text-align:center"><caption><strong>Effects of Political Ideology on Feelings Towards the Military</strong></caption>
## <tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td colspan="2"><strong>ft_military_2</strong></td></tr>
## <tr><td style="text-align:left"></td><td><strong>Model 1</strong></td><td><strong>Model 2</strong></td></tr>
## <tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">pid_x</td><td>1.3<sup>***</sup></td><td>1.3<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td>(0.1)</td><td>(0.1)</td></tr>
## <tr><td style="text-align:left">faminc</td><td></td><td>-0.01</td></tr>
## <tr><td style="text-align:left"></td><td></td><td>(0.04)</td></tr>
## <tr><td style="text-align:left">relig_import_2</td><td></td><td>-7.1<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td></td><td>(0.6)</td></tr>
## <tr><td style="text-align:left">Constant</td><td>75.9<sup>***</sup></td><td>85.3<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td>(0.5)</td><td>(1.1)</td></tr>
## <tr><td style="text-align:left">N</td><td>5459</td><td>4978</td></tr>
## <tr><td style="text-align:left">R-squared</td><td>0.02</td><td>0.04</td></tr>
## <tr><td style="text-align:left">Adj. R-squared</td><td>0.02</td><td>0.04</td></tr>
## <tr><td style="text-align:left">Residual Std. Error</td><td>20.8 (df = 5457)</td><td>20.6 (df = 4974)</td></tr>
## <tr><td style="text-align:left">F Statistic</td><td>94.6<sup>***</sup> (df = 1; 5457)</td><td>76.5<sup>***</sup> (df = 3; 4974)</td></tr>
## <tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td colspan="3" style="text-align:left"><sup>***</sup>p < .01; <sup>**</sup>p < .05; <sup>*</sup>p < .1</td></tr>
## </table>

6 Glossary

The functions we’ve gone over today are:

  1. chisq.test()
  2. t.test()
  3. lm() + summary()
  4. stargazer()