Notes from group discussion 4/21/2017

Presented by Bert and Ryan.

Question: what does it mean to have multiple ways to classify a categorical variable? If we only have two categories, do we just use a t-test? What is an example of a categorical variable that can be classified in different ways?

number of ways to categorize tells you what order of ANOVA to use...but what information does a higher-order ANOVA give you?

Paula: best discussion of ANOVA on Seltmann (?) book that's online

This particular test seems to be good for answering the general question "is there something significant going on here?"

Bert: If you do ANOVA with only two categories (i.e. A and B), that's an f-test.

Anne: Does the number in your sets matter for this? Does it matter if the number in one category is much different than another?
Equal numbers are not important, but it may get weird if one category has very few students in it.

Lisa: how do you know whether your parent distribution has a covariate?
Ryan: ANCOVA is a way of correcting for some covariance, without changing your question. In other words, we are trying to control for some other possible variable, without doing a two-way ANOVA.

You might use ANCOVA as a second step to sort out which variable actually has an effect, after you've run a two-way ANOVA and figured out that something significant is going on.

You can lower the error you would get on ANOVA by taking out the error due to a known covariate with ANCOVA.

We really have no idea what "adjusting for the covariate" really means...but the other option is a two-way ANOVA which doesn't give us info on which of the covariates really matters.

Discussion: does the adjustment for the covariate in ANCOVA usually make the variation between groups increase, as it did in our analysis? We know we are adjusting the means of the covariate categories, so maybe this messes with our categorical variable groups...

Contingency tables: Ryan's favorite statistical object.

Assumptions: simple random sample--everyone is equally likely to be selected.
Expected cell values: this is a binomial distribution, and the normal approximation to the binomial distribution (or vice versa) tells us that no cell should be "too low". -> We can't split too finely.

Comments

Popular Posts