R语言笔记: 17
chapter 15: One Factor Models
Linear models with only categorical predictors (or factors) have traditionally been called analysis of variance (ANOVA) problems. The terminology used in ANOVA-type problems is sometimes different. Predic- tors are now all qualitative and are now typically called factors, which have some number of levels. The regression parameters are now often called effects. We shall consider only models where the parameters are considered fixed, but unknown —called fixed-effects models. Random-effects models are used where parameters are taken to be random variables and are not covered in this text.
1、回归方程,检查不同level间是否显著 (不同levels之间是否有差异)
data(coagulation, package="faraway")
coagulation
plot(coag ~ diet, coagulation,ylab="coagulation time")
stripchart(coag ~ diet, coagulation, vertical=TRUE, method="stack", xlab="diet",ylab="coagulation time")
(1)默认方式回归方程
lmod <- lm(coag ~ diet, coagulation)
summary(lmod)
round(coef(lmod),1) # 显示系数
model.matrix(lmod) #查看编码方式
anova(lmod) #检验不同level间是否有差异
(2)回归不含截距项的方程
lmodi <- lm(coag ~ diet -1, coagulation)
sumary(lmodi)
lmnull <- lm(coag ~ 1, coagulation)
anova(lmnull,lmodi)
(3)using a sum coding
options(contrasts=c("contr.sum","contr.poly"))
lmods <- lm(coag ~ diet , coagulation)
sumary(lmods)
2、诊断检验
It makes no sense to transform the predictor, but it is reasonable to consider transforming the response
qqnorm(residuals(lmod))
qqline(residuals(lmod))
plot(jitter(fitted(lmod)),residuals(lmod),xlab="Fitted",ylab=" Residuals")
abline(h=0)
#检验异方差
(1) using Levene’s test.
med <- with(coagulation,tapply(coag,diet,median))
ar <- with(coagulation,abs(coag -med[diet]))
anova(lm(ar ~ diet,coagulation)) #p>0.5 表明不存在异方差
(2) Bartlett
bartlett.test(coag ~ diet, coagulation)
3、Pairwise Comparisons(A和B之间是否存在差异)
After detecting some difference in the levels of the factor, interest centers on which levels or combinations of levels are different
tci <- TukeyHSD(aov(coag ~ diet, coagulation)))
4、False Discovery Rate(与平均值相比)
# One approach is to control the familywise error rate (FWER) which is the overall probability of falsely declaring a difference (where none exists).
pvals <- summary(lmod)$coef[,4]
padj <- p.adjust(pvals, method="bonferroni")
coef(lmod)[padj < 0.05]
#An alternative approach is to control the false discovery rate (FDR) which is the proportion of effects identified as significant which are not real.
(1)
names(which(sort(pvals) < (1:49)*0.05/49))
(2) more convenent method
padj <- p.adjust(pvals, method="fdr")
coef(lmod)[padj < 0.05]
热门话题 · · · · · · ( 去话题广场 )
- 锦绣芳华追剧手记562篇内容 · 43.3万次浏览
- 想做的事,别等“以后”1.0万+篇内容 · 754.2万次浏览
- 抬头看看,这个刚诞生的夏天395篇内容 · 68.1万次浏览
- 中年人感悟特别多1554篇内容 · 743.1万次浏览
- 重新养一遍自己,可真好啊3259篇内容 · 491.4万次浏览
- 你有哪些“终不似,少年游”的经历?3677篇内容 · 138.1万次浏览
- 哪个瞬间你发现自己被琐碎地爱着?771篇内容 · 170.8万次浏览
- 无意间闯进了陌生人的人生16篇内容 · 6.8万次浏览