快速注册

R语言笔记： 17

船长阿道克 2019-04-10 23:59:56

chapter 15: One Factor Models

Linear models with only categorical predictors (or factors) have traditionally been called analysis of variance (ANOVA) problems. The terminology used in ANOVA-type problems is sometimes different. Predic- tors are now all qualitative and are now typically called factors, which have some number of levels. The regression parameters are now often called effects. We shall consider only models where the parameters are considered fixed, but unknown —called fixed-effects models. Random-effects models are used where parameters are taken to be random variables and are not covered in this text.

1、回归方程，检查不同level间是否显著（不同levels之间是否有差异）

data(coagulation, package="faraway")

coagulation

plot(coag ~ diet, coagulation,ylab="coagulation time")

stripchart(coag ~ diet, coagulation, vertical=TRUE, method="stack", xlab="diet",ylab="coagulation time")

（1）默认方式回归方程

lmod <- lm(coag ~ diet, coagulation)

summary(lmod)

round(coef(lmod),1) # 显示系数

model.matrix(lmod) #查看编码方式

anova(lmod) #检验不同level间是否有差异

（2）回归不含截距项的方程

lmodi <- lm(coag ~ diet -1, coagulation)

sumary(lmodi)

lmnull <- lm(coag ~ 1, coagulation)

anova(lmnull,lmodi)

（3）using a sum coding

options(contrasts=c("contr.sum","contr.poly"))

lmods <- lm(coag ~ diet , coagulation)

sumary(lmods)

2、诊断检验

It makes no sense to transform the predictor, but it is reasonable to consider transforming the response

qqnorm(residuals(lmod))

qqline(residuals(lmod))

plot(jitter(fitted(lmod)),residuals(lmod),xlab="Fitted",ylab=" Residuals")

abline(h=0)

#检验异方差

（1） using Levene’s test.

med <- with(coagulation,tapply(coag,diet,median))

ar <- with(coagulation,abs(coag -med[diet]))

anova(lm(ar ~ diet,coagulation)) #p>0.5 表明不存在异方差

（2） Bartlett

bartlett.test(coag ~ diet, coagulation)

3、Pairwise Comparisons（A和B之间是否存在差异）

After detecting some difference in the levels of the factor, interest centers on which levels or combinations of levels are different

tci <- TukeyHSD(aov(coag ~ diet, coagulation)))

4、False Discovery Rate(与平均值相比)

# One approach is to control the familywise error rate (FWER) which is the overall probability of falsely declaring a difference (where none exists).

pvals <- summary(lmod)$coef[,4]

padj <- p.adjust(pvals, method="bonferroni")

coef(lmod)[padj < 0.05]

#An alternative approach is to control the false discovery rate (FDR) which is the proportion of effects identified as significant which are not real.

(1)

names(which(sort(pvals) < (1:49)*0.05/49))

(2) more convenent method

padj <- p.adjust(pvals, method="fdr")

coef(lmod)[padj < 0.05]

回应转发赞收藏

> 我来回应

船长阿道克 (Croatia)

你瞅啥？

R语言笔记： 17

热门话题 · · · · · · ( 去话题广场 )