jamesbang

V1

2022/08/26阅读:29主题:橙心

🤩 ggstatsplot | 一个满足你日常统计需求的高颜值R包(五)

🤩 ggstatsplot | 一个满足你日常统计需求的高颜值R包(五)


1. 写在前面

点图用处非常广泛,可以展示变量的分布情况,变量之间的相关性,回归结果等 上期介绍了ggstatsplot包中绘制dotplot,scatterplot的相关函数 本期重点介绍ggcoefstats函数, 高颜值展示你的回归结果

2. 用到的包

rm(list=ls())
library(tidyverse)
library(ggstatsplot)
library(lme4)

补充知识: lme4包提供的建模方式
lmerlinear
glmergeneralized linear
nlmernonlinear

3. 示例数据

dat <- movies_long
dat2 <- lung

4. 回归结果可视化

4.1 t-statistic

应用场景1:t-statistic

✅ linear model (lm) and linear mixed-effects model (lmer/lmerMod)

我们先建2个模型 ^_~

# 模型1: lm model
mod1 <- stats::lm(formula = scale(rating) ~ scale(budget), data = movies_long)

# 模型2: merMod model
mod2 <- lme4::lmer(
formula = scale(rating) ~ scale(budget) + (budget | genre),
data = movies_long,
control = lme4::lmerControl(calc.derivs = FALSE)
)

将上面两个模型一起绘图 @(o・ェ・o)@

combine_plots(
plotlist = list(
ggcoefstats(mod1) +
ggplot2::labs(x = parse(text = "'standardized regression coefficient' ~italic(beta)")),
ggcoefstats(mod2) +
ggplot2::labs(
x = parse(text = "'standardized regression coefficient' ~italic(beta)"),
y = "fixed effects"
)
),
plotgrid.args = list(nrow = 2),
annotation.args = list(title = "Relationship between movie budget and its IMDB rating")
)

Note! 需要注意的是, 对于mixed-effects models,只会绘制fixed effects,因为默认random effects是没有置信区间的•﹏•.
补充:可以使用merDeriv包计算random effects的置信区间。


如果想要查看2种效应模型的结果,可以使用parameters包的model_parameters函数

library(parameters)

# 输出结果
parameters::model_parameters(
lme4::lmer(
formula = scale(rating) ~ scale(budget) + (budget | genre),
data = movies_long,
control = lme4::lmerControl(calc.derivs = FALSE)
)
)

4.2 z-statistic

应用场景2:z-statistic
✅ Aalen’s additive regression model

建模 ٩(͡๏̯͡๏)۶

# 这里需要用到生存分析,我们加载一下`survival`包
library(survival)

afit <- survival::aareg(
formula = Surv(time, status) ~ age + sex + ph.ecog,
data = dat2,
dfbeta = TRUE
)

绘图 ✎~~~✐

ggcoefstats(
x = afit,
title = "Aalen's additive regression model",
subtitle = "(for censored data)",
k = 3
)

4.3 𝜒2 -statistic

应用场景3:𝜒2 -statistic
✅ Cox proportional hazards regression model (coxph)

建模 (:◎)≡

mod_coxph <- survival::coxph(
formula = Surv(time, status) ~ age + sex + frailty(inst),
data = dat2
)

绘图 ( 。_ 。) ✎ _

ggcoefstats(
x = mod_coxph,
title = "Proportional Hazards Regression Model\nwith Frailty penalty function"
)

Note! Aalen's additive modelCox model较为相似,满足等比例风险假设,即(Proportional hazard assumption, PH assumption)时,两种model均可选择。

不再满足时,可以选择Aalen's additive model作为Cox model的替代选择。


4.4 F-statistic

应用场景4:F-statistic
✅ omnibus ANOVA (aov)

建模 (=^x^=)

mod_aov <- stats::aov(formula = rating ~ mpaa + genre + length, data = dat)

绘图 (◆゜∀゜)b

ggcoefstats(
x = mod_aov,
effsize = "omega", # changing the effect size estimate being displayed
point.args = list(color = "red", size = 4, shape = 15),
package = "dutchmasters",
palette = "milkmaid",
title = "omnibus ANOVA",
exclude.intercept = TRUE
) +
# ggplot2::scale_y_discrete(labels = c("")) +
ggplot2::labs(x = "effect size estimate (eta-squared)", y = NULL)

Note! 这里需要注意建模时,+*的含义不同,分别为Additive effectMultiplicative effect, 即独立和相互。


4.5 Bayesian models

应用场景4:Bayesian models
✅ 贝叶斯统计不同于一般的统计方法,其不仅利用模型信息和数据信息,而且充分利用先验信息。 可以看出贝叶斯模型不仅利用了前期的数据信息,还加入了决策者的经验和判断等信息,并将客观因素和主观因素结合起来,对异常情况的发生具有较多的灵活性。

需要利用BayesFactor包进行建模, 内容非常多,不做过多介绍了,有兴趣的小伙伴自己去搜一下吧(๑→ܫ←)

library(BayesFactor)

# one sample t-test
mod1 <- ttestBF(mtcars$wt, mu = 3)

# independent t-test
mod2 <- ttestBF(formula = wt ~ am, data = mtcars)

# paired t-test
mod3 <- ttestBF(x = sleep$extra[1:10], y = sleep$extra[11:20], paired = TRUE)

# correlation
mod4 <- correlationBF(y = iris$Sepal.Length, x = iris$Sepal.Width)

# contingency tabs (not supported)
data("raceDolls")
mod5 <- contingencyTableBF(
raceDolls,
sampleType = "indepMulti",
fixedMargin = "cols"
)

# anova
data("puzzles")
mod6 <- anovaBF(
formula = RT ~ shape * color + ID,
data = puzzles,
whichRandom = "ID",
whichModels = "top",
progress = FALSE
)

# regression-1
mod7 <- regressionBF(rating ~ ., data = attitude, progress = FALSE)

# meta-analysis
t <- c(-.15, 2.39, 2.42, 2.43, -.15, 2.39, 2.42, 2.43)
N <- c(100, 150, 97, 99, 99, 97, 100, 150)
mod8 <- meta.ttestBF(t, N, rscale = 1, nullInterval = c(0, Inf))

# proportion test
mod9 <- proportionBF(y = 15, N = 25, p = .5)

# list of plots
combine_plots(
plotlist = list(
ggcoefstats(mod1, title = "one sample t-test"),
ggcoefstats(mod2, title = "independent t-test"),
ggcoefstats(mod3, title = "paired t-test"),
ggcoefstats(mod4, title = "correlation"),
ggcoefstats(mod5, title = "contingency table"),
ggcoefstats(mod6, title = "anova"),
ggcoefstats(mod7, title = "regression-1"),
ggcoefstats(mod8, title = "meta-analysis"),
ggcoefstats(mod9, title = "proportion test")
),
annotation.args = list(title = "Example from `BayesFactor` package")
)


薯片
薯片
最后祝大家早日不卷!~

点个在看吧各位~ ✐.ɴɪᴄᴇ ᴅᴀʏ 〰

分类:

后端

标签:

后端

作者介绍

jamesbang
V1