jamesbang

V1

2022/09/09阅读：32主题：雁栖湖

🧐 lme4 | 这个线性模型对你来说可能更合理

2用到的包

``rm(list = ls())library(tidyverse)library(lme4)library(modelr)library(broom)``

3示例数据

``create_data <- function() {  df <- tibble(    ids = 1:100,    department = rep(c("sociology", "biology", "english", "informatics", "statistics"), 20),    bases = rep(c(40000, 50000, 60000, 70000, 80000), 20) * runif(100, .9, 1.1),    experience = floor(runif(100, 0, 10)),    raises = rep(c(2000, 500, 500, 1700, 500), 20) * runif(100, .9, 1.1)  )  df <- df %>% mutate(    salary = bases + experience * raises  )  df}library(broom.mixed)df <- create_data()``

Note! 参数依次为：
`"ids"`教师代号
`"department`部门
`"experience"`经验
`"raises"`增长率
`"salary"`收入

4线性模型

4.1 建模

``m1 <- lm(salary ~ experience, data = df)m1broom::tidy(m1)``

``df %>% modelr::add_predictions(m1)``

4.2 可视化

``df %>%  add_predictions(m1) %>%  ggplot(aes(x = experience, y = salary)) +  geom_point() +  geom_line(aes(x = experience, y = pred)) +  labs(x = "Experience", y = "Predicted Salary") +  ggtitle("linear model Salary Prediction") +  scale_color_npg()``

5变化的截距

5.1 建模

``m2 <- lmer(salary ~ experience + (1 | department), data = df)m2broom.mixed::tidy(m2, effects = "ran_vals") # "ran_vals" = random effect values``

5.2 可视化

``df %>%  add_predictions(m2) %>%  ggplot(aes(    x = experience, y = salary, group = department,    colour = department  )) +  geom_point() +  geom_line(aes(x = experience, y = pred)) +  labs(x = "Experience", y = "Predicted Salary") +  ggtitle("Varying Intercept Salary Prediction") +  scale_color_npg()``

6变化的斜率

6.1 建模

``m3 <- lmer(salary ~ experience + (0 + experience | department), data = df)m3broom.mixed::tidy(m3, effects = "ran_vals")``

6.2 可视化

``df %>%  add_predictions(m3) %>%  ggplot(aes(    x = experience, y = salary, group = department,    colour = department  )) +  geom_point() +  geom_line(aes(x = experience, y = pred)) +  labs(x = "Experience", y = "Predicted Salary") +  ggtitle("Varying slope Salary Prediction") +  scale_color_npg()``

7变化的斜率和截距

7.1 建模

``m4 <- lmer(salary ~ experience + (1 + experience | department), data = df)m4broom.mixed::tidy(m4, effects = "ran_vals")``

7.2 可视化

``df %>%  add_predictions(m4) %>%  ggplot(aes(    x = experience, y = salary, group = department,    colour = department  )) +  geom_point() +  geom_line(aes(x = experience, y = pred)) +  labs(x = "Experience", y = "Predicted Salary") +  ggtitle("Varying Intercept and Slopes Salary Prediction") +  scale_color_npg()``

8. 小彩蛋

Q: 不同的`department``base`不同，`raise`也不同，我们得出不同的`α``β`

A: 不等价！

jamesbang
V1

wx🔍: Grassssss 卷起来了