jamesbang

V1

2022/11/11阅读：17主题：雁栖湖

# 🤧 ggmice | 用这只可爱的小老鼠来填补你的缺失值吧！~

## 2用到的包

``rm(list = ls())library(tidyverse)library(mice) library(ggmice)``

## 3示例数据

``dat <- airqualitydat[4:9,3] <- rep(NA,6)dat[1:4,4] <- NA``

## 4数据概览

### 4.1 缺失值查看

``summary(dat)``

### 4.2 缺失值可视化

Note! `ggmice`提供了一种`NA值`的可视化方法，一目了然，nice! 🤒

``plot_pattern(dat,             square = F,             rotate = F)``

### 4.3 influx-outflux plot

``plot_flux(dat,          label = F,          caption = F)``

## 5可视化一下吧

### 5.1 连续变量

``ggmice(dat, aes(Ozone, Solar.R))+  geom_point()``

### 5.2 分类变量

``ggmice(dat, aes(Month, Solar.R)) +  geom_point()``

### 5.3 分面展示

``ggmice(dat, aes(Month, Solar.R)) +  geom_point() +  facet_wrap(~ Month == 5,             # labeller = label_both             )``

## 6mice包填补缺失值

### 6.1 填补缺失值

Note! 可选`method`包括：👇

`pmm`,
`logreg`,
`polyreg`,
`polr`

``imp <- mice(dat, m = 3, method = "pmm")``

### 6.2 连续变量缺失值填补后可视化

``ggmice(imp, aes(Ozone, Solar.R))+  geom_point()``

### 6.3 分类变量缺失值填补后可视化

``ggmice(imp, aes(Month, Solar.R)) +  geom_point()``

### 6.4 分面展示

``ggmice(dat, aes(Month, Solar.R)) +  geom_point() +  facet_wrap(~ Month == 5,             # labeller = label_both             )``

## 7填补数据集的可视化

### 7.1 dotplot

``ggmice(imp, aes(x = .imp, y = Temp)) +  geom_jitter(height = 0, width = 0.25) +  labs(x = "Imputation number")``

### 7.2 boxplot

``ggmice(imp, aes(x = .imp, y = Temp)) +   geom_jitter(height = 0, width = 0.25) +  geom_boxplot(width = 0.5, size = 1, alpha = 0.75, outlier.shape = NA) +  labs(x = "Imputation number")``

## 8算法收敛

``plot_trace(imp,#           "Temp"           )``

📍 往期精彩

##### jamesbang
V1

wx🔍: Grassssss 卷起来了