forked from epinotes/InjuryEpi
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathR_coding_injury_tbi.Rmd
221 lines (113 loc) · 5.8 KB
/
R_coding_injury_tbi.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
---
title: "R Coding for State Injury Report"
author: "Mamadou Ndiaye - Washington State Department of Health"
date: "August 30, 2016"
output: html_document
subtitle: 2014 TBI Death in Washington State
abstract: 'Steps to provide count of traumatic brain injury (TBI) deaths using R:
(1)Define the injury with regular expressions, (2)Specify the columns with the icd-10
codes of interest, (3) Use a predefined function to match the regular expression
patterns with the icd10 codes , (4) Create a new variable assigning ''1'' to observation
matching the defined regular expression in any of the selected field, assigning
''0'' otherwise.(5) Make the tables of counts after the new variable for the defined
injury is created.'
---
```{r, echo=F,message=FALSE, warning=FALSE}
options(scipen = 3)
knitr::opts_chunk$set(
eval = FALSE,
comment = NA,
error = FALSE,
tidy = FALSE,
cache = F
)
```
## Introduction
### R libraries and custom functions
The R libraries **dplyr**, **tidyr** and **Hmisc** were used in this analysis. They can be loaded with the following codes (assuming they are already installed with the function *"install.packages"*):
```{r}
library(dplyr)
library(tidyr)
library(Hmisc)
```
The functions *"create.diag"* and *"create_cond_diag"* can be loaded from my github page with:
```{r}
source("https://raw.githubusercontent.com/epinotes/InjuryEpi/master/create_diag")
source("https://raw.githubusercontent.com/epinotes/InjuryEpi/master/create_cond_diag")
source("https://raw.githubusercontent.com/epinotes/InjuryEpi/master/onclip")
```
### The Death dataset
The variables used in the death dataset (**dea2014f**) were:
- The diagnosis code columns: *underly*, *mltcse1* to *mltcse20*
- The age group column (11 age groups): *agegrp11*
- The sex column: *sex*
## Injury Definitions with Regular Expressions
### All injuries
Definition from CDC: The ICD-10 codes __V01–Y36, Y85–Y87, Y89, *U01–*U03__ in the underlying cause of death field. They are re-written as regular expressions in the following that save them in the R object **.cdc_inj**.
```{r}
.cdc_inj <- "^V[0-9][1-9]|^[W-X][0-9][0-9]|^Y[0-2][0-9]|^Y3[0-6]|^Y8[5-7|9]|U0[1-3]"
```
Select the column of interest (the underlying cause of death column):
```{r}
colx <- grep("^underly$", names(dea2014f))
```
Create the variable for any injury, *cdc_inj*:
```{r}
dea2014f$cdc_inj <- new.diag_m(data=dea2014f, expr=.cdc_inj, colvec= colx)
```
Check the results with the function *describe()* (get information on the functiion by running *?Hmisc::describe*)
```{r}
describe(dea2014f$cdc_inj)
```
Show the requested CDC table of count of injury deaths by the 11 age groups (assigned to the object *dea2014f_inj*) :
```{r}
dea2014f_inj <- dea2014f %>% filter(cdc_inj == 1) %>% group_by(agegrp11) %>% summarise(deaths = n()) %>% na.omit()
print(dea2014f_inj)
```
we can copy the table in a clipboard then paste it on the CDC injury table templates:
```{r}
onclip(dea2014f_inj)
```
### TBI Injuries
#### 1. All TBI Injuries
Once a case is defined as injury (as previously done); it is TBI if any field of the multiple cause of death file has one of these ICD-10 code:
S01.0–S01.9, S02.0, S02.1, S02.3, S02.7–S02.9, S04.0, S06.0–S06.9, S07.0, S07.1, S07.8, S07.9, S09.7–S09.9, T01.0*, T02.0*, T04.0*, T06.0*, T90.1, T90.2, T90.4, T90.5, T90.8, T90.9
or re-written as a regular expression:
```{r}
.cdc_tbi <- "S01[0-9]|S02[0|1|3|7|8|9]|S040|S06[0-9]|S07[0|1|8|9]|S09[7-9]|T0[1|2|4|6]0|T90[1|2|4|5|8|9]"
```
Then we select the underlying cause column ("underly") and the 20 multiple cause columns ("mltcse1", "mltcse20"):
```{r}
col_m <- grep("underly$|mltcse", names(dea2014f), value = F)
```
Create the TBI variable
```{r}
dea2014f <- dea2014f %>% mutate(cdc_tbi = create_cond_diag(., expr=.cdc_tbi, colvec= col_m, cond.var = cdc_inj))
```
Check the results:
```{r}
describe(dea2014f$cdc_tbi)
```
Create a table for count of TBI deaths by the 11 age groups.
```{r}
dea2014f_tbi11 <- dea2014f %>% filter(!is.na(agegrp11) & cdc_tbi == 1) %>% mutate(agegrp11 = as.numeric(agegrp11)) %>% group_by(agegrp11) %>% summarise(deaths = n()) %>% right_join(data.frame(agegrp11 = 1:11)) %>% complete(agegrp11 = full_seq(agegrp11, period = 1), fill = list(deaths = 0))
```
Print the table or copy it on the clipboard
```{r}
print(dea2014f_tbi11)
onclip(dea2014f_tbi11[,2])
```
Create similar tables restrited to male cases:
```{r}
dea2014f_tbi11_m <- dea2014f %>% filter(!is.na(agegrp11) & cdc_tbi == 1 & sex == "M") %>% mutate(agegrp11 = as.numeric(agegrp11)) %>% group_by(agegrp11) %>% summarise(deaths = n()) %>% right_join(data.frame(agegrp11 = 1:11)) %>% complete(agegrp11 = full_seq(agegrp11, period = 1), fill = list(deaths = 0))
```
Then restricted to female cases
```{r}
dea2014f_tbi11_f <- dea2014f %>% filter(!is.na(agegrp11) & cdc_tbi == 1 & sex == "F") %>% mutate(agegrp11 = as.numeric(agegrp11)) %>% group_by(agegrp11) %>% summarise(deaths = n()) %>% right_join(data.frame(agegrp11 = 1:11)) %>% complete(agegrp11 = full_seq(agegrp11, period = 1), fill = list(deaths = 0))
```
Create a subset (**dea2014_tbi**) of the death file with all the TBI deaths. It will be used to build the other tables of specific injuries with TBI:
```{r}
sel_inj <- c(1, 3, 51, 52, grep("underly$|ecode|mltcse", names(dea2014f ), value = F), 169:180)
dea2014_tbi <- dea2014f[sel_inj] %>% filter(cdc_tbi == 1)
```
#### 2. TBI Injuries and Motor Vehicle Traffic Crashes