FDA Adverse Event Reporting System (FAERS)

Build Status Build status

The FDA Adverse Event Reporting System (FAERS) compiles all prescription drug-related side-effects reported by either physicians or patients in the United States. Either party can make a (voluntary) submission to the FDA or the manufacturer (who then must report that event). This is the post-marketing safety surveillance program for drug and therapeutic biological products.

  • Multiple tables linkable by the primaryid field with patient demographics, drug/biologic information, patient outcomes, reporting source, drug start and end dates.

  • Published quarterly with the latest events reported to the FDA since 2004, with a revised system beginning in the fourth quarter of 2012.

  • Maintained by the United States Food and Drug Administration (FDA).

Simplified Download and Importation

The R lodown package easily downloads and imports all available FAERS microdata by simply specifying "faers" with an output_dir = parameter in the lodown() function. Depending on your internet connection and computer processing speed, you might prefer to run this step overnight.

library(lodown)
lodown( "faers" , output_dir = file.path( path.expand( "~" ) , "FAERS" ) )

Analysis Examples with base R  

Load a data frame:

faers_drug_df <- 
    readRDS( file.path( path.expand( "~" ) , "FAERS" , "2016 q4/drug16q4.rds" ) )

faers_outcome_df <- 
    readRDS( file.path( path.expand( "~" ) , "FAERS" , "2016 q4/outc16q4.rds" ) )

faers_demo_df <- 
    readRDS( file.path( path.expand( "~" ) , "FAERS" , "2016 q4/demo16q4.rds" ) )

faers_df <- merge( faers_drug_df , faers_outcome_df )

faers_df <- merge( faers_df , faers_demo_df , all.x = TRUE )

Variable Recoding

Add new columns to the data set:

faers_df <- 
    transform( 
        faers_df , 
        
        physician_reported = as.numeric( occp_cod == "MD" ) ,
        
        init_fda_year = as.numeric( substr( init_fda_dt , 1 , 4 ) )
        
    )

Unweighted Counts

Count the unweighted number of records in the table, overall and by groups:

nrow( faers_df )

table( faers_df[ , "outc_code" ] , useNA = "always" )

Descriptive Statistics

Calculate the mean (average) of a linear variable, overall and by groups:

mean( faers_df[ , "init_fda_year" ] , na.rm = TRUE )

tapply(
    faers_df[ , "init_fda_year" ] ,
    faers_df[ , "outc_code" ] ,
    mean ,
    na.rm = TRUE 
)

Calculate the distribution of a categorical variable, overall and by groups:

prop.table( table( faers_df[ , "sex" ] ) )

prop.table(
    table( faers_df[ , c( "sex" , "outc_code" ) ] ) ,
    margin = 2
)

Calculate the sum of a linear variable, overall and by groups:

sum( faers_df[ , "init_fda_year" ] , na.rm = TRUE )

tapply(
    faers_df[ , "init_fda_year" ] ,
    faers_df[ , "outc_code" ] ,
    sum ,
    na.rm = TRUE 
)

Calculate the median (50th percentile) of a linear variable, overall and by groups:

quantile( faers_df[ , "init_fda_year" ] , 0.5 , na.rm = TRUE )

tapply(
    faers_df[ , "init_fda_year" ] ,
    faers_df[ , "outc_code" ] ,
    quantile ,
    0.5 ,
    na.rm = TRUE 
)

Subsetting

Limit your data.frame to elderly persons:

sub_faers_df <- subset( faers_df , age_grp == "E" )

Calculate the mean (average) of this subset:

mean( sub_faers_df[ , "init_fda_year" ] , na.rm = TRUE )

Measures of Uncertainty

Calculate the variance, overall and by groups:

var( faers_df[ , "init_fda_year" ] , na.rm = TRUE )

tapply(
    faers_df[ , "init_fda_year" ] ,
    faers_df[ , "outc_code" ] ,
    var ,
    na.rm = TRUE 
)

Regression Models and Tests of Association

Perform a t-test:

t.test( init_fda_year ~ physician_reported , faers_df )

Perform a chi-squared test of association:

this_table <- table( faers_df[ , c( "physician_reported" , "sex" ) ] )

chisq.test( this_table )

Perform a generalized linear model:

glm_result <- 
    glm( 
        init_fda_year ~ physician_reported + sex , 
        data = faers_df
    )

summary( glm_result )

Analysis Examples with dplyr  

The R dplyr library offers an alternative grammar of data manipulation to base R and SQL syntax. dplyr offers many verbs, such as summarize, group_by, and mutate, the convenience of pipe-able functions, and the tidyverse style of non-standard evaluation. This vignette details the available features. As a starting point for FAERS users, this code replicates previously-presented examples:

library(dplyr)
faers_tbl <- tbl_df( faers_df )

Calculate the mean (average) of a linear variable, overall and by groups:

faers_tbl %>%
    summarize( mean = mean( init_fda_year , na.rm = TRUE ) )

faers_tbl %>%
    group_by( outc_code ) %>%
    summarize( mean = mean( init_fda_year , na.rm = TRUE ) )