--- title: 'Example for a Sick-Sicker-Dead model' author: "Javier Sanchez Alvarez and Valerie Aponte Ribero" date: '`r format(Sys.Date(), "%B %d, %Y")`' output: html_document: number_sections: true toc: true toc_float: true embed-resources: true self-contained-math: true vignette: > %\VignetteIndexEntry{example_ssd} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", echo = TRUE ) ``` # Introduction This document runs a discrete event simulation model in the context of a late oncology model to show how the functions can be used to generate a model in only a few steps. When running a DES, it's important to consider speed. Simulation based models can be computationally expensive, which means that using efficient coding can have a substantial impact on performance. ## Main options ```{r setup} library(WARDEN) library(dplyr) library(ggplot2) library(kableExtra) library(purrr) ``` ```{r main_opt, results='hide', message=FALSE} options(scipen = 999) options(digits=3) options(tibble.print_max = 50) ``` # General inputs with delayed execution Initial inputs and flags that will be used in the model can be defined below. We can define inputs that will only change across scenarios (`sensitivity_inputs`), inputs which are common to all patients (`common_all_inputs`) within a simulation, inputs that are unique to a patient independently of the treatment (e.g. natural time to death, defined in `common_pt_inputs`), and inputs that are unique to that patient and that treatment (`unique_pt_inputs`). Items can be included through the `add_item` function, and can be used in subsequent items. All these inputs are generated before the events and the reaction to events are executed. Furthermore, the program first executes `common_all_inputs`, then `common_pt_inputs` and then `unique_pt_inputs`. So one could use the items generated in `common_all_inputs` in `unique_pt_inputs`. Note that inputs are "reset" after each patient, so if patient 1 arm "noint" changes `util.sick` to be = 2, even if it's a common parameter for everyone, it would be reset to 1 for patient 1 arm "int". Note that time to death is set in the common_pt_inputs, but it could also just be set in the `add_tte` function explained below. The user has full flexibility on how to implement this type of inputs. There are some auxiliary functions to help setting up inputs, like `pick_val_v` and `pick_val` (and `pick_psa`, see below the section on [Sensitivity Analysis](#sensitivity-analysis)). Note that `pick_val_v` and `pick_val` can be directly loaded as parameters (in fact, a named list will be loaded directly by R). A small tweak is needed if it's the first item added, in which the item list must be initiated by using `add_item()` (see below). ```{r input_delayed} #We don't need to use sensitivity_inputs here, so we don't add that object #Put objects here that do not change on any patient or intervention loop common_all_inputs <-add_item( util.sick = 0.8, util.sicker = 0.5, cost.sick = 3000, cost.sicker = 7000, cost.int = 1000, coef_noint = log(0.2), HR_int = 0.8, drc = 0.035, #different values than what's assumed by default drq = 0.035, random_seed_sicker_i = sample.int(100000,1000,replace = FALSE) ) #to be used as seeds to draw the time to event for sicker, to ensure same luck for the same patient independently of the arm #Put objects here that do not change as we loop through treatments for a patient common_pt_inputs <- add_item(death= max(0.0000001,rnorm(n=1, mean=12, sd=3))) #Put objects here that change as we loop through treatments for each patient (e.g. events can affect fl.tx, but events do not affect nat.os.s) unique_pt_inputs <- add_item(fl.sick = 1, q_default = util.sick, c_default = cost.sick + if(arm=="int"){cost.int}else{0}) ``` # Events ## Add Initial Events Events are added below through the `add_tte` function. We use this function once applying to both interventions. We must define several arguments: one to indicate the intervention, one to define the names of the events used, one to define the names of other objects created that we would like to store (optional, maybe we generate an intermediate input which is not an event but that we want to save) and the actual input in which we generate the time to event. Events and other objects will be automatically initialized to `Inf`. We draw the times to event for the patients. Note: the order of the `evts` argument that appears first will be used as a reference of the order in which to process events in the case of ties (so "sick" would be processed before "sicker" if there is a tie in time to event.) Note that the model will use the evnets defined in `evts` argument to look for the objects both defined in the input list and in this expression to allocate time to events. If an event is declared in `evts` but not defined elsewhere, then they would be assumed TTE of `Inf` by default. This chunk is a bit more complex, so it's worth spending a bit of time explaining it. The `init_event_list` object is populated by using the `add_tte` function which applies to both arms, "int" strategy and "noint" strategy. We first declare the `start` time to be `0`. Note this could also be separated by arm if the user wants to have more clarity using two `add_tte` functions (i.e., `add_tte(arm="noint"...) %>% add_tte(arm="int"...)`). We then proceed to generate the actual time to event. We use the `draw_tte()` function to generate the time to event, though one can set this up in any other way (e.g., using `rexp`). One should always be aware of how the competing risks interact with each other. While we have abstracted from these type of corrections here, it is recommended to have an understanding about how these affect the results and have a look at the competing risks/semi-competing risks literature. ```{r model_evts} init_event_list <- add_tte(arm=c("noint","int"), evts = c("sick","sicker","death") ,input={ sick <- 0 sicker <- draw_tte(1,dist="exp", coef1=coef_noint, beta_tx = ifelse(arm=="int",HR_int,1), seed = random_seed_sicker_i[i]) #this way the value would be the same if it wasn't for the HR, effectively "cloning" patients luck }) ``` ## Add Reaction to Those Events Once the initial times of the events have been defined, we also need to declare how events react and affect each other. To do so, we use the `evt_react_list` object and the `add_reactevt` function. This function just needs to state which event is affected, and the actual reaction (usually setting flags to 1 or 0, or creating new/adjusting events). There are a series of objects that can be used in this context to help with the reactions. Apart from the global objects and flags defined above, we can also use `curtime` for the current event time, `prevtime` for the time of the previous event, `cur_evtlist` for the named vector of events that is yet to happen for that patient, `arm` for the current treatment in the loop, `evt` for the current event being processed, `i` expresses the patient iteration, and `simulation` the specific simulation (relevant when the number of simulations is greater than 1). Furthermore, one can also call any other input/item that has been created before or create new ones. For example, we could even modify a cost/utility item by changing it directly, e.g. through `modify_item(list(cost.idfs.tx=500))`. Item | What does it do ------------ | ------------- `curtime` | Current event time (numeric) `prevtime` | Time of the previous event (numeric) `cur_evtlist` | Named vector of events that is yet to happen for that patient (named numeric vector) `evt` | Current event being processed (character) `i` | Patient being iterated (numeric) `arm` | Intervention being iterated (character) `simulation` | Simulation being iterated (numeric) `sens` | Sensitivity analysis being iterated (numeric) The functions to add/modify events and inputs use lists. Whenever several inputs/events are added or modified, it's recommended to group them within one function, as it reduces the computation cost. So rather than use two `modify_item` with a list of one element, it's better to group them into a single `modify_item` with a list of two elements. `new_event`allows to generate events and add them to the vector of events. It accepts more than one event. `modify_event` allows to modify events (e.g. delay death). When adding an event, the name of the events and the time of the events must be defined. When using `modify_event`, one must indicate which events are affected and what are the new times of the events. If the event specified does not exist or has already occurred, it will be ignored. `modify_event` with `create_if_null = TRUE` argument will also generate events if they don't exist. Note that one could potentially omit part of the modeling set in `init_event_list` and actually define new events dynamically through the reactions (we do that below for the `"ae"` event). However, this can have an impact in computation time, so if possible it's always better to use `init_event_list`. `modify_item` and `modify_item_seq`allow to modify and add items. Elements defined within this function are not evaluated sequentially in `modify_item` (i.e. defining `modify_item(list(fl.new = 1, var1 = fl.new * 5)))` will give an error if `fl.new` was not defined outside this function), while `modify_item_seq` will do it sequentially at a slightly bigger computational cost, so both are left as choices for the user. Note that one can modify costs/utilities by using the construction `type_name_category`, where type is either "qaly" or "cost", name is the name (e.g., "default") and category is the category used (e.g., "instant"), so one could pass `cost_default_instant` and modify the cost. The list of relevant functions to be used within `add_reactevt` are: Function | What does it do | How to use it ------------ | ------------- | ------------- `modify_item()` | Adds & Modifies items/flags/variables for future events | `modify_item(list("fl.idfs.ontx"=0,"fl.tx.beva"=0))` `modify_item_seq()` | Adds & Modifies items/flags/variables for future events sequentially | `modify_item_seq(list("fl.idfs.ontx"=0,"fl.tx.beva"=0))` `new_event()` | Adds events to the vector of events for that patient | `new_event(rep(list("ae"=curtime + 0.001),5))` `modify_event()` | Modifies (or creates, with `create_if_null = TRUE`) existing events by changing their time | `modify_event(list("os"=curtime +5, "ttot"=curtime+0.0001))` The model will run until `curtime` is set to `Inf`, so the event that terminates the model (in this case, `os`), should modify `curtime` and set it to `Inf`. Finally, note that there could be two different ways of accumulating continuous outcomes, backwards (i.e., in the example below, we would set q_default = util.sick at the sicker event, and modify the `q_default` value in the death event) and forwards (as in the example below). This option can be modified in the `run_sim` function using the `accum_backwards` argument, which assumes forwards by default. ```{r model_reaction} evt_react_list <- add_reactevt(name_evt = "sick", input = {}) %>% add_reactevt(name_evt = "sicker", input = { modify_item(list(q_default = util.sicker, c_default = cost.sicker + if(arm=="int"){cost.int}else{0}, fl.sick = 0)) }) %>% add_reactevt(name_evt = "death", input = { modify_item(list(q_default = 0, c_default = 0, curtime = Inf)) }) ``` ### Extract Interactions Between Events As an additional optional step, to easily see the interactions between the reactions of the events, we can also now use the `extract_from_reactions` function to obtain a data.frame with all the relationships defined in the reactions in the model. This functions looks at all `modify_item`, `modify_item_seq`, `modify_event` and `new_event` and checks which elements are being defined there, their definition, and whether they are triggered conditionally (e.g., `"if(a==1){modify_item(list(b=2))}"`). Note it would be straightforward to build a network graph showcasing all the interactions between events in terms of events affecting other events, or to show which (and how) events affect specific items. ```{r model_relatioships} df_interactions <- extract_from_reactions(evt_react_list) kable(df_interactions) ``` # Costs and Utilities Costs and utilities are introduced below. However, it's worth noting that the model is able to run without costs or utilities. Utilities/Costs/Other outputs are defined by declaring which object belongs to utilities/costs/other outputs, and whether they need to be discounted continuously or discretely (instantaneous). These will be passed to the `run_sim` function. ## Utilities ```{r utilities} util_ongoing <- "q_default" ``` ## Costs ```{r costs} cost_ongoing <- "c_default" ``` # Model ## Model Execution The model can be run using the function `run_sim` below. We must define the number of patients to be simulated, the number of simulations, whether we want to run a PSA or not, the strategy list, the inputs, events and reactions defined above, utilities, costs and also if we want any extra output and the level of ipd data desired to be exported. It is worth noting that the `psa_bool` argument does not run a PSA automatically, but is rather an additional input/flag of the model that we use as a reference to determine whether we want to use a deterministic or stochastic input. As such, it could also be defined in `common_all_inputs` as the first item to be defined, and the result would be the same. However, we recommend it to be defined in `run_sim`. Note that the distribution chosen, the number of events and the interaction between events can have a substantial impact on the running time of the model. Debugging can be implemented using the argument `debug` in the `run_sim` function. ```{r model_run} #Logic is: per patient, per intervention, per event, react to that event. results <- run_sim( npats=1000, # number of patients to be simulated n_sim=1, # number of simulations to run psa_bool = FALSE, # use PSA or not. If n_sim > 1 and psa_bool = FALSE, then difference in outcomes is due to sampling (number of pats simulated) arm_list = c("int", "noint"), # intervention list common_all_inputs = common_all_inputs, # inputs common that do not change within a simulation common_pt_inputs = common_pt_inputs, # inputs that change within a simulation but are not affected by the intervention unique_pt_inputs = unique_pt_inputs, # inputs that change within a simulation between interventions init_event_list = init_event_list, # initial event list evt_react_list = evt_react_list, # reaction of events util_ongoing_list = util_ongoing, cost_ongoing_list = cost_ongoing, ipd = 1 ) ``` # Post-processing of Model Outputs ## Summary of Results Once the model has been run, we can use the results and summarize them using the `summary_results_det` to print the results of the last simulation (if `nsim = 1`, it's the deterministic case), and `summary_results_sim` to show the PSA results (with the confidence intervals). We can also use the individual patient data generated by the simulation, which we collect here to plot in the `psa_ipd` object. ```{r post-processing_summary} summary_results_det(results[[1]][[1]]) #print first simulation summary_results_sim(results[[1]]) summary_results_sens(results) psa_ipd <- bind_rows(map(results[[1]], "merged_df")) psa_ipd[1:10,] %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive")) ``` We can also check what has been the absolute number of events per strategy. ```{r post-processing_analysis,echo=FALSE, message=FALSE} psa_ipd %>% group_by(arm,evtname) %>% summarise(n=n()) %>% arrange(arm,-n)%>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive")) ``` ## Plots We now use the data output to plot the histograms/densities of the simulation. ```{r post-processing_plots1, fig.width=10, fig.height=8} data_plot <- results[[1]][[1]]$merged_df %>% filter(evtname != "sick") %>% group_by(arm,evtname,simulation) %>% mutate(median = median(evttime)) %>% ungroup() ggplot(data_plot) + geom_density(aes(fill = arm, x = evttime), alpha = 0.7) + geom_vline(aes(xintercept=median,col=arm)) + facet_wrap( ~ evtname, scales = "free") + scale_y_continuous(expand = c(0, 0)) + scale_x_continuous(expand = c(0, 0)) + theme_bw() ``` We can also plot the patient level incremental QALY/costs. ```{r post-processing_plots, fig.width=10, fig.height=8, message=FALSE} data_qaly_cost<- psa_ipd[,.SD[1],by=.(pat_id,arm,simulation)][,.(arm,qaly=total_qalys,cost=total_costs,pat_id,simulation)] data_qaly_cost[,ps_id:=paste(pat_id,simulation,sep="_")] mean_data_qaly_cost <- data_qaly_cost %>% group_by(arm) %>% summarise(across(where(is.numeric),mean)) ggplot(data_qaly_cost,aes(x=qaly, y = cost, col = arm)) + geom_point(alpha=0.15,shape = 21) + geom_point(data=mean_data_qaly_cost, aes(x=qaly, y = cost, fill = arm), shape = 21,col="black",size=3) + scale_y_continuous(expand = c(0, 0)) + scale_x_continuous(expand = c(0, 0)) + theme_bw()+ theme(axis.text.x = element_text(angle = 90, vjust = .5)) ``` # Sensitivity Analysis ## Inputs In this case, inputs must be created first to change across sensitivity analysis. To do so, the item list `sensitivity_inputs` can be used. In this case, we also use `pick_val_v` which allows the model to automatically pick the relevant value (no PSA, PSA or sensitivity analysis) based on the corresponding boolean flags of psa_bool and sensitivity_bool. In this case we also use the `sens` iterator for each sensitivity analysis and the `n_sensitivity` which is an argument in `run_sim`. Note that we have then just changed how the inputs are handled in common_all_inputs, but the same could be done with unique_pt_inputs, but in those cases, as the inputs change per patient, the `pick_val_v` or `pick_val` functions should be applied within unique_pt_inputs to make sure they are evaluated when it correspond. Note that for the psa we are directly calling the distributions and passing the parameters.Note also that the `sens_name_used` is automatically computed by the engine and is accesible to the user (it's the name of the sensitivity analysis, e.g., "scenario 1"). The indicator parameter in `pick_val_v` and `pick_val` is used to determine which parameters are left "as is" and which ones are to be substituted with the sensitivity value. There are two ways to do this, either by setting it in a binary way (1 or 0), or by using the indicator as the number of the parameter values to be varied (useful when several parameters are varied at the same time, or only specific values of a vector are varied). This can be set by using `indicator_sens_binary` argument. Note that `pick_val_v` and `pick_val` can be directly loaded as parameters (in fact, a named list will be loaded directly by R). A small tweak is needed if it's the first item added, in which the item list must be initiated by using `add_item()` (see below). Note that one can use a list of lists in the case where the base_value or any of the other parameters are vectors instead of elements of length 1. In this case, we showcase a list but it could also use a data.frame. `pick_psa` can be used to select the correct PSA distributions. ```{r dsa_inputs} #Load some data list_par <- list(parameter_name = list("util.sick","util.sicker","cost.sick","cost.sicker","cost.int","coef_noint","HR_int"), base_value = list(0.8,0.5,3000,7000,1000,log(0.2),0.8), DSA_min = list(0.6,0.3,1000,5000,800,log(0.1),0.5), DSA_max = list(0.9,0.7,5000,9000,2000,log(0.4),0.9), PSA_dist = list("rnorm","rbeta_mse","rgamma_mse","rgamma_mse","rgamma_mse","rnorm","rlnorm"), a=list(0.8,0.5,3000,7000,1000,log(0.2),0.8), b=lapply(list(0.8,0.5,3000,7000,1000,log(0.2),0.8), function(x) abs(x/5)), scenario_1=list(0.6,0.3,1000,5000,800,log(0.1),0.5), scenario_2=list(0.9,0.7,5000,9000,2000,log(0.4),0.9) ) sensitivity_inputs <-add_item( indicators = if(sensitivity_bool){ create_indicators(sens,n_sensitivity*length(sensitivity_names),rep(1,length(list_par[[1]])))}else{ rep(1,length(list_par[[1]]))} #vector of indicators, value 0 everywhere except at sens, where it takes value 1 (for dsa_min and dsa_max, if not sensitivity analysis, then we activate all of them, i.e., in a PSA) ) common_all_inputs <-add_item() %>% add_item( pick_val_v(base = list_par[["base_value"]], psa = pick_psa(list_par[["PSA_dist"]],rep(1,length(list_par[["PSA_dist"]])),list_par[["a"]],list_par[["b"]]), sens = list_par[[sens_name_used]], psa_ind = psa_bool, sens_ind = sensitivity_bool, indicator = indicators, names_out = list_par[["parameter_name"]] ) ) %>% add_item(random_seed_sicker_i = sample(1:1000,1000,replace = FALSE)) #we don't add this variable ot the sensitivity analysis ``` ## Model Execution The model is executed as before, just adding the `sensitivity_inputs`, `sensitivity_names`, `sensitivity_bool` and `n_sensitivity` arguments. Note that the total number of sensitivity iterations is given not by n_sensitivity, but by n_sensitivity * length(sensitivity_names), so in this case it will be 2 x n_sensitivity, or 2 x 7 = 14. For two scenario analysis it would be 2 x 1 = 2, with the `indicators` variable defined in the previous section taking value 1 for all the variables altered in the scenario, and 0 otherwise. ```{r run_dsa} results <- run_sim( npats=100, # number of patients to be simulated n_sim=1, # number of simulations to run psa_bool = FALSE, # use PSA or not. If n_sim > 1 and psa_bool = FALSE, then difference in outcomes is due to sampling (number of pats simulated) arm_list = c("int", "noint"), # intervention list common_all_inputs = common_all_inputs, # inputs common that do not change within a simulation common_pt_inputs = common_pt_inputs, # inputs that change within a simulation but are not affected by the intervention unique_pt_inputs = unique_pt_inputs, # inputs that change within a simulation between interventions init_event_list = init_event_list, # initial event list evt_react_list = evt_react_list, # reaction of events util_ongoing_list = util_ongoing, cost_ongoing_list = cost_ongoing, sensitivity_inputs = sensitivity_inputs, sensitivity_names = c("DSA_min","DSA_max"), sensitivity_bool = TRUE, n_sensitivity = length(list_par[[1]]), input_out = unlist(list_par[["parameter_name"]]) ) ``` ## Check results We briefly check below that indeed the engine has been changing the corresponding parameter value. ```{r dsa_check} data_sensitivity <- bind_rows(map_depth(results,2, "merged_df")) #Check mean value across iterations as PSA is off data_sensitivity %>% group_by(sensitivity) %>% summarise_at(c("util.sick","util.sicker","cost.sick","cost.sicker","cost.int","coef_noint","HR_int"),mean) ``` ## Model Execution, probabilistic DSA The model is executed as before, just activating the psa_bool option ```{r run_dsa_psa} results <- run_sim( npats=100, n_sim=6, psa_bool = TRUE, arm_list = c("int", "noint"), common_all_inputs = common_all_inputs, common_pt_inputs = common_pt_inputs, unique_pt_inputs = unique_pt_inputs, init_event_list = init_event_list, evt_react_list = evt_react_list, util_ongoing_list = util_ongoing, cost_ongoing_list = cost_ongoing, sensitivity_inputs = sensitivity_inputs, sensitivity_names = c("DSA_min","DSA_max"), sensitivity_bool = TRUE, n_sensitivity = length(list_par[[1]]), input_out = unlist(list_par[["parameter_name"]]) ) ``` ## Check results We briefly check below that indeed the engine has been changing the corresponding parameter value. ```{r dsa_check_psa} data_sensitivity <- bind_rows(map_depth(results,2, "merged_df")) #Check mean value across iterations as PSA is off data_sensitivity %>% group_by(sensitivity) %>% summarise_at(c("util.sick","util.sicker","cost.sick","cost.sicker","cost.int","coef_noint","HR_int"),mean) ``` ## Model Execution, Simple PSA The model is executed as before, just activating the psa_bool option and deactivating the sensitivity_bool and removing sensitivity_names and setting n_sensitivity = 1 ```{r run_psa} results <- run_sim( npats=100, n_sim=10, psa_bool = TRUE, arm_list = c("int", "noint"), common_all_inputs = common_all_inputs, common_pt_inputs = common_pt_inputs, unique_pt_inputs = unique_pt_inputs, init_event_list = init_event_list, evt_react_list = evt_react_list, util_ongoing_list = util_ongoing, cost_ongoing_list = cost_ongoing, sensitivity_inputs = sensitivity_inputs, sensitivity_bool = FALSE, n_sensitivity = 1, input_out = unlist(list_par[["parameter_name"]]) ) ``` ## Check results We briefly check below that indeed the engine has been changing the corresponding parameter values. ```{r check_psa} data_simulation <- bind_rows(map_depth(results,2, "merged_df")) #Check mean value across iterations as PSA is off data_simulation %>% group_by(simulation) %>% summarise_at(c("util.sick","util.sicker","cost.sick","cost.sicker","cost.int","coef_noint","HR_int"),mean) ```