This vignette is a guide to policy_data()
. As the name
suggests, the function creates a policy_data
object with a
specific data structure making it easy to use in combination with
policy_def()
, policy_learn()
, and
policy_eval()
. The vignette is also a guide to some of the
associated S3 functions which transform or access parts of the data, see
?policy_data
and
methods(class="policy_data")
.
We will start by looking at a simple single-stage example, then consider a fixed two-stage example with varying actions sets and data in wide format, and finally we will look at an example with a stochastic number of stages and data in long format.
Consider a simple single-stage problem with covariates/state
variables (Z,L,B), binary action
variable A, and utility outcome
U. We use
sim_single_stage()
to simulate data:
(d <- sim_single_stage(n = 5e2, seed=1)) |> head()
#> Z L B A U
#> 1 1.2879704 -1.4795962 0 1 -0.9337648
#> 2 1.6184181 1.2966436 0 1 6.7506026
#> 3 1.2710352 -1.0431352 0 1 -0.3377580
#> 4 -0.2157605 0.1198224 1 0 1.4993427
#> 5 -1.0671588 -1.3663727 0 1 -9.1718727
#> 6 -1.4469746 -0.4018530 0 0 -2.6692961
We give instructions to policy_data()
which variables
define the action
, the state covariates
, and
the utility
variable:
pd <- policy_data(d, action="A", covariates=list("Z", "B", "L"), utility="U")
pd
#> Policy data with n = 500 observations and maximal K = 1 stages.
#>
#> action
#> stage 0 1 n
#> 1 278 222 500
#>
#> Baseline covariates:
#> State covariates: Z, B, L
#> Average utility: -0.98
In the single-stage case the history H is just (B,Z,L). We access the history and actions using
get_history()
:
get_history(pd)$H |> head()
#> Key: <id, stage>
#> id stage Z B L
#> <int> <int> <num> <num> <num>
#> 1: 1 1 1.2879704 0 -1.4795962
#> 2: 2 1 1.6184181 0 1.2966436
#> 3: 3 1 1.2710352 0 -1.0431352
#> 4: 4 1 -0.2157605 1 0.1198224
#> 5: 5 1 -1.0671588 0 -1.3663727
#> 6: 6 1 -1.4469746 0 -0.4018530
get_history(pd)$A |> head()
#> Key: <id, stage>
#> id stage A
#> <int> <int> <char>
#> 1: 1 1 1
#> 2: 2 1 1
#> 3: 3 1 1
#> 4: 4 1 0
#> 5: 5 1 1
#> 6: 6 1 0
Similarly, we access the utility outcomes U:
Consider a two-stage problem with observations O=(B,BB,L1,C1,U1,A1,L2,C2,U2,A2,U3). Following the general notation introduced in Section 3.1 of (Nordland and Holst 2023), (B,BB) are the baseline covariates, Sk=(Lk,Ck) are the state covariates at stage k, Ak is the action at stage k, and Uk is the reward at stage k. The utility is the sum of the rewards U=U1+U2+U3.
We use sim_two_stage_multi_actions()
to simulate
data:
d <- sim_two_stage_multi_actions(n=2e3, seed = 1)
colnames(d)
#> [1] "B" "BB" "L_1" "C_1" "A_1" "L_2" "C_2" "A_2" "L_3" "U_1" "U_2" "U_3"
Note that the data is in wide format. The data is transformed using
policy_data()
with instructions on which variables define
the actions, baseline covariates, state covariates, and the rewards:
pd <- policy_data(d,
action = c("A_1", "A_2"),
baseline = c("B", "BB"),
covariates = list(L = c("L_1", "L_2"),
C = c("C_1", "C_2")),
utility = c("U_1", "U_2", "U_3"))
pd
#> Policy data with n = 2000 observations and maximal K = 2 stages.
#>
#> action
#> stage default no yes n
#> 1 0 1017 983 2000
#> 2 769 826 405 2000
#>
#> Baseline covariates: B, BB
#> State covariates: L, C
#> Average utility: 0.39
The length of the character vector action
determines the
number of stages K
(in this case 2). If the number of
stages is 2 or more, the covariates
argument must be a
named list. Each element must be a character vector with length equal to
the number of stages. If a covariate is not available at a given stage
we insert an NA
value, e.g.,
L = c(NA, "L_2")
.
Finally, the utility
argument must be a single character
string (the utility is observed after stage K) or a character vector of
length K+1 with the names of the rewards.
In this example, the observed action sets vary for each stage.
get_action_set()
returns the global action set and
get_stage_action_sets()
returns the action set for each
stage:
The full histories H1=(B,BB,L1,C1) and H2=(B,BB,L1,C1,A1,L2,C2) are available using
get_history()
and full_history = TRUE
:
get_history(pd, stage = 1, full_history = TRUE)$H |> head()
#> Key: <id, stage>
#> id stage L_1 C_1 B BB
#> <int> <num> <num> <num> <num> <char>
#> 1: 1 1 0.9696772 1.7112790 -0.6264538 group2
#> 2: 2 1 -2.1994065 -2.6431237 0.1836433 group1
#> 3: 3 1 1.9480938 2.0619342 -0.8356286 group2
#> 4: 4 1 0.1798532 1.0066957 1.5952808 group2
#> 5: 5 1 0.4150568 0.1538534 0.3295078 group2
#> 6: 6 1 0.6468405 -0.0982121 -0.8204684 group3
get_history(pd, stage = 2, full_history = TRUE)$H |> head()
#> Key: <id, stage>
#> id stage A_1 L_1 L_2 C_1 C_2 B
#> <int> <num> <char> <num> <num> <num> <num> <num>
#> 1: 1 2 yes 0.9696772 -0.7393434 1.7112790 2.4243702 -0.6264538
#> 2: 2 2 no -2.1994065 0.4828756 -2.6431237 -2.6647281 0.1836433
#> 3: 3 2 no 1.9480938 0.4803055 2.0619342 2.4747615 -0.8356286
#> 4: 4 2 yes 0.1798532 -0.3574497 1.0066957 2.0571959 1.5952808
#> 5: 5 2 no 0.4150568 2.0473541 0.1538534 -0.9649004 0.3295078
#> 6: 6 2 yes 0.6468405 -2.3701135 -0.0982121 1.0989523 -0.8204684
#> BB
#> <char>
#> 1: group2
#> 2: group1
#> 3: group2
#> 4: group2
#> 5: group2
#> 6: group3
Similarly, we access the associated actions at each stage via list
element A
:
get_history(pd, stage = 1, full_history = TRUE)$A |> head()
#> Key: <id, stage>
#> id stage A_1
#> <int> <num> <char>
#> 1: 1 1 yes
#> 2: 2 1 no
#> 3: 3 1 no
#> 4: 4 1 yes
#> 5: 5 1 no
#> 6: 6 1 yes
get_history(pd, stage = 2, full_history = TRUE)$A |> head()
#> Key: <id, stage>
#> id stage A_2
#> <int> <num> <char>
#> 1: 1 2 no
#> 2: 2 2 no
#> 3: 3 2 default
#> 4: 4 2 yes
#> 5: 5 2 yes
#> 6: 6 2 no
Alternatively, the state/Markov type history and actions are
available using full_history = FALSE
:
get_history(pd, full_history = FALSE)$H |> head()
#> Key: <id, stage>
#> id stage L C B BB
#> <int> <int> <num> <num> <num> <char>
#> 1: 1 1 0.9696772 1.711279 -0.6264538 group2
#> 2: 1 2 -0.7393434 2.424370 -0.6264538 group2
#> 3: 2 1 -2.1994065 -2.643124 0.1836433 group1
#> 4: 2 2 0.4828756 -2.664728 0.1836433 group1
#> 5: 3 1 1.9480938 2.061934 -0.8356286 group2
#> 6: 3 2 0.4803055 2.474761 -0.8356286 group2
get_history(pd, full_history = FALSE)$A |> head()
#> Key: <id, stage>
#> id stage A
#> <int> <int> <char>
#> 1: 1 1 yes
#> 2: 1 2 no
#> 3: 2 1 no
#> 4: 2 2 no
#> 5: 3 1 no
#> 6: 3 2 default
Note that policy_data()
overrides the action variable
names to A_1
, A_2
, … in the full history case
and A
in the state/Markov history case.
As in the single-stage case we access the utility, i.e. the sum of
the rewards, using get_utility()
:
In this example we illustrate how polle
handles decision
processes with a stochastic number of stages, see Section 3.5 in (Nordland and Holst 2023). The data is simulated
using sim_multi_stage()
. Detailed information on the
simulation is available in ?sim_multi_stage
. We simulate
data from 2000 iid subjects:
As described, the stage data is in long format:
d$stage_data[, -(9:10)] |> head()
#> id stage event t A X X_lead U
#> <num> <num> <num> <num> <char> <num> <num> <num>
#> 1: 1 1 0 0.000000 1 1.3297993 0.0000000 0.0000000
#> 2: 1 2 0 1.686561 1 -0.7926711 1.3297993 0.3567621
#> 3: 1 3 0 3.071768 0 3.5246509 -0.7926711 2.1778778
#> 4: 1 4 1 3.071768 <NA> NA NA 0.0000000
#> 5: 2 1 0 0.000000 1 0.7635935 0.0000000 0.0000000
#> 6: 2 2 0 1.297336 1 -0.5441694 0.7635935 0.5337427
The id
variable is important for identifying which rows
belong to each subjects. The baseline data uses the same id
variable:
d$baseline_data |> head()
#> id B
#> <num> <int>
#> 1: 1 0
#> 2: 2 0
#> 3: 3 1
#> 4: 4 1
#> 5: 5 1
#> 6: 6 0
The data is transformed using policy_data()
with
type = "long"
. The names of the id
,
stage
, event
, action
, and
utility
variables must be specified. The event variable,
inspired by the event variable in survival::Surv()
, is
0
whenever an action occur and 1
for a
terminal event.
pd <- policy_data(data = d$stage_data,
baseline_data = d$baseline_data,
type = "long",
id = "id",
stage = "stage",
event = "event",
action = "A",
utility = "U")
pd
#> Policy data with n = 2000 observations and maximal K = 4 stages.
#>
#> action
#> stage 0 1 n
#> 1 113 1887 2000
#> 2 844 1039 1883
#> 3 956 74 1030
#> 4 72 0 72
#>
#> Baseline covariates: B
#> State covariates: t, X, X_lead
#> Average utility: 2.46
In some cases we are only interested in analyzing a subset of the
decision stages. partial()
trims the maximum number of
decision stages:
sessionInfo()
#> R version 4.4.1 (2024-06-14)
#> Platform: aarch64-apple-darwin23.5.0
#> Running under: macOS Sonoma 14.6.1
#>
#> Matrix products: default
#> BLAS: /Users/oano/.asdf/installs/R/4.4.1/lib/R/lib/libRblas.dylib
#> LAPACK: /Users/oano/.asdf/installs/R/4.4.1/lib/R/lib/libRlapack.dylib; LAPACK version 3.12.0
#>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> time zone: Europe/Copenhagen
#> tzcode source: internal
#>
#> attached base packages:
#> [1] splines stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] ggplot2_3.5.1 data.table_1.15.4 polle_1.5
#> [4] SuperLearner_2.0-29 gam_1.22-4 foreach_1.5.2
#> [7] nnls_1.5
#>
#> loaded via a namespace (and not attached):
#> [1] sass_0.4.9 utf8_1.2.4 future_1.33.2
#> [4] lattice_0.22-6 listenv_0.9.1 digest_0.6.36
#> [7] magrittr_2.0.3 evaluate_0.24.0 grid_4.4.1
#> [10] iterators_1.0.14 mvtnorm_1.2-5 policytree_1.2.3
#> [13] fastmap_1.2.0 jsonlite_1.8.8 Matrix_1.7-0
#> [16] survival_3.6-4 fansi_1.0.6 scales_1.3.0
#> [19] numDeriv_2016.8-1.1 codetools_0.2-20 jquerylib_0.1.4
#> [22] lava_1.8.0 cli_3.6.3 rlang_1.1.4
#> [25] mets_1.3.4 parallelly_1.37.1 future.apply_1.11.2
#> [28] munsell_0.5.1 withr_3.0.0 cachem_1.1.0
#> [31] yaml_2.3.8 tools_4.4.1 parallel_4.4.1
#> [34] colorspace_2.1-0 globals_0.16.3 vctrs_0.6.5
#> [37] R6_2.5.1 lifecycle_1.0.4 pkgconfig_2.0.3
#> [40] timereg_2.0.5 progressr_0.14.0 bslib_0.7.0
#> [43] pillar_1.9.0 gtable_0.3.5 Rcpp_1.0.13
#> [46] glue_1.7.0 xfun_0.45 tibble_3.2.1
#> [49] highr_0.11 knitr_1.47 farver_2.1.2
#> [52] htmltools_0.5.8.1 rmarkdown_2.27 labeling_0.4.3
#> [55] compiler_4.4.1