Skip to contents

This function simulates data from a platform trial with a given number of experimental treatment arms entering at given time points and a shared control arm. The primary endpoint is a binary endpoint. The user specifies the timing of adding arms in terms of patients recruited to the trial so far and the sample size per experimental treatment arm.

Usage

datasim_bin(
  num_arms,
  n_arm,
  d,
  period_blocks = 2,
  p0,
  OR,
  lambda,
  trend,
  N_peak,
  n_wave,
  trend_mean = 0,
  trend_var = 0.5,
  full = FALSE,
  check = TRUE
)

Arguments

num_arms

Integer. Number of experimental treatment arms in the trial.

n_arm

Integer. Sample size per experimental treatment arm (assumed equal).

d

Integer vector with timings of adding new arms in terms of number of patients recruited to the trial so far. The first entry must be 0, so that the trial starts with at least one experimental treatment arm, and the entries must be non-decreasing. The vector length equals num_arms.

period_blocks

Integer. Number to define the size of the blocks for the block randomization. The block size in each period equals period_blockstimes the number of active arms in the period (see Details). Default=2.

p0

Double. Response probability in the control arm.

OR

Double vector with treatment effects in terms of odds ratios for each experimental treatment arm compared to control. The elements of the vector (odds ratios) are ordered by the entry time of the experimental treatment arms (e.g., the first entry in the vector corresponds to the odds ratio of the first experimental treatment arm). The vector length equals num_arms.

lambda

Vector containing numerical entries or the string "random", indicating the strength of the time trend in each arm ordered by the entry time of the arms (e.g., the first entry in the vector corresponds to the time trend in the control arm, second entry to the time trend in the first experimental treatment arm). The vector length equals num_arms+1, as time trend in the control is also allowed. In case of random time trend, its strenght is generated from a normal distribution.

trend

String indicating the time trend pattern ("linear", "linear_2, "stepwise", "stepwise_2", "inv_u" or "seasonal"). See Details for more information.

N_peak

Integer. Timepoint at which the inverted-u time trend switches direction in terms of overall sample size (i.e. after how many recruited participants the trend direction switches).

n_wave

Integer. Number of cycles (waves) should the seasonal trend have.

trend_mean

Integer. In case of random time trends, the strength of the time trend will be generated from N(trend_mean, trend_var). Default: N(0, 0.5).

trend_var

Integer. In case of random time trends, the strength of the time trend will be generated from N(trend_mean, trend_var). Default: N(0, 0.5).

full

Logical. Indicates whether the output should be in form of a data frame with variables needed for the analysis only (FALSE) or in form of a list containing more information (TRUE). Default=FALSE.

check

Logical. Indicates whether the input parameters should be checked by the function. Default=TRUE, unless the function is called by a simulation function, where the default is FALSE.

Value

Data frame: simulated trial data (if full=FALSE, i.e. default) with the following columns:

  • j - patient recruitment index

  • response - binary response for patient j

  • treatment- index of the treatment patient j was allocated to

  • period - index of the period patient j was recruited in

or List (if full=TRUE) containing the following elements:

  • Data - simulated trial data, including an additional column p with the probability used for simulating the response for patient j

  • n_total - total sample size in the trial

  • n_arm - sample size per arm (assumed equal)

  • num_arms - number of experimental treatment arms in the trial

  • d - timings of adding new arms

  • SS_matrix - matrix with the sample sizes per arm and per period

  • period_blocks - number to multiply the number of active arms with, in order to get the block size per period

  • p0 - response probability in the control arm

  • OR - odds ratios for each experimental treatment arm

  • lambda - strength of time trend in each arm

  • time_dep_effect - time dependent treatment effects for each experimental treatment arm (for computing the bias)

  • trend - time trend pattern

Details

Design assumptions:

  • The simulated platform trial consists of a given number of experimental treatment arms (specified by the argument num_arms) and one control arm that is shared across the whole platform.

  • Participants are indexed by entry order, assuming that at each time unit exactly one participant is recruited and the time of recruitment and observation of the response are equal.

  • All participants are assumed to be eligible for all arms in the trial, i.e. the same inclusion and exclusion criteria apply to all experimental and control arms.

  • Equal sample sizes (given by parameter n_arm) in all experimental treatment arms are assumed.

  • The duration of the trial is divided into so-called periods, defined as time intervals bounded by distinct time points of any treatment arm entering or leaving the platform. Hence, multiple treatment arms entering or leaving at the same time point imply the start of only one additional period.

  • Allocation ratio of 1:1:...:1 in each period. Furthermore, block randomization is used to assign patients to the active arms. Block size in each period = period_blocks* (number of active arms in the period).

  • If the period sample size is not a multiple of the block size, arms for the remaining participants are chosen by sampling without replacement from a vector containing the indices of active arms replicated times ceiling(remaining sample size/number of active arms).

Data generation:

The binary response \(y_j\) for patient \(j\) is generated according to:

$$g(E(y_j)) = \eta_0 + \sum_{k=1}^K \cdot I(k_j=k) + f(j)$$

where \(g(\cdot)\) is the logit link function, and \(\eta_0\) (logit function of parameter p0) and \(\theta_k\) (log of the parameter OR) are the log odds in the control arm and the log odds ratio of treatment \(k\). \(K\) is the total number of treatment arms in the trial (parameter num_arms) and \(k_j\) is an indicator of the treatment arm patient \(j\) is allocated to.

The function \(f(j)\) denotes the time trend, whose strength is indicated by \(\lambda_{k_j}\) (parameter lambda) and which can have the following patterns (parameter trend):

  • "linear" - trend starts at the beginning of the trial and the log odds increases or decreases linearly with a slope of \(\lambda\), according to the function \(f(j) = \lambda \cdot \frac{j-1}{N-1}\), where \(N\) is the total sample size in the trial

  • "linear_2" - trend starts after the first period (i.e. there is no time trend in the first period) and the log odds increases or decreases linearly with a slope of \(\lambda\), according to the function \(f(j) = \lambda \cdot \frac{j-1}{N-1}\), where \(N\) is the total sample size in the trial

  • "stepwise" - the log odds is constant in each period and increases or decreases by \(\lambda\) each time any treatment arm enters or leaves the trial (i.e. in each period), according to the function \(f(j) = \lambda_{k_j} \cdot (c_j - 1)\), where \(c_j\) is an index of the period patient \(j\) was enrolled in

  • "stepwise_2" - the log odds is constant in each period and increases or decreases by \(\lambda\) each time a new treatment arm is added to the trial, according to the function \(f(j) = \lambda_{k_j} \cdot (w_j - 1)\), where \(w_j\) is an indicator of how many treatment arms have already entered the ongoing trial, when patient \(j\) was enrolled

  • "inv_u" - the log odds increases up to the point \(N_p\) (parameter N_peak) and decreases afterwards, linearly with a slope of \(\lambda\), according to the function \(f(j) = \lambda \cdot \frac{j-1}{N-1} (I(j \leq N_p) - I(j > N_p))\), where \(N_p\) indicates the point at which the trend turns from positive to negative in terms of the sample size (note that for negative \(\lambda\), the log odds ratio decreases first and increases after)

  • "seasonal" - the log odds increases and decreases periodically with a magnitude of \(\lambda\), according to the function \(f(j) = \lambda \cdot \mathrm{sin} \big( \psi \cdot 2\pi \cdot \frac{j-1}{N-1} \big)\), where \(\psi\) indicates how many cycles should the time trend have (parameter n_wave)

Trials with no time trend can be simulated too, by setting all elements of the vector lambda to zero and choosing an arbitrary pattern.

Author

Pavla Krotka, Marta Bofill Roig

Examples


head(datasim_bin(num_arms = 3, n_arm = 100, d = c(0, 100, 250),
p0 = 0.7, OR = rep(1.8, 3), lambda = rep(0.15, 4), trend="stepwise"))
#>   j response treatment period
#> 1 1        1         1      1
#> 2 2        1         0      1
#> 3 3        0         1      1
#> 4 4        1         0      1
#> 5 5        0         1      1
#> 6 6        1         0      1