The package gscounts provides a collection of functions for planning and analyzing group sequential designs and fixed sample designs with negative binomial outcomes, which includes both recurrent events as well as count data. In particular, the package provides tools to determine

power and sample size of group sequential designs with various structures of follow-up times and accrual times,
critical values for group sequential testing.

The functions provided by the R package gscounts include the functionality for planning group sequential designs which only stop for efficacy as well as the functionality for planning group sequential designs which also stop for binding or non-binding futility.

1 Statistical model

In a group sequential design the hypothesis of interest is tested several times during the course of the clinical trial. At each data look, that is what the interim analyses are referred to, the decision whether to stop for efficacy or futility is made. In the following, we briefly outline the statistical model. For more details, the reader is referred to the manuscript by Mütze et al. (2017). We define $K$ as the total number of data looks. We model the number of events of patient $j=1,\ldots, n_{i}$ in treatment group $i=1,2$ at look $k=1,\ldots, K$ as a homogeneous Poisson point process with rate $\lambda_{ij}$. Let $t_{ijk}$ be the exposure time of patient $j$ in treatment group $i$ at look $k$. After an exposure time of $t_{ijk}$, the number of events $Y_{ijk}$ of patient $j$ in treatment group $i$ at look $k$ given the rate $\lambda_{ij}$ is Poisson distributed with rate $t_{ijk}\lambda_{ij}$, that is \[\begin{align*} Y_{ijk}|\lambda_{ij} \sim \operatorname{Pois}(t_{ijk} \lambda_{ij}). \end{align*}\] In other words, the event rate $\lambda_{ij}$ is subject specific. The between patient heterogeneity in the event rates is modeled by assuming that the rates $\lambda_{ij}$ are Gamma distributed with shape parameter $\alpha=1/\phi$ and rate parameter $\beta=1/(\phi \mu_i)$, i.e. \[\begin{align*} \lambda_{ij} \sim \Gamma\left(\frac{1}{\phi}, \frac{1}{\phi \mu_i}\right). \end{align*}\] Then, the random variable $Y_{ijk}$ is marginally negative binomially distributed with rate $t_{ijk}\mu_i$ and dispersion parameter $\phi$, i.e. \[\begin{align*} Y_{ijk} \sim \operatorname{NB}\left(t_{ijk}\mu_i, \phi\right). \end{align*}\] The probability mass function of $Y_{ijk}$ is given by \[\begin{align*} \mathbb{P}\left(Y_{ijk}=y\right)=\frac{\Gamma(y + 1/\phi)}{\Gamma(1/\phi)y!}\left(\frac{1}{1+\phi t_{ijk} \mu_i}\right)^{1/\phi} \left(\frac{\phi t_{ijk}\mu_i}{1 + \phi t_{ijk}\mu_i }\right)^{y},\qquad y\in \mathcal{N}_{0} \end{align*}\] The expected value and the variance of the random variable $Y_{ij}$ are given by \[\begin{align*} \mathbb{E}\left[Y_{ijk}\right] &= t_{ijk} \mu_i, \\ \operatorname{Var}\left[Y_{ijk}\right] &= t_{ijk} \mu_i (1 + \phi t_{ijk} \mu_i). \end{align*}\] Moreover, as the dispersion parameter approaches zero, the negative binomial distribution converges to a Poisson distribution which also means that the overdispersion increases in $\phi$. In this manuscript we consider smaller values of $\mu_i$ to be better. The statistical hypothesis testing problem of interest is \[\begin{align*} H_0: \frac{\mu_1}{\mu_2}\geq \delta \quad \text{vs.} \quad H_1: \frac{\mu_1}{\mu_2} < \delta. \end{align*}\] Non-inferiority of treatment 1 compared to treatment 2 is tested for $\delta\in (1,\infty)$. Superiority of treatment 1 compared to treatment 2 is tested for for $\delta \in (0,1]$. We denote the alternative is given by $\theta^{*}$. The calculation of the efficacy and (non-)binding futility boundaries are performed under the hypothesis $H_0: \frac{\mu_1}{\mu_2}= \delta$ and under the alternative $H_1: \frac{\mu_1}{\mu_2} = \theta^{*}$

2 Planning group sequential designs with negative binomial outcomes

The functionality of the R package gscounts includes the planning of group sequential designs for both recurrent events modeled by a negative binomial distribution and count data modeled by a negative binomial distribution. In the first part of this section we explain how to plan group sequential designs for recurrent events modeled by a negative binomial distribution. Planning group sequential designs with negative binomial count data is discussed in the second part.

First of all, however, the R package must be loaded.

library(gscounts)

2.1 Negative binomial recurrent events

2.1.1 Sample size for unequal exposure times with uniform recruitment

The first case study focus on determining the sample size for a group sequential design where we assume a rate ratio in the alternative of $\mu_1 / \mu_2 = 0.0875/0.125=0.7$ to be tested with the margin $\delta=1$. The dispersion parameter is chosen to be $\phi = 5$. The target power for the test in the group sequential design is $1-\beta=0.8$, the data looks are performed at information times of $40\%$ and $70\%$ of the maximum information $\mathcal{I}_{max}$, and the O’Brien-Fleming-type error spending function is considered to allocated the significance level $\alpha=0.025$. Furthermore, it is assumed that patients are entering the study uniformly over a period of 1.25 years and that the study ends 2.75 years after the last patient entered, i.e. after four years in total. The respective sample size can then be calculated using the R function design_gsnb.

out <- design_gsnb(rate1 = 0.0875, rate2 = 0.125, dispersion = 5, 
                   ratio_H0 = 1, power = 0.8, sig_level = 0.025, 
                   timing = c(0.4, 0.7, 1), esf = obrien, study_period = 4, 
                   accrual_period = 1.25, random_ratio = 1)
out

## 
##  Group sequential trial with negative binomial outcomes
## 
## Distribution parameters
##  Rate group 1:  0.0875 
##  Rate group 2:  0.125 
##  Dispersion parameter:  5 
## Hypothesis testing
##  Rate ratio under null hypothesis:  1 
##  Rate ratio under alternative:  0.7 
##  Significance level:  0.025 
##  Power group sequential design:  0.8 
##  Power fixed design:  0.8061 
##  Maximum information:  62.66 
##  Number of looks:  3 
##  Information times of looks:  0.4, 0.7, 1.0 
##  Calendar times of looks:  1.333, 2.208, 4.000 
## Sample size and study duration
##  Sample size group 1:  990 
##  Sample size group 2:  990 
##  Accrual period:  1.25 
##  Study duration:  4 
## 
## Critical values and spending at each data look
##                              +------- Efficacy -------+ 
##     Look   Information time     Spending     Boundary
##        1                0.4   0.00039415      -3.3569
##        2                0.7    0.0069903      -2.4445
##        3                  1     0.017616      -2.0005
## 
## Probabilities of stopping for efficacy, i.e. for rejecting H0
##  Rate ratio       Look 1      Look 2     Look 3    Total
##         1.0 0.0003941518 0.006990339 0.01761551 0.025000
##         0.7 0.0580726841 0.410335143 0.33173722 0.800145
## 
## Expected information level
##  Rate ratio     E[I]
##         1.0 62.51748
##         0.7 52.76634

The R function design_gsnb returns an object of class gsnb which contains the relevant design information. To assess the single components and especially the ones which are not shown in the comprehensive output above, for instance the assumed study entry times, the notation out$ followed by one following names is used.

## List of 22
##  $ rate1         : num 0.0875
##  $ rate2         : num 0.125
##  $ dispersion    : num 5
##  $ ratio_H0      : num 1
##  $ power         : num 0.8
##  $ sig_level     : num 0.025
##  $ timing        : num [1:3] 0.4 0.7 1
##  $ random_ratio  : num 1
##  $ efficacy      :List of 3
##   ..$ esf     :function (t, sig_level, ...)  
##   ..$ spend   : num [1:3] 0.000394 0.00699 0.017616
##   ..$ critical: num [1:3] -3.36 -2.44 -2
##  $ ratio_H1      : num 0.7
##  $ calendar      : num [1:3] 1.33 2.21 4
##  $ n             : num 1980
##  $ n1            : num 990
##  $ n2            : num 990
##  $ t_recruit1    : num [1:990] 0 0.00126 0.00253 0.00379 0.00506 ...
##  $ t_recruit2    : num [1:990] 0 0.00126 0.00253 0.00379 0.00506 ...
##  $ max_info      : num 62.7
##  $ accrual_period: num 1.25
##  $ study_period  : num 4
##  $ stop_prob     :List of 1
##   ..$ efficacy:'data.frame': 2 obs. of  5 variables:
##   .. ..$ Rate ratio: num [1:2] 1 0.7
##   .. ..$ Look 1    : num [1:2] 0.000394 0.058073
##   .. ..$ Look 2    : num [1:2] 0.00699 0.41034
##   .. ..$ Look 3    : num [1:2] 0.0176 0.3317
##   .. ..$ Total     : num [1:2] 0.025 0.8
##  $ expected_info :'data.frame':  2 obs. of  2 variables:
##   ..$ Rate ratio: num [1:2] 1 0.7
##   ..$ E[I]      : num [1:2] 62.5 52.8
##  $ power_fix     : num 0.806
##  - attr(*, "class")= chr "gsnb"

By default, no futility boundaries are calculated, however, the function design_gsnb allows calculating both binding and non-binding futility boundaries. To calculate futility boundaries, the type of futility, i.e. binding or non-binding, has to be specified using the futility argument. The futility spending function is set through the argument esf_futility. Next, a group sequential design with binding futility boundaries is planned. The O’Brien-Fleming-type error spending function is considered for the futility spending.

out2 <- design_gsnb(rate1 = 0.0875, rate2 = 0.125, dispersion = 5, 
                    ratio_H0 = 1, power = 0.8, sig_level = 0.025, 
                    timing = c(0.4, 0.7, 1), esf = obrien, study_period = 4, 
                    accrual_period = 1.25, random_ratio = 1, 
                    futility = "binding", esf_futility = obrien)
out2

## 
##  Group sequential trial with negative binomial outcomes and with binding futility
## 
## Distribution parameters
##  Rate group 1:  0.0875 
##  Rate group 2:  0.125 
##  Dispersion parameter:  5 
## Hypothesis testing
##  Rate ratio under null hypothesis:  1 
##  Rate ratio under alternative:  0.7 
##  Significance level:  0.025 
##  Power group sequential design:  0.8 
##  Power fixed design:  0.8248 
##  Maximum information:  65.83 
##  Number of looks:  3 
##  Information times of looks:  0.4, 0.7, 1.0 
##  Calendar times of looks:  1.333, 2.208, 4.000 
## Sample size and study duration
##  Sample size group 1:  1040 
##  Sample size group 2:  1040 
##  Accrual period:  1.25 
##  Study duration:  4 
## 
## Critical values and spending at each data look. The futility spending is binding.
##                              +------- Efficacy -------+ +------- Futility -------+ 
##     Look   Information time     Spending     Boundary     Spending     Boundary
##        1                0.4   0.00039415      -3.3569     0.042733      -0.1108
##        2                0.7    0.0069903      -2.4439     0.082852      -1.2121
##        3                  1     0.017616      -1.9300     0.074415      -1.9300
## 
## Probabilities of stopping for efficacy, i.e. for rejecting H0
##  Rate ratio       Look 1      Look 2     Look 3     Total
##         1.0 0.0003941518 0.006990339 0.01761551 0.0250000
##         0.7 0.0634275108 0.428289634 0.30813899 0.7998561
## 
## Probabilities of stopping for futility, i.e. for accepting H0
##  Rate ratio     Look 1     Look 2     Look 3     Total
##         1.0 0.54410188 0.34879491 0.08210322 0.9750000
##         0.7 0.04276404 0.08291204 0.07446779 0.2001439
## 
## Expected information level
##  Rate ratio     E[I]
##         1.0 37.29628
##         0.7 51.53880

2.1.2 Sample size for equal exposure times

In some cases patients are only followed for a prespecified time period which is identical for all patients. In this case the syntax for determining the sample size is slightly different. We assume a Pocock-type error spending function and a planned follow-up time of 0.5 years for every patient. The data look is performed at $50\%$ of the maximum information time $\mathcal{I}_{max}$. Futility stopping is not considered in this example. The next chunk shows the code for calculating the sample size of the group sequential design in the case of equal follow-up times.

out3 <- design_gsnb(rate1 = 4.2, rate2 = 8.4, dispersion = 3, ratio_H0 = 1, 
                   power = 0.8, sig_level = 0.025, timing = c(0.5, 1), 
                   esf = pocock, followup_max = 0.5, random_ratio = 1)
out3$n
out3$n1
out3$n2

## [1] 246
## [1] 123
## [1] 123

The sample size calculated in output out3 is independent of the time the various subjects enter the study. However, the time the subjects enter the study affects the calendar time of the data looks.

The sample size ratio $n_1 / n_2$ is by default set to be one but can be changed in the input through the argument random_ratio. The next chunk shows an example for which the sample size in group 1 is twice the sample size of group 2, i.e. $n_1 / n_2 = 2$, and a nonbinding futility boundary is included based on the O’Brien-Fleming-type error spending function.

out4 <- design_gsnb(rate1 = 4.2, rate2 = 8.4, dispersion = 3, ratio_H0 = 1, 
                   power = 0.8, sig_level = 0.025, timing = c(0.5, 1), 
                   esf = pocock, followup_max = 0.5, random_ratio = 2,
                   futility = "nonbinding", esf_futility = obrien)
out4$n
out4$n1
out4$n2

## [1] 285
## [1] 190
## [1] 95

2.1.3 Study duration given the study entry times

In the following we assume that the patients’ study entry times are given and we calculate the study duration required to obtain a prespecified power. We assume that the study entry times are identical between the groups.

recruit <- seq(0, 1.25, length.out = 1042)
out5 <- design_gsnb(rate1 = 0.0875, rate2 = 0.125, dispersion = 5, 
                   power = 0.8, timing = c(0.5, 1), esf = obrien,
                   ratio_H0 = 1, sig_level = 0.025, 
                   t_recruit1 = recruit, t_recruit2 = recruit)
out5$study_period

## [1] 3.49812

Under the assumption of recruitment times as defined in the variable recruit, the study period required to obtain a group sequential design with a power of $80\%$ is 3.498. More information on the design are contained in the object out5.

2.2 Negative binomial count data

Planning a group sequential design for negative binomial count data with the function design_gsnb is very similar to the case of planning a group sequential design for negative binomial recurrent events with equal exposure time. In detail, for the negative binomial count data, no exposure times exist and the recruitment times, the study period, and the accrual period are not relevant for planning purposes. Thus, the input argument defining the maximum follow-up time is set to one, that is followup_max = 1 and the input arguments t_recruit1, t_recruit2, study_period, and accrual_period are set to their default value NULL. The next chunk shows the R code for calculating the sample size of a group sequential design with negative binomial count data.

out6 <- design_gsnb(rate1 = 4.2, rate2 = 8.4, dispersion = 3, ratio_H0 = 1, 
                   power = 0.8, sig_level = 0.025, timing = c(0.5, 1), 
                   esf = obrien, followup_max = 1, random_ratio = 1)
out6$n1
out6$n2

## [1] 104
## [1] 104

3 Analyzing group sequential design with negative binomial outcomes

In this section we illustrate how to analyze a group sequential design with negative binomial outcomes. For illustration purposes, we assume that the data comes from the clinical trial example planned in Section 2.1.1. In other words, we have a group sequential design with a maximum of three data looks, no futility stopping, a planned study period of four years, an accrual period of 1.25 years, and a planned accrual of 990 patients per group. It is assumed that the patients enter the study uniformly throughout the accrual period. The package includes the example data set hospitalizations which was generated using R. The data set contains the four columns treatment, pat, t_recruit, and eventtime. The column pat contains the patient number which is unique within the treatment indication listed in the column treatment. The time a patient enters the study is listed in column t_recruit. In the column eventtime, the time of an event (here the events are hospitalizations) of a patient is listed. If a patient has no event, the column contains NA. If a patient has multiple events, the event times are listed over multiple rows. The next chunk shows the first ten columns of the data set.

head(hospitalizations, n = 10)

##     treatment pat   t_recruit eventtime
## 1  experiment   1 0.000000000        NA
## 2  experiment   2 0.001263903 0.5504814
## 3  experiment   3 0.002527806        NA
## 4  experiment   4 0.003791709 3.3024774
## 5  experiment   4 0.003791709 3.5015188
## 6  experiment   5 0.005055612 1.2872186
## 7  experiment   6 0.006319515        NA
## 8  experiment   7 0.007583418        NA
## 9  experiment   8 0.008847321 1.2756962
## 10 experiment   8 0.008847321 2.5748877

When a clinical trial with group sequential design is planned, the decision on the information time of the data look has to be made. This decision is essential in that it affects the required maximum information. In our example clinical trial, the data looks are planned to be performed at information times 0.4, 0.7, 1. There are various potential ways in how to proceed in practice: the information level of the clinical trial could be monitored and the data looks are performed when the respective information levels are achieved, or alternatively, the data looks could be performed at the calendar times which correspond to the planned information times of the data looks under the alternative. When the effect and the nuisance parameters are specified correctly in the planning phase, the calendar times at which the respective information levels are achieved and the calendar times determined during the planning phase will be similar. The function design_gsnb calculates the calendar times under the specified alternative which are 1.333, 2.208, 4 years for the example clinical trial. It is important to emphasize that the critical values have to be recalculated when the information times of the data looks are changed, otherwise the type I error rate will be affected. Moreover, a change of the information time of the data looks will affect the power. In the following, we focus on the case where the information level is monitored. One way of monitoring the information level is to calculate the estimates $\hat{\lambda}_1$, and $\hat{\lambda}_2$, and $\hat{\phi}$ of the rates and the dispersion shape parameter after every new event. Then, the information level after that event is calculated using the function get_info_gsnb(). The next chunk shows the code for monitoring the information level after the event at calendar time t = 1.297 years. This is also the calendar time of the first data look since the information level is $0.4\mathcal{I}_{max}=25.07$.

library(dplyr)
# Filter data to obtain all the events which happened prior to t = 1.2906
t <- 1.2972
rawdata_interim <- dplyr::filter(hospitalizations, t_recruit <= t, 
                       (is.na(eventtime) | eventtime <= t))
# Calculate the number of events of every patient until t
events_interim <- rawdata_interim %>% 
  group_by(treatment, pat, t_recruit) %>% 
  summarise(count = sum(eventtime <= t, na.rm = TRUE)) 
# Exposure time = time of patient with study
events_interim$t_expose <- t - events_interim$t_recruit
# Negative binomial regression
glm_out <- MASS::glm.nb(formula = count ~ treatment + offset(log(t_expose)) - 1, 
                        data = events_interim)
rates <- exp(glm_out$coefficients)
followup1 <- dplyr::filter(events_interim, treatment == 'experiment')$t_expose
followup2 <- dplyr::filter(events_interim, treatment == 'control')$t_expose
# Information level at calendar time t
info_interim <- get_info_gsnb(rate1 = rates["treatmentexperiment"],
                              rate2 = rates["treatmentcontrol"],
                              dispersion = 1/glm_out$theta, 
                              followup1 = followup1, 
                              followup2 = followup2)

The information level at calendar time $t=1.2972$ is given by $\mathcal{I}_{t}=25.361$. Next, we explain how to perform the interim analysis at the calendar time $t=1.2972$. The respective data are given by the list event_interim. To test whether the clinical trial can be stopped early for efficacy, a negative binomial regression is performed. The results are as follows.

glm_interim1 <- MASS::glm.nb(formula = count ~ treatment + offset(log(t_expose)), 
                        data = events_interim)
summary(glm_interim1)$coefficients

##                    Estimate Std. Error    z value     Pr(>|z|)
## (Intercept)      -2.3544742  0.1475891 -15.952898 2.719780e-57
## treatmentcontrol  0.2678467  0.1985713   1.348869 1.773789e-01

The test statistic for the test of efficacy (and futility if stopping for futility is part of the design) is the negative $z$-value of the coefficient treatmentcontrol, i.e. $T_1 = -1.349$. However, the critical value for the first data look as calculated during the planning phase of the study and stored in the object out under out$efficacy$critical, is $c_1 = -3.357$. Thus, since the test statistic is not smaller than the critical value, the clinical trial is not stopped for efficacy.

4 Planning and analyzing a fixed sample design with negative binomial outcomes

The R package gscount, through the function design_nb, also implements the functionality for planning a fixed sample design with negative binomial outcomes. The input arguments of the function design_nb are mostly identical to the input arguments of the function design_gsnb, which we discussed in Section 2. In the following we outline the R code for planning a fixed sample design with the same parameters are the first example.

out_fix <- design_nb(rate1 = 0.0875, rate2 = 0.125, dispersion = 5, 
                 ratio_H0 = 1, power = 0.8, sig_level = 0.025, 
                 study_period = 4, accrual_period = 1.25, random_ratio = 1)
out_fix

## 
##  Fixed sample design with negative binomial outcomes
## 
## Distribution parameters
##  Rate group 1:  0.0875 
##  Rate group 2:  0.125 
##  Dispersion parameter:  5 
## Hypothesis testing
##  Rate ratio under null hypothesis:  1 
##  Rate ratio under alternative:  0.7 
##  Significance level:  0.025 
##  Power fixed design:  0.8 
##  Maximum information:  61.71 
## Sample size and study duration
##  Sample size group 1:  975 
##  Sample size group 2:  975 
##  Accrual period:  1.25 
##  Study duration:  4

The output of the function design_nb is an object of class nb. The printed object contains the distribution parameters, the listed hypothesis testing parameters, and the design parameters, i.e. sample size accrual period, and study duration. Not every parameter in the output object is printed, though. The assumed recruitment times of the patients, are not printed. The next chunk shows the internal structure of an object of class nb.

str(out_fix)

## List of 14
##  $ rate1         : num 0.0875
##  $ rate2         : num 0.125
##  $ dispersion    : num 5
##  $ power         : num 0.8
##  $ ratio_H0      : num 1
##  $ sig_level     : num 0.025
##  $ n             : num 1950
##  $ n1            : num 975
##  $ n2            : num 975
##  $ t_recruit1    : num [1:975] 0 0.00128 0.00257 0.00385 0.00513 ...
##  $ t_recruit2    : num [1:975] 0 0.00128 0.00257 0.00385 0.00513 ...
##  $ max_info      : num 61.7
##  $ accrual_period: num 1.25
##  $ study_period  : num 4
##  - attr(*, "class")= chr "nb"

5 References

T. Mütze, E. Glimm, H. Schmidli, and T. Friede. (2018) Group sequential designs for negative binomial outcomes. Statistical Methods in Medical Research (in press).

Group sequential designs for negative binomial outcomes using gscounts

Tobias Mütze

2021-11-01