Package 'sgpv' reference manual

Title:	Calculate Second-Generation p-Values and Associated Measures
Description:	Computation of second-generation p-values as described in Blume et al. (2018) <doi:10.1371/journal.pone.0188299> and Blume et al. (2019) <doi:10.1080/00031305.2018.1537893>. There are additional functions which provide power and type I error calculations, create graphs (particularly suited for large-scale inference usage), and a function to estimate false discovery rates based on second-generation p-value inference.
Authors:	Valerie Welty [aut, cre], Rebecca Irlmeier [aut], Thomas Stewart [aut], Robert Greevy, Jr. [aut], Lucy D'Agostino McGowan [aut], Jeffrey Blume [aut]
Maintainer:	Valerie Welty <[email protected]>
License:	MIT + file LICENSE
Version:	1.1.0
Built:	2025-03-18 04:17:30 UTC
Source:	https://github.com/weltybiostat/sgpv

False Discovery Risk for Second-Generation p-Values

Description

This function computes the false discovery risk (sometimes called the "empirical bayes FDR") for a second-generation p-value of 0, or the false confirmation risk for a second-generation p-value of 1.

Usage

fdrisk(
  sgpval = 0,
  null.lo,
  null.hi,
  std.err,
  interval.type,
  interval.level,
  pi0 = 0.5,
  null.weights,
  null.space,
  alt.weights,
  alt.space
)
fdrisk(
  sgpval = 0,
  null.lo,
  null.hi,
  std.err,
  interval.type,
  interval.level,
  pi0 = 0.5,
  null.weights,
  null.space,
  alt.weights,
  alt.space
)

Arguments

`sgpval`	The observed second-generation p-value. Default is $0$ , which gives the false discovery risk.
`null.lo`	The lower bound of the indifference zone (null interval) upon which the second-generation p-value was based
`null.hi`	The upper bound for the indifference zone (null interval) upon which the second-generation p-value was based
`std.err`	Standard error of the point estimate
`interval.type`	Class of interval estimate used. This determines the functional form of the power function. Options are `confidence` for a $(1-\alpha)100$ % confidence interval and `likelihood` for a $1/k$ likelihood support interval (`credible` not yet supported).
`interval.level`	Level of interval estimate. If `interval.type` is `confidence`, the level is $\alpha$ . If `interval.type` is `likelihood`, the level is $1/k$ (not $k$ ).
`pi0`	Prior probability of the null hypothesis. Default is $0.5$ .
`null.weights`	Probability distribution for the null parameter space. Options are currently `Point`, `Uniform`, and `TruncNormal`.
`null.space`	Support of the null probability distribution. If `null.weights` is `Point`, then `null.space` is a scalar. If `null.weights` is `Uniform`, then `null.space` is a vector of length two.
`alt.weights`	Probability distribution for the alternative parameter space. Options are currently `Point`, `Uniform`, and `TruncNormal`.
`alt.space`	Support for the alternative probability distribution. If `alt.weights` is `Point`, then `alt.space` is a scalar. If `alt.weights` is `Uniform`, then `alt.space` is a vector of length two.

Details

When possible, one should compute the second-generation p-value and FDR/FCR on a scale that is symmetric about the null hypothesis. For example, if the parameter of interest is an odds ratio, inputs pt.est, std.err, null.lo, null.hi, null.space, and alt.space are typically on the log scale.

If TruncNormal is used for null.weights, then the distribution used is a truncated Normal distribution with mean equal to the midpoint of null.space, and standard deviation equal to std.err, truncated to the support of null.space. If TruncNormal is used for alt.weights, then the distribution used is a truncated Normal distribution with mean equal to the midpoint of alt.space, and standard deviation equal to std.err, truncated to the support of alt.space. Further customization of these parameters for the truncated Normal are currently not possible, although they may be implemented in future versions.

Value

Numeric scalar representing the False discovery risk (FDR) or false confirmation risk (FCR) for the observed second-generation p-value. If sgpval = $0$ , the function returns false discovery risk (FDR). If sgpval = $1$ , the function returns false confirmation risk (FCR).

References

Blume JD, Greevy RA Jr., Welty VF, Smith JR, Dupont WD (2019). An Introduction to Second-generation p-values. The American Statistician. 73:sup1, 157-167, DOI: https://doi.org/10.1080/00031305.2018.1537893

Blume JD, D’Agostino McGowan L, Dupont WD, Greevy RA Jr. (2018). Second-generation p-values: Improved rigor, reproducibility, & transparency in statistical analyses. PLoS ONE 13(3): e0188299. https://doi.org/10.1371/journal.pone.0188299

Examples


# false discovery risk with 95% confidence level
fdrisk(sgpval = 0,  null.lo = log(1/1.1), null.hi = log(1.1),  std.err = 0.8,
  null.weights = 'Uniform', null.space = c(log(1/1.1), log(1.1)),
  alt.weights = 'Uniform',  alt.space = 2 + c(-1,1)*qnorm(1-0.05/2)*0.8,
  interval.type = 'confidence',  interval.level = 0.05)

# false discovery risk with 1/8 likelihood support level
fdrisk(sgpval = 0,  null.lo = log(1/1.1), null.hi = log(1.1),  std.err = 0.8,
  null.weights = 'Point', null.space = 0,  alt.weights = 'Uniform',
  alt.space = 2 + c(-1,1)*qnorm(1-0.041/2)*0.8,
  interval.type = 'likelihood',  interval.level = 1/8)

## with truncated normal weighting distribution
fdrisk(sgpval = 0,  null.lo = log(1/1.1), null.hi = log(1.1),  std.err = 0.8,
  null.weights = 'Point', null.space = 0,  alt.weights = 'TruncNormal',
  alt.space = 2 + c(-1,1)*qnorm(1-0.041/2)*0.8,
  interval.type = 'likelihood',  interval.level = 1/8)

# false discovery risk with LSI and wider null hypothesis
fdrisk(sgpval = 0,  null.lo = log(1/1.5), null.hi = log(1.5),  std.err = 0.8,
  null.weights = 'Point', null.space = 0,  alt.weights = 'Uniform',
  alt.space = 2.5 + c(-1,1)*qnorm(1-0.041/2)*0.8,
  interval.type = 'likelihood',  interval.level = 1/8)

# false confirmation risk example
fdrisk(sgpval = 1,  null.lo = log(1/1.5), null.hi = log(1.5),  std.err = 0.15,
  null.weights = 'Uniform', null.space = 0.01 + c(-1,1)*qnorm(1-0.041/2)*0.15,
  alt.weights = 'Uniform',  alt.space = c(log(1.5), 1.25*log(1.5)),
  interval.type = 'likelihood',  interval.level = 1/8)


# false discovery risk with 95% confidence level
fdrisk(sgpval = 0,  null.lo = log(1/1.1), null.hi = log(1.1),  std.err = 0.8,
  null.weights = 'Uniform', null.space = c(log(1/1.1), log(1.1)),
  alt.weights = 'Uniform',  alt.space = 2 + c(-1,1)*qnorm(1-0.05/2)*0.8,
  interval.type = 'confidence',  interval.level = 0.05)

# false discovery risk with 1/8 likelihood support level
fdrisk(sgpval = 0,  null.lo = log(1/1.1), null.hi = log(1.1),  std.err = 0.8,
  null.weights = 'Point', null.space = 0,  alt.weights = 'Uniform',
  alt.space = 2 + c(-1,1)*qnorm(1-0.041/2)*0.8,
  interval.type = 'likelihood',  interval.level = 1/8)

## with truncated normal weighting distribution
fdrisk(sgpval = 0,  null.lo = log(1/1.1), null.hi = log(1.1),  std.err = 0.8,
  null.weights = 'Point', null.space = 0,  alt.weights = 'TruncNormal',
  alt.space = 2 + c(-1,1)*qnorm(1-0.041/2)*0.8,
  interval.type = 'likelihood',  interval.level = 1/8)

# false discovery risk with LSI and wider null hypothesis
fdrisk(sgpval = 0,  null.lo = log(1/1.5), null.hi = log(1.5),  std.err = 0.8,
  null.weights = 'Point', null.space = 0,  alt.weights = 'Uniform',
  alt.space = 2.5 + c(-1,1)*qnorm(1-0.041/2)*0.8,
  interval.type = 'likelihood',  interval.level = 1/8)

# false confirmation risk example
fdrisk(sgpval = 1,  null.lo = log(1/1.5), null.hi = log(1.5),  std.err = 0.15,
  null.weights = 'Uniform', null.space = 0.01 + c(-1,1)*qnorm(1-0.041/2)*0.15,
  alt.weights = 'Uniform',  alt.space = c(log(1.5), 1.25*log(1.5)),
  interval.type = 'likelihood',  interval.level = 1/8)

Test Statistics from Gloub (1999) Leukemia data set

Description

Data are from 7218 gene specific t-tests for a difference in mean expression (on the log scale; AML versus ALL) in the Gloub data set (1999). Data are from 72 patients using a pooled t-test (df=70). Included in the dataframe are the following: t-statistic (t.stat), p-value (p.value), CI lower limit (ci.lo), CI upper limit (ci.hi), estimate (estimate), standard error (se).

Usage

data(leukstats)
data(leukstats)

Format

An object of class data.frame. Includes the following: t-statistic (t.stat), p-value (p.value), CI lower limit (ci.lo), CI upper limit (ci.hi), estimate (estimate), standard error (se).

Source

https://github.com/ramhiser/datamicroarray/wiki/Golub-(1999)

References

Gloub (1999) and used in Blume et. al. (2018) PlosONE.

Examples

data(leukstats)
order(leukstats$p.value)
data(leukstats)
order(leukstats$p.value)

Second-Generation p-Value Plotting

Description

This function displays a modified Manhattan-style plot colored according to second-generation p-value status. There are several variations of this plot that can be made depending upon user input for type as well as the set.order and x.show options. These plots allow the user to visualize the overall result of a large-scale analysis succintly and to visually assess the differences in the results using second-generation p-value techniques as opposed to classical p-value techniques.

Usage

plotman(
  est.lo,
  est.hi,
  null.lo,
  null.hi,
  set.order = NA,
  x.show = NA,
  type = "delta-gap",
  p.values = NA,
  ref.lines = NA,
  null.pt = NA,
  int.col = c("cornflowerblue", "firebrick3", "darkslateblue"),
  int.pch = 16,
  int.cex = 0.4,
  title.lab = NA,
  x.lab = "Position (by set.order)",
  y.lab = "Outcome label",
  legend.on = TRUE
)
plotman(
  est.lo,
  est.hi,
  null.lo,
  null.hi,
  set.order = NA,
  x.show = NA,
  type = "delta-gap",
  p.values = NA,
  ref.lines = NA,
  null.pt = NA,
  int.col = c("cornflowerblue", "firebrick3", "darkslateblue"),
  int.pch = 16,
  int.cex = 0.4,
  title.lab = NA,
  x.lab = "Position (by set.order)",
  y.lab = "Outcome label",
  legend.on = TRUE
)

Arguments

`est.lo`	A numeric vector of lower bounds of interval estimates. Must be of same length as `est.hi`.
`est.hi`	A numeric vector of upper bounds of interval estimates. Must be of same length as `est.lo`.
`null.lo`	A scalar representing the lower bound of the null interval hypothesis (indifference zone). Value must be finite.
`null.hi`	A scalar representing the upper bound of the null interval hypothesis (indifference zone). Value must be finite.
`set.order`	A numeric vector giving the desired order along the x-axis. Alternatively, if `set.order` is set to `"sgpv"`, the second-generation p-value ranking is used. The default option is `NA`, which uses the original input ordering.
`x.show`	A numeric scalar representing the maximum ranking on the x-axis that is displayed. Default is to display all rankings.
`type`	A string specifying the desired Manhattan-style plot to be graphed. This argument specifies the variable on the y-axis. If `type = "delta-gap"`, the delta-gaps are ranked. If `type = "p-value"`, the classic p-values are ranked. If `type = "comparison"`, the classic p-values are ranked by SGPV. Default is `type = "delta-gap"`.
`p.values`	A numeric vector giving the classic p-values. This is required when `type = "p-value"` or `type = "comparison"`, and is not required when `type = "delta-gap"`. The `p.values` input may be any desired transformation of the p-values. For example, if the desired transformation is $-log10(p-value)$ as in a traditional Manhattan plot, the $-log10(p-values)$ should be provided for `p.values`. The corresponding x or y axis label(s) should be updated to reflect any transformations.
`ref.lines`	A numeric scalar or vector giving the points on the y-axis at which to add a horizontal reference line. For example, if `p.values` is set to $-log10(p-values)$ and the type of plot selected shows the (transformed) p-values on the y-axis, possible locations for the reference lines could be at the $-log10(0.05)$ , $-log10(Bonferroni)$ and $-log10(FDR)$ significance levels.
`null.pt`	An optional numeric scalar representing a point null hypothesis. Default is `NA`.
`int.col`	Vector of length three specifing the colors of the points according to SGPV result. The first color option corresponds to the $SGPV = 0$ results, the second color option corresponds to the $0 < SGPV < 1$ results, and the third color option corresponds to the $SGPV = 1$ results. Default is `int.col = c("cornflowerblue","firebrick3","darkslateblue")`.
`int.pch`	Plotting symbol for points. Default is `16` for small points.
`int.cex`	Size of plotting symbol for points. Default is `0.4`.
`title.lab`	Title text.
`x.lab`	A title for the x-axis. Default is the generic `"Position (by set.order)"`.
`y.lab`	A title for the y-axis. Default is the generic `"Outcome label"`.
`legend.on`	Toggle for plotting the legend. Default is `TRUE`.

Details

Use set.order to provide the classical p-value ranking. For example, if pvalue.vector is a vector of classical p-values, then set set.order=order(pvalue.vector) to sort the x-axis according to p-value rank.

Use type and p.values to provide the $-log10(p-values)$ for the y-axis. For example, if pvalue.vector is a vector of classical p-values, then set type="p-value" (or type="comparison") and p.values=-log10(pvalue.vector) to set the y-axis. Then, set the y-axis title to something like y.lab="-log10(p)".

References

Examples



#  Use leukstats data
data(leukstats)

# ID number on the x-axis, delta-gap on the y-axis, using an interval null hypothesis of
# (-0.3, 0.3) for the log mean difference in expression levels (fold change).
plotman(est.lo=leukstats$ci.lo, est.hi=leukstats$ci.hi,
       null.lo=-0.3, null.hi=0.3,
       set.order=NA,
       type="delta-gap",
       ref.lines=NA,
       int.pch=16, int.cex=0.4,
       title.lab="Leukemia Example",
       y.lab="Delta-gap",
       x.lab="Position (ID)",
       legend.on=TRUE)

# ID number on the x-axis, -log10(classical p-value) on the y-axis, using an interval
# null hypothesis of (-0.3, 0.3) for the log mean difference in expression levels
# (fold change).
plotman(est.lo=leukstats$ci.lo, est.hi=leukstats$ci.hi,
       null.lo=-0.3, null.hi=0.3,
       set.order=NA,
       type="p-value",
       p.values=-log10(leukstats$p.value),
       ref.lines=-log10(0.05),
       int.pch=16, int.cex=0.4,
       title.lab="Leukemia Example",
       y.lab=expression("-log"[10]*"(p-value)"),
       x.lab="Position (ID)",
       legend.on=TRUE)

# Second-generation p-value (SGPV) on the x-axis, -log10(classical p-value) on the
# y-axis, using an interval null hypothesis of (-0.3, 0.3) for the log mean difference
# in expression levels (fold change).
plotman(est.lo=leukstats$ci.lo, est.hi=leukstats$ci.hi,
       null.lo=-0.3, null.hi=0.3,
       set.order="sgpv",
       type="comparison",
       p.values=-log10(leukstats$p.value),
       ref.lines=c(-log10(0.05), -log10(0.001)),
       int.pch=16, int.cex=0.4,
       title.lab="Leukemia Example",
       y.lab=expression("-log"[10]*"(p-value)"),
       x.lab="Second-generation p-value ranking",
       legend.on=TRUE)

#  Use leukstats data
data(leukstats)

# ID number on the x-axis, delta-gap on the y-axis, using an interval null hypothesis of
# (-0.3, 0.3) for the log mean difference in expression levels (fold change).
plotman(est.lo=leukstats$ci.lo, est.hi=leukstats$ci.hi,
       null.lo=-0.3, null.hi=0.3,
       set.order=NA,
       type="delta-gap",
       ref.lines=NA,
       int.pch=16, int.cex=0.4,
       title.lab="Leukemia Example",
       y.lab="Delta-gap",
       x.lab="Position (ID)",
       legend.on=TRUE)

# ID number on the x-axis, -log10(classical p-value) on the y-axis, using an interval
# null hypothesis of (-0.3, 0.3) for the log mean difference in expression levels
# (fold change).
plotman(est.lo=leukstats$ci.lo, est.hi=leukstats$ci.hi,
       null.lo=-0.3, null.hi=0.3,
       set.order=NA,
       type="p-value",
       p.values=-log10(leukstats$p.value),
       ref.lines=-log10(0.05),
       int.pch=16, int.cex=0.4,
       title.lab="Leukemia Example",
       y.lab=expression("-log"[10]*"(p-value)"),
       x.lab="Position (ID)",
       legend.on=TRUE)

# Second-generation p-value (SGPV) on the x-axis, -log10(classical p-value) on the
# y-axis, using an interval null hypothesis of (-0.3, 0.3) for the log mean difference
# in expression levels (fold change).
plotman(est.lo=leukstats$ci.lo, est.hi=leukstats$ci.hi,
       null.lo=-0.3, null.hi=0.3,
       set.order="sgpv",
       type="comparison",
       p.values=-log10(leukstats$p.value),
       ref.lines=c(-log10(0.05), -log10(0.001)),
       int.pch=16, int.cex=0.4,
       title.lab="Leukemia Example",
       y.lab=expression("-log"[10]*"(p-value)"),
       x.lab="Second-generation p-value ranking",
       legend.on=TRUE)

Plot power curves for Second-Generation p-Values

Description

This function calculates power and type I error values from significance testing based on second-generation p-values as the inferential metric and plots the power curve to visualize the operating charateristics of the inferential procedure.

Usage

plotsgpower(
  null.lo,
  null.hi,
  std.err,
  alt = NA,
  x.lim = NA,
  interval.type,
  interval.level = 0.05,
  plot.option = 1,
  null.col = rgb(208, 216, 232, maxColorValue = 255),
  pow.col = c("cornflowerblue", "firebrick3", "green4"),
  pow.lty = c(1, 1, 1),
  title.lab = "",
  x.lab = "Parameter",
  y.lab = "Probability",
  legend.on = TRUE,
  null.pt = NA,
  acc = 100
)
plotsgpower(
  null.lo,
  null.hi,
  std.err,
  alt = NA,
  x.lim = NA,
  interval.type,
  interval.level = 0.05,
  plot.option = 1,
  null.col = rgb(208, 216, 232, maxColorValue = 255),
  pow.col = c("cornflowerblue", "firebrick3", "green4"),
  pow.lty = c(1, 1, 1),
  title.lab = "",
  x.lab = "Parameter",
  y.lab = "Probability",
  legend.on = TRUE,
  null.pt = NA,
  acc = 100
)

Arguments

`null.lo`	A scalar representing the lower bound of the null interval hypothesis (indifference zone) upon which the second-generation p-value is based.
`null.hi`	A scalar representing the upper bound of the null interval hypothesis (indifference zone) upon which the second-generation p-value is based.
`std.err`	Standard error for the distribution of the estimator for the parameter of interest. Note that this is the standard deviation for the estimator, not the standard deviation parameter for the data itself. This will be a function of the sample size(s).
`alt`	Optional scalar or vector of alternative value(s) for the parameter of interest. Default is `NA`. If provided, a blue dotted line (or one at each point) will be plotted and the power will be printed.
`x.lim`	Optional numeric vector of length two giving the lower and upper bounds of the x-axis for the power curve. Default is `NA`, where the x-axis range will be optimized to fit the entirety of the power curve (which is dependent upon the width of the null zone and the standard error of the estimator).
`interval.type`	Class of interval estimate used for calculating the SGPV. Options are `"confidence"` for a $(1-\alpha)100$ % confidence interval and `"likelihood"` for a $1/k$ likelihood support interval (`credible` not yet supported).
`interval.level`	Level of interval estimate. If `interval.type = "confidence"` is used, the level is $\alpha$ . If `interval.type = "likelihood"` is used, the level is $1/k$ (not $k$ ).
`plot.option`	Used to specify the type of plot desired. If `plot.option = 1`, the classical power curve and its corresponding SGPV power curve are shown. If `plot.option = 2`, the three power curves provided by `sgpower` are shown. Default is `plot.option = 1`.
`null.col`	Coloring of shading for the null interval hypothesis (indifference zone) region. Default is Hawkes Blue: `null.col = rgb(208, 216, 232, maxColorValue = 255)`.
`pow.col`	Vector of length three specifying the colors for the the three power curves given when `plot.option = 2`. The first color option corresponds to the $Pr(SGPV = 0 \| \theta)$ line, the second color option corresponds to the $Pr(0 < SGPV < 1 \| \theta)$ line, and the third color option corresponds to the $Pr(SGPV = 1 \| \theta)$ line. Default is `pow.col = c("cornflowerblue", "firebrick3", "green4")`.
`pow.lty`	Vector of length three specifying the line types (`lty`) for the three power curves given when `plot.option = 2`. The first line type option corresponds to the $Pr(SGPV = 0 \| \theta)$ line, the second line type option corresponds to the $Pr(0 < SGPV < 1 \| \theta)$ line, and the third line type option corresponds to the $Pr(SGPV = 1 \| \theta)$ line. Default is `pow.lty = c(1,1,1)` for solid lines.
`title.lab`	Title text.
`x.lab`	x-axis label.
`y.lab`	y-axis label.
`legend.on`	Toggle for plotting the legend. Default is `TRUE`.
`null.pt`	Optional numeric scalar representing a point null hypothesis. Default is `NA`. If a value is given, it will be plotted as a black dashed line and the type I error at that point will be printed.
`acc`	Optional parameter specifying the resolution of the x-axis. Default is `acc = 100` for plotting the power curve as a sequence of 100 (x, y) points.

References

Examples


sigma = 5
n = 20

plotsgpower(alt = NA, null.lo = -1, null.hi = 1,
            std.err = sigma/sqrt(n), x.lim = c(-8,8),
           interval.type = 'confidence', interval.level = 0.05,
           plot.option = 2, null.pt = 0)

plotsgpower(alt = c(-4,2),
            null.lo = -1, null.hi = 1, std.err = sigma/sqrt(n),
            x.lim = NA, interval.type = 'confidence',
            interval.level = 0.05, plot.option = 2)

plotsgpower(alt = NA, null.lo = -1, null.hi = 1,
            std.err = sigma/sqrt(n), x.lim = NA,
            interval.type = 'confidence', interval.level = 0.05,
            plot.option = 1, null.pt = NA)

plotsgpower(alt = c(-4,2), null.lo = -1, null.hi = 1,
            std.err = 1, x.lim = NA, interval.type = 'likelihood',
            interval.level = 0.05, plot.option = 1, null.pt = 0)


sigma = 5
n = 20

plotsgpower(alt = NA, null.lo = -1, null.hi = 1,
            std.err = sigma/sqrt(n), x.lim = c(-8,8),
           interval.type = 'confidence', interval.level = 0.05,
           plot.option = 2, null.pt = 0)

plotsgpower(alt = c(-4,2),
            null.lo = -1, null.hi = 1, std.err = sigma/sqrt(n),
            x.lim = NA, interval.type = 'confidence',
            interval.level = 0.05, plot.option = 2)

plotsgpower(alt = NA, null.lo = -1, null.hi = 1,
            std.err = sigma/sqrt(n), x.lim = NA,
            interval.type = 'confidence', interval.level = 0.05,
            plot.option = 1, null.pt = NA)

plotsgpower(alt = c(-4,2), null.lo = -1, null.hi = 1,
            std.err = 1, x.lim = NA, interval.type = 'likelihood',
            interval.level = 0.05, plot.option = 1, null.pt = 0)

Second-Generation p-Value Plotting

Description

This function displays user supplied interval estimates (support intervals, confidence intervals, credible intervals, etc.) according to its associated second-generation p-value ranking.

Usage

plotsgpv(
  est.lo,
  est.hi,
  null.lo,
  null.hi,
  set.order = "sgpv",
  x.show = NA,
  null.col = rgb(208, 216, 232, maxColorValue = 255),
  int.col = c("cornflowerblue", "firebrick3", "darkslateblue"),
  int.pch = NA,
  int.cex = 0.4,
  plot.axis = c(TRUE, TRUE),
  null.pt = NA,
  outline.zone = TRUE,
  title.lab = "Title",
  x.lab = "Position (by set.order)",
  y.lab = "Outcome label",
  legend.on = TRUE
)
plotsgpv(
  est.lo,
  est.hi,
  null.lo,
  null.hi,
  set.order = "sgpv",
  x.show = NA,
  null.col = rgb(208, 216, 232, maxColorValue = 255),
  int.col = c("cornflowerblue", "firebrick3", "darkslateblue"),
  int.pch = NA,
  int.cex = 0.4,
  plot.axis = c(TRUE, TRUE),
  null.pt = NA,
  outline.zone = TRUE,
  title.lab = "Title",
  x.lab = "Position (by set.order)",
  y.lab = "Outcome label",
  legend.on = TRUE
)

Arguments

`est.lo`	A numeric vector of lower bounds of interval estimates. Values must be finite for interval to be drawn. Must be of same length as `est.hi`.
`est.hi`	A numeric vector of upper bounds of interval estimates. Values must be finite for interval to be drawn. Must be of same length as `est.lo`.
`null.lo`	A scalar representing the lower bound of null interval (indifference zone). Value must be finite.
`null.hi`	A scalar representing the upper bound of null interval (indifference zone). Value must be finite.
`set.order`	A numeric vector giving the desired order along the x-axis. If `set.order` is set to `sgpv`, the second-generation p-value ranking is used. If `set.order` is set to `NA`, the original input ordering is used.
`x.show`	A scalar representing the maximum ranking on the x-axis that is displayed. Default is to display all intervals.
`null.col`	Coloring of the null interval (indifference zone). Default is Hawkes Blue: `rgb(208,216,232,maxColorValue=255)`.
`int.col`	Coloring of the intervals according to SGPV ranking. Default is `c("cornflowerblue","firebrick3","darkslateblue")` for SGPVs of $0$ , in $(0,1)$ , and $1$ respectively.
`int.pch`	Plotting symbol for interval endpoints. Default is `NA`, no symbol. Use `16` for small endpoints.
`int.cex`	Size of plotting symbol for interval endpoints. Default is $0.4$ .
`plot.axis`	Toggle for default axis plotting. Default is `c(TRUE,TRUE)` for $(x-axis,y-axis)$ respectively.
`null.pt`	A scalar representing a point null hypothesis. Default is `NA`. If set, the function will draw a horizontal dashed black line at this location.
`outline.zone`	Toggle for drawing a slim white outline around the null zone. Helpful visual aid when plotting many intervals. Default is `TRUE`.
`title.lab`	Title text.
`x.lab`	x-axis label.
`y.lab`	y-axis label.
`legend.on`	Toggle for plotting the legend. Default is `TRUE`.

Details

Interval estimates with infinite or undefined limits should be manually truncated or avoided altogether. While the sgpvalue funciton will handle these cases, this function assumes they have been truncated or removed because there is no standard way to plot them.

References

Examples


# Use leukstats data
data(leukstats)
plotsgpv(est.lo=leukstats$ci.lo, est.hi=leukstats$ci.hi,
		null.lo=-0.3, null.hi=0.3,
		set.order=order(leukstats$p.value),
		x.show=7000,
		plot.axis=c("TRUE","FALSE"),
		null.pt=0, outline.zone=TRUE,
		title.lab="Leukemia Example", y.lab="Fold Change (base 10)",
		x.lab="Classical p-value ranking",
		legend.on=TRUE)
axis(side=2,at=round(log(c(1/1000,1/100,1/10,1/2,1,2,10,100,1000),
	base=10),2),labels=c("1/1000","1/100","1/10","1/2",1,2,10,100,1000),
	las=2)


# Use leukstats data
data(leukstats)
plotsgpv(est.lo=leukstats$ci.lo, est.hi=leukstats$ci.hi,
		null.lo=-0.3, null.hi=0.3,
		set.order=order(leukstats$p.value),
		x.show=7000,
		plot.axis=c("TRUE","FALSE"),
		null.pt=0, outline.zone=TRUE,
		title.lab="Leukemia Example", y.lab="Fold Change (base 10)",
		x.lab="Classical p-value ranking",
		legend.on=TRUE)
axis(side=2,at=round(log(c(1/1000,1/100,1/10,1/2,1,2,10,100,1000),
	base=10),2),labels=c("1/1000","1/100","1/10","1/2",1,2,10,100,1000),
	las=2)

Power functions for Second-Generation p-Values

Description

Calculate power and type I error values from significance testing based on second-generation p-values as the inferential metric.

Usage

sgpower(true, null.lo, null.hi, std.err = 1, interval.type, interval.level)
sgpower(true, null.lo, null.hi, std.err = 1, interval.type, interval.level)

Arguments

`true`	The true value for the parameter of interest at which to calculate power. Note that this is on the absolute scale of the parameter, and not the standard deviation or standard error scale.
`null.lo`	The lower bound of the indifference zone (null interval) upon which the second-generation p-value is based
`null.hi`	The upper bound for the indifference zone (null interval) upon which the second-generation p-value is based
`std.err`	Standard error for the distribution of the estimator for the parameter of interest. Note that this is the standard deviation for the estimator, not the standard deviation parameter for the data itself. This will be a function of the sample size(s).
`interval.type`	Class of interval estimate used for calculating the SGPV. Options are `confidence` for a $(1-\alpha)100$ % confidence interval and `likelihood` for a $1/k$ likelihood support interval (`credible` not yet supported)
`interval.level`	Level of interval estimate. If `interval.type` is `confidence`, the level is $\alpha$ . If `interval.type` is `likelihood`, the level is $1/k$ (not $k$ ).

Value

A list containing the following components:

power.alt: Probability of SGPV = 0 calculated assuming the parameter is equal to true. That is, power.alt $= P(SGPV = 0 | \theta =$ true).
power.inc: Probability of 0 < SGPV < 1 calculated assuming the parameter is equal to true. That is, power.inc $= P(0 < SGPV < 1 | \theta =$ true).
power.null: Probability of SGPV = 1 calculated assuming the parameter is equal to true. That is, power.null $= P(SGPV = 1 | \theta =$ true).
`type I error summaries`: Named vector that includes different ways the type I error may be summarized for an interval null hypothesis. min is the minimum type I error over the range (null.lo, null.hi), which occurs at the midpoint of (null.lo, null.hi). max is the maximum type I error over the range (null.lo, null.hi), which occurs at the boundaries of the null hypothesis, null.lo and null.hi. mean is the average type I error (unweighted) over the range (null.lo, null.hi). If $0$ is included in the null hypothesis region, then `type I error summaries` also contains at 0, the type I error calculated assuming the true parameter value $\theta$ is equal to $0$ .

References

Examples

sgpower(true=2, null.lo=-1, null.hi=1, std.err=1, interval.type='confidence',
 'interval.level'=0.05)

sgpower(true=0, null.lo=-1, null.hi=1, std.err=1, interval.type='confidence',
 'interval.level'=0.05)

# plot the power curve
sigma = 5
n = 20
theta = seq(-10, 10, by=0.1)
power = sgpower(true=theta, null.lo=-1, null.hi=1, std.err=sigma/sqrt(n),
 interval.type='confidence', interval.level=0.05)$power.alt
plot(theta, power, type='l', ylab='power')

sgpower(true=2, null.lo=-1, null.hi=1, std.err=1, interval.type='confidence',
 'interval.level'=0.05)

sgpower(true=0, null.lo=-1, null.hi=1, std.err=1, interval.type='confidence',
 'interval.level'=0.05)

# plot the power curve
sigma = 5
n = 20
theta = seq(-10, 10, by=0.1)
power = sgpower(true=theta, null.lo=-1, null.hi=1, std.err=sigma/sqrt(n),
 interval.type='confidence', interval.level=0.05)$power.alt
plot(theta, power, type='l', ylab='power')

Second-Generation p-Values

Description

This function computes the second-generation p-value (SGPV) and its associated delta gaps, as introduced in Blume et al. (2018).

Usage

sgpvalue(
  est.lo,
  est.hi,
  null.lo,
  null.hi,
  inf.correction = 1e-05,
  warnings = TRUE
)
sgpvalue(
  est.lo,
  est.hi,
  null.lo,
  null.hi,
  inf.correction = 1e-05,
  warnings = TRUE
)

Arguments

`est.lo`	A numeric vector of lower bounds of interval estimates. Values may be finite or `-Inf` or `Inf`. Must be of same length as `est.hi`.
`est.hi`	A numeric vector of upper bounds of interval estimates. Values may be finite or `-Inf` or `Inf`. Must be of same length as `est.lo`.
`null.lo`	A numeric vector of lower bounds of null intervals. Values may be finite or `-Inf` or `Inf`. Must be of same length as `null.hi`.
`null.hi`	A numeric vector of upper bounds of null intervals. Values may be finite or `-Inf` or `Inf`. Must be of same length as `null.lo`.
`inf.correction`	A small scalar to denote a positive but infinitesimally small SGPV. Default is 1e-5. SGPVs that are infinitesimally close to 1 are assigned `1-inf.correction`. This option can only be invoked when one of the intervals has infinite length.
`warnings`	Warnings toggle. Warnings are on by default.

Details

Values of NA or NaN for est.lo, est.hi, null.lo, or null.lo will yield a warning and result in a SGPV of NA or NaN.

When null.hi and null.lo are of length 1, the same null interval is used for every interval estimate of [est.lo, est.hi]. If null.hi is not of length 1, its length must match that of est.hi.

When possible, one should compute the second-generation p-value on a scale that is symmetric about the null hypothesis. For example, if the parameter of interest is an odds ratio, computations are typically done on the log scale. This keeps the magnitude of positive and negative delta-gaps comparable. Also, recall that the delta-gaps magnitude is not comparable across different null intervals.

Value

A list containing the following components:

p.delta: Vector of second-generation p-values
delta.gap: Vector of delta-gaps. Reported as NA when the corresponding second-generation p-value is not zero.

References

Examples


## Simple example for three estimated log odds ratios but the same null interval
lb <- c(log(1.05), log(1.3), log(0.97))
ub <- c(log(1.8), log(1.8), log(1.02))
sgpv <- sgpvalue(est.lo = lb, est.hi = ub, null.lo = log(1/1.1), null.hi = log(1.1))
sgpv$p.delta

sgpv$delta.gap

## Works with infinte interval bounds
sgpvalue(est.lo = log(1.3), est.hi = Inf, null.lo = -Inf, null.hi = log(1.1))


sgpvalue(est.lo = log(1.05), est.hi = Inf, null.lo = -Inf, null.hi = log(1.1))



## Example t-test with simulated data
set.seed(1776)
x1 <- rnorm(15,mean=0,sd=2) ; x2 <- rnorm(15,mean=3,sd=2)
ci <- t.test(x1,x2)$conf.int[1:2]
sgpvalue(est.lo = ci[1], est.hi = ci[2], null.lo = -1, null.hi = 1)

set.seed(2019)
x1 <- rnorm(15,mean=0,sd=2) ; x2 <- rnorm(15,mean=3,sd=2)
ci <- t.test(x1,x2)$conf.int[1:2]
sgpvalue(est.lo = ci[1], est.hi = ci[2], null.lo = -1, null.hi = 1)

## Simulated two-group dichotomous data for different parameters
set.seed(1492)
n1 <- n2 <- 30
x1 <- rbinom(1,size=n1,p=0.15) ; x2 <- rbinom(1,size=n2,p=0.50)

# On the difference in proportions
ci.p  <- prop.test(c(x1,x2),n=c(n1,n2))$conf.int[1:2]
sgpvalue(est.lo = ci.p[1], est.hi = ci.p[2], null.lo = -0.2, null.hi = 0.2)

# On the log odds ratio scale
a <- x1 ; b <- x2 ; c <- n1-x1 ; d <- n2-x2
ci.or <- log(a*d/(b*c)) + c(-1,1)*1.96*sqrt(1/a+1/b+1/c+1/d)	# Delta-method SE for log odds ratio
sgpvalue(est.lo = ci.or[1], est.hi = ci.or[2], null.lo = log(1/1.5), null.hi = log(1.5))


## Simple example for three estimated log odds ratios but the same null interval
lb <- c(log(1.05), log(1.3), log(0.97))
ub <- c(log(1.8), log(1.8), log(1.02))
sgpv <- sgpvalue(est.lo = lb, est.hi = ub, null.lo = log(1/1.1), null.hi = log(1.1))
sgpv$p.delta

sgpv$delta.gap

## Works with infinte interval bounds
sgpvalue(est.lo = log(1.3), est.hi = Inf, null.lo = -Inf, null.hi = log(1.1))


sgpvalue(est.lo = log(1.05), est.hi = Inf, null.lo = -Inf, null.hi = log(1.1))



## Example t-test with simulated data
set.seed(1776)
x1 <- rnorm(15,mean=0,sd=2) ; x2 <- rnorm(15,mean=3,sd=2)
ci <- t.test(x1,x2)$conf.int[1:2]
sgpvalue(est.lo = ci[1], est.hi = ci[2], null.lo = -1, null.hi = 1)

set.seed(2019)
x1 <- rnorm(15,mean=0,sd=2) ; x2 <- rnorm(15,mean=3,sd=2)
ci <- t.test(x1,x2)$conf.int[1:2]
sgpvalue(est.lo = ci[1], est.hi = ci[2], null.lo = -1, null.hi = 1)

## Simulated two-group dichotomous data for different parameters
set.seed(1492)
n1 <- n2 <- 30
x1 <- rbinom(1,size=n1,p=0.15) ; x2 <- rbinom(1,size=n2,p=0.50)

# On the difference in proportions
ci.p  <- prop.test(c(x1,x2),n=c(n1,n2))$conf.int[1:2]
sgpvalue(est.lo = ci.p[1], est.hi = ci.p[2], null.lo = -0.2, null.hi = 0.2)

# On the log odds ratio scale
a <- x1 ; b <- x2 ; c <- n1-x1 ; d <- n2-x2
ci.or <- log(a*d/(b*c)) + c(-1,1)*1.96*sqrt(1/a+1/b+1/c+1/d)	# Delta-method SE for log odds ratio
sgpvalue(est.lo = ci.or[1], est.hi = ci.or[2], null.lo = log(1/1.5), null.hi = log(1.5))

Package 'sgpv'

Help Index

False Discovery Risk for Second-Generation p-Values

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Test Statistics from Gloub (1999) Leukemia data set

Description

Usage

Format

Source

References

Examples

Second-Generation p-Value Plotting

Description

Usage

Arguments

Details

References

See Also

Examples

Plot power curves for Second-Generation p-Values

Description

Usage

Arguments

References

See Also

Examples

Second-Generation p-Value Plotting

Description

Usage

Arguments

Details

References

See Also

Examples

Power functions for Second-Generation p-Values

Description

Usage

Arguments

Value

References

See Also

Examples

Second-Generation p-Values

Description

Usage

Arguments

Details

Value

References

See Also

Examples