Package 'sgpv'

Title: Calculate Second-Generation p-Values and Associated Measures
Description: Computation of second-generation p-values as described in Blume et al. (2018) <doi:10.1371/journal.pone.0188299> and Blume et al. (2019) <doi:10.1080/00031305.2018.1537893>. There are additional functions which provide power and type I error calculations, create graphs (particularly suited for large-scale inference usage), and a function to estimate false discovery rates based on second-generation p-value inference.
Authors: Valerie Welty [aut, cre], Rebecca Irlmeier [aut], Thomas Stewart [aut], Robert Greevy, Jr. [aut], Lucy D'Agostino McGowan [aut], Jeffrey Blume [aut]
Maintainer: Valerie Welty <[email protected]>
License: MIT + file LICENSE
Version: 1.1.0
Built: 2024-11-18 05:23:32 UTC
Source: https://github.com/weltybiostat/sgpv

Help Index


False Discovery Risk for Second-Generation p-Values

Description

This function computes the false discovery risk (sometimes called the "empirical bayes FDR") for a second-generation p-value of 0, or the false confirmation risk for a second-generation p-value of 1.

Usage

fdrisk(
  sgpval = 0,
  null.lo,
  null.hi,
  std.err,
  interval.type,
  interval.level,
  pi0 = 0.5,
  null.weights,
  null.space,
  alt.weights,
  alt.space
)

Arguments

sgpval

The observed second-generation p-value. Default is 00, which gives the false discovery risk.

null.lo

The lower bound of the indifference zone (null interval) upon which the second-generation p-value was based

null.hi

The upper bound for the indifference zone (null interval) upon which the second-generation p-value was based

std.err

Standard error of the point estimate

interval.type

Class of interval estimate used. This determines the functional form of the power function. Options are confidence for a (1α)100(1-\alpha)100% confidence interval and likelihood for a 1/k1/k likelihood support interval (credible not yet supported).

interval.level

Level of interval estimate. If interval.type is confidence, the level is α\alpha. If interval.type is likelihood, the level is 1/k1/k (not kk).

pi0

Prior probability of the null hypothesis. Default is 0.50.5.

null.weights

Probability distribution for the null parameter space. Options are currently Point, Uniform, and TruncNormal.

null.space

Support of the null probability distribution. If null.weights is Point, then null.space is a scalar. If null.weights is Uniform, then null.space is a vector of length two.

alt.weights

Probability distribution for the alternative parameter space. Options are currently Point, Uniform, and TruncNormal.

alt.space

Support for the alternative probability distribution. If alt.weights is Point, then alt.space is a scalar. If alt.weights is Uniform, then alt.space is a vector of length two.

Details

When possible, one should compute the second-generation p-value and FDR/FCR on a scale that is symmetric about the null hypothesis. For example, if the parameter of interest is an odds ratio, inputs pt.est, std.err, null.lo, null.hi, null.space, and alt.space are typically on the log scale.

If TruncNormal is used for null.weights, then the distribution used is a truncated Normal distribution with mean equal to the midpoint of null.space, and standard deviation equal to std.err, truncated to the support of null.space. If TruncNormal is used for alt.weights, then the distribution used is a truncated Normal distribution with mean equal to the midpoint of alt.space, and standard deviation equal to std.err, truncated to the support of alt.space. Further customization of these parameters for the truncated Normal are currently not possible, although they may be implemented in future versions.

Value

Numeric scalar representing the False discovery risk (FDR) or false confirmation risk (FCR) for the observed second-generation p-value. If sgpval = 00, the function returns false discovery risk (FDR). If sgpval = 11, the function returns false confirmation risk (FCR).

References

Blume JD, Greevy RA Jr., Welty VF, Smith JR, Dupont WD (2019). An Introduction to Second-generation p-values. The American Statistician. 73:sup1, 157-167, DOI: https://doi.org/10.1080/00031305.2018.1537893

Blume JD, D’Agostino McGowan L, Dupont WD, Greevy RA Jr. (2018). Second-generation p-values: Improved rigor, reproducibility, & transparency in statistical analyses. PLoS ONE 13(3): e0188299. https://doi.org/10.1371/journal.pone.0188299

See Also

sgpvalue, sgpower, plotsgpv, FDRestimation::p.fdr

Examples

# false discovery risk with 95% confidence level
fdrisk(sgpval = 0,  null.lo = log(1/1.1), null.hi = log(1.1),  std.err = 0.8,
  null.weights = 'Uniform', null.space = c(log(1/1.1), log(1.1)),
  alt.weights = 'Uniform',  alt.space = 2 + c(-1,1)*qnorm(1-0.05/2)*0.8,
  interval.type = 'confidence',  interval.level = 0.05)

# false discovery risk with 1/8 likelihood support level
fdrisk(sgpval = 0,  null.lo = log(1/1.1), null.hi = log(1.1),  std.err = 0.8,
  null.weights = 'Point', null.space = 0,  alt.weights = 'Uniform',
  alt.space = 2 + c(-1,1)*qnorm(1-0.041/2)*0.8,
  interval.type = 'likelihood',  interval.level = 1/8)

## with truncated normal weighting distribution
fdrisk(sgpval = 0,  null.lo = log(1/1.1), null.hi = log(1.1),  std.err = 0.8,
  null.weights = 'Point', null.space = 0,  alt.weights = 'TruncNormal',
  alt.space = 2 + c(-1,1)*qnorm(1-0.041/2)*0.8,
  interval.type = 'likelihood',  interval.level = 1/8)

# false discovery risk with LSI and wider null hypothesis
fdrisk(sgpval = 0,  null.lo = log(1/1.5), null.hi = log(1.5),  std.err = 0.8,
  null.weights = 'Point', null.space = 0,  alt.weights = 'Uniform',
  alt.space = 2.5 + c(-1,1)*qnorm(1-0.041/2)*0.8,
  interval.type = 'likelihood',  interval.level = 1/8)

# false confirmation risk example
fdrisk(sgpval = 1,  null.lo = log(1/1.5), null.hi = log(1.5),  std.err = 0.15,
  null.weights = 'Uniform', null.space = 0.01 + c(-1,1)*qnorm(1-0.041/2)*0.15,
  alt.weights = 'Uniform',  alt.space = c(log(1.5), 1.25*log(1.5)),
  interval.type = 'likelihood',  interval.level = 1/8)

Test Statistics from Gloub (1999) Leukemia data set

Description

Data are from 7218 gene specific t-tests for a difference in mean expression (on the log scale; AML versus ALL) in the Gloub data set (1999). Data are from 72 patients using a pooled t-test (df=70). Included in the dataframe are the following: t-statistic (t.stat), p-value (p.value), CI lower limit (ci.lo), CI upper limit (ci.hi), estimate (estimate), standard error (se).

Usage

data(leukstats)

Format

An object of class data.frame. Includes the following: t-statistic (t.stat), p-value (p.value), CI lower limit (ci.lo), CI upper limit (ci.hi), estimate (estimate), standard error (se).

Source

https://github.com/ramhiser/datamicroarray/wiki/Golub-(1999)

References

Gloub (1999) and used in Blume et. al. (2018) PlosONE.

Blume JD, D’Agostino McGowan L, Dupont WD, Greevy RA Jr. (2018). Second-generation p-values: Improved rigor, reproducibility, & transparency in statistical analyses. PLoS ONE 13(3): e0188299. https://doi.org/10.1371/journal.pone.0188299

Examples

data(leukstats)
order(leukstats$p.value)

Second-Generation p-Value Plotting

Description

This function displays a modified Manhattan-style plot colored according to second-generation p-value status. There are several variations of this plot that can be made depending upon user input for type as well as the set.order and x.show options. These plots allow the user to visualize the overall result of a large-scale analysis succintly and to visually assess the differences in the results using second-generation p-value techniques as opposed to classical p-value techniques.

Usage

plotman(
  est.lo,
  est.hi,
  null.lo,
  null.hi,
  set.order = NA,
  x.show = NA,
  type = "delta-gap",
  p.values = NA,
  ref.lines = NA,
  null.pt = NA,
  int.col = c("cornflowerblue", "firebrick3", "darkslateblue"),
  int.pch = 16,
  int.cex = 0.4,
  title.lab = NA,
  x.lab = "Position (by set.order)",
  y.lab = "Outcome label",
  legend.on = TRUE
)

Arguments

est.lo

A numeric vector of lower bounds of interval estimates. Must be of same length as est.hi.

est.hi

A numeric vector of upper bounds of interval estimates. Must be of same length as est.lo.

null.lo

A scalar representing the lower bound of the null interval hypothesis (indifference zone). Value must be finite.

null.hi

A scalar representing the upper bound of the null interval hypothesis (indifference zone). Value must be finite.

set.order

A numeric vector giving the desired order along the x-axis. Alternatively, if set.order is set to "sgpv", the second-generation p-value ranking is used. The default option is NA, which uses the original input ordering.

x.show

A numeric scalar representing the maximum ranking on the x-axis that is displayed. Default is to display all rankings.

type

A string specifying the desired Manhattan-style plot to be graphed. This argument specifies the variable on the y-axis. If type = "delta-gap", the delta-gaps are ranked. If type = "p-value", the classic p-values are ranked. If type = "comparison", the classic p-values are ranked by SGPV. Default is type = "delta-gap".

p.values

A numeric vector giving the classic p-values. This is required when type = "p-value" or type = "comparison", and is not required when type = "delta-gap". The p.values input may be any desired transformation of the p-values. For example, if the desired transformation is log10(pvalue)-log10(p-value) as in a traditional Manhattan plot, the log10(pvalues)-log10(p-values) should be provided for p.values. The corresponding x or y axis label(s) should be updated to reflect any transformations.

ref.lines

A numeric scalar or vector giving the points on the y-axis at which to add a horizontal reference line. For example, if p.values is set to log10(pvalues)-log10(p-values) and the type of plot selected shows the (transformed) p-values on the y-axis, possible locations for the reference lines could be at the log10(0.05)-log10(0.05), log10(Bonferroni)-log10(Bonferroni) and log10(FDR)-log10(FDR) significance levels.

null.pt

An optional numeric scalar representing a point null hypothesis. Default is NA.

int.col

Vector of length three specifing the colors of the points according to SGPV result. The first color option corresponds to the SGPV=0SGPV = 0 results, the second color option corresponds to the 0<SGPV<10 < SGPV < 1 results, and the third color option corresponds to the SGPV=1SGPV = 1 results. Default is int.col = c("cornflowerblue","firebrick3","darkslateblue").

int.pch

Plotting symbol for points. Default is 16 for small points.

int.cex

Size of plotting symbol for points. Default is 0.4.

title.lab

Title text.

x.lab

A title for the x-axis. Default is the generic "Position (by set.order)".

y.lab

A title for the y-axis. Default is the generic "Outcome label".

legend.on

Toggle for plotting the legend. Default is TRUE.

Details

Use set.order to provide the classical p-value ranking. For example, if pvalue.vector is a vector of classical p-values, then set set.order=order(pvalue.vector) to sort the x-axis according to p-value rank.

Use type and p.values to provide the log10(pvalues)-log10(p-values) for the y-axis. For example, if pvalue.vector is a vector of classical p-values, then set type="p-value" (or type="comparison") and p.values=-log10(pvalue.vector) to set the y-axis. Then, set the y-axis title to something like y.lab="-log10(p)".

References

Blume JD, Greevy RA Jr., Welty VF, Smith JR, Dupont WD (2019). An Introduction to Second-generation p-values. The American Statistician. 73:sup1, 157-167, DOI: https://doi.org/10.1080/00031305.2018.1537893

Blume JD, D’Agostino McGowan L, Dupont WD, Greevy RA Jr. (2018). Second-generation p-values: Improved rigor, reproducibility, & transparency in statistical analyses. PLoS ONE 13(3): e0188299. https://doi.org/10.1371/journal.pone.0188299

See Also

sgpvalue, plotsgpv, sgpower, plotsgpower

Examples

#  Use leukstats data
data(leukstats)

# ID number on the x-axis, delta-gap on the y-axis, using an interval null hypothesis of
# (-0.3, 0.3) for the log mean difference in expression levels (fold change).
plotman(est.lo=leukstats$ci.lo, est.hi=leukstats$ci.hi,
       null.lo=-0.3, null.hi=0.3,
       set.order=NA,
       type="delta-gap",
       ref.lines=NA,
       int.pch=16, int.cex=0.4,
       title.lab="Leukemia Example",
       y.lab="Delta-gap",
       x.lab="Position (ID)",
       legend.on=TRUE)

# ID number on the x-axis, -log10(classical p-value) on the y-axis, using an interval
# null hypothesis of (-0.3, 0.3) for the log mean difference in expression levels
# (fold change).
plotman(est.lo=leukstats$ci.lo, est.hi=leukstats$ci.hi,
       null.lo=-0.3, null.hi=0.3,
       set.order=NA,
       type="p-value",
       p.values=-log10(leukstats$p.value),
       ref.lines=-log10(0.05),
       int.pch=16, int.cex=0.4,
       title.lab="Leukemia Example",
       y.lab=expression("-log"[10]*"(p-value)"),
       x.lab="Position (ID)",
       legend.on=TRUE)

# Second-generation p-value (SGPV) on the x-axis, -log10(classical p-value) on the
# y-axis, using an interval null hypothesis of (-0.3, 0.3) for the log mean difference
# in expression levels (fold change).
plotman(est.lo=leukstats$ci.lo, est.hi=leukstats$ci.hi,
       null.lo=-0.3, null.hi=0.3,
       set.order="sgpv",
       type="comparison",
       p.values=-log10(leukstats$p.value),
       ref.lines=c(-log10(0.05), -log10(0.001)),
       int.pch=16, int.cex=0.4,
       title.lab="Leukemia Example",
       y.lab=expression("-log"[10]*"(p-value)"),
       x.lab="Second-generation p-value ranking",
       legend.on=TRUE)

Plot power curves for Second-Generation p-Values

Description

This function calculates power and type I error values from significance testing based on second-generation p-values as the inferential metric and plots the power curve to visualize the operating charateristics of the inferential procedure.

Usage

plotsgpower(
  null.lo,
  null.hi,
  std.err,
  alt = NA,
  x.lim = NA,
  interval.type,
  interval.level = 0.05,
  plot.option = 1,
  null.col = rgb(208, 216, 232, maxColorValue = 255),
  pow.col = c("cornflowerblue", "firebrick3", "green4"),
  pow.lty = c(1, 1, 1),
  title.lab = "",
  x.lab = "Parameter",
  y.lab = "Probability",
  legend.on = TRUE,
  null.pt = NA,
  acc = 100
)

Arguments

null.lo

A scalar representing the lower bound of the null interval hypothesis (indifference zone) upon which the second-generation p-value is based.

null.hi

A scalar representing the upper bound of the null interval hypothesis (indifference zone) upon which the second-generation p-value is based.

std.err

Standard error for the distribution of the estimator for the parameter of interest. Note that this is the standard deviation for the estimator, not the standard deviation parameter for the data itself. This will be a function of the sample size(s).

alt

Optional scalar or vector of alternative value(s) for the parameter of interest. Default is NA. If provided, a blue dotted line (or one at each point) will be plotted and the power will be printed.

x.lim

Optional numeric vector of length two giving the lower and upper bounds of the x-axis for the power curve. Default is NA, where the x-axis range will be optimized to fit the entirety of the power curve (which is dependent upon the width of the null zone and the standard error of the estimator).

interval.type

Class of interval estimate used for calculating the SGPV. Options are "confidence" for a (1α)100(1-\alpha)100% confidence interval and "likelihood" for a 1/k1/k likelihood support interval (credible not yet supported).

interval.level

Level of interval estimate. If interval.type = "confidence" is used, the level is α\alpha. If interval.type = "likelihood" is used, the level is 1/k1/k (not kk).

plot.option

Used to specify the type of plot desired. If plot.option = 1, the classical power curve and its corresponding SGPV power curve are shown. If plot.option = 2, the three power curves provided by sgpower are shown. Default is plot.option = 1.

null.col

Coloring of shading for the null interval hypothesis (indifference zone) region. Default is Hawkes Blue: null.col = rgb(208, 216, 232, maxColorValue = 255).

pow.col

Vector of length three specifying the colors for the the three power curves given when plot.option = 2. The first color option corresponds to the Pr(SGPV=0θ)Pr(SGPV = 0 | \theta) line, the second color option corresponds to the Pr(0<SGPV<1θ)Pr(0 < SGPV < 1 | \theta) line, and the third color option corresponds to the Pr(SGPV=1θ)Pr(SGPV = 1 | \theta) line. Default is pow.col = c("cornflowerblue", "firebrick3", "green4").

pow.lty

Vector of length three specifying the line types (lty) for the three power curves given when plot.option = 2. The first line type option corresponds to the Pr(SGPV=0θ)Pr(SGPV = 0 | \theta) line, the second line type option corresponds to the Pr(0<SGPV<1θ)Pr(0 < SGPV < 1 | \theta) line, and the third line type option corresponds to the Pr(SGPV=1θ)Pr(SGPV = 1 | \theta) line. Default is pow.lty = c(1,1,1) for solid lines.

title.lab

Title text.

x.lab

x-axis label.

y.lab

y-axis label.

legend.on

Toggle for plotting the legend. Default is TRUE.

null.pt

Optional numeric scalar representing a point null hypothesis. Default is NA. If a value is given, it will be plotted as a black dashed line and the type I error at that point will be printed.

acc

Optional parameter specifying the resolution of the x-axis. Default is acc = 100 for plotting the power curve as a sequence of 100 (x, y) points.

References

Blume JD, Greevy RA Jr., Welty VF, Smith JR, Dupont WD (2019). An Introduction to Second-generation p-values. The American Statistician. 73:sup1, 157-167, DOI: https://doi.org/10.1080/00031305.2018.1537893

Blume JD, D’Agostino McGowan L, Dupont WD, Greevy RA Jr. (2018). Second-generation p-values: Improved rigor, reproducibility, & transparency in statistical analyses. PLoS ONE 13(3): e0188299. https://doi.org/10.1371/journal.pone.0188299

See Also

fdrisk, sgpvalue, plotsgpv

Examples

sigma = 5
n = 20

plotsgpower(alt = NA, null.lo = -1, null.hi = 1,
            std.err = sigma/sqrt(n), x.lim = c(-8,8),
           interval.type = 'confidence', interval.level = 0.05,
           plot.option = 2, null.pt = 0)

plotsgpower(alt = c(-4,2),
            null.lo = -1, null.hi = 1, std.err = sigma/sqrt(n),
            x.lim = NA, interval.type = 'confidence',
            interval.level = 0.05, plot.option = 2)

plotsgpower(alt = NA, null.lo = -1, null.hi = 1,
            std.err = sigma/sqrt(n), x.lim = NA,
            interval.type = 'confidence', interval.level = 0.05,
            plot.option = 1, null.pt = NA)

plotsgpower(alt = c(-4,2), null.lo = -1, null.hi = 1,
            std.err = 1, x.lim = NA, interval.type = 'likelihood',
            interval.level = 0.05, plot.option = 1, null.pt = 0)

Second-Generation p-Value Plotting

Description

This function displays user supplied interval estimates (support intervals, confidence intervals, credible intervals, etc.) according to its associated second-generation p-value ranking.

Usage

plotsgpv(
  est.lo,
  est.hi,
  null.lo,
  null.hi,
  set.order = "sgpv",
  x.show = NA,
  null.col = rgb(208, 216, 232, maxColorValue = 255),
  int.col = c("cornflowerblue", "firebrick3", "darkslateblue"),
  int.pch = NA,
  int.cex = 0.4,
  plot.axis = c(TRUE, TRUE),
  null.pt = NA,
  outline.zone = TRUE,
  title.lab = "Title",
  x.lab = "Position (by set.order)",
  y.lab = "Outcome label",
  legend.on = TRUE
)

Arguments

est.lo

A numeric vector of lower bounds of interval estimates. Values must be finite for interval to be drawn. Must be of same length as est.hi.

est.hi

A numeric vector of upper bounds of interval estimates. Values must be finite for interval to be drawn. Must be of same length as est.lo.

null.lo

A scalar representing the lower bound of null interval (indifference zone). Value must be finite.

null.hi

A scalar representing the upper bound of null interval (indifference zone). Value must be finite.

set.order

A numeric vector giving the desired order along the x-axis. If set.order is set to sgpv, the second-generation p-value ranking is used. If set.order is set to NA, the original input ordering is used.

x.show

A scalar representing the maximum ranking on the x-axis that is displayed. Default is to display all intervals.

null.col

Coloring of the null interval (indifference zone). Default is Hawkes Blue: rgb(208,216,232,maxColorValue=255).

int.col

Coloring of the intervals according to SGPV ranking. Default is c("cornflowerblue","firebrick3","darkslateblue") for SGPVs of 00, in (0,1)(0,1), and 11 respectively.

int.pch

Plotting symbol for interval endpoints. Default is NA, no symbol. Use 16 for small endpoints.

int.cex

Size of plotting symbol for interval endpoints. Default is 0.40.4.

plot.axis

Toggle for default axis plotting. Default is c(TRUE,TRUE) for (xaxis,yaxis)(x-axis,y-axis) respectively.

null.pt

A scalar representing a point null hypothesis. Default is NA. If set, the function will draw a horizontal dashed black line at this location.

outline.zone

Toggle for drawing a slim white outline around the null zone. Helpful visual aid when plotting many intervals. Default is TRUE.

title.lab

Title text.

x.lab

x-axis label.

y.lab

y-axis label.

legend.on

Toggle for plotting the legend. Default is TRUE.

Details

Use set.order to provide the classical p-value ranking. For example, if pvalue.vector is a vector of classical p-values, then set set.order=order(pvalue.vector) to sort the x-axis according to p-value rank.

Interval estimates with infinite or undefined limits should be manually truncated or avoided altogether. While the sgpvalue funciton will handle these cases, this function assumes they have been truncated or removed because there is no standard way to plot them.

References

Blume JD, Greevy RA Jr., Welty VF, Smith JR, Dupont WD (2019). An Introduction to Second-generation p-values. The American Statistician. 73:sup1, 157-167, DOI: https://doi.org/10.1080/00031305.2018.1537893

Blume JD, D’Agostino McGowan L, Dupont WD, Greevy RA Jr. (2018). Second-generation p-values: Improved rigor, reproducibility, & transparency in statistical analyses. PLoS ONE 13(3): e0188299. https://doi.org/10.1371/journal.pone.0188299

See Also

sgpvalue, sgpower, fdrisk

Examples

# Use leukstats data
data(leukstats)
plotsgpv(est.lo=leukstats$ci.lo, est.hi=leukstats$ci.hi,
		null.lo=-0.3, null.hi=0.3,
		set.order=order(leukstats$p.value),
		x.show=7000,
		plot.axis=c("TRUE","FALSE"),
		null.pt=0, outline.zone=TRUE,
		title.lab="Leukemia Example", y.lab="Fold Change (base 10)",
		x.lab="Classical p-value ranking",
		legend.on=TRUE)
axis(side=2,at=round(log(c(1/1000,1/100,1/10,1/2,1,2,10,100,1000),
	base=10),2),labels=c("1/1000","1/100","1/10","1/2",1,2,10,100,1000),
	las=2)

Power functions for Second-Generation p-Values

Description

Calculate power and type I error values from significance testing based on second-generation p-values as the inferential metric.

Usage

sgpower(true, null.lo, null.hi, std.err = 1, interval.type, interval.level)

Arguments

true

The true value for the parameter of interest at which to calculate power. Note that this is on the absolute scale of the parameter, and not the standard deviation or standard error scale.

null.lo

The lower bound of the indifference zone (null interval) upon which the second-generation p-value is based

null.hi

The upper bound for the indifference zone (null interval) upon which the second-generation p-value is based

std.err

Standard error for the distribution of the estimator for the parameter of interest. Note that this is the standard deviation for the estimator, not the standard deviation parameter for the data itself. This will be a function of the sample size(s).

interval.type

Class of interval estimate used for calculating the SGPV. Options are confidence for a (1α)100(1-\alpha)100% confidence interval and likelihood for a 1/k1/k likelihood support interval (credible not yet supported)

interval.level

Level of interval estimate. If interval.type is confidence, the level is α\alpha. If interval.type is likelihood, the level is 1/k1/k (not kk).

Value

A list containing the following components:

power.alt

Probability of SGPV = 0 calculated assuming the parameter is equal to true. That is, power.alt=P(SGPV=0θ== P(SGPV = 0 | \theta = true).

power.inc

Probability of 0 < SGPV < 1 calculated assuming the parameter is equal to true. That is, power.inc=P(0<SGPV<1θ== P(0 < SGPV < 1 | \theta = true).

power.null

Probability of SGPV = 1 calculated assuming the parameter is equal to true. That is, power.null=P(SGPV=1θ== P(SGPV = 1 | \theta = true).

`type I error summaries`

Named vector that includes different ways the type I error may be summarized for an interval null hypothesis. min is the minimum type I error over the range (null.lo, null.hi), which occurs at the midpoint of (null.lo, null.hi). max is the maximum type I error over the range (null.lo, null.hi), which occurs at the boundaries of the null hypothesis, null.lo and null.hi. mean is the average type I error (unweighted) over the range (null.lo, null.hi). If 00 is included in the null hypothesis region, then `type I error summaries` also contains at 0, the type I error calculated assuming the true parameter value θ\theta is equal to 00.

References

Blume JD, Greevy RA Jr., Welty VF, Smith JR, Dupont WD (2019). An Introduction to Second-generation p-values. The American Statistician. 73:sup1, 157-167, DOI: https://doi.org/10.1080/00031305.2018.1537893

Blume JD, D’Agostino McGowan L, Dupont WD, Greevy RA Jr. (2018). Second-generation p-values: Improved rigor, reproducibility, & transparency in statistical analyses. PLoS ONE 13(3): e0188299. https://doi.org/10.1371/journal.pone.0188299

See Also

fdrisk, sgpvalue, plotsgpv

Examples

sgpower(true=2, null.lo=-1, null.hi=1, std.err=1, interval.type='confidence',
 'interval.level'=0.05)

sgpower(true=0, null.lo=-1, null.hi=1, std.err=1, interval.type='confidence',
 'interval.level'=0.05)

# plot the power curve
sigma = 5
n = 20
theta = seq(-10, 10, by=0.1)
power = sgpower(true=theta, null.lo=-1, null.hi=1, std.err=sigma/sqrt(n),
 interval.type='confidence', interval.level=0.05)$power.alt
plot(theta, power, type='l', ylab='power')

Second-Generation p-Values

Description

This function computes the second-generation p-value (SGPV) and its associated delta gaps, as introduced in Blume et al. (2018).

Usage

sgpvalue(
  est.lo,
  est.hi,
  null.lo,
  null.hi,
  inf.correction = 1e-05,
  warnings = TRUE
)

Arguments

est.lo

A numeric vector of lower bounds of interval estimates. Values may be finite or -Inf or Inf. Must be of same length as est.hi.

est.hi

A numeric vector of upper bounds of interval estimates. Values may be finite or -Inf or Inf. Must be of same length as est.lo.

null.lo

A numeric vector of lower bounds of null intervals. Values may be finite or -Inf or Inf. Must be of same length as null.hi.

null.hi

A numeric vector of upper bounds of null intervals. Values may be finite or -Inf or Inf. Must be of same length as null.lo.

inf.correction

A small scalar to denote a positive but infinitesimally small SGPV. Default is 1e-5. SGPVs that are infinitesimally close to 1 are assigned 1-inf.correction. This option can only be invoked when one of the intervals has infinite length.

warnings

Warnings toggle. Warnings are on by default.

Details

Values of NA or NaN for est.lo, est.hi, null.lo, or null.lo will yield a warning and result in a SGPV of NA or NaN.

When null.hi and null.lo are of length 1, the same null interval is used for every interval estimate of [est.lo, est.hi]. If null.hi is not of length 1, its length must match that of est.hi.

When possible, one should compute the second-generation p-value on a scale that is symmetric about the null hypothesis. For example, if the parameter of interest is an odds ratio, computations are typically done on the log scale. This keeps the magnitude of positive and negative delta-gaps comparable. Also, recall that the delta-gaps magnitude is not comparable across different null intervals.

Value

A list containing the following components:

p.delta

Vector of second-generation p-values

delta.gap

Vector of delta-gaps. Reported as NA when the corresponding second-generation p-value is not zero.

References

Blume JD, Greevy RA Jr., Welty VF, Smith JR, Dupont WD (2019). An Introduction to Second-generation p-values. The American Statistician. 73:sup1, 157-167, DOI: https://doi.org/10.1080/00031305.2018.1537893

Blume JD, D’Agostino McGowan L, Dupont WD, Greevy RA Jr. (2018). Second-generation p-values: Improved rigor, reproducibility, & transparency in statistical analyses. PLoS ONE 13(3): e0188299. https://doi.org/10.1371/journal.pone.0188299

See Also

fdrisk, sgpower, plotsgpv

Examples

## Simple example for three estimated log odds ratios but the same null interval
lb <- c(log(1.05), log(1.3), log(0.97))
ub <- c(log(1.8), log(1.8), log(1.02))
sgpv <- sgpvalue(est.lo = lb, est.hi = ub, null.lo = log(1/1.1), null.hi = log(1.1))
sgpv$p.delta

sgpv$delta.gap

## Works with infinte interval bounds
sgpvalue(est.lo = log(1.3), est.hi = Inf, null.lo = -Inf, null.hi = log(1.1))


sgpvalue(est.lo = log(1.05), est.hi = Inf, null.lo = -Inf, null.hi = log(1.1))



## Example t-test with simulated data
set.seed(1776)
x1 <- rnorm(15,mean=0,sd=2) ; x2 <- rnorm(15,mean=3,sd=2)
ci <- t.test(x1,x2)$conf.int[1:2]
sgpvalue(est.lo = ci[1], est.hi = ci[2], null.lo = -1, null.hi = 1)

set.seed(2019)
x1 <- rnorm(15,mean=0,sd=2) ; x2 <- rnorm(15,mean=3,sd=2)
ci <- t.test(x1,x2)$conf.int[1:2]
sgpvalue(est.lo = ci[1], est.hi = ci[2], null.lo = -1, null.hi = 1)

## Simulated two-group dichotomous data for different parameters
set.seed(1492)
n1 <- n2 <- 30
x1 <- rbinom(1,size=n1,p=0.15) ; x2 <- rbinom(1,size=n2,p=0.50)

# On the difference in proportions
ci.p  <- prop.test(c(x1,x2),n=c(n1,n2))$conf.int[1:2]
sgpvalue(est.lo = ci.p[1], est.hi = ci.p[2], null.lo = -0.2, null.hi = 0.2)

# On the log odds ratio scale
a <- x1 ; b <- x2 ; c <- n1-x1 ; d <- n2-x2
ci.or <- log(a*d/(b*c)) + c(-1,1)*1.96*sqrt(1/a+1/b+1/c+1/d)	# Delta-method SE for log odds ratio
sgpvalue(est.lo = ci.or[1], est.hi = ci.or[2], null.lo = log(1/1.5), null.hi = log(1.5))