Title: | Bayesian Analysis, No Gibbs |
---|---|
Description: | Provides functions for the Bayesian analysis of some simple commonly-used models, without using Markov Chain Monte Carlo (MCMC) methods such as Gibbs sampling. The 'rust' package <https://cran.r-project.org/package=rust> is used to simulate a random sample from the required posterior distribution, using the generalized ratio-of-uniforms method. See Wakefield, Gelfand and Smith (1991) <DOI:10.1007/BF01889987> for details. At the moment three conjugate hierarchical models are available: beta-binomial, gamma-Poisson and a 1-way analysis of variance (ANOVA). |
Authors: | Paul J. Northrop [aut, cre, cph], Benjamin D. Hall [aut, cph] |
Maintainer: | Paul J. Northrop <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.0.4 |
Built: | 2024-11-14 05:00:26 UTC |
Source: | https://github.com/paulnorthrop/bang |
Performs Bayesian analyses using some simple commonly-used models. The multivariate generalized ratio-of-uniforms method is used to simulate random samples from the required posterior distribution. The user can either choose hyperparameter values of a default prior distribution or specify their own prior distribution.
Currently three conjugate hierarchical models are available:
beta-binomial, gamma-Poisson and 1-way Analysis of Variance (ANOVA).
The function hef
produces random posterior samples from for the
beta-binomial and gamma-Poisson models. The function hanova1
does this for the 1-way Analysis of Variance (ANOVA).
The rust package is used to
produce these samples.
See
vignette("bang-a-vignette", package = "bang") for a brief introduction
to the package and
vignette("bang-b-hef-vignette", package = "bang") and
vignette("bang-c-anova-vignette", package = "bang") for illustrations
of the use of the hef
and hanova1
functions.
Maintainer: Paul J. Northrop [email protected] [copyright holder]
Authors:
Benjamin D. Hall [email protected] [copyright holder]
Northrop, P. J. (2017). rust: Ratio-of-Uniforms Simulation with Transformation. R package version 1.2.3. https://cran.r-project.org/package=rust.
hef
for hierarchical exponential family models.
hanova1
for hierarchical one-way analysis of
variance (ANOVA).
set_user_prior
to set a user-defined prior.
Coagulation time in seconds for blood drawn from 24 animals randomly
allocated to four different diets from Box, Hunter, and Hunter (1978).
The data frame coagulation
has 24 rows and 2 columns.
Each row relates to a different animal.
Column 1 contains the coagulation times.
Column 2 contains a label for the type of diet: one of A, B, C or D.
coagulation
coagulation
A data frame with 24 rows and 2 columns.
Table 11.2 of Gelman, A., Carlin, J. B., Stern, H. S. Dunson, D. B., Vehtari, A. and Rubin, D. B. (2014) Bayesian Data Analysis. Chapman & Hall / CRC. http://www.stat.columbia.edu/~gelman/book/
Box, G. E. P., Hunter, W. G., and Hunter, J. S. (1978). Statistics for Experimenters. New York: Wiley.
Produces random samples from the posterior distribution of the parameters of a 1-way hierarchical ANOVA model.
hanova1( n = 1000, resp, fac, ..., prior = "default", hpars = NULL, param = c("trans", "original"), init = NULL, mu0 = 0, sigma0 = Inf, nrep = NULL )
hanova1( n = 1000, resp, fac, ..., prior = "default", hpars = NULL, param = c("trans", "original"), init = NULL, mu0 = 0, sigma0 = Inf, nrep = NULL )
n |
A numeric scalar. The size of posterior sample required. |
resp |
A numeric vector. Response values. |
fac |
A vector of class |
... |
Optional further arguments to be passed to
|
prior |
The log-prior for the parameters of the hyperprior
distribution. If the user wishes to specify their own prior then
|
hpars |
A numeric vector. Used to set parameters (if any) in
an in-built prior. If |
param |
A character scalar.
If |
init |
A numeric vector. Optional initial estimates sent to
|
mu0 , sigma0
|
A numeric scalar. Mean and standard deviation of a
normal prior for |
nrep |
A numeric scalar. If |
Consider independent experiments in which the
responses
from experiment/group
are normally
distributed with mean
and standard deviation
.
The population parameters
1, ...,
are modelled as random samples from a normal
distribution with mean
and standard deviation
. Let
.
Conditionally on
1, ...,
,
1, ...,
are independent of each other and are independent of
.
A hyperprior is placed on
.
The user can either choose parameter values of a default hyperprior or
specify their own hyperprior using
set_user_prior
.
The ru
function in the rust
package is used to draw a random sample from the marginal posterior
of the hyperparameter vector .
Then, conditional on these values, population parameters are sampled
directly from the conditional posterior density of
1, ...,
given
and the data.
See the vignette("bang-c-anova-vignette", package = "bang")
for details.
The following priors are specified up to proportionality.
Priors:
prior = "bda"
(the default):
that is, a uniform prior for
,
for
and
.
The data must contain at least 3 groups, that is,
fac
must have
at least 3 levels, for a proper posterior density to be obtained.
[See Sections 5.7 and 11.6 of Gelman et al. (2014).]
prior = "unif"
:
that is, a uniform prior for
,
for
and
.
[See Section 11.6 of Gelman et al. (2014).]
prior = "cauchy"
: independent half-Cauchy priors for
and
with respective scale parameters
and
, that is,
[See Gelman (2006).] The scale parameters (
,
)
are specified using
hpars
= (,
).
The default setting is
hpars = c(10, 10).
Parameterizations for sampling:
param = "original"
is (),
param = "trans"
(the default) is
.
An object (list) of class "hef"
, which has the same
structure as an object of class "ru" returned from ru
.
In particular, the columns of the n
-row matrix sim_vals
contain the simulated values of .
In addition this list contains the arguments
model
, resp
,
fac
and prior
detailed above and an n
by
matrix
theta_sim_vals
: column contains the simulated
values of
. Also included are
data = cbind(resp, fac)
and summary_stats
a list
containing: the number of groups I
; the numbers of responses
each group ni
; the total number of observations; the sample mean
response in each group; the sum of squared deviations from the
group means s
; the arguments to hanova1
mu0
and
sigma0
; call
: the matched call to hanova1
.
Gelman, A., Carlin, J. B., Stern, H. S. Dunson, D. B., Vehtari, A. and Rubin, D. B. (2014) Bayesian Data Analysis. Chapman & Hall / CRC.
Gelman, A. (2006) Prior distributions for variance parameters in hierarchical models. Bayesian Analysis, 1(3), 515-533. doi:10.1214/06-BA117A.
The ru
function in the rust
package for details of the arguments that can be passed to ru
via
hanova1
.
hef
for hierarchical exponential family models.
set_user_prior
to set a user-defined prior.
# ======= Late 21st Century Global Temperature Data ======= # Extract data for RCP2.6 RCP26_2 <- temp2[temp2$RCP == "rcp26", ] # Sample from the posterior under the default `noninformative' flat prior # for (mu, sigma_alpha, log(sigma)). Ratio-of-uniforms is used to sample # from the marginal posterior for (log(sigma_alpha), log(sigma)). temp_res <- hanova1(resp = RCP26_2[, 1], fac = RCP26_2[, 2]) # Plot of sampled values of (sigma_alpha, sigma) plot(temp_res, params = "ru") # Plot of sampled values of (log(sigma_alpha), log(sigma)) # (centred at (0,0)) plot(temp_res, ru_scale = TRUE) # Plot of sampled values of (mu, sigma_alpha, sigma) plot(temp_res) # Estimated marginal posterior densities of the mean for each GCM plot(temp_res, params = "pop", which_pop = "all", one_plot = TRUE) # Posterior sample quantiles probs <- c(2.5, 25, 50, 75, 97.5) / 100 round(t(apply(temp_res$sim_vals, 2, quantile, probs = probs)), 2) # Ratio-of-uniforms information and posterior sample summaries summary(temp_res) # ======= Coagulation time data, from Table 11.2 Gelman et al (2014) ======= # With only 4 groups the posterior for sigma_alpha has a heavy right tail if # the default `noninformative' flat prior for (mu, sigma_alpha, log(sigma)) # is used. If we try to sample from the marginal posterior for # (sigma_alpha, sigma) using the default generalized ratio-of-uniforms # runing parameter value r = 1/2 then the acceptance region is not bounded. # Two remedies: reparameterize the posterior and/or increase the value of r. # (log(sigma_alpha), log(sigma)) parameterization, ru parameter r = 1/2 coag1 <- hanova1(resp = coagulation[, 1], fac = coagulation[, 2]) # (sigma_alpha, sigma) parameterization, ru parameter r = 1 coag2 <- hanova1(resp = coagulation[, 1], fac = coagulation[, 2], param = "original", r = 1) # Values to compare to those in Table 11.3 of Gelman et al (2014) all1 <- cbind(coag1$theta_sim_vals, coag1$sim_vals) all2 <- cbind(coag2$theta_sim_vals, coag2$sim_vals) round(t(apply(all1, 2, quantile, probs = probs)), 1) round(t(apply(all2, 2, quantile, probs = probs)), 1) # Pairwise plots of posterior samples from the group means plot(coag1, which_pop = "all", plot_type = "pairs") # Independent half-Cauchy priors for sigma_alpha and sigma coag3 <- hanova1(resp = coagulation[, 1], fac = coagulation[, 2], param = "original", prior = "cauchy", hpars = c(10, 1e6))
# ======= Late 21st Century Global Temperature Data ======= # Extract data for RCP2.6 RCP26_2 <- temp2[temp2$RCP == "rcp26", ] # Sample from the posterior under the default `noninformative' flat prior # for (mu, sigma_alpha, log(sigma)). Ratio-of-uniforms is used to sample # from the marginal posterior for (log(sigma_alpha), log(sigma)). temp_res <- hanova1(resp = RCP26_2[, 1], fac = RCP26_2[, 2]) # Plot of sampled values of (sigma_alpha, sigma) plot(temp_res, params = "ru") # Plot of sampled values of (log(sigma_alpha), log(sigma)) # (centred at (0,0)) plot(temp_res, ru_scale = TRUE) # Plot of sampled values of (mu, sigma_alpha, sigma) plot(temp_res) # Estimated marginal posterior densities of the mean for each GCM plot(temp_res, params = "pop", which_pop = "all", one_plot = TRUE) # Posterior sample quantiles probs <- c(2.5, 25, 50, 75, 97.5) / 100 round(t(apply(temp_res$sim_vals, 2, quantile, probs = probs)), 2) # Ratio-of-uniforms information and posterior sample summaries summary(temp_res) # ======= Coagulation time data, from Table 11.2 Gelman et al (2014) ======= # With only 4 groups the posterior for sigma_alpha has a heavy right tail if # the default `noninformative' flat prior for (mu, sigma_alpha, log(sigma)) # is used. If we try to sample from the marginal posterior for # (sigma_alpha, sigma) using the default generalized ratio-of-uniforms # runing parameter value r = 1/2 then the acceptance region is not bounded. # Two remedies: reparameterize the posterior and/or increase the value of r. # (log(sigma_alpha), log(sigma)) parameterization, ru parameter r = 1/2 coag1 <- hanova1(resp = coagulation[, 1], fac = coagulation[, 2]) # (sigma_alpha, sigma) parameterization, ru parameter r = 1 coag2 <- hanova1(resp = coagulation[, 1], fac = coagulation[, 2], param = "original", r = 1) # Values to compare to those in Table 11.3 of Gelman et al (2014) all1 <- cbind(coag1$theta_sim_vals, coag1$sim_vals) all2 <- cbind(coag2$theta_sim_vals, coag2$sim_vals) round(t(apply(all1, 2, quantile, probs = probs)), 1) round(t(apply(all2, 2, quantile, probs = probs)), 1) # Pairwise plots of posterior samples from the group means plot(coag1, which_pop = "all", plot_type = "pairs") # Independent half-Cauchy priors for sigma_alpha and sigma coag3 <- hanova1(resp = coagulation[, 1], fac = coagulation[, 2], param = "original", prior = "cauchy", hpars = c(10, 1e6))
Produces random samples from the posterior distribution of the parameters of certain hierarchical exponential family models.
hef( n = 1000, model = c("beta_binom", "gamma_pois"), data, ..., prior = "default", hpars = NULL, param = c("trans", "original"), init = NULL, nrep = NULL )
hef( n = 1000, model = c("beta_binom", "gamma_pois"), data, ..., prior = "default", hpars = NULL, param = c("trans", "original"), init = NULL, nrep = NULL )
n |
An integer scalar. The size of the posterior sample required. |
model |
A character string. Abbreviated name for the
response-population distribution combination.
For a hierarchical normal model see |
data |
A numeric matrix. The format depends on |
... |
Optional further arguments to be passed to
|
prior |
The log-prior for the parameters of the hyperprior
distribution. If the user wishes to specify their own prior then
|
hpars |
A numeric vector. Used to set parameters (if any) in an in-built prior. |
param |
A character scalar.
If |
init |
A numeric vector of length 2. Optional initial estimates
for the search for the mode of the posterior density of the
hyperparameter vector |
nrep |
A numeric scalar. If |
Conditional on population-specific parameter vectors
1, ...,
the observed response data
1, ...,
J within each
population are modelled as random samples from a distribution in an
exponential family. The population parameters
1, ...,
are modelled as random samples from a common
population distribution, chosen to be conditionally conjugate
to the response distribution, with hyperparameter vector
. Conditionally on
1, ...,
,
1, ...,
are independent of each other and are independent of
.
A hyperprior is placed on
. The user can either
choose parameter values of a default hyperprior or specify their own
hyperprior using
set_user_prior
.
The ru
function in the rust
package is used to draw a random sample
from the marginal posterior of the hyperparameter vector .
Then, conditional on these values, population parameters are sampled
directly from the conditional posterior density of
1, ...,
given
and the data.
We outline each model
, specify the format of the
data
, give the default (log-)priors (up to an additive constant)
and detail the choices of ratio-of-uniforms parameterization
param
.
Beta-binomial: For ,
are i.i.d binomial
,
where
is the probability of success in group
and
is the number of trials in group
.
are i.i.d. beta
, so
and
.
data
is a 2-column matrix: the numbers of successes in column 1
and the corresponding numbers of trials in column 2.
Priors:
prior = "bda"
(the default):
[See Section 5.3 of Gelman et al. (2014).]
prior = "gamma"
: independent gamma priors on
and
, i.e.
where the respective shape (
,
) and rate
(
,
) parameters are specified using
hpars
= . The default setting is
hpars = c(1, 0.01, 1, 0.01).
Parameterizations for sampling:
param = "original"
is (),
param = "trans"
(the default) is
.
See Section 5.3 of Gelman et al. (2014).
Gamma-Poisson: For ,
j are i.i.d Poisson(
j
j),
where
is the exposure in group
, based on the
total length of observation time and/or size of the population at
risk of the event of interest and
j is the mean number
of events per unit of exposure.
j are i.i.d. gamma
, so
.
data
is a 2-column matrix: the counts of the numbers of
events in column 1 and the corresponding exposures
in column 2.
Priors:
prior = "gamma"
(the default): independent gamma priors
on and
, i.e.
where the respective shape (
,
) and rate
(
,
) parameters are specified using
hpars
= . The default setting is
hpars = c(1, 0.01, 1, 0.01).
Parameterizations for sampling:
param = "original"
is (),
param = "trans"
(the default) is
An object (list) of class "hef"
, which has the same
structure as an object of class "ru" returned from ru
.
In particular, the columns of the n
-row matrix sim_vals
contain the simulated values of .
In addition this list contains the arguments
model
, data
and prior
detailed above, an n
by matrix
theta_sim_vals
: column contains the simulated values of
and
call
: the matched call to hef
.
If nrep
is not NULL
then this list also contains
data_rep
, a numerical matrix with nrep
columns.
Each column contains a replication of the first column of the original
data data[, 1]
, simulated from the posterior predictive
distribution.
Gelman, A., Carlin, J. B., Stern, H. S. Dunson, D. B., Vehtari, A. and Rubin, D. B. (2014) Bayesian Data Analysis. Chapman & Hall / CRC. http://www.stat.columbia.edu/~gelman/book/
The ru
function in the rust
package for details of the arguments that can be passed to ru
via
hef
.
hanova1
for hierarchical one-way analysis of
variance (ANOVA).
set_user_prior
to set a user-defined prior.
############################ Beta-binomial ################################# # ------------------------- Rat tumor data ------------------------------- # # Default prior, sampling on (rotated) (log(mean), log(alpha + beta)) scale rat_res <- hef(model = "beta_binom", data = rat) # Hyperparameters alpha and beta plot(rat_res) # Parameterization used for sampling plot(rat_res, ru_scale = TRUE) summary(rat_res) # Choose rats with extreme sample probabilities pops <- c(which.min(rat[, 1] / rat[, 2]), which.max(rat[, 1] / rat[, 2])) # Population-specific posterior samples: separate plots plot(rat_res, params = "pop", plot_type = "both", which_pop = pops) # Population-specific posterior samples: one plot plot(rat_res, params = "pop", plot_type = "dens", which_pop = pops, one_plot = TRUE, add_legend = TRUE) # Default prior, sampling on (rotated) (alpha, beta) scale rat_res <- hef(model = "beta_binom", data = rat, param = "original") plot(rat_res) plot(rat_res, ru_scale = TRUE) summary(rat_res) # To produce a plot akin to Figure 5.3 of Gelman et al. (2014) we # (a) Use the same prior for (alpha, beta) # (b) Don't use axis rotation (rotate = FALSE) # (c) Plot on the scale used for ratio-of-uniforms sampling (ru_scale = TRUE) # (d) Note that the mode is relocated to (0, 0) in the plot rat_res <- hef(model = "beta_binom", data = rat, rotate = FALSE) plot(rat_res, ru_scale = TRUE) # This is the estimated location of the posterior mode rat_res$f_mode # User-defined prior, passing parameters # (equivalent to prior = "gamma" with hpars = c(1, 0.01, 1, 0.01)) user_prior <- function(x, hpars) { return(dexp(x[1], hpars[1], log = TRUE) + dexp(x[2], hpars[2], log = TRUE)) } user_prior_fn <- set_user_prior(user_prior, hpars = c(0.01, 0.01)) rat_res <- hef(model = "beta_binom", data = rat, prior = user_prior_fn) plot(rat_res) summary(rat_res) ############################ Gamma-Poisson ################################# # ------------------------ Pump failure data ------------------------------ # pump_res <- hef(model = "gamma_pois", data = pump) # Hyperparameters alpha and beta plot(pump_res) # Parameterization used for sampling plot(pump_res, ru_scale = TRUE) summary(pump_res) # Choose pumps with extreme sample rates pops <- c(which.min(pump[, 1] / pump[, 2]), which.max(pump[, 1] / pump[, 2])) plot(pump_res, params = "pop", plot_type = "dens", which_pop = pops)
############################ Beta-binomial ################################# # ------------------------- Rat tumor data ------------------------------- # # Default prior, sampling on (rotated) (log(mean), log(alpha + beta)) scale rat_res <- hef(model = "beta_binom", data = rat) # Hyperparameters alpha and beta plot(rat_res) # Parameterization used for sampling plot(rat_res, ru_scale = TRUE) summary(rat_res) # Choose rats with extreme sample probabilities pops <- c(which.min(rat[, 1] / rat[, 2]), which.max(rat[, 1] / rat[, 2])) # Population-specific posterior samples: separate plots plot(rat_res, params = "pop", plot_type = "both", which_pop = pops) # Population-specific posterior samples: one plot plot(rat_res, params = "pop", plot_type = "dens", which_pop = pops, one_plot = TRUE, add_legend = TRUE) # Default prior, sampling on (rotated) (alpha, beta) scale rat_res <- hef(model = "beta_binom", data = rat, param = "original") plot(rat_res) plot(rat_res, ru_scale = TRUE) summary(rat_res) # To produce a plot akin to Figure 5.3 of Gelman et al. (2014) we # (a) Use the same prior for (alpha, beta) # (b) Don't use axis rotation (rotate = FALSE) # (c) Plot on the scale used for ratio-of-uniforms sampling (ru_scale = TRUE) # (d) Note that the mode is relocated to (0, 0) in the plot rat_res <- hef(model = "beta_binom", data = rat, rotate = FALSE) plot(rat_res, ru_scale = TRUE) # This is the estimated location of the posterior mode rat_res$f_mode # User-defined prior, passing parameters # (equivalent to prior = "gamma" with hpars = c(1, 0.01, 1, 0.01)) user_prior <- function(x, hpars) { return(dexp(x[1], hpars[1], log = TRUE) + dexp(x[2], hpars[2], log = TRUE)) } user_prior_fn <- set_user_prior(user_prior, hpars = c(0.01, 0.01)) rat_res <- hef(model = "beta_binom", data = rat, prior = user_prior_fn) plot(rat_res) summary(rat_res) ############################ Gamma-Poisson ################################# # ------------------------ Pump failure data ------------------------------ # pump_res <- hef(model = "gamma_pois", data = pump) # Hyperparameters alpha and beta plot(pump_res) # Parameterization used for sampling plot(pump_res, ru_scale = TRUE) summary(pump_res) # Choose pumps with extreme sample rates pops <- c(which.min(pump[, 1] / pump[, 2]), which.max(pump[, 1] / pump[, 2])) plot(pump_res, params = "pop", plot_type = "dens", which_pop = pops)
plot
method for class "hef".
## S3 method for class 'hef' plot( x, y, ..., params = c("hyper", "ru", "pop"), which_pop = NULL, plot_type = NULL, one_plot = FALSE, add_legend = FALSE, legend_position = "topright", legend_text = NULL, num = 100 )
## S3 method for class 'hef' plot( x, y, ..., params = c("hyper", "ru", "pop"), which_pop = NULL, plot_type = NULL, one_plot = FALSE, add_legend = FALSE, legend_position = "topright", legend_text = NULL, num = 100 )
x |
an object of class "hef", a result of a call to
|
y |
Not used. |
... |
Additional arguments passed to |
params |
A character scalar that determines to which parameters the plots relate.
|
which_pop |
An integer vector or character scalar.
If |
plot_type |
A character scalar that determines the type of plot
produced when
|
one_plot , add_legend , legend_position , legend_text
|
Only relevant if
|
num |
A numeric scalar. If |
See the examples in hef
and hanova1
.
plot.ru
for arguments that may be passed
via ...., in particular ru_scale
.
pp_check
method for class "hef". This provides an interface
to the functions that perform posterior predictive checks in the
bayesplot package. See PPC-overview for
details of these functions.
## S3 method for class 'hef' pp_check(object, fun = NULL, raw = FALSE, nrep = NULL, ...)
## S3 method for class 'hef' pp_check(object, fun = NULL, raw = FALSE, nrep = NULL, ...)
object |
An object of class "hef", a result of a call to
|
fun |
The plotting function to call. Can be any of the functions detailed at PPC-overview. The "ppc_" prefix can optionally be dropped if fun is specified as a string. |
raw |
Only relevant if |
nrep |
The number of predictive replicates to use. If |
... |
Additional arguments passed on to bayesplot functions. See Examples below. |
For details of these functions see PPC-overview. See also the vignettes Conjugate Hierarchical Models, Hierarchical 1-way Analysis of Variance and the bayesplot vignette Graphical posterior predictive checks.
The general idea is to compare the observed data object$data
with a matrix object$data_rep
in which each row is a
replication of the observed data simulated from the posterior predictive
distribution. For greater detail see Chapter 6 of Gelman et al. (2013).
A ggplot object that can be further customized using the ggplot2 package.
Jonah Gabry (2016). bayesplot: Plotting for Bayesian Models. R package version 1.1.0. https://CRAN.R-project.org/package=bayesplot
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., and Rubin, D. B. (2013). Bayesian Data Analysis. Chapman & Hall/CRC Press, London, third edition. (Chapter 6). http://www.stat.columbia.edu/~gelman/book/
hef
and hanova1
for sampling
from posterior distributions of hierarchical models.
bayesplot functions PPC-overview, PPC-distributions, PPC-test-statistics, PPC-intervals, pp_check.
############################ Beta-binomial ################################# # ------------------------- Rat tumor data ------------------------------- # rat_res <- hef(model = "beta_binom", data = rat, nrep = 50) # Overlaid density estimates pp_check(rat_res) # Overlaid distribution function estimates pp_check(rat_res, fun = "ecdf_overlay") # Multiple histograms pp_check(rat_res, fun = "hist", nrep = 8) # Multiple boxplots pp_check(rat_res, fun = "boxplot") # Predictive medians vs observed median pp_check(rat_res, fun = "stat", stat = "median") # Predictive (mean, sd) vs observed (mean, sd) pp_check(rat_res, fun = "stat_2d", stat = c("mean", "sd")) ############################ Gamma-Poisson ################################# # ------------------------ Pump failure data ------------------------------ # pump_res <- hef(model = "gamma_pois", data = pump, nrep = 50) # Overlaid density estimates pp_check(pump_res) # Predictive (mean, sd) vs observed (mean, sd) pp_check(pump_res, fun = "stat_2d", stat = c("mean", "sd")) ###################### One-way Hierarchical ANOVA ########################## #----------------- Late 21st Century Global Temperature Data ------------- # RCP26_2 <- temp2[temp2$RCP == "rcp26", ] temp_res <- hanova1(resp = RCP26_2[, 1], fac = RCP26_2[, 2], nrep = 50) # Overlaid density estimates pp_check(temp_res) # Predictive (mean, sd) vs observed (mean, sd) pp_check(temp_res, fun = "stat_2d", stat = c("mean", "sd"))
############################ Beta-binomial ################################# # ------------------------- Rat tumor data ------------------------------- # rat_res <- hef(model = "beta_binom", data = rat, nrep = 50) # Overlaid density estimates pp_check(rat_res) # Overlaid distribution function estimates pp_check(rat_res, fun = "ecdf_overlay") # Multiple histograms pp_check(rat_res, fun = "hist", nrep = 8) # Multiple boxplots pp_check(rat_res, fun = "boxplot") # Predictive medians vs observed median pp_check(rat_res, fun = "stat", stat = "median") # Predictive (mean, sd) vs observed (mean, sd) pp_check(rat_res, fun = "stat_2d", stat = c("mean", "sd")) ############################ Gamma-Poisson ################################# # ------------------------ Pump failure data ------------------------------ # pump_res <- hef(model = "gamma_pois", data = pump, nrep = 50) # Overlaid density estimates pp_check(pump_res) # Predictive (mean, sd) vs observed (mean, sd) pp_check(pump_res, fun = "stat_2d", stat = c("mean", "sd")) ###################### One-way Hierarchical ANOVA ########################## #----------------- Late 21st Century Global Temperature Data ------------- # RCP26_2 <- temp2[temp2$RCP == "rcp26", ] temp_res <- hanova1(resp = RCP26_2[, 1], fac = RCP26_2[, 2], nrep = 50) # Overlaid density estimates pp_check(temp_res) # Predictive (mean, sd) vs observed (mean, sd) pp_check(temp_res, fun = "stat_2d", stat = c("mean", "sd"))
print
method for class "hef".
## S3 method for class 'hef' print(x, ...)
## S3 method for class 'hef' print(x, ...)
x |
an object of class "hef", a result of a call to
|
... |
Additional optional arguments. At present no optional arguments are used. |
Prints the original call to hef
or
hanova1
, the name of the model and the number of populations
in the hierarchical model.
The argument x
, invisibly, as for all print
methods.
hef
for hierarchical exponential family models.
hanova1
for hierarchical one-way analysis of
variance (ANOVA).
print
method for class "summary.hef".
## S3 method for class 'summary.hef' print(x, ...)
## S3 method for class 'summary.hef' print(x, ...)
x |
an object of class "summary.hef", a result of a call to
|
... |
Additional optional arguments to be passed to
|
What is printed depends on the argument params
supplied
to summary.hef
.
The argument x
, invisibly, as for all print
methods.
summary.hef
: summary
method for class "hef".
hef
for hierarchical exponential family models.
hanova1
for hierarchical one-way analysis of
variance (ANOVA).
Data on pump failures from Gaver, D. P. and O'Muircheartaigh, I. G. (1987).
The matrix pump
has 10 rows and 2 columns.
Each row relates to a different pump system.
The first column contains the number of pump failures.
The second column contains the length of operating time, in
thousands of hours.
pump
pump
A matrix 10 rows and 2 columns.
Table 3 of Gaver, D. P. and O'Muircheartaigh, I. G. (1987). See also Gelfand, A. E. and Smith, A. F. M. (1990).
Gaver, D. P. and O'Muircheartaigh, I. G. (1987) Robust Empirical Bayes Analyses of Event Rates. Technometrics, 29, 1-15. doi:10.1080/00401706.1987.10488178
Gelfand, A. E. and Smith, A. F. M. (1990) Sampling-Based Approaches to Calculating Marginal Densities. Journal of the American Statistical Association, 85(410), 398-409. doi:10.1080/01621459.1990.10476213
Tumor incidence in 71 groups of rate from Tarone (1982).
The matrix rat
has 71 rows and 2 columns.
Each row relates to a different group of rats.
The first column (y
) contains the number of rats with tumors.
The second column (n
) contains the total number of rats.
rat
rat
A matrix with 71 rows and 2 columns.
Table 5.1 of Gelman, A., Carlin, J. B., Stern, H. S. Dunson, D. B., Vehtari, A. and Rubin, D. B. (2014) Bayesian Data Analysis, Chapman & Hall / CRC. http://www.stat.columbia.edu/~gelman/book/data/rats.asc
Tarone, R. E. (1982) The use of historical information in testing for a trend in proportions. Biometrics, 38, 215-220. doi:10.2307/2530304
Constructs a user-defined prior distribution for use as the argument
prior
in hef
or hanova1
.
set_user_prior( prior, ..., model = c("beta_binom", "gamma_pois", "anova1"), anova_d = 2 )
set_user_prior( prior, ..., model = c("beta_binom", "gamma_pois", "anova1"), anova_d = 2 )
prior |
An R function returning the log of the prior density
for of (perhaps a subset of) the hyperparameter vector |
... |
Further arguments giving the names and values of any
parameters involved in the function |
model |
A character string. Abbreviated name of the model:
"beta_binom" for beta-binomial and "gamma_pois" for gamma-Poisson
(see |
anova_d |
An integer scalar. Only relevant if |
For details of the hyperparameters in see the
Details section of
hef
for the models
beta_binom
and gamma_pois
and of hanova1
for the model anova1
.
A list of class "bang_prior"
. Will contain the component
prior
, the user-supplied function to evaluate the log of the prior,
and any arguments supplied in ....
hef
for hierarchical exponential family models.
hanova1
for hierarchical one-way analysis of
variance (ANOVA).
# User-defined prior, passing parameters # (equivalent to prior = "gamma" with hpars = c(1, 0.01, 1, 0.01)) user_prior <- function(x, hpars) { return(dexp(x[1], hpars[1], log = TRUE) + dexp(x[2], hpars[2], log = TRUE)) } user_prior_fn <- set_user_prior(user_prior, hpars = c(0.01, 0.01))
# User-defined prior, passing parameters # (equivalent to prior = "gamma" with hpars = c(1, 0.01, 1, 0.01)) user_prior <- function(x, hpars) { return(dexp(x[1], hpars[1], log = TRUE) + dexp(x[2], hpars[2], log = TRUE)) } user_prior_fn <- set_user_prior(user_prior, hpars = c(0.01, 0.01))
Simulates nrep
draws from the posterior predictive distribution
of the beta-binomial model described in hef
.
This function is called within hef
when the argument
nrep
is supplied.
sim_pred_beta_binom(theta_sim_vals, data, nrep)
sim_pred_beta_binom(theta_sim_vals, data, nrep)
theta_sim_vals |
A numeric matrix with |
data |
A 2-column numeric matrix: the numbers of successes in column 1 and the corresponding numbers of trials in column 2. |
nrep |
A numeric scalar. The number of replications of the original
dataset simulated from the posterior predictive distribution.
If |
A numeric matrix with nrep
columns. Each column contains
a draw from the posterior predictive distribution of the number of
successes.
rat_res <- hef(model = "beta_binom", data = rat) rat_sim_pred <- sim_pred_beta_binom(rat_res$theta_sim_vals, rat, 50)
rat_res <- hef(model = "beta_binom", data = rat) rat_sim_pred <- sim_pred_beta_binom(rat_res$theta_sim_vals, rat, 50)
Simulates nrep
draws from the posterior predictive distribution
of the beta-binomial model described in hef
.
This function is called within hef
when the argument
nrep
is supplied.
sim_pred_gamma_pois(theta_sim_vals, data, nrep)
sim_pred_gamma_pois(theta_sim_vals, data, nrep)
theta_sim_vals |
A numeric matrix with |
data |
A 2-column numeric matrix: the numbers of successes in column 1 and the corresponding numbers of trials in column 2. |
nrep |
A numeric scalar. The number of replications of the original
dataset simulated from the posterior predictive distribution.
If |
A numeric matrix with nrep
columns. Each column contains
a draw from the posterior predictive distribution of the number of
successes.
pump_res <- hef(model = "gamma_pois", data = pump) pump_sim_pred <- sim_pred_gamma_pois(pump_res$theta_sim_vals, pump, 50)
pump_res <- hef(model = "gamma_pois", data = pump) pump_sim_pred <- sim_pred_gamma_pois(pump_res$theta_sim_vals, pump, 50)
Simulates nrep
draws from the posterior predictive distribution
of the one-way hierarchical ANOVA model described in hanova1
.
This function is called within hanova1
when the argument
nrep
is supplied.
sim_pred_hanova1(theta_sim_vals, sim_vals, fac, nrep)
sim_pred_hanova1(theta_sim_vals, sim_vals, fac, nrep)
theta_sim_vals |
A numeric matrix with |
sim_vals |
A numeric matrix with |
fac |
The argument |
nrep |
A numeric scalar. The number of replications of the original
dataset simulated from the posterior predictive distribution.
If |
A numeric matrix with nrep
columns. Each column contains
a draw from the posterior predictive distribution of the number of
successes.
RCP26_2 <- temp2[temp2$RCP == "rcp26", ] temp_res <- hanova1(resp = RCP26_2[, 1], fac = RCP26_2[, 2]) sim_pred <- sim_pred_hanova1(temp_res$theta_sim_vals, temp_res$sim_vals, RCP26_2[, 2], 50)
RCP26_2 <- temp2[temp2$RCP == "rcp26", ] temp_res <- hanova1(resp = RCP26_2[, 1], fac = RCP26_2[, 2]) sim_pred <- sim_pred_hanova1(temp_res$theta_sim_vals, temp_res$sim_vals, RCP26_2[, 2], 50)
summary
method for class "hef".
## S3 method for class 'hef' summary( object, ..., params = c("hyper", "pop"), which_pop = 1:ncol(object$theta_sim_vals) )
## S3 method for class 'hef' summary( object, ..., params = c("hyper", "pop"), which_pop = 1:ncol(object$theta_sim_vals) )
object |
an object of class "hef", a result of a call to
|
... |
Additional arguments passed on to |
params |
A character scalar. If If |
which_pop |
An integer vector. If |
# Beta-binomial model, rat data rat_res <- hef(model = "beta_binom", data = rat) # Posterior summaries of the hyperparameters alpha and beta summary(rat_res) # Posterior summaries of the binomial probability for rats 1 to 3 summary(rat_res, params = "pop", which_pop = 1:3)
# Beta-binomial model, rat data rat_res <- hef(model = "beta_binom", data = rat) # Posterior summaries of the hyperparameters alpha and beta summary(rat_res) # Posterior summaries of the binomial probability for rats 1 to 3 summary(rat_res, params = "pop", which_pop = 1:3)
Indices of global temperature change from late 20th century (1970-1999) to mid 21st century (2020-2049) based on data produced by the Fifth Coupled Model Intercomparison Project (CMIP5).
temp1
temp1
A data frame with 270 rows and 4 columns.
Column 1, index
: anomaly of 2020-2049 mean relative to
the 1970-1999 mean.
Column 2, GCM
: Abbreviated name of General Circulation
Model.
Column 3, RCP
: Representative Concentration Pathway.
One of rcp26, rcp45, rcp60, rcp85.
Column 4, run
: Simulation run number.
The data frame temp1
data frame has 270 rows and 4 columns.
Each row relates to a climate projection run from one of 38 different
General Circulation Models (GCMs) under a particular
Representative Concentration Pathway (RCP).
Use table(temp1[, c("GCM", "RCP")])
to see the numbers of
runs under each RCP for each GCM.
See Van Vuuren et al (2011) for an overview of RCPs
and Northrop and Chandler (2014) for analyses of a similar
older dataset (CMIP3).
Column 1 contains the anomaly of the mean global temperature over
the time period 2020-2049 relative to the mean global temperature
over 1970-1999, i.e. the latter subtracted from the former.
Column 2 contains an abbreviation for the name of the climate modelling
research group and the GCM.
Column 3 contains the RCP in the format rcpxx
where xx
is a radiative forcing level resulting from an anticipated future
greenhouse gas emissions.
Column 4 is the simulation run number.
The raw data from which the indices are calculated are monthly CMIP5 scenario runs for global surface air temperature (tas) downloaded from the KNMI Climate Explorer (https://climexp.knmi.nl/) on 4/3/2015.
Northrop, P.J. and R.E. Chandler (2014). Quantifying Sources of Uncertainty in Projections of Future Climate. Journal of Climate, 27, 8793-8808. doi:10.1175/JCLI-D-14-00265.1
Van Vuuren, D. P., Edmonds, J., Kainuma, M., Riahi, K. Thomson, A., Hibbard, K., Hurtt, G. C., Kram, T., Krey, V., Lamarque, J.-F. (2011). The representative concentration pathways: an overview. Climatic change, 109, 5-31. doi:10.1007/s10584-011-0148-z
Indices of global temperature change from late 20th century (1970-1999) to late 21st century (2069-2098) based on data produced by the Fifth Coupled Model Intercomparison Project (CMIP5).
temp2
temp2
A data frame with 270 rows and 4 columns.
Column 1, index
: anomaly of 2069-2098 mean relative to
the 1970-1999 mean.
Column 2, GCM
: Abbreviated name of General Circulation
Model.
Column 3, RCP
: Representative Concentration Pathway.
One of rcp26, rcp45, rcp60, rcp85.
Column 4, run
: Simulation run number.
The data frame temp2
data frame has 270 rows and 4 columns.
Each row relates to a climate projection run from one of 38 different
General Circulation Models (GCMs) under a particular
Representative Concentration Pathway (RCP).
Use table(temp2[, c("GCM", "RCP")])
to see the numbers of
runs under each RCP for each GCM.
See Van Vuuren et al (2011) for an overview of RCPs
and Northrop and Chandler (2014) for analyses of a similar
older dataset (CMIP3).
Column 1 contains the anomaly of the mean global temperature over
the time period 2069-2098 relative to the mean global temperature
over 1970-1999, i.e. the latter subtracted from the former.
Column 2 contains an abbreviation for the name of the climate modelling
research group and the GCM.
Column 3 contains the RCP in the format rcpxx
where xx
is a radiative forcing level resulting from an anticipated future
greenhouse gas emissions.
Column 4 is the simulation run number.
The raw data from which the indices are calculated are monthly CMIP5 scenario runs for global surface air temperature (tas) downloaded from the KNMI Climate Explorer (https://climexp.knmi.nl/) on 4/3/2015.
Northrop, P.J. and R.E. Chandler (2014). Quantifying Sources of Uncertainty in Projections of Future Climate. Journal of Climate, 27, 8793-8808. doi:10.1175/JCLI-D-14-00265.1
Van Vuuren, D. P., Edmonds, J., Kainuma, M., Riahi, K. Thomson, A., Hibbard, K., Hurtt, G. C., Kram, T., Krey, V., Lamarque, J.-F. (2011). The representative concentration pathways: an overview. Climatic change, 109, 5-31. doi:10.1007/s10584-011-0148-z
Data from an experiment to study weight gained by 10 rats fed on four different diets, defined by a combination of the amount of protein (low and high) and by the source of protein (beef and cereal).
weight_gain
weight_gain
A data frame with 40 rows and 3 columns.
Column 1, source
: source of protein, a factor with levels
Beef
and Cereal
.
Column 2, type
: amount of protein, a factor with levels
High
and Low
.
Column 3, weightgain
: weight gained, in grams.
D. J. Hand, A. D. Lunn, K. J. McConway, and E. Ostrowski (1994). A Handbook of Small Datasets, Chapman and Hall/CRC, London.