Title: | Estimation of the Extremal Index |
---|---|
Description: | Performs frequentist inference for the extremal index of a stationary time series. Two types of methodology are used. One type is based on a model that relates the distribution of block maxima to the marginal distribution of series and leads to the semiparametric maxima estimators described in Northrop (2015) <doi:10.1007/s10687-015-0221-5> and Berghaus and Bucher (2018) <doi:10.1214/17-AOS1621>. Sliding block maxima are used to increase precision of estimation. A graphical block size diagnostic is provided. The other type of methodology uses a model for the distribution of threshold inter-exceedance times (Ferro and Segers (2003) <doi:10.1111/1467-9868.00401>). Three versions of this type of approach are provided: the iterated weight least squares approach of Suveges (2007) <doi:10.1007/s10687-007-0034-2>, the K-gaps model of Suveges and Davison (2010) <doi:10.1214/09-AOAS292> and a similar approach of Holesovsky and Fusek (2020) <doi:10.1007/s10687-020-00374-3> that we refer to as D-gaps. For the K-gaps and D-gaps models this package allows missing values in the data, can accommodate independent subsets of data, such as monthly or seasonal time series from different years, and can incorporate information from right-censored inter-exceedance times. Graphical diagnostics for the threshold level and the respective tuning parameters K and D are provided. |
Authors: | Paul J. Northrop [aut, cre, cph], Constantinos Christodoulides [aut, cph] |
Maintainer: | Paul J. Northrop <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.2.3 |
Built: | 2024-11-23 04:49:07 UTC |
Source: | https://github.com/paulnorthrop/exdex |
The extremal index is a measure of the degree of local
dependence in the extremes of a stationary process. The exdex
package performs frequentist inference about
using the
methodologies proposed in Northrop (2015), Berghaus and Bucher (2018),
Suveges (2007), Suveges and Davison (2010) and Holesovsky and Fusek (2020).
Functions to implement four estimators of the extremal index are provided, namely
spm
: semiparametric maxima estimator, using block
maxima: (Northrop, 2015; Berghaus and Bucher, 2018)
kgaps
: -gaps estimator, using threshold
inter-exceedance times (Suveges and Davison, 2010)
dgaps
: -gaps estimator, using threshold
inter-exceedance times (Holesovsky and Fusek, 2020))
iwls
: iterated weighted least squares estimator,
using threshold inter-exceedance times: (Suveges, 2007)
The functions choose_b
, choose_uk
and
choose_ud
provide graphical diagnostics for choosing the
respective tuning parameters of the semiparametric maxima, -gaps and
-gaps estimators.
For the -gaps and
-gaps models the 'exdex' package allows
missing values in the data, can accommodate independent subsets of data,
such as monthly or seasonal time series from different years, and can
incorporate information from censored inter-exceedance times.
See vignette("exdex-vignette", package = "exdex")
for an
overview of the package.
Maintainer: Paul J. Northrop [email protected] [copyright holder]
Authors:
Constantinos Christodoulides [copyright holder]
Berghaus, B., Bucher, A. (2018) Weak convergence of a pseudo maximum likelihood estimator for the extremal index. Ann. Statist. 46(5), 2307-2335. doi:10.1214/17-AOS1621
Holesovsky, J. and Fusek, M. Estimation of the extremal index using censored distributions. Extremes 23, 197-213 (2020). doi:10.1007/s10687-020-00374-3
Northrop, P. J. (2015) An efficient semiparametric maxima estimator of the extremal index. Extremes 18(4), 585-603. doi:10.1007/s10687-015-0221-5
Suveges, M. (2007) Likelihood estimation of the extremal index. Extremes, 10, 41-55. doi:10.1007/s10687-007-0034-2
Suveges, M. and Davison, A. C. (2010) Model misspecification in peaks over threshold analysis, Annals of Applied Statistics, 4(1), 203-221. doi:10.1214/09-AOAS292
spm
: semiparametric maxima estimator.
kgaps
: -gaps estimator.
dgaps
: -gaps estimator.
iwls
: iterated weighted least squares estimator.
choose_b
, choose_ud
and
choose_ud
for choosing tuning parameters.
newlyn
, sp500
and
cheeseboro
for example datasets.
Calculates the (sliding) maxima of all blocks of b
contiguous values
and all sets of the maxima of disjoint blocks of b
contiguous values
in the vector x
. This provides the first step of computations in
spm
.
all_max_rcpp(x, b = 1, which_dj = c("all", "first", "last"), ...)
all_max_rcpp(x, b = 1, which_dj = c("all", "first", "last"), ...)
x |
A numeric vector of raw observations. |
b |
A numeric scalar. The block size. |
which_dj |
A character scalar. Determines Which sets of disjoint
maxima are calculated: |
... |
Further arguments to be passed to
|
Sliding maxima. The function
roll_max
in the RcppRoll
package is used.
Disjoint maxima. If n = length(x)
is an integer
multiple of b
, or if which_dj = "first"
or
which_dj = "last"
then only one set of n / b
disjoint
block maxima are returned.
Otherwise, n - floor(n / b) * b + 1
sets of floor(n / b)
disjoint block maxima are returned. Set i
are the disjoint maxima
of x[i:(i + floor(n / b) * b - 1)]
. That is, all possible sets
of contiguous disjoint maxima achieving the maxima length of
floor(n / b)
are calculated.
In both instances na.rm = TRUE
is passed to max
so
that blocks containing missing values produce a non-missing result.
Also returned are the values in x
that contribute to each set
of block maxima.
A list containing
ys |
a numeric vector containing one set of sliding block maxima. |
xs |
a numeric vector containing the values that
contribute to |
yd |
if |
xd |
if |
spm
for semiparametric estimation of the
extremal index based on block maxima.
x <- 1:11 all_max_rcpp(x, 3) all_max_rcpp(x, 3, which_dj = "first") all_max_rcpp(x, 3, which_dj = "last")
x <- 1:11 all_max_rcpp(x, 3) all_max_rcpp(x, 3, which_dj = "first") all_max_rcpp(x, 3, which_dj = "last")
The matrix cheeseboro
contains hourly maximum wind gusts (in miles
per hour) recorded at the Cheeseboro weather station near Thousand Oaks,
Southern California, USA during the month of January over the period
2000-2009. These data are analysed in Reich and Shaby (2016).
cheeseboro
cheeseboro
A 744 by 10 numeric matrix. Column i
contains the hourly
maximum wind gusts (in miles per hour) from Cheeseboro in the year
2000 + i
- 1. The columns are named 2000, 2001, ..., 2009 and the
rows are named dayj
hourk
, where j
is the day of the
month and k
the hour of the day.
There are 42 missing values, located in 6 of the 10 years, namely 2000-2003 and 2005-2006.
The Remote Automated Weather Stations USA Climate Archive at https://raws.dri.edu/, more specifically the Daily Summaries of the Cheeseboro page.
Reich, B. J. and Shaby, B. A. (2016). 'Time series of Extremes', in Dey, D. K. and Yan, J. (eds.) Extreme Value Modeling and Risk Analysis. New York: Chapman and Hall/CRC, pp. 131-151.
Creates data for a plot to aid the choice of the block length b
to
supply to spm
. The general idea is to select the smallest
value of b
above which estimates of the extremal index
appear to be constant with respect to
b
, taking into account sampling
variability. plot.choose_b
creates the plot.
choose_b( data, b, bias_adjust = c("BB3", "BB1", "N", "none"), constrain = TRUE, varN = TRUE, level = 0.95, interval_type = c("norm", "lik"), conf_scale = c("theta", "log"), type = c("vertical", "cholesky", "spectral", "none") )
choose_b( data, b, bias_adjust = c("BB3", "BB1", "N", "none"), constrain = TRUE, varN = TRUE, level = 0.95, interval_type = c("norm", "lik"), conf_scale = c("theta", "log"), type = c("vertical", "cholesky", "spectral", "none") )
data |
A numeric vector of raw data. No missing values are allowed. |
b |
A numeric scalar. The block size. |
bias_adjust |
A character scalar. Is bias-adjustment of the
raw estimate of |
constrain |
A logical scalar. If |
varN |
A logical scalar. If |
level |
A numeric scalar in (0, 1). The confidence level required. |
interval_type |
A character scalar: |
conf_scale |
A character scalar. If If Any bias-adjustment requested in the original call to |
type |
A character scalar. The argument |
For each block size in b
the extremal index
is estimated using
spm
. The estimates of
approximate
conf
% confidence intervals for are
stored for plotting (by
plot.choose_b
)
to produce a simple graphical diagnostic to inform the choice of
block size. This plot is used to choose a block size above which the
underlying value of may be approximately constant.
This is akin to a threshold stability plot: see Chapter 4 of Coles (2001),
for example.
The nature of the calculation of the sampling variances of the estimates
of (see
spm
for details) means that
choose_b
may be a little slow to run if b
contains many
values, particularly if some of them are small.
For very small block sizes it may not be possible to estimate the
confidence intervals. See Details in spm
.
For any such block sizes the intervals will be missing from the plot.
An object of class c("choose_b", "exdex")
containing
theta_sl , theta_dj
|
numeric |
lower_sl , lower_dj
|
Similarly for the lower limits of the confidence intervals. |
upper_sl , upper_dj
|
Similarly for the upper limits of the confidence intervals. |
b |
the input |
call |
the call to |
Coles, S. G. (2001) An Introduction to Statistical Modeling of Extreme Values, Springer-Verlag, London. doi:10.1007/978-1-4471-3675-0_3
Northrop, P. J. (2015) An efficient semiparametric maxima estimator of the extremal index. Extremes 18(4), 585-603. doi:10.1007/s10687-015-0221-5
Berghaus, B., Bucher, A. (2018) Weak convergence of a pseudo maximum likelihood estimator for the extremal index. Ann. Statist. 46(5), 2307-2335. doi:10.1214/17-AOS1621
plot.choose_b
to produce the block length diagnostic
plot.
# Newlyn seas surges # Plot like the top left of Northrop (2015) # Remove the last 14 values because 2880 has lots of factors b_vals <- c(2,3,4,5,6,8,9,10,12,15,16,18,20,24,30,32,36,40,45,48,54,60) res <- choose_b(newlyn[1:2880], b_vals) # Some b are too small for the sampling variance of the sliding blocks # estimator to be estimated plot(res) plot(res, estimator = "BB2018") plot(res, maxima = "disjoint") # S&P 500 index: similar to Berghaus and Bucher (2018), Fig 4 top left b_vals <- c(10, seq(from = 25, to = 350, by = 25), 357) res500 <- choose_b(sp500, b_vals) plot(res500, ylim = c(0, 1)) plot(res500, estimator = "BB2018", ylim = c(0, 1))
# Newlyn seas surges # Plot like the top left of Northrop (2015) # Remove the last 14 values because 2880 has lots of factors b_vals <- c(2,3,4,5,6,8,9,10,12,15,16,18,20,24,30,32,36,40,45,48,54,60) res <- choose_b(newlyn[1:2880], b_vals) # Some b are too small for the sampling variance of the sliding blocks # estimator to be estimated plot(res) plot(res, estimator = "BB2018") plot(res, maxima = "disjoint") # S&P 500 index: similar to Berghaus and Bucher (2018), Fig 4 top left b_vals <- c(10, seq(from = 25, to = 350, by = 25), 357) res500 <- choose_b(sp500, b_vals) plot(res500, ylim = c(0, 1)) plot(res500, estimator = "BB2018", ylim = c(0, 1))
and runs parameter
diagnostic for the
-gaps
estimatorCreates data for a plot to aid the choice of the threshold and
run parameter for the
-gaps estimator (see
dgaps
). plot.choose_ud
creates the plot.
choose_ud(data, u, D = 1, inc_cens = TRUE)
choose_ud(data, u, D = 1, inc_cens = TRUE)
data |
A numeric vector or numeric matrix of raw data. If If |
u , D
|
Numeric vectors. Any values in |
inc_cens |
A logical scalar indicating whether or not to include
contributions from right-censored inter-exceedance times, relating to the
first and last observations. It is known that these times are greater
than or equal to the time observed.
If |
For each combination of threshold in u
and
in
D
the functions dgaps
and dgaps_imt
are called in order to estimate and to perform the
information matrix test of Holesovsky and Fusek (2020).
An object (a list) of class c("choose_ud", "exdex")
containing
imt |
an object of class |
theta |
a |
Holesovsky, J. and Fusek, M. Estimation of the extremal index using censored distributions. Extremes 23, 197-213 (2020). doi:10.1007/s10687-020-00374-3
dgaps
for maximum likelihood estimation of the
extremal index using the
-gaps model.
dgaps_imt
for the information matrix test under the
-gaps model
plot.choose_ud
to produce the diagnostic plot.
### S&P 500 index # Multiple thresholds and left-censoring parameters u <- quantile(sp500, probs = seq(0.2, 0.9, by = 0.1)) imt_theta <- choose_ud(sp500, u = u, D = 1:5) plot(imt_theta) plot(imt_theta, uprob = TRUE) plot(imt_theta, y = "theta") # One left-censoring parameter D, many thresholds u u <- quantile(sp500, probs = seq(0.2, 0.9, by = 0.1)) imt_theta <- choose_ud(sp500, u = u, D = 1) plot(imt_theta) plot(imt_theta, y = "theta") # One threshold u, many left-censoring parameters D u <- quantile(sp500, probs = 0.9) imt_theta <- choose_ud(sp500, u = u, D = 1:5) plot(imt_theta) plot(imt_theta, y = "theta") ### Newlyn sea surges u <- quantile(newlyn, probs = seq(0.1, 0.9, by = 0.1)) imt_theta <- choose_ud(newlyn, u = u, D = 1:5) plot(imt_theta, uprob = TRUE) ### Cheeseboro wind gusts (a matrix containing some NAs) probs <- c(seq(0.5, 0.95, by = 0.05), 0.99) u <- quantile(cheeseboro, probs = probs, na.rm = TRUE) imt_theta <- choose_ud(cheeseboro, u, D = 1:6) plot(imt_theta, uprob = FALSE, lwd = 2) ### Uccle July temperatures probs <- c(seq(0.7, 0.95, by = 0.05), 0.99) u <- quantile(uccle720m, probs = probs, na.rm = TRUE) imt_theta <- choose_ud(uccle720m, u, D = 1:5) plot(imt_theta, uprob = TRUE, lwd = 2)
### S&P 500 index # Multiple thresholds and left-censoring parameters u <- quantile(sp500, probs = seq(0.2, 0.9, by = 0.1)) imt_theta <- choose_ud(sp500, u = u, D = 1:5) plot(imt_theta) plot(imt_theta, uprob = TRUE) plot(imt_theta, y = "theta") # One left-censoring parameter D, many thresholds u u <- quantile(sp500, probs = seq(0.2, 0.9, by = 0.1)) imt_theta <- choose_ud(sp500, u = u, D = 1) plot(imt_theta) plot(imt_theta, y = "theta") # One threshold u, many left-censoring parameters D u <- quantile(sp500, probs = 0.9) imt_theta <- choose_ud(sp500, u = u, D = 1:5) plot(imt_theta) plot(imt_theta, y = "theta") ### Newlyn sea surges u <- quantile(newlyn, probs = seq(0.1, 0.9, by = 0.1)) imt_theta <- choose_ud(newlyn, u = u, D = 1:5) plot(imt_theta, uprob = TRUE) ### Cheeseboro wind gusts (a matrix containing some NAs) probs <- c(seq(0.5, 0.95, by = 0.05), 0.99) u <- quantile(cheeseboro, probs = probs, na.rm = TRUE) imt_theta <- choose_ud(cheeseboro, u, D = 1:6) plot(imt_theta, uprob = FALSE, lwd = 2) ### Uccle July temperatures probs <- c(seq(0.7, 0.95, by = 0.05), 0.99) u <- quantile(uccle720m, probs = probs, na.rm = TRUE) imt_theta <- choose_ud(uccle720m, u, D = 1:5) plot(imt_theta, uprob = TRUE, lwd = 2)
and runs parameter
diagnostic for the
-gaps
estimatorCreates data for a plot to aid the choice of the threshold and
run parameter for the
-gaps estimator (see
kgaps
). plot.choose_uk
creates the plot.
choose_uk(data, u, k = 1, inc_cens = TRUE)
choose_uk(data, u, k = 1, inc_cens = TRUE)
data |
A numeric vector or numeric matrix of raw data. If If |
u , k
|
Numeric vectors. Any values in |
inc_cens |
A logical scalar indicating whether or not to include contributions from censored inter-exceedance times, relating to the first and last observations. See Attalides (2015) for details. |
For each combination of threshold in u
and
in
k
the functions kgaps
and kgaps_imt
are called in order to estimate and to perform the
information matrix test of Suveges and Davison (2010).
An object (a list) of class c("choose_uk", "exdex")
containing
imt |
an object of class |
theta |
a |
Suveges, M. and Davison, A. C. (2010) Model misspecification in peaks over threshold analysis, Annals of Applied Statistics, 4(1), 203-221. doi:10.1214/09-AOAS292
Attalides, N. (2015) Threshold-based extreme value modelling, PhD thesis, University College London. https://discovery.ucl.ac.uk/1471121/1/Nicolas_Attalides_Thesis.pdf
kgaps
for maximum likelihood estimation of the
extremal index using the
-gaps model.
kgaps_imt
for the information matrix test under the
-gaps model
plot.choose_uk
to produce the diagnostic plot.
### S&P 500 index # Multiple thresholds and run parameters u <- quantile(sp500, probs = seq(0.1, 0.9, by = 0.1)) imt_theta <- choose_uk(sp500, u = u, k = 1:5) plot(imt_theta) plot(imt_theta, uprob = TRUE) plot(imt_theta, y = "theta") # One run parameter K, many thresholds u u <- quantile(sp500, probs = seq(0.1, 0.9, by = 0.1)) imt_theta <- choose_uk(sp500, u = u, k = 1) plot(imt_theta) plot(imt_theta, y = "theta") # One threshold u, many run parameters K u <- quantile(sp500, probs = 0.9) imt_theta <- choose_uk(sp500, u = u, k = 1:5) plot(imt_theta) plot(imt_theta, y = "theta") ### Newlyn sea surges u <- quantile(newlyn, probs = seq(0.1, 0.9, by = 0.1)) imt_theta <- choose_uk(newlyn, u = u, k = 1:5) plot(imt_theta, uprob = TRUE) ### Cheeseboro wind gusts (a matrix containing some NAs) probs <- c(seq(0.5, 0.95, by = 0.05), 0.99) u <- quantile(cheeseboro, probs = probs, na.rm = TRUE) imt_theta <- choose_uk(cheeseboro, u, k = 1:6) plot(imt_theta, uprob = FALSE, lwd = 2) ### Uccle July temperatures probs <- c(seq(0.7, 0.95, by = 0.05), 0.99) u <- quantile(uccle720m, probs = probs, na.rm = TRUE) imt_theta <- choose_uk(uccle720m, u, k = 1:5) plot(imt_theta, uprob = TRUE, lwd = 2)
### S&P 500 index # Multiple thresholds and run parameters u <- quantile(sp500, probs = seq(0.1, 0.9, by = 0.1)) imt_theta <- choose_uk(sp500, u = u, k = 1:5) plot(imt_theta) plot(imt_theta, uprob = TRUE) plot(imt_theta, y = "theta") # One run parameter K, many thresholds u u <- quantile(sp500, probs = seq(0.1, 0.9, by = 0.1)) imt_theta <- choose_uk(sp500, u = u, k = 1) plot(imt_theta) plot(imt_theta, y = "theta") # One threshold u, many run parameters K u <- quantile(sp500, probs = 0.9) imt_theta <- choose_uk(sp500, u = u, k = 1:5) plot(imt_theta) plot(imt_theta, y = "theta") ### Newlyn sea surges u <- quantile(newlyn, probs = seq(0.1, 0.9, by = 0.1)) imt_theta <- choose_uk(newlyn, u = u, k = 1:5) plot(imt_theta, uprob = TRUE) ### Cheeseboro wind gusts (a matrix containing some NAs) probs <- c(seq(0.5, 0.95, by = 0.05), 0.99) u <- quantile(cheeseboro, probs = probs, na.rm = TRUE) imt_theta <- choose_uk(cheeseboro, u, k = 1:6) plot(imt_theta, uprob = FALSE, lwd = 2) ### Uccle July temperatures probs <- c(seq(0.7, 0.95, by = 0.05), 0.99) u <- quantile(uccle720m, probs = probs, na.rm = TRUE) imt_theta <- choose_uk(uccle720m, u, k = 1:5) plot(imt_theta, uprob = TRUE, lwd = 2)
Calculates maximum likelihood estimates of the extremal index
based on a model for threshold inter-exceedances times of
Holesovsky and Fusek (2020). We refer to this as the
-gaps model,
because it uses a tuning parameter
, whereas the related
-gaps
model of Suveges and Davison (2010) has a tuning parameter
.
dgaps(data, u, D = 1, inc_cens = TRUE)
dgaps(data, u, D = 1, inc_cens = TRUE)
data |
A numeric vector or numeric matrix of raw data. If If |
u |
A numeric scalar. Extreme value threshold applied to data. |
D |
A numeric scalar. The censoring parameter |
inc_cens |
A logical scalar indicating whether or not to include
contributions from right-censored inter-exceedance times, relating to the
first and last observations. It is known that these times are greater
than or equal to the time observed.
If |
If inc_cens = FALSE
then the maximum likelihood estimate of
the extremal index under the
-gaps model of
Holesovsky and Fusek (2020) is calculated. Under this model
inter-exceedance times that are less than or equal to
are
left-censored, as a strategy to mitigate model mis-specification resulting
from the fact that inter-exceedance times that are equal to 0 are expected
under the model but only positive inter-exceedance times can be observed
in practice.
If inc_cens = TRUE
then information from the right-censored
first and last inter-exceedance times are also included in the likelihood
to be maximized.
For an explanation of the idea see Attalides (2015). The form of the
log-likelihood is given in the Details section of
dgaps_stat
.
It is possible that the estimate of is equal to 1, and also
possible that it is equal to 0.
dgaps_stat
explains the
respective properties of the data that cause these events to occur.
An object (a list) of class c("dgaps", "exdex")
containing
theta |
The maximum likelihood estimate (MLE) of
|
se |
The estimated standard error of the MLE, calculated
using an algebraic expression for the observed information. If the
estimate of |
se_exp |
The estimated standard error of the MLE,
calculated using an algebraic expression for the expected information.
If the estimate of |
ss |
The list of summary statistics returned from
|
D , u , inc_cens
|
The input values of |
max_loglik |
The value of the log-likelihood at the MLE. |
call |
The call to |
Holesovsky, J. and Fusek, M. Estimation of the extremal index using censored distributions. Extremes 23, 197-213 (2020). doi:10.1007/s10687-020-00374-3
Suveges, M. and Davison, A. C. (2010) Model misspecification in peaks over threshold analysis, Annals of Applied Statistics, 4(1), 203-221. doi:10.1214/09-AOAS292
dgaps_confint
to estimate confidence intervals
for .
dgaps_methods
for S3 methods for "dgaps"
objects.
dgaps_imt
for the information matrix test, which
may be used to inform the choice of the pair (u, D
).
choose_ud
for a diagnostic plot based on
dgaps_imt
.
dgaps_stat
for the calculation of sufficient
statistics for the -gaps model.
### S&P 500 index u <- quantile(sp500, probs = 0.60) theta <- dgaps(sp500, u = u, D = 1) theta summary(theta) coef(theta) nobs(theta) vcov(theta) logLik(theta) ### Newlyn sea surges u <- quantile(newlyn, probs = 0.60) theta <- dgaps(newlyn, u = u, D = 2) theta summary(theta) ### Uccle July temperatures # Using vector input, which merges data from different years u <- quantile(uccle720$temp, probs = 0.9, na.rm = TRUE) theta <- dgaps(uccle720$temp, u = u, D = 2) theta # Using matrix input to separate data from different years u <- quantile(uccle720m, probs = 0.9, na.rm = TRUE) theta <- dgaps(uccle720m, u = u, D = 2) theta
### S&P 500 index u <- quantile(sp500, probs = 0.60) theta <- dgaps(sp500, u = u, D = 1) theta summary(theta) coef(theta) nobs(theta) vcov(theta) logLik(theta) ### Newlyn sea surges u <- quantile(newlyn, probs = 0.60) theta <- dgaps(newlyn, u = u, D = 2) theta summary(theta) ### Uccle July temperatures # Using vector input, which merges data from different years u <- quantile(uccle720$temp, probs = 0.9, na.rm = TRUE) theta <- dgaps(uccle720$temp, u = u, D = 2) theta # Using matrix input to separate data from different years u <- quantile(uccle720m, probs = 0.9, na.rm = TRUE) theta <- dgaps(uccle720m, u = u, D = 2) theta
for "dgaps"
objectsconfint
method for objects of class c("dgaps", "exdex")
.
Computes confidence intervals for based on an object returned
from
dgaps
. Two types of interval may be returned:
(a) intervals based on approximate large-sample normality of the estimator
of , which are symmetric about the point estimate,
and (b) likelihood-based intervals. The
plot
method plots the
log-likelihood for , with the required confidence interval
indicated on the plot.
## S3 method for class 'dgaps' confint( object, parm = "theta", level = 0.95, interval_type = c("both", "norm", "lik"), conf_scale = c("theta", "log"), constrain = TRUE, se_type = c("observed", "expected"), ... ) ## S3 method for class 'confint_dgaps' plot(x, ...) ## S3 method for class 'confint_dgaps' print(x, ...)
## S3 method for class 'dgaps' confint( object, parm = "theta", level = 0.95, interval_type = c("both", "norm", "lik"), conf_scale = c("theta", "log"), constrain = TRUE, se_type = c("observed", "expected"), ... ) ## S3 method for class 'confint_dgaps' plot(x, ...) ## S3 method for class 'confint_dgaps' print(x, ...)
object |
An object of class |
parm |
Specifies which parameter is to be given a confidence interval.
Here there is only one option: the extremal index |
level |
The confidence level required. A numeric scalar in (0, 1). |
interval_type |
A character scalar: |
conf_scale |
A character scalar. If If |
constrain |
A logical scalar. If |
se_type |
A character scalar. Should the confidence intervals for the
|
... |
|
x |
an object of class |
Two type of interval are calculated: (a) an interval based on the
approximate large sample normality of the estimator of
(if
conf_scale = "theta"
) or of
(if
conf_scale = "log"
) and (b) a likelihood-based interval,
based on the approximate large sample chi-squared, with 1 degree of
freedom, distribution of the log-likelihood ratio statistic.
print.confint_dgaps
prints the matrix of confidence
intervals for .
A list of class c("confint_dgaps", "exdex") containing the following components.
cis |
A matrix with columns giving the lower and upper confidence
limits. These are labelled as (1 - level)/2 and 1 - (1 - level)/2 in
% (by default 2.5% and 97.5%).
The row names indicate the type of interval:
|
call |
The call to |
object |
The input object |
level |
The input |
plot.confint_dgaps
: nothing is returned.
print.confint_dgaps
: the argument x
, invisibly.
Holesovsky, J. and Fusek, M. Estimation of the extremal index using censored distributions. Extremes 23, 197-213 (2020). doi:10.1007/s10687-020-00374-3
dgaps
for estimation of the extremal index
using a semiparametric maxima method.
u <- quantile(newlyn, probs = 0.90) theta <- dgaps(newlyn, u) cis <- confint(theta) cis plot(cis)
u <- quantile(newlyn, probs = 0.90) theta <- dgaps(newlyn, u) cis <- confint(theta) cis plot(cis)
-gaps modelPerforms an information matrix test (IMT) to diagnose misspecification of
the -gaps model of Holesovsky and Fusek (2020).
dgaps_imt(data, u, D = 1, inc_cens = TRUE)
dgaps_imt(data, u, D = 1, inc_cens = TRUE)
data |
A numeric vector or numeric matrix of raw data. If If |
u , D
|
Numeric vectors. Any values in |
inc_cens |
A logical scalar indicating whether or not to include
contributions from right-censored inter-exceedance times, relating to the
first and last observations. See |
The general approach follows Suveges and Davison (2010).
The -gaps IMT is performed a over grid of all
combinations of threshold and
in the vectors
u
and D
. If the estimate of is 0 then the
IMT statistic, and its associated
-value is
NA
.
An object (a list) of class c("dgaps_imt", "exdex")
containing
imt |
A |
p |
A |
theta |
A |
u , D
|
The input |
Holesovsky, J. and Fusek, M. Estimation of the extremal index using censored distributions. Extremes 23, 197-213 (2020). doi:10.1007/s10687-020-00374-3
Suveges, M. and Davison, A. C. (2010) Model misspecification in peaks over threshold analysis, Annals of Applied Statistics, 4(1), 203-221. doi:10.1214/09-AOAS292
dgaps
for maximum likelihood estimation of the
extremal index using the
-gaps model.
### Newlyn sea surges u <- quantile(newlyn, probs = seq(0.1, 0.9, by = 0.1)) imt <- dgaps_imt(newlyn, u = u, D = 1:5) ### S&P 500 index u <- quantile(sp500, probs = seq(0.1, 0.9, by = 0.1)) imt <- dgaps_imt(sp500, u = u, D = 1:5) ### Cheeseboro wind gusts (a matrix containing some NAs) probs <- c(seq(0.5, 0.98, by = 0.025), 0.99) u <- quantile(cheeseboro, probs = probs, na.rm = TRUE) imt <- dgaps_imt(cheeseboro, u = u, D = 1:5) ### Uccle July temperatures probs <- c(seq(0.7, 0.98, by = 0.025), 0.99) u <- quantile(uccle720m, probs = probs, na.rm = TRUE) imt <- dgaps_imt(uccle720m, u = u, D = 1:5)
### Newlyn sea surges u <- quantile(newlyn, probs = seq(0.1, 0.9, by = 0.1)) imt <- dgaps_imt(newlyn, u = u, D = 1:5) ### S&P 500 index u <- quantile(sp500, probs = seq(0.1, 0.9, by = 0.1)) imt <- dgaps_imt(sp500, u = u, D = 1:5) ### Cheeseboro wind gusts (a matrix containing some NAs) probs <- c(seq(0.5, 0.98, by = 0.025), 0.99) u <- quantile(cheeseboro, probs = probs, na.rm = TRUE) imt <- dgaps_imt(cheeseboro, u = u, D = 1:5) ### Uccle July temperatures probs <- c(seq(0.7, 0.98, by = 0.025), 0.99) u <- quantile(uccle720m, probs = probs, na.rm = TRUE) imt <- dgaps_imt(uccle720m, u = u, D = 1:5)
-gaps information matrix testCalculates the components required to calculate the value of the information
matrix test under the -gaps model, using vector data input.
Called by
dgaps_imt
.
dgaps_imt_stat(data, theta, u, D = 1, inc_cens = TRUE)
dgaps_imt_stat(data, theta, u, D = 1, inc_cens = TRUE)
data |
A numeric vector of raw data. Missing values are allowed, but
they should not appear between non-missing values, that is, they only be
located at the start and end of the vector. Missing values are omitted
using |
theta |
A numeric scalar. An estimate of the extremal index
|
u |
A numeric scalar. Extreme value threshold applied to data. |
D |
A numeric scalar. The censoring parameter |
inc_cens |
A logical scalar indicating whether or not to include
contributions from right-censored inter-exceedance times, relating to the
first and last observations. See |
A list
relating the quantities given on pages 18-19 of
Suveges and Davison (2010). All but the last component are vectors giving
the contribution to the quantity from each -gap, evaluated at the
input value
theta
of .
ldj |
the derivative of the log-likelihood with respect to
|
Ij |
the observed information |
Jj |
the square of the score |
dj |
|
Ddj |
the derivative of |
n_dgaps |
the number of |
Holesovsky, J. and Fusek, M. Estimation of the extremal index using censored distributions. Extremes 23, 197-213 (2020). doi:10.1007/s10687-020-00374-3
"dgaps"
Methods for objects of class c("dgaps", "exdex")
returned from
dgaps
.
## S3 method for class 'dgaps' coef(object, ...) ## S3 method for class 'dgaps' vcov(object, type = c("observed", "expected"), ...) ## S3 method for class 'dgaps' nobs(object, ...) ## S3 method for class 'dgaps' logLik(object, ...) ## S3 method for class 'dgaps' print(x, digits = max(3L, getOption("digits") - 3L), ...) ## S3 method for class 'dgaps' summary( object, se_type = c("observed", "expected"), digits = max(3, getOption("digits") - 3L), ... ) ## S3 method for class 'summary.dgaps' print(x, ...)
## S3 method for class 'dgaps' coef(object, ...) ## S3 method for class 'dgaps' vcov(object, type = c("observed", "expected"), ...) ## S3 method for class 'dgaps' nobs(object, ...) ## S3 method for class 'dgaps' logLik(object, ...) ## S3 method for class 'dgaps' print(x, digits = max(3L, getOption("digits") - 3L), ...) ## S3 method for class 'dgaps' summary( object, se_type = c("observed", "expected"), digits = max(3, getOption("digits") - 3L), ... ) ## S3 method for class 'summary.dgaps' print(x, ...)
object |
and object of class |
... |
For |
type |
A character scalar. Should the estimate of the variance be based on the observed information or the expected information? |
x |
|
digits |
|
se_type |
A character scalar. Should the estimate of the standard error be based on the observed information or the expected information? |
coef.dgaps
. A numeric scalar: the estimate of the extremal index
.
vcov.dgaps
. A numeric matrix containing the
estimated variance of the estimator.
nobs.dgaps
. A numeric scalar: the number of inter-exceedance times
used in the fit. If x$inc_cens = TRUE
then this includes up to 2
censored observations.
logLik.dgaps
. An object of class "logLik"
: a numeric scalar
with value equal to the maximised log-likelihood. The returned object
also has attributes nobs
, the numbers of -gaps that
contribute to the log-likelihood and
"df"
, which is equal to the
number of total number of parameters estimated (1).
print.dgaps
. The argument x
, invisibly.
summary.dgaps
. Returns a list containing the list element
object$call
and a numeric matrix summary
giving the estimate
of the extremal index and the estimated standard error
(Std. Error).
print.summary.dgaps
. The argument x
, invisibly.
See the examples in dgaps
.
dgaps
for maximum likelihood estimation of the
extremal index using the
-gaps model.
confint.dgaps
for confidence intervals for
.
Calculates sufficient statistics for the the left-censored inter-exceedances
time -gaps model for the extremal index
.
dgaps_stat(data, u, q_u, D = 1, inc_cens = TRUE)
dgaps_stat(data, u, q_u, D = 1, inc_cens = TRUE)
data |
A numeric vector of raw data. No missing values are allowed. |
u |
A numeric scalar. Extreme value threshold applied to data. |
q_u |
A numeric scalar. An estimate of the probability with which
the threshold |
D |
A numeric scalar. Run parameter |
inc_cens |
A logical scalar indicating whether or not to include contributions from right-censored inter-exceedance times relating to the first and last observation. It is known that these times are greater than or equal to the time observed. See Attalides (2015) for details. |
The sample inter-exceedance times are
,
where
are uncensored and
and
are right-censored. Under the assumption that the
inter-exceedance times are independent, the log-likelihood of the
-gaps model is given by
where
is the threshold exceedance probability, estimated by
the proportion of threshold exceedances,
,
if
and
otherwise,
is the number of sample inter-exceedance times that
are left-censored, that is, are less than or equal to
,
(apart from an adjustment for the contributions of and
)
is the number of inter-exceedance times that
are uncensored, that is, are greater than
,
specifically, if inc_cens = TRUE
then is equal
to the number of
that are
uncensored plus
.
The differing treatment of uncensored and censored -gaps reflects
differing contributions to the likelihood. Right-censored
inter-exceedance times whose observed values are less than or equal to
add no information to the likelihood because we do not know to
which part of the likelihood they should contribute.
If then we are in the degenerate case where there is one
cluster (all inter-exceedance times are left-censored) and the likelihood
is maximized at
.
If then all exceedances occur singly (no inter-exceedance
times are left-censored) and the likelihood is maximized at
.
A list containing the sufficient statistics, with components
N0 |
the number of left-censored inter-exceedance times. |
N1 |
contribution from inter-exceedance times that are not left-censored (see Details). |
sum_qtd |
the sum of the (scaled) inter-exceedance times
that are not left-censored, that is,
|
n_dgaps |
the number of inter-exceedances that contribute to the log-likelihood. |
q_u |
the sample proportion of values that exceed the threshold. |
D |
the input value of |
Holesovsky, J. and Fusek, M. Estimation of the extremal index using censored distributions. Extremes 23, 197-213 (2020). doi:10.1007/s10687-020-00374-3
Attalides, N. (2015) Threshold-based extreme value modelling, PhD thesis, University College London. https://discovery.ucl.ac.uk/1471121/1/Nicolas_Attalides_Thesis.pdf
dgaps
for maximum likelihood estimation of the
extremal index using the
-gaps model.
u <- quantile(newlyn, probs = 0.90) dgaps_stat(newlyn, u = u, D = 1)
u <- quantile(newlyn, probs = 0.90) dgaps_stat(newlyn, u = u, D = 1)
Estimates the extremal index using the iterated weighted least
squares method of Suveges (2007). At the moment no estimates of
uncertainty are provided.
iwls(data, u, maxit = 100)
iwls(data, u, maxit = 100)
data |
A numeric vector of raw data. No missing values are allowed. |
u |
A numeric scalar. Extreme value threshold applied to data. |
maxit |
A numeric scalar. The maximum number of iterations. |
The iterated weighted least squares algorithm on page 46 of
Suveges (2007) is used to estimate the value of the extremal index.
This approach uses the time gaps between successive exceedances
in the data data
of the threshold u
. The th
gap is defined as
, where
is the difference in
the occurrence time of exceedance
and exceedance
.
Therefore, threshold exceedances at adjacent time points produce a gap
of zero.
The model underlying this approach is an exponential-point mas mixture
for scaled gaps, that is, gaps multiplied by the proportion of
values in data
that exceed u
. Under this model
scaled gaps are zero (‘within-cluster’ inter-exceedance times) with
probability and otherwise (‘between-cluster’
inter-exceedance times) follow an exponential distribution with mean
.
The estimation method is based on fitting the ‘broken stick’ model of
Ferro (2003) to an exponential quantile-quantile plot of all of the
scaled gaps. Specifically, the broken stick is a horizontal line
and a line with gradient
which intersect at
. The algorithm on page 46 of
Suveges (2007) uses a weighted least squares minimization applied to
the exponential
part of this model to seek a compromise between the role of
as the proportion of inter-exceedance times that are between-cluster
and the reciprocal of the mean of an exponential distribution for these
inter-exceedance times. The weights (see Ferro (2003)) are based on the
variances of order statistics of a standard exponential sample: larger
order statistics have larger sampling variabilities and therefore
receive smaller weight than smaller order statistics.
Note that in step (1) of the algorithm on page 46 of Suveges there is a
typo: should be
, where
is the number of
threshold exceedances. Also, the gaps are scaled as detailed above,
not by their mean.
An object (a list) of class "iwls", "exdex"
containing
theta |
The estimate of |
conv |
A convergence indicator: 0 indicates successful
convergence; 1 indicates that |
niter |
The number of iterations performed. |
n_gaps |
The number of time gaps between successive exceedances. |
call |
The call to |
Suveges, M. (2007) Likelihood estimation of the extremal index. Extremes, 10, 41-55. doi:10.1007/s10687-007-0034-2
Ferro, C.A.T. (2003) Statistical methods for clusters of extreme values. Ph.D. thesis, Lancaster University.
iwls_methods
for S3 methods for "iwls"
objects.
### S&P 500 index u <- quantile(sp500, probs = 0.60) theta <- iwls(sp500, u) theta coef(theta) nobs(theta) ### Newlyn sea surges u <- quantile(newlyn, probs = 0.90) theta <- iwls(newlyn, u) theta
### S&P 500 index u <- quantile(sp500, probs = 0.60) theta <- iwls(sp500, u) theta coef(theta) nobs(theta) ### Newlyn sea surges u <- quantile(newlyn, probs = 0.90) theta <- iwls(newlyn, u) theta
"iwls"
Methods for objects of class c("iwls", "exdex")
returned from
iwls
.
## S3 method for class 'iwls' coef(object, ...) ## S3 method for class 'iwls' nobs(object, ...) ## S3 method for class 'iwls' print(x, digits = max(3L, getOption("digits") - 3L), ...)
## S3 method for class 'iwls' coef(object, ...) ## S3 method for class 'iwls' nobs(object, ...) ## S3 method for class 'iwls' print(x, digits = max(3L, getOption("digits") - 3L), ...)
object |
and object of class |
... |
Further arguments. None are used. |
x |
an object of class |
digits |
The argument |
print.iwls
prints the original call to iwls
and the estimate of the extremal index .
coef.iwls
. A numeric scalar: the estimate of the extremal index
.
nobs.iwls
. A numeric scalar: the number of inter-exceedance times
used in the fit.
print.iwls
. The argument x
, invisibly.
See the examples in iwls
.
iwls
for maximum likelihood estimation of the
extremal index using the
-gaps model.
-gaps modelCalculates maximum likelihood estimates of the extremal index
based on the
-gaps model for threshold inter-exceedances times of
Suveges and Davison (2010).
kgaps(data, u, k = 1, inc_cens = TRUE)
kgaps(data, u, k = 1, inc_cens = TRUE)
data |
A numeric vector or numeric matrix of raw data. If If |
u |
A numeric scalar. Extreme value threshold applied to data. |
k |
A non-negative numeric scalar. Run parameter |
inc_cens |
A logical scalar indicating whether or not to include
contributions from right-censored inter-exceedance times, relating to the
first and last observations. It is known that these times are greater
than or equal to the time observed. See Attalides (2015) for details.
If |
If inc_cens = FALSE
then the maximum likelihood estimate of
the extremal index under the
-gaps model of
Suveges and Davison (2010) is calculated.
If inc_cens = TRUE
then information from right-censored
first and last inter-exceedance times is also included in the likelihood
to be maximized, following Attalides (2015). The form of the
log-likelihood is given in the Details section of
kgaps_stat
.
It is possible that the estimate of is equal to 1, and also
possible that it is equal to 0.
kgaps_stat
explains the
respective properties of the data that cause these events to occur.
An object (a list) of class c("kgaps", "exdex")
containing
theta |
The maximum likelihood estimate (MLE) of
|
se |
The estimated standard error of the MLE, calculated
using an algebraic expression for the observed information.
If |
se_exp |
The estimated standard error of the MLE,
calculated using an algebraic expression for the expected information.
If the estimate of |
ss |
The list of summary statistics returned from
|
k , u , inc_cens
|
The input values of |
max_loglik |
The value of the log-likelihood at the MLE. |
call |
The call to |
Suveges, M. and Davison, A. C. (2010) Model misspecification in peaks over threshold analysis, Annals of Applied Statistics, 4(1), 203-221. doi:10.1214/09-AOAS292
Attalides, N. (2015) Threshold-based extreme value modelling, PhD thesis, University College London. https://discovery.ucl.ac.uk/1471121/1/Nicolas_Attalides_Thesis.pdf
kgaps_confint
to estimate confidence intervals
for .
kgaps_methods
for S3 methods for "kgaps"
objects.
kgaps_imt
for the information matrix test, which
may be used to inform the choice of the pair (u, k
).
choose_uk
for a diagnostic plot based on
kgaps_imt
.
kgaps_stat
for the calculation of sufficient
statistics for the -gaps model.
kgaps_post
in the
revdbayes
package for Bayesian inference
about using the
-gaps model.
### S&P 500 index u <- quantile(sp500, probs = 0.60) theta <- kgaps(sp500, u) theta summary(theta) coef(theta) nobs(theta) vcov(theta) logLik(theta) ### Newlyn sea surges u <- quantile(newlyn, probs = 0.60) theta <- kgaps(newlyn, u, k = 2) theta summary(theta) ### Cheeseboro wind gusts theta <- kgaps(cheeseboro, 45, k = 3) theta summary(theta)
### S&P 500 index u <- quantile(sp500, probs = 0.60) theta <- kgaps(sp500, u) theta summary(theta) coef(theta) nobs(theta) vcov(theta) logLik(theta) ### Newlyn sea surges u <- quantile(newlyn, probs = 0.60) theta <- kgaps(newlyn, u, k = 2) theta summary(theta) ### Cheeseboro wind gusts theta <- kgaps(cheeseboro, 45, k = 3) theta summary(theta)
for "kgaps"
objectsconfint
method for objects of class c("kgaps", "exdex")
.
Computes confidence intervals for based on an object returned
from
kgaps
. Two types of interval may be returned:
(a) intervals based on approximate large-sample normality of the estimator
of , which are symmetric about the point estimate,
and (b) likelihood-based intervals. The
plot
method plots the
log-likelihood for , with the required confidence interval
indicated on the plot.
## S3 method for class 'kgaps' confint( object, parm = "theta", level = 0.95, interval_type = c("both", "norm", "lik"), conf_scale = c("theta", "log"), constrain = TRUE, se_type = c("observed", "expected"), ... ) ## S3 method for class 'confint_kgaps' plot(x, ...) ## S3 method for class 'confint_kgaps' print(x, ...)
## S3 method for class 'kgaps' confint( object, parm = "theta", level = 0.95, interval_type = c("both", "norm", "lik"), conf_scale = c("theta", "log"), constrain = TRUE, se_type = c("observed", "expected"), ... ) ## S3 method for class 'confint_kgaps' plot(x, ...) ## S3 method for class 'confint_kgaps' print(x, ...)
object |
An object of class |
parm |
Specifies which parameter is to be given a confidence interval.
Here there is only one option: the extremal index |
level |
The confidence level required. A numeric scalar in (0, 1). |
interval_type |
A character scalar: |
conf_scale |
A character scalar. If If |
constrain |
A logical scalar. If |
se_type |
A character scalar. Should the confidence intervals for the
|
... |
|
x |
an object of class |
Two type of interval are calculated: (a) an interval based on the
approximate large sample normality of the estimator of
(if
conf_scale = "theta"
) or of
(if
conf_scale = "log"
) and (b) a likelihood-based interval,
based on the approximate large sample chi-squared, with 1 degree of
freedom, distribution of the log-likelihood ratio statistic.
print.confint_kgaps
prints the matrix of confidence
intervals for .
A list of class c("confint_kgaps", "exdex") containing the following components.
cis |
A matrix with columns giving the lower and upper confidence
limits. These are labelled as (1 - level)/2 and 1 - (1 - level)/2 in
% (by default 2.5% and 97.5%).
The row names indicate the type of interval:
|
call |
The call to |
object |
The input object |
level |
The input |
plot.confint_kgaps
: nothing is returned. If
x$object$k = 0
then no plot is produced.
print.confint_kgaps
: the argument x
, invisibly.
Suveges, M. and Davison, A. C. (2010) Model misspecification in peaks over threshold analysis, Annals of Applied Statistics, 4(1), 203-221. doi:10.1214/09-AOAS292
kgaps
for estimation of the extremal index
using a semiparametric maxima method.
u <- quantile(newlyn, probs = 0.90) theta <- kgaps(newlyn, u) cis <- confint(theta) cis plot(cis)
u <- quantile(newlyn, probs = 0.90) theta <- kgaps(newlyn, u) cis <- confint(theta) cis plot(cis)
-gaps modelPerforms the information matrix test (IMT) of Suveges and Davison (2010) to
diagnose misspecification of the -gaps model.
kgaps_imt(data, u, k = 1, inc_cens = TRUE)
kgaps_imt(data, u, k = 1, inc_cens = TRUE)
data |
A numeric vector or numeric matrix of raw data. If If |
u , k
|
Numeric vectors. Any values in |
inc_cens |
A logical scalar indicating whether or not to include contributions from censored inter-exceedance times, relating to the first and last observations. See Attalides (2015) for details. |
The -gaps IMT is performed a over grid of all
combinations of threshold and
in the vectors
u
and k
. If the estimate of is 0 then the
IMT statistic, and its associated
-value is
NA
.
For details of the IMT see Suveges and Davison
(2010). There are some typing errors on pages 18-19 that have been
corrected in producing the code: the penultimate term inside {...}
in the middle equation on page 18 should be , as should
the penultimate term in the first equation on page 19; the
{...}
bracket should be squared in the 4th equation on page 19; the factor
should be
in the final equation on page 19.
An object (a list) of class c("kgaps_imt", "exdex")
containing
imt |
A |
p |
A |
theta |
A |
u , k
|
The input |
Suveges, M. and Davison, A. C. (2010) Model misspecification in peaks over threshold analysis, Annals of Applied Statistics, 4(1), 203-221. doi:10.1214/09-AOAS292
Attalides, N. (2015) Threshold-based extreme value modelling, PhD thesis, University College London. https://discovery.ucl.ac.uk/1471121/1/Nicolas_Attalides_Thesis.pdf
kgaps
for maximum likelihood estimation of the
extremal index using the
-gaps model.
choose_uk
for graphical diagnostic to aid the choice
of the threshold and the run parameter
.
### Newlyn sea surges u <- quantile(newlyn, probs = seq(0.1, 0.9, by = 0.1)) imt <- kgaps_imt(newlyn, u = u, k = 1:5) ### S&P 500 index u <- quantile(sp500, probs = seq(0.1, 0.9, by = 0.1)) imt <- kgaps_imt(sp500, u = u, k = 1:5) ### Cheeseboro wind gusts (a matrix containing some NAs) probs <- c(seq(0.5, 0.98, by = 0.025), 0.99) u <- quantile(cheeseboro, probs = probs, na.rm = TRUE) imt <- kgaps_imt(cheeseboro, u = u, k = 1:5)
### Newlyn sea surges u <- quantile(newlyn, probs = seq(0.1, 0.9, by = 0.1)) imt <- kgaps_imt(newlyn, u = u, k = 1:5) ### S&P 500 index u <- quantile(sp500, probs = seq(0.1, 0.9, by = 0.1)) imt <- kgaps_imt(sp500, u = u, k = 1:5) ### Cheeseboro wind gusts (a matrix containing some NAs) probs <- c(seq(0.5, 0.98, by = 0.025), 0.99) u <- quantile(cheeseboro, probs = probs, na.rm = TRUE) imt <- kgaps_imt(cheeseboro, u = u, k = 1:5)
Calculates the components required to calculate the value of the information
matrix test under the -gaps model, using vector data input.
Called by
kgaps_imt
.
kgaps_imt_stat(data, theta, u, k = 1, inc_cens = TRUE)
kgaps_imt_stat(data, theta, u, k = 1, inc_cens = TRUE)
data |
A numeric vector of raw data. Missing values are allowed, but
they should not appear between non-missing values, that is, they only be
located at the start and end of the vector. Missing values are omitted
using |
theta |
A numeric scalar. An estimate of the extremal index
|
u |
A numeric scalar. Extreme value threshold applied to data. |
k |
A numeric scalar. Run parameter |
inc_cens |
A logical scalar indicating whether or not to include contributions from censored inter-exceedance times relating to the first and last observation. See Attalides (2015) for details. |
A list relating the quantities given on pages 18-19 of
Suveges and Davison (2010). All but the last component are vectors giving
the contribution to the quantity from each -gap, evaluated at the
input value
theta
of .
ldj |
the derivative of the log-likelihood with respect to
|
Ij |
the observed information |
Jj |
the square of the score |
dj |
|
Ddj |
the derivative of |
n_kgaps |
the number of |
Suveges, M. and Davison, A. C. (2010) Model misspecification in peaks over threshold analysis, Annals of Applied Statistics, 4(1), 203-221. doi:10.1214/09-AOAS292
Attalides, N. (2015) Threshold-based extreme value modelling, PhD thesis, University College London. https://discovery.ucl.ac.uk/1471121/1/Nicolas_Attalides_Thesis.pdf
"kgaps"
Methods for objects of class c("kgaps", "exdex")
returned from
kgaps
.
## S3 method for class 'kgaps' coef(object, ...) ## S3 method for class 'kgaps' vcov(object, type = c("observed", "expected"), ...) ## S3 method for class 'kgaps' nobs(object, ...) ## S3 method for class 'kgaps' logLik(object, ...) ## S3 method for class 'kgaps' print(x, digits = max(3L, getOption("digits") - 3L), ...) ## S3 method for class 'kgaps' summary( object, se_type = c("observed", "expected"), digits = max(3, getOption("digits") - 3L), ... ) ## S3 method for class 'summary.kgaps' print(x, ...)
## S3 method for class 'kgaps' coef(object, ...) ## S3 method for class 'kgaps' vcov(object, type = c("observed", "expected"), ...) ## S3 method for class 'kgaps' nobs(object, ...) ## S3 method for class 'kgaps' logLik(object, ...) ## S3 method for class 'kgaps' print(x, digits = max(3L, getOption("digits") - 3L), ...) ## S3 method for class 'kgaps' summary( object, se_type = c("observed", "expected"), digits = max(3, getOption("digits") - 3L), ... ) ## S3 method for class 'summary.kgaps' print(x, ...)
object |
and object of class |
... |
For |
type |
A character scalar. Should the estimate of the variance be based on the observed information or the expected information? |
x |
|
digits |
|
se_type |
A character scalar. Should the estimate of the standard error be based on the observed information or the expected information? |
coef.kgaps
. A numeric scalar: the estimate of the extremal index
.
vcov.kgaps
. A numeric matrix containing the
estimated variance of the estimator.
nobs.kgaps
. A numeric scalar: the number of inter-exceedance times
used in the fit. If x$inc_cens = TRUE
then this includes up to 2
censored observations.
logLik.kgaps
. An object of class "logLik"
: a numeric scalar
with value equal to the maximised log-likelihood. The returned object
also has attributes nobs
, the numbers of -gaps that
contribute to the log-likelihood and
"df"
, which is equal to the
number of total number of parameters estimated (1).
print.kgaps
. The argument x
, invisibly.
summary.kgaps
. Returns a list containing the list element
object$call
and a numeric matrix summary
giving the estimate
of the extremal index and the estimated standard error
(Std. Error).
print.summary.kgaps
. The argument x
, invisibly.
See the examples in kgaps
.
kgaps
for maximum likelihood estimation of the
extremal index using the
-gaps model.
confint.kgaps
for confidence intervals for
.
-gaps modelCalculates sufficient statistics for the -gaps model for the extremal
index
. Called by
kgaps
.
kgaps_stat(data, u, q_u, k = 1, inc_cens = TRUE)
kgaps_stat(data, u, q_u, k = 1, inc_cens = TRUE)
data |
A numeric vector of raw data. |
u |
A numeric scalar. Extreme value threshold applied to data. |
q_u |
A numeric scalar. An estimate of the probability with which
the threshold |
k |
A numeric scalar. Run parameter |
inc_cens |
A logical scalar indicating whether or not to include contributions from right-censored inter-exceedance times relating to the first and last observation. It is known that these times are greater than or equal to the time observed. See Attalides (2015) for details. |
The sample -gaps are
,
where
are uncensored and
and
are right-censored. Under the assumption that the
-gaps are independent, the log-likelihood of the
-gaps
model is given by
where
is the threshold exceedance probability, estimated by
the proportion of threshold exceedances,
is the number of uncensored sample
-gaps that
are equal to zero,
(apart from an adjustment for the contributions of
and
)
is the number of positive sample
-gaps,
specifically, if inc_cens = TRUE
then is equal
to the number of
that are positive plus
, where
if
is greater than zero and
otherwise, and
similarly for
.
The differing treatment of uncensored and right-censored -gaps
reflects differing contributions to the likelihood. Right-censored
-gaps that are equal to zero add no information to the likelihood.
For full details see Suveges and Davison (2010) and Attalides (2015).
If then we are in the degenerate case where there is one
cluster (all
-gaps are zero) and the likelihood is maximized at
.
If then all exceedances occur singly (all
-gaps are
positive) and the likelihood is maximized at
.
A list containing the sufficient statistics, with components
N0 |
the number of zero |
N1 |
contribution from non-zero |
sum_qs |
the sum of the (scaled) |
n_kgaps |
the number of |
Suveges, M. and Davison, A. C. (2010) Model misspecification in peaks over threshold analysis, Annals of Applied Statistics, 4(1), 203-221. doi:10.1214/09-AOAS292
Attalides, N. (2015) Threshold-based extreme value modelling, PhD thesis, University College London. https://discovery.ucl.ac.uk/1471121/1/Nicolas_Attalides_Thesis.pdf
kgaps
for maximum likelihood estimation of the
extremal index using the
-gaps model.
u <- quantile(newlyn, probs = 0.90) kgaps_stat(newlyn, u)
u <- quantile(newlyn, probs = 0.90) kgaps_stat(newlyn, u)
The vector newlyn
contains 2894 maximum sea-surges measured at
Newlyn, Cornwall, UK over the period 1971-1976. The observations are
the maximum hourly sea-surge heights over contiguous 15-hour time
periods.
newlyn
newlyn
A vector of length 2894.
Coles, S.G. (1991) Modelling extreme multivariate events. PhD thesis, University of Sheffield, U.K.
Fawcett, L. and Walshaw, D. (2012) Estimating return levels from serially dependent extremes. Environmetrics, 23(3), 272-283. doi:10.1002/env.2133
Northrop, P. J. (2015) An efficient semiparametric maxima estimator of the extremal index. Extremes, 18, 585-603. doi:10.1007/s10687-015-0221-5
plot
method for objects inheriting from class "choose_b"
,
returned from choose_b
## S3 method for class 'choose_b' plot( x, y, ..., estimator = c("N2015", "BB2018"), maxima = c("sliding", "disjoint") )
## S3 method for class 'choose_b' plot( x, y, ..., estimator = c("N2015", "BB2018"), maxima = c("sliding", "disjoint") )
x |
an object of class |
y |
Not used. |
... |
|
estimator |
Choice of estimator: |
maxima |
Should the estimator be based on sliding or disjoint maxima? |
Produces a simple diagnostic plot to aid the choice of block
length based on the object returned from
choose_b
.
Estimates of and approximate
conf
% confidence intervals
are plotted against the value of used to produce each estimate.
The type of confidence interval is determined by the arguments
interval_type
, conf_scale
and type
provided in the
call to choose_b
.
Nothing is returned.
See the examples in choose_b
.
Northrop, P. J. (2015) An efficient semiparametric maxima estimator of the extremal index. Extremes 18(4), 585-603. doi:10.1007/s10687-015-0221-5
Berghaus, B., Bucher, A. (2018) Weak convergence of a pseudo maximum likelihood estimator for the extremal index. Ann. Statist. 46(5), 2307-2335. doi:10.1214/17-AOS1621
and runs parameter
diagnostic for the
-gaps estimatorplot
method for objects inheriting from class "choose_ud"
,
returned from choose_ud
## S3 method for class 'choose_ud' plot( x, y = c("imts", "theta"), level = 0.95, interval_type = c("norm", "lik"), conf_scale = c("theta", "log"), alpha = 0.05, constrain = TRUE, for_abline = list(lty = 2, lwd = 1, col = 1), digits = 3, uprob = FALSE, leg_pos = if (y == "imts") "topright" else "topleft", ... )
## S3 method for class 'choose_ud' plot( x, y = c("imts", "theta"), level = 0.95, interval_type = c("norm", "lik"), conf_scale = c("theta", "log"), alpha = 0.05, constrain = TRUE, for_abline = list(lty = 2, lwd = 1, col = 1), digits = 3, uprob = FALSE, leg_pos = if (y == "imts") "topright" else "topleft", ... )
x |
an object of class |
y |
A character scalar indicating what should be plotted on the
vertical axes of the plot: information matrix test statistics (IMTS)
if |
level |
A numeric scalar in (0, 1). The confidence level used in
calculating confidence intervals for |
interval_type |
A character scalar. The type of confidence interval
to be plotted, if |
conf_scale |
A character scalar. If |
alpha |
A numeric vector with entries in (0, 1). The size of the test to be performed. |
constrain |
A logical scalar. The argument |
for_abline |
Only relevant when |
digits |
An integer. Used for formatting the value of the threshold
with |
uprob |
A logical scalar. Should we plot |
leg_pos |
A character scalar. The position of any legend added to
a plot. Only relevant when both the arguments |
... |
Additional arguments passed to |
The type of plot produced depends mainly on y
.
If y = "imts"
then the values of IMTS are plotted against the
thresholds in x$u
(or their corresponding approximate sample
quantile levels if uprob = TRUE
) for each value of
in
x$D
. Horizontal lines are added to indicate the critical
values of the IMT for the significance levels in alpha
.
We would not reject at the 100alpha
% level combinations of
threshold and corresponding to values of the IMTS that fall
below the line.
If y = "theta"
then estimates of are plotted on the
vertical axis. If both
x$u
and x$D$
have length greater
than one then only these estimates are plotted. If either x$u
or x$D
have length one then approximate 100level
%
confidence intervals are added to the plot and the variable,
x$u
or x$D
that has length greater than one is plotted on
the horizontal axis.
Nothing is returned.
See the examples in choose_ud
.
and runs parameter
diagnostic for the
-gaps estimatorplot
method for objects inheriting from class "choose_uk"
,
returned from choose_uk
## S3 method for class 'choose_uk' plot( x, y = c("imts", "theta"), level = 0.95, interval_type = c("norm", "lik"), conf_scale = c("theta", "log"), alpha = 0.05, constrain = TRUE, for_abline = list(lty = 2, lwd = 1, col = 1), digits = 3, uprob = FALSE, leg_pos = if (y == "imts") "topright" else "topleft", ... )
## S3 method for class 'choose_uk' plot( x, y = c("imts", "theta"), level = 0.95, interval_type = c("norm", "lik"), conf_scale = c("theta", "log"), alpha = 0.05, constrain = TRUE, for_abline = list(lty = 2, lwd = 1, col = 1), digits = 3, uprob = FALSE, leg_pos = if (y == "imts") "topright" else "topleft", ... )
x |
an object of class |
y |
A character scalar indicating what should be plotted on the
vertical axes of the plot: information matrix test statistics (IMTS)
if |
level |
A numeric scalar in (0, 1). The confidence level used in
calculating confidence intervals for |
interval_type |
A character scalar. The type of confidence interval
to be plotted, if |
conf_scale |
A character scalar. If |
alpha |
A numeric vector with entries in (0, 1). The size of the test to be performed. |
constrain |
A logical scalar. The argument |
for_abline |
Only relevant when |
digits |
An integer. Used for formatting the value of the threshold
with |
uprob |
A logical scalar. Should we plot |
leg_pos |
A character scalar. The position of any legend added to
a plot. Only relevant when both the arguments |
... |
Additional arguments passed to |
The type of plot produced depends mainly on y
.
If y = "imts"
then the values of IMTS are plotted against the
thresholds in x$u
(or their corresponding approximate sample
quantile levels if uprob = TRUE
) for each value of
in
x$k
. Horizontal lines are added to indicate the critical
values of the IMT for the significance levels in alpha
.
We would not reject at the 100alpha
% level combinations of
threshold and corresponding to values of the IMTS that fall
below the line.
If y = "theta"
then estimates of are plotted on the
vertical axis. If both
x$u
and x$k$
have length greater
than one then only these estimates are plotted. If either x$u
or x$k
have length one then approximate 100level
%
confidence intervals are added to the plot and the variable,
x$u
or x$k
that has length greater than one is plotted on
the horizontal axis.
Nothing is returned.
See the examples in choose_uk
.
Daily log returns of the S&P 500 index, that is, the log of the ratio of successive daily closing prices, from 3rd January 1990 to 9th October 2018.
sp500
sp500
A vector of length 7250, created using zoo
with an "index" attribute giving the date of the corresponding negated
log return.
Yahoo finance: https://finance.yahoo.com/quote/^SPX/history/
Splits the values in a numeric matrix column-wise into sequences of non-missing values.
split_by_NAs(x)
split_by_NAs(x)
x |
A vector or matrix. |
For each column in x
, split_by_NAs
finds runs of
values that contain no missing values and assigns them to a column in the
matrix that is returned. Different columns are treated separately.
If there are no missing values in a column then that column appears
unmodified in the output matrix. Please see the Examples
for illustrations.
A matrix containing a column for each run of non-missing values in
x
. The number of rows is equal to the longest run of non-missing
values in x
and will therefore be at most nrow{x}
. The
matrix is padded with NA
values at the end of each column, where
necessary.
The returned object has an attribute called split_by_NAs_done
whose value is TRUE
, so that in programming one can avoid calling
split_by_NAs
more than once.
# Create a simple numeric matrix and insert some NAs x <- matrix(1:50, 10, 5) x[c(3, 8), 1] <- NA x[c(1:2, 5, 10), 3] <- NA x[1:3, 4] <- NA x[7:10, 5] <- NA x res <- split_by_NAs(x) res # An example of a character matrix x <- matrix(c(letters, letters[1:18]), 11, 4) x[c(1:2, 5:11), 2] <- NA x[c(2:4, 6:11), 3] <- NA x[1:10, 4] <- NA res <- split_by_NAs(x) res
# Create a simple numeric matrix and insert some NAs x <- matrix(1:50, 10, 5) x[c(3, 8), 1] <- NA x[c(1:2, 5, 10), 3] <- NA x[1:3, 4] <- NA x[7:10, 5] <- NA x res <- split_by_NAs(x) res # An example of a character matrix x <- matrix(c(letters, letters[1:18]), 11, 4) x[c(1:2, 5:11), 2] <- NA x[c(2:4, 6:11), 3] <- NA x[1:10, 4] <- NA res <- split_by_NAs(x) res
Estimates the extremal index using a semiparametric block
maxima estimator of Northrop (2015) (
"N2015"
) and a variant of this
estimator studied by Berghaus and Bucher (2018) ("BB2018"
), using
both sliding (overlapping) block maxima and disjoint (non-overlapping) block
maxima. A simple modification (subtraction of , where
is
the block size) of the Berghaus and Bucher (2018) estimator
(
"BB2018b"
) is also calculated. Estimates of uncertainty are
calculated using the asymptotic theory developed by Berghaus and Bucher
(2018).
spm( data, b, bias_adjust = c("BB3", "BB1", "N", "none"), constrain = TRUE, varN = TRUE, which_dj = c("last", "first") )
spm( data, b, bias_adjust = c("BB3", "BB1", "N", "none"), constrain = TRUE, varN = TRUE, which_dj = c("last", "first") )
data |
A numeric vector of raw data. No missing values are allowed. |
b |
A numeric scalar. The block size. |
bias_adjust |
A character scalar. Is bias-adjustment of the
raw estimate of |
constrain |
A logical scalar. If |
varN |
A logical scalar. If |
which_dj |
A character scalar. Determines which set of disjoint
maxima are used to calculate an estimate of |
The extremal index is estimated using the
semiparametric maxima estimator of Northrop (2015) and variant
of this studied by Berghaus and Bucher (2018). In each case a sample
of 'data' is derived from the input data
data
, based on the
empirical distribution function of these data evaluated at the
maximum values of of blocks of b
contiguous values in data
.
The estimators are based on an assumption that these 'data' are sampled
approximately from an exponential distribution with mean .
For details see page 2309 of Berghaus and Bucher (2018), where the
'data' for the Northrop (2015) estimator are denoted
and
those for the Berghaus and Bucher (2018) are denoted
.
For convenience, we will refer to these values as the
-data and the
-data.
The approximate nature of the model for the -data arises from the
estimation of the distribution function
. A further
approximation motivates the use of the
-data. If
is known
then the variable
has a beta(1,
) distribution,
so that that is,
has mean
.
Therefore, an exponential distribution with mean
may provide a better approximate model, which provides the motivation for
subtracting
from the Berghaus and Bucher (2018) estimator.
Indeed, the resulting estimates are typically close to those
of the Northrop (2015) estimator.
If sliding = TRUE
then the function uses sliding block maxima,
that is, the largest value observed in all
(length(data) - b + 1
) blocks of b
observations.
If sliding = FALSE
then disjoint block maxima, that is, the
largest values in (floor(n / b)
) disjoint blocks of b
observations, are used.
Estimation of the sampling variances of the estimators is based
on Proposition 4.1 on page 2319 of Berghaus and Bucher (2018).
For the Northrop (2015) variant the user has the choice either to
use the sampling variance based on the Berghaus and Bucher (2018)
estimator, i.e. the -data (
varN = FALSE
) or an analogous
version tailored to the Northrop (2015) estimator that uses the
-data (
varN = TRUE
).
The estimates of the sampling variances of the sliding blocks estimators
are inferred from those of the disjoint blocks estimators (see page 2319
of Berhaus and Bucher (2018)). The calculation of the latter uses a set
of disjoint block maxima. If length(data)
is not an integer
multiple of b
then there will be more than one set of these, and
all are equally valid. In this event we perform the calculation for all
such sets and use the mean of the resulting estimates. This reduces the
sampling variability of the estimates at the expense of slowing down
the calculation somewhat, particularly if b
is small. This may
become apparent when calling spm
repeatedly in
choose_b
.
This estimator of the sampling variance of the sliding blocks estimator is
not constrained to be positive: a negative estimate may result if the
block size is small. In this event
no warning will be given until the returned object is printed and,
for the affected estimator ("N2015"
or "BB2018/BB2018b"
),
the corresponding estimated standard errors using sliding
blocks will be missing in se_sl
in the returned object,
if bias_adjust == "BB3"
then bias-adjustment
based on bias_adjust == "BB1"
will instead be performed
when using sliding blocks, because the former relies on the
estimated variances of the estimators.
Similarly, bias adjustment under adjust = "BB3"
and/or subtraction
of in the
"BB2018b"
case may, rare cases, produce a
negative estimate of . In these instances an estimate of
zero is returned, but the values returned in
bias_dj
and
bias_sl
are not changed.
A list of class c("spm", "exdex")
containing the
components listed below. The components that are vectors are
labelled to indicate the estimator to which the constituent values
relate: "N2015"
for Northrop (2015), "BB2018"
for
Berghaus and Bucher (2018) and "BB2018b"
for the modified version.
theta_sl , theta_dj
|
Vectors containing the estimates of
|
se_sl , se_dj
|
The estimated standard errors associated
with the estimates in |
bias_sl , bias_dj
|
The respective values of the
bias-adjustment applied to the raw estimates, that is, the values
subtracted from the raw estimates. For estimator
|
raw_theta_sl , raw_theta_dj
|
Vectors containing the raw estimates
of |
uncon_theta_sl , uncon_theta_dj
|
The (bias_adjusted) estimates of
|
data_sl , data_dj
|
Matrices containing the |
sigma2dj , sigma2dj_for_sl
|
Estimates of the variance
|
sigma2sl |
Estimates of the variance
|
b |
The input value of |
bias_adjust |
The input value of |
call |
The call to |
Northrop, P. J. (2015) An efficient semiparametric maxima estimator of the extremal index. Extremes 18(4), 585-603. doi:10.1007/s10687-015-0221-5
Berghaus, B., Bucher, A. (2018) Weak convergence of a pseudo maximum likelihood estimator for the extremal index. Ann. Statist. 46(5), 2307-2335. doi:10.1214/17-AOS1621
spm_confint
to estimate confidence intervals
for .
spm_methods
for S3 methods for "spm"
objects.
### Newlyn sea surges theta <- spm(newlyn, 20) theta summary(theta) coef(theta) nobs(theta) vcov(theta) ### S&P 500 index theta <- spm(sp500, 100) theta summary(theta)
### Newlyn sea surges theta <- spm(newlyn, 20) theta summary(theta) coef(theta) nobs(theta) vcov(theta) ### S&P 500 index theta <- spm(sp500, 100) theta summary(theta)
for "spm"
objectsconfint
method for objects of class c("spm", "exdex")
.
Computes confidence intervals for based on an object returned
from
spm
. Two types of interval may be returned:
(a) intervals that are based on approximate large-sample normality of the
estimators of (or of
if
conf_scale = "log"
), and which are symmetric about the respective
point estimates, and (b) likelihood-based intervals based on an adjustment
of a naive (pseudo-) loglikelihood, using the
adjust_loglik
function in the
chandwich
package. The plot
method plots the
log-likelihood for , with the required confidence interval(s)
indicated on the plot.
## S3 method for class 'spm' confint( object, parm = "theta", level = 0.95, maxima = c("sliding", "disjoint"), interval_type = c("norm", "lik", "both"), conf_scale = c("theta", "log"), constrain = TRUE, bias_adjust = TRUE, type = c("vertical", "cholesky", "spectral", "none"), ... ) ## S3 method for class 'confint_spm' plot(x, estimator = "all", ndec = 2, ...) ## S3 method for class 'confint_spm' print(x, ...)
## S3 method for class 'spm' confint( object, parm = "theta", level = 0.95, maxima = c("sliding", "disjoint"), interval_type = c("norm", "lik", "both"), conf_scale = c("theta", "log"), constrain = TRUE, bias_adjust = TRUE, type = c("vertical", "cholesky", "spectral", "none"), ... ) ## S3 method for class 'confint_spm' plot(x, estimator = "all", ndec = 2, ...) ## S3 method for class 'confint_spm' print(x, ...)
object |
An object of class |
parm |
Specifies which parameter is to be given a confidence interval.
Here there is only one option: the extremal index |
level |
A numeric scalar in (0, 1). The confidence level required. |
maxima |
A character scalar specifying whether to estimate confidence intervals based on sliding maxima or disjoint maxima. |
interval_type |
A character scalar: |
conf_scale |
A character scalar. If If Any bias-adjustment requested in the original call to |
constrain |
A logical scalar. If |
bias_adjust |
A logical scalar. If If |
type |
A character scalar. The argument |
... |
Additional optional arguments to be passed to
|
x |
an object of class |
estimator |
A character vector specifying which of the three variants
of the semiparametric maxima estimator to include in the plot:
|
ndec |
An integer scalar. The legend (if included on the plot)
contains the confidence limits rounded to |
The likelihood-based intervals are estimated using the
adjust_loglik
function in the
chandwich
package, followed by a call to
conf_intervals
.
This adjusts the naive (pseudo-)loglikelihood so that the curvature of
the adjust loglikelihood agrees with the estimated standard errors of
the estimators. The option type = "none"
should not be used in
practice because the resulting confidence intervals will be wrong.
In particular, in the intervals based on sliding maxima will provide
vast underestimates of uncertainty.
If object$se
contains NA
s, because the block size b
was too small or too large in the call to spm
then
confidence intervals cannot be estimated. A matrix of NA
s
will be returned. See the Details section of the
spm
documentation for more information.
print.confint_spm
prints the matrix of confidence
intervals for .
A list of class c("confint_spm", "exdex") containing the following components.
cis |
A matrix with columns giving the lower and upper confidence
limits. These are labelled as (1 - level)/2 and 1 - (1 - level)/2 in
% (by default 2.5% and 97.5%).
The row names are a concatenation of the variant of the estimator
( |
ciN |
The object returned from
|
ciBB |
The object returned from
|
ciBBb |
The object returned from
|
call |
The call to |
object |
The input |
maxima |
The input |
theta |
The relevant estimates of |
level |
The input |
plot.confint_spm
: nothing is returned.
print.confint_spm
: the argument x
, invisibly.
Northrop, P. J. (2015) An efficient semiparametric maxima estimator of the extremal index. Extremes 18(4), 585-603. doi:10.1007/s10687-015-0221-5
Berghaus, B., Bucher, A. (2018) Weak convergence of a pseudo maximum likelihood estimator for the extremal index. Ann. Statist. 46(5), 2307-2335. doi:10.1214/17-AOS1621
spm
for estimation of the extremal index
using a semiparametric maxima method.
# Newlyn sea surges theta <- spm(newlyn, 20) confint(theta) cis <- confint(theta, interval_type = "lik") cis plot(cis) # S&P 500 index theta <- spm(sp500, 100) confint(theta) cis <- confint(theta, interval_type = "lik") cis plot(cis)
# Newlyn sea surges theta <- spm(newlyn, 20) confint(theta) cis <- confint(theta, interval_type = "lik") cis plot(cis) # S&P 500 index theta <- spm(sp500, 100) confint(theta) cis <- confint(theta, interval_type = "lik") cis plot(cis)
"spm"
Methods for objects of class c("spm", "exdex")
returned from
spm
.
## S3 method for class 'spm' coef( object, maxima = c("sliding", "disjoint"), estimator = "all", constrain = FALSE, ... ) ## S3 method for class 'spm' vcov(object, maxima = c("sliding", "disjoint"), estimator = "all", ...) ## S3 method for class 'spm' nobs(object, maxima = c("sliding", "disjoint"), ...) ## S3 method for class 'spm' print(x, digits = max(3L, getOption("digits") - 3L), ...) ## S3 method for class 'spm' summary(object, digits = max(3, getOption("digits") - 3L), ...) ## S3 method for class 'summary.spm' print(x, ...)
## S3 method for class 'spm' coef( object, maxima = c("sliding", "disjoint"), estimator = "all", constrain = FALSE, ... ) ## S3 method for class 'spm' vcov(object, maxima = c("sliding", "disjoint"), estimator = "all", ...) ## S3 method for class 'spm' nobs(object, maxima = c("sliding", "disjoint"), ...) ## S3 method for class 'spm' print(x, digits = max(3L, getOption("digits") - 3L), ...) ## S3 method for class 'spm' summary(object, digits = max(3, getOption("digits") - 3L), ...) ## S3 method for class 'summary.spm' print(x, ...)
object |
an object of class |
maxima |
A character scalar specifying whether to return the number of observed sliding maxima or disjoint maxima. |
estimator |
A character vector specifying which of the three variants
of the semiparametric maxima estimator to use: |
constrain |
A logical scalar. If |
... |
For |
x |
|
digits |
An integer. Used for number formatting with
|
print.spm
prints the original call to spm
and the estimates of the extremal index , based on all three
variants of the semiparametric maxima estimator and both sliding
and disjoint blocks.
coef.spm
. A numeric scalar (or a vector of length 3 if
estimator = "all"
): the required estimate(s) of the extremal index
.
vcov.spm
. A numeric matrix if
estimator = "N2015"
or "BB2018"
and a vector of length 3 if
estimator = "all"
, containing the estimated variance(s) of the
estimator(s).
nobs.spm
. A numeric scalar: the number of observations used in the
fit.
print.spm
. The argument x
, invisibly.
summary.spm
. Returns an object (a list) of class
"summary.spm"
containing the list element object$call
and a
numeric matrix matrix
giving, for all three variants of the
semiparametric estimator and both sliding and disjoint blocks,
the (bias-adjusted) Estimate of the extremal index ,
the estimated standard error (Std. Error),
and the bias adjustment (Bias adj.) applied to obtain the estimate, i.e.
the value subtracted from the raw estimate. If any of the
(bias-adjusted) estimates are greater than 1 then a column
containing the unconstrained estimates (Uncon. estimate) is added.
print.summary.spm
. The argument x
, invisibly.
See the examples in spm
.
spm
for semiparametric estimation of the
extremal index based on block maxima.
The dataframe uccle
contains daily maximum temperatures in degrees C
recorded at the Uccle, Belgium from 1/1/1833 to 23/1/2011. The Station
identifier in the source file is 17 and the Source identifier is 117882.
uccle
uccle
A data frame with 65036 observations on the following and 5 variables.
temp:
daily maximum temperature in degrees C.
year:
the year.
month:
the month of the year.
day:
day of the month.
date:
date with the Date
class,
in the format YYYY-MM-DD.
There are 5336 missing values.
Klein Tank, A.M.G. and Coauthors, 2002. Daily dataset of 20th-century surface air temperature and precipitation series for the European Climate Assessment. Int. J. of Climatol., 22, 1441-1453 doi:10.1002/joc.773. Data and metadata available at https://www.ecad.eu. The data were downloaded on 27/3/2022 using a Custom query (ASCII), selecting "non-blend" for type of series.
The dataframe uccle720
contains daily maximum temperatures in degrees C
recorded at the Uccle, Belgium during July for the years 1901 to 1999.
The Station identifier in the source file is 17 and the Source identifier is
117882. These data are analysed in Holesovsky and Fusek (2020).
uccle720
uccle720
A data frame with 3100 observations on the following and 5 variables.
temp:
daily maximum temperature in degrees C.
year:
the year.
month:
the month of the year.
day:
day of the month.
date:
date with the Date
class,
in the format YYYY-MM-DD.
There are 6 missing values, one located in each of the years 1925, 1926, 1956, 1963, 1969 and 1976.
Klein Tank, A.M.G. and Coauthors, 2002. Daily dataset of 20th-century surface air temperature and precipitation series for the European Climate Assessment. Int. J. of Climatol., 22, 1441-1453 doi:10.1002/joc.773. Data and metadata available at https://www.ecad.eu. The data were downloaded on 27/3/2022 using a Custom query (ASCII), selecting "non-blend" for type of series.
Holesovsky, J. and Fusek, M. Estimation of the extremal index using censored distributions. Extremes, 23, 197-213 (2020). doi:10.1007/s10687-020-00374-3
uccle720_ts <- ts(uccle720$temp, start = c(1901, 1), frequency = 31) plot(uccle720_ts, ylab = "daily maximum temperature in July / degrees C", xlab = "year")
uccle720_ts <- ts(uccle720$temp, start = c(1901, 1), frequency = 31) plot(uccle720_ts, ylab = "daily maximum temperature in July / degrees C", xlab = "year")
The matrix uccle720m
contains daily maximum temperatures in degrees C
recorded at the Uccle, Belgium during July for the years 1901 to 1999.
The Station identifier in the source file is 17 and the Source identifier is
117882. These data are analysed in Holesovsky and Fusek (2020).
uccle720m
uccle720m
A 31 by 100 numeric matrix. Column i
contains the maximum
daily temperature in degrees C at Uccle in the year 1900 + i
- 1.
The columns are named 1900, 1901, ..., 1999 and the rows are named
after the day of the month: 1, 2, .., 31.
There are 6 missing values, one located in each of the years 1925, 1926, 1956, 1963, 1969 and 1976.
Klein Tank, A.M.G. and Coauthors, 2002. Daily dataset of 20th-century surface air temperature and precipitation series for the European Climate Assessment. Int. J. of Climatol., 22, 1441-1453 doi:10.1002/joc.773. Data and metadata available at https://www.ecad.eu. The data were downloaded on 27/3/2022 using a Custom query (ASCII), selecting "non-blend" for type of series.
Holesovsky, J. and Fusek, M. Estimation of the extremal index using censored distributions. Extremes, 23, 197-213 (2020). doi:10.1007/s10687-020-00374-3
uccle720_ts <- ts(uccle720$temp, start = c(1901, 1), frequency = 31) plot(uccle720_ts, ylab = "daily maximum temperature in July / degrees C", xlab = "year")
uccle720_ts <- ts(uccle720$temp, start = c(1901, 1), frequency = 31) plot(uccle720_ts, ylab = "daily maximum temperature in July / degrees C", xlab = "year")