Statistical Inference with Persistence Landscapes

Compute test statistics and conduct null hypothesis tests for persistence data using persistence landscapes. See Section 3 of Bubenik (2015).

pl_z_test(
  x,
  y,
  alternative = c("two.sided", "less", "greater"),
  conf.level = 0.95,
  supports = NULL,
  r = 0,
  p = 1
)

pd_z_test(
  x,
  y,
  degree = NULL,
  exact = FALSE,
  xmin = NULL,
  xmax = NULL,
  xby = NULL,
  alternative = c("two.sided", "less", "greater"),
  conf.level = 0.95,
  supports = NULL,
  r = 0,
  p = 1
)

pl_perm_test(x, y, p = 1, complete = FALSE, max_iter = 1000L)

Arguments

x, y: Lists of persistence landscapes.
alternative: a character string specifying the alternative hypothesis, must be one of "two.sided" (default), "greater" or "less". You can specify just the initial letter.
conf.level: confidence level of the interval.
supports: List of support intervals for landscape levels.
r: Non-negative number; the power of the coefficient \(1/k\) in the indicator linear form.
p: Positive integer or infinity; the power used to compute an integral.
degree: Non-negative integer; if input is a persistence diagram object, then the dimension for which to compute a landscape. (For degree \(d\), the \((d+1)\)th matrix in the list will be selected.)
exact: Set to TRUE for exact representation, FALSE (default) for discrete.
xmin, xmax: Domain thresholds for discrete PL; if not specified, then taken to be the support of the PL constructed from the data or the internal values of the 'Rcpp_PersistenceLandscape' object.
xby: Domain grid diameter for discrete PL; if not specified, then set to the power of 10 that yields between 100 and 1000 intervals.
complete: Logical; whether to compute averages between all combinations from the two lists of landscapes.
max_iter: Positive integer; the maximum number of combinations using which to estimate the null distance between mean landscapes.

Value

A list with class "htest" containing the following components:

statistic: (z-test only) the value of the test statistic.
parameter: (z-test only) the degrees of freedom of the test, \(\lvert x \rvert + \lvert y \rvert - 2\).
p.value: the p-value for the test.
estimate: the estimated difference difference in means.
null.value: the difference in means under the null hypothesis, always \(0\).
alternative: a character string describing the alternative hypothesis.
method: a character string indicating the test performed.
conf.int: (z-test only) a confidence interval for the estimated difference in means. Depends on the choice of conf.level.

Examples

# two sets of landscapes from similar but distinctive point clouds
set.seed(711018L)
circlescapes <- replicate(
  6,
  tdaunif::sample_circle(n = rpois(n = 1, lambda = 24)) |>
    ripserr::vietoris_rips(dim = 2L, threshold = 2) |>
    pl_new(degree = 1, exact = TRUE)
)
#> Warning: `dim` parameter has been deprecated; use `max_dim` instead.
#> Warning: `dim` parameter has been deprecated; use `max_dim` instead.
#> Warning: `dim` parameter has been deprecated; use `max_dim` instead.
#> Warning: `dim` parameter has been deprecated; use `max_dim` instead.
#> Warning: `dim` parameter has been deprecated; use `max_dim` instead.
#> Warning: `dim` parameter has been deprecated; use `max_dim` instead.
toruscapes <- replicate(
  6,
  tdaunif::sample_torus_tube(n = rpois(n = 1, lambda = 24)) |>
    ripserr::vietoris_rips(dim = 2L, threshold = 2) |>
    pl_new(degree = 1, exact = TRUE)
)
#> Warning: `dim` parameter has been deprecated; use `max_dim` instead.
#> Warning: `dim` parameter has been deprecated; use `max_dim` instead.
#> Warning: `dim` parameter has been deprecated; use `max_dim` instead.
#> Warning: `dim` parameter has been deprecated; use `max_dim` instead.
#> Warning: `dim` parameter has been deprecated; use `max_dim` instead.
#> Warning: `dim` parameter has been deprecated; use `max_dim` instead.

# null hypothesis tests
pl_z_test(circlescapes, toruscapes)
#> 
#> 	z-test
#> 
#> data:  
#> z = 2.114, df = 10, p-value = 0.03451
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#>  0.00895417 0.23680564
#> sample estimates:
#> mean integral of x mean integral of y 
#>         0.14781005         0.02493015 
#> 
pl_perm_test(circlescapes, toruscapes)
#> 
#> 	permutation test
#> 
#> data:  
#> p-value = 0.004
#> alternative hypothesis: true distance between mean landscapes is greater than 0
#> sample estimates:
#> distance between mean landscapes 
#>                        0.1639316 
#> 

if (FALSE) { # \dontrun{
# benchmark one- and two-step computation of indicator-based linear form
bench::mark(
  pl$indicator_form(f, 0, p = 1),
  pl$indicator(f, 0)$integral(p = 1)
)
bench::mark(
  pl$indicator_form(f, r = 1, p = 1),
  pl$indicator(f, r = 1)$integral(p = 1)
)
} # }

Statistical Inference with Persistence Landscapes

Arguments

Value

See also

Examples