arviz_stats.loo_influence#
- arviz_stats.loo_influence(data, var_names=None, group='posterior_predictive', sample_dims=None, log_likelihood_var_name=None, kind='mean', standardize=True, probs=None, log_weights=None, pareto_k=None)[source]#
Compute influential observations based on leave-one-out (LOO) expectations.
Computes observation influence by measuring the change in posterior or posterior predictive summaries when leaving out each observation. The function supports various summary statistics.
- Parameters:
- data: DataTree or InferenceData
It should contain the selected group and log_likelihood.
- var_names: str or list of str, optional
The name(s) of the variable(s) to compute the influence.
- group: str
Group from which to compute weighted expectations. Defaults to
posterior_predictive.- sample_dims
stror sequence ofhashable, optional Defaults to
rcParams["data.sample_dims"]- log_likelihood_var_name: str, optional
The name of the variable in the log_likelihood group to use for loo computation. When log_likelihood contains more than one variable and group is
posterior, this must be provided.- kind: str, optional
The kind of expectation to compute. Available options are:
‘mean’. Default.
‘median’.
‘sd’.
‘var’.
‘quantile’.
‘octiles’.
- standardize: bool
Whether to standardize the computed metric. It uses the standard deviation when
kind=meanand MAD whenkind=median. Ignored for the other values of kind.- probs: float or list of float, optional
The quantile(s) to compute when kind is ‘quantile’.
- log_weights
xarray.DataArray, optional Pre-computed smoothed log weights from PSIS. Must be provided together with pareto_k. If not provided, PSIS will be computed internally.
- pareto_k
xarray.DataArray, optional Pre-computed Pareto k-hat diagnostic values. Must be provided together with log_weights.
- Returns:
- shift
xarray.DataArrayorxarray.Dataset Influential metric
- khat
xarray.DataArrayorxarray.Dataset Function-specific Pareto k-hat diagnostics for each observation.
- shift
Examples
Calculate influential observations based on the posterior median for the parameter
mu:In [1]: from arviz_stats import loo_influence ...: from arviz_base import load_arviz_data ...: dt = load_arviz_data("centered_eight") ...: shift, _ = loo_influence(dt, kind="median", var_names="mu", group="posterior") ...: shift ...: Out[1]: <xarray.Dataset> Size: 576B Dimensions: (school: 8) Coordinates: * school (school) <U16 512B 'Choate' 'Deerfield' ... 'Mt. Hermon' Data variables: mu (school) float64 64B 0.3999 0.1501 0.1494 ... 0.06267 0.5269 0.1137
Calculate influential observations based on 3 quantiles of the posterior predictive:
In [2]: shift, khat = loo_influence(dt, kind="quantile", probs=[0.25, 0.5, 0.75]) ...: shift ...: Out[2]: <xarray.DataArray 'obs' (school: 8)> Size: 64B array([3.26803155, 0.84859889, 0.75626084, 0.43330545, 1.76310658, 0.34259657, 3.46995122, 0.65380875]) Coordinates: * school (school) <U16 512B 'Choate' 'Deerfield' ... 'Mt. Hermon'