via(): Switching to different mode
Explanation
Source:vignettes/pointers/recalibration-method.Rmd
recalibration-method.RmdRationale
One estimation method, in a form of <STAT_FN>
under STAT_CONSTRUCTOR() functions, have varieties of
methods you can use for estimation in statistical inference. The
accessibility is simple: just a one line of code under lazy-loaded
objects using via().
For instance, a classical t-test, a bootstrap t-test, and a
permutation t-test answer the same question: “Is there a difference
between these two groups?”, but they get there through different
estimation machinery. Without via(), switching between them
would normally mean reaching for a different function entirely, or
threading a string flag through ... and hoping the
implementation underneath understands it. statim instead
treats the estimation method as something you are switching the
mode within the lazy-loaded pipeline, while the model definition,
the data, and (if present) the hypothesis claim stay untouched.
How via() fits in the pipeline
via() only operates on lazy objects — a
test_lazy produced by prepare_test(), or a
model_lazy produced by prepare_model(). It
cannot be used after conclude(), since by then the pipeline
has already executed and the result is no longer lazy.
sleep |>
define_model(extra %by% group) |>
prepare_test(TTEST) |>
via("boot", n = 2000) |>
conclude()
#>
#> == Model =======================================================================
#>
#> Variable Mapper : x_by
#> Args : extra | group
#> x_vars : 1
#> by_vars : 1
#>
#> == T-Test · boot ===============================================================
#>
#> ============================== Bootstrapped T-test =============================
#>
#>
#> -- Summary ---------------------------------------------------------------------
#> Warning in system("tput cols", intern = TRUE): running command 'tput cols' had
#> status 2
#> ------------------------------
#> CI : [-3.21, 0.0702]
#> n_reps : 2000
#> ------------------------------via() itself does not run anything. It records two
things on the lazy object’s recalibrate_spec: the method
name (.method) and any extra named arguments
(...). The actual dispatch (i.e. finding the matching
implementation and running it) only happens once conclude()
is called. This is the same “defused until executed” behavior the rest
of the grammar follows: define_model(),
prepare_test() / prepare_model(),
state_null(), and via() all just accumulate
specification, and conclude() is the single point where
everything is resolved together.
Where variant names come from
A variant name like "boot" or "permute"
from TTEST() in x_by mode is not arbitrary. It
has to match a name registered for that specific model type, inside the
agendas() object passed to stat_define() when
the test or model was built. Two sources are checked:
The variants declared directly inside the matched
stat_define()’sagendas(), alongside itsbaseline(). This is howTTESTships with"boot","permute","contrast", and"multi"forx_by()pipelines, for example.-
Any variants registered afterwards via
add_variant(). This is a developer-facing function, paired with the%<-%operator, and is meant for extending an existing<STAT_FN>(built withHTEST_FN()orMODEL_FN()) without touching its original definition:add_variant(<STAT_FN>, <var_id>, "<new_mode>") %<-% variant( fn = function(.proc, arg1 = , arg2 = , ...) { # ... } )Variants added this way carry an
origin:"user"(the default) is scoped to the current session and can be removed withremove_variant(), while"package"is meant to be registered from a package’s.onLoad()and lasts as long as that package stays loaded. Either way, the name"default"is frozen and cannot be added or removed throughadd_variant()/remove_variant().
Because matching is scoped to model type, the same variant name can
mean different things (or simply do not exist) depending on which
<var_id> you used in define_model(). A
variant registered for x_by() pipelines is not
automatically available to a <formula>-based pipeline
of the same test, even if both eventually call TTEST.
If you pass a method name that is not registered for the model type
in play, via() fails immediately, before
conclude() is even reached:
sleep |>
define_model(extra %by% group) |>
prepare_test(TTEST) |>
via("not_a_real_method")
#> Error in `method(via, list(statim::test_lazy, class_character))`:
#> ! No variant "not_a_real_method" registered for model type "x_by".
#> ℹ Available variants: "contrast", "multi", "boot", and "permute".The error message lists every variant available for that exact model type, so you can fix the call without digging through documentation.
Argument merging
Any named arguments supplied to via() are not a
replacement for the arguments already set in prepare_test()
/ prepare_model() – they are merged on top of them.
Concretely, conclude() combines the two argument lists with
utils::modifyList(), where the names from
via() win on overlap:
sleep |>
define_model(extra %by% group) |>
prepare_test(TTEST) |>
update(.ci = 0.9) |>
via("boot", n = 2000) |>
conclude()
#>
#> == Model =======================================================================
#>
#> Variable Mapper : x_by
#> Args : extra | group
#> x_vars : 1
#> by_vars : 1
#>
#> == T-Test · boot ===============================================================
#>
#> ============================== Bootstrapped T-test =============================
#>
#>
#> -- Summary ---------------------------------------------------------------------
#> Warning in system("tput cols", intern = TRUE): running command 'tput cols' had
#> status 2
#> -----------------------------
#> CI : [-2.94, -0.29]
#> n_reps : 2000
#> -----------------------------Here, .ci = 0.9 came from update() on the
baseline specification, and n = 2000 came from
via(). Both reach the "boot" implementation’s
fn, because neither name collides with the other. If
via() had also supplied .ci, that value would
have taken precedence over the one set earlier in the pipeline.
This merging behavior is also why via() does not require
you to repeat arguments the variant shares with the baseline. Only
supply what differs for that particular variant; anything else falls
through from what was already specified, or from the variant’s own
declared defaults in its fn signature.
Recalibration and hypothesis claims
If a hypothesis was attached with state_null(),
recalibrating the method does not invalidate it, as long as the variant
you switch to actually declares a claim_parser of its own.
claim_parser is an argument to baseline() and
to each variant() individually — a map_claim()
object that knows how to turn a stated claim into the arguments that one
fn expects, since different estimation methods can require
the same population-parameter claim expressed differently.
TTEST’s x_by() implementation, for instance,
gives a claim_parser to its base and to
"contrast", but not to the resampling-based
"boot" or "permute" variants, since those do
not take a hypothesized value as an argument in the first place.
sleep |>
define_model(extra %by% group) |>
prepare_test(TTEST) |>
state_null(
2 * MU(extra, group == "1") - MU(extra, group == "2") <= 0
) |>
via("contrast") |>
conclude()
#>
#> == Model =======================================================================
#>
#> Variable Mapper : x_by
#> Args : extra | group
#> x_vars : 1
#> by_vars : 1
#>
#> == T-Test · contrast ===========================================================
#>
#> -- Summary ---------------------------------------------------------------------
#>
#> ──────────────────────────────────────────
#> group estimate t_stat df p_val
#> ──────────────────────────────────────────
#> group -0.830 -0.640 14.130 0.734
#> ──────────────────────────────────────────
#>
#>
#> -- Confidence Interval ---------------------------------------------------------
#>
#> ─────────────────────────────
#> group lower_95 upper_95
#> ─────────────────────────────
#> group -3.112 Inf
#> ─────────────────────────────At conclude(), the "contrast" variant’s own
claim_parser runs against the stated claim, and its output
(in this case, .mu, .op, and .w)
is merged into the argument list the same way the rest of
via()’s arguments are. If the variant you switched to has
no claim_parser, conclude() reports that
explicitly rather than silently ignoring the claim:
sleep |>
define_model(extra %by% group) |>
prepare_test(TTEST) |>
state_null(MU(extra, group == "1") == MU(extra, group == "2")) |>
via("permute", n = 999L) |>
conclude()
#> Error in `method(conclude, statim::test_lazy)`:
#> ! No claim parser defined for variant "permute".
#> ℹ Remove `state_null()` or use a supported variant.Because claim_parser lives on the implementation itself
rather than in a separate lookup keyed by variant name, there’s nothing
else to update when you add a new variant via add_variant()
— give it a claim_parser if it should support
state_null(), or leave the argument out if it shouldn’t.
Either way conclude() checks whichever impl it
actually resolved, regardless of whether that came from the original
agendas() or from the runtime registry.
TL;DR
-
via()only works on lazy objects, beforeconclude(). - It records a method name and extra arguments; nothing executes until
conclude(). - Variant names are scoped per model type, sourced from the matched
stat_define()’sagendas()and any variants registered afterwards viaadd_variant(). - Arguments from
via()are merged with, not substituted for, the arguments already declared earlier in the pipeline. - A stated null hypothesis survives recalibration only if the new
variant declares its own
claim_parser; not every variant needs one.