Context-Based Refactoring of Element-Wise Analysis Functions#131
Context-Based Refactoring of Element-Wise Analysis Functions#131mattcieslak merged 10 commits intomainfrom
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #131 +/- ##
==========================================
+ Coverage 89.33% 90.48% +1.14%
==========================================
Files 6 6
Lines 1998 1870 -128
==========================================
- Hits 1785 1692 -93
+ Misses 213 178 -35 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@mattcieslak I'm ok if this waits until a later release but took a shot at reducing some of the redundancy |
|
There's one major change in behavior that this introduces, which is that the pre-filtered wdyt about keeping all of the other grouping/checking in the pre-fitting context, but keeping the filtering in the per-element runs? I think it would still be a major performance gain. |
Summary
Refactors the element-wise analysis pipeline to hoist loop-invariant computations (formula parsing, collision checks, cross-scalar source alignment) out of the per-element functions and into precomputed context objects built once before the loop. This eliminates redundant string operations,
match()calls, and validation checks in large-scale runs.Changes
New:
analyse-helpers.R— Context builders and shared data assembly.build_base_context()— Precomputes scalar name lookups, column-name collision checks, andmatch()-based source-alignment reorder indices. Supportsscalar_subset = NULL(wrap: all scalars) and explicit subset (lm/gam: formula-referenced scalars only)..build_lm_context()— Extends base context withall.vars()formula parsing, LHS/RHS variable detection, and scalar predictor identification..build_gam_context()— Extends lm context with cachedmgcv::interpret.gam()breakdown..build_wrap_context()— Calls base context in "attach all scalars" mode..assemble_element_data()— Shared per-element helper that reads scalar rows using precomputed reorder indices, builds intersection validity masks, applies subject threshold, and returns the filtered data.frame. Replaces duplicated logic across all threeanalyseOneElement.*functions.Modified:
ModelArray_Constructor.R— Per-element functionsanalyseOneElement.lm— Accepts optionalctxparameter. When provided, delegates data assembly to.assemble_element_data()and skips all invariant work. Falls back to inline computation whenctx = NULL(backward compatible for direct calls/debugging).analyseOneElement.gam— Samectx/fallback pattern. Preserves(Intercept)→Interceptrename ands(age)→s_agesmooth term normalization in the statistics extraction section.analyseOneElement.wrap— Samectx/fallback pattern. Coercion section now properly handles: multi-row data.frame errors, unnamed atomic/list auto-naming (v1,v2, ...), and unsupported return types with informative error messages.Modified:
analyse.R— Top-level dispatchersModelArray.lm()— Buildsctxonce via.build_lm_context(), passes through.find_initiator_element()and.parallel_dispatch().ModelArray.gam()— Buildsctxonce via.build_gam_context(), uses cachedgam_formula_breakdownforchecker_gam_formula(). Reduced model contexts forchanged.rsqare built separately.ModelArray.wrap()— Buildsctxonce via.build_wrap_context(). Restored missingwrite_scalar_column_nameslength validation.Minor fixes
stats::setNames()qualified to avoid R CMD check NOTE.utils::globalVariables(".")added to suppress magrittr.NOTE.subset(select = -c(...))replaced with bracket-based column removal to eliminate bare-name NOTEs inModelArray.gam().Tests:
test-analyse_context.R(new)Organized in five sections following the context hierarchy:
.assemble_element_data()— validity masking, threshold, cross-scalar reorder, intersection masksctxvs. legacy path equivalence (lm, gam, wrap) — verifies identical output.find_initiator_element()— success and all-elements-fail casesTests:
test-ModelArray_low_hanging_coverage.R(updated)expect_errorregex patterns to match refactored error messages for unsupported return types and multi-row data.frame validation.Not Changed
mergeModelArrays, or S4 class/accessor definitions.analyseOneElement.*function signatures remain backward compatible — all existing direct calls work without modification via thectx = NULLfallback.helper-expected_values.Ror the expected value generation logic.Closes #127