close
Skip to content

SQL: accept a trailing options map as alternative to positional optional args in functions #3879

@lvca

Description

@lvca

Motivation

SQL functions in ArcadeDB today accept positional arguments. When a function grows multiple optional parameters of different types, every new capability either:

  • extends the positional list and shifts/breaks callers of future versions, or
  • overloads a single slot with type dispatch (reinterpret arg 4 based on its Java type), which is fragile and hard to document.

The immediate case is vector.neighbors. The current signature is vector.neighbors(index, vector, k [, efSearch]). We want to also expose the existing allowedRIDs whitelist (already supported at LSMVectorIndex.findNeighborsFromVector Java-level, never wired up to SQL), and we have a follow-up feature planned for predicate-based filtered vector search. Three optional args of different types is unworkable positionally.

Proposal

Allow any SQL function to accept an options map as its trailing argument, as an alternative to positional optional arguments. Keys in the map correspond to named options. Unknown keys are rejected with a clear error to catch typos.

Example

Today:

SELECT \`vector.neighbors\`('Doc[embedding]', :qv, 10, 200)

After:

SELECT \`vector.neighbors\`('Doc[embedding]', :qv, 10, { efSearch: 200, filter: [#1:0, #1:1, #1:2] })

Both forms stay supported. The short positional form is kept for the common single-optional case. The map form is picked up when the trailing arg is a map literal.

Scope

  1. A small utility class com.arcadedb.function.sql.FunctionOptions wrapping a Map<String, Object> with:
    • typed getters (int/long/double/boolean/string/list) with default values,
    • unknown-key rejection with a descriptive error listing the accepted keys,
    • shared across adopting functions so the UX stays consistent.
  2. vector.neighbors adopts the pattern. The options map initially accepts:
    • efSearch: int, search beam width (existing functionality via new surface).
    • filter: list of RIDs restricting the search space. Already supported at the Java API level by LSMVectorIndex.findNeighborsFromVector(..., Set<RID> allowedRIDs), never exposed in SQL.
  3. Documentation and tests.

Non-goals

  • No grammar changes. Map literal is an existing production in the SQL grammar; no parser work is required.
  • No engine-wide convention. Functions opt in individually. A blanket rule ("last map arg is options") would collide with functions that legitimately take a map as data (document builders, etc.).
  • No named-argument syntax (efSearch := 200). That is a separate, larger change and can be stacked on top later.

Future work

  • Apply the same pattern to other multi-optional functions (e.g. fulltext.search, selected graph.* functions) as they grow new options.
  • Extend vector.neighbors's filter option beyond a RID whitelist to a predicate lambda, for filtered vector search that avoids materializing millions of RIDs. Tracked separately.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions