close
Skip to content

Improve display of schema error #1369

@abey79

Description

@abey79

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

Consider this snippet:

import datafusion as dfn
import pyarrow as pa

ctx = dfn.SessionContext()
data = {"id": [1, 2, 3], "name": ["Alice", "Bob", "Charlie"], "age": [25, 30, 35]}
table = pa.Table.from_pydict(data)
ctx.register_record_batches("users", [table.to_batches()])

df = ctx.table("users")
df.select("does_not_exist")

I'm getting the following error:

Exception: DataFusion error: SchemaError(FieldNotFound { field: Column { relation: None, name: "does_not_exist" }, valid_fields: [Column { relation: Some(Bare { table: "users" }), name: "id" }, Column { relation: Some(Bare { table: "users" }), name: "name" }, Column { relation: Some(Bare { table: "users" }), name: "age" }] }, Some(""))

Describe the solution you'd like

While technically correct, the above error is not very user-friendly. Something like field "does_not_exists" was not found, valid fields: "id", "name" would be preferable.

Describe alternatives you've considered

n/a

Additional context

I had a cursory look at the code, and it seems that:

  • DataFusionError implements Display
  • Debug is used instead in the bindings

I haven't tested, but maybe it's a matter of just changing the line above.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions