close
Skip to content

Core: Cache PartitionData template in PartitionsTable to avoid rebuilding Avro schema per partition#16208

Open
Wenjun7J wants to merge 2 commits intoapache:mainfrom
Wenjun7J:partitions-table-schema-cache
Open

Core: Cache PartitionData template in PartitionsTable to avoid rebuilding Avro schema per partition#16208
Wenjun7J wants to merge 2 commits intoapache:mainfrom
Wenjun7J:partitions-table-schema-cache

Conversation

@Wenjun7J
Copy link
Copy Markdown

@Wenjun7J Wenjun7J commented May 4, 2026

image

What is changed

This change avoids rebuilding the same PartitionData Avro schema for every partition row when scanning the partitions metadata table.

Instead of creating a fresh PartitionData(partitionType) for each partition value, PartitionsTable now creates one PartitionData template per scan and reuses it through copyFor(key).

A regression test is also added to verify that partition rows produced within the same scan reuse the same underlying Avro schema instance.

Why

PartitionsTable currently constructs partition rows like this:

  • create PartitionData(partitionType)
  • convert partition type to Avro schema
  • copy the partition key into the new object

When a table has many partition values, this repeats the same schema conversion over and over again, creating heavy allocation pressure in:

  • PartitionData.partitionDataSchema
  • AvroSchemaUtil.convert
  • TypeToSchema$WithTypeToName.struct

This is especially visible for wide partition specs and large metadata table scans.

External reproduction

Used a standalone repro app that scans the Iceberg partitions metadata table for a table with:

  • 20,000 partition values
  • 4 partition columns
  • repeated full partitionsTable scans
         try (CloseableIterable<FileScanTask> tasks = partitionsTable.newScan().planFiles()) {
                for (FileScanTask task : tasks) {
                    try (CloseableIterable<StructLike> rows = task.asDataTask().rows()) {
                        for (StructLike row : rows) {
                            StructProjection partitionData = row.get(0, StructProjection.class);
                            if (partitionData == null) {
                                throw new IllegalStateException("Partition row returned null partition data");
                            }
                            partitionRows++;
                        }
                    }
                }
            }

Before fix (origin/main)

  • Average wall clock time: 12.71s
  • Average max RSS: 5,938,604 KB (~5.66 GiB)

After fix

  • Average wall clock time: 5.24s
  • Average max RSS: 1,483,155 KB (~1.41 GiB)
image

Improvement

  • Wall clock time reduced by 58.8%
  • Max RSS reduced by 75.0%

Signed-off-by: SevenJ <wenjun7j@gmail.com>
@github-actions github-actions Bot added the core label May 4, 2026
@Wenjun7J
Copy link
Copy Markdown
Author

Wenjun7J commented May 5, 2026

@RussellSpitzer @pvary could you please take a look?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant