Commits


David Schlosnagle authored and GitHub committed 7bfe02db04e
GH-41573: [Java] VectorSchemaRoot uses inefficient stream to copy fieldVectors (#41574) ### Rationale for this change While reviewing allocation profiling of an Arrow intensive application, I noticed significant allocations due to `ArrayList#grow()` originating from `org.apache.arrow.vector.VectorSchemaRoot#getFieldVectors()`. The `org.apache.arrow.vector.VectorSchemaRoot#getFieldVectors()` method uses an inefficient `fieldVectors.stream().collect(Collectors.toList())` to create a list copy, leading to reallocations as the target list is collected. This could be replaced with a more efficent `new ArrayList<>(fieldVectors)` to make a pre-sized list copy, or even better an unmodifiable view via `Collections.unmodifiableList(fieldVectors)`. ### What changes are included in this PR? * Use `Collections.unmodifiableList(List)` to return unmodifiable list view of `fieldVectors` from `getFieldVectors()` * Pre-size the `fieldVectors` `ArrayList` in static factory `VectorSchemaRoot#create(Schema, BufferAllocator)` * `VectorSchemaRoot#setRowCount(int)` iterates over instance `fieldVectors` instead of copied list (similar to existing `allocateNew()`, `clear()`, `contentToTSVString()`). ### Are these changes tested? These changes are covered by existing unit and integration tests. ### Are there any user-facing changes? No * GitHub Issue: #41573 Authored-by: David Schlosnagle <davids@palantir.com> Signed-off-by: David Li <li.davidm96@gmail.com>