Commits


mwish authored and GitHub committed e837f73b476
GH-14923: [C++][Parquet] Fix DELTA_BINARY_PACKED problem on reading the last block with malford bit-width (#15241) Problem is mentioned here: https://github.com/apache/arrow/issues/14923 This patch fixes that issue. And the code is a bit complex. - Rename variables, since original name is confusing for me. - `block_initialized_` -> `first_block_initialized_`, because it's a mask that indicates that if the first block in page is initialized. - `total_value_count_` -> `total_values_remaining_`. Because it's not `total values within a page`, it means `remaing values to be decoded within a page` - `values_count_current_mini_block_` -> `values_remaining_current_mini_block_`, ditto - Add variables - `total_value_count_`: the total value numbers within a page. - Change Syntax - Change `InitBlock()` to `InitBlock()` and `InitMiniBlock` - Implemention, most logic is in `InitBlock()` and `InitMiniBlock` - Testing. Thanks @ rok. I use a page within 65 values with bitwidth `32 32 165 165`. And personally, I use the code here for testing: ```c++ for (uint32_t i = num_miniblocks; i < mini_blocks_per_block_; i++) { - bit_width_data[i] = 0; + // bit_width_data[i] = 0; + bit_width_data[i] = static_cast<uint8_t>(random()); } ``` The code works well in both debug and release mode. * Closes: #14923 Lead-authored-by: mwish <maplewish117@gmail.com> Co-authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>