Commits


William Butler authored and Micah Kornfield committed 423ca163a26
PARQUET-2163: Handle decimal schemas with large fixed_len_byte_arrays The precision calculation had been overflowing to infinity when the length of the fixed_len_byte_array > 128, triggering an error when then trying to convert infinity to an int32. We can actually simplify the logic by noting that log_b(a^(x)) = log_b(a)*x. This avoids the intermediate infinity. We also added a check for extremely large value sizes implying a max precision that cannot fit in int32. Even 129 byte decimal seems extreme. The formula Parquet C++ was using is technically incorrect vs the Parquet specification. The specification says that the max precision is floor(log_10(2^(B*8 -1) - 1)), where the C++ implementation was omitting the outer -1. However, this is okay as it is easy to prove that these values will always be the same (ignoring the realities of FP arithmetic) & in practice all three formulas agree through 128 when using FP. Bug found through fuzzing. Closes #13456 from tachyonwill/float_overflow Authored-by: William Butler <wab@google.com> Signed-off-by: Micah Kornfield <emkornfield@gmail.com>