Commits


Aaron Gorenstein authored and GitHub committed 29b23cdeabb
GH-34492: [Go] Fix missing boolean plain encoder state update (#34493) ### Rationale for this change This is a bugfix in the go `PlainBooleanEncoder`. Statistics of boolean columns, as well as data in rare cases, can be malformed. This is a data-correctness bugfix. ### What changes are included in this PR? `FlushValue` does not reset the intermediary BitMapWriter position. In this way previously-encoded values might "leak" into subsequent `FlushValue` calls. Page emission and statistics emission (for examples) manifest this issue. ### Are these changes tested? Yes. The most direct test is the new `TestBooleanPlainDecoderAfterFlushing` in internal\encoding\encoding_test.go, but I also added `TestBooleanStatisticsEncoding` as a "consumer" of that encoder. Question: should I add more testing? I experimentally harnessed a fuzz test targeted towards the Boolean encoder, and that manifested the issue, but that's a fuzz test and also targeted towards Bools alone. I also provisionally implemented a (hacky) extension to the roundtrip tests to have an intermediate flush, but that didn't seem enough to manifest the (somewhat value-count-sensitive) issue. ### Are there any user-facing changes? This is a correctness fix, so other than that, no. **This PR contains a "Critical Fix".** * Closes: #34492 Authored-by: Aaron Gorenstein <aaron.gorenstein@mongodb.com> Signed-off-by: Matt Topol <zotthewizard@gmail.com>