Skip to content
Snippets Groups Projects
Commit 2ccf3b66 authored by Josh Rosen's avatar Josh Rosen
Browse files

Fix PySpark hash partitioning bug.

A Java array's hashCode is based on its object
identify, not its elements, so this was causing
serialized keys to be hashed incorrectly.

This commit adds a PySpark-specific workaround
and adds more tests.
parent 7859879a
No related branches found
No related tags found
No related merge requests found
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment