Skip to content
Snippets Groups Projects
Commit 165e06a7 authored by Sean Owen's avatar Sean Owen Committed by Patrick Wendell
Browse files

SPARK-1057 (alternative) Remove fastutil

(This is for discussion at this point -- I'm not suggesting this should be committed.)

This is what removing fastutil looks like. Much of it is straightforward, like using `java.io` buffered stream classes, and Guava for murmurhash3.

Uses of the `FastByteArrayOutputStream` were a little trickier. In only one case though do I think the change to use `java.io` actually entails an extra array copy.

The rest is using `OpenHashMap` and `OpenHashSet`.  These are now written in terms of more scala-like operations.

`OpenHashMap` is where I made three non-trivial changes to make it work, and they need review:

- It is no longer private
- The key must be a `ClassTag`
- Unless a lot of other code changes, the key type can't enforce being a supertype of `Null`

It all works and tests pass, and I think there is reason to believe it's OK from a speed perspective.

But what about those last changes?

Author: Sean Owen <sowen@cloudera.com>

Closes #266 from srowen/SPARK-1057-alternate and squashes the following commits:

2601129 [Sean Owen] Fix Map return type error not previously caught
ec65502 [Sean Owen] Updates from matei's review
00bc81e [Sean Owen] Remove use of fastutil and replace with use of java.io, spark.util and Guava classes
parent aa8bb117
No related branches found
No related tags found
No related merge requests found
Showing
with 72 additions and 107 deletions
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment