Skip to content
Snippets Groups Projects
Commit 8ead999f authored by Davies Liu's avatar Davies Liu Committed by Xiangrui Meng
Browse files

[SPARK-5223] [MLlib] [PySpark] fix MapConverter and ListConverter in MLlib

It will introduce problems if the object in dict/list/tuple can not support by py4j, such as Vector.
Also, pickle may have better performance for larger object (less RPC).

In some cases that the object in dict/list can not be pickled (such as JavaObject), we should still use MapConvert/ListConvert.

This PR should be ported into branch-1.2

Author: Davies Liu <davies@databricks.com>

Closes #4023 from davies/listconvert and squashes the following commits:

55d4ab2 [Davies Liu] fix MapConverter and ListConverter in MLlib
parent 39e333ec
No related branches found
No related tags found
No related merge requests found
......@@ -18,7 +18,7 @@
import py4j.protocol
from py4j.protocol import Py4JJavaError
from py4j.java_gateway import JavaObject
from py4j.java_collections import MapConverter, ListConverter, JavaArray, JavaList
from py4j.java_collections import ListConverter, JavaArray, JavaList
from pyspark import RDD, SparkContext
from pyspark.serializers import PickleSerializer, AutoBatchedSerializer
......@@ -70,9 +70,7 @@ def _py2java(sc, obj):
obj = _to_java_object_rdd(obj)
elif isinstance(obj, SparkContext):
obj = obj._jsc
elif isinstance(obj, dict):
obj = MapConverter().convert(obj, sc._gateway._gateway_client)
elif isinstance(obj, (list, tuple)):
elif isinstance(obj, list) and (obj or isinstance(obj[0], JavaObject)):
obj = ListConverter().convert(obj, sc._gateway._gateway_client)
elif isinstance(obj, JavaObject):
pass
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment