-
- Downloads
Merge pull request #283 from tmyklebu/master
Python bindings for mllib This pull request contains Python bindings for the regression, clustering, classification, and recommendation tools in mllib. For each 'train' frontend exposed, there is a Scala stub in PythonMLLibAPI.scala and a Python stub in mllib.py. The Python stub serialises the input RDD and any vector/matrix arguments into a mutually-understood format and calls the Scala stub. The Scala stub deserialises the RDD and the vector/matrix arguments, calls the appropriate 'train' function, serialises the resulting model, and returns the serialised model. ALSModel is slightly different since a MatrixFactorizationModel has RDDs inside. The Scala stub returns a handle to a Scala MatrixFactorizationModel; prediction is done by calling the Scala predict method. I have tested these bindings on an x86_64 machine running Linux. There is a risk that these bindings may fail on some choose-your-own-endian platform if Python's endian differs from java.nio.ByteBuffer's idea of the native byte order.
No related branches found
No related tags found
Showing
- mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala 232 additions, 0 deletions...la/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala
- python/pyspark/java_gateway.py 1 addition, 0 deletionspython/pyspark/java_gateway.py
- python/pyspark/mllib/__init__.py 20 additions, 0 deletionspython/pyspark/mllib/__init__.py
- python/pyspark/mllib/_common.py 227 additions, 0 deletionspython/pyspark/mllib/_common.py
- python/pyspark/mllib/classification.py 86 additions, 0 deletionspython/pyspark/mllib/classification.py
- python/pyspark/mllib/clustering.py 79 additions, 0 deletionspython/pyspark/mllib/clustering.py
- python/pyspark/mllib/recommendation.py 74 additions, 0 deletionspython/pyspark/mllib/recommendation.py
- python/pyspark/mllib/regression.py 110 additions, 0 deletionspython/pyspark/mllib/regression.py
- python/pyspark/serializers.py 1 addition, 1 deletionpython/pyspark/serializers.py
python/pyspark/mllib/__init__.py
0 → 100644
python/pyspark/mllib/_common.py
0 → 100644
python/pyspark/mllib/classification.py
0 → 100644
python/pyspark/mllib/clustering.py
0 → 100644
python/pyspark/mllib/recommendation.py
0 → 100644
python/pyspark/mllib/regression.py
0 → 100644
Please register or sign in to comment