- Jan 01, 2013
-
-
Josh Rosen authored
Expand the PySpark programming guide.
-
Josh Rosen authored
-
Josh Rosen authored
-
- Dec 31, 2012
-
-
Josh Rosen authored
-
- Dec 29, 2012
-
-
Josh Rosen authored
This version of the example crashes after the first iteration with "OverflowError: math range error" because Python's math.exp() behaves differently than Scala's; see SPARK-646.
-
Josh Rosen authored
-
Josh Rosen authored
-
Josh Rosen authored
Conflicts: docs/quick-start.md
-
Josh Rosen authored
-
Matei Zaharia authored
Fix deletion of files in current working directory by clearFiles()
-
Josh Rosen authored
-
Josh Rosen authored
-
Josh Rosen authored
-
- Dec 28, 2012
-
-
Josh Rosen authored
-
Josh Rosen authored
-
Josh Rosen authored
-
Josh Rosen authored
This fixes an issue where Spark could delete original files in the current working directory that were added to the job using addFile(). There was also the potential for addFile() to overwrite local files, which is addressed by changing Utils.fetchFile() to log a warning instead of overwriting a file with new contents. This is a short-term fix; a better long-term solution would be to remove the dependence on storing files in the current working directory, since we can't change the cwd from Java.
-
Josh Rosen authored
-
Josh Rosen authored
- Bundle Py4J binaries, since it's hard to install - Uses Spark's `run` script to launch the Py4J gateway, inheriting the settings in spark-env.sh With these changes, (hopefully) nothing more than running `sbt/sbt package` will be necessary to run PySpark.
-
- Dec 27, 2012
-
-
Josh Rosen authored
Add options to pyspark.SparkContext constructor.
-
Josh Rosen authored
-
Josh Rosen authored
Suggested by / based on code from @MLnick
-
- Dec 26, 2012
-
-
Josh Rosen authored
-
Josh Rosen authored
-
- Dec 24, 2012
-
-
Josh Rosen authored
Passing large volumes of data through Py4J seems to be slow. It appears to be faster to write the data to the local filesystem and read it back from Python.
-
Matei Zaharia authored
lookup() needn't fail when there is no partitioner
-
Josh Rosen authored
-
Mark Hamstra authored
-
Matei Zaharia authored
Allow distinct() to be called without parentheses
-
Mark Hamstra authored
-
- Dec 21, 2012
-
-
Reynold Xin authored
-
Reynold Xin authored
-
Reynold Xin authored
-
Matei Zaharia authored
Kryo2 update against Spark master
-
- Dec 20, 2012
-
-
Matei Zaharia authored
Added the ability in block manager to remove blocks.
-
Reynold Xin authored
-
- Dec 19, 2012
-
-
Matei Zaharia authored
SPARK-616: Logging dead workers in Web UI.
-
Reynold Xin authored
excessive debug messages.
-
Matei Zaharia authored
Tweaked debian packaging to be a bit more in line with debian standards
-
Thomas Dudziak authored
-