- Jan 03, 2014
-
-
Prashant Sharma authored
-
Prashant Sharma authored
-
- Jan 02, 2014
-
-
Prashant Sharma authored
-
Prashant Sharma authored
-
Prashant Sharma authored
-
Prashant Sharma authored
-
Prashant Sharma authored
Merge branch 'scripts-reorg' of github.com:shane-huang/incubator-spark into spark-915-segregate-scripts Conflicts: bin/spark-shell core/pom.xml core/src/main/scala/org/apache/spark/SparkContext.scala core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/CoarseMesosSchedulerBackend.scala core/src/main/scala/org/apache/spark/ui/UIWorkloadGenerator.scala core/src/test/scala/org/apache/spark/DriverSuite.scala python/run-tests sbin/compute-classpath.sh sbin/spark-class sbin/stop-slaves.sh
-
- Jan 01, 2014
-
-
Patrick Wendell authored
SPARK-544. Migrate configuration to a SparkConf class This is still a work in progress based on Prashant and Evan's code. So far I've done the following: - Got rid of global SparkContext.globalConf - Passed SparkConf to serializers and compression codecs - Made SparkConf public instead of private[spark] - Improved API of SparkContext and SparkConf - Switched executor environment vars to be passed through SparkConf - Fixed some places that were still using system properties - Fixed some tests, though others are still failing This still fails several tests in core, repl and streaming, likely due to properties not being set or cleared correctly (some of the tests run fine in isolation). But the API at least is hopefully ready for review. Unfortunately there was a lot of global stuff before due to a "SparkContext.globalConf" method that let you set a "default" configuration of sorts, which meant I had to make some pretty big changes.
-
Matei Zaharia authored
-
Matei Zaharia authored
-
Matei Zaharia authored
Also replaced SparkConf.getOrElse with just a "get" that takes a default value, and added getInt, getLong, etc to make code that uses this simpler later on.
-
Matei Zaharia authored
Conflicts: core/src/main/scala/org/apache/spark/SparkContext.scala core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala
-
Patrick Wendell authored
SPARK-1008: Logging improvments 1. Adds a default log4j file that gets loaded if users haven't specified a log4j file. 2. Isolates use of the tools assembly jar. I found this produced SLF4J warnings after building with SBT (and I've seen similar warnings on the mailing list).
-
Patrick Wendell authored
Conflicts: streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala
-
Matei Zaharia authored
Conflicts: project/SparkBuild.scala
-
- Dec 31, 2013
-
-
Reynold Xin authored
restore core/pom.xml file modification
-
liguoqiang authored
-
Reynold Xin authored
Approximate distinct count Added countApproxDistinct() to RDD and countApproxDistinctByKey() to PairRDDFunctions to approximately count distinct number of elements and distinct number of values per key, respectively. Both functions use HyperLogLog from stream-lib for counting. Both functions take a parameter that controls the trade-off between accuracy and memory consumption. Also added Scala docs and test suites for both methods.
-
Patrick Wendell authored
-
Hossein Falaki authored
-
Hossein Falaki authored
-
Matei Zaharia authored
-
Matei Zaharia authored
Conflicts: core/src/main/scala/org/apache/spark/rdd/CheckpointRDD.scala streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala
-
Patrick Wendell authored
upgrade Netty from 4.0.0.Beta2 to 4.0.13.Final the changes are listed at https://github.com/netty/netty/wiki/New-and-noteworthy
-
Patrick Wendell authored
Bug fixes for file input stream and checkpointing - Fixed bugs in the file input stream that led the stream to fail due to transient HDFS errors (listing files when a background thread it deleting fails caused errors, etc.) - Updated Spark's CheckpointRDD and Streaming's CheckpointWriter to use SparkContext.hadoopConfiguration, to allow checkpoints to be written to any HDFS compatible store requiring special configuration. - Changed the API of SparkContext.setCheckpointDir() - eliminated the unnecessary 'useExisting' parameter. Now SparkContext will always create a unique subdirectory within the user specified checkpoint directory. This is to ensure that previous checkpoint files are not accidentally overwritten. - Fixed bug where setting checkpoint directory as a relative local path caused the checkpointing to fail.
-
Tathagata Das authored
-
Patrick Wendell authored
-
Patrick Wendell authored
-
Patrick Wendell authored
-
Patrick Wendell authored
-
- Dec 30, 2013
-
-
Hossein Falaki authored
-
Hossein Falaki authored
-
Hossein Falaki authored
-
Hossein Falaki authored
-
Hossein Falaki authored
-
Matei Zaharia authored
-
Hossein Falaki authored
-
Patrick Wendell authored
-
Patrick Wendell authored
Changed naming of StageCompleted event to be consistent The rest of the SparkListener events are named with "SparkListener" as the prefix of the name; this commit renames the StageCompleted event to SparkListenerStageCompleted for consistency.
-
Patrick Wendell authored
1. Adds a default log4j file that gets loaded if users haven't specified a log4j file. 2. Isolates use of the tools assembly jar. I found this produced SLF4J warnings after building with SBT (and I've seen similar warnings on the mailing list).
-