Skip to content
Snippets Groups Projects
  1. Dec 15, 2013
    • Josh Rosen's avatar
      Merge pull request #256 from MLnick/master · d2ced6d5
      Josh Rosen authored
      Fix 'IPYTHON=1 ./pyspark' throwing ValueError
      
      This fixes an annoying issue where running ```IPYTHON=1 ./pyspark``` resulted in:
      
      ```
      Welcome to
            ____              __
           / __/__  ___ _____/ /__
          _\ \/ _ \/ _ `/ __/  '_/
         /__ / .__/\_,_/_/ /_/\_\   version 0.8.0
            /_/
      
      Using Python version 2.7.5 (default, Jun 20 2013 11:06:30)
      Spark context avaiable as sc.
      ---------------------------------------------------------------------------
      ValueError                                Traceback (most recent call last)
      /usr/local/lib/python2.7/site-packages/IPython/utils/py3compat.pyc in execfile(fname, *where)
          202             else:
          203                 filename = fname
      --> 204             __builtin__.execfile(filename, *where)
      
      /Users/Nick/workspace/scala/spark-0.8.0-incubating-bin-hadoop1/python/pyspark/shell.py in <module>()
           30 add_files = os.environ.get("ADD_FILES").split(',') if os.environ.get("ADD_FILES") != None else None
           31
      ---> 32 sc = SparkContext(os.environ.get("MASTER", "local"), "PySparkShell", pyFiles=add_files)
           33
           34 print """Welcome to
      
      /Users/Nick/workspace/scala/spark-0.8.0-incubating-bin-hadoop1/python/pyspark/context.pyc in __init__(self, master, jobName, sparkHome, pyFiles, environment, batchSize)
           70         with SparkContext._lock:
           71             if SparkContext._active_spark_context:
      ---> 72                 raise ValueError("Cannot run multiple SparkContexts at once")
           73             else:
           74                 SparkContext._active_spark_context = self
      
      ValueError: Cannot run multiple SparkContexts at once
      ```
      
      The issue arises since previously IPython didn't seem to respect ```$PYTHONSTARTUP```, but since at least 1.0.0 it has. Technically this might break for older versions of IPython, but most users should be able to upgrade IPython to at least 1.0.0 (and should be encouraged to do so :).
      
      New behaviour:
      ```
      Nicks-MacBook-Pro:incubator-spark-mlnick Nick$ IPYTHON=1 ./pyspark
      Python 2.7.5 (default, Jun 20 2013, 11:06:30)
      Type "copyright", "credits" or "license" for more information.
      
      IPython 1.1.0 -- An enhanced Interactive Python.
      ?         -> Introduction and overview of IPython's features.
      %quickref -> Quick reference.
      help      -> Python's own help system.
      object?   -> Details about 'object', use 'object??' for extra details.
      SLF4J: Class path contains multiple SLF4J bindings.
      SLF4J: Found binding in [jar:file:/Users/Nick/workspace/scala/incubator-spark-mlnick/tools/target/scala-2.9.3/spark-tools-assembly-0.9.0-incubating-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class]
      SLF4J: Found binding in [jar:file:/Users/Nick/workspace/scala/incubator-spark-mlnick/assembly/target/scala-2.9.3/spark-assembly-0.9.0-incubating-SNAPSHOT-hadoop1.0.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
      SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
      SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
      13/12/12 13:08:15 WARN Utils: Your hostname, Nicks-MacBook-Pro.local resolves to a loopback address: 127.0.0.1; using 10.0.0.4 instead (on interface en0)
      13/12/12 13:08:15 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
      13/12/12 13:08:15 INFO Slf4jEventHandler: Slf4jEventHandler started
      13/12/12 13:08:15 INFO SparkEnv: Registering BlockManagerMaster
      13/12/12 13:08:15 INFO DiskBlockManager: Created local directory at /var/folders/_l/06wxljt13wqgm7r08jlc44_r0000gn/T/spark-local-20131212130815-0e76
      13/12/12 13:08:15 INFO MemoryStore: MemoryStore started with capacity 326.7 MB.
      13/12/12 13:08:15 INFO ConnectionManager: Bound socket to port 53732 with id = ConnectionManagerId(10.0.0.4,53732)
      13/12/12 13:08:15 INFO BlockManagerMaster: Trying to register BlockManager
      13/12/12 13:08:15 INFO BlockManagerMasterActor$BlockManagerInfo: Registering block manager 10.0.0.4:53732 with 326.7 MB RAM
      13/12/12 13:08:15 INFO BlockManagerMaster: Registered BlockManager
      13/12/12 13:08:15 INFO HttpBroadcast: Broadcast server started at http://10.0.0.4:53733
      13/12/12 13:08:15 INFO SparkEnv: Registering MapOutputTracker
      13/12/12 13:08:15 INFO HttpFileServer: HTTP File server directory is /var/folders/_l/06wxljt13wqgm7r08jlc44_r0000gn/T/spark-8f40e897-8211-4628-a7a8-755562d5244c
      13/12/12 13:08:16 INFO SparkUI: Started Spark Web UI at http://10.0.0.4:4040
      2013-12-12 13:08:16.337 java[56801:4003] Unable to load realm info from SCDynamicStore
      Welcome to
            ____              __
           / __/__  ___ _____/ /__
          _\ \/ _ \/ _ `/ __/  '_/
         /__ / .__/\_,_/_/ /_/\_\   version 0.9.0-SNAPSHOT
            /_/
      
      Using Python version 2.7.5 (default, Jun 20 2013 11:06:30)
      Spark context avaiable as sc.
      ```
      d2ced6d5
    • Reynold Xin's avatar
      Merge pull request #257 from tgravescs/sparkYarnFixName · c55e6985
      Reynold Xin authored
      Fix the --name option for Spark on Yarn
      
      Looks like the --name option accidentally got broken in one of the merges.  The Client hangs if the --name option is used right now.
      c55e6985
    • Reynold Xin's avatar
      Merge pull request #264 from shivaram/spark-class-fix · ab85f88f
      Reynold Xin authored
      Use CoarseGrainedExecutorBackend in spark-class
      ab85f88f
    • Shivaram Venkataraman's avatar
    • Nick Pentreath's avatar
      Making IPython PySpark compatible across versions <1.0.0. Also cleaned up '-i'... · bb5277b1
      Nick Pentreath authored
      Making IPython PySpark compatible across versions <1.0.0. Also cleaned up '-i' option and made IPYTHON_OPTS work
      bb5277b1
    • Nick Pentreath's avatar
      d36ee3b1
  2. Dec 14, 2013
    • Reynold Xin's avatar
      Merge pull request #251 from pwendell/master · 7db91659
      Reynold Xin authored
      Fix list rendering in YARN markdown docs.
      
      This is some minor clean-up which makes the list render correctly.
      7db91659
    • Josh Rosen's avatar
      Merge pull request #249 from ngbinh/partitionInJavaSortByKey · 2fd781d3
      Josh Rosen authored
      Expose numPartitions parameter in JavaPairRDD.sortByKey()
      
      This change makes Java and Scala API on sortByKey() the same.
      2fd781d3
    • Patrick Wendell's avatar
      Merge pull request #259 from pwendell/scala-2.10 · 97ac0601
      Patrick Wendell authored
      Migration to Scala 2.10
      
      == Below description was written by Prashant Sharma ==
      
      This PR migrates spark to scala 2.10.
      
      Summary of changes apart from scala 2.10 migration:
      (has no implications for user.)
      1. Migrated Akka to 2.2.3.
      
      Does not use remote death watch for it has a bug, where it tries to send message to dead node infinitely.
      
      Uses an indestructible actorsystem which tolerates errors only on executors.
      
      (Might be useful for user.)
      4. New configuration settings introduced:
      
      System.getProperty("spark.akka.heartbeat.pauses", "600")
      System.getProperty("spark.akka.failure-detector.threshold", "300.0")
      System.getProperty("spark.akka.heartbeat.interval", "1000")
      
      Defaults for these are fairly large to only disable Failure detector that comes with akka. The reason for doing so is we have our own failure detector like mechanism in place and then this is just an overhead on top of that + it leads to a lot of false positives. But with these properties it is possible to enable them. A good use case for enabling it could be when someone wants spark to be sensitive (in a controllable manner ofc.) to GC pauses/Network lags and quickly evict executors that experienced it. More information is included in configuration.md
      
      Once we have the SPARK-544 merged, I had like to deprecate atleast these akka properties and may be others too.
      
      This PR is duplicate of #221(where all the discussion happened.) for that one pointed to master this one points to scala-2.10 branch.
      97ac0601
    • Patrick Wendell's avatar
      Merge pull request #262 from pwendell/mvn-fix · 7ac944fc
      Patrick Wendell authored
      Fix maven build issues in 2.10 branch
      
      Found some issues when locally testing maven.
      7ac944fc
    • Patrick Wendell's avatar
      Fix maven build issues in 2.10 branch · 6e8a96c7
      Patrick Wendell authored
      6e8a96c7
  3. Dec 13, 2013
  4. Dec 12, 2013
    • Patrick Wendell's avatar
      Merge pull request #255 from ScrapCodes/scala-2.10 · 0aeb182b
      Patrick Wendell authored
      Disabled yarn 2.2 in sbt and mvn build and added a message in the sbt build.
      0aeb182b
    • Thomas Graves's avatar
      Fix the --name option for Spark on Yarn · 842eb55f
      Thomas Graves authored
      842eb55f
    • Nick Pentreath's avatar
    • Prashant Sharma's avatar
    • Patrick Wendell's avatar
      Merge pull request #254 from ScrapCodes/scala-2.10 · 2e89398e
      Patrick Wendell authored
      Scala 2.10 migration
      
      This PR migrates spark to scala 2.10.
      
      Summary of changes apart from scala 2.10 migration:
      (has no implications for user.)
      1. Migrated Akka to 2.2.3.
      
      Does not use remote death watch for it has a bug, where it tries to send message to dead node infinitely.
      
      Uses an indestructible actorsystem which tolerates errors only on executors.
      
      (Might be useful for user.)
      4. New configuration settings introduced:
      
      System.getProperty("spark.akka.heartbeat.pauses", "600")
      System.getProperty("spark.akka.failure-detector.threshold", "300.0")
      System.getProperty("spark.akka.heartbeat.interval", "1000")
      
      Defaults for these are fairly large to only disable Failure detector that comes with akka. The reason for doing so is we have our own failure detector like mechanism in place and then this is just an overhead on top of that + it leads to a lot of false positives. But with these properties it is possible to enable them. A good use case for enabling it could be when someone wants spark to be sensitive (in a controllable manner ofc.) to GC pauses/Network lags and quickly evict executors that experienced it. More information is included in configuration.md
      
      Once we have the SPARK-544 merged, I had like to deprecate atleast these akka properties and may be others too.
      
      This PR is duplicate of #221(where all the discussion happened.) for that one pointed to master this one points to scala-2.10 branch.
      2e89398e
  5. Dec 11, 2013
  6. Dec 10, 2013
  7. Dec 09, 2013
  8. Dec 08, 2013
Loading