Skip to content
Snippets Groups Projects
  1. May 28, 2014
    • David Lemieux's avatar
      Spark 1916 · 0b769b73
      David Lemieux authored
      The changes could be ported back to 0.9 as well.
      Changing in.read to in.readFully to read the whole input stream rather than the first 1020 bytes.
      This should ok considering that Flume caps the body size to 32K by default.
      
      Author: David Lemieux <david.lemieux@radialpoint.com>
      
      Closes #865 from lemieud/SPARK-1916 and squashes the following commits:
      
      a265673 [David Lemieux] Updated SparkFlumeEvent to read the whole stream rather than the first X bytes.
      0b769b73
    • Patrick Wendell's avatar
      Organize configuration docs · 032493e1
      Patrick Wendell authored
      This PR improves and organizes the config option page
      and makes a few other changes to config docs. See a preview here:
      http://people.apache.org/~pwendell/config-improvements/configuration.html
      
      
      
      The biggest changes are:
      1. The configs for the standalone master/workers were moved to the
      standalone page and out of the general config doc.
      2. SPARK_LOCAL_DIRS was missing from the standalone docs.
      3. Expanded discussion of injecting configs with spark-submit, including an
      example.
      4. Config options were organized into the following categories:
      - Runtime Environment
      - Shuffle Behavior
      - Spark UI
      - Compression and Serialization
      - Execution Behavior
      - Networking
      - Scheduling
      - Security
      - Spark Streaming
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #880 from pwendell/config-cleanup and squashes the following commits:
      
      93f56c3 [Patrick Wendell] Feedback from Matei
      6f66efc [Patrick Wendell] More feedback
      16ae776 [Patrick Wendell] Adding back header section
      d9c264f [Patrick Wendell] Small fix
      e0c1728 [Patrick Wendell] Response to Matei's review
      27d57db [Patrick Wendell] Reverting changes to index.html (covered in #896)
      e230ef9 [Patrick Wendell] Merge remote-tracking branch 'apache/master' into config-cleanup
      a374369 [Patrick Wendell] Line wrapping fixes
      fdff7fc [Patrick Wendell] Merge remote-tracking branch 'apache/master' into config-cleanup
      3289ea4 [Patrick Wendell] Pulling in changes from #856
      106ee31 [Patrick Wendell] Small link fix
      f7e79bc [Patrick Wendell] Re-organizing config options.
      54b184d [Patrick Wendell] Adding standalone configs to the standalone page
      592e94a [Patrick Wendell] Stash
      29b5446 [Patrick Wendell] Better discussion of spark-submit in configuration docs
      2d719ef [Patrick Wendell] Small fix
      4af9e07 [Patrick Wendell] Adding SPARK_LOCAL_DIRS docs
      204b248 [Patrick Wendell] Small fixes
      (cherry picked from commit 7801d44f)
      
      Signed-off-by: default avatarPatrick Wendell <pwendell@gmail.com>
      032493e1
    • jmu's avatar
      Fix doc about NetworkWordCount/JavaNetworkWordCount usage of spark streaming · 3669bb8e
      jmu authored
      Usage: NetworkWordCount <master> <hostname> <port>
      -->
      Usage: NetworkWordCount <hostname> <port>
      
      Usage: JavaNetworkWordCount <master> <hostname> <port>
      -->
      Usage: JavaNetworkWordCount <hostname> <port>
      
      Author: jmu <jmujmu@gmail.com>
      
      Closes #826 from jmu/master and squashes the following commits:
      
      9fb7980 [jmu] Merge branch 'master' of https://github.com/jmu/spark
      
      
      b9a6b02 [jmu] Fix doc for NetworkWordCount/JavaNetworkWordCount Usage: NetworkWordCount <master> <hostname> <port> --> Usage: NetworkWordCount <hostname> <port>
      (cherry picked from commit 82eadc3b)
      
      Signed-off-by: default avatarPatrick Wendell <pwendell@gmail.com>
      3669bb8e
    • Takuya UESHIN's avatar
      [SPARK-1938] [SQL] ApproxCountDistinctMergeFunction should return Int value. · 24a1cac4
      Takuya UESHIN authored
      
      `ApproxCountDistinctMergeFunction` should return `Int` value because the `dataType` of `ApproxCountDistinct` is `IntegerType`.
      
      Author: Takuya UESHIN <ueshin@happy-camper.st>
      
      Closes #893 from ueshin/issues/SPARK-1938 and squashes the following commits:
      
      3970e88 [Takuya UESHIN] Remove a superfluous line.
      5ad7ec1 [Takuya UESHIN] Make dataType for each of CountDistinct, ApproxCountDistinctMerge and ApproxCountDistinct LongType.
      cbe7c71 [Takuya UESHIN] Revert a change.
      fc3ac0f [Takuya UESHIN] Fix evaluated value type of ApproxCountDistinctMergeFunction to Int.
      
      (cherry picked from commit 9df86835)
      Signed-off-by: default avatarReynold Xin <rxin@apache.org>
      24a1cac4
  2. May 27, 2014
  3. May 26, 2014
  4. May 25, 2014
  5. May 24, 2014
    • Zhen Peng's avatar
      [SPARK-1886] check executor id existence when executor exit · b5e96869
      Zhen Peng authored
      
      Author: Zhen Peng <zhenpeng01@baidu.com>
      
      Closes #827 from zhpengg/bugfix-executor-id-not-found and squashes the following commits:
      
      cd8bb65 [Zhen Peng] bugfix: check executor id existence when executor exit
      
      (cherry picked from commit 4e4831b8)
      Signed-off-by: default avatarAaron Davidson <aaron@databricks.com>
      b5e96869
    • Tathagata Das's avatar
      Revert "[maven-release-plugin] prepare release v1.0.0-rc10" · 9ff42249
      Tathagata Das authored
      This reverts commit d8070234.
      9ff42249
    • Tathagata Das's avatar
      f856b8ca
    • Tathagata Das's avatar
      Updated CHANGES.txt · 84060927
      Tathagata Das authored
      84060927
    • Patrick Wendell's avatar
      SPARK-1911: Emphasize that Spark jars should be built with Java 6. · 217bd562
      Patrick Wendell authored
      
      This commit requires the user to manually say "yes" when buiding Spark
      without Java 6. The prompt can be bypassed with a flag (e.g. if the user
      is scripting around make-distribution).
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #859 from pwendell/java6 and squashes the following commits:
      
      4921133 [Patrick Wendell] Adding Pyspark Notice
      fee8c9e [Patrick Wendell] SPARK-1911: Emphasize that Spark jars should be built with Java 6.
      
      (cherry picked from commit 75a03277)
      Signed-off-by: default avatarTathagata Das <tathagata.das1565@gmail.com>
      217bd562
    • Andrew Or's avatar
      [SPARK-1900 / 1918] PySpark on YARN is broken · 12f5ecc8
      Andrew Or authored
      
      If I run the following on a YARN cluster
      ```
      bin/spark-submit sheep.py --master yarn-client
      ```
      it fails because of a mismatch in paths: `spark-submit` thinks that `sheep.py` resides on HDFS, and balks when it can't find the file there. A natural workaround is to add the `file:` prefix to the file:
      ```
      bin/spark-submit file:/path/to/sheep.py --master yarn-client
      ```
      However, this also fails. This time it is because python does not understand URI schemes.
      
      This PR fixes this by automatically resolving all paths passed as command line argument to `spark-submit` properly. This has the added benefit of keeping file and jar paths consistent across different cluster modes. For python, we strip the URI scheme before we actually try to run it.
      
      Much of the code is originally written by @mengxr. Tested on YARN cluster. More tests pending.
      
      Author: Andrew Or <andrewor14@gmail.com>
      
      Closes #853 from andrewor14/submit-paths and squashes the following commits:
      
      0bb097a [Andrew Or] Format path correctly before adding it to PYTHONPATH
      323b45c [Andrew Or] Include --py-files on PYTHONPATH for pyspark shell
      3c36587 [Andrew Or] Improve error messages (minor)
      854aa6a [Andrew Or] Guard against NPE if user gives pathological paths
      6638a6b [Andrew Or] Fix spark-shell jar paths after #849 went in
      3bb0359 [Andrew Or] Update more comments (minor)
      2a1f8a0 [Andrew Or] Update comments (minor)
      6af2c77 [Andrew Or] Merge branch 'master' of github.com:apache/spark into submit-paths
      a68c4d1 [Andrew Or] Handle Windows python file path correctly
      427a250 [Andrew Or] Resolve paths properly for Windows
      a591a4a [Andrew Or] Update tests for resolving URIs
      6c8621c [Andrew Or] Move resolveURIs to Utils
      db8255e [Andrew Or] Merge branch 'master' of github.com:apache/spark into submit-paths
      f542dce [Andrew Or] Fix outdated tests
      691c4ce [Andrew Or] Ignore special primary resource names
      5342ac7 [Andrew Or] Add missing space in error message
      02f77f3 [Andrew Or] Resolve command line arguments to spark-submit properly
      
      (cherry picked from commit 5081a0a9)
      Signed-off-by: default avatarTathagata Das <tathagata.das1565@gmail.com>
      12f5ecc8
  6. May 23, 2014
  7. May 22, 2014
    • Tathagata Das's avatar
      Updated scripts for auditing releases · 6541ca24
      Tathagata Das authored
      
      - Added script to automatically generate change list CHANGES.txt
      - Added test for verifying linking against maven distributions of `spark-sql` and `spark-hive`
      - Added SBT projects for testing functionality of `spark-sql` and `spark-hive`
      - Fixed issues in existing tests that might have come up because of changes in Spark 1.0
      
      Author: Tathagata Das <tathagata.das1565@gmail.com>
      
      Closes #844 from tdas/update-dev-scripts and squashes the following commits:
      
      25090ba [Tathagata Das] Added missing license
      e2e20b3 [Tathagata Das] Updated tests for auditing releases.
      
      (cherry picked from commit b2bdd0e5)
      Signed-off-by: default avatarTathagata Das <tathagata.das1565@gmail.com>
      6541ca24
Loading