Skip to content
  • Lianhui Wang's avatar
    ebff7327
    [SPARK-6869] [PYSPARK] Add pyspark archives path to PYTHONPATH · ebff7327
    Lianhui Wang authored
    Based on https://github.com/apache/spark/pull/5478 that provide a PYSPARK_ARCHIVES_PATH env. within this PR, we just should export PYSPARK_ARCHIVES_PATH=/user/spark/pyspark.zip,/user/spark/python/lib/py4j-0.8.2.1-src.zip in conf/spark-env.sh when we don't install PySpark on each node of Yarn. i run python application successfully on yarn-client and yarn-cluster with this PR.
    andrewor14 sryza Sephiroth-Lin Can you take a look at this?thanks.
    
    Author: Lianhui Wang <lianhuiwang09@gmail.com>
    
    Closes #5580 from lianhuiwang/SPARK-6869 and squashes the following commits:
    
    66ffa43 [Lianhui Wang] Update Client.scala
    c2ad0f9 [Lianhui Wang] Update Client.scala
    1c8f664 [Lianhui Wang] Merge remote-tracking branch 'remotes/apache/master' into SPARK-6869
    008850a [Lianhui Wang] Merge remote-tracking branch 'remotes/apache/master' into SPARK-6869
    f0b4ed8 [Lianhui Wang] Merge remote-tracking branch 'remotes/apache/master' into SPARK-6869
    150907b [Lianhui Wang] Merge remote-tracking branch 'remotes/apache/master' into SPARK-6869
    20402cd [Lianhui Wang] use ZipEntry
    9d87c3f [Lianhui Wang] update scala style
    e7bd971 [Lianhui Wang] address vanzin's comments
    4b8a3ed [Lianhui Wang] use pyArchivesEnvOpt
    e6b573b [Lianhui Wang] address vanzin's comments
    f11f84a [Lianhui Wang] zip pyspark archives
    5192cca [Lianhui Wang] update import path
    3b1e4c8 [Lianhui Wang] address tgravescs's comments
    9396346 [Lianhui Wang] put zip to make-distribution.sh
    0d2baf7 [Lianhui Wang] update import paths
    e0179be [Lianhui Wang] add zip pyspark archives in build or sparksubmit
    31e8e06 [Lianhui Wang] update code style
    9f31dac [Lianhui Wang] update code and add comments
    f72987c [Lianhui Wang] add archives path to PYTHONPATH
    ebff7327
    [SPARK-6869] [PYSPARK] Add pyspark archives path to PYTHONPATH
    Lianhui Wang authored
    Based on https://github.com/apache/spark/pull/5478 that provide a PYSPARK_ARCHIVES_PATH env. within this PR, we just should export PYSPARK_ARCHIVES_PATH=/user/spark/pyspark.zip,/user/spark/python/lib/py4j-0.8.2.1-src.zip in conf/spark-env.sh when we don't install PySpark on each node of Yarn. i run python application successfully on yarn-client and yarn-cluster with this PR.
    andrewor14 sryza Sephiroth-Lin Can you take a look at this?thanks.
    
    Author: Lianhui Wang <lianhuiwang09@gmail.com>
    
    Closes #5580 from lianhuiwang/SPARK-6869 and squashes the following commits:
    
    66ffa43 [Lianhui Wang] Update Client.scala
    c2ad0f9 [Lianhui Wang] Update Client.scala
    1c8f664 [Lianhui Wang] Merge remote-tracking branch 'remotes/apache/master' into SPARK-6869
    008850a [Lianhui Wang] Merge remote-tracking branch 'remotes/apache/master' into SPARK-6869
    f0b4ed8 [Lianhui Wang] Merge remote-tracking branch 'remotes/apache/master' into SPARK-6869
    150907b [Lianhui Wang] Merge remote-tracking branch 'remotes/apache/master' into SPARK-6869
    20402cd [Lianhui Wang] use ZipEntry
    9d87c3f [Lianhui Wang] update scala style
    e7bd971 [Lianhui Wang] address vanzin's comments
    4b8a3ed [Lianhui Wang] use pyArchivesEnvOpt
    e6b573b [Lianhui Wang] address vanzin's comments
    f11f84a [Lianhui Wang] zip pyspark archives
    5192cca [Lianhui Wang] update import path
    3b1e4c8 [Lianhui Wang] address tgravescs's comments
    9396346 [Lianhui Wang] put zip to make-distribution.sh
    0d2baf7 [Lianhui Wang] update import paths
    e0179be [Lianhui Wang] add zip pyspark archives in build or sparksubmit
    31e8e06 [Lianhui Wang] update code style
    9f31dac [Lianhui Wang] update code and add comments
    f72987c [Lianhui Wang] add archives path to PYTHONPATH
Loading