Skip to content
Snippets Groups Projects
  1. Mar 02, 2015
    • Yin Huai's avatar
      [SPARK-6073][SQL] Need to refresh metastore cache after append data in... · 39a54b40
      Yin Huai authored
      [SPARK-6073][SQL] Need to refresh metastore cache after append data in CreateMetastoreDataSourceAsSelect
      
      JIRA: https://issues.apache.org/jira/browse/SPARK-6073
      
      liancheng
      
      Author: Yin Huai <yhuai@databricks.com>
      
      Closes #4824 from yhuai/refreshCache and squashes the following commits:
      
      b9542ef [Yin Huai] Refresh metadata cache in the Catalog in CreateMetastoreDataSourceAsSelect.
      39a54b40
    • Lianhui Wang's avatar
      [SPARK-6103][Graphx]remove unused class to import in EdgeRDDImpl · 49c7a8f6
      Lianhui Wang authored
      Class TaskContext is unused in EdgeRDDImpl, so we need to remove it from import list.
      
      Author: Lianhui Wang <lianhuiwang09@gmail.com>
      
      Closes #4846 from lianhuiwang/SPARK-6103 and squashes the following commits:
      
      31aed64 [Lianhui Wang] remove unused class to import in EdgeRDDImpl
      49c7a8f6
    • Sean Owen's avatar
      SPARK-3357 [CORE] Internal log messages should be set at DEBUG level instead of INFO · 948c2390
      Sean Owen authored
      Demote some 'noisy' log messages to debug level. I added a few more, to include everything that gets logged in stanzas like this:
      
      ```
      15/03/01 00:03:54 INFO BlockManager: Removing broadcast 0
      15/03/01 00:03:54 INFO BlockManager: Removing block broadcast_0_piece0
      15/03/01 00:03:54 INFO MemoryStore: Block broadcast_0_piece0 of size 839 dropped from memory (free 277976091)
      15/03/01 00:03:54 INFO BlockManagerInfo: Removed broadcast_0_piece0 on localhost:49524 in memory (size: 839.0 B, free: 265.1 MB)
      15/03/01 00:03:54 INFO BlockManagerMaster: Updated info of block broadcast_0_piece0
      15/03/01 00:03:54 INFO BlockManager: Removing block broadcast_0
      15/03/01 00:03:54 INFO MemoryStore: Block broadcast_0 of size 1088 dropped from memory (free 277977179)
      15/03/01 00:03:54 INFO ContextCleaner: Cleaned broadcast 0
      ```
      
      as well as regular messages like
      
      ```
      15/03/01 00:02:33 INFO MemoryStore: ensureFreeSpace(2640) called with curMem=47322, maxMem=278019440
      ```
      
      WDYT? good or should some be left alone?
      
      CC mengxr who suggested some of this.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #4838 from srowen/SPARK-3357 and squashes the following commits:
      
      dce75c1 [Sean Owen] Back out some debug level changes
      d9b784d [Sean Owen] Demote some 'noisy' log messages to debug level
      948c2390
    • Saisai Shao's avatar
      [Streaming][Minor]Fix some error docs in streaming examples · d8fb40ed
      Saisai Shao authored
      Small changes, please help to review, thanks a lot.
      
      Author: Saisai Shao <saisai.shao@intel.com>
      
      Closes #4837 from jerryshao/doc-fix and squashes the following commits:
      
      545291a [Saisai Shao] Fix some error docs in streaming examples
      d8fb40ed
  2. Mar 01, 2015
    • MechCoder's avatar
      [SPARK-6083] [MLLib] [DOC] Make Python API example consistent in NaiveBayes · 3f00bb3e
      MechCoder authored
      Author: MechCoder <manojkumarsivaraj334@gmail.com>
      
      Closes #4834 from MechCoder/spark-6083 and squashes the following commits:
      
      1cdd7b5 [MechCoder] Add parse function
      65bbbe9 [MechCoder] [SPARK-6083] Make Python API example consistent in NaiveBayes
      3f00bb3e
    • Xiangrui Meng's avatar
      [SPARK-6053][MLLIB] support save/load in PySpark's ALS · aedbbaa3
      Xiangrui Meng authored
      A simple wrapper to save/load `MatrixFactorizationModel` in Python. jkbradley
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #4811 from mengxr/SPARK-5991 and squashes the following commits:
      
      f135dac [Xiangrui Meng] update save doc
      57e5200 [Xiangrui Meng] address comments
      06140a4 [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into SPARK-5991
      282ec8d [Xiangrui Meng] support save/load in PySpark's ALS
      aedbbaa3
    • Marcelo Vanzin's avatar
      [SPARK-6074] [sql] Package pyspark sql bindings. · fd8d283e
      Marcelo Vanzin authored
      This is needed for the SQL bindings to work on Yarn.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #4822 from vanzin/SPARK-6074 and squashes the following commits:
      
      fb52001 [Marcelo Vanzin] [SPARK-6074] [sql] Package pyspark sql bindings.
      fd8d283e
    • Josh Rosen's avatar
      [SPARK-6075] Fix bug in that caused lost accumulator updates: do not store... · 2df5f1f0
      Josh Rosen authored
      [SPARK-6075] Fix bug in that caused lost accumulator updates: do not store WeakReferences in localAccums map
      
      This fixes a non-deterministic bug introduced in #4021 that could cause tasks' accumulator updates to be lost.  The problem is that `localAccums` should not hold weak references: after the task finishes running there won't be any strong references to these local accumulators, so they can get garbage-collected before the executor reads the `localAccums` map.  We don't need weak references here anyways, since this map is cleared at the end of each task.
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #4835 from JoshRosen/SPARK-6075 and squashes the following commits:
      
      4f4b5b2 [Josh Rosen] Remove defensive assertions that caused test failures in code unrelated to this change
      120c7b0 [Josh Rosen] [SPARK-6075] Do not store WeakReferences in localAccums map
      2df5f1f0
  3. Feb 28, 2015
    • Evan Yu's avatar
      SPARK-5984: Fix TimSort bug causes ArrayOutOfBoundsException · 643300a6
      Evan Yu authored
      Fix TimSort bug which causes a ArrayOutOfBoundsException.
      
      Using the proposed fix here
      http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/
      
      Author: Evan Yu <ehotou@gmail.com>
      
      Closes #4804 from hotou/SPARK-5984 and squashes the following commits:
      
      3421b6c [Evan Yu] SPARK-5984: Add info to LICENSE
      e61c6b8 [Evan Yu] SPARK-5984: Fix license and document
      6ccc280 [Evan Yu] SPARK-5984: Add License header to file
      e06c0d2 [Evan Yu] SPARK-5984: Add License header to file
      4d95f75 [Evan Yu] SPARK-5984: Fix TimSort bug causes ArrayOutOfBoundsException
      479a106 [Evan Yu] SPARK-5984: Fix TimSort bug causes ArrayOutOfBoundsException
      643300a6
    • Sean Owen's avatar
      SPARK-1965 [WEBUI] Spark UI throws NPE on trying to load the app page for non-existent app · 86fcdaef
      Sean Owen authored
      Don't throw NPE if appId is unknown. kayousterhout is this a decent enough band-aid for avoiding a full-blown NPE? it should just render empty content instead
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #4777 from srowen/SPARK-1965 and squashes the following commits:
      
      7e16590 [Sean Owen] Update app not found message
      cb878d6 [Sean Owen] Return basic "not found" page for unknown appId
      d8270da [Sean Owen] Don't throw NPE if appId is unknown
      86fcdaef
    • Sean Owen's avatar
      SPARK-5983 [WEBUI] Don't respond to HTTP TRACE in HTTP-based UIs · f91298e2
      Sean Owen authored
      Disallow TRACE HTTP method in servlets
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #4765 from srowen/SPARK-5983 and squashes the following commits:
      
      421b25b [Sean Owen] Disallow TRACE HTTP method in servlets
      f91298e2
    • Michael Griffiths's avatar
      SPARK-6063 MLlib doesn't pass mvn scalastyle check due to UTF chars in LDAModel.scala · b36b1bc2
      Michael Griffiths authored
      Remove unicode characters from MLlib file.
      
      Author: Michael Griffiths <msjgriffiths@gmail.com>
      Author: Griffiths, Michael (NYC-RPM) <michael.griffiths@reprisemedia.com>
      
      Closes #4815 from msjgriffiths/SPARK-6063 and squashes the following commits:
      
      bcd7de1 [Griffiths, Michael (NYC-RPM)] Change \u201D quote marks around 'theta' to standard single apostrophe (\x27)
      38eb535 [Michael Griffiths] Merge pull request #2 from apache/master
      b08e865 [Michael Griffiths] Merge pull request #1 from apache/master
      b36b1bc2
    • Cheng Lian's avatar
      [SPARK-5775] [SQL] BugFix: GenericRow cannot be cast to SpecificMutableRow... · e6003f0a
      Cheng Lian authored
      [SPARK-5775] [SQL] BugFix: GenericRow cannot be cast to SpecificMutableRow when nested data and partitioned table
      
      This PR adapts anselmevignon's #4697 to master and branch-1.3. Please refer to PR description of #4697 for details.
      
      <!-- Reviewable:start -->
      [<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/4792)
      <!-- Reviewable:end -->
      
      Author: Cheng Lian <lian@databricks.com>
      Author: Cheng Lian <liancheng@users.noreply.github.com>
      Author: Yin Huai <yhuai@databricks.com>
      
      Closes #4792 from liancheng/spark-5775 and squashes the following commits:
      
      538f506 [Cheng Lian] Addresses comments
      cee55cf [Cheng Lian] Merge pull request #4 from yhuai/spark-5775-yin
      b0b74fb [Yin Huai] Remove runtime pattern matching.
      ca6e038 [Cheng Lian] Fixes SPARK-5775
      e6003f0a
    • Patrick Wendell's avatar
      MAINTENANCE: Automated closing of pull requests. · 91682598
      Patrick Wendell authored
      This commit exists to close the following pull requests on Github:
      
      Closes #1128 (close requested by 'srowen')
      Closes #3425 (close requested by 'srowen')
      Closes #4770 (close requested by 'srowen')
      Closes #2813 (close requested by 'srowen')
      91682598
    • Burak Yavuz's avatar
      [SPARK-5979][SPARK-6032] Smaller safer --packages fix · 6d8e5fbc
      Burak Yavuz authored
      pwendell tdas
      This is the safer parts of PR #4754:
       - SPARK-5979: All dependencies with the groupId `org.apache.spark` passed through `--packages`, were being excluded from the dependency tree on the assumption that they would be in the assembly jar. This is not the case, therefore the exclusion rules had to be defined more explicitly.
       - SPARK-6032: Ivy prints a whole lot of logs while retrieving dependencies. These were printed to `System.out`. Moved the logging to `System.err`.
      
      Author: Burak Yavuz <brkyvz@gmail.com>
      
      Closes #4802 from brkyvz/simple-streaming-fix and squashes the following commits:
      
      e0f38cb [Burak Yavuz] Merge branch 'master' of github.com:apache/spark into simple-streaming-fix
      bad921c [Burak Yavuz] [SPARK-5979][SPARK-6032] Smaller safer fix
      6d8e5fbc
    • Marcelo Vanzin's avatar
      [SPARK-6070] [yarn] Remove unneeded classes from shuffle service jar. · dba08d1f
      Marcelo Vanzin authored
      These may conflict with the classes already in the NM. We shouldn't
      be repackaging them.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #4820 from vanzin/SPARK-6070 and squashes the following commits:
      
      871b566 [Marcelo Vanzin] The "d'oh how didn't I think of it before" solution.
      3cba946 [Marcelo Vanzin] Use profile instead, so that dependencies don't need to be explicitly listed.
      7a18a1b [Marcelo Vanzin] [SPARK-6070] [yarn] Remove unneeded classes from shuffle service jar.
      dba08d1f
  4. Feb 27, 2015
    • Davies Liu's avatar
      [SPARK-6055] [PySpark] fix incorrect __eq__ of DataType · e0e64ba4
      Davies Liu authored
      The _eq_ of DataType is not correct, class cache is not use correctly (created class can not be find by dataType), then it will create lots of classes (saved in _cached_cls), never released.
      
      Also, all same DataType have same hash code, there will be many object in a dict with the same hash code, end with hash attach, it's very slow to access this dict (depends on the implementation of CPython).
      
      This PR also improve the performance of inferSchema (avoid the unnecessary converter of object).
      
      cc pwendell  JoshRosen
      
      Author: Davies Liu <davies@databricks.com>
      
      Closes #4808 from davies/leak and squashes the following commits:
      
      6a322a4 [Davies Liu] tests refactor
      3da44fc [Davies Liu] fix __eq__ of Singleton
      534ac90 [Davies Liu] add more checks
      46999dc [Davies Liu] fix tests
      d9ae973 [Davies Liu] fix memory leak in sql
      e0e64ba4
    • Cheng Lian's avatar
      [SPARK-5751] [SQL] Sets SPARK_HOME as SPARK_PID_DIR when running Thrift server test suites · 8c468a66
      Cheng Lian authored
      This is a follow-up of #4720. By default, `spark-daemon.sh` writes PID files under `/tmp`, which makes it impossible to start multiple server instances simultaneously. This PR sets `SPARK_PID_DIR` to Spark home directory to workaround this problem.
      
      Many thanks to chenghao-intel for pointing out this issue!
      
      <!-- Reviewable:start -->
      [<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/4758)
      <!-- Reviewable:end -->
      
      Author: Cheng Lian <lian@databricks.com>
      
      Closes #4758 from liancheng/thriftserver-pid-dir and squashes the following commits:
      
      252fa0f [Cheng Lian] Uses temporary directory as Thrift server PID directory
      1b3d1e3 [Cheng Lian] Sets SPARK_HOME as SPARK_PID_DIR when running Thrift server test suites
      8c468a66
    • Saisai Shao's avatar
      [Streaming][Minor] Remove useless type signature of Java Kafka direct stream API · 5f7f3b93
      Saisai Shao authored
      cc tdas .
      
      Author: Saisai Shao <saisai.shao@intel.com>
      
      Closes #4817 from jerryshao/signature-minor-fix and squashes the following commits:
      
      eebfaac [Saisai Shao] Remove useless type parameter
      5f7f3b93
    • Joseph K. Bradley's avatar
      [SPARK-4587] [mllib] [docs] Fixed save,load calls in ML guide examples · d17cb2ba
      Joseph K. Bradley authored
      Should pass spark context to save/load
      
      CC: mengxr
      
      Author: Joseph K. Bradley <joseph@databricks.com>
      
      Closes #4816 from jkbradley/ml-io-doc-fix and squashes the following commits:
      
      83d369d [Joseph K. Bradley] added comment to save,load parts of ML guide examples
      2841170 [Joseph K. Bradley] Fixed save,load calls in ML guide examples
      d17cb2ba
    • zsxwing's avatar
      [SPARK-6059][Yarn] Add volatile to ApplicationMaster's reporterThread and allocator · 57566d0a
      zsxwing authored
      `ApplicationMaster.reporterThread` and `ApplicationMaster.allocator` are accessed in multiple threads, so they should be marked as `volatile`.
      
      Author: zsxwing <zsxwing@gmail.com>
      
      Closes #4814 from zsxwing/SPARK-6059 and squashes the following commits:
      
      17d9386 [zsxwing] Add volatile to ApplicationMaster's reporterThread and allocator
      57566d0a
    • zsxwing's avatar
      [SPARK-6058][Yarn] Log the user class exception in ApplicationMaster · e747e984
      zsxwing authored
      Because ApplicationMaster doesn't set SparkUncaughtExceptionHandler, the exception in the user class won't be logged. This PR added a `logError` for it.
      
      Author: zsxwing <zsxwing@gmail.com>
      
      Closes #4813 from zsxwing/SPARK-6058 and squashes the following commits:
      
      806c932 [zsxwing] Log the user class exception
      e747e984
    • Zhang, Liye's avatar
      [SPARK-6036][CORE] avoid race condition between eventlogListener and akka actor system · 8cd1692c
      Zhang, Liye authored
      For detail description, pls refer to [SPARK-6036](https://issues.apache.org/jira/browse/SPARK-6036).
      
      Author: Zhang, Liye <liye.zhang@intel.com>
      
      Closes #4785 from liyezhang556520/EventLogInProcess and squashes the following commits:
      
      8b0b0a6 [Zhang, Liye] stop listener after DAGScheduler
      79b15b3 [Zhang, Liye] SPARK-6036 avoid race condition between eventlogListener and akka actor system
      8cd1692c
    • 许鹏's avatar
      fix spark-6033, clarify the spark.worker.cleanup behavior in standalone mode · 0375a413
      许鹏 authored
      jira case spark-6033 https://issues.apache.org/jira/browse/SPARK-6033
      
      In standalone deploy mode, the cleanup will only remove the stopped application's directories.
      
      The original description about the cleanup behavior is incorrect.
      
      Author: 许鹏 <peng.xu@fraudmetrix.cn>
      
      Closes #4803 from hseagle/spark-6033 and squashes the following commits:
      
      927a6a0 [许鹏] fix the incorrect description about the spark.worker.cleanup in standalone mode
      0375a413
    • Andrew Or's avatar
      [SPARK-6046] Privatize SparkConf.translateConfKey · 7c99a014
      Andrew Or authored
      The warning of deprecated configs is actually done when the configs are set, not when they are get. As a result we don't need to explicitly call `translateConfKey` outside of `SparkConf` just to print the warning again in vain.
      
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #4797 from andrewor14/warn-deprecated-config and squashes the following commits:
      
      8fb43e6 [Andrew Or] Privatize SparkConf.translateConfKey
      7c99a014
    • Lukasz Jastrzebski's avatar
      SPARK-2168 [Spark core] Use relative URIs for the app links in the History Server. · 4a8a0a8e
      Lukasz Jastrzebski authored
      As agreed in PR #1160 adding test to verify if history server generates relative links to applications.
      
      Author: Lukasz Jastrzebski <lukasz.jastrzebski@gmail.com>
      
      Closes #4778 from elyast/master and squashes the following commits:
      
      0c07fab [Lukasz Jastrzebski] Incorporating comments for SPARK-2168
      6d7866d [Lukasz Jastrzebski] Adjusting test for  SPARK-2168 for master branch
      d6f4fbe [Lukasz Jastrzebski] Added test for  SPARK-2168
      4a8a0a8e
    • jerryshao's avatar
      [SPARK-5495][UI] Add app and driver kill function in master web UI · 67595eb8
      jerryshao authored
      Add application kill function in master web UI for standalone mode. Details can be seen in [SPARK-5495](https://issues.apache.org/jira/browse/SPARK-5495).
      
      The snapshot of UI shows as below:
      ![snapshot](https://dl.dropboxusercontent.com/u/19230832/master_ui.png)
      
      Please help to review, thanks a lot.
      
      Author: jerryshao <saisai.shao@intel.com>
      
      Closes #4288 from jerryshao/SPARK-5495 and squashes the following commits:
      
      fa3e486 [jerryshao] Add some conditions
      9a7be93 [jerryshao] Add kill Driver function
      a239776 [jerryshao] Change the code format
      ff5195d [jerryshao] Add app kill function in master web UI
      67595eb8
    • jerryshao's avatar
      [SPARK-5771][UI][hotfix] Change Requested Cores into * if default cores is not set · 12135e90
      jerryshao authored
      cc andrewor14, srowen.
      
      Author: jerryshao <saisai.shao@intel.com>
      
      Closes #4800 from jerryshao/SPARK-5771 and squashes the following commits:
      
      a2483c2 [jerryshao] Change the UI of Requested Cores into * if default cores is not set
      12135e90
  5. Feb 26, 2015
    • Yin Huai's avatar
      [SPARK-6024][SQL] When a data source table has too many columns, it's schema... · 5e5ad655
      Yin Huai authored
      [SPARK-6024][SQL] When a data source table has too many columns, it's schema cannot be stored in metastore.
      
      JIRA: https://issues.apache.org/jira/browse/SPARK-6024
      
      Author: Yin Huai <yhuai@databricks.com>
      
      Closes #4795 from yhuai/wideSchema and squashes the following commits:
      
      4882e6f [Yin Huai] Address comments.
      73e71b4 [Yin Huai] Address comments.
      143927a [Yin Huai] Simplify code.
      cc1d472 [Yin Huai] Make the schema wider.
      12bacae [Yin Huai] If the JSON string of a schema is too large, split it before storing it in metastore.
      e9b4f70 [Yin Huai] Failed test.
      5e5ad655
    • Liang-Chi Hsieh's avatar
      [SPARK-6037][SQL] Avoiding duplicate Parquet schema merging · 4ad5153f
      Liang-Chi Hsieh authored
      `FilteringParquetRowInputFormat` manually merges Parquet schemas before computing splits. However, it is duplicate because the schemas are already merged in `ParquetRelation2`. We don't need to re-merge them at `InputFormat`.
      
      Author: Liang-Chi Hsieh <viirya@gmail.com>
      
      Closes #4786 from viirya/dup_parquet_schemas_merge and squashes the following commits:
      
      ef78a5a [Liang-Chi Hsieh] Avoiding duplicate Parquet schema merging.
      4ad5153f
    • Hong Shen's avatar
      [SPARK-5529][CORE]Add expireDeadHosts in HeartbeatReceiver · 18f20984
      Hong Shen authored
      If a blockManager has not send heartBeat more than 120s, BlockManagerMasterActor will remove it. But coarseGrainedSchedulerBackend can only remove executor after an DisassociatedEvent.  We should expireDeadHosts at HeartbeatReceiver.
      
      Author: Hong Shen <hongshen@tencent.com>
      
      Closes #4363 from shenh062326/my_change3 and squashes the following commits:
      
      2c9a46a [Hong Shen] Change some code style.
      1a042ff [Hong Shen] Change some code style.
      2dc456e [Hong Shen] Change some code style.
      d221493 [Hong Shen] Fix test failed
      7448ac6 [Hong Shen] A minor change in sparkContext and heartbeatReceiver
      b904aed [Hong Shen] Fix failed test
      52725af [Hong Shen] Remove assert in SparkContext.killExecutors
      5bedcb8 [Hong Shen] Remove assert in SparkContext.killExecutors
      a858fb5 [Hong Shen] A minor change in HeartbeatReceiver
      3e221d9 [Hong Shen] A minor change in HeartbeatReceiver
      6bab7aa [Hong Shen] Change a code style.
      07952f3 [Hong Shen] Change configs name and code style.
      ce9257e [Hong Shen] Fix test failed
      bccd515 [Hong Shen] Fix test failed
      8e77408 [Hong Shen] Fix test failed
      c1dfda1 [Hong Shen] Fix test failed
      e197e20 [Hong Shen] Fix test failed
      fb5df97 [Hong Shen] Remove ExpireDeadHosts in BlockManagerMessages
      b5c0441 [Hong Shen] Remove expireDeadHosts in BlockManagerMasterActor
      c922cb0 [Hong Shen] Add expireDeadHosts in HeartbeatReceiver
      18f20984
    • Sean Owen's avatar
      SPARK-4579 [WEBUI] Scheduling Delay appears negative · fbc46947
      Sean Owen authored
      Ensure scheduler delay handles unfinished task case, and ensure delay is never negative even due to rounding
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #4796 from srowen/SPARK-4579 and squashes the following commits:
      
      ad6713c [Sean Owen] Ensure scheduler delay handles unfinished task case, and ensure delay is never negative even due to rounding
      fbc46947
    • tedyu's avatar
      SPARK-6045 RecordWriter should be checked against null in PairRDDFunctio... · e60ad2f4
      tedyu authored
      ...ns#saveAsNewAPIHadoopDataset
      
      Author: tedyu <yuzhihong@gmail.com>
      
      Closes #4794 from tedyu/master and squashes the following commits:
      
      2632a57 [tedyu] SPARK-6045 RecordWriter should be checked against null in PairRDDFunctions#saveAsNewAPIHadoopDataset
      2d8d4b1 [tedyu] SPARK-6045 RecordWriter should be checked against null in PairRDDFunctions#saveAsNewAPIHadoopDataset
      e60ad2f4
    • mohit.goyal's avatar
      [SPARK-5951][YARN] Remove unreachable driver memory properties in yarn client mode · b38dec2f
      mohit.goyal authored
      Remove unreachable driver memory properties in yarn client mode
      
      Author: mohit.goyal <mohit.goyal@guavus.com>
      
      Closes #4730 from zuxqoj/master and squashes the following commits:
      
      977dc96 [mohit.goyal] remove not rechable deprecated variables in yarn client mode
      b38dec2f
    • moussa taifi's avatar
      Add a note for context termination for History server on Yarn · c871e2da
      moussa taifi authored
      The history server on Yarn only shows completed jobs. This adds a note concerning the needed explicit context termination at the end of a spark job which is a best practice anyway.
      Related to SPARK-2972 and SPARK-3458
      
      Author: moussa taifi <moutai10@gmail.com>
      
      Closes #4721 from moutai/add-history-server-note-for-closing-the-spark-context and squashes the following commits:
      
      9f5b6c3 [moussa taifi] Fix upper case typo for YARN
      3ad3db4 [moussa taifi] Add context termination for History server on Yarn
      c871e2da
    • Sean Owen's avatar
      SPARK-4300 [CORE] Race condition during SparkWorker shutdown · 3fb53c02
      Sean Owen authored
      Close appender saving stdout/stderr before destroying process to avoid exception on reading closed input stream.
      (This also removes a redundant `waitFor()` although it was harmless)
      
      CC tdas since I think you wrote this method.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #4787 from srowen/SPARK-4300 and squashes the following commits:
      
      e0cdabf [Sean Owen] Close appender saving stdout/stderr before destroying process to avoid exception on reading closed input stream
      3fb53c02
    • Cheolsoo Park's avatar
      [SPARK-6018] [YARN] NoSuchMethodError in Spark app is swallowed by YARN AM · 5f3238b3
      Cheolsoo Park authored
      Author: Cheolsoo Park <cheolsoop@netflix.com>
      
      Closes #4773 from piaozhexiu/SPARK-6018 and squashes the following commits:
      
      2a919d5 [Cheolsoo Park] Rename e with cause to avoid duplicate names
      1e71d2d [Cheolsoo Park] Replace placeholder with throwable
      eb5750d [Cheolsoo Park] NoSuchMethodError in Spark app is swallowed by YARN AM
      5f3238b3
    • Tathagata Das's avatar
      [SPARK-6027][SPARK-5546] Fixed --jar and --packages not working for KafkaUtils... · aa63f633
      Tathagata Das authored
      [SPARK-6027][SPARK-5546] Fixed --jar and --packages not working for KafkaUtils and improved error message
      
      The problem with SPARK-6027 in short is that JARs like the kafka-assembly.jar does not work in python as the added JAR is not visible in the classloader used by Py4J. Py4J uses Class.forName(), which does not uses the systemclassloader, but the JARs are only visible in the Thread's contextclassloader. So this back uses the context class loader to create the KafkaUtils dstream object. This works for both cases where the Kafka libraries are added with --jars spark-streaming-kafka-assembly.jar or with --packages spark-streaming-kafka
      
      Also improves the error message.
      
      davies
      
      Author: Tathagata Das <tathagata.das1565@gmail.com>
      
      Closes #4779 from tdas/kafka-python-fix and squashes the following commits:
      
      fb16b04 [Tathagata Das] Removed import
      c1fdf35 [Tathagata Das] Fixed long line and improved documentation
      7b88be8 [Tathagata Das] Fixed --jar not working for KafkaUtils and improved error message
      aa63f633
    • xukun 00228947's avatar
      [SPARK-3562]Periodic cleanup event logs · 8942b522
      xukun 00228947 authored
      Author: xukun 00228947 <xukun.xu@huawei.com>
      
      Closes #4214 from viper-kun/cleaneventlog and squashes the following commits:
      
      7a5b9c5 [xukun 00228947] fix issue
      31674ee [xukun 00228947] fix issue
      6e3d06b [xukun 00228947] fix issue
      373f3b9 [xukun 00228947] fix issue
      71782b5 [xukun 00228947] fix issue
      5b45035 [xukun 00228947] fix issue
      70c28d6 [xukun 00228947] fix issues
      adcfe86 [xukun 00228947] Periodic cleanup event logs
      8942b522
    • Li Zhihui's avatar
      Modify default value description for spark.scheduler.minRegisteredResourcesRatio on docs. · 10094a52
      Li Zhihui authored
      The configuration is not supported in mesos mode now.
      See https://github.com/apache/spark/pull/1462
      
      Author: Li Zhihui <zhihui.li@intel.com>
      
      Closes #4781 from li-zhihui/fixdocconf and squashes the following commits:
      
      63e7a44 [Li Zhihui] Modify default value description for spark.scheduler.minRegisteredResourcesRatio on docs.
      10094a52
Loading