Skip to content
Snippets Groups Projects
  1. Jun 01, 2017
    • jerryshao's avatar
      [SPARK-20244][CORE] Handle incorrect bytesRead metrics when using PySpark · 5854f77c
      jerryshao authored
      ## What changes were proposed in this pull request?
      
      Hadoop FileSystem's statistics in based on thread local variables, this is ok if the RDD computation chain is running in the same thread. But if child RDD creates another thread to consume the iterator got from Hadoop RDDs, the bytesRead computation will be error, because now the iterator's `next()` and `close()` may run in different threads. This could be happened when using PySpark with PythonRDD.
      
      So here building a map to track the `bytesRead` for different thread and add them together. This method will be used in three RDDs, `HadoopRDD`, `NewHadoopRDD` and `FileScanRDD`. I assume `FileScanRDD` cannot be called directly, so I only fixed `HadoopRDD` and `NewHadoopRDD`.
      
      ## How was this patch tested?
      
      Unit test and local cluster verification.
      
      Author: jerryshao <sshao@hortonworks.com>
      
      Closes #17617 from jerryshao/SPARK-20244.
      5854f77c
  2. May 31, 2017
  3. May 30, 2017
    • jerryshao's avatar
      [SPARK-20275][UI] Do not display "Completed" column for in-progress applications · 52ed9b28
      jerryshao authored
      ## What changes were proposed in this pull request?
      
      Current HistoryServer will display completed date of in-progress application as `1969-12-31 23:59:59`, which is not so meaningful. Instead of unnecessarily showing this incorrect completed date, here propose to make this column invisible for in-progress applications.
      
      The purpose of only making this column invisible rather than deleting this field is that: this data is fetched through REST API, and in the REST API  the format is like below shows, in which `endTime` matches `endTimeEpoch`. So instead of changing REST API to break backward compatibility, here choosing a simple solution to only make this column invisible.
      
      ```
      [ {
        "id" : "local-1491805439678",
        "name" : "Spark shell",
        "attempts" : [ {
          "startTime" : "2017-04-10T06:23:57.574GMT",
          "endTime" : "1969-12-31T23:59:59.999GMT",
          "lastUpdated" : "2017-04-10T06:23:57.574GMT",
          "duration" : 0,
          "sparkUser" : "",
          "completed" : false,
          "startTimeEpoch" : 1491805437574,
          "endTimeEpoch" : -1,
          "lastUpdatedEpoch" : 1491805437574
        } ]
      } ]%
      ```
      
      Here is UI before changed:
      
      <img width="1317" alt="screen shot 2017-04-10 at 3 45 57 pm" src="https://cloud.githubusercontent.com/assets/850797/24851938/17d46cc0-1e08-11e7-84c7-90120e171b41.png">
      
      And after:
      
      <img width="1281" alt="screen shot 2017-04-10 at 4 02 35 pm" src="https://cloud.githubusercontent.com/assets/850797/24851945/1fe9da58-1e08-11e7-8d0d-9262324f9074.png">
      
      ## How was this patch tested?
      
      Manual verification.
      
      Author: jerryshao <sshao@hortonworks.com>
      
      Closes #17588 from jerryshao/SPARK-20275.
      52ed9b28
    • jinxing's avatar
      [SPARK-20333] HashPartitioner should be compatible with num of child RDD's partitions. · de953c21
      jinxing authored
      ## What changes were proposed in this pull request?
      
      Fix test
      "don't submit stage until its dependencies map outputs are registered (SPARK-5259)" ,
      "run trivial shuffle with out-of-band executor failure and retry",
      "reduce tasks should be placed locally with map output"
      in DAGSchedulerSuite.
      
      Author: jinxing <jinxing6042@126.com>
      
      Closes #17634 from jinxing64/SPARK-20333.
      de953c21
  4. May 27, 2017
  5. May 26, 2017
    • Wenchen Fan's avatar
      [SPARK-19659][CORE][FOLLOW-UP] Fetch big blocks to disk when shuffle-read · 1d62f8ac
      Wenchen Fan authored
      ## What changes were proposed in this pull request?
      
      This PR includes some minor improvement for the comments and tests in https://github.com/apache/spark/pull/16989
      
      ## How was this patch tested?
      
      N/A
      
      Author: Wenchen Fan <wenchen@databricks.com>
      
      Closes #18117 from cloud-fan/follow.
      1d62f8ac
    • Yu Peng's avatar
      [SPARK-10643][CORE] Make spark-submit download remote files to local in client mode · 4af37812
      Yu Peng authored
      ## What changes were proposed in this pull request?
      
      This PR makes spark-submit script download remote files to local file system for local/standalone client mode.
      
      ## How was this patch tested?
      
      - Unit tests
      - Manual tests by adding s3a jar and testing against file on s3.
      
      Please review http://spark.apache.org/contributing.html before opening a pull request.
      
      Author: Yu Peng <loneknightpy@gmail.com>
      
      Closes #18078 from loneknightpy/download-jar-in-spark-submit.
      4af37812
    • Sital Kedia's avatar
      [SPARK-20014] Optimize mergeSpillsWithFileStream method · 473d7552
      Sital Kedia authored
      ## What changes were proposed in this pull request?
      
      When the individual partition size in a spill is small, mergeSpillsWithTransferTo method does many small disk ios which is really inefficient. One way to improve the performance will be to use mergeSpillsWithFileStream method by turning off transfer to and using buffered file read/write to improve the io throughput.
      However, the current implementation of mergeSpillsWithFileStream does not do a buffer read/write of the files and in addition to that it unnecessarily flushes the output files for each partitions.
      
      ## How was this patch tested?
      
      Tested this change by running a job on the cluster and the map stage run time was reduced by around 20%.
      
      Author: Sital Kedia <skedia@fb.com>
      
      Closes #17343 from sitalkedia/upstream_mergeSpillsWithFileStream.
      473d7552
    • 10129659's avatar
      [SPARK-20835][CORE] It should exit directly when the --total-executor-cores... · 0fd84b05
      10129659 authored
      [SPARK-20835][CORE] It should exit directly when the --total-executor-cores parameter is setted less than 0 when submit a application
      
      ## What changes were proposed in this pull request?
      In my test, the submitted app running with out an error when the --total-executor-cores less than 0
      and given the warnings:
      "2017-05-22 17:19:36,319 WARN org.apache.spark.scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources";
      
      It should exit directly when the --total-executor-cores parameter is setted less than 0 when submit a application
      (Please fill in changes proposed in this fix)
      
      ## How was this patch tested?
      Run the ut tests
      (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
      (If this patch involves UI changes, please attach a screenshot; otherwise, remove this)
      
      Please review http://spark.apache.org/contributing.html before opening a pull request.
      
      Author: 10129659 <chen.yanshan@zte.com.cn>
      
      Closes #18060 from eatoncys/totalcores.
      0fd84b05
    • Wenchen Fan's avatar
      [SPARK-20887][CORE] support alternative keys in ConfigBuilder · 629f38e1
      Wenchen Fan authored
      ## What changes were proposed in this pull request?
      
      `ConfigBuilder` builds `ConfigEntry` which can only read value with one key, if we wanna change the config name but still keep the old one, it's hard to do.
      
      This PR introduce `ConfigBuilder.withAlternative`, to support reading config value with alternative keys. And also rename `spark.scheduler.listenerbus.eventqueue.size` to `spark.scheduler.listenerbus.eventqueue.capacity` with this feature, according to https://github.com/apache/spark/pull/14269#discussion_r118432313
      
      ## How was this patch tested?
      
      a new test
      
      Author: Wenchen Fan <wenchen@databricks.com>
      
      Closes #18110 from cloud-fan/config.
      629f38e1
    • Wenchen Fan's avatar
      [SPARK-20868][CORE] UnsafeShuffleWriter should verify the position after FileChannel.transferTo · d9ad7890
      Wenchen Fan authored
      ## What changes were proposed in this pull request?
      
      Long time ago we fixed a [bug](https://issues.apache.org/jira/browse/SPARK-3948) in shuffle writer about `FileChannel.transferTo`. We were not very confident about that fix, so we added a position check after the writing, try to discover the bug earlier.
      
       However this checking is missing in the new `UnsafeShuffleWriter`, this PR adds it.
      
      https://issues.apache.org/jira/browse/SPARK-18105 maybe related to that `FileChannel.transferTo` bug, hopefully we can find out the root cause after adding this position check.
      
      ## How was this patch tested?
      
      N/A
      
      Author: Wenchen Fan <wenchen@databricks.com>
      
      Closes #18091 from cloud-fan/shuffle.
      d9ad7890
  6. May 25, 2017
    • hyukjinkwon's avatar
      [SPARK-19707][SPARK-18922][TESTS][SQL][CORE] Fix test failures/the invalid... · e9f983df
      hyukjinkwon authored
      [SPARK-19707][SPARK-18922][TESTS][SQL][CORE] Fix test failures/the invalid path check for sc.addJar on Windows
      
      ## What changes were proposed in this pull request?
      
      This PR proposes two things:
      
      - A follow up for SPARK-19707 (Improving the invalid path check for sc.addJar on Windows as well).
      
      ```
      org.apache.spark.SparkContextSuite:
       - add jar with invalid path *** FAILED *** (32 milliseconds)
         2 was not equal to 1 (SparkContextSuite.scala:309)
         ...
      ```
      
      - Fix path vs URI related test failures on Windows.
      
      ```
      org.apache.spark.storage.LocalDirsSuite:
       - SPARK_LOCAL_DIRS override also affects driver *** FAILED *** (0 milliseconds)
         new java.io.File("/NONEXISTENT_PATH").exists() was true (LocalDirsSuite.scala:50)
         ...
      
       - Utils.getLocalDir() throws an exception if any temporary directory cannot be retrieved *** FAILED *** (15 milliseconds)
         Expected exception java.io.IOException to be thrown, but no exception was thrown. (LocalDirsSuite.scala:64)
         ...
      ```
      
      ```
      org.apache.spark.sql.hive.HiveSchemaInferenceSuite:
       - orc: schema should be inferred and saved when INFER_AND_SAVE is specified *** FAILED *** (203 milliseconds)
         java.net.URISyntaxException: Illegal character in opaque part at index 2: C:\projects\spark\target\tmp\spark-dae61ab3-a851-4dd3-bf4e-be97c501f254
         ...
      
       - parquet: schema should be inferred and saved when INFER_AND_SAVE is specified *** FAILED *** (203 milliseconds)
         java.net.URISyntaxException: Illegal character in opaque part at index 2: C:\projects\spark\target\tmp\spark-fa3aff89-a66e-4376-9a37-2a9b87596939
         ...
      
       - orc: schema should be inferred but not stored when INFER_ONLY is specified *** FAILED *** (141 milliseconds)
         java.net.URISyntaxException: Illegal character in opaque part at index 2: C:\projects\spark\target\tmp\spark-fb464e59-b049-481b-9c75-f53295c9fc2c
         ...
      
       - parquet: schema should be inferred but not stored when INFER_ONLY is specified *** FAILED *** (125 milliseconds)
         java.net.URISyntaxException: Illegal character in opaque part at index 2: C:\projects\spark\target\tmp\spark-9487568e-80a4-42b3-b0a5-d95314c4ccbc
         ...
      
       - orc: schema should not be inferred when NEVER_INFER is specified *** FAILED *** (156 milliseconds)
         java.net.URISyntaxException: Illegal character in opaque part at index 2: C:\projects\spark\target\tmp\spark-0d2dfa45-1b0f-4958-a8be-1074ed0135a
         ...
      
       - parquet: schema should not be inferred when NEVER_INFER is specified *** FAILED *** (547 milliseconds)
         java.net.URISyntaxException: Illegal character in opaque part at index 2: C:\projects\spark\target\tmp\spark-6d95d64e-613e-4a59-a0f6-d198c5aa51ee
         ...
      ```
      
      ```
      org.apache.spark.sql.execution.command.DDLSuite:
       - create temporary view using *** FAILED *** (15 milliseconds)
         org.apache.spark.sql.AnalysisException: Path does not exist: file:/C:projectsspark	arget	mpspark-3881d9ca-561b-488d-90b9-97587472b853	mp;
         ...
      
       - insert data to a data source table which has a non-existing location should succeed *** FAILED *** (109 milliseconds)
         file:/C:projectsspark%09arget%09mpspark-4cad3d19-6085-4b75-b407-fe5e9d21df54 did not equal file:///C:/projects/spark/target/tmp/spark-4cad3d19-6085-4b75-b407-fe5e9d21df54 (DDLSuite.scala:1869)
         ...
      
       - insert into a data source table with a non-existing partition location should succeed *** FAILED *** (94 milliseconds)
         file:/C:projectsspark%09arget%09mpspark-4b52e7de-e3aa-42fd-95d4-6d4d58d1d95d did not equal file:///C:/projects/spark/target/tmp/spark-4b52e7de-e3aa-42fd-95d4-6d4d58d1d95d (DDLSuite.scala:1910)
         ...
      
       - read data from a data source table which has a non-existing location should succeed *** FAILED *** (93 milliseconds)
         file:/C:projectsspark%09arget%09mpspark-f8c281e2-08c2-4f73-abbf-f3865b702c34 did not equal file:///C:/projects/spark/target/tmp/spark-f8c281e2-08c2-4f73-abbf-f3865b702c34 (DDLSuite.scala:1937)
         ...
      
       - read data from a data source table with non-existing partition location should succeed *** FAILED *** (110 milliseconds)
         java.lang.IllegalArgumentException: Can not create a Path from an empty string
         ...
      
       - create datasource table with a non-existing location *** FAILED *** (94 milliseconds)
         file:/C:projectsspark%09arget%09mpspark-387316ae-070c-4e78-9b78-19ebf7b29ec8 did not equal file:///C:/projects/spark/target/tmp/spark-387316ae-070c-4e78-9b78-19ebf7b29ec8 (DDLSuite.scala:1982)
         ...
      
       - CTAS for external data source table with a non-existing location *** FAILED *** (16 milliseconds)
         java.lang.IllegalArgumentException: Can not create a Path from an empty string
         ...
      
       - CTAS for external data source table with a existed location *** FAILED *** (15 milliseconds)
         java.lang.IllegalArgumentException: Can not create a Path from an empty string
         ...
      
       - data source table:partition column name containing a b *** FAILED *** (125 milliseconds)
         java.lang.IllegalArgumentException: Can not create a Path from an empty string
         ...
      
       - data source table:partition column name containing a:b *** FAILED *** (143 milliseconds)
         java.lang.IllegalArgumentException: Can not create a Path from an empty string
         ...
      
       - data source table:partition column name containing a%b *** FAILED *** (109 milliseconds)
         java.lang.IllegalArgumentException: Can not create a Path from an empty string
         ...
      
       - data source table:partition column name containing a,b *** FAILED *** (109 milliseconds)
         java.lang.IllegalArgumentException: Can not create a Path from an empty string
         ...
      
       - location uri contains a b for datasource table *** FAILED *** (94 milliseconds)
         file:/C:projectsspark%09arget%09mpspark-5739cda9-b702-4e14-932c-42e8c4174480a%20b did not equal file:///C:/projects/spark/target/tmp/spark-5739cda9-b702-4e14-932c-42e8c4174480/a%20b (DDLSuite.scala:2084)
         ...
      
       - location uri contains a:b for datasource table *** FAILED *** (78 milliseconds)
         file:/C:projectsspark%09arget%09mpspark-9bdd227c-840f-4f08-b7c5-4036638f098da:b did not equal file:///C:/projects/spark/target/tmp/spark-9bdd227c-840f-4f08-b7c5-4036638f098d/a:b (DDLSuite.scala:2084)
         ...
      
       - location uri contains a%b for datasource table *** FAILED *** (78 milliseconds)
         file:/C:projectsspark%09arget%09mpspark-62bb5f1d-fa20-460a-b534-cb2e172a3640a%25b did not equal file:///C:/projects/spark/target/tmp/spark-62bb5f1d-fa20-460a-b534-cb2e172a3640/a%25b (DDLSuite.scala:2084)
         ...
      
       - location uri contains a b for database *** FAILED *** (16 milliseconds)
         org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string);
         ...
      
       - location uri contains a:b for database *** FAILED *** (15 milliseconds)
         org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string);
         ...
      
       - location uri contains a%b for database *** FAILED *** (0 milliseconds)
         org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string);
         ...
      ```
      
      ```
      org.apache.spark.sql.hive.execution.HiveDDLSuite:
       - create hive table with a non-existing location *** FAILED *** (16 milliseconds)
         org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string);
         ...
      
       - CTAS for external hive table with a non-existing location *** FAILED *** (16 milliseconds)
         org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string);
         ...
      
       - CTAS for external hive table with a existed location *** FAILED *** (16 milliseconds)
         org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string);
         ...
      
       - partition column name of parquet table containing a b *** FAILED *** (156 milliseconds)
         java.lang.IllegalArgumentException: Can not create a Path from an empty string
         ...
      
       - partition column name of parquet table containing a:b *** FAILED *** (94 milliseconds)
         java.lang.IllegalArgumentException: Can not create a Path from an empty string
         ...
      
       - partition column name of parquet table containing a%b *** FAILED *** (125 milliseconds)
         java.lang.IllegalArgumentException: Can not create a Path from an empty string
         ...
      
       - partition column name of parquet table containing a,b *** FAILED *** (110 milliseconds)
         java.lang.IllegalArgumentException: Can not create a Path from an empty string
         ...
      
       - partition column name of hive table containing a b *** FAILED *** (15 milliseconds)
         org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string);
         ...
      
       - partition column name of hive table containing a:b *** FAILED *** (16 milliseconds)
         org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string);
         ...
      
       - partition column name of hive table containing a%b *** FAILED *** (16 milliseconds)
         org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string);
         ...
      
       - partition column name of hive table containing a,b *** FAILED *** (0 milliseconds)
         org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string);
         ...
      
       - hive table: location uri contains a b *** FAILED *** (0 milliseconds)
         org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string);
         ...
      
       - hive table: location uri contains a:b *** FAILED *** (0 milliseconds)
         org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string);
         ...
      
       - hive table: location uri contains a%b *** FAILED *** (0 milliseconds)
         org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string);
         ...
      ```
      
      ```
      org.apache.spark.sql.sources.PathOptionSuite:
       - path option also exist for write path *** FAILED *** (94 milliseconds)
         file:/C:projectsspark%09arget%09mpspark-2870b281-7ac0-43d6-b6b6-134e01ab6fdc did not equal file:///C:/projects/spark/target/tmp/spark-2870b281-7ac0-43d6-b6b6-134e01ab6fdc (PathOptionSuite.scala:98)
         ...
      ```
      
      ```
      org.apache.spark.sql.CachedTableSuite:
       - SPARK-19765: UNCACHE TABLE should un-cache all cached plans that refer to this table *** FAILED *** (110 milliseconds)
         java.lang.IllegalArgumentException: Can not create a Path from an empty string
         ...
      ```
      
      ```
      org.apache.spark.sql.execution.DataSourceScanExecRedactionSuite:
       - treeString is redacted *** FAILED *** (250 milliseconds)
         "file:/C:/projects/spark/target/tmp/spark-3ecc1fa4-3e76-489c-95f4-f0b0500eae28" did not contain "C:\projects\spark\target\tmp\spark-3ecc1fa4-3e76-489c-95f4-f0b0500eae28" (DataSourceScanExecRedactionSuite.scala:46)
         ...
      ```
      
      ## How was this patch tested?
      
      Tested via AppVeyor for each and checked it passed once each. These should be retested via AppVeyor in this PR.
      
      Author: hyukjinkwon <gurwls223@gmail.com>
      
      Closes #17987 from HyukjinKwon/windows-20170515.
      e9f983df
    • jinxing's avatar
      [SPARK-19659] Fetch big blocks to disk when shuffle-read. · 3f94e64a
      jinxing authored
      ## What changes were proposed in this pull request?
      
      Currently the whole block is fetched into memory(off heap by default) when shuffle-read. A block is defined by (shuffleId, mapId, reduceId). Thus it can be large when skew situations. If OOM happens during shuffle read, job will be killed and users will be notified to "Consider boosting spark.yarn.executor.memoryOverhead". Adjusting parameter and allocating more memory can resolve the OOM. However the approach is not perfectly suitable for production environment, especially for data warehouse.
      Using Spark SQL as data engine in warehouse, users hope to have a unified parameter(e.g. memory) but less resource wasted(resource is allocated but not used). The hope is strong especially when migrating data engine to Spark from another one(e.g. Hive). Tuning the parameter for thousands of SQLs one by one is very time consuming.
      It's not always easy to predict skew situations, when happen, it make sense to fetch remote blocks to disk for shuffle-read, rather than kill the job because of OOM.
      
      In this pr, I propose to fetch big blocks to disk(which is also mentioned in SPARK-3019):
      
      1. Track average size and also the outliers(which are larger than 2*avgSize) in MapStatus;
      2. Request memory from `MemoryManager` before fetch blocks and release the memory to `MemoryManager` when `ManagedBuffer` is released.
      3. Fetch remote blocks to disk when failing acquiring memory from `MemoryManager`, otherwise fetch to memory.
      
      This is an improvement for memory control when shuffle blocks and help to avoid OOM in scenarios like below:
      1. Single huge block;
      2. Sizes of many blocks are underestimated in `MapStatus` and the actual footprint of blocks is much larger than the estimated.
      
      ## How was this patch tested?
      Added unit test in `MapStatusSuite` and `ShuffleBlockFetcherIteratorSuite`.
      
      Author: jinxing <jinxing6042@126.com>
      
      Closes #16989 from jinxing64/SPARK-19659.
      3f94e64a
    • Xianyang Liu's avatar
      [SPARK-20250][CORE] Improper OOM error when a task been killed while spilling data · 731462a0
      Xianyang Liu authored
      ## What changes were proposed in this pull request?
      
      Currently, when a task is calling spill() but it receives a killing request from driver (e.g., speculative task), the `TaskMemoryManager` will throw an `OOM` exception.  And we don't catch `Fatal` exception when a error caused by `Thread.interrupt`. So for `ClosedByInterruptException`, we should throw `RuntimeException` instead of `OutOfMemoryError`.
      
      https://issues.apache.org/jira/browse/SPARK-20250?jql=project%20%3D%20SPARK
      
      ## How was this patch tested?
      
      Existing unit tests.
      
      Author: Xianyang Liu <xianyang.liu@intel.com>
      
      Closes #18090 from ConeyLiu/SPARK-20250.
      731462a0
  7. May 24, 2017
    • Marcelo Vanzin's avatar
      [SPARK-20205][CORE] Make sure StageInfo is updated before sending event. · 95aef660
      Marcelo Vanzin authored
      The DAGScheduler was sending a "stage submitted" event before it properly
      updated the event's information. This meant that a listener (e.g. the
      even logging listener) could record wrong information about the event.
      
      This change sets the stage's submission time before the event is submitted,
      when there are tasks to be executed in the stage.
      
      Tested with existing unit tests.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #17925 from vanzin/SPARK-20205.
      95aef660
    • Xingbo Jiang's avatar
      [SPARK-18406][CORE] Race between end-of-task and completion iterator read lock release · d76633e3
      Xingbo Jiang authored
      ## What changes were proposed in this pull request?
      
      When a TaskContext is not propagated properly to all child threads for the task, just like the reported cases in this issue, we fail to get to TID from TaskContext and that causes unable to release the lock and assertion failures. To resolve this, we have to explicitly pass the TID value to the `unlock` method.
      
      ## How was this patch tested?
      
      Add new failing regression test case in `RDDSuite`.
      
      Author: Xingbo Jiang <xingbo.jiang@databricks.com>
      
      Closes #18076 from jiangxb1987/completion-iterator.
      d76633e3
  8. May 22, 2017
    • James Shuster's avatar
      [SPARK-20815][SPARKR] NullPointerException in RPackageUtils#checkManifestForR · 4dbb63f0
      James Shuster authored
      ## What changes were proposed in this pull request?
      
      - Add a null check to RPackageUtils#checkManifestForR so that jars w/o manifests don't NPE.
      
      ## How was this patch tested?
      
      - Unit tests and manual tests.
      
      Author: James Shuster <jshuster@palantir.com>
      
      Closes #18040 from jrshust/feature/r-package-utils.
      4dbb63f0
    • jinxing's avatar
      [SPARK-20801] Record accurate size of blocks in MapStatus when it's above threshold. · 2597674b
      jinxing authored
      ## What changes were proposed in this pull request?
      
      Currently, when number of reduces is above 2000, HighlyCompressedMapStatus is used to store size of blocks. in HighlyCompressedMapStatus, only average size is stored for non empty blocks. Which is not good for memory control when we shuffle blocks. It makes sense to store the accurate size of block when it's above threshold.
      
      ## How was this patch tested?
      
      Added test in MapStatusSuite.
      
      Author: jinxing <jinxing6042@126.com>
      
      Closes #18031 from jinxing64/SPARK-20801.
      2597674b
    • John Lee's avatar
      [SPARK-20813][WEB UI] Fixed Web UI executor page tab search by status not working · aea73be1
      John Lee authored
      ## What changes were proposed in this pull request?
      On status column of the table, I removed the condition  that forced only the display value to take on values Active, Blacklisted and Dead.
      
      Before the removal, values used for sort and filter for that particular column was True and False.
      ## How was this patch tested?
      
      Tested with Active, Blacklisted and Dead present as current status.
      
      Author: John Lee <jlee2@yahoo-inc.com>
      
      Closes #18036 from yoonlee95/SPARK-20813.
      aea73be1
    • caoxuewen's avatar
      [SPARK-20609][CORE] Run the SortShuffleSuite unit tests have residual spark_* system directory · f1ffc6e7
      caoxuewen authored
      ## What changes were proposed in this pull request?
      This PR solution to run the SortShuffleSuite unit tests have residual spark_* system directory
      For example:
      OS:Windows 7
      After the running SortShuffleSuite unit tests,
      the system of TMP directory have '..\AppData\Local\Temp\spark-f64121f9-11b4-4ffd-a4f0-cfca66643503' not deleted
      
      ## How was this patch tested?
      Run SortShuffleSuite unit test.
      
      Author: caoxuewen <cao.xuewen@zte.com.cn>
      
      Closes #17869 from heary-cao/SortShuffleSuite.
      f1ffc6e7
    • fjh100456's avatar
      [SPARK-20591][WEB UI] Succeeded tasks num not equal in all jobs page and job... · 190d8b0b
      fjh100456 authored
      [SPARK-20591][WEB UI] Succeeded tasks num not equal in all jobs page and job detail page on spark web ui when speculative task(s) exist.
      
      ## What changes were proposed in this pull request?
      
      Modified succeeded num in job detail page from "completed = stageData.completedIndices.size" to "completed = stageData.numCompleteTasks",which making succeeded tasks num in all jobs page and job detail page look more consistent, and more easily to find which stages the speculative task(s) were in.
      
      ## How was this patch tested?
      
      manual tests
      
      Author: fjh100456 <fu.jinhua6@zte.com.cn>
      
      Closes #17923 from fjh100456/master.
      190d8b0b
  9. May 19, 2017
    • caoxuewen's avatar
      [SPARK-20607][CORE] Add new unit tests to ShuffleSuite · f398640d
      caoxuewen authored
      ## What changes were proposed in this pull request?
      
      This PR update to two:
      1.adds the new unit tests.
        testing would be performed when there is no shuffle stage,
        shuffle will not generate the data file and the index files.
      2.Modify the '[SPARK-4085] rerun map stage if reduce stage cannot find its local shuffle file' unit test,
        parallelize is 1 but not is 2, Check the index file and delete.
      
      ## How was this patch tested?
      The new unit test.
      
      Author: caoxuewen <cao.xuewen@zte.com.cn>
      
      Closes #17868 from heary-cao/ShuffleSuite.
      f398640d
  10. May 17, 2017
  11. May 16, 2017
    • Shixiong Zhu's avatar
      [SPARK-20529][CORE] Allow worker and master work with a proxy server · 9150bca4
      Shixiong Zhu authored
      ## What changes were proposed in this pull request?
      
      In the current codes, when worker connects to master, master will send its address to the worker. Then worker will save this address and use it to reconnect in case of failure. However, sometimes, this address is not correct. If there is a proxy between master and worker, the address master sent is not the address of proxy.
      
      In this PR, the master address used by the worker will be sent to the master, then master just replies this address back, worker will use this address to reconnect in case of failure. In other words, the worker will use the config master address set in the worker side if possible rather than the master address set in the master side.
      
      There is still one potential issue though. When a master is restarted or takes over leadership, the work will use the address sent from the master to connect. If there is still a proxy between  master and worker, the address may be wrong. However, there is no way to figure it out just in the worker.
      
      ## How was this patch tested?
      
      The new added unit test.
      
      Author: Shixiong Zhu <shixiong@databricks.com>
      
      Closes #17821 from zsxwing/SPARK-20529.
      9150bca4
  12. May 15, 2017
  13. May 12, 2017
    • Shixiong Zhu's avatar
      [SPARK-20702][CORE] TaskContextImpl.markTaskCompleted should not hide the original error · 7d6ff391
      Shixiong Zhu authored
      ## What changes were proposed in this pull request?
      
      This PR adds an `error` parameter to `TaskContextImpl.markTaskCompleted` to propagate the original error.
      
      It also fixes an issue that `TaskCompletionListenerException.getMessage` doesn't include `previousError`.
      
      ## How was this patch tested?
      
      New unit tests.
      
      Author: Shixiong Zhu <shixiong@databricks.com>
      
      Closes #17942 from zsxwing/SPARK-20702.
      7d6ff391
    • Sean Owen's avatar
      [SPARK-20554][BUILD] Remove usage of scala.language.reflectiveCalls · fc8a2b6e
      Sean Owen authored
      ## What changes were proposed in this pull request?
      
      Remove uses of scala.language.reflectiveCalls that are either unnecessary or probably resulting in more complex code. This turned out to be less significant than I thought, but, still worth a touch-up.
      
      ## How was this patch tested?
      
      Existing tests.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #17949 from srowen/SPARK-20554.
      fc8a2b6e
  14. May 10, 2017
    • Xianyang Liu's avatar
      [MINOR][BUILD] Fix lint-java breaks. · fcb88f92
      Xianyang Liu authored
      ## What changes were proposed in this pull request?
      
      This PR proposes to fix the lint-breaks as below:
      ```
      [ERROR] src/main/java/org/apache/spark/unsafe/Platform.java:[51] (regexp) RegexpSingleline: No trailing whitespace allowed.
      [ERROR] src/main/scala/org/apache/spark/sql/streaming/Trigger.java:[45,25] (naming) MethodName: Method name 'ProcessingTime' must match pattern '^[a-z][a-z0-9][a-zA-Z0-9_]*$'.
      [ERROR] src/main/scala/org/apache/spark/sql/streaming/Trigger.java:[62,25] (naming) MethodName: Method name 'ProcessingTime' must match pattern '^[a-z][a-z0-9][a-zA-Z0-9_]*$'.
      [ERROR] src/main/scala/org/apache/spark/sql/streaming/Trigger.java:[78,25] (naming) MethodName: Method name 'ProcessingTime' must match pattern '^[a-z][a-z0-9][a-zA-Z0-9_]*$'.
      [ERROR] src/main/scala/org/apache/spark/sql/streaming/Trigger.java:[92,25] (naming) MethodName: Method name 'ProcessingTime' must match pattern '^[a-z][a-z0-9][a-zA-Z0-9_]*$'.
      [ERROR] src/main/scala/org/apache/spark/sql/streaming/Trigger.java:[102,25] (naming) MethodName: Method name 'Once' must match pattern '^[a-z][a-z0-9][a-zA-Z0-9_]*$'.
      [ERROR] src/test/java/org/apache/spark/streaming/kinesis/JavaKinesisInputDStreamBuilderSuite.java:[28,8] (imports) UnusedImports: Unused import - org.apache.spark.streaming.api.java.JavaDStream.
      ```
      
      after:
      ```
      dev/lint-java
      Checkstyle checks passed.
      ```
      [Test Result](https://travis-ci.org/ConeyLiu/spark/jobs/229666169)
      
      ## How was this patch tested?
      
      Travis CI
      
      Author: Xianyang Liu <xianyang.liu@intel.com>
      
      Closes #17890 from ConeyLiu/codestyle.
      fcb88f92
    • NICHOLAS T. MARION's avatar
      [SPARK-20393][WEBU UI] Strengthen Spark to prevent XSS vulnerabilities · b512233a
      NICHOLAS T. MARION authored
      ## What changes were proposed in this pull request?
      
      Add stripXSS and stripXSSMap to Spark Core's UIUtils. Calling these functions at any point that getParameter is called against a HttpServletRequest.
      
      ## How was this patch tested?
      
      Unit tests, IBM Security AppScan Standard no longer showing vulnerabilities, manual verification of WebUI pages.
      
      Author: NICHOLAS T. MARION <nmarion@us.ibm.com>
      
      Closes #17686 from n-marion/xss-fix.
      b512233a
    • Michael Mior's avatar
      [SPARK-20637][CORE] Remove mention of old RDD classes from comments · a4cbf26b
      Michael Mior authored
      ## What changes were proposed in this pull request?
      
      A few comments around the code mention RDD classes that do not exist anymore. I'm not sure of the best way to replace these, so I've just removed them here.
      
      ## How was this patch tested?
      
      Only changes code comments, no testing required
      
      Author: Michael Mior <mmior@uwaterloo.ca>
      
      Closes #17900 from michaelmior/remove-old-rdds.
      a4cbf26b
    • Alex Bozarth's avatar
      [SPARK-20630][WEB UI] Fixed column visibility in Executor Tab · ca4625e0
      Alex Bozarth authored
      ## What changes were proposed in this pull request?
      
      #14617 added new columns to the executor table causing the visibility checks for the logs and threadDump columns to toggle the wrong columns since they used hard-coded column numbers.
      
      I've updated the checks to use column names instead of numbers so future updates don't accidentally break this again.
      
      Note: This will also need to be back ported into 2.2 since #14617 was merged there
      
      ## How was this patch tested?
      
      Manually tested
      
      Author: Alex Bozarth <ajbozart@us.ibm.com>
      
      Closes #17904 from ajbozarth/spark20630.
      ca4625e0
  15. May 09, 2017
  16. May 08, 2017
    • jerryshao's avatar
      [SPARK-20605][CORE][YARN][MESOS] Deprecate not used AM and executor port configuration · 829cd7b8
      jerryshao authored
      ## What changes were proposed in this pull request?
      
      After SPARK-10997, client mode Netty RpcEnv doesn't require to start server, so port configurations are not used any more, here propose to remove these two configurations: "spark.executor.port" and "spark.am.port".
      
      ## How was this patch tested?
      
      Existing UTs.
      
      Author: jerryshao <sshao@hortonworks.com>
      
      Closes #17866 from jerryshao/SPARK-20605.
      829cd7b8
    • Xianyang Liu's avatar
      [SPARK-19956][CORE] Optimize a location order of blocks with topology information · 15526653
      Xianyang Liu authored
      ## What changes were proposed in this pull request?
      
      When call the method getLocations of BlockManager, we only compare the data block host. Random selection for non-local data blocks, this may cause the selected data block to be in a different rack. So in this patch to increase the sort of the rack.
      
      ## How was this patch tested?
      
      New test case.
      
      Please review http://spark.apache.org/contributing.html before opening a pull request.
      
      Author: Xianyang Liu <xianyang.liu@intel.com>
      
      Closes #17300 from ConeyLiu/blockmanager.
      15526653
Loading