Skip to content
Snippets Groups Projects
  1. Jul 18, 2017
    • xuanyuanking's avatar
      [SPARK-21435][SQL] Empty files should be skipped while write to file · 81c99a5b
      xuanyuanking authored
      ## What changes were proposed in this pull request?
      
      Add EmptyDirectoryWriteTask for empty task while writing files. Fix the empty result for parquet format by leaving the first partition for meta writing.
      
      ## How was this patch tested?
      
      Add new test in `FileFormatWriterSuite `
      
      Author: xuanyuanking <xyliyuanjian@gmail.com>
      
      Closes #18654 from xuanyuanking/SPARK-21435.
      81c99a5b
    • Tathagata Das's avatar
      [SPARK-21462][SS] Added batchId to StreamingQueryProgress.json · 84f1b25f
      Tathagata Das authored
      ## What changes were proposed in this pull request?
      
      - Added batchId to StreamingQueryProgress.json as that was missing from the generated json.
      - Also, removed recently added numPartitions from StatefulOperatorProgress as this value does not change through the query run, and there are other ways to find that.
      
      ## How was this patch tested?
      Updated unit tests
      
      Author: Tathagata Das <tathagata.das1565@gmail.com>
      
      Closes #18675 from tdas/SPARK-21462.
      84f1b25f
    • Wenchen Fan's avatar
      [SPARK-21457][SQL] ExternalCatalog.listPartitions should correctly handle partition values with dot · f18b905f
      Wenchen Fan authored
      ## What changes were proposed in this pull request?
      
      When we list partitions from hive metastore with a partial partition spec, we are expecting exact matching according to the partition values. However, hive treats dot specially and match any single character for dot. We should do an extra filter to drop unexpected partitions.
      
      ## How was this patch tested?
      
      new regression test.
      
      Author: Wenchen Fan <wenchen@databricks.com>
      
      Closes #18671 from cloud-fan/hive.
      f18b905f
    • Marcelo Vanzin's avatar
      [SPARK-21408][CORE] Better default number of RPC dispatch threads. · 264b0f36
      Marcelo Vanzin authored
      Instead of using the host's cpu count, use the number of cores allocated
      for the Spark process when sizing the RPC dispatch thread pool. This avoids
      creating large thread pools on large machines when the number of allocated
      cores is small.
      
      Tested by verifying number of threads with spark.executor.cores set
      to 1 and 4; same thing for YARN AM.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #18639 from vanzin/SPARK-21408.
      264b0f36
    • jerryshao's avatar
      [SPARK-21411][YARN] Lazily create FS within kerberized UGI to avoid token acquiring failure · cde64add
      jerryshao authored
      ## What changes were proposed in this pull request?
      
      In the current `YARNHadoopDelegationTokenManager`, `FileSystem` to which to get tokens are created out of KDC logged UGI, using these `FileSystem` to get new tokens will lead to exception. The main thing is that Spark code trying to get new tokens from the FS created with token auth-ed UGI, but Hadoop can only grant new tokens in kerberized UGI. To fix this issue, we should lazily create these FileSystem within KDC logged UGI.
      
      ## How was this patch tested?
      
      Manual verification in secure cluster.
      
      CC vanzin mgummelt please help to review, thanks!
      
      Author: jerryshao <sshao@hortonworks.com>
      
      Closes #18633 from jerryshao/SPARK-21411.
      cde64add
    • Sean Owen's avatar
      [SPARK-15526][ML][FOLLOWUP] Make JPMML provided scope to avoid including... · d3f4a211
      Sean Owen authored
      [SPARK-15526][ML][FOLLOWUP] Make JPMML provided scope to avoid including unshaded JARs, and repromote to compile in MLlib
      
      Following the comment at https://issues.apache.org/jira/browse/SPARK-15526?focusedCommentId=16086106&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16086106 -- this change actually needed a little more work to be complete.
      
      This also marks JPMML as `provided` to make sure its JARs aren't included in the `jars` output, but then scopes to `compile` in `mllib`. This is how Guava is handled.
      
      Checked result in `assembly/target/scala-2.11/jars` to verify there are no JPMML jars. Maven and SBT builds still work.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #18637 from srowen/SPARK-15526.2.
      d3f4a211
    • Sean Owen's avatar
      [SPARK-21415] Triage scapegoat warnings, part 1 · e26dac5f
      Sean Owen authored
      ## What changes were proposed in this pull request?
      
      Address scapegoat warnings for:
      - BigDecimal double constructor
      - Catching NPE
      - Finalizer without super
      - List.size is O(n)
      - Prefer Seq.empty
      - Prefer Set.empty
      - reverse.map instead of reverseMap
      - Type shadowing
      - Unnecessary if condition.
      - Use .log1p
      - Var could be val
      
      In some instances like Seq.empty, I avoided making the change even where valid in test code to keep the scope of the change smaller. Those issues are concerned with performance and it won't matter for tests.
      
      ## How was this patch tested?
      
      Existing tests
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #18635 from srowen/Scapegoat1.
      e26dac5f
  2. Jul 17, 2017
    • Burak Yavuz's avatar
      [SPARK-21445] Make IntWrapper and LongWrapper in UTF8String Serializable · 26cd2ca0
      Burak Yavuz authored
      ## What changes were proposed in this pull request?
      
      Making those two classes will avoid Serialization issues like below:
      ```
      Caused by: java.io.NotSerializableException: org.apache.spark.unsafe.types.UTF8String$IntWrapper
      Serialization stack:
          - object not serializable (class: org.apache.spark.unsafe.types.UTF8String$IntWrapper, value: org.apache.spark.unsafe.types.UTF8String$IntWrapper326450e)
          - field (class: org.apache.spark.sql.catalyst.expressions.Cast$$anonfun$castToInt$1, name: result$2, type: class org.apache.spark.unsafe.types.UTF8String$IntWrapper)
          - object (class org.apache.spark.sql.catalyst.expressions.Cast$$anonfun$castToInt$1, <function1>)
      ```
      
      ## How was this patch tested?
      
      - [x] Manual testing
      - [ ] Unit test
      
      Author: Burak Yavuz <brkyvz@gmail.com>
      
      Closes #18660 from brkyvz/serializableutf8.
      26cd2ca0
    • aokolnychyi's avatar
      [SPARK-21332][SQL] Incorrect result type inferred for some decimal expressions · 0be5fb41
      aokolnychyi authored
      ## What changes were proposed in this pull request?
      
      This PR changes the direction of expression transformation in the DecimalPrecision rule. Previously, the expressions were transformed down, which led to incorrect result types when decimal expressions had other decimal expressions as their operands. The root cause of this issue was in visiting outer nodes before their children. Consider the example below:
      
      ```
          val inputSchema = StructType(StructField("col", DecimalType(26, 6)) :: Nil)
          val sc = spark.sparkContext
          val rdd = sc.parallelize(1 to 2).map(_ => Row(BigDecimal(12)))
          val df = spark.createDataFrame(rdd, inputSchema)
      
          // Works correctly since no nested decimal expression is involved
          // Expected result type: (26, 6) * (26, 6) = (38, 12)
          df.select($"col" * $"col").explain(true)
          df.select($"col" * $"col").printSchema()
      
          // Gives a wrong result since there is a nested decimal expression that should be visited first
          // Expected result type: ((26, 6) * (26, 6)) * (26, 6) = (38, 12) * (26, 6) = (38, 18)
          df.select($"col" * $"col" * $"col").explain(true)
          df.select($"col" * $"col" * $"col").printSchema()
      ```
      
      The example above gives the following output:
      
      ```
      // Correct result without sub-expressions
      == Parsed Logical Plan ==
      'Project [('col * 'col) AS (col * col)#4]
      +- LogicalRDD [col#1]
      
      == Analyzed Logical Plan ==
      (col * col): decimal(38,12)
      Project [CheckOverflow((promote_precision(cast(col#1 as decimal(26,6))) * promote_precision(cast(col#1 as decimal(26,6)))), DecimalType(38,12)) AS (col * col)#4]
      +- LogicalRDD [col#1]
      
      == Optimized Logical Plan ==
      Project [CheckOverflow((col#1 * col#1), DecimalType(38,12)) AS (col * col)#4]
      +- LogicalRDD [col#1]
      
      == Physical Plan ==
      *Project [CheckOverflow((col#1 * col#1), DecimalType(38,12)) AS (col * col)#4]
      +- Scan ExistingRDD[col#1]
      
      // Schema
      root
       |-- (col * col): decimal(38,12) (nullable = true)
      
      // Incorrect result with sub-expressions
      == Parsed Logical Plan ==
      'Project [(('col * 'col) * 'col) AS ((col * col) * col)#11]
      +- LogicalRDD [col#1]
      
      == Analyzed Logical Plan ==
      ((col * col) * col): decimal(38,12)
      Project [CheckOverflow((promote_precision(cast(CheckOverflow((promote_precision(cast(col#1 as decimal(26,6))) * promote_precision(cast(col#1 as decimal(26,6)))), DecimalType(38,12)) as decimal(26,6))) * promote_precision(cast(col#1 as decimal(26,6)))), DecimalType(38,12)) AS ((col * col) * col)#11]
      +- LogicalRDD [col#1]
      
      == Optimized Logical Plan ==
      Project [CheckOverflow((cast(CheckOverflow((col#1 * col#1), DecimalType(38,12)) as decimal(26,6)) * col#1), DecimalType(38,12)) AS ((col * col) * col)#11]
      +- LogicalRDD [col#1]
      
      == Physical Plan ==
      *Project [CheckOverflow((cast(CheckOverflow((col#1 * col#1), DecimalType(38,12)) as decimal(26,6)) * col#1), DecimalType(38,12)) AS ((col * col) * col)#11]
      +- Scan ExistingRDD[col#1]
      
      // Schema
      root
       |-- ((col * col) * col): decimal(38,12) (nullable = true)
      ```
      
      ## How was this patch tested?
      
      This PR was tested with available unit tests. Moreover, there are tests to cover previously failing scenarios.
      
      Author: aokolnychyi <anton.okolnychyi@sap.com>
      
      Closes #18583 from aokolnychyi/spark-21332.
      0be5fb41
    • Josh Rosen's avatar
      [SPARK-21444] Be more defensive when removing broadcasts in MapOutputTracker · 5952ad2b
      Josh Rosen authored
      ## What changes were proposed in this pull request?
      
      In SPARK-21444, sitalkedia reported an issue where the `Broadcast.destroy()` call in `MapOutputTracker`'s `ShuffleStatus.invalidateSerializedMapOutputStatusCache()` was failing with an `IOException`, causing the DAGScheduler to crash and bring down the entire driver.
      
      This is a bug introduced by #17955. In the old code, we removed a broadcast variable by calling `BroadcastManager.unbroadcast` with `blocking=false`, but the new code simply calls `Broadcast.destroy()` which is capable of failing with an IOException in case certain blocking RPCs time out.
      
      The fix implemented here is to replace this with a call to `destroy(blocking = false)` and to wrap the entire operation in `Utils.tryLogNonFatalError`.
      
      ## How was this patch tested?
      
      I haven't written regression tests for this because it's really hard to inject mocks to simulate RPC failures here. Instead, this class of issue is probably best uncovered with more generalized error injection / network unreliability / fuzz testing tools.
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #18662 from JoshRosen/SPARK-21444.
      5952ad2b
    • Tathagata Das's avatar
      [SPARK-21409][SS] Follow up PR to allow different types of custom metrics to be exposed · e9faae13
      Tathagata Das authored
      ## What changes were proposed in this pull request?
      
      Implementation may expose both timing as well as size metrics. This PR enables that.
      
      Author: Tathagata Das <tathagata.das1565@gmail.com>
      
      Closes #18661 from tdas/SPARK-21409-2.
      e9faae13
    • Zhang A Peng's avatar
      [SPARK-21410][CORE] Create less partitions for RangePartitioner if RDD.count()... · 7aac755b
      Zhang A Peng authored
      [SPARK-21410][CORE] Create less partitions for RangePartitioner if RDD.count() is less than `partitions`
      
      ## What changes were proposed in this pull request?
      
      Fix a bug in RangePartitioner:
      In RangePartitioner(partitions: Int, rdd: RDD[]), RangePartitioner.numPartitions is wrong if the number of elements in RDD (rdd.count()) is less than number of partitions (partitions in constructor).
      
      ## How was this patch tested?
      
      test as described in [SPARK-SPARK-21410](https://issues.apache.org/jira/browse/SPARK-21410
      )
      
      Please review http://spark.apache.org/contributing.html before opening a pull request.
      
      Author: Zhang A Peng <zhangap@cn.ibm.com>
      
      Closes #18631 from apapi/fixRangePartitioner.numPartitions.
      7aac755b
    • gatorsmile's avatar
      [MINOR] Improve SQLConf messages · a8c6d0f6
      gatorsmile authored
      ### What changes were proposed in this pull request?
      The current SQLConf messages of `spark.sql.hive.convertMetastoreParquet` and `spark.sql.hive.convertMetastoreOrc` are not very clear to end users. This PR is to improve them.
      
      ### How was this patch tested?
      N/A
      
      Author: gatorsmile <gatorsmile@gmail.com>
      
      Closes #18657 from gatorsmile/msgUpdates.
      a8c6d0f6
    • Tathagata Das's avatar
      [SPARK-21409][SS] Expose state store memory usage in SQL metrics and progress updates · 9d8c8317
      Tathagata Das authored
      ## What changes were proposed in this pull request?
      
      Currently, there is no tracking of memory usage of state stores. This JIRA is to expose that through SQL metrics and StreamingQueryProgress.
      
      Additionally, added the ability to expose implementation-specific metrics through the StateStore APIs to the SQLMetrics.
      
      ## How was this patch tested?
      Added unit tests.
      
      Author: Tathagata Das <tathagata.das1565@gmail.com>
      
      Closes #18629 from tdas/SPARK-21409.
      9d8c8317
    • jerryshao's avatar
      [SPARK-21377][YARN] Make jars specify with --jars/--packages load-able in AM's credential renwer · 53465075
      jerryshao authored
      ## What changes were proposed in this pull request?
      
      In this issue we have a long running Spark application with secure HBase, which requires `HBaseCredentialProvider` to get tokens periodically, we specify HBase related jars with `--packages`, but these dependencies are not added into AM classpath, so when `HBaseCredentialProvider` tries to initialize HBase connections to get tokens, it will be failed.
      
      Currently because jars specified with `--jars` or `--packages` are not added into AM classpath, the only way to extend AM classpath is to use "spark.driver.extraClassPath" which supposed to be used in yarn cluster mode.
      
      So in this fix, we proposed to use/reuse a classloader for `AMCredentialRenewer` to acquire new tokens.
      
      Also in this patch, we fixed AM cannot get tokens from HDFS issue, it is because FileSystem is gotten before kerberos logged, so using this FS to get tokens will throw exception.
      
      ## How was this patch tested?
      
      Manual verification.
      
      Author: jerryshao <sshao@hortonworks.com>
      
      Closes #18616 from jerryshao/SPARK-21377.
      53465075
    • John Lee's avatar
      [SPARK-21321][SPARK CORE] Spark very verbose on shutdown · 0e07a29c
      John Lee authored
      ## What changes were proposed in this pull request?
      
      The current code is very verbose on shutdown.
      
      The changes I propose is to change the log level when the driver is shutting down and the RPC connections are closed (RpcEnvStoppedException).
      
      ## How was this patch tested?
      
      Tested with word count(deploy-mode = cluster, master = yarn, num-executors = 4) with 300GB of data.
      
      Author: John Lee <jlee2@yahoo-inc.com>
      
      Closes #18547 from yoonlee95/SPARK-21321.
      0e07a29c
    • Ajay Saini's avatar
      [SPARK-21221][ML] CrossValidator and TrainValidationSplit Persist Nested... · 7047f49f
      Ajay Saini authored
      [SPARK-21221][ML] CrossValidator and TrainValidationSplit Persist Nested Estimators such as OneVsRest
      
      ## What changes were proposed in this pull request?
      Added functionality for CrossValidator and TrainValidationSplit to persist nested estimators such as OneVsRest. Also added CrossValidator and TrainValidation split persistence to pyspark.
      
      ## How was this patch tested?
      Performed both cross validation and train validation split with a one vs. rest estimator and tested read/write functionality of the estimator parameter maps required by these meta-algorithms.
      
      Author: Ajay Saini <ajays725@gmail.com>
      
      Closes #18428 from ajaysaini725/MetaAlgorithmPersistNestedEstimators.
      7047f49f
    • hyukjinkwon's avatar
      [SPARK-21394][SPARK-21432][PYTHON] Reviving callable object/partial function... · 4ce735ee
      hyukjinkwon authored
      [SPARK-21394][SPARK-21432][PYTHON] Reviving callable object/partial function support in UDF in PySpark
      
      ## What changes were proposed in this pull request?
      
      This PR proposes to avoid `__name__` in the tuple naming the attributes assigned directly from the wrapped function to the wrapper function, and use `self._name` (`func.__name__` or `obj.__class__.name__`).
      
      After SPARK-19161, we happened to break callable objects as UDFs in Python as below:
      
      ```python
      from pyspark.sql import functions
      
      class F(object):
          def __call__(self, x):
              return x
      
      foo = F()
      udf = functions.udf(foo)
      ```
      
      ```
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
        File ".../spark/python/pyspark/sql/functions.py", line 2142, in udf
          return _udf(f=f, returnType=returnType)
        File ".../spark/python/pyspark/sql/functions.py", line 2133, in _udf
          return udf_obj._wrapped()
        File ".../spark/python/pyspark/sql/functions.py", line 2090, in _wrapped
          functools.wraps(self.func)
        File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/functools.py", line 33, in update_wrapper
          setattr(wrapper, attr, getattr(wrapped, attr))
      AttributeError: F instance has no attribute '__name__'
      ```
      
      This worked in Spark 2.1:
      
      ```python
      from pyspark.sql import functions
      
      class F(object):
          def __call__(self, x):
              return x
      
      foo = F()
      udf = functions.udf(foo)
      spark.range(1).select(udf("id")).show()
      ```
      
      ```
      +-----+
      |F(id)|
      +-----+
      |    0|
      +-----+
      ```
      
      **After**
      
      ```python
      from pyspark.sql import functions
      
      class F(object):
          def __call__(self, x):
              return x
      
      foo = F()
      udf = functions.udf(foo)
      spark.range(1).select(udf("id")).show()
      ```
      
      ```
      +-----+
      |F(id)|
      +-----+
      |    0|
      +-----+
      ```
      
      _In addition, we also happened to break partial functions as below_:
      
      ```python
      from pyspark.sql import functions
      from functools import partial
      
      partial_func = partial(lambda x: x, x=1)
      udf = functions.udf(partial_func)
      ```
      
      ```
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
        File ".../spark/python/pyspark/sql/functions.py", line 2154, in udf
          return _udf(f=f, returnType=returnType)
        File ".../spark/python/pyspark/sql/functions.py", line 2145, in _udf
          return udf_obj._wrapped()
        File ".../spark/python/pyspark/sql/functions.py", line 2099, in _wrapped
          functools.wraps(self.func, assigned=assignments)
        File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/functools.py", line 33, in update_wrapper
          setattr(wrapper, attr, getattr(wrapped, attr))
      AttributeError: 'functools.partial' object has no attribute '__module__'
      ```
      
      This worked in Spark 2.1:
      
      ```python
      from pyspark.sql import functions
      from functools import partial
      
      partial_func = partial(lambda x: x, x=1)
      udf = functions.udf(partial_func)
      spark.range(1).select(udf()).show()
      ```
      
      ```
      +---------+
      |partial()|
      +---------+
      |        1|
      +---------+
      ```
      
      **After**
      
      ```python
      from pyspark.sql import functions
      from functools import partial
      
      partial_func = partial(lambda x: x, x=1)
      udf = functions.udf(partial_func)
      spark.range(1).select(udf()).show()
      ```
      
      ```
      +---------+
      |partial()|
      +---------+
      |        1|
      +---------+
      ```
      
      ## How was this patch tested?
      
      Unit tests in `python/pyspark/sql/tests.py` and manual tests.
      
      Author: hyukjinkwon <gurwls223@gmail.com>
      
      Closes #18615 from HyukjinKwon/callable-object.
      4ce735ee
    • gatorsmile's avatar
      [SPARK-21354][SQL] INPUT FILE related functions do not support more than one sources · e398c281
      gatorsmile authored
      ### What changes were proposed in this pull request?
      The build-in functions `input_file_name`, `input_file_block_start`, `input_file_block_length` do not support more than one sources, like what Hive does. Currently, Spark does not block it and the outputs are ambiguous/non-deterministic. It could be from any side.
      
      ```
      hive> select *, INPUT__FILE__NAME FROM t1, t2;
      FAILED: SemanticException Column INPUT__FILE__NAME Found in more than One Tables/Subqueries
      ```
      
      This PR blocks it and issues an error.
      
      ### How was this patch tested?
      Added a test case
      
      Author: gatorsmile <gatorsmile@gmail.com>
      
      Closes #18580 from gatorsmile/inputFileName.
      e398c281
  3. Jul 16, 2017
  4. Jul 15, 2017
  5. Jul 14, 2017
    • Kazuaki Ishizaki's avatar
      [SPARK-21344][SQL] BinaryType comparison does signed byte array comparison · ac5d5d79
      Kazuaki Ishizaki authored
      ## What changes were proposed in this pull request?
      
      This PR fixes a wrong comparison for `BinaryType`. This PR enables unsigned comparison and unsigned prefix generation for an array for `BinaryType`. Previous implementations uses signed operations.
      
      ## How was this patch tested?
      
      Added a test suite in `OrderingSuite`.
      
      Author: Kazuaki Ishizaki <ishizaki@jp.ibm.com>
      
      Closes #18571 from kiszk/SPARK-21344.
      ac5d5d79
    • Shixiong Zhu's avatar
      [SPARK-21421][SS] Add the query id as a local property to allow source and sink using it · 2d968a07
      Shixiong Zhu authored
      ## What changes were proposed in this pull request?
      
      Add the query id as a local property to allow source and sink using it.
      
      ## How was this patch tested?
      
      The new unit test.
      
      Author: Shixiong Zhu <shixiong@databricks.com>
      
      Closes #18638 from zsxwing/SPARK-21421.
      2d968a07
    • Marcelo Vanzin's avatar
      [SPARK-9825][YARN] Do not overwrite final Hadoop config entries. · 601a237b
      Marcelo Vanzin authored
      When localizing the gateway config files in a YARN application, avoid
      overwriting final configs by distributing the gateway files to a separate
      directory, and explicitly loading them into the Hadoop config, instead
      of placing those files before the cluster's files in the classpath.
      
      This is done by saving the gateway's config to a separate XML file
      distributed with the rest of the Spark app's config, and loading that
      file when creating a new config through `YarnSparkHadoopUtil`.
      
      Tested with existing unit tests, and by verifying the behavior in a YARN
      cluster (final values are not overridden, non-final values are).
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #18370 from vanzin/SPARK-9825.
      601a237b
  6. Jul 13, 2017
    • jerryshao's avatar
      [SPARK-21376][YARN] Fix yarn client token expire issue when cleaning the... · cb8d5cc9
      jerryshao authored
      [SPARK-21376][YARN] Fix yarn client token expire issue when cleaning the staging files in long running scenario
      
      ## What changes were proposed in this pull request?
      
      This issue happens in long running application with yarn cluster mode, because yarn#client doesn't sync token with AM, so it will always keep the initial token, this token may be expired in the long running scenario, so when yarn#client tries to clean up staging directory after application finished, it will use this expired token and meet token expire issue.
      
      ## How was this patch tested?
      
      Manual verification is secure cluster.
      
      Author: jerryshao <sshao@hortonworks.com>
      
      Closes #18617 from jerryshao/SPARK-21376.
      cb8d5cc9
    • Sean Owen's avatar
      [SPARK-15526][MLLIB] Shade JPMML · 5c8edfc4
      Sean Owen authored
      ## What changes were proposed in this pull request?
      
      Shade JPMML classes (`org.jpmml.**`) and related PMML model classes (`org.dmg.pmml.**`). This insulates downstream users from the version of JPMML in Spark, allows us to upgrade more freely, and allows downstream users to use a different version. JPMML minor releases are not generally forwards/backwards compatible.
      
      ## How was this patch tested?
      
      Existing tests
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #18584 from srowen/SPARK-15526.
      5c8edfc4
    • Stavros Kontopoulos's avatar
      [SPARK-21403][MESOS] fix --packages for mesos · d8257b99
      Stavros Kontopoulos authored
      ## What changes were proposed in this pull request?
      Fixes --packages flag for mesos in cluster mode. Probably I will handle standalone and Yarn in another commit, I need to investigate those cases as they are different.
      
      ## How was this patch tested?
      Tested with a community 1.9 dc/os cluster. packages were successfully resolved in cluster mode within a container.
      
      andrewor14  susanxhuynh ArtRand srowen  pls review.
      
      Author: Stavros Kontopoulos <st.kontopoulos@gmail.com>
      
      Closes #18587 from skonto/fix_packages_mesos_cluster.
      d8257b99
    • Kazuaki Ishizaki's avatar
      [SPARK-21373][CORE] Update Jetty to 9.3.20.v20170531 · af80e01b
      Kazuaki Ishizaki authored
      ## What changes were proposed in this pull request?
      
      This PR upgrades jetty to the latest version 9.3.20.v20170531. The version includes the fix of CVE-2017-9735.
      
      Here are links to descriptions for CVE-2017-9735.
      * https://nvd.nist.gov/vuln/detail/CVE-2017-9735
      * https://github.com/eclipse/jetty.project/issues/1556
      
      Here is [a release note](https://github.com/eclipse/jetty.project/blob/jetty-9.3.x/VERSION.txt) for the latest jetty
      
      ## How was this patch tested?
      
      tested by existing test suites
      
      Author: Kazuaki Ishizaki <ishizaki@jp.ibm.com>
      
      Closes #18601 from kiszk/SPARK-21373.
      af80e01b
    • Sean Owen's avatar
      [SPARK-19810][BUILD][CORE] Remove support for Scala 2.10 · 425c4ada
      Sean Owen authored
      ## What changes were proposed in this pull request?
      
      - Remove Scala 2.10 build profiles and support
      - Replace some 2.10 support in scripts with commented placeholders for 2.12 later
      - Remove deprecated API calls from 2.10 support
      - Remove usages of deprecated context bounds where possible
      - Remove Scala 2.10 workarounds like ScalaReflectionLock
      - Other minor Scala warning fixes
      
      ## How was this patch tested?
      
      Existing tests
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #17150 from srowen/SPARK-19810.
      425c4ada
  7. Jul 12, 2017
Loading