Skip to content
Snippets Groups Projects
  1. Dec 03, 2014
    • Mark Hamstra's avatar
      [SPARK-4498][core] Don't transition ExecutorInfo to RUNNING until Driver adds Executor · 96b27855
      Mark Hamstra authored
      The ExecutorInfo only reaches the RUNNING state if the Driver is alive to send the ExecutorStateChanged message to master.  Else, appInfo.resetRetryCount() is never called and failing Executors will eventually exceed ApplicationState.MAX_NUM_RETRY, resulting in the application being removed from the master's accounting.
      
      JoshRosen
      
      Author: Mark Hamstra <markhamstra@gmail.com>
      
      Closes #3550 from markhamstra/SPARK-4498 and squashes the following commits:
      
      8f543b1 [Mark Hamstra] Don't transition ExecutorInfo to RUNNING until Executor is added by Driver
      96b27855
    • Michael Armbrust's avatar
      [SPARK-4552][SQL] Avoid exception when reading empty parquet data through Hive · 513ef82e
      Michael Armbrust authored
      This is a very small fix that catches one specific exception and returns an empty table.  #3441 will address this in a more principled way.
      
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #3586 from marmbrus/fixEmptyParquet and squashes the following commits:
      
      2781d9f [Michael Armbrust] Handle empty lists for newParquet
      04dd376 [Michael Armbrust] Avoid exception when reading empty parquet data through Hive
      513ef82e
    • Andrew Or's avatar
      [HOT FIX] [YARN] Check whether `/lib` exists before listing its files · 90ec643e
      Andrew Or authored
      This is caused by a975dc32
      
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #3589 from andrewor14/yarn-hot-fix and squashes the following commits:
      
      a4fad5f [Andrew Or] Check whether lib directory exists before listing its files
      90ec643e
    • Masayoshi TSUZUKI's avatar
      [SPARK-4642] Add description about spark.yarn.queue to running-on-YARN document. · 692f4937
      Masayoshi TSUZUKI authored
      Added descriptions about these parameters.
      - spark.yarn.queue
      
      Modified description about the defalut value of this parameter.
      - spark.yarn.submit.file.replication
      
      Author: Masayoshi TSUZUKI <tsudukim@oss.nttdata.co.jp>
      
      Closes #3500 from tsudukim/feature/SPARK-4642 and squashes the following commits:
      
      ce99655 [Masayoshi TSUZUKI] better gramatically.
      21cf624 [Masayoshi TSUZUKI] Removed intentionally undocumented properties.
      88cac9b [Masayoshi TSUZUKI] [SPARK-4642] Documents about running-on-YARN needs update
      692f4937
    • zsxwing's avatar
      [SPARK-4715][Core] Make sure tryToAcquire won't return a negative value · edd3cd47
      zsxwing authored
      ShuffleMemoryManager.tryToAcquire may return a negative value. The unit test demonstrates this bug. It will output `0 did not equal -200 granted is negative`.
      
      Author: zsxwing <zsxwing@gmail.com>
      
      Closes #3575 from zsxwing/SPARK-4715 and squashes the following commits:
      
      a193ae6 [zsxwing] Make sure tryToAcquire won't return a negative value
      edd3cd47
    • Masayoshi TSUZUKI's avatar
      [SPARK-4701] Typo in sbt/sbt · 96786e3e
      Masayoshi TSUZUKI authored
      Modified typo.
      
      Author: Masayoshi TSUZUKI <tsudukim@oss.nttdata.co.jp>
      
      Closes #3560 from tsudukim/feature/SPARK-4701 and squashes the following commits:
      
      ed2a3f1 [Masayoshi TSUZUKI] Another whitespace position error.
      1af3a35 [Masayoshi TSUZUKI] [SPARK-4701] Typo in sbt/sbt
      96786e3e
    • Jim Lim's avatar
      SPARK-2624 add datanucleus jars to the container in yarn-cluster · a975dc32
      Jim Lim authored
      If `spark-submit` finds the datanucleus jars, it adds them to the driver's classpath, but does not add it to the container.
      
      This patch modifies the yarn deployment class to copy all `datanucleus-*` jars found in `[spark-home]/libs` to the container.
      
      Author: Jim Lim <jim@quixey.com>
      
      Closes #3238 from jimjh/SPARK-2624 and squashes the following commits:
      
      3633071 [Jim Lim] SPARK-2624 update documentation and comments
      fe95125 [Jim Lim] SPARK-2624 keep java imports together
      6c31fe0 [Jim Lim] SPARK-2624 update documentation
      6690fbf [Jim Lim] SPARK-2624 add tests
      d28d8e9 [Jim Lim] SPARK-2624 add spark.yarn.datanucleus.dir option
      84e6cba [Jim Lim] SPARK-2624 add datanucleus jars to the container in yarn-cluster
      a975dc32
    • DB Tsai's avatar
      [SPARK-4717][MLlib] Optimize BLAS library to avoid de-reference multiple times in loop · d0054298
      DB Tsai authored
      Have a local reference to `values` and `indices` array in the `Vector` object
      so JVM can locate the value with one operation call. See `SPARK-4581`
      for similar optimization, and the bytecode analysis.
      
      Author: DB Tsai <dbtsai@alpinenow.com>
      
      Closes #3577 from dbtsai/blasopt and squashes the following commits:
      
      62d38c4 [DB Tsai] formating
      0316cef [DB Tsai] first commit
      d0054298
    • DB Tsai's avatar
      [SPARK-4708][MLLib] Make k-mean runs two/three times faster with dense/sparse sample · 7fc49ed9
      DB Tsai authored
      Note that the usage of `breezeSquaredDistance` in
      `org.apache.spark.mllib.util.MLUtils.fastSquaredDistance`
      is in the critical path, and `breezeSquaredDistance` is slow.
      We should replace it with our own implementation.
      
      Here is the benchmark against mnist8m dataset.
      
      Before
      DenseVector: 70.04secs
      SparseVector: 59.05secs
      
      With this PR
      DenseVector: 30.58secs
      SparseVector: 21.14secs
      
      Author: DB Tsai <dbtsai@alpinenow.com>
      
      Closes #3565 from dbtsai/kmean and squashes the following commits:
      
      08bc068 [DB Tsai] restyle
      de24662 [DB Tsai] address feedback
      b185a77 [DB Tsai] cleanup
      4554ddd [DB Tsai] first commit
      7fc49ed9
    • Joseph K. Bradley's avatar
      [SPARK-4710] [mllib] Eliminate MLlib compilation warnings · 4ac21511
      Joseph K. Bradley authored
      Renamed StreamingKMeans to StreamingKMeansExample to avoid warning about name conflict with StreamingKMeans class.
      
      Added import to DecisionTreeRunner to eliminate warning.
      
      CC: mengxr
      
      Author: Joseph K. Bradley <joseph@databricks.com>
      
      Closes #3568 from jkbradley/ml-compilation-warnings and squashes the following commits:
      
      64d6bc4 [Joseph K. Bradley] Updated DecisionTreeRunner.scala and StreamingKMeans.scala to eliminate compilation warnings, including renaming StreamingKMeans to StreamingKMeansExample.
      4ac21511
    • zsxwing's avatar
      [SPARK-4397][Core] Change the 'since' value of '@deprecated' to '1.3.0' · 8af551f7
      zsxwing authored
      As #3262 wasn't merged to branch 1.2, the `since` value of `deprecated` should be '1.3.0'.
      
      Author: zsxwing <zsxwing@gmail.com>
      
      Closes #3573 from zsxwing/SPARK-4397-version and squashes the following commits:
      
      1daa03c [zsxwing] Change the 'since' value to '1.3.0'
      8af551f7
    • JerryLead's avatar
      [SPARK-4672][Core]Checkpoint() should clear f to shorten the serialization chain · 77be8b98
      JerryLead authored
      The related JIRA is https://issues.apache.org/jira/browse/SPARK-4672
      
      The f closure of `PartitionsRDD(ZippedPartitionsRDD2)` contains a `$outer` that references EdgeRDD/VertexRDD, which causes task's serialization chain become very long in iterative GraphX applications. As a result, StackOverflow error will occur. If we set "f = null" in `clearDependencies()`, checkpoint() can cut off the long serialization chain. More details and explanation can be found in the JIRA.
      
      Author: JerryLead <JerryLead@163.com>
      Author: Lijie Xu <csxulijie@gmail.com>
      
      Closes #3545 from JerryLead/my_core and squashes the following commits:
      
      f7faea5 [JerryLead] checkpoint() should clear the f to avoid StackOverflow error
      c0169da [JerryLead] Merge branch 'master' of https://github.com/apache/spark
      52799e3 [Lijie Xu] Merge pull request #1 from apache/master
      77be8b98
  2. Dec 02, 2014
    • JerryLead's avatar
      [SPARK-4672][GraphX]Non-transient PartitionsRDDs will lead to StackOverflow error · 17c162f6
      JerryLead authored
      The related JIRA is https://issues.apache.org/jira/browse/SPARK-4672
      
      In a nutshell, if `val partitionsRDD` in EdgeRDDImpl and VertexRDDImpl are non-transient, the serialization chain can become very long in iterative algorithms and finally lead to the StackOverflow error. More details and explanation can be found in the JIRA.
      
      Author: JerryLead <JerryLead@163.com>
      Author: Lijie Xu <csxulijie@gmail.com>
      
      Closes #3544 from JerryLead/my_graphX and squashes the following commits:
      
      628f33c [JerryLead] set PartitionsRDD to be transient in EdgeRDDImpl and VertexRDDImpl
      c0169da [JerryLead] Merge branch 'master' of https://github.com/apache/spark
      52799e3 [Lijie Xu] Merge pull request #1 from apache/master
      17c162f6
    • JerryLead's avatar
      [SPARK-4672][GraphX]Perform checkpoint() on PartitionsRDD to shorten the lineage · fc0a1475
      JerryLead authored
      The related JIRA is https://issues.apache.org/jira/browse/SPARK-4672
      
      Iterative GraphX applications always have long lineage, while checkpoint() on EdgeRDD and VertexRDD themselves cannot shorten the lineage. In contrast, if we perform checkpoint() on their ParitionsRDD, the long lineage can be cut off. Moreover, the existing operations such as cache() in this code is performed on the PartitionsRDD, so checkpoint() should do the same way. More details and explanation can be found in the JIRA.
      
      Author: JerryLead <JerryLead@163.com>
      Author: Lijie Xu <csxulijie@gmail.com>
      
      Closes #3549 from JerryLead/my_graphX_checkpoint and squashes the following commits:
      
      d1aa8d8 [JerryLead] Perform checkpoint() on PartitionsRDD not VertexRDD and EdgeRDD themselves
      ff08ed4 [JerryLead] Merge branch 'master' of https://github.com/apache/spark
      c0169da [JerryLead] Merge branch 'master' of https://github.com/apache/spark
      52799e3 [Lijie Xu] Merge pull request #1 from apache/master
      fc0a1475
    • Andrew Or's avatar
      5da21f07
    • Reynold Xin's avatar
      Minor nit style cleanup in GraphX. · 2d4f6e70
      Reynold Xin authored
      2d4f6e70
    • wangfei's avatar
      [SPARK-4695][SQL] Get result using executeCollect · 3ae0cda8
      wangfei authored
      Using ```executeCollect``` to collect the result, because executeCollect is a custom implementation of collect in spark sql which better than rdd's collect
      
      Author: wangfei <wangfei1@huawei.com>
      
      Closes #3547 from scwf/executeCollect and squashes the following commits:
      
      a5ab68e [wangfei] Revert "adding debug info"
      a60d680 [wangfei] fix test failure
      0db7ce8 [wangfei] adding debug info
      184c594 [wangfei] using executeCollect instead collect
      3ae0cda8
    • Daoyuan Wang's avatar
      [SPARK-4670] [SQL] wrong symbol for bitwise not · 1f5ddf17
      Daoyuan Wang authored
      We should use `~` instead of `-` for bitwise NOT.
      
      Author: Daoyuan Wang <daoyuan.wang@intel.com>
      
      Closes #3528 from adrian-wang/symbol and squashes the following commits:
      
      affd4ad [Daoyuan Wang] fix code gen test case
      56efb79 [Daoyuan Wang] ensure bitwise NOT over byte and short persist data type
      f55fbae [Daoyuan Wang] wrong symbol for bitwise not
      1f5ddf17
    • Daoyuan Wang's avatar
      [SPARK-4593][SQL] Return null when denominator is 0 · f6df609d
      Daoyuan Wang authored
      SELECT max(1/0) FROM src
      would return a very large number, which is obviously not right.
      For hive-0.12, hive would return `Infinity` for 1/0, while for hive-0.13.1, it is `NULL` for 1/0.
      I think it is better to keep our behavior with newer Hive version.
      This PR ensures that when the divider is 0, the result of expression should be NULL, same with hive-0.13.1
      
      Author: Daoyuan Wang <daoyuan.wang@intel.com>
      
      Closes #3443 from adrian-wang/div and squashes the following commits:
      
      2e98677 [Daoyuan Wang] fix code gen for divide 0
      85c28ba [Daoyuan Wang] temp
      36236a5 [Daoyuan Wang] add test cases
      6f5716f [Daoyuan Wang] fix comments
      cee92bd [Daoyuan Wang] avoid evaluation 2 times
      22ecd9a [Daoyuan Wang] fix style
      cf28c58 [Daoyuan Wang] divide fix
      2dfe50f [Daoyuan Wang] return null when divider is 0 of Double type
      f6df609d
    • YanTangZhai's avatar
      [SPARK-4676][SQL] JavaSchemaRDD.schema may throw NullType MatchError if sql has null · 10664276
      YanTangZhai authored
      val jsc = new org.apache.spark.api.java.JavaSparkContext(sc)
      val jhc = new org.apache.spark.sql.hive.api.java.JavaHiveContext(jsc)
      val nrdd = jhc.hql("select null from spark_test.for_test")
      println(nrdd.schema)
      Then the error is thrown as follows:
      scala.MatchError: NullType (of class org.apache.spark.sql.catalyst.types.NullType$)
      at org.apache.spark.sql.types.util.DataTypeConversions$.asJavaDataType(DataTypeConversions.scala:43)
      
      Author: YanTangZhai <hakeemzhai@tencent.com>
      Author: yantangzhai <tyz0303@163.com>
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #3538 from YanTangZhai/MatchNullType and squashes the following commits:
      
      e052dff [yantangzhai] [SPARK-4676] [SQL] JavaSchemaRDD.schema may throw NullType MatchError if sql has null
      4b4bb34 [yantangzhai] [SPARK-4676] [SQL] JavaSchemaRDD.schema may throw NullType MatchError if sql has null
      896c7b7 [yantangzhai] fix NullType MatchError in JavaSchemaRDD when sql has null
      6e643f8 [YanTangZhai] Merge pull request #11 from apache/master
      e249846 [YanTangZhai] Merge pull request #10 from apache/master
      d26d982 [YanTangZhai] Merge pull request #9 from apache/master
      76d4027 [YanTangZhai] Merge pull request #8 from apache/master
      03b62b0 [YanTangZhai] Merge pull request #7 from apache/master
      8a00106 [YanTangZhai] Merge pull request #6 from apache/master
      cbcba66 [YanTangZhai] Merge pull request #3 from apache/master
      cdef539 [YanTangZhai] Merge pull request #1 from apache/master
      10664276
    • baishuo's avatar
      [SPARK-4663][sql]add finally to avoid resource leak · 69b6fed2
      baishuo authored
      Author: baishuo <vc_java@hotmail.com>
      
      Closes #3526 from baishuo/master-trycatch and squashes the following commits:
      
      d446e14 [baishuo] correct the code style
      b36bf96 [baishuo] correct the code style
      ae0e447 [baishuo] add finally to avoid resource leak
      69b6fed2
    • Kousuke Saruta's avatar
      [SPARK-4536][SQL] Add sqrt and abs to Spark SQL DSL · e75e04f9
      Kousuke Saruta authored
      Spark SQL has embeded sqrt and abs but DSL doesn't support those functions.
      
      Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
      
      Closes #3401 from sarutak/dsl-missing-operator and squashes the following commits:
      
      07700cf [Kousuke Saruta] Modified Literal(null, NullType) to Literal(null) in DslQuerySuite
      8f366f8 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into dsl-missing-operator
      1b88e2e [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into dsl-missing-operator
      0396f89 [Kousuke Saruta] Added sqrt and abs to Spark SQL DSL
      e75e04f9
    • Reynold Xin's avatar
      Indent license header properly for interfaces.scala. · b1f8fe31
      Reynold Xin authored
      A very small nit update.
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #3552 from rxin/license-header and squashes the following commits:
      
      df8d1a4 [Reynold Xin] Indent license header properly for interfaces.scala.
      b1f8fe31
    • Kay Ousterhout's avatar
      [SPARK-4686] Link to allowed master URLs is broken · d9a148ba
      Kay Ousterhout authored
      The link points to the old scala programming guide; it should point to the submitting applications page.
      
      This should be backported to 1.1.2 (it's been broken as of 1.0).
      
      Author: Kay Ousterhout <kayousterhout@gmail.com>
      
      Closes #3542 from kayousterhout/SPARK-4686 and squashes the following commits:
      
      a8fc43b [Kay Ousterhout] [SPARK-4686] Link to allowed master URLs is broken
      d9a148ba
    • zsxwing's avatar
      [SPARK-4397][Core] Cleanup 'import SparkContext._' in core · 6dfe38a0
      zsxwing authored
      This PR cleans up `import SparkContext._` in core for SPARK-4397(#3262) to prove it really works well.
      
      Author: zsxwing <zsxwing@gmail.com>
      
      Closes #3530 from zsxwing/SPARK-4397-cleanup and squashes the following commits:
      
      04e2273 [zsxwing] Cleanup 'import SparkContext._' in core
      6dfe38a0
  3. Dec 01, 2014
    • DB Tsai's avatar
      [SPARK-4611][MLlib] Implement the efficient vector norm · 64f3175b
      DB Tsai authored
      The vector norm in breeze is implemented by `activeIterator` which is known to be very slow.
      In this PR, an efficient vector norm is implemented, and with this API, `Normalizer` and
      `k-means` have big performance improvement.
      
      Here is the benchmark against mnist8m dataset.
      
      a) `Normalizer`
      Before
      DenseVector: 68.25secs
      SparseVector: 17.01secs
      
      With this PR
      DenseVector: 12.71secs
      SparseVector: 2.73secs
      
      b) `k-means`
      Before
      DenseVector: 83.46secs
      SparseVector: 61.60secs
      
      With this PR
      DenseVector: 70.04secs
      SparseVector: 59.05secs
      
      Author: DB Tsai <dbtsai@alpinenow.com>
      
      Closes #3462 from dbtsai/norm and squashes the following commits:
      
      63c7165 [DB Tsai] typo
      0c3637f [DB Tsai] add import org.apache.spark.SparkContext._ back
      6fa616c [DB Tsai] address feedback
      9b7cb56 [DB Tsai] move norm to static method
      0b632e6 [DB Tsai] kmeans
      dbed124 [DB Tsai] style
      c1a877c [DB Tsai] first commit
      64f3175b
    • Patrick Wendell's avatar
      MAINTENANCE: Automated closing of pull requests. · b0a46d89
      Patrick Wendell authored
      This commit exists to close the following pull requests on Github:
      
      Closes #1612 (close requested by 'marmbrus')
      Closes #2723 (close requested by 'marmbrus')
      Closes #1737 (close requested by 'marmbrus')
      Closes #2252 (close requested by 'marmbrus')
      Closes #2029 (close requested by 'marmbrus')
      Closes #2386 (close requested by 'marmbrus')
      Closes #2997 (close requested by 'marmbrus')
      b0a46d89
    • zsxwing's avatar
      [SPARK-4268][SQL] Use #::: to get benefit from Stream in SqlLexical.allCaseVersions · d3e02ddd
      zsxwing authored
      In addition, using `s.isEmpty` to eliminate the string comparison.
      
      Author: zsxwing <zsxwing@gmail.com>
      
      Closes #3132 from zsxwing/SPARK-4268 and squashes the following commits:
      
      358e235 [zsxwing] Improvement of allCaseVersions
      d3e02ddd
    • Daoyuan Wang's avatar
      [SPARK-4529] [SQL] support view with column alias · 4df60a8c
      Daoyuan Wang authored
      Support view definition like
      
      CREATE VIEW view3(valoo)
      TBLPROPERTIES ("fear" = "factor")
      AS SELECT upper(value) FROM src WHERE key=86;
      
      [valoo as the alias of upper(value)]. This is missing part of SPARK-4239, for a fully view support.
      
      Author: Daoyuan Wang <daoyuan.wang@intel.com>
      
      Closes #3396 from adrian-wang/viewcolumn and squashes the following commits:
      
      4d001d0 [Daoyuan Wang] support view with column alias
      4df60a8c
    • Daoyuan Wang's avatar
      [SQL][DOC] Date type in SQL programming guide · 5edbcbfb
      Daoyuan Wang authored
      Author: Daoyuan Wang <daoyuan.wang@intel.com>
      
      Closes #3535 from adrian-wang/datedoc and squashes the following commits:
      
      18ff1ed [Daoyuan Wang] [DOC] Date type
      5edbcbfb
    • wangfei's avatar
      [SQL] Minor fix for doc and comment · 7b799578
      wangfei authored
      Author: wangfei <wangfei1@huawei.com>
      
      Closes #3533 from scwf/sql-doc1 and squashes the following commits:
      
      962910b [wangfei] doc and comment fix
      7b799578
    • ravipesala's avatar
      [SPARK-4658][SQL] Code documentation issue in DDL of datasource API · bc353819
      ravipesala authored
      Author: ravipesala <ravindra.pesala@huawei.com>
      
      Closes #3516 from ravipesala/ddl_doc and squashes the following commits:
      
      d101fdf [ravipesala] Style issues fixed
      d2238cd [ravipesala] Corrected documentation
      bc353819
    • ravipesala's avatar
      [SPARK-4650][SQL] Supporting multi column support in countDistinct function... · 6a9ff19d
      ravipesala authored
      [SPARK-4650][SQL] Supporting multi column support in countDistinct function like count(distinct c1,c2..) in Spark SQL
      
      Supporting multi column support in countDistinct function like count(distinct c1,c2..) in Spark SQL
      
      Author: ravipesala <ravindra.pesala@huawei.com>
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #3511 from ravipesala/countdistinct and squashes the following commits:
      
      cc4dbb1 [ravipesala] style
      070e12a [ravipesala] Supporting multi column support in count(distinct c1,c2..) in Spark SQL
      6a9ff19d
    • Liang-Chi Hsieh's avatar
      [SPARK-4358][SQL] Let BigDecimal do checking type compatibility · b57365a1
      Liang-Chi Hsieh authored
      Remove hardcoding max and min values for types. Let BigDecimal do checking type compatibility.
      
      Author: Liang-Chi Hsieh <viirya@gmail.com>
      
      Closes #3208 from viirya/more_numericLit and squashes the following commits:
      
      e9834b4 [Liang-Chi Hsieh] Remove byte and short types for number literal.
      1bd1825 [Liang-Chi Hsieh] Fix Indentation and make the modification clearer.
      cf1a997 [Liang-Chi Hsieh] Modified for comment to add a rule of analysis that adds a cast.
      91fe489 [Liang-Chi Hsieh] add Byte and Short.
      1bdc69d [Liang-Chi Hsieh] Let BigDecimal do checking type compatibility.
      b57365a1
    • Jacky Li's avatar
      [SQL] add @group tab in limit() and count() · bafee67e
      Jacky Li authored
      group tab is missing for scaladoc
      
      Author: Jacky Li <jacky.likun@gmail.com>
      
      Closes #3458 from jackylk/patch-7 and squashes the following commits:
      
      0121a70 [Jacky Li] add @group tab in limit() and count()
      bafee67e
    • Cheng Lian's avatar
      [SPARK-4258][SQL][DOC] Documents spark.sql.parquet.filterPushdown · 5db8dcaf
      Cheng Lian authored
      Documents `spark.sql.parquet.filterPushdown`, explains why it's turned off by default and when it's safe to be turned on.
      
      <!-- Reviewable:start -->
      [<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/3440)
      <!-- Reviewable:end -->
      
      Author: Cheng Lian <lian@databricks.com>
      
      Closes #3440 from liancheng/parquet-filter-pushdown-doc and squashes the following commits:
      
      2104311 [Cheng Lian] Documents spark.sql.parquet.filterPushdown
      5db8dcaf
    • Madhu Siddalingaiah's avatar
      Documentation: add description for repartitionAndSortWithinPartitions · 2b233f5f
      Madhu Siddalingaiah authored
      Author: Madhu Siddalingaiah <madhu@madhu.com>
      
      Closes #3390 from msiddalingaiah/master and squashes the following commits:
      
      cbccbfe [Madhu Siddalingaiah] Documentation: replace <b> with <code> (again)
      332f7a2 [Madhu Siddalingaiah] Documentation: replace <b> with <code>
      cd2b05a [Madhu Siddalingaiah] Merge remote-tracking branch 'upstream/master'
      0fc12d7 [Madhu Siddalingaiah] Documentation: add description for repartitionAndSortWithinPartitions
      2b233f5f
    • zsxwing's avatar
      [SPARK-4661][Core] Minor code and docs cleanup · 30a86acd
      zsxwing authored
      Author: zsxwing <zsxwing@gmail.com>
      
      Closes #3521 from zsxwing/SPARK-4661 and squashes the following commits:
      
      03cbe3f [zsxwing] Minor code and docs cleanup
      30a86acd
    • zsxwing's avatar
      [SPARK-4664][Core] Throw an exception when spark.akka.frameSize > 2047 · 1d238f22
      zsxwing authored
      If `spark.akka.frameSize` > 2047, it will overflow and become negative. Should have some assertion in `maxFrameSizeBytes` to warn people.
      
      Author: zsxwing <zsxwing@gmail.com>
      
      Closes #3527 from zsxwing/SPARK-4664 and squashes the following commits:
      
      0089c7a [zsxwing] Throw an exception when spark.akka.frameSize > 2047
      1d238f22
    • Sean Owen's avatar
      SPARK-2192 [BUILD] Examples Data Not in Binary Distribution · 6384f42a
      Sean Owen authored
      Simply, add data/ to distributions. This adds about 291KB (compressed) to the tarball, FYI.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #3480 from srowen/SPARK-2192 and squashes the following commits:
      
      47688f1 [Sean Owen] Add data/ to distributions
      6384f42a
Loading