Skip to content
Snippets Groups Projects
  1. Mar 24, 2015
    • Reynold Xin's avatar
      [SPARK-6428][SQL] Added explicit types for all public methods in catalyst · 73348012
      Reynold Xin authored
      I think after this PR, we can finally turn the rule on. There are still some smaller ones that need to be fixed, but those are easier.
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #5162 from rxin/catalyst-explicit-types and squashes the following commits:
      
      e7eac03 [Reynold Xin] [SPARK-6428][SQL] Added explicit types for all public methods in catalyst.
      73348012
    • Josh Rosen's avatar
      [SPARK-6209] Clean up connections in ExecutorClassLoader after failing to load... · 7215aa74
      Josh Rosen authored
      [SPARK-6209] Clean up connections in ExecutorClassLoader after failing to load classes (master branch PR)
      
      ExecutorClassLoader does not ensure proper cleanup of network connections that it opens. If it fails to load a class, it may leak partially-consumed InputStreams that are connected to the REPL's HTTP class server, causing that server to exhaust its thread pool, which can cause the entire job to hang.  See [SPARK-6209](https://issues.apache.org/jira/browse/SPARK-6209) for more details, including a bug reproduction.
      
      This patch fixes this issue by ensuring proper cleanup of these resources.  It also adds logging for unexpected error cases.
      
      This PR is an extended version of #4935 and adds a regression test.
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #4944 from JoshRosen/executorclassloader-leak-master-branch and squashes the following commits:
      
      e0e3c25 [Josh Rosen] Wrap try block around getReponseCode; re-enable keep-alive by closing error stream
      961c284 [Josh Rosen] Roll back changes that were added to get the regression test to fail
      7ee2261 [Josh Rosen] Add a failing regression test
      e2d70a3 [Josh Rosen] Properly clean up after errors in ExecutorClassLoader
      7215aa74
    • Michael Armbrust's avatar
      [SPARK-6458][SQL] Better error messages for invalid data sources · a8f51b82
      Michael Armbrust authored
      Avoid unclear match errors and use `AnalysisException`.
      
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #5158 from marmbrus/dataSourceError and squashes the following commits:
      
      af9f82a [Michael Armbrust] Yins comment
      90c6ba4 [Michael Armbrust] Better error messages for invalid data sources
      a8f51b82
    • Michael Armbrust's avatar
      [SPARK-6376][SQL] Avoid eliminating subqueries until optimization · cbeaf9eb
      Michael Armbrust authored
      Previously it was okay to throw away subqueries after analysis, as we would never try to use that tree for resolution again.  However, with eager analysis in `DataFrame`s this can cause errors for queries such as:
      
      ```scala
      val df = Seq(1,2,3).map(i => (i, i.toString)).toDF("int", "str")
      df.as('x).join(df.as('y), $"x.str" === $"y.str").groupBy("x.str").count()
      ```
      
      As a result, in this PR we defer the elimination of subqueries until the optimization phase.
      
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #5160 from marmbrus/subqueriesInDfs and squashes the following commits:
      
      a9bb262 [Michael Armbrust] Update Optimizer.scala
      27d25bf [Michael Armbrust] fix hive tests
      9137e03 [Michael Armbrust] add type
      81cd597 [Michael Armbrust] Avoid eliminating subqueries until optimization
      cbeaf9eb
    • Michael Armbrust's avatar
      [SPARK-6375][SQL] Fix formatting of error messages. · 046c1e2a
      Michael Armbrust authored
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #5155 from marmbrus/errorMessages and squashes the following commits:
      
      b898188 [Michael Armbrust] Fix formatting of error messages.
      046c1e2a
    • Michael Armbrust's avatar
      [SPARK-6054][SQL] Fix transformations of TreeNodes that hold StructTypes · 3fa3d121
      Michael Armbrust authored
      Due to a recent change that made `StructType` a `Seq` we started inadvertently turning `StructType`s into generic `Traversable` when attempting nested tree transformations.  In this PR we explicitly avoid descending into `DataType`s to avoid this bug.
      
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #5157 from marmbrus/udfFix and squashes the following commits:
      
      26f7087 [Michael Armbrust] Fix transformations of TreeNodes that hold StructTypes
      3fa3d121
    • Michael Armbrust's avatar
      [SPARK-6437][SQL] Use completion iterator to close external sorter · 26c6ce3d
      Michael Armbrust authored
      Otherwise we will leak files when spilling occurs.
      
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #5161 from marmbrus/cleanupAfterSort and squashes the following commits:
      
      cb13d3c [Michael Armbrust] hint to inferencer
      cdebdf5 [Michael Armbrust] Use completion iterator to close external sorter
      26c6ce3d
    • Michael Armbrust's avatar
      [SPARK-6459][SQL] Warn when constructing trivially true equals predicate · 32efadd0
      Michael Armbrust authored
      For example, one might expect the following code to work, but it does not.  Now you will at least get a warning with a suggestion to use aliases.
      
      ```scala
      val df = sqlContext.load(path, "parquet")
      val txns = df.groupBy("cust_id").agg($"cust_id", countDistinct($"day_num").as("txns"))
      val spend = df.groupBy("cust_id").agg($"cust_id", sum($"extended_price").as("spend"))
      val rmJoin = txns.join(spend, txns("cust_id") === spend("cust_id"), "inner")
      ```
      
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #5163 from marmbrus/selfJoinError and squashes the following commits:
      
      16c1f0b [Michael Armbrust] fix visibility
      1b57e8d [Michael Armbrust] Warn when constructing trivially true equals predicate
      32efadd0
    • Xiangrui Meng's avatar
      [SPARK-6361][SQL] support adding a column with metadata in DF · 6bdddb6f
      Xiangrui Meng authored
      This is used by ML pipelines to embed ML attributes in columns created by ML transformers/estimators. marmbrus
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #5151 from mengxr/SPARK-6361 and squashes the following commits:
      
      bb30de3 [Xiangrui Meng] support adding a column with metadata in DF
      6bdddb6f
    • Xiangrui Meng's avatar
      [SPARK-6475][SQL] recognize array types when infer data types from JavaBeans · a1d1529d
      Xiangrui Meng authored
      Right now if there is a array field in a JavaBean, the user wold see an exception in `createDataFrame`. liancheng
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #5146 from mengxr/SPARK-6475 and squashes the following commits:
      
      51e87e5 [Xiangrui Meng] validate schemas
      4f2df5e [Xiangrui Meng] recognize array types when infer data types from JavaBeans
      a1d1529d
    • Peter Rudenko's avatar
      [ML][docs][minor] Define LabeledDocument/Document classes in CV example · 08d45280
      Peter Rudenko authored
      To easier copy/paste Cross-Validation example code snippet need to define LabeledDocument/Document in it, since they difined in a previous example.
      
      Author: Peter Rudenko <petro.rudenko@gmail.com>
      
      Closes #5135 from petro-rudenko/patch-3 and squashes the following commits:
      
      5190c75 [Peter Rudenko] Fix primitive types for java examples.
      1d35383 [Peter Rudenko] [SQL][docs][minor] Define LabeledDocument/Document classes in CV example
      08d45280
    • Kousuke Saruta's avatar
      [SPARK-5559] [Streaming] [Test] Remove oppotunity we met flakiness when running FlumeStreamSuite · 85cf0636
      Kousuke Saruta authored
      When we run FlumeStreamSuite on Jenkins, sometimes we get error like as follows.
      
          sbt.ForkMain$ForkError: The code passed to eventually never returned normally. Attempted 52 times over 10.094849836 seconds. Last failure message: Error connecting to localhost/127.0.0.1:23456.
      	    at org.scalatest.concurrent.Eventually$class.tryTryAgain$1(Eventually.scala:420)
      	    at org.scalatest.concurrent.Eventually$class.eventually(Eventually.scala:438)
      	    at org.scalatest.concurrent.Eventually$.eventually(Eventually.scala:478)
      	    at org.scalatest.concurrent.Eventually$class.eventually(Eventually.scala:307)
      	   at org.scalatest.concurrent.Eventually$.eventually(Eventually.scala:478)
      	   at org.apache.spark.streaming.flume.FlumeStreamSuite.writeAndVerify(FlumeStreamSuite.scala:116)
                 at org.apache.spark.streaming.flume.FlumeStreamSuite.org$apache$spark$streaming$flume$FlumeStreamSuite$$testFlumeStream(FlumeStreamSuite.scala:74)
      	   at org.apache.spark.streaming.flume.FlumeStreamSuite$$anonfun$3.apply$mcV$sp(FlumeStreamSuite.scala:66)
      	    at org.apache.spark.streaming.flume.FlumeStreamSuite$$anonfun$3.apply(FlumeStreamSuite.scala:66)
      	    at org.apache.spark.streaming.flume.FlumeStreamSuite$$anonfun$3.apply(FlumeStreamSuite.scala:66)
      	    at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
      	    at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
      	    at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
      	    at org.scalatest.Transformer.apply(Transformer.scala:22)
      	    at org.scalatest.Transformer.apply(Transformer.scala:20)
          	    at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
      	    at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
      	    at org.scalatest.FunSuite.withFixture(FunSuite.scala:1555)
      	    at org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
      	   at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
      	    at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
      	    at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
      	    at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
      
      This error is caused by check-then-act logic  when it find free-port .
      
            /** Find a free port */
            private def findFreePort(): Int = {
              Utils.startServiceOnPort(23456, (trialPort: Int) => {
                val socket = new ServerSocket(trialPort)
                socket.close()
                (null, trialPort)
              }, conf)._2
            }
      
      Removing the check-then-act is not easy but we can reduce the chance of having the error by choosing random value for initial port instead of 23456.
      
      Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
      
      Closes #4337 from sarutak/SPARK-5559 and squashes the following commits:
      
      16f109f [Kousuke Saruta] Added `require` to Utils#startServiceOnPort
      c39d8b6 [Kousuke Saruta] Merge branch 'SPARK-5559' of github.com:sarutak/spark into SPARK-5559
      1610ba2 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5559
      33357e3 [Kousuke Saruta] Changed "findFreePort" method in MQTTStreamSuite and FlumeStreamSuite so that it can choose valid random port
      a9029fe [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5559
      9489ef9 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5559
      8212e42 [Kousuke Saruta] Modified default port used in FlumeStreamSuite from 23456 to random value
      85cf0636
    • Marcelo Vanzin's avatar
      [SPARK-6473] [core] Do not try to figure out Scala version if not needed... · b293afc4
      Marcelo Vanzin authored
      ....
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #5143 from vanzin/SPARK-6473 and squashes the following commits:
      
      a2e5e2d [Marcelo Vanzin] [SPARK-6473] [core] Do not try to figure out Scala version if not needed.
      b293afc4
    • Cong Yue's avatar
      Update the command to use IPython notebook · c12312f8
      Cong Yue authored
      As for "notebook --pylab inline" is not supported any more, update the related documentation for this.
      
      Author: Cong Yue <yuecong1104@gmail.com>
      
      Closes #5111 from yuecong/patch-1 and squashes the following commits:
      
      872df76 [Cong Yue] Update the command to use IPython notebook
      c12312f8
    • Brennon York's avatar
      [SPARK-6477][Build]: Run MIMA tests before the Spark test suite · 37fac1dc
      Brennon York authored
      This moves the MIMA checks to before the full Spark test suite such that, if new PR's fail the MIMA check, they will return much faster having not run the entire test suite. This is preferable to the current scenario where a user would have to wait until the entire test suite completes before realizing it failed on a MIMA check in which case, once the MIMA issues are fixed, the user would have to resubmit and rerun the full test suite again.
      
      Author: Brennon York <brennon.york@capitalone.com>
      
      Closes #5145 from brennonyork/SPARK-6477 and squashes the following commits:
      
      12b0aee [Brennon York] updated to put the mima checks before the spark test suite
      37fac1dc
    • Cheng Lian's avatar
      [SPARK-6452] [SQL] Checks for missing attributes and unresolved operator for all types of operator · 1afcf773
      Cheng Lian authored
      In `CheckAnalysis`, `Filter` and `Aggregate` are checked in separate case clauses, thus never hit those clauses for unresolved operators and missing input attributes.
      
      This PR also removes the `prettyString` call when generating error message for missing input attributes. Because result of `prettyString` doesn't contain expression ID, and may give confusing messages like
      
      > resolved attributes a missing from a
      
      cc rxin
      
      <!-- Reviewable:start -->
      [<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/5129)
      <!-- Reviewable:end -->
      
      Author: Cheng Lian <lian@databricks.com>
      
      Closes #5129 from liancheng/spark-6452 and squashes the following commits:
      
      52cdc69 [Cheng Lian] Addresses comments
      029f9bd [Cheng Lian] Checks for missing attributes and unresolved operator for all types of operator
      1afcf773
    • Reynold Xin's avatar
      [SPARK-6428] Added explicit types for all public methods in core. · 4ce2782a
      Reynold Xin authored
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #5125 from rxin/core-explicit-type and squashes the following commits:
      
      f471415 [Reynold Xin] Revert style checker changes.
      81b66e4 [Reynold Xin] Code review feedback.
      a7533e3 [Reynold Xin] Mima excludes.
      1d795f5 [Reynold Xin] [SPARK-6428] Added explicit types for all public methods in core.
      4ce2782a
  2. Mar 23, 2015
  3. Mar 22, 2015
    • Cheng Lian's avatar
      Revert "[SPARK-6397][SQL] Check the missingInput simply" · bf044def
      Cheng Lian authored
      This reverts commit e566fe59.
      bf044def
    • q00251598's avatar
      [SPARK-6397][SQL] Check the missingInput simply · e566fe59
      q00251598 authored
      Author: q00251598 <qiyadong@huawei.com>
      
      Closes #5082 from watermen/sql-missingInput and squashes the following commits:
      
      25766b9 [q00251598] Check the missingInput simply
      e566fe59
    • Daoyuan Wang's avatar
      [SPARK-4985] [SQL] parquet support for date type · 4659468f
      Daoyuan Wang authored
      This PR might have some issues with #3732 ,
      and this would have merge conflicts with #3820 so the review can be delayed till that 2 were merged.
      
      Author: Daoyuan Wang <daoyuan.wang@intel.com>
      
      Closes #3822 from adrian-wang/parquetdate and squashes the following commits:
      
      2c5d54d [Daoyuan Wang] add a test case
      faef887 [Daoyuan Wang] parquet support for primitive date
      97e9080 [Daoyuan Wang] parquet support for date type
      4659468f
    • vinodkc's avatar
      [SPARK-6337][Documentation, SQL]Spark 1.3 doc fixes · 2bf40c58
      vinodkc authored
      Author: vinodkc <vinod.kc.in@gmail.com>
      
      Closes #5112 from vinodkc/spark_1.3_doc_fixes and squashes the following commits:
      
      2c6aee6 [vinodkc] Spark 1.3 doc fixes
      2bf40c58
    • Reynold Xin's avatar
    • Calvin Jia's avatar
      [SPARK-6122][Core] Upgrade Tachyon client version to 0.6.1. · a41b9c60
      Calvin Jia authored
      Changes the Tachyon client version from 0.5 to 0.6 in spark core and distribution script.
      
      New dependencies in Tachyon 0.6.0 include
      
      commons-codec:commons-codec:jar:1.5:compile
      io.netty:netty-all:jar:4.0.23.Final:compile
      
      These are already in spark core.
      
      Author: Calvin Jia <jia.calvin@gmail.com>
      
      Closes #4867 from calvinjia/upgrade_tachyon_0.6.0 and squashes the following commits:
      
      eed9230 [Calvin Jia] Update tachyon version to 0.6.1.
      11907b3 [Calvin Jia] Use TachyonURI for tachyon paths instead of strings.
      71bf441 [Calvin Jia] Upgrade Tachyon client version to 0.6.0.
      a41b9c60
    • Kamil Smuga's avatar
      SPARK-6454 [DOCS] Fix links to pyspark api · 6ef48632
      Kamil Smuga authored
      Author: Kamil Smuga <smugakamil@gmail.com>
      Author: stderr <smugakamil@gmail.com>
      
      Closes #5120 from kamilsmuga/master and squashes the following commits:
      
      fee3281 [Kamil Smuga] more python api links fixed for docs
      13240cb [Kamil Smuga] resolved merge conflicts with upstream/master
      6649b3b [Kamil Smuga] fix broken docs links to Python API
      92f03d7 [stderr] Fix links to pyspark api
      6ef48632
    • Jongyoul Lee's avatar
      [SPARK-6453][Mesos] Some Mesos*Suite have a different package with their classes · adb2ff75
      Jongyoul Lee authored
      - Moved Suites from o.a.s.s.mesos to o.a.s.s.cluster.mesos
      
      Author: Jongyoul Lee <jongyoul@gmail.com>
      
      Closes #5126 from jongyoul/SPARK-6453 and squashes the following commits:
      
      4f24a3e [Jongyoul Lee] [SPARK-6453][Mesos] Some Mesos*Suite have a different package with their classes - Fixed imports orders
      8ab149d [Jongyoul Lee] [SPARK-6453][Mesos] Some Mesos*Suite have a different package with their classes - Moved Suites from o.a.s.s.mesos to o.a.s.s.cluster.mesos
      adb2ff75
    • Hangchen Yu's avatar
      [SPARK-6455] [docs] Correct some mistakes and typos · ab4f516f
      Hangchen Yu authored
      Correct some typos. Correct a mistake in lib/PageRank.scala. The first PageRank implementation uses standalone Graph interface, but the second uses Pregel interface. It may mislead the code viewers.
      
      Author: Hangchen Yu <yuhc@gitcafe.com>
      
      Closes #5128 from yuhc/master and squashes the following commits:
      
      53e5432 [Hangchen Yu] Merge branch 'master' of https://github.com/yuhc/spark
      67b77b5 [Hangchen Yu] [SPARK-6455] [docs] Correct some mistakes and typos
      206f2dc [Hangchen Yu] Correct some mistakes and typos.
      ab4f516f
    • Ryan Williams's avatar
      [SPARK-6448] Make history server log parse exceptions · b9fe504b
      Ryan Williams authored
      This helped me to debug a parse error that was due to the event log format changing recently.
      
      Author: Ryan Williams <ryan.blake.williams@gmail.com>
      
      Closes #5122 from ryan-williams/histerror and squashes the following commits:
      
      5831656 [Ryan Williams] line length
      c3742ae [Ryan Williams] Make history server log parse exceptions
      b9fe504b
    • ypcat's avatar
      [SPARK-6408] [SQL] Fix JDBCRDD filtering string literals · 9b1e1f20
      ypcat authored
      Author: ypcat <ypcat6@gmail.com>
      Author: Pei-Lun Lee <pllee@appier.com>
      
      Closes #5087 from ypcat/spark-6408 and squashes the following commits:
      
      1becc16 [ypcat] [SPARK-6408] [SQL] styling
      1bc4455 [ypcat] [SPARK-6408] [SQL] move nested function outside
      e57fa4a [ypcat] [SPARK-6408] [SQL] fix test case
      245ab6f [ypcat] [SPARK-6408] [SQL] add test cases for filtering quoted strings
      8962534 [Pei-Lun Lee] [SPARK-6408] [SQL] Fix filtering string literals
      9b1e1f20
  4. Mar 21, 2015
    • Reynold Xin's avatar
      [SPARK-6428][SQL] Added explicit type for all public methods for Hive module · b6090f90
      Reynold Xin authored
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #5108 from rxin/hive-public-type and squashes the following commits:
      
      a320328 [Reynold Xin] [SPARK-6428][SQL] Added explicit type for all public methods for Hive module.
      b6090f90
    • Yin Huai's avatar
      [SPARK-6250][SPARK-6146][SPARK-5911][SQL] Types are now reserved words in DDL parser. · 94a102ac
      Yin Huai authored
      This PR creates a trait `DataTypeParser` used to parse data types. This trait aims to be single place to provide the functionality of parsing data types' string representation. It is currently mixed in with `DDLParser` and `SqlParser`. It is also used to parse the data type for `DataFrame.cast` and to convert Hive metastore's data type string back to a `DataType`.
      
      JIRA: https://issues.apache.org/jira/browse/SPARK-6250
      
      Author: Yin Huai <yhuai@databricks.com>
      
      Closes #5078 from yhuai/ddlKeywords and squashes the following commits:
      
      0e66097 [Yin Huai] Special handle struct<>.
      fea6012 [Yin Huai] Style.
      c9733fb [Yin Huai] Create a trait to parse data types.
      94a102ac
    • Venkata Ramana Gollamudi's avatar
      [SPARK-5680][SQL] Sum function on all null values, should return zero · ee569a0c
      Venkata Ramana Gollamudi authored
      SELECT sum('a'), avg('a'), variance('a'), std('a') FROM src;
      Should give output as
      0.0	NULL	NULL	NULL
      This fixes hive udaf_number_format.q
      
      Author: Venkata Ramana G <ramana.gollamudihuawei.com>
      
      Author: Venkata Ramana Gollamudi <ramana.gollamudi@huawei.com>
      
      Closes #4466 from gvramana/sum_fix and squashes the following commits:
      
      42e14d1 [Venkata Ramana Gollamudi] Added comments
      39415c0 [Venkata Ramana Gollamudi] Handled the partitioned Sum expression scenario
      df66515 [Venkata Ramana Gollamudi] code style fix
      4be2606 [Venkata Ramana Gollamudi] Add udaf_number_format to whitelist and golden answer
      330fd64 [Venkata Ramana Gollamudi] fix sum function for all null data
      ee569a0c
    • x1-'s avatar
      [SPARK-5320][SQL]Add statistics method at NoRelation (override super). · 52dd4b2b
      x1- authored
      Because of no statistics override, in spute of super class say 'LeafNode must override'.
      fix issue
      
      [SPARK-5320: Joins on simple table created using select gives error](https://issues.apache.org/jira/browse/SPARK-5320)
      
      Author: x1- <viva008@gmail.com>
      
      Closes #5105 from x1-/SPARK-5320 and squashes the following commits:
      
      e561aac [x1-] Add statistics method at NoRelation (override super).
      52dd4b2b
  5. Mar 20, 2015
    • Yanbo Liang's avatar
      [SPARK-5821] [SQL] JSON CTAS command should throw error message when delete path failure · e5d2c37c
      Yanbo Liang authored
      When using "CREATE TEMPORARY TABLE AS SELECT" to create JSON table, we first delete the path file or directory and then generate a new directory with the same name. But if only read permission was granted, the delete failed.
      Here we just throwing an error message to let users know what happened.
      ParquetRelation2 may also hit this problem. I think to restrict JSONRelation and ParquetRelation2 must base on directory is more reasonable for access control. Maybe I can do it in follow up works.
      
      Author: Yanbo Liang <ybliang8@gmail.com>
      Author: Yanbo Liang <yanbohappy@gmail.com>
      
      Closes #4610 from yanboliang/jsonInsertImprovements and squashes the following commits:
      
      c387fce [Yanbo Liang] fix typos
      42d7fb6 [Yanbo Liang] add unittest & fix output format
      46f0d9d [Yanbo Liang] Update JSONRelation.scala
      e2df8d5 [Yanbo Liang] check path exisit when write
      79f7040 [Yanbo Liang] Update JSONRelation.scala
      e4bc229 [Yanbo Liang] Update JSONRelation.scala
      5a42d83 [Yanbo Liang] JSONRelation CTAS should check if delete is successful
      e5d2c37c
    • Cheng Lian's avatar
      [SPARK-6315] [SQL] Also tries the case class string parser while reading Parquet schema · 937c1e55
      Cheng Lian authored
      When writing Parquet files, Spark 1.1.x persists the schema string into Parquet metadata with the result of `StructType.toString`, which was then deprecated in Spark 1.2 by a schema string in JSON format. But we still need to take the old schema format into account while reading Parquet files.
      
      <!-- Reviewable:start -->
      [<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/5034)
      <!-- Reviewable:end -->
      
      Author: Cheng Lian <lian@databricks.com>
      
      Closes #5034 from liancheng/spark-6315 and squashes the following commits:
      
      a182f58 [Cheng Lian] Adds a regression test
      b9c6dbe [Cheng Lian] Also tries the case class string parser while reading Parquet schema
      937c1e55
    • Yanbo Liang's avatar
      [SPARK-5821] [SQL] ParquetRelation2 CTAS should check if delete is successful · bc37c974
      Yanbo Liang authored
      Do the same check as #4610 for ParquetRelation2.
      
      Author: Yanbo Liang <ybliang8@gmail.com>
      
      Closes #5107 from yanboliang/spark-5821-parquet and squashes the following commits:
      
      7092c8d [Yanbo Liang] ParquetRelation2 CTAS should check if delete is successful
      bc37c974
    • MechCoder's avatar
      [SPARK-6025] [MLlib] Add helper method evaluateEachIteration to extract learning curve · 25e271d9
      MechCoder authored
      Added evaluateEachIteration to allow the user to manually extract the error for each iteration of GradientBoosting. The internal optimisation can be dealt with later.
      
      Author: MechCoder <manojkumarsivaraj334@gmail.com>
      
      Closes #4906 from MechCoder/spark-6025 and squashes the following commits:
      
      67146ab [MechCoder] Minor
      352001f [MechCoder] Minor
      6e8aa10 [MechCoder] Made the following changes Used mapPartition instead of map Refactored computeError and unpersisted broadcast variables
      bc99ac6 [MechCoder] Refactor the method and stuff
      dbda033 [MechCoder] [SPARK-6025] Add helper method evaluateEachIteration to extract learning curve
      25e271d9
Loading