Skip to content
Snippets Groups Projects
  1. Jun 11, 2015
    • Reynold Xin's avatar
      [SPARK-8286] Rewrite UTF8String in Java and move it into unsafe package. · 7d669a56
      Reynold Xin authored
      Unit test is still in Scala.
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #6738 from rxin/utf8string-java and squashes the following commits:
      
      562dc6e [Reynold Xin] Flag...
      98e600b [Reynold Xin] Another try with encoding setting ..
      cfa6bdf [Reynold Xin] Merge branch 'master' into utf8string-java
      a3b124d [Reynold Xin] Try different UTF-8 encoded characters.
      1ff7c82 [Reynold Xin] Enable UTF-8 encoding.
      82d58cc [Reynold Xin] Reset run-tests.
      2cb3c69 [Reynold Xin] Use utf-8 encoding in set bytes.
      53f8ef4 [Reynold Xin] Hack Jenkins to run one test.
      9a48e8d [Reynold Xin] Fixed runtime compilation error.
      911c450 [Reynold Xin] Moved unit test also to Java.
      4eff7bd [Reynold Xin] Improved unit test coverage.
      8e89a3c [Reynold Xin] Fixed tests.
      77c64bd [Reynold Xin] Fixed string type codegen.
      ffedb62 [Reynold Xin] Code review feedback.
      0967ce6 [Reynold Xin] Fixed import ordering.
      45a123d [Reynold Xin] [SPARK-8286] Rewrite UTF8String in Java and move it into unsafe package.
      7d669a56
    • Marcelo Vanzin's avatar
      [SPARK-6511] [docs] Fix example command in hadoop-provided docs. · 9cbdf31e
      Marcelo Vanzin authored
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #6766 from vanzin/SPARK-6511 and squashes the following commits:
      
      49f0f67 [Marcelo Vanzin] [SPARK-6511] [docs] Fix example command in hadoop-provided docs.
      9cbdf31e
    • zsxwing's avatar
      [SPARK-7444] [TESTS] Eliminate noisy css warn/error logs for UISeleniumSuite · 95690a17
      zsxwing authored
      Eliminate the following noisy logs for `UISeleniumSuite`:
      ```
      15/05/07 10:09:50.196 pool-1-thread-1-ScalaTest-running-UISeleniumSuite WARN DefaultCssErrorHandler: CSS error: 'http://192.168.0.170:4040/static/bootstrap.min.css' [793:167] Error in style rule. (Invalid token "*". Was expecting one of: <EOF>, <S>, <IDENT>, "}", ";".)
      15/05/07 10:09:50.196 pool-1-thread-1-ScalaTest-running-UISeleniumSuite WARN DefaultCssErrorHandler: CSS warning: 'http://192.168.0.170:4040/static/bootstrap.min.css' [793:167] Ignoring the following declarations in this rule.
      15/05/07 10:09:50.197 pool-1-thread-1-ScalaTest-running-UISeleniumSuite WARN DefaultCssErrorHandler: CSS error: 'http://192.168.0.170:4040/static/bootstrap.min.css' [799:325] Error in style rule. (Invalid token "*". Was expecting one of: <EOF>, <S>, <IDENT>, "}", ";".)
      15/05/07 10:09:50.197 pool-1-thread-1-ScalaTest-running-UISeleniumSuite WARN DefaultCssErrorHandler: CSS warning: 'http://192.168.0.170:4040/static/bootstrap.min.css' [799:325] Ignoring the following declarations in this rule.
      15/05/07 10:09:50.198 pool-1-thread-1-ScalaTest-running-UISeleniumSuite WARN DefaultCssErrorHandler: CSS error: 'http://192.168.0.170:4040/static/bootstrap.min.css' [805:18] Error in style rule. (Invalid token "*". Was expecting one of: <EOF>, <S>, <IDENT>, "}", ";".)
      15/05/07 10:09:50.198 pool-1-thread-1-ScalaTest-running-UISeleniumSuite WARN DefaultCssErrorHandler: CSS warning: 'http://192.168.0.170:4040/static/bootstrap.min.css' [805:18] Ignoring the following declarations in this rule.
      ```
      
      Author: zsxwing <zsxwing@gmail.com>
      
      Closes #5983 from zsxwing/SPARK-7444 and squashes the following commits:
      
      4202728 [zsxwing] Add SparkUICssErrorHandler for all tests
      d1398ad [zsxwing] Merge remote-tracking branch 'origin/master' into SPARK-7444
      7bb7f11 [zsxwing] Merge branch 'master' into SPARK-7444
      a59f40e [zsxwing] Eliminate noisy css warn/error logs for UISeleniumSuite
      95690a17
    • Cheng Hao's avatar
      [SPARK-7915] [SQL] Support specifying the column list for target table in CTAS · 040f223c
      Cheng Hao authored
      ```
      create table t1 (a int, b string) as select key, value from src;
      
      desc t1;
      key	int	NULL
      value	string	NULL
      ```
      
      Thus Hive doesn't support specifying the column list for target table in CTAS, however, we should either throwing exception explicity, or supporting the this feature, we just pick up the later one, which seems useful and straightforward.
      
      Author: Cheng Hao <hao.cheng@intel.com>
      
      Closes #6458 from chenghao-intel/ctas_column and squashes the following commits:
      
      d1fa9b6 [Cheng Hao] bug in unittest
      4e701aa [Cheng Hao] update as feedback
      f305ec1 [Cheng Hao] support specifying the column list for target table in CTAS
      040f223c
    • Shivaram Venkataraman's avatar
      [SPARK-8310] [EC2] Updates the master branch EC2 versions · c8d551d5
      Shivaram Venkataraman authored
      Will send another PR for `branch-1.4`
      
      Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
      
      Closes #6764 from shivaram/SPARK-8310 and squashes the following commits:
      
      d8cd3b3 [Shivaram Venkataraman] This updates the master branch EC2 versions
      c8d551d5
    • Davies Liu's avatar
      [SPARK-8305] [SPARK-8190] [SQL] improve codegen · 1191c3ef
      Davies Liu authored
      This PR fix a few small issues about codgen:
      
      1. cast decimal to boolean
      2. do not inline literal with null
      3. improve SpecificRow.equals()
      4. test expressions with optimized express
      5. fix compare with BinaryType
      
      cc rxin chenghao-intel
      
      Author: Davies Liu <davies@databricks.com>
      
      Closes #6755 from davies/fix_codegen and squashes the following commits:
      
      ef27343 [Davies Liu] address comments
      6617ea6 [Davies Liu] fix scala tyle
      70b7dda [Davies Liu] improve codegen
      1191c3ef
    • Davies Liu's avatar
      [SPARK-6411] [SQL] [PySpark] support date/datetime with timezone in Python · 424b0075
      Davies Liu authored
      Spark SQL does not support timezone, and Pyrolite does not support timezone well. This patch will convert datetime into POSIX timestamp (without confusing of timezone), which is used by SQL. If the datetime object does not have timezone, it's treated as local time.
      
      The timezone in RDD will be lost after one round trip, all the datetime from SQL will be local time.
      
      Because of Pyrolite, datetime from SQL only has precision as 1 millisecond.
      
      This PR also drop the timezone in date, convert it to number of days since epoch (used in SQL).
      
      Author: Davies Liu <davies@databricks.com>
      
      Closes #6250 from davies/tzone and squashes the following commits:
      
      44d8497 [Davies Liu] add timezone support for DateType
      99d9d9c [Davies Liu] use int for timestamp
      10aa7ca [Davies Liu] Merge branch 'master' of github.com:apache/spark into tzone
      6a29aa4 [Davies Liu] support datetime with timezone
      424b0075
    • Adam Roberts's avatar
      [SPARK-8289] Specify stack size for consistency with Java tests - resolves test failures · 6b68366d
      Adam Roberts authored
      This change is a simple one and specifies a stack size of 4096k instead of the vendor default for Java tests (the defaults vary between Java vendors). This remedies test failures observed with JavaALSSuite with IBM and Oracle Java owing to a lower default size in comparison to the size with OpenJDK. 4096k is a suitable default where the tests pass with each Java vendor tested. The alternative is to reduce the number of iterations in the test (no observed failures with 5 iterations instead of 15).
      
      -Xss works with Oracle's HotSpot VM, IBM's J9 VM and OpenJDK (IcedTea).
      
      I have ensured this does not have any negative implications for other tests.
      
      Author: Adam Roberts <aroberts@uk.ibm.com>
      Author: a-roberts <aroberts@uk.ibm.com>
      
      Closes #6727 from a-roberts/IncJavaStackSize and squashes the following commits:
      
      ab40aea [Adam Roberts] Specify stack size for SBT builds
      5032d8d [a-roberts] Update pom.xml
      6b68366d
    • Patrick Wendell's avatar
      [HOTFIX] Fixing errors in name mappings · e84545fa
      Patrick Wendell authored
      e84545fa
  2. Jun 10, 2015
    • Patrick Wendell's avatar
      a777eb04
    • Daoyuan Wang's avatar
      [SPARK-8217] [SQL] math function log2 · 2758ff0a
      Daoyuan Wang authored
      Author: Daoyuan Wang <daoyuan.wang@intel.com>
      
      This patch had conflicts when merged, resolved by
      Committer: Reynold Xin <rxin@databricks.com>
      
      Closes #6718 from adrian-wang/udflog2 and squashes the following commits:
      
      3909f48 [Daoyuan Wang] math function: log2
      2758ff0a
    • Cheng Hao's avatar
      [SPARK-8248][SQL] string function: length · 9fe3adcc
      Cheng Hao authored
      Author: Cheng Hao <hao.cheng@intel.com>
      
      Closes #6724 from chenghao-intel/length and squashes the following commits:
      
      aaa3c31 [Cheng Hao] revert the additional change
      97148a9 [Cheng Hao] remove the codegen testing temporally
      ae08003 [Cheng Hao] update the comments
      1eb1fd1 [Cheng Hao] simplify the code as commented
      3e92d32 [Cheng Hao] use the selectExpr in unit test intead of SQLQuery
      3c729aa [Cheng Hao] fix bug for constant null value in codegen
      3641f06 [Cheng Hao] keep the length() method for registered function
      8e30171 [Cheng Hao] update the code as comment
      db604ae [Cheng Hao] Add code gen support
      548d2ef [Cheng Hao] register the length()
      09a0738 [Cheng Hao] add length support
      9fe3adcc
    • Wenchen Fan's avatar
      [SPARK-8164] transformExpressions should support nested expression sequence · 4e42842e
      Wenchen Fan authored
      Currently we only support `Seq[Expression]`, we should handle cases like `Seq[Seq[Expression]]` so that we can remove the unnecessary `GroupExpression`.
      
      Author: Wenchen Fan <cloud0fan@outlook.com>
      
      Closes #6706 from cloud-fan/clean and squashes the following commits:
      
      60a1193 [Wenchen Fan] support nested expression sequence and remove GroupExpression
      4e42842e
    • navis.ryu's avatar
      [SPARK-8285] [SQL] CombineSum should be calculated as unlimited decimal first · 6a47114b
      navis.ryu authored
          case cs  CombineSum(expr) =>
              val calcType = expr.dataType
                expr.dataType match {
                  case DecimalType.Fixed(_, _) =>
                    DecimalType.Unlimited
                  case _ =>
                    expr.dataType
                }
      calcType is always expr.dataType. credits are all belong to IntelliJ
      
      Author: navis.ryu <navis@apache.org>
      
      Closes #6736 from navis/SPARK-8285 and squashes the following commits:
      
      20382c1 [navis.ryu] [SPARK-8285] [SQL] CombineSum should be calculated as unlimited decimal first
      6a47114b
    • Davies Liu's avatar
      [SPARK-8189] [SQL] use Long for TimestampType in SQL · 37719e0c
      Davies Liu authored
      This PR change to use Long as internal type for TimestampType for efficiency, which means it will the precision below 100ns.
      
      Author: Davies Liu <davies@databricks.com>
      
      Closes #6733 from davies/timestamp and squashes the following commits:
      
      d9565fa [Davies Liu] remove print
      65cf2f1 [Davies Liu] fix Timestamp in SparkR
      86fecfb [Davies Liu] disable two timestamp tests
      8f77ee0 [Davies Liu] fix scala style
      246ee74 [Davies Liu] address comments
      309d2e1 [Davies Liu] use Long for TimestampType in SQL
      37719e0c
    • Paavo's avatar
      [SPARK-8200] [MLLIB] Check for empty RDDs in StreamingLinearAlgorithm · b928f543
      Paavo authored
      Test cases for both StreamingLinearRegression and StreamingLogisticRegression, and code fix.
      
      Edit:
      This contribution is my original work and I license the work to the project under the project's open source license.
      
      Author: Paavo <pparkkin@gmail.com>
      
      Closes #6713 from pparkkin/streamingmodel-empty-rdd and squashes the following commits:
      
      ff5cd78 [Paavo] Update strings to use interpolation.
      db234cf [Paavo] Use !rdd.isEmpty.
      54ad89e [Paavo] Test case for empty stream.
      393e36f [Paavo] Ignore empty RDDs.
      0bfc365 [Paavo] Test case for empty stream.
      b928f543
    • Shivaram Venkataraman's avatar
      [SPARK-2774] Set preferred locations for reduce tasks · 96a7c888
      Shivaram Venkataraman authored
      Set preferred locations for reduce tasks.
      The basic design is that we maintain a map from reducerId to a list of (sizes, locations) for each
      shuffle. We then set the preferred locations to be any machines that have 20% of more of the output
      that needs to be read by the reduce task.  This will result in at most 5 preferred locations for
      each reduce task.
      
      Selecting the preferred locations involves O(# map tasks * # reduce tasks) computation, so we
      restrict this feature to cases where we have fewer than 1000 map tasks and 1000 reduce tasks.
      
      Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
      
      Closes #6652 from shivaram/reduce-locations and squashes the following commits:
      
      492e25e [Shivaram Venkataraman] Remove unused import
      2ef2d39 [Shivaram Venkataraman] Address code review comments
      897a914 [Shivaram Venkataraman] Remove unused hash map
      f5be578 [Shivaram Venkataraman] Use fraction of map outputs to determine locations Also removes caching of preferred locations to make the API cleaner
      68bc29e [Shivaram Venkataraman] Fix line length
      1090b58 [Shivaram Venkataraman] Change flag name
      77ce7d8 [Shivaram Venkataraman] Merge branch 'master' of https://github.com/apache/spark into reduce-locations
      e5d56bd [Shivaram Venkataraman] Add flag to turn off locality for shuffle deps
      6cfae98 [Shivaram Venkataraman] Filter out zero blocks, rename variables
      9d5831a [Shivaram Venkataraman] Address some more comments
      8e31266 [Shivaram Venkataraman] Fix style
      0df3180 [Shivaram Venkataraman] Address code review comments
      e7d5449 [Shivaram Venkataraman] Fix merge issues
      ad7cb53 [Shivaram Venkataraman] Merge branch 'master' of https://github.com/apache/spark into reduce-locations
      df14cee [Shivaram Venkataraman] Merge branch 'master' of https://github.com/apache/spark into reduce-locations
      5093aea [Shivaram Venkataraman] Merge branch 'master' of https://github.com/apache/spark into reduce-locations
      0171d3c [Shivaram Venkataraman] Merge branch 'master' of https://github.com/apache/spark into reduce-locations
      bc4dfd6 [Shivaram Venkataraman] Merge branch 'master' of https://github.com/apache/spark into reduce-locations
      774751b [Shivaram Venkataraman] Fix bug introduced by line length adjustment
      34d0283 [Shivaram Venkataraman] Fix style issues
      3b464b7 [Shivaram Venkataraman] Set preferred locations for reduce tasks This is another attempt at #1697 addressing some of the earlier concerns. This adds a couple of thresholds based on number map and reduce tasks beyond which we don't use preferred locations for reduce tasks.
      96a7c888
    • WangTaoTheTonic's avatar
      [SPARK-8273] Driver hangs up when yarn shutdown in client mode · 5014d0ed
      WangTaoTheTonic authored
      In client mode, if yarn was shut down with spark application running, the application will hang up after several retries(default: 30) because the exception throwed by YarnClientImpl could not be caught by upper level, we should exit in case that user can not be aware that.
      
      The exception we wanna catch is [here](https://github.com/apache/hadoop/blob/branch-2.7.0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/retry/RetryInvocationHandler.java#L122), and I try to fix it refer to [MR](https://github.com/apache/hadoop/blob/branch-2.7.0/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/ClientServiceDelegate.java#L320).
      
      Author: WangTaoTheTonic <wangtao111@huawei.com>
      
      Closes #6717 from WangTaoTheTonic/SPARK-8273 and squashes the following commits:
      
      28752d6 [WangTaoTheTonic] catch the throwed exception
      5014d0ed
    • WangTaoTheTonic's avatar
      [SPARK-8290] spark class command builder need read SPARK_JAVA_OPTS and SPARK_DRIVER_MEMORY properly · cb871c44
      WangTaoTheTonic authored
      SPARK_JAVA_OPTS was missed in reconstructing the launcher part, we should add it back so process launched by spark-class could read it properly. And so does `SPARK_DRIVER_MEMORY`.
      
      The missing part is [here](https://github.com/apache/spark/blob/1c30afdf94b27e1ad65df0735575306e65d148a1/bin/spark-class#L97).
      
      Author: WangTaoTheTonic <wangtao111@huawei.com>
      Author: Tao Wang <wangtao111@huawei.com>
      
      Closes #6741 from WangTaoTheTonic/SPARK-8290 and squashes the following commits:
      
      bd89f0f [Tao Wang] make sure the memory setting is right too
      e313520 [WangTaoTheTonic] spark class command builder need read SPARK_JAVA_OPTS
      cb871c44
    • zsxwing's avatar
      [SPARK-7261] [CORE] Change default log level to WARN in the REPL · 80043e9e
      zsxwing authored
      1. Add `log4j-defaults-repl.properties` that has log level WARN.
      2. When logging is initialized, check whether inside the REPL. If so, use `log4j-defaults-repl.properties`.
      3. Print the following information if using `log4j-defaults-repl.properties`:
      ```
      Using Spark's repl log4j profile: org/apache/spark/log4j-defaults-repl.properties
      To adjust logging level use sc.setLogLevel("INFO")
      ```
      
      Author: zsxwing <zsxwing@gmail.com>
      
      Closes #6734 from zsxwing/log4j-repl and squashes the following commits:
      
      3835eff [zsxwing] Change default log level to WARN in the REPL
      80043e9e
    • zsxwing's avatar
      [SPARK-7527] [CORE] Fix createNullValue to return the correct null values and REPL mode detection · e90c9d92
      zsxwing authored
      The root cause of SPARK-7527 is `createNullValue` returns an incompatible value `Byte(0)` for `char` and `boolean`.
      
      This PR fixes it and corrects the class name of the main class, and also adds an unit test to demonstrate it.
      
      Author: zsxwing <zsxwing@gmail.com>
      
      Closes #6735 from zsxwing/SPARK-7527 and squashes the following commits:
      
      bbdb271 [zsxwing] Use pattern match in createNullValue
      b0a0e7e [zsxwing] Remove the noisy in the test output
      903e269 [zsxwing] Remove the code for Utils.isInInterpreter == false
      5f92dc1 [zsxwing] Fix createNullValue to return the correct null values and REPL mode detection
      e90c9d92
    • Adam Roberts's avatar
      [SPARK-7756] CORE RDDOperationScope fix for IBM Java · 19e30b48
      Adam Roberts authored
      IBM Java has an extra method when we do getStackTrace(): this is "getStackTraceImpl", a native method. This causes two tests to fail within "DStreamScopeSuite" when running with IBM Java. Instead of "map" or "filter" being the method names found, "getStackTrace" is returned. This commit addresses such an issue by using dropWhile. Given that our current method is withScope, we look for the next method that isn't ours: we don't care about methods that come before us in the stack trace: e.g. getStackTrace (regardless of how many levels this might go).
      
      IBM:
      java.lang.Thread.getStackTraceImpl(Native Method)
      java.lang.Thread.getStackTrace(Thread.java:1117)
      org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:104)
      
      Oracle:
      PRINTING STACKTRACE!!!
      java.lang.Thread.getStackTrace(Thread.java:1552)
      org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:106)
      
      I've tested this with Oracle and IBM Java, no side effects for other tests introduced.
      
      Author: Adam Roberts <aroberts@uk.ibm.com>
      Author: a-roberts <aroberts@uk.ibm.com>
      
      Closes #6740 from a-roberts/RDDScopeStackCrawlFix and squashes the following commits:
      
      13ce390 [Adam Roberts] Ensure consistency with String equality checking
      a4fc0e0 [a-roberts] Update RDDOperationScope.scala
      19e30b48
    • Hossein's avatar
      [SPARK-8282] [SPARKR] Make number of threads used in RBackend configurable · 30ebf1a2
      Hossein authored
      Read number of threads for RBackend from configuration.
      
      [SPARK-8282] #comment Linking with JIRA
      
      Author: Hossein <hossein@databricks.com>
      
      Closes #6730 from falaki/SPARK-8282 and squashes the following commits:
      
      33b3d98 [Hossein] Documented new config parameter
      70f2a9c [Hossein] Fixing import
      ec44225 [Hossein] Read number of threads for RBackend from configuration
      30ebf1a2
    • Marcelo Vanzin's avatar
      [SPARK-5479] [YARN] Handle --py-files correctly in YARN. · 38112905
      Marcelo Vanzin authored
      The bug description is a little misleading: the actual issue is that
      .py files are not handled correctly when distributed by YARN. They're
      added to "spark.submit.pyFiles", which, when processed by context.py,
      explicitly whitelists certain extensions (see PACKAGE_EXTENSIONS),
      and that does not include .py files.
      
      On top of that, archives were not handled at all! They made it to the
      driver's python path, but never made it to executors, since the mechanism
      used to propagate their location (spark.submit.pyFiles) only works on
      the driver side.
      
      So, instead, ignore "spark.submit.pyFiles" and just build PYTHONPATH
      correctly for both driver and executors. Individual .py files are
      placed in a subdirectory of the container's local dir in the cluster,
      which is then added to the python path. Archives are added directly.
      
      The change, as a side effect, ends up solving the symptom described
      in the bug. The issue was not that the files were not being distributed,
      but that they were never made visible to the python application
      running under Spark.
      
      Also included is a proper unit test for running python on YARN, which
      broke in several different ways with the previous code.
      
      A short walk around of the changes:
      - SparkSubmit does not try to be smart about how YARN handles python
        files anymore. It just passes down the configs to the YARN client
        code.
      - The YARN client distributes python files and archives differently,
        placing the files in a subdirectory.
      - The YARN client now sets PYTHONPATH for the processes it launches;
        to properly handle different locations, it uses YARN's support for
        embedding env variables, so to avoid YARN expanding those at the
        wrong time, SparkConf is now propagated to the AM using a conf file
        instead of command line options.
      - Because the Client initialization code is a maze of implicit
        dependencies, some code needed to be moved around to make sure
        all needed state was available when the code ran.
      - The pyspark tests in YarnClusterSuite now actually distribute and try
        to use both a python file and an archive containing a different python
        module. Also added a yarn-client tests for completeness.
      - I cleaned up some of the code around distributing files to YARN, to
        avoid adding more copied & pasted code to handle the new files being
        distributed.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #6360 from vanzin/SPARK-5479 and squashes the following commits:
      
      bcaf7e6 [Marcelo Vanzin] Feedback.
      c47501f [Marcelo Vanzin] Fix yarn-client mode.
      46b1d0c [Marcelo Vanzin] Merge branch 'master' into SPARK-5479
      c743778 [Marcelo Vanzin] Only pyspark cares about python archives.
      c8e5a82 [Marcelo Vanzin] Actually run pyspark in client mode.
      705571d [Marcelo Vanzin] Move some code to the YARN module.
      1dd4d0c [Marcelo Vanzin] Review feedback.
      71ee736 [Marcelo Vanzin] Merge branch 'master' into SPARK-5479
      220358b [Marcelo Vanzin] Scalastyle.
      cdbb990 [Marcelo Vanzin] Merge branch 'master' into SPARK-5479
      7fe3cd4 [Marcelo Vanzin] No need to distribute primary file to executors.
      09045f1 [Marcelo Vanzin] Style.
      943cbf4 [Marcelo Vanzin] [SPARK-5479] [yarn] Handle --py-files correctly in YARN.
      38112905
    • Cheng Lian's avatar
      [SQL] [MINOR] Fixes a minor Java example error in SQL programming guide · 8f7308f9
      Cheng Lian authored
      Author: Cheng Lian <lian@databricks.com>
      
      Closes #6749 from liancheng/java-sample-fix and squashes the following commits:
      
      5b44585 [Cheng Lian] Fixes a minor Java example error in SQL programming guide
      8f7308f9
    • Ilya Ganelin's avatar
      [SPARK-7996] Deprecate the developer api SparkEnv.actorSystem · 2b550a52
      Ilya Ganelin authored
      Changed ```SparkEnv.actorSystem``` to be a function such that we can use the deprecated flag with it and added a deprecated message.
      
      Author: Ilya Ganelin <ilya.ganelin@capitalone.com>
      
      Closes #6731 from ilganeli/SPARK-7996 and squashes the following commits:
      
      be43817 [Ilya Ganelin] Restored to val
      9ed89e7 [Ilya Ganelin] Added a version info for deprecation
      9610b08 [Ilya Ganelin] Converted actorSystem to function and added deprecated flag
      2b550a52
    • Daoyuan Wang's avatar
      [SPARK-8215] [SPARK-8212] [SQL] add leaf math expression for e and pi · c6ba7cca
      Daoyuan Wang authored
      Author: Daoyuan Wang <daoyuan.wang@intel.com>
      
      Closes #6716 from adrian-wang/epi and squashes the following commits:
      
      e2e8dbd [Daoyuan Wang] move tests
      11b351c [Daoyuan Wang] add tests and remove pu
      db331c9 [Daoyuan Wang] py style
      599ddd8 [Daoyuan Wang] add py
      e6783ef [Daoyuan Wang] register function
      82d426e [Daoyuan Wang] add function entry
      dbf3ab5 [Daoyuan Wang] add PI and E
      c6ba7cca
    • Reynold Xin's avatar
      [SPARK-7886] Added unit test for HAVING aggregate pushdown. · e90035e6
      Reynold Xin authored
      This is a followup to #6712.
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #6739 from rxin/6712-followup and squashes the following commits:
      
      fd9acfb [Reynold Xin] [SPARK-7886] Added unit test for HAVING aggregate pushdown.
      e90035e6
    • Reynold Xin's avatar
      [SPARK-7886] Use FunctionRegistry for built-in expressions in HiveContext. · 57c60c5b
      Reynold Xin authored
      This builds on #6710 and also uses FunctionRegistry for function lookup in HiveContext.
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #6712 from rxin/udf-registry-hive and squashes the following commits:
      
      f4c2df0 [Reynold Xin] Fixed style violation.
      0bd4127 [Reynold Xin] Fixed Python UDFs.
      f9a0378 [Reynold Xin] Disable one more test.
      5609494 [Reynold Xin] Disable some failing tests.
      4efea20 [Reynold Xin] Don't check children resolved for UDF resolution.
      2ebe549 [Reynold Xin] Removed more hardcoded functions.
      aadce78 [Reynold Xin] [SPARK-7886] Use FunctionRegistry for built-in expressions in HiveContext.
      57c60c5b
  3. Jun 09, 2015
  4. Jun 08, 2015
Loading