Skip to content
Snippets Groups Projects
  1. Apr 06, 2015
  2. Apr 05, 2015
    • zsxwing's avatar
      [SPARK-6602][Core] Update MapOutputTrackerMasterActor to MapOutputTrackerMasterEndpoint · 0b5d028a
      zsxwing authored
      This is the second PR for [SPARK-6602]. It updated MapOutputTrackerMasterActor and its unit tests.
      
      cc rxin
      
      Author: zsxwing <zsxwing@gmail.com>
      
      Closes #5371 from zsxwing/rpc-rewrite-part2 and squashes the following commits:
      
      fcf3816 [zsxwing] Fix the code style
      4013a22 [zsxwing] Add doc for uncaught exceptions in RpcEnv
      93c6c20 [zsxwing] Add an example of UnserializableException and add ErrorMonitor to monitor errors from Akka
      134fe7b [zsxwing] Update MapOutputTrackerMasterActor to MapOutputTrackerMasterEndpoint
      0b5d028a
    • lewuathe's avatar
      [SPARK-6262][MLLIB]Implement missing methods for MultivariateStatisticalSummary · acffc434
      lewuathe authored
      Add below methods in pyspark for MultivariateStatisticalSummary
      - normL1
      - normL2
      
      Author: lewuathe <lewuathe@me.com>
      
      Closes #5359 from Lewuathe/SPARK-6262 and squashes the following commits:
      
      cbe439e [lewuathe] Implement missing methods for MultivariateStatisticalSummary
      acffc434
  3. Apr 04, 2015
    • zsxwing's avatar
      [SPARK-6602][Core] Replace direct use of Akka with Spark RPC interface - part 1 · f15806a8
      zsxwing authored
      This PR replaced the following `Actor`s to `RpcEndpoint`:
      
      1. HeartbeatReceiver
      1. ExecutorActor
      1. BlockManagerMasterActor
      1. BlockManagerSlaveActor
      1. CoarseGrainedExecutorBackend and subclasses
      1. CoarseGrainedSchedulerBackend.DriverActor
      
      This is the first PR. I will split the work of SPARK-6602 to several PRs for code review.
      
      Author: zsxwing <zsxwing@gmail.com>
      
      Closes #5268 from zsxwing/rpc-rewrite and squashes the following commits:
      
      287e9f8 [zsxwing] Fix the code style
      26c56b7 [zsxwing] Merge branch 'master' into rpc-rewrite
      9cc825a [zsxwing] Rmove setupThreadSafeEndpoint and add ThreadSafeRpcEndpoint
      30a9036 [zsxwing] Make self return null after stopping RpcEndpointRef; fix docs and error messages
      705245d [zsxwing] Fix some bugs after rebasing the changes on the master
      003cf80 [zsxwing] Update CoarseGrainedExecutorBackend and CoarseGrainedSchedulerBackend to use RpcEndpoint
      7d0e6dc [zsxwing] Update BlockManagerSlaveActor to use RpcEndpoint
      f5d6543 [zsxwing] Update BlockManagerMaster to use RpcEndpoint
      30e3f9f [zsxwing] Update ExecutorActor to use RpcEndpoint
      478b443 [zsxwing] Update HeartbeatReceiver to use RpcEndpoint
      f15806a8
    • Liang-Chi Hsieh's avatar
      [SPARK-6607][SQL] Check invalid characters for Parquet schema and show error messages · 7bca62f7
      Liang-Chi Hsieh authored
      '(' and ')' are special characters used in Parquet schema for type annotation. When we run an aggregation query, we will obtain attribute name such as "MAX(a)".
      
      If we directly store the generated DataFrame as Parquet file, it causes failure when reading and parsing the stored schema string.
      
      Several methods can be adopted to solve this. This pr uses a simplest one to just replace attribute names before generating Parquet schema based on these attributes.
      
      Another possible method might be modifying all aggregation expression names from "func(column)" to "func[column]".
      
      Author: Liang-Chi Hsieh <viirya@gmail.com>
      
      Closes #5263 from viirya/parquet_aggregation_name and squashes the following commits:
      
      2d70542 [Liang-Chi Hsieh] Address comment.
      463dff4 [Liang-Chi Hsieh] Instead of replacing special chars, showing error message to user to suggest using Alias.
      1de001d [Liang-Chi Hsieh] Replace special characters '(' and ')' of Parquet schema.
      7bca62f7
    • Yin Huai's avatar
      [SQL] Use path.makeQualified in newParquet. · da25c86d
      Yin Huai authored
      Author: Yin Huai <yhuai@databricks.com>
      
      Closes #5353 from yhuai/wrongFS and squashes the following commits:
      
      849603b [Yin Huai] Not use deprecated method.
      6d6ae34 [Yin Huai] Use path.makeQualified.
      da25c86d
  4. Apr 03, 2015
    • Davies Liu's avatar
      [SPARK-6700] disable flaky test · 9b40c17a
      Davies Liu authored
      Author: Davies Liu <davies@databricks.com>
      
      Closes #5356 from davies/flaky and squashes the following commits:
      
      08955f4 [Davies Liu] disable flaky test
      9b40c17a
    • Liang-Chi Hsieh's avatar
      [SPARK-6647][SQL] Make trait StringComparison as BinaryPredicate and fix unit... · 26b415e1
      Liang-Chi Hsieh authored
      [SPARK-6647][SQL] Make trait StringComparison as BinaryPredicate and fix unit tests of string data source Filter
      
      Now trait `StringComparison` is a `BinaryExpression`. In fact, it should be a `BinaryPredicate`.
      
      By making `StringComparison` as `BinaryPredicate`, we can throw error when a `expressions.Predicate` can't translate to a data source `Filter` in function `selectFilters`.
      
      Without this modification, because we will wrap a `Filter` outside the scanned results in `pruneFilterProjectRaw`, we can't detect about something is wrong in translating predicates to filters in `selectFilters`.
      
      The unit test of #5285 demonstrates such problem. In that pr, even `expressions.Contains` is not properly translated to `sources.StringContains`, the filtering is still performed by the `Filter` and so the test passes.
      
      Of course, by doing this modification, all `expressions.Predicate` classes need to have its data source `Filter` correspondingly.
      
      There is a small bug in `FilteredScanSuite` for doing `StringEndsWith` filter. This pr also fixes it.
      
      Author: Liang-Chi Hsieh <viirya@gmail.com>
      
      Closes #5309 from viirya/translate_predicate and squashes the following commits:
      
      b176385 [Liang-Chi Hsieh] Address comment.
      275a493 [Liang-Chi Hsieh] More properly test for StringStartsWith, StringEndsWith and StringContains.
      caf2347 [Liang-Chi Hsieh] Make trait StringComparison as BinaryPredicate and throw error when Predicate can't translate to data source Filter.
      26b415e1
    • Marcelo Vanzin's avatar
      [SPARK-6688] [core] Always use resolved URIs in EventLoggingListener. · 14632b79
      Marcelo Vanzin authored
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #5340 from vanzin/SPARK-6688 and squashes the following commits:
      
      ccfddd9 [Marcelo Vanzin] Resolve at the source.
      20d2a34 [Marcelo Vanzin] [SPARK-6688] [core] Always use resolved URIs in EventLoggingListener.
      14632b79
    • Reynold Xin's avatar
      Closes #3158 · ffe8cc9a
      Reynold Xin authored
      ffe8cc9a
    • zsxwing's avatar
      [SPARK-6640][Core] Fix the race condition of creating HeartbeatReceiver and... · 88504b75
      zsxwing authored
      [SPARK-6640][Core] Fix the race condition of creating HeartbeatReceiver and retrieving HeartbeatReceiver
      
      This PR moved the code of creating `HeartbeatReceiver` above the code of creating `schedulerBackend` to resolve the race condition.
      
      Author: zsxwing <zsxwing@gmail.com>
      
      Closes #5306 from zsxwing/SPARK-6640 and squashes the following commits:
      
      840399d [zsxwing] Don't send TaskScheduler through Akka
      a90616a [zsxwing] Fix docs
      dd202c7 [zsxwing] Fix typo
      d7c250d [zsxwing] Fix the race condition of creating HeartbeatReceiver and retrieving HeartbeatReceiver
      88504b75
    • Ilya Ganelin's avatar
      [SPARK-6492][CORE] SparkContext.stop() can deadlock when DAGSchedulerEventProcessLoop dies · 2c43ea38
      Ilya Ganelin authored
      I've added a timeout and retry loop around the SparkContext shutdown code that should fix this deadlock. If a SparkContext shutdown is in progress when another thread comes knocking, it will wait for 10 seconds for the lock, then fall through where the outer loop will re-submit the request.
      
      Author: Ilya Ganelin <ilya.ganelin@capitalone.com>
      
      Closes #5277 from ilganeli/SPARK-6492 and squashes the following commits:
      
      8617a7e [Ilya Ganelin] Resolved merge conflict
      2fbab66 [Ilya Ganelin] Added MIMA Exclude
      a0e2c70 [Ilya Ganelin] Deleted stale imports
      fa28ce7 [Ilya Ganelin] reverted to just having a single stopped
      76fc825 [Ilya Ganelin] Updated to use atomic booleans instead of the synchronized vars
      6e8a7f7 [Ilya Ganelin] Removing unecessary null check for now since i'm not fixing stop ordering yet
      cdf7073 [Ilya Ganelin] [SPARK-6492] Moved stopped=true back to the start of the shutdown sequence so this can be addressed in a seperate PR
      7fb795b [Ilya Ganelin] Spacing
      b7a0c5c [Ilya Ganelin] Import ordering
      df8224f [Ilya Ganelin] Added comment for added lock
      343cb94 [Ilya Ganelin] [SPARK-6492] Added timeout/retry logic to fix a deadlock in SparkContext shutdown
      2c43ea38
    • guowei2's avatar
      [SPARK-5203][SQL] fix union with different decimal type · c23ba81b
      guowei2 authored
         When union non-decimal types with decimals, we use the following rules:
            - FIRST `intTypeToFixed`, then fixed union decimals with precision/scale p1/s2 and p2/s2  will be promoted to
            DecimalType(max(p1, p2), max(s1, s2))
            - FLOAT and DOUBLE cause fixed-length decimals to turn into DOUBLE (this is the same as Hive,
            but note that unlimited decimals are considered bigger than doubles in WidenTypes)
      
      Author: guowei2 <guowei2@asiainfo.com>
      
      Closes #4004 from guowei2/SPARK-5203 and squashes the following commits:
      
      ff50f5f [guowei2] fix code style
      11df1bf [guowei2] fix decimal union with double, double->Decimal(15,15)
      0f345f9 [guowei2] fix structType merge with decimal
      101ed4d [guowei2] fix build error after rebase
      0b196e4 [guowei2] code style
      fe2c2ca [guowei2] handle union decimal precision in 'DecimalPrecision'
      421d840 [guowei2] fix union types for decimal precision
      ef2c661 [guowei2] fix union with different decimal type
      c23ba81b
    • Liang-Chi Hsieh's avatar
      [Minor][SQL] Fix typo · dc6dff24
      Liang-Chi Hsieh authored
      Just fix a typo.
      
      Author: Liang-Chi Hsieh <viirya@gmail.com>
      
      Closes #5352 from viirya/fix_a_typo and squashes the following commits:
      
      303b2d2 [Liang-Chi Hsieh] Fix typo.
      dc6dff24
    • lewuathe's avatar
      [SPARK-6615][MLLIB] Python API for Word2Vec · 512a2f19
      lewuathe authored
      This is the sub-task of SPARK-6254.
      Wrap missing method for `Word2Vec` and `Word2VecModel`.
      
      Author: lewuathe <lewuathe@me.com>
      
      Closes #5296 from Lewuathe/SPARK-6615 and squashes the following commits:
      
      f14c304 [lewuathe] Reorder tests
      1d326b9 [lewuathe] Merge master
      e2bedfb [lewuathe] Modify test cases
      afb866d [lewuathe] [SPARK-6615] Python API for Word2Vec
      512a2f19
    • Omede Firouz's avatar
      [MLLIB] Remove println in LogisticRegression.scala · b52c7f9f
      Omede Firouz authored
      There's no corresponding printing in linear regression. Here was my previous PR (something weird happened and I can't reopen it) https://github.com/apache/spark/pull/5272
      
      Author: Omede Firouz <ofirouz@palantir.com>
      
      Closes #5338 from oefirouz/println and squashes the following commits:
      
      3f3dbf4 [Omede Firouz] [MLLIB] Remove println
      b52c7f9f
    • Stephen Haberman's avatar
      [SPARK-6560][CORE] Do not suppress exceptions from writer.write. · b0d884f0
      Stephen Haberman authored
      If there is a failure in the Hadoop backend while calling
      writer.write, we should remember this original exception,
      and try to call writer.close(), but if that fails as well,
      still report the original exception.
      
      Note that, if writer.write fails, it is likely that writer
      was left in an invalid state, and so actually makes it more
      likely that writer.close will also fail. Which just increases
      the chances for writer.write's exception to be suppressed.
      
      This patch introduces an admittedly potentially too cute
      Utils.tryWithSafeFinally method to handle the try/finally
      gyrations.
      
      Author: Stephen Haberman <stephen@exigencecorp.com>
      
      Closes #5223 from stephenh/do_not_suppress_writer_exception and squashes the following commits:
      
      c7ad53f [Stephen Haberman] [SPARK-6560][CORE] Do not suppress exceptions from writer.write.
      b0d884f0
    • Reynold Xin's avatar
      [SPARK-6428] Turn on explicit type checking for public methods. · 82701ee2
      Reynold Xin authored
      This builds on my earlier pull requests and turns on the explicit type checking in scalastyle.
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #5342 from rxin/SPARK-6428 and squashes the following commits:
      
      7b531ab [Reynold Xin] import ordering
      2d9a8a5 [Reynold Xin] jl
      e668b1c [Reynold Xin] override
      9b9e119 [Reynold Xin] Parenthesis.
      82e0cf5 [Reynold Xin] [SPARK-6428] Turn on explicit type checking for public methods.
      82701ee2
    • Yin Huai's avatar
      [SPARK-6575][SQL] Converted Parquet Metastore tables no longer cache metadata · c42c3fc7
      Yin Huai authored
      https://issues.apache.org/jira/browse/SPARK-6575
      
      Author: Yin Huai <yhuai@databricks.com>
      
      This patch had conflicts when merged, resolved by
      Committer: Cheng Lian <lian@databricks.com>
      
      Closes #5339 from yhuai/parquetRelationCache and squashes the following commits:
      
      b0e1a42 [Yin Huai] Address comments.
      83d9846 [Yin Huai] Remove unnecessary change.
      c0dc7a4 [Yin Huai] Cache converted parquet relations.
      c42c3fc7
    • zsxwing's avatar
      [SPARK-6621][Core] Fix the bug that calling EventLoop.stop in... · 440ea31b
      zsxwing authored
      [SPARK-6621][Core] Fix the bug that calling EventLoop.stop in EventLoop.onReceive/onError/onStart doesn't call onStop
      
      Author: zsxwing <zsxwing@gmail.com>
      
      Closes #5280 from zsxwing/SPARK-6621 and squashes the following commits:
      
      521125e [zsxwing] Fix the bug that calling EventLoop.stop in EventLoop.onReceive and EventLoop.onError doesn't call onStop
      440ea31b
  5. Apr 02, 2015
    • freeman's avatar
      [SPARK-6345][STREAMING][MLLIB] Fix for training with prediction · 6e1c1ec6
      freeman authored
      This patch fixes a reported bug causing model updates to not properly propagate to model predictions during streaming regression. These minor changes in model declaration fix the problem, and I expanded the tests to include the scenario in which the bug was arising. The two new tests failed prior to the patch and now pass.
      
      cc mengxr
      
      Author: freeman <the.freeman.lab@gmail.com>
      
      Closes #5037 from freeman-lab/train-predict-fix and squashes the following commits:
      
      3af953e [freeman] Expand test coverage to include combined training and prediction
      8f84fc8 [freeman] Move model declaration
      6e1c1ec6
    • KaiXinXiaoLei's avatar
      [CORE] The descriptionof jobHistory config should be spark.history.fs.logDirectory · 8a0aa81c
      KaiXinXiaoLei authored
      The config option  is spark.history.fs.logDirectory, not spark.fs.history.logDirectory. So the descriptionof  should be changed. Thanks.
      
      Author: KaiXinXiaoLei <huleilei1@huawei.com>
      
      Closes #5332 from KaiXinXiaoLei/historyConfig and squashes the following commits:
      
      5ffbfb5 [KaiXinXiaoLei] the describe of jobHistory config is error
      8a0aa81c
    • Yin Huai's avatar
      [SPARK-6575][SQL] Converted Parquet Metastore tables no longer cache metadata · 4b82bd73
      Yin Huai authored
      https://issues.apache.org/jira/browse/SPARK-6575
      
      Author: Yin Huai <yhuai@databricks.com>
      
      Closes #5339 from yhuai/parquetRelationCache and squashes the following commits:
      
      83d9846 [Yin Huai] Remove unnecessary change.
      c0dc7a4 [Yin Huai] Cache converted parquet relations.
      4b82bd73
    • Marcelo Vanzin's avatar
      [SPARK-6650] [core] Stop ExecutorAllocationManager when context stops. · 45134ec9
      Marcelo Vanzin authored
      This fixes the thread leak. I also changed the unit test to keep track
      of allocated contexts and make sure they're closed after tests are
      run; this is needed since some tests use this pattern:
      
          val sc = createContext()
          doSomethingThatMayThrow()
          sc.stop()
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #5311 from vanzin/SPARK-6650 and squashes the following commits:
      
      652c73b [Marcelo Vanzin] Nits.
      5711512 [Marcelo Vanzin] More exception safety.
      cc5a744 [Marcelo Vanzin] Stop alloc manager before scheduler.
      9886f69 [Marcelo Vanzin] [SPARK-6650] [core] Stop ExecutorAllocationManager when context stops.
      45134ec9
    • Michael Armbrust's avatar
      [SPARK-6686][SQL] Use resolved output instead of names for toDF rename · 052dee07
      Michael Armbrust authored
      This is a workaround for a problem reported on the user list.  This doesn't fix the core problem, but in general is a more robust way to do renames.
      
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #5337 from marmbrus/toDFrename and squashes the following commits:
      
      6a3159d [Michael Armbrust] [SPARK-6686][SQL] Use resolved output instead of names for toDF rename
      052dee07
    • DoingDone9's avatar
      [SPARK-6243][SQL] The Operation of match did not conside the scenarios that... · 947802cb
      DoingDone9 authored
      [SPARK-6243][SQL] The Operation of match did not conside the scenarios that order.dataType does not match NativeType
      
      It did not conside that order.dataType does not match NativeType. So i add "case other => ..." for other cenarios.
      
      Author: DoingDone9 <799203320@qq.com>
      
      Closes #4959 from DoingDone9/case_ and squashes the following commits:
      
      6278846 [DoingDone9] Update rows.scala
      cb1852d [DoingDone9] Merge pull request #2 from apache/master
      c3f046f [DoingDone9] Merge pull request #1 from apache/master
      947802cb
    • Cheng Hao's avatar
      [SQL][Minor] Use analyzed logical instead of unresolved in HiveComparisonTest · dfd2982b
      Cheng Hao authored
      Some internal unit test failed due to the logical plan node in pattern matching in `HiveComparisonTest`,  e.g.
      https://github.com/apache/spark/blob/master/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveComparisonTest.scala#L137
      
      Which will may call the `output` function on an unresolved logical plan.
      
      Author: Cheng Hao <hao.cheng@intel.com>
      
      Closes #4946 from chenghao-intel/logical and squashes the following commits:
      
      432ecb3 [Cheng Hao] Use analyzed instead of logical in HiveComparisonTest
      dfd2982b
    • Yin Huai's avatar
      [SPARK-6618][SPARK-6669][SQL] Lock Hive metastore client correctly. · 5db89127
      Yin Huai authored
      Author: Yin Huai <yhuai@databricks.com>
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #5333 from yhuai/lookupRelationLock and squashes the following commits:
      
      59c884f [Michael Armbrust] [SQL] Lock metastore client in analyzeTable
      7667030 [Yin Huai] Merge pull request #2 from marmbrus/pr/5333
      e4a9b0b [Michael Armbrust] Correctly lock on MetastoreCatalog
      d6fc32f [Yin Huai] Missing `)`.
      1e241af [Yin Huai] Protect InsertIntoHive.
      fee7e9c [Yin Huai] A test?
      5416b0f [Yin Huai] Just protect client.
      5db89127
    • Cheng Lian's avatar
      [Minor] [SQL] Follow-up of PR #5210 · d3944b6f
      Cheng Lian authored
      This PR addresses rxin's comments in PR #5210.
      
      <!-- Reviewable:start -->
      [<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/5219)
      <!-- Reviewable:end -->
      
      Author: Cheng Lian <lian@databricks.com>
      
      Closes #5219 from liancheng/spark-6554-followup and squashes the following commits:
      
      41f3a09 [Cheng Lian] Addresses comments in #5210
      d3944b6f
    • Yin Huai's avatar
      [SPARK-6655][SQL] We need to read the schema of a data source table stored in... · 251698fb
      Yin Huai authored
      [SPARK-6655][SQL] We need to read the schema of a data source table stored in spark.sql.sources.schema property
      
      https://issues.apache.org/jira/browse/SPARK-6655
      
      Author: Yin Huai <yhuai@databricks.com>
      
      Closes #5313 from yhuai/SPARK-6655 and squashes the following commits:
      
      1e00c03 [Yin Huai] Unnecessary change.
      f131bd9 [Yin Huai] Fix.
      f1218c1 [Yin Huai] Failed test.
      251698fb
    • Michael Armbrust's avatar
      [SQL] Throw UnsupportedOperationException instead of NotImplementedError · 4214e50f
      Michael Armbrust authored
      NotImplementedError in scala 2.10 is a fatal exception, which is not very nice to throw when not actually fatal.
      
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #5315 from marmbrus/throwUnsupported and squashes the following commits:
      
      c29e03b [Michael Armbrust] [SQL] Throw UnsupportedOperationException instead of NotImplementedError
      052e05b [Michael Armbrust] [SQL] Throw UnsupportedOperationException instead of NotImplementedError
      4214e50f
    • Hung Lin's avatar
      SPARK-6414: Spark driver failed with NPE on job cancelation · e3202aa2
      Hung Lin authored
      Use Option for ActiveJob.properties to avoid NPE bug
      
      Author: Hung Lin <hung.lin@gmail.com>
      
      Closes #5124 from hunglin/SPARK-6414 and squashes the following commits:
      
      2290b6b [Hung Lin] [SPARK-6414][core] Fix NPE in SparkContext.cancelJobGroup()
      e3202aa2
    • Davies Liu's avatar
      [SPARK-6667] [PySpark] remove setReuseAddress · 0cce5451
      Davies Liu authored
      The reused address on server side had caused the server can not acknowledge the connected connections, remove it.
      
      This PR will retry once after timeout, it also add a timeout at client side.
      
      Author: Davies Liu <davies@databricks.com>
      
      Closes #5324 from davies/collect_hang and squashes the following commits:
      
      e5a51a2 [Davies Liu] remove setReuseAddress
      7977c2f [Davies Liu] do retry on client side
      b838f35 [Davies Liu] retry after timeout
      0cce5451
    • Xiangrui Meng's avatar
      [SPARK-6672][SQL] convert row to catalyst in createDataFrame(RDD[Row], ...) · 424e987d
      Xiangrui Meng authored
      We assume that `RDD[Row]` contains Scala types. So we need to convert them into catalyst types in createDataFrame. liancheng
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #5329 from mengxr/SPARK-6672 and squashes the following commits:
      
      2d52644 [Xiangrui Meng] set needsConversion = false in jsonRDD
      06896e4 [Xiangrui Meng] add createDataFrame without conversion
      4a3767b [Xiangrui Meng] convert Row to catalyst
      424e987d
    • Patrick Wendell's avatar
      [SPARK-6627] Some clean-up in shuffle code. · 6562787b
      Patrick Wendell authored
      Before diving into review #4450 I did a look through the existing shuffle
      code to learn how it works. Unfortunately, there are some very
      confusing things in this code. This patch makes a few small changes
      to simplify things. It is not easily to concisely describe the changes
      because of how convoluted the issues were, but they are fairly small
      logically:
      
      1. There is a trait named `ShuffleBlockManager` that only deals with
         one logical function which is retrieving shuffle block data given shuffle
         block coordinates. This trait has two implementors FileShuffleBlockManager
         and IndexShuffleBlockManager. Confusingly the vast majority of those
         implementations have nothing to do with this particular functionality.
         So I've renamed the trait to ShuffleBlockResolver and documented it.
      2. The aforementioned trait had two almost identical methods, for no good
         reason. I removed one method (getBytes) and modified callers to use the
         other one. I think the behavior is preserved in all cases.
      3. The sort shuffle code uses an identifier "0" in the reduce slot of a
         BlockID as a placeholder. I made it into a constant since it needs to
         be consistent across multiple places.
      
      I think for (3) there is actually a better solution that would avoid the
      need to do this type of workaround/hack in the first place, but it's more
      complex so I'm punting it for now.
      
      Author: Patrick Wendell <patrick@databricks.com>
      
      Closes #5286 from pwendell/cleanup and squashes the following commits:
      
      c71fbc7 [Patrick Wendell] Open interface back up for testing
      f36edd5 [Patrick Wendell] Code review feedback
      d1c0494 [Patrick Wendell] Style fix
      a406079 [Patrick Wendell] [HOTFIX] Some clean-up in shuffle code.
      6562787b
    • Davies Liu's avatar
      [SPARK-6663] [SQL] use Literal.create instread of constructor · 40df5d49
      Davies Liu authored
      In order to do inbound checking and type conversion, we should use Literal.create() instead of  constructor.
      
      Author: Davies Liu <davies@databricks.com>
      
      Closes #5320 from davies/literal and squashes the following commits:
      
      1667604 [Davies Liu] fix style and add comment
      5f8c0fd [Davies Liu] use Literal.create instread of constructor
      40df5d49
  6. Apr 01, 2015
Loading