Commits · ae980eb41c00b5f1f64c650f267b884e864693f0 · cs525-sp18-g07 / spark

Apr 07, 2015

[SPARK-6736][GraphX][Doc]Example of Graph#aggregateMessages has error · ae980eb4

Sasaki Toru authored 10 years ago

Example of Graph#aggregateMessages has error.
Since aggregateMessages is a method of Graph, It should be written "rawGraph.aggregateMessages"

Author: Sasaki Toru <sasakitoa@nttdata.co.jp>

Closes #5388 from sasakitoa/aggregateMessagesExample and squashes the following commits:

b1d631b [Sasaki Toru] Example of Graph#aggregateMessages has error

ae980eb4

[SPARK-6636] Use public DNS hostname everywhere in spark_ec2.py · 6f0d55d7

Matt Aasted authored 10 years ago

The spark_ec2.py script uses public_dns_name everywhere in the script except for testing ssh availability, which is done using the public ip address of the instances. This breaks the script for users who are deploying the cluster with a private-network-only security group. The fix is to use public_dns_name in the remaining place.

Author: Matt Aasted <aasted@twitch.tv>

Closes #5302 from aasted/master and squashes the following commits:

60cf6ee [Matt Aasted] [SPARK-6636] Use public DNS hostname everywhere in spark_ec2.py

6f0d55d7

[SPARK-6716] Change SparkContext.DRIVER_IDENTIFIER from <driver> to driver · a0846c4b

Josh Rosen authored 10 years ago

Currently, the driver's executorId is set to `<driver>`. This choice of ID was present in older Spark versions, but it has started to cause problems now that executorIds are used in more contexts, such as Ganglia metric names or driver thread-dump links the web UI. The angle brackets must be escaped when embedding this ID in XML or as part of URLs and this has led to multiple problems:

- https://issues.apache.org/jira/browse/SPARK-6484
- https://issues.apache.org/jira/browse/SPARK-4313

The simplest solution seems to be to change this id to something that does not contain any special characters, such as `driver`.

I'm not sure whether we can perform this change in a patch release, since this ID may be considered a stable API by metrics users, but it's probably okay to do this in a major release as long as we document it in the release notes.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #5372 from JoshRosen/driver-id-fix and squashes the following commits:

42d3c10 [Josh Rosen] Clarify comment
0c5d04b [Josh Rosen] Add backwards-compatibility in BlockManagerId.isDriver
7ff12e0 [Josh Rosen] Change SparkContext.DRIVER_IDENTIFIER from <driver> to driver

a0846c4b

Apr 06, 2015

[Minor] [SQL] [SPARK-6729] Minor fix for DriverQuirks get · e40ea874

Volodymyr Lyubinets authored 10 years ago

The function uses .substring(0, X), which will trigger OutOfBoundsException if string length is less than X. A better way to do this is to use startsWith, which won't error out in this case.

Author: Volodymyr Lyubinets <vlyubin@gmail.com>

Closes #5378 from vlyubin/quirks and squashes the following commits:

504e8e0 [Volodymyr Lyubinets] Minor fix for DriverQuirks get

e40ea874

[MLlib] [SPARK-6713] Iterators in columnSimilarities for mapPartitionsWithIndex · 30363ede

Reza Zadeh authored 10 years ago

Use Iterators in columnSimilarities to allow mapPartitionsWithIndex to spill to disk. This could happen in a dense and large column - this way Spark can spill the pairs onto disk instead of building all the pairs before handing them to Spark.

Another PR coming to update documentation.

Author: Reza Zadeh <reza@databricks.com>

Closes #5364 from rezazadeh/optmemsim and squashes the following commits:

47c90ba [Reza Zadeh] Iterators in columnSimilarities for flatMap

30363ede

SPARK-6569 [STREAMING] Down-grade same-offset message in Kafka streaming to INFO · 9fe41252

Sean Owen authored 10 years ago

Reduce "is the same as ending offset" message to INFO level per JIRA discussion

Author: Sean Owen <sowen@cloudera.com>

Closes #5366 from srowen/SPARK-6569 and squashes the following commits:

8a5b992 [Sean Owen] Reduce "is the same as ending offset" message to INFO level per JIRA discussion

9fe41252

[SPARK-6673] spark-shell.cmd can't start in Windows even when spark was built · 49f38824

Masayoshi TSUZUKI authored 10 years ago

added equivalent script to load-spark-env.sh

Author: Masayoshi TSUZUKI <tsudukim@oss.nttdata.co.jp>

Closes #5328 from tsudukim/feature/SPARK-6673 and squashes the following commits:

aaefb19 [Masayoshi TSUZUKI] removed dust.
be3405e [Masayoshi TSUZUKI] [SPARK-6673] spark-shell.cmd can't start in Windows even when spark was built

49f38824

Apr 05, 2015

[SPARK-6602][Core] Update MapOutputTrackerMasterActor to MapOutputTrackerMasterEndpoint · 0b5d028a

zsxwing authored 10 years ago

This is the second PR for [SPARK-6602]. It updated MapOutputTrackerMasterActor and its unit tests.

cc rxin

Author: zsxwing <zsxwing@gmail.com>

Closes #5371 from zsxwing/rpc-rewrite-part2 and squashes the following commits:

fcf3816 [zsxwing] Fix the code style
4013a22 [zsxwing] Add doc for uncaught exceptions in RpcEnv
93c6c20 [zsxwing] Add an example of UnserializableException and add ErrorMonitor to monitor errors from Akka
134fe7b [zsxwing] Update MapOutputTrackerMasterActor to MapOutputTrackerMasterEndpoint

0b5d028a

[SPARK-6262][MLLIB]Implement missing methods for MultivariateStatisticalSummary · acffc434

lewuathe authored 10 years ago

Add below methods in pyspark for MultivariateStatisticalSummary
- normL1
- normL2

Author: lewuathe <lewuathe@me.com>

Closes #5359 from Lewuathe/SPARK-6262 and squashes the following commits:

cbe439e [lewuathe] Implement missing methods for MultivariateStatisticalSummary

acffc434

Apr 04, 2015

[SPARK-6602][Core] Replace direct use of Akka with Spark RPC interface - part 1 · f15806a8

zsxwing authored 10 years ago

This PR replaced the following `Actor`s to `RpcEndpoint`:

1. HeartbeatReceiver
1. ExecutorActor
1. BlockManagerMasterActor
1. BlockManagerSlaveActor
1. CoarseGrainedExecutorBackend and subclasses
1. CoarseGrainedSchedulerBackend.DriverActor

This is the first PR. I will split the work of SPARK-6602 to several PRs for code review.

Author: zsxwing <zsxwing@gmail.com>

Closes #5268 from zsxwing/rpc-rewrite and squashes the following commits:

287e9f8 [zsxwing] Fix the code style
26c56b7 [zsxwing] Merge branch 'master' into rpc-rewrite
9cc825a [zsxwing] Rmove setupThreadSafeEndpoint and add ThreadSafeRpcEndpoint
30a9036 [zsxwing] Make self return null after stopping RpcEndpointRef; fix docs and error messages
705245d [zsxwing] Fix some bugs after rebasing the changes on the master
003cf80 [zsxwing] Update CoarseGrainedExecutorBackend and CoarseGrainedSchedulerBackend to use RpcEndpoint
7d0e6dc [zsxwing] Update BlockManagerSlaveActor to use RpcEndpoint
f5d6543 [zsxwing] Update BlockManagerMaster to use RpcEndpoint
30e3f9f [zsxwing] Update ExecutorActor to use RpcEndpoint
478b443 [zsxwing] Update HeartbeatReceiver to use RpcEndpoint

f15806a8

[SPARK-6607][SQL] Check invalid characters for Parquet schema and show error messages · 7bca62f7

Liang-Chi Hsieh authored 10 years ago

'(' and ')' are special characters used in Parquet schema for type annotation. When we run an aggregation query, we will obtain attribute name such as "MAX(a)".

If we directly store the generated DataFrame as Parquet file, it causes failure when reading and parsing the stored schema string.

Several methods can be adopted to solve this. This pr uses a simplest one to just replace attribute names before generating Parquet schema based on these attributes.

Another possible method might be modifying all aggregation expression names from "func(column)" to "func[column]".

Author: Liang-Chi Hsieh <viirya@gmail.com>

Closes #5263 from viirya/parquet_aggregation_name and squashes the following commits:

2d70542 [Liang-Chi Hsieh] Address comment.
463dff4 [Liang-Chi Hsieh] Instead of replacing special chars, showing error message to user to suggest using Alias.
1de001d [Liang-Chi Hsieh] Replace special characters '(' and ')' of Parquet schema.

7bca62f7

[SQL] Use path.makeQualified in newParquet. · da25c86d

Yin Huai authored 10 years ago

Author: Yin Huai <yhuai@databricks.com>

Closes #5353 from yhuai/wrongFS and squashes the following commits:

849603b [Yin Huai] Not use deprecated method.
6d6ae34 [Yin Huai] Use path.makeQualified.

da25c86d

Apr 03, 2015

[SPARK-6700] disable flaky test · 9b40c17a

Davies Liu authored 10 years ago

Author: Davies Liu <davies@databricks.com>

Closes #5356 from davies/flaky and squashes the following commits:

08955f4 [Davies Liu] disable flaky test

9b40c17a

[SPARK-6647][SQL] Make trait StringComparison as BinaryPredicate and fix unit... · 26b415e1

Liang-Chi Hsieh authored 10 years ago

[SPARK-6647][SQL] Make trait StringComparison as BinaryPredicate and fix unit tests of string data source Filter

Now trait `StringComparison` is a `BinaryExpression`. In fact, it should be a `BinaryPredicate`.

By making `StringComparison` as `BinaryPredicate`, we can throw error when a `expressions.Predicate` can't translate to a data source `Filter` in function `selectFilters`.

Without this modification, because we will wrap a `Filter` outside the scanned results in `pruneFilterProjectRaw`, we can't detect about something is wrong in translating predicates to filters in `selectFilters`.

The unit test of #5285 demonstrates such problem. In that pr, even `expressions.Contains` is not properly translated to `sources.StringContains`, the filtering is still performed by the `Filter` and so the test passes.

Of course, by doing this modification, all `expressions.Predicate` classes need to have its data source `Filter` correspondingly.

There is a small bug in `FilteredScanSuite` for doing `StringEndsWith` filter. This pr also fixes it.

Author: Liang-Chi Hsieh <viirya@gmail.com>

Closes #5309 from viirya/translate_predicate and squashes the following commits:

b176385 [Liang-Chi Hsieh] Address comment.
275a493 [Liang-Chi Hsieh] More properly test for StringStartsWith, StringEndsWith and StringContains.
caf2347 [Liang-Chi Hsieh] Make trait StringComparison as BinaryPredicate and throw error when Predicate can't translate to data source Filter.

26b415e1

[SPARK-6688] [core] Always use resolved URIs in EventLoggingListener. · 14632b79

Marcelo Vanzin authored 10 years ago

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #5340 from vanzin/SPARK-6688 and squashes the following commits:

ccfddd9 [Marcelo Vanzin] Resolve at the source.
20d2a34 [Marcelo Vanzin] [SPARK-6688] [core] Always use resolved URIs in EventLoggingListener.

14632b79

Closes #3158 · ffe8cc9a
Reynold Xin authored 10 years ago

ffe8cc9a

[SPARK-6640][Core] Fix the race condition of creating HeartbeatReceiver and... · 88504b75

zsxwing authored 10 years ago

[SPARK-6640][Core] Fix the race condition of creating HeartbeatReceiver and retrieving HeartbeatReceiver

This PR moved the code of creating `HeartbeatReceiver` above the code of creating `schedulerBackend` to resolve the race condition.

Author: zsxwing <zsxwing@gmail.com>

Closes #5306 from zsxwing/SPARK-6640 and squashes the following commits:

840399d [zsxwing] Don't send TaskScheduler through Akka
a90616a [zsxwing] Fix docs
dd202c7 [zsxwing] Fix typo
d7c250d [zsxwing] Fix the race condition of creating HeartbeatReceiver and retrieving HeartbeatReceiver

88504b75

[SPARK-6492][CORE] SparkContext.stop() can deadlock when DAGSchedulerEventProcessLoop dies · 2c43ea38

Ilya Ganelin authored 10 years ago

I've added a timeout and retry loop around the SparkContext shutdown code that should fix this deadlock. If a SparkContext shutdown is in progress when another thread comes knocking, it will wait for 10 seconds for the lock, then fall through where the outer loop will re-submit the request.

Author: Ilya Ganelin <ilya.ganelin@capitalone.com>

Closes #5277 from ilganeli/SPARK-6492 and squashes the following commits:

8617a7e [Ilya Ganelin] Resolved merge conflict
2fbab66 [Ilya Ganelin] Added MIMA Exclude
a0e2c70 [Ilya Ganelin] Deleted stale imports
fa28ce7 [Ilya Ganelin] reverted to just having a single stopped
76fc825 [Ilya Ganelin] Updated to use atomic booleans instead of the synchronized vars
6e8a7f7 [Ilya Ganelin] Removing unecessary null check for now since i'm not fixing stop ordering yet
cdf7073 [Ilya Ganelin] [SPARK-6492] Moved stopped=true back to the start of the shutdown sequence so this can be addressed in a seperate PR
7fb795b [Ilya Ganelin] Spacing
b7a0c5c [Ilya Ganelin] Import ordering
df8224f [Ilya Ganelin] Added comment for added lock
343cb94 [Ilya Ganelin] [SPARK-6492] Added timeout/retry logic to fix a deadlock in SparkContext shutdown

2c43ea38

[SPARK-5203][SQL] fix union with different decimal type · c23ba81b

guowei2 authored 10 years ago

   When union non-decimal types with decimals, we use the following rules:
      - FIRST `intTypeToFixed`, then fixed union decimals with precision/scale p1/s2 and p2/s2  will be promoted to
      DecimalType(max(p1, p2), max(s1, s2))
      - FLOAT and DOUBLE cause fixed-length decimals to turn into DOUBLE (this is the same as Hive,
      but note that unlimited decimals are considered bigger than doubles in WidenTypes)

Author: guowei2 <guowei2@asiainfo.com>

Closes #4004 from guowei2/SPARK-5203 and squashes the following commits:

ff50f5f [guowei2] fix code style
11df1bf [guowei2] fix decimal union with double, double->Decimal(15,15)
0f345f9 [guowei2] fix structType merge with decimal
101ed4d [guowei2] fix build error after rebase
0b196e4 [guowei2] code style
fe2c2ca [guowei2] handle union decimal precision in 'DecimalPrecision'
421d840 [guowei2] fix union types for decimal precision
ef2c661 [guowei2] fix union with different decimal type

c23ba81b

[Minor][SQL] Fix typo · dc6dff24

Liang-Chi Hsieh authored 10 years ago

Just fix a typo.

Author: Liang-Chi Hsieh <viirya@gmail.com>

Closes #5352 from viirya/fix_a_typo and squashes the following commits:

303b2d2 [Liang-Chi Hsieh] Fix typo.

dc6dff24

[SPARK-6615][MLLIB] Python API for Word2Vec · 512a2f19

lewuathe authored 10 years ago

This is the sub-task of SPARK-6254.
Wrap missing method for `Word2Vec` and `Word2VecModel`.

Author: lewuathe <lewuathe@me.com>

Closes #5296 from Lewuathe/SPARK-6615 and squashes the following commits:

f14c304 [lewuathe] Reorder tests
1d326b9 [lewuathe] Merge master
e2bedfb [lewuathe] Modify test cases
afb866d [lewuathe] [SPARK-6615] Python API for Word2Vec

512a2f19

[MLLIB] Remove println in LogisticRegression.scala · b52c7f9f

Omede Firouz authored 10 years ago

There's no corresponding printing in linear regression. Here was my previous PR (something weird happened and I can't reopen it) https://github.com/apache/spark/pull/5272

Author: Omede Firouz <ofirouz@palantir.com>

Closes #5338 from oefirouz/println and squashes the following commits:

3f3dbf4 [Omede Firouz] [MLLIB] Remove println

b52c7f9f

[SPARK-6560][CORE] Do not suppress exceptions from writer.write. · b0d884f0

Stephen Haberman authored 10 years ago

If there is a failure in the Hadoop backend while calling
writer.write, we should remember this original exception,
and try to call writer.close(), but if that fails as well,
still report the original exception.

Note that, if writer.write fails, it is likely that writer
was left in an invalid state, and so actually makes it more
likely that writer.close will also fail. Which just increases
the chances for writer.write's exception to be suppressed.

This patch introduces an admittedly potentially too cute
Utils.tryWithSafeFinally method to handle the try/finally
gyrations.

Author: Stephen Haberman <stephen@exigencecorp.com>

Closes #5223 from stephenh/do_not_suppress_writer_exception and squashes the following commits:

c7ad53f [Stephen Haberman] [SPARK-6560][CORE] Do not suppress exceptions from writer.write.

b0d884f0

[SPARK-6428] Turn on explicit type checking for public methods. · 82701ee2

Reynold Xin authored 10 years ago

This builds on my earlier pull requests and turns on the explicit type checking in scalastyle.

Author: Reynold Xin <rxin@databricks.com>

Closes #5342 from rxin/SPARK-6428 and squashes the following commits:

7b531ab [Reynold Xin] import ordering
2d9a8a5 [Reynold Xin] jl
e668b1c [Reynold Xin] override
9b9e119 [Reynold Xin] Parenthesis.
82e0cf5 [Reynold Xin] [SPARK-6428] Turn on explicit type checking for public methods.

82701ee2

[SPARK-6575][SQL] Converted Parquet Metastore tables no longer cache metadata · c42c3fc7

Yin Huai authored 10 years ago

https://issues.apache.org/jira/browse/SPARK-6575

Author: Yin Huai <yhuai@databricks.com>

This patch had conflicts when merged, resolved by
Committer: Cheng Lian <lian@databricks.com>

Closes #5339 from yhuai/parquetRelationCache and squashes the following commits:

b0e1a42 [Yin Huai] Address comments.
83d9846 [Yin Huai] Remove unnecessary change.
c0dc7a4 [Yin Huai] Cache converted parquet relations.

c42c3fc7

[SPARK-6621][Core] Fix the bug that calling EventLoop.stop in... · 440ea31b

zsxwing authored 10 years ago

[SPARK-6621][Core] Fix the bug that calling EventLoop.stop in EventLoop.onReceive/onError/onStart doesn't call onStop

Author: zsxwing <zsxwing@gmail.com>

Closes #5280 from zsxwing/SPARK-6621 and squashes the following commits:

521125e [zsxwing] Fix the bug that calling EventLoop.stop in EventLoop.onReceive and EventLoop.onError doesn't call onStop

440ea31b

Apr 02, 2015

[SPARK-6345][STREAMING][MLLIB] Fix for training with prediction · 6e1c1ec6

freeman authored 10 years ago

This patch fixes a reported bug causing model updates to not properly propagate to model predictions during streaming regression. These minor changes in model declaration fix the problem, and I expanded the tests to include the scenario in which the bug was arising. The two new tests failed prior to the patch and now pass.

cc mengxr

Author: freeman <the.freeman.lab@gmail.com>

Closes #5037 from freeman-lab/train-predict-fix and squashes the following commits:

3af953e [freeman] Expand test coverage to include combined training and prediction
8f84fc8 [freeman] Move model declaration

6e1c1ec6

[CORE] The descriptionof jobHistory config should be spark.history.fs.logDirectory · 8a0aa81c

KaiXinXiaoLei authored 10 years ago

The config option is spark.history.fs.logDirectory, not spark.fs.history.logDirectory. So the descriptionof should be changed. Thanks.

Author: KaiXinXiaoLei <huleilei1@huawei.com>

Closes #5332 from KaiXinXiaoLei/historyConfig and squashes the following commits:

5ffbfb5 [KaiXinXiaoLei] the describe of jobHistory config is error

8a0aa81c

[SPARK-6575][SQL] Converted Parquet Metastore tables no longer cache metadata · 4b82bd73

Yin Huai authored 10 years ago

https://issues.apache.org/jira/browse/SPARK-6575

Author: Yin Huai <yhuai@databricks.com>

Closes #5339 from yhuai/parquetRelationCache and squashes the following commits:

83d9846 [Yin Huai] Remove unnecessary change.
c0dc7a4 [Yin Huai] Cache converted parquet relations.

4b82bd73

[SPARK-6650] [core] Stop ExecutorAllocationManager when context stops. · 45134ec9

Marcelo Vanzin authored 10 years ago

This fixes the thread leak. I also changed the unit test to keep track
of allocated contexts and make sure they're closed after tests are
run; this is needed since some tests use this pattern:

    val sc = createContext()
    doSomethingThatMayThrow()
    sc.stop()

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #5311 from vanzin/SPARK-6650 and squashes the following commits:

652c73b [Marcelo Vanzin] Nits.
5711512 [Marcelo Vanzin] More exception safety.
cc5a744 [Marcelo Vanzin] Stop alloc manager before scheduler.
9886f69 [Marcelo Vanzin] [SPARK-6650] [core] Stop ExecutorAllocationManager when context stops.

45134ec9

[SPARK-6686][SQL] Use resolved output instead of names for toDF rename · 052dee07

Michael Armbrust authored 10 years ago

This is a workaround for a problem reported on the user list. This doesn't fix the core problem, but in general is a more robust way to do renames.

Author: Michael Armbrust <michael@databricks.com>

Closes #5337 from marmbrus/toDFrename and squashes the following commits:

6a3159d [Michael Armbrust] [SPARK-6686][SQL] Use resolved output instead of names for toDF rename

052dee07

[SPARK-6243][SQL] The Operation of match did not conside the scenarios that... · 947802cb

DoingDone9 authored 10 years ago

[SPARK-6243][SQL] The Operation of match did not conside the scenarios that order.dataType does not match NativeType

It did not conside that order.dataType does not match NativeType. So i add "case other => ..." for other cenarios.

Author: DoingDone9 <799203320@qq.com>

Closes #4959 from DoingDone9/case_ and squashes the following commits:

6278846 [DoingDone9] Update rows.scala
cb1852d [DoingDone9] Merge pull request #2 from apache/master
c3f046f [DoingDone9] Merge pull request #1 from apache/master

947802cb

[SQL][Minor] Use analyzed logical instead of unresolved in HiveComparisonTest · dfd2982b

Cheng Hao authored 10 years ago

Some internal unit test failed due to the logical plan node in pattern matching in `HiveComparisonTest`, e.g.
https://github.com/apache/spark/blob/master/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveComparisonTest.scala#L137

Which will may call the `output` function on an unresolved logical plan.

Author: Cheng Hao <hao.cheng@intel.com>

Closes #4946 from chenghao-intel/logical and squashes the following commits:

432ecb3 [Cheng Hao] Use analyzed instead of logical in HiveComparisonTest

dfd2982b

[SPARK-6618][SPARK-6669][SQL] Lock Hive metastore client correctly. · 5db89127

Yin Huai authored 10 years ago

Author: Yin Huai <yhuai@databricks.com>
Author: Michael Armbrust <michael@databricks.com>

Closes #5333 from yhuai/lookupRelationLock and squashes the following commits:

59c884f [Michael Armbrust] [SQL] Lock metastore client in analyzeTable
7667030 [Yin Huai] Merge pull request #2 from marmbrus/pr/5333
e4a9b0b [Michael Armbrust] Correctly lock on MetastoreCatalog
d6fc32f [Yin Huai] Missing `)`.
1e241af [Yin Huai] Protect InsertIntoHive.
fee7e9c [Yin Huai] A test?
5416b0f [Yin Huai] Just protect client.

5db89127

[Minor] [SQL] Follow-up of PR #5210 · d3944b6f

Cheng Lian authored 10 years ago

This PR addresses rxin's comments in PR #5210.

<!-- Reviewable:start -->
[<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/5219)
<!-- Reviewable:end -->

Author: Cheng Lian <lian@databricks.com>

Closes #5219 from liancheng/spark-6554-followup and squashes the following commits:

41f3a09 [Cheng Lian] Addresses comments in #5210

d3944b6f

[SPARK-6655][SQL] We need to read the schema of a data source table stored in... · 251698fb

Yin Huai authored 10 years ago

[SPARK-6655][SQL] We need to read the schema of a data source table stored in spark.sql.sources.schema property

https://issues.apache.org/jira/browse/SPARK-6655

Author: Yin Huai <yhuai@databricks.com>

Closes #5313 from yhuai/SPARK-6655 and squashes the following commits:

1e00c03 [Yin Huai] Unnecessary change.
f131bd9 [Yin Huai] Fix.
f1218c1 [Yin Huai] Failed test.

251698fb

[SQL] Throw UnsupportedOperationException instead of NotImplementedError · 4214e50f

Michael Armbrust authored 10 years ago

NotImplementedError in scala 2.10 is a fatal exception, which is not very nice to throw when not actually fatal.

Author: Michael Armbrust <michael@databricks.com>

Closes #5315 from marmbrus/throwUnsupported and squashes the following commits:

c29e03b [Michael Armbrust] [SQL] Throw UnsupportedOperationException instead of NotImplementedError
052e05b [Michael Armbrust] [SQL] Throw UnsupportedOperationException instead of NotImplementedError

4214e50f

SPARK-6414: Spark driver failed with NPE on job cancelation · e3202aa2

Hung Lin authored 10 years ago

Use Option for ActiveJob.properties to avoid NPE bug

Author: Hung Lin <hung.lin@gmail.com>

Closes #5124 from hunglin/SPARK-6414 and squashes the following commits:

2290b6b [Hung Lin] [SPARK-6414][core] Fix NPE in SparkContext.cancelJobGroup()

e3202aa2

[SPARK-6667] [PySpark] remove setReuseAddress · 0cce5451

Davies Liu authored 10 years ago

The reused address on server side had caused the server can not acknowledge the connected connections, remove it.

This PR will retry once after timeout, it also add a timeout at client side.

Author: Davies Liu <davies@databricks.com>

Closes #5324 from davies/collect_hang and squashes the following commits:

e5a51a2 [Davies Liu] remove setReuseAddress
7977c2f [Davies Liu] do retry on client side
b838f35 [Davies Liu] retry after timeout

0cce5451

[SPARK-6672][SQL] convert row to catalyst in createDataFrame(RDD[Row], ...) · 424e987d

Xiangrui Meng authored 10 years ago

We assume that `RDD[Row]` contains Scala types. So we need to convert them into catalyst types in createDataFrame. liancheng

Author: Xiangrui Meng <meng@databricks.com>

Closes #5329 from mengxr/SPARK-6672 and squashes the following commits:

2d52644 [Xiangrui Meng] set needsConversion = false in jsonRDD
06896e4 [Xiangrui Meng] add createDataFrame without conversion
4a3767b [Xiangrui Meng] convert Row to catalyst

424e987d