- Sep 21, 2016
-
-
Peng, Meng authored
## What changes were proposed in this pull request? Univariate feature selection works by selecting the best features based on univariate statistical tests. False Positive Rate (FPR) is a popular univariate statistical test for feature selection. We add a chiSquare Selector based on False Positive Rate (FPR) test in this PR, like it is implemented in scikit-learn. http://scikit-learn.org/stable/modules/feature_selection.html#univariate-feature-selection ## How was this patch tested? Add Scala ut Author: Peng, Meng <peng.meng@intel.com> Closes #14597 from mpjlu/fprChiSquare.
-
- Sep 19, 2016
-
-
sethah authored
## What changes were proposed in this pull request? Merge `MultinomialLogisticRegression` into `LogisticRegression` and remove `MultinomialLogisticRegression`. Marked as WIP because we should discuss the coefficients API in the model. See discussion below. JIRA: [SPARK-17163](https://issues.apache.org/jira/browse/SPARK-17163) ## How was this patch tested? Merged test suites and added some new unit tests. ## Design ### Switching between binomial and multinomial We default to automatically detecting whether we should run binomial or multinomial lor. We expose a new parameter called `family` which defaults to auto. When "auto" is used, we run normal binomial lor with pivoting if there are 1 or 2 label classes. Otherwise, we run multinomial. If the user explicitly sets the family, then we abide by that setting. In the case where "binomial" is set but multiclass lor is detected, we throw an error. ### coefficients/intercept model API (TODO) This is the biggest design point remaining, IMO. We need to decide how to store the coefficients and intercepts in the model, and in turn how to expose them via the API. Two important points: * We must maintain compatibility with the old API, i.e. we must expose `def coefficients: Vector` and `def intercept: Double` * There are two separate cases: binomial lr where we have a single set of coefficients and a single intercept and multinomial lr where we have `numClasses` sets of coefficients and `numClasses` intercepts. Some options: 1. **Store the binomial coefficients as a `2 x numFeatures` matrix.** This means that we would center the model coefficients before storing them in the model. The BLOR algorithm gives `1 * numFeatures` coefficients, but we would convert them to `2 x numFeatures` coefficients before storing them, effectively doubling the storage in the model. This has the advantage that we can make the code cleaner (i.e. less `if (isMultinomial) ... else ...`) and we don't have to reason about the different cases as much. It has the disadvantage that we double the storage space and we could see small regressions at prediction time since there are 2x the number of operations in the prediction algorithms. Additionally, we still have to produce the uncentered coefficients/intercept via the API, so we will have to either ALSO store the uncentered version, or compute it in `def coefficients: Vector` every time. 2. **Store the binomial coefficients as a `1 x numFeatures` matrix.** We still store the coefficients as a matrix and the intercepts as a vector. When users call `coefficients` we return them a `Vector` that is backed by the same underlying array as the `coefficientMatrix`, so we don't duplicate any data. At prediction time, we use the old prediction methods that are specialized for binary LOR. The benefits here are that we don't store extra data, and we won't see any regressions in performance. The cost of this is that we have separate implementations for predict methods in the binary vs multiclass case. The duplicated code is really not very high, but it's still a bit messy. If we do decide to store the 2x coefficients, we would likely want to see some performance tests to understand the potential regressions. **Update:** We have chosen option 2 ### Threshold/thresholds (TODO) Currently, when `threshold` is set we clear whatever value is in `thresholds` and when `thresholds` is set we clear whatever value is in `threshold`. [SPARK-11543](https://issues.apache.org/jira/browse/SPARK-11543) was created to prefer thresholds over threshold. We should decide if we should implement this behavior now or if we want to do it in a separate JIRA. **Update:** Let's leave it for a follow up PR ## Follow up * Summary model for multiclass logistic regression [SPARK-17139](https://issues.apache.org/jira/browse/SPARK-17139) * Thresholds vs threshold [SPARK-11543](https://issues.apache.org/jira/browse/SPARK-11543) Author: sethah <seth.hendrickson16@gmail.com> Closes #14834 from sethah/SPARK-17163.
-
- Sep 15, 2016
-
-
Sean Owen authored
## What changes were proposed in this pull request? Following https://github.com/apache/spark/pull/14969 for some reason the MiMa excludes weren't complete, but still passed the PR builder. This adds 3 more excludes from https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.2/1749/consoleFull It also moves the excludes to their own Seq in the build, as they probably should have been. Even though this is merged to 2.1.x only / master, I left the exclude in for 2.0.x in case we back port. It's a private API so is always a false positive. ## How was this patch tested? Jenkins build Author: Sean Owen <sowen@cloudera.com> Closes #15110 from srowen/SPARK-17406.2.
-
cenyuhai authored
## What changes were proposed in this pull request? The job page will be too slow to open when there are thousands of executor events(added or removed). I found that in ExecutorsTab file, executorIdToData will not remove elements, it will increase all the time.Before this pr, it looks like [timeline1.png](https://issues.apache.org/jira/secure/attachment/12827112/timeline1.png). After this pr, it looks like [timeline2.png](https://issues.apache.org/jira/secure/attachment/12827113/timeline2.png)(we can set how many executor events will be displayed) Author: cenyuhai <cenyuhai@didichuxing.com> Closes #14969 from cenyuhai/SPARK-17406.
-
- Sep 12, 2016
-
-
Josh Rosen authored
This patch makes a handful of post-Spark-2.0 MiMa exclusion and build updates. It should be merged to master and a subset of it should be picked into branch-2.0 in order to test Spark 2.0.1-SNAPSHOT. - Remove the ` sketch`, `mllibLocal`, and `streamingKafka010` from the list of excluded subprojects so that MiMa checks them. - Remove now-unnecessary special-case handling of the Kafka 0.8 artifact in `mimaSettings`. - Move the exclusion added in SPARK-14743 from `v20excludes` to `v21excludes`, since that patch was only merged into master and not branch-2.0. - Add exclusions for an API change introduced by SPARK-17096 / #14675. - Add missing exclusions for the `o.a.spark.internal` and `o.a.spark.sql.internal` packages. Author: Josh Rosen <joshrosen@databricks.com> Closes #15061 from JoshRosen/post-2.0-mima-changes.
-
- Sep 04, 2016
-
-
Shivansh authored
[SPARK-17308] Improved the spark core code by replacing all pattern match on boolean value by if/else block. ## What changes were proposed in this pull request? Improved the code quality of spark by replacing all pattern match on boolean value by if/else block. ## How was this patch tested? By running the tests Author: Shivansh <shiv4nsh@gmail.com> Closes #14873 from shiv4nsh/SPARK-17308.
-
- Aug 26, 2016
-
-
Michael Gummelt authored
## What changes were proposed in this pull request? Move Mesos code into a mvn module ## How was this patch tested? unit tests manually submitting a client mode and cluster mode job spark/mesos integration test suite Author: Michael Gummelt <mgummelt@mesosphere.io> Closes #14637 from mgummelt/mesos-module.
-
- Aug 10, 2016
-
-
jerryshao authored
## What changes were proposed in this pull request? Add a configurable token manager for Spark on running on yarn. ### Current Problems ### 1. Supported token provider is hard-coded, currently only hdfs, hbase and hive are supported and it is impossible for user to add new token provider without code changes. 2. Also this problem exits in timely token renewer and updater. ### Changes In This Proposal ### In this proposal, to address the problems mentioned above and make the current code more cleaner and easier to understand, mainly has 3 changes: 1. Abstract a `ServiceTokenProvider` as well as `ServiceTokenRenewable` interface for token provider. Each service wants to communicate with Spark through token way needs to implement this interface. 2. Provide a `ConfigurableTokenManager` to manage all the register token providers, also token renewer and updater. Also this class offers the API for other modules to obtain tokens, get renewal interval and so on. 3. Implement 3 built-in token providers `HDFSTokenProvider`, `HiveTokenProvider` and `HBaseTokenProvider` to keep the same semantics as supported today. Whether to load in these built-in token providers is controlled by configuration "spark.yarn.security.tokens.${service}.enabled", by default for all the built-in token providers are loaded. ### Behavior Changes ### For the end user there's no behavior change, we still use the same configuration `spark.yarn.security.tokens.${service}.enabled` to decide which token provider is enabled (hbase or hive). For user implemented token provider (assume the name of token provider is "test") needs to add into this class should have two configurations: 1. `spark.yarn.security.tokens.test.enabled` to true 2. `spark.yarn.security.tokens.test.class` to the full qualified class name. So we still keep the same semantics as current code while add one new configuration. ### Current Status ### - [x] token provider interface and management framework. - [x] implement built-in token providers (hdfs, hbase, hive). - [x] Coverage of unit test. - [x] Integrated test with security cluster. ## How was this patch tested? Unit test and integrated test. Please suggest and review, any comment is greatly appreciated. Author: jerryshao <sshao@hortonworks.com> Closes #14065 from jerryshao/SPARK-16342.
-
- Aug 04, 2016
-
-
Sean Zhong authored
## What changes were proposed in this pull request? For DataSet typed select: ``` def select[U1: Encoder](c1: TypedColumn[T, U1]): Dataset[U1] ``` If type T is a case class or a tuple class that is not atomic, the resulting logical plan's schema will mismatch with `Dataset[T]` encoder's schema, which will cause encoder error and throw AnalysisException. ### Before change: ``` scala> case class A(a: Int, b: Int) scala> Seq((0, A(1,2))).toDS.select($"_2".as[A]) org.apache.spark.sql.AnalysisException: cannot resolve '`a`' given input columns: [_2]; .. ``` ### After change: ``` scala> case class A(a: Int, b: Int) scala> Seq((0, A(1,2))).toDS.select($"_2".as[A]).show +---+---+ | a| b| +---+---+ | 1| 2| +---+---+ ``` ## How was this patch tested? Unit test. Author: Sean Zhong <seanzhong@databricks.com> Closes #14474 from clockfly/SPARK-16853.
-
- Jul 12, 2016
-
-
petermaxlee authored
## What changes were proposed in this pull request? It would be useful to support listing the columns that are referenced by a filter. This can help simplify data source planning, because with this we would be able to implement unhandledFilters method in HadoopFsRelation. This is based on rxin's patch (#13901) and adds unit tests. ## How was this patch tested? Added a new suite FiltersSuite. Author: petermaxlee <petermaxlee@gmail.com> Author: Reynold Xin <rxin@databricks.com> Closes #14120 from petermaxlee/SPARK-16199.
-
- Jul 11, 2016
-
-
Reynold Xin authored
## What changes were proposed in this pull request? It is currently fairly difficult to have proper mima excludes when we cut a version branch. I'm proposing a small change to take the exclude list out of the exclude function, and put it in a variable so we can easily union excludes. After this change, we can bump pom.xml version to 2.1.0-SNAPSHOT, without bumping the diff base version. Note that I also deleted all the exclude rules for version 1.x, to cut down the size of the file. ## How was this patch tested? N/A - this is a build infra change. Author: Reynold Xin <rxin@databricks.com> Closes #14128 from rxin/SPARK-16476.
-
- Jul 06, 2016
-
-
Eric Liang authored
## What changes were proposed in this pull request? This patches `MemoryAllocator` to fill clean and freed memory with known byte values, similar to https://github.com/jemalloc/jemalloc/wiki/Use-Case:-Find-a-memory-corruption-bug . Memory filling is flag-enabled in test only by default. ## How was this patch tested? Unit test that it's on in test. cc sameeragarwal Author: Eric Liang <ekl@databricks.com> Closes #13983 from ericl/spark-16021.
-
- Jul 05, 2016
-
-
cody koeninger authored
## What changes were proposed in this pull request? during sbt unidoc task, skip the streamingKafka010 subproject and filter kafka 0.10 classes from the classpath, so that at least existing kafka 0.8 doc can be included in unidoc without error ## How was this patch tested? sbt spark/scalaunidoc:doc | grep -i error Author: cody koeninger <cody@koeninger.org> Closes #14041 from koeninger/SPARK-16359.
-
- Jul 04, 2016
-
-
Michael Allman authored
Link to Jira issue: https://issues.apache.org/jira/browse/SPARK-16353 ## What changes were proposed in this pull request? The javadoc options for the java unidoc generation are ignored when generating the java unidoc. For example, the generated `index.html` has the wrong HTML page title. This can be seen at http://spark.apache.org/docs/latest/api/java/index.html. I changed the relevant setting scope from `doc` to `(JavaUnidoc, unidoc)`. ## How was this patch tested? I ran `docs/jekyll build` and verified that the java unidoc `index.html` has the correct HTML page title. Author: Michael Allman <michael@videoamp.com> Closes #14031 from mallman/spark-16353.
-
- Jun 30, 2016
-
-
cody koeninger authored
## What changes were proposed in this pull request? New Kafka consumer api for the released 0.10 version of Kafka ## How was this patch tested? Unit tests, manual tests Author: cody koeninger <cody@koeninger.org> Closes #11863 from koeninger/kafka-0.9.
-
- Jun 27, 2016
-
-
Dongjoon Hyun authored
## What changes were proposed in this pull request? Currently, Spark Scala/Java API documents shows **org.apache.hadoop.hive.ql.io.orc** package at the top. http://spark.apache.org/docs/2.0.0-preview/api/scala/index.html#org.apache.spark.package http://spark.apache.org/docs/2.0.0-preview/api/java/index.html This PR hides `SparkOrcNewRecordReader` from API docs. ## How was this patch tested? Manual. (`build/sbt unidoc`). The following is the screenshot after this PR. **Scala API doc**  **Java API doc**  Author: Dongjoon Hyun <dongjoon@apache.org> Closes #13914 from dongjoon-hyun/SPARK-16111.
-
- Jun 22, 2016
-
-
Xiangrui Meng authored
## What changes were proposed in this pull request? In 1.4 and earlier releases, we have package grouping in the generated Java API docs. See http://spark.apache.org/docs/1.4.0/api/java/index.html. However, this disappeared in 1.5.0: http://spark.apache.org/docs/1.5.0/api/java/index.html. Rather than fixing it, I'd suggest removing grouping. Because it might take some time to fix and it is a manual process to update the grouping in `SparkBuild.scala`. I didn't find anyone complaining about missing groups since 1.5.0 on Google. Manually checked the generated Java API docs and confirmed that they are the same as in master. Author: Xiangrui Meng <meng@databricks.com> Closes #13856 from mengxr/SPARK-16155.
-
- Jun 15, 2016
-
-
Reynold Xin authored
## What changes were proposed in this pull request? The way bash script `build/spark-build-info` is called from core/pom.xml prevents Spark building on Windows. Instead of calling the script directly we call bash and pass the script as an argument. This enables running it on Windows with bash installed which typically comes with Git. This brings https://github.com/apache/spark/pull/13612 up-to-date and also addresses comments from the code review. Closes #13612 ## How was this patch tested? I built manually (on a Mac) to verify it didn't break Mac compilation. Author: Reynold Xin <rxin@databricks.com> Author: avulanov <nashb@yandex.ru> Closes #13691 from rxin/SPARK-15851.
-
- Jun 14, 2016
-
-
Sean Zhong authored
## What changes were proposed in this pull request? Revert partial changes in SPARK-12600, and add some deprecated method back to SQLContext for backward source code compatibility. ## How was this patch tested? Manual test. Author: Sean Zhong <seanzhong@databricks.com> Closes #13637 from clockfly/SPARK-15914.
-
- Jun 11, 2016
-
-
Eric Liang authored
## What changes were proposed in this pull request? These were not updated after performance improvements. To make updating them easier, I also moved the results from inline comments out into a file, which is auto-generated when the benchmark is re-run. Author: Eric Liang <ekl@databricks.com> Closes #13607 from ericl/sc-3538.
-
- Jun 09, 2016
-
-
Josh Rosen authored
Spark's SBT build currently uses a fork of the sbt-pom-reader plugin but depends on that fork via a SBT subproject which is cloned from https://github.com/scrapcodes/sbt-pom-reader/tree/ignore_artifact_id. This unnecessarily slows down the initial build on fresh machines and is also risky because it risks a build breakage in case that GitHub repository ever changes or is deleted. In order to address these issues, I have published a pre-built binary of our forked sbt-pom-reader plugin to Maven Central under the `org.spark-project` namespace and have updated Spark's build to use that artifact. This published artifact was built from https://github.com/JoshRosen/sbt-pom-reader/tree/v1.0.0-spark, which contains the contents of ScrapCodes's branch plus an additional patch to configure the build for artifact publication. /cc srowen ScrapCodes for review. Author: Josh Rosen <joshrosen@databricks.com> Closes #13564 from JoshRosen/use-published-fork-of-pom-reader.
-
- Jun 06, 2016
-
-
Dhruve Ashar authored
## What changes were proposed in this pull request? Change the way spark picks up version information. Also embed the build information to better identify the spark version running. More context can be found here : https://github.com/apache/spark/pull/12152 ## How was this patch tested? Ran the mvn and sbt builds to verify the version information was being displayed correctly on executing <code>spark-submit --version </code>  Author: Dhruve Ashar <dhruveashar@gmail.com> Closes #13061 from dhruve/impr/SPARK-14279.
-
- May 31, 2016
-
-
Marcelo Vanzin authored
This helps with preventing jdk8-specific calls being checked in, because PR builders are running the compiler with the wrong settings. If the JAVA_7_HOME env variable is set, assume it points at a jdk7 and use its rt.jar when invoking javac. For zinc, just run it with jdk7, and disable it when building jdk8-specific code. A big note for sbt usage: adding the bootstrap options forces sbt to fork the compiler, and that disables incremental compilation. That means that it's really not convenient to use for normal development, but should be ok for automated builds. Tested with JAVA_HOME=jdk8 and JAVA_7_HOME=jdk7: - mvn + zinc - mvn sans zinc - sbt Verified that in all cases, jdk8-specific library calls fail to compile. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #13272 from vanzin/SPARK-15451.
-
- May 27, 2016
-
-
DB Tsai authored
## What changes were proposed in this pull request? We're using `asML` to convert the mllib vector/matrix to ml vector/matrix now. Using `as` is more correct given that this conversion actually shares the same underline data structure. As a result, in this PR, `toBreeze` will be changed to `asBreeze`. This is a private API, as a result, it will not affect any user's application. ## How was this patch tested? unit tests Author: DB Tsai <dbt@netflix.com> Closes #13198 from dbtsai/minor.
-
- May 26, 2016
-
-
Yin Huai authored
[SPARK-15532][SQL] SQLContext/HiveContext's public constructors should use SparkSession.build.getOrCreate ## What changes were proposed in this pull request? This PR changes SQLContext/HiveContext's public constructor to use SparkSession.build.getOrCreate and removes isRootContext from SQLContext. ## How was this patch tested? Existing tests. Author: Yin Huai <yhuai@databricks.com> Closes #13310 from yhuai/SPARK-15532.
-
Reynold Xin authored
## What changes were proposed in this pull request? This patch renames various DefaultSources to make their names more self-describing. The choice of "DefaultSource" was from the days when we did not have a good way to specify short names. They are now named: - LibSVMFileFormat - CSVFileFormat - JdbcRelationProvider - JsonFileFormat - ParquetFileFormat - TextFileFormat Backward compatibility is maintained through aliasing. ## How was this patch tested? Updated relevant test cases too. Author: Reynold Xin <rxin@databricks.com> Closes #13311 from rxin/SPARK-15543.
-
- May 25, 2016
-
-
Herman van Hovell authored
## What changes were proposed in this pull request? The ANTLR4 SBT plugin has been moved from its own repo to one on bintray. The version was also changed from `0.7.10` to `0.7.11`. The latter actually broke our build (ihji has fixed this by also adding `0.7.10` and others to the bin-tray repo). This PR upgrades the SBT-ANTLR4 plugin and ANTLR4 to their most recent versions (`0.7.11`/`4.5.3`). I have also removed a few obsolete build configurations. ## How was this patch tested? Manually running SBT/Maven builds. Author: Herman van Hovell <hvanhovell@questtec.nl> Closes #13299 from hvanhovell/SPARK-15525.
-
- May 21, 2016
-
-
Reynold Xin authored
## What changes were proposed in this pull request? I initially asked to create a hivecontext-compatibility module to put the HiveContext there. But we are so close to Spark 2.0 release and there is only a single class in it. It seems overkill to have an entire package, which makes it more inconvenient, for a single class. ## How was this patch tested? Tests were moved. Author: Reynold Xin <rxin@databricks.com> Closes #13207 from rxin/SPARK-15424.
-
- May 18, 2016
-
-
Davies Liu authored
## What changes were proposed in this pull request? Since we support forced spilling for Spillable, which only works in OnHeap mode, different from other SQL operators (could be OnHeap or OffHeap), we should considering the mode of consumer before calling trigger forced spilling. ## How was this patch tested? Add new test. Author: Davies Liu <davies@databricks.com> Closes #13151 from davies/fix_mode.
-
- May 17, 2016
-
-
DB Tsai authored
## What changes were proposed in this pull request? Once SPARK-14487 and SPARK-14549 are merged, we will migrate to use the new vector and matrix type in the new ml pipeline based apis. ## How was this patch tested? Unit tests Author: DB Tsai <dbt@netflix.com> Author: Liang-Chi Hsieh <simonh@tw.ibm.com> Author: Xiangrui Meng <meng@databricks.com> Closes #12627 from dbtsai/SPARK-14615-NewML.
-
Sean Owen authored
## What changes were proposed in this pull request? (See https://github.com/apache/spark/pull/12416 where most of this was already reviewed and committed; this is just the module structure and move part. This change does not move the annotations into test scope, which was the apparently problem last time.) Rename `spark-test-tags` -> `spark-tags`; move common annotations like `Since` to `spark-tags` ## How was this patch tested? Jenkins tests. Author: Sean Owen <sowen@cloudera.com> Closes #13074 from srowen/SPARK-15290.
-
- May 11, 2016
-
-
cody koeninger authored
## What changes were proposed in this pull request? Renaming the streaming-kafka artifact to include kafka version, in anticipation of needing a different artifact for later kafka versions ## How was this patch tested? Unit tests Author: cody koeninger <cody@koeninger.org> Closes #12946 from koeninger/SPARK-15085.
-
hyukjinkwon authored
## What changes were proposed in this pull request? This PR removes the old `json(path: String)` API which is covered by the new `json(paths: String*)`. ## How was this patch tested? Jenkins tests (existing tests should cover this) Author: hyukjinkwon <gurwls223@gmail.com> Author: Hyukjin Kwon <gurwls223@gmail.com> Closes #13040 from HyukjinKwon/SPARK-15250.
-
- May 10, 2016
-
-
Sital Kedia authored
## What changes were proposed in this pull request? Currently PipedRDD internally uses PrintWriter to write data to the stdin of the piped process, which by default uses a BufferedWriter of buffer size 8k. In our experiment, we have seen that 8k buffer size is too small and the job spends significant amount of CPU time in system calls to copy the data. We should have a way to configure the buffer size for the writer. ## How was this patch tested? Ran PipedRDDSuite tests. Author: Sital Kedia <skedia@fb.com> Closes #12309 from sitalkedia/bufferedPipedRDD.
-
- May 09, 2016
-
-
Alex Bozarth authored
## What changes were proposed in this pull request? Removed blockTransferService and sparkFilesDir from SparkEnv since they're rarely used and don't need to be in stored in the env. Edited their few usages to accommodate the change. ## How was this patch tested? ran dev/run-tests locally Author: Alex Bozarth <ajbozart@us.ibm.com> Closes #12970 from ajbozarth/spark10653.
-
- May 06, 2016
-
-
Luciano Resende authored
## What changes were proposed in this pull request? Create a maven profile for executing the docker integration tests using maven Remove docker integration tests from main sbt build Update documentation on how to run docker integration tests from sbt ## How was this patch tested? Manual test of the docker integration tests as in : mvn -Pdocker-integration-tests -pl :spark-docker-integration-tests_2.11 compile test ## Other comments Note that the the DB2 Docker Tests are still disabled as there is a kernel version issue on the AMPLab Jenkins slaves and we would need to get them on the right level before enabling those tests. They do run ok locally with the updates from PR #12348 Author: Luciano Resende <lresende@apache.org> Closes #12508 from lresende/docker.
-
- May 05, 2016
-
-
Luciano Resende authored
## What changes were proposed in this pull request? Enhance the DB2 JDBC Dialect docker tests as they seemed to have had some issues on previous merge causing some tests to fail. ## How was this patch tested? By running the integration tests locally. Author: Luciano Resende <lresende@apache.org> Closes #12348 from lresende/SPARK-14589.
-
- Apr 30, 2016
-
-
Herman van Hovell authored
#### What changes were proposed in this pull request? This PR removes three methods the were deprecated in 1.6.0: - `PortableDataStream.close()` - `LinearRegression.weights` - `LogisticRegression.weights` The rationale for doing this is that the impact is small and that Spark 2.0 is a major release. #### How was this patch tested? Compilation succeded. Author: Herman van Hovell <hvanhovell@questtec.nl> Closes #12732 from hvanhovell/SPARK-14952.
-
- Apr 29, 2016
-
-
Jakob Odersky authored
## What changes were proposed in this pull request? In the past, genjavadoc had issues with package private members which led the spark project to use a forked version. This issue has been fixed upstream (typesafehub/genjavadoc#70) and a release is available for scala versions 2.10, 2.11 **and 2.12**, hence a forked version for spark is no longer necessary. This pull request updates the build configuration to use the newest upstream genjavadoc. ## How was this patch tested? The build was run `sbt unidoc`. During the process javadoc emits some errors on the generated java stubs, however these errors were also present before the upgrade. Furthermore, the produced html is fine. Author: Jakob Odersky <jakob@odersky.com> Closes #12707 from jodersky/SPARK-14511-genjavadoc.
-
- Apr 28, 2016
-