Skip to content
Snippets Groups Projects
  1. Apr 25, 2017
  2. Apr 14, 2017
  3. Apr 10, 2017
    • Shixiong Zhu's avatar
      [SPARK-17564][TESTS] Fix flaky RequestTimeoutIntegrationSuite.furtherRequestsDelay · 8eb71b81
      Shixiong Zhu authored
      ## What changes were proposed in this pull request?
      
      This PR  fixs the following failure:
      ```
      sbt.ForkMain$ForkError: java.lang.AssertionError: null
      	at org.junit.Assert.fail(Assert.java:86)
      	at org.junit.Assert.assertTrue(Assert.java:41)
      	at org.junit.Assert.assertTrue(Assert.java:52)
      	at org.apache.spark.network.RequestTimeoutIntegrationSuite.furtherRequestsDelay(RequestTimeoutIntegrationSuite.java:230)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:497)
      	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
      	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
      	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
      	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
      	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
      	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
      	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
      	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
      	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
      	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
      	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
      	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
      	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
      	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
      	at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
      	at org.junit.runners.Suite.runChild(Suite.java:128)
      	at org.junit.runners.Suite.runChild(Suite.java:27)
      	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
      	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
      	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
      	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
      	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
      	at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
      	at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
      	at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
      	at com.novocode.junit.JUnitRunner$1.execute(JUnitRunner.java:132)
      	at sbt.ForkMain$Run$2.call(ForkMain.java:296)
      	at sbt.ForkMain$Run$2.call(ForkMain.java:286)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      	at java.lang.Thread.run(Thread.java:745)
      ```
      
      It happens several times per month on [Jenkins](http://spark-tests.appspot.com/test-details?suite_name=org.apache.spark.network.RequestTimeoutIntegrationSuite&test_name=furtherRequestsDelay). The failure is because `callback1` may not be called before `assertTrue(callback1.failure instanceof IOException);`. It's pretty easy to reproduce this error by adding a sleep before this line: https://github.com/apache/spark/blob/379b0b0bbdbba2278ce3bcf471bd75f6ffd9cf0d/common/network-common/src/test/java/org/apache/spark/network/RequestTimeoutIntegrationSuite.java#L267
      
      
      
      The fix is straightforward: just use the latch to wait until `callback1` is called.
      
      ## How was this patch tested?
      
      Jenkins
      
      Author: Shixiong Zhu <shixiong@databricks.com>
      
      Closes #17599 from zsxwing/SPARK-17564.
      
      (cherry picked from commit 734dfbfc)
      Signed-off-by: default avatarReynold Xin <rxin@databricks.com>
      8eb71b81
  4. Apr 02, 2017
  5. Mar 28, 2017
  6. Mar 21, 2017
  7. Feb 13, 2017
    • Josh Rosen's avatar
      [SPARK-19529] TransportClientFactory.createClient() shouldn't call awaitUninterruptibly() · 5db23473
      Josh Rosen authored
      
      This patch replaces a single `awaitUninterruptibly()` call with a plain `await()` call in Spark's `network-common` library in order to fix a bug which may cause tasks to be uncancellable.
      
      In Spark's Netty RPC layer, `TransportClientFactory.createClient()` calls `awaitUninterruptibly()` on a Netty future while waiting for a connection to be established. This creates problem when a Spark task is interrupted while blocking in this call (which can happen in the event of a slow connection which will eventually time out). This has bad impacts on task cancellation when `interruptOnCancel = true`.
      
      As an example of the impact of this problem, I experienced significant numbers of uncancellable "zombie tasks" on a production cluster where several tasks were blocked trying to connect to a dead shuffle server and then continued running as zombies after I cancelled the associated Spark stage. The zombie tasks ran for several minutes with the following stack:
      
      ```
      java.lang.Object.wait(Native Method)
      java.lang.Object.wait(Object.java:460)
      io.netty.util.concurrent.DefaultPromise.await0(DefaultPromise.java:607)
      io.netty.util.concurrent.DefaultPromise.awaitUninterruptibly(DefaultPromise.java:301)
      org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:224)
      org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:179) => holding Monitor(java.lang.Object1849476028})
      org.apache.spark.network.shuffle.ExternalShuffleClient$1.createAndStart(ExternalShuffleClient.java:105)
      org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140)
      org.apache.spark.network.shuffle.RetryingBlockFetcher.start(RetryingBlockFetcher.java:120)
      org.apache.spark.network.shuffle.ExternalShuffleClient.fetchBlocks(ExternalShuffleClient.java:114)
      org.apache.spark.storage.ShuffleBlockFetcherIterator.sendRequest(ShuffleBlockFetcherIterator.scala:169)
      org.apache.spark.storage.ShuffleBlockFetcherIterator.fetchUpToMaxBytes(ShuffleBlockFetcherIterator.scala:
      350)
      org.apache.spark.storage.ShuffleBlockFetcherIterator.initialize(ShuffleBlockFetcherIterator.scala:286)
      org.apache.spark.storage.ShuffleBlockFetcherIterator.<init>(ShuffleBlockFetcherIterator.scala:120)
      org.apache.spark.shuffle.BlockStoreShuffleReader.read(BlockStoreShuffleReader.scala:45)
      org.apache.spark.sql.execution.ShuffledRowRDD.compute(ShuffledRowRDD.scala:169)
      org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
      org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
      [...]
      ```
      
      As far as I can tell, `awaitUninterruptibly()` might have been used in order to avoid having to declare that methods throw `InterruptedException` (this code is written in Java, hence the need to use checked exceptions). This patch simply replaces this with a regular, interruptible `await()` call,.
      
      This required several interface changes to declare a new checked exception (these are internal interfaces, though, and this change doesn't significantly impact binary compatibility).
      
      An alternative approach would be to wrap `InterruptedException` into `IOException` in order to avoid having to change interfaces. The problem with this approach is that the `network-shuffle` project's `RetryingBlockFetcher` code treats `IOExceptions` as transitive failures when deciding whether to retry fetches, so throwing a wrapped `IOException` might cause an interrupted shuffle fetch to be retried, further prolonging the lifetime of a cancelled zombie task.
      
      Note that there are three other `awaitUninterruptibly()` in the codebase, but those calls have a hard 10 second timeout and are waiting on a `close()` operation which is expected to complete near instantaneously, so the impact of uninterruptibility there is much smaller.
      
      Manually.
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #16866 from JoshRosen/SPARK-19529.
      
      (cherry picked from commit 1c4d10b1)
      Signed-off-by: default avatarCheng Lian <lian@databricks.com>
      5db23473
    • Shixiong Zhu's avatar
      [SPARK-17714][CORE][TEST-MAVEN][TEST-HADOOP2.6] Avoid using... · 328b2298
      Shixiong Zhu authored
      [SPARK-17714][CORE][TEST-MAVEN][TEST-HADOOP2.6] Avoid using ExecutorClassLoader to load Netty generated classes
      
      ## What changes were proposed in this pull request?
      
      Netty's `MessageToMessageEncoder` uses [Javassist](https://github.com/netty/netty/blob/91a0bdc17a8298437d6de08a8958d753799bd4a6/common/src/main/java/io/netty/util/internal/JavassistTypeParameterMatcherGenerator.java#L62
      
      ) to generate a matcher class and the implementation calls `Class.forName` to check if this class is already generated. If `MessageEncoder` or `MessageDecoder` is created in `ExecutorClassLoader.findClass`, it will cause `ClassCircularityError`. This is because loading this Netty generated class will call `ExecutorClassLoader.findClass` to search this class, and `ExecutorClassLoader` will try to use RPC to load it and cause to load the non-exist matcher class again. JVM will report `ClassCircularityError` to prevent such infinite recursion.
      
      ##### Why it only happens in Maven builds
      
      It's because Maven and SBT have different class loader tree. The Maven build will set a URLClassLoader as the current context class loader to run the tests and expose this issue. The class loader tree is as following:
      
      ```
      bootstrap class loader ------ ... ----- REPL class loader ---- ExecutorClassLoader
      |
      |
      URLClasssLoader
      ```
      
      The SBT build uses the bootstrap class loader directly and `ReplSuite.test("propagation of local properties")` is the first test in ReplSuite, which happens to load `io/netty/util/internal/__matchers__/org/apache/spark/network/protocol/MessageMatcher` into the bootstrap class loader (Note: in maven build, it's loaded into URLClasssLoader so it cannot be found in ExecutorClassLoader). This issue can be reproduced in SBT as well. Here are the produce steps:
      - Enable `hadoop.caller.context.enabled`.
      - Replace `Class.forName` with `Utils.classForName` in `object CallerContext`.
      - Ignore `ReplSuite.test("propagation of local properties")`.
      - Run `ReplSuite` using SBT.
      
      This PR just creates a singleton MessageEncoder and MessageDecoder and makes sure they are created before switching to ExecutorClassLoader. TransportContext will be created when creating RpcEnv and that happens before creating ExecutorClassLoader.
      
      ## How was this patch tested?
      
      Jenkins
      
      Author: Shixiong Zhu <shixiong@databricks.com>
      
      Closes #16859 from zsxwing/SPARK-17714.
      
      (cherry picked from commit 905fdf0c)
      Signed-off-by: default avatarShixiong Zhu <shixiong@databricks.com>
      328b2298
  8. Jan 13, 2017
    • Wenchen Fan's avatar
      [SPARK-19178][SQL] convert string of large numbers to int should return null · 2c2ca894
      Wenchen Fan authored
      
      ## What changes were proposed in this pull request?
      
      When we convert a string to integral, we will convert that string to `decimal(20, 0)` first, so that we can turn a string with decimal format to truncated integral, e.g. `CAST('1.2' AS int)` will return `1`.
      
      However, this brings problems when we convert a string with large numbers to integral, e.g. `CAST('1234567890123' AS int)` will return `1912276171`, while Hive returns null as we expected.
      
      This is a long standing bug(seems it was there the first day Spark SQL was created), this PR fixes this bug by adding the native support to convert `UTF8String` to integral.
      
      ## How was this patch tested?
      
      new regression tests
      
      Author: Wenchen Fan <wenchen@databricks.com>
      
      Closes #16550 from cloud-fan/string-to-int.
      
      (cherry picked from commit 6b34e745)
      Signed-off-by: default avatarWenchen Fan <wenchen@databricks.com>
      2c2ca894
  9. Dec 28, 2016
  10. Dec 22, 2016
    • Shixiong Zhu's avatar
      [SPARK-18972][CORE] Fix the netty thread names for RPC · 1857acc7
      Shixiong Zhu authored
      
      ## What changes were proposed in this pull request?
      
      Right now the name of threads created by Netty for Spark RPC are `shuffle-client-**` and `shuffle-server-**`. It's pretty confusing.
      
      This PR just uses the module name in TransportConf to set the thread name. In addition, it also includes the following minor fixes:
      
      - TransportChannelHandler.channelActive and channelInactive should call the corresponding super methods.
      - Make ShuffleBlockFetcherIterator throw NoSuchElementException if it has no more elements. Otherwise,  if the caller calls `next` without `hasNext`, it will just hang.
      
      ## How was this patch tested?
      
      Jenkins
      
      Author: Shixiong Zhu <shixiong@databricks.com>
      
      Closes #16380 from zsxwing/SPARK-18972.
      
      (cherry picked from commit f252cb5d)
      Signed-off-by: default avatarShixiong Zhu <shixiong@databricks.com>
      1857acc7
    • Ryan Williams's avatar
      [SPARK-17807][CORE] split test-tags into test-JAR · 132f2297
      Ryan Williams authored
      
      Remove spark-tag's compile-scope dependency (and, indirectly, spark-core's compile-scope transitive-dependency) on scalatest by splitting test-oriented tags into spark-tags' test JAR.
      
      Alternative to #16303.
      
      Author: Ryan Williams <ryan.blake.williams@gmail.com>
      
      Closes #16311 from ryan-williams/tt.
      
      (cherry picked from commit afd9bc1d)
      Signed-off-by: default avatarMarcelo Vanzin <vanzin@cloudera.com>
      132f2297
  11. Dec 15, 2016
  12. Dec 08, 2016
  13. Nov 28, 2016
  14. Nov 14, 2016
    • Michael Armbrust's avatar
      [SPARK-18124] Observed delay based Event Time Watermarks · 27999b36
      Michael Armbrust authored
      
      This PR adds a new method `withWatermark` to the `Dataset` API, which can be used specify an _event time watermark_.  An event time watermark allows the streaming engine to reason about the point in time after which we no longer expect to see late data.  This PR also has augmented `StreamExecution` to use this watermark for several purposes:
        - To know when a given time window aggregation is finalized and thus results can be emitted when using output modes that do not allow updates (e.g. `Append` mode).
        - To minimize the amount of state that we need to keep for on-going aggregations, by evicting state for groups that are no longer expected to change.  Although, we do still maintain all state if the query requires (i.e. if the event time is not present in the `groupBy` or when running in `Complete` mode).
      
      An example that emits windowed counts of records, waiting up to 5 minutes for late data to arrive.
      ```scala
      df.withWatermark("eventTime", "5 minutes")
        .groupBy(window($"eventTime", "1 minute") as 'window)
        .count()
        .writeStream
        .format("console")
        .mode("append") // In append mode, we only output finalized aggregations.
        .start()
      ```
      
      ### Calculating the watermark.
      The current event time is computed by looking at the `MAX(eventTime)` seen this epoch across all of the partitions in the query minus some user defined _delayThreshold_.  An additional constraint is that the watermark must increase monotonically.
      
      Note that since we must coordinate this value across partitions occasionally, the actual watermark used is only guaranteed to be at least `delay` behind the actual event time.  In some cases we may still process records that arrive more than delay late.
      
      This mechanism was chosen for the initial implementation over processing time for two reasons:
        - it is robust to downtime that could affect processing delay
        - it does not require syncing of time or timezones between the producer and the processing engine.
      
      ### Other notable implementation details
       - A new trigger metric `eventTimeWatermark` outputs the current value of the watermark.
       - We mark the event time column in the `Attribute` metadata using the key `spark.watermarkDelay`.  This allows downstream operations to know which column holds the event time.  Operations like `window` propagate this metadata.
       - `explain()` marks the watermark with a suffix of `-T${delayMs}` to ease debugging of how this information is propagated.
       - Currently, we don't filter out late records, but instead rely on the state store to avoid emitting records that are both added and filtered in the same epoch.
      
      ### Remaining in this PR
       - [ ] The test for recovery is currently failing as we don't record the watermark used in the offset log.  We will need to do so to ensure determinism, but this is deferred until #15626 is merged.
      
      ### Other follow-ups
      There are some natural additional features that we should consider for future work:
       - Ability to write records that arrive too late to some external store in case any out-of-band remediation is required.
       - `Update` mode so you can get partial results before a group is evicted.
       - Other mechanisms for calculating the watermark.  In particular a watermark based on quantiles would be more robust to outliers.
      
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #15702 from marmbrus/watermarks.
      
      (cherry picked from commit c0718782)
      Signed-off-by: default avatarTathagata Das <tathagata.das1565@gmail.com>
      27999b36
  15. Oct 07, 2016
    • Reynold Xin's avatar
      [SPARK-17800] Introduce InterfaceStability annotation · dd16b52c
      Reynold Xin authored
      ## What changes were proposed in this pull request?
      This patch introduces three new annotations under InterfaceStability:
      - Stable
      - Evolving
      - Unstable
      
      This is inspired by Hadoop's InterfaceStability, and the first step towards switching over to a new API stability annotation framework.
      
      ## How was this patch tested?
      N/A
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #15374 from rxin/SPARK-17800.
      dd16b52c
  16. Oct 04, 2016
  17. Sep 27, 2016
    • Kazuaki Ishizaki's avatar
      [SPARK-15962][SQL] Introduce implementation with a dense format for UnsafeArrayData · 85b0a157
      Kazuaki Ishizaki authored
      ## What changes were proposed in this pull request?
      
      This PR introduces more compact representation for ```UnsafeArrayData```.
      
      ```UnsafeArrayData``` needs to accept ```null``` value in each entry of an array. In the current version, it has three parts
      ```
      [numElements] [offsets] [values]
      ```
      `Offsets` has the number of `numElements`, and represents `null` if its value is negative. It may increase memory footprint, and introduces an indirection for accessing each of `values`.
      
      This PR uses bitvectors to represent nullability for each element like `UnsafeRow`, and eliminates an indirection for accessing each element. The new ```UnsafeArrayData``` has four parts.
      ```
      [numElements][null bits][values or offset&length][variable length portion]
      ```
      In the `null bits` region, we store 1 bit per element, represents whether an element is null. Its total size is ceil(numElements / 8) bytes, and it is aligned to 8-byte boundaries.
      In the `values or offset&length` region, we store the content of elements. For fields that hold fixed-length primitive types, such as long, double, or int, we store the value directly in the field. For fields with non-primitive or variable-length values, we store a relative offset (w.r.t. the base address of the array) that points to the beginning of the variable-length field and length (they are combined into a long). Each is word-aligned. For `variable length portion`, each is aligned to 8-byte boundaries.
      
      The new format can reduce memory footprint and improve performance of accessing each element. An example of memory foot comparison:
      1024x1024 elements integer array
      Size of ```baseObject``` for ```UnsafeArrayData```: 8 + 1024x1024 + 1024x1024 = 2M bytes
      Size of ```baseObject``` for ```UnsafeArrayData```: 8 + 1024x1024/8 + 1024x1024 = 1.25M bytes
      
      In summary, we got 1.0-2.6x performance improvements over the code before applying this PR.
      Here are performance results of [benchmark programs](https://github.com/kiszk/spark/blob/04d2e4b6dbdc4eff43ce18b3c9b776e0129257c7/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/UnsafeArrayDataBenchmark.scala):
      
      **Read UnsafeArrayData**: 1.7x and 1.6x performance improvements over the code before applying this PR
      ````
      OpenJDK 64-Bit Server VM 1.8.0_91-b14 on Linux 4.4.11-200.fc22.x86_64
      Intel Xeon E3-12xx v2 (Ivy Bridge)
      
      Without SPARK-15962
      Read UnsafeArrayData:                    Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
      ------------------------------------------------------------------------------------------------
      Int                                            430 /  436        390.0           2.6       1.0X
      Double                                         456 /  485        367.8           2.7       0.9X
      
      With SPARK-15962
      Read UnsafeArrayData:                    Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
      ------------------------------------------------------------------------------------------------
      Int                                            252 /  260        666.1           1.5       1.0X
      Double                                         281 /  292        597.7           1.7       0.9X
      ````
      **Write UnsafeArrayData**: 1.0x and 1.1x performance improvements over the code before applying this PR
      ````
      OpenJDK 64-Bit Server VM 1.8.0_91-b14 on Linux 4.0.4-301.fc22.x86_64
      Intel Xeon E3-12xx v2 (Ivy Bridge)
      
      Without SPARK-15962
      Write UnsafeArrayData:                   Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
      ------------------------------------------------------------------------------------------------
      Int                                            203 /  273        103.4           9.7       1.0X
      Double                                         239 /  356         87.9          11.4       0.8X
      
      With SPARK-15962
      Write UnsafeArrayData:                   Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
      ------------------------------------------------------------------------------------------------
      Int                                            196 /  249        107.0           9.3       1.0X
      Double                                         227 /  367         92.3          10.8       0.9X
      ````
      
      **Get primitive array from UnsafeArrayData**: 2.6x and 1.6x performance improvements over the code before applying this PR
      ````
      OpenJDK 64-Bit Server VM 1.8.0_91-b14 on Linux 4.0.4-301.fc22.x86_64
      Intel Xeon E3-12xx v2 (Ivy Bridge)
      
      Without SPARK-15962
      Get primitive array from UnsafeArrayData: Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
      ------------------------------------------------------------------------------------------------
      Int                                            207 /  217        304.2           3.3       1.0X
      Double                                         257 /  363        245.2           4.1       0.8X
      
      With SPARK-15962
      Get primitive array from UnsafeArrayData: Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
      ------------------------------------------------------------------------------------------------
      Int                                            151 /  198        415.8           2.4       1.0X
      Double                                         214 /  394        293.6           3.4       0.7X
      ````
      
      **Create UnsafeArrayData from primitive array**: 1.7x and 2.1x performance improvements over the code before applying this PR
      ````
      OpenJDK 64-Bit Server VM 1.8.0_91-b14 on Linux 4.0.4-301.fc22.x86_64
      Intel Xeon E3-12xx v2 (Ivy Bridge)
      
      Without SPARK-15962
      Create UnsafeArrayData from primitive array: Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
      ------------------------------------------------------------------------------------------------
      Int                                            340 /  385        185.1           5.4       1.0X
      Double                                         479 /  705        131.3           7.6       0.7X
      
      With SPARK-15962
      Create UnsafeArrayData from primitive array: Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
      ------------------------------------------------------------------------------------------------
      Int                                            206 /  211        306.0           3.3       1.0X
      Double                                         232 /  406        271.6           3.7       0.9X
      ````
      
      1.7x and 1.4x performance improvements in [```UDTSerializationBenchmark```](https://github.com/apache/spark/blob/master/mllib/src/test/scala/org/apache/spark/mllib/linalg/UDTSerializationBenchmark.scala)  over the code before applying this PR
      ````
      OpenJDK 64-Bit Server VM 1.8.0_91-b14 on Linux 4.4.11-200.fc22.x86_64
      Intel Xeon E3-12xx v2 (Ivy Bridge)
      
      Without SPARK-15962
      VectorUDT de/serialization:              Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
      ------------------------------------------------------------------------------------------------
      serialize                                      442 /  533          0.0      441927.1       1.0X
      deserialize                                    217 /  274          0.0      217087.6       2.0X
      
      With SPARK-15962
      VectorUDT de/serialization:              Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
      ------------------------------------------------------------------------------------------------
      serialize                                      265 /  318          0.0      265138.5       1.0X
      deserialize                                    155 /  197          0.0      154611.4       1.7X
      ````
      
      ## How was this patch tested?
      
      Added unit tests into ```UnsafeArraySuite```
      
      Author: Kazuaki Ishizaki <ishizaki@jp.ibm.com>
      
      Closes #13680 from kiszk/SPARK-15962.
      85b0a157
  18. Sep 20, 2016
    • Weiqing Yang's avatar
      [MINOR][BUILD] Fix CheckStyle Error · 1ea49916
      Weiqing Yang authored
      ## What changes were proposed in this pull request?
      This PR is to fix the code style errors before 2.0.1 release.
      
      ## How was this patch tested?
      Manual.
      
      Before:
      ```
      ./dev/lint-java
      Using `mvn` from path: /usr/local/bin/mvn
      Checkstyle checks failed at following occurrences:
      [ERROR] src/main/java/org/apache/spark/network/client/TransportClient.java:[153] (sizes) LineLength: Line is longer than 100 characters (found 107).
      [ERROR] src/main/java/org/apache/spark/network/client/TransportClient.java:[196] (sizes) LineLength: Line is longer than 100 characters (found 108).
      [ERROR] src/main/java/org/apache/spark/network/client/TransportClient.java:[239] (sizes) LineLength: Line is longer than 100 characters (found 115).
      [ERROR] src/main/java/org/apache/spark/network/server/TransportRequestHandler.java:[119] (sizes) LineLength: Line is longer than 100 characters (found 107).
      [ERROR] src/main/java/org/apache/spark/network/server/TransportRequestHandler.java:[129] (sizes) LineLength: Line is longer than 100 characters (found 104).
      [ERROR] src/main/java/org/apache/spark/network/util/LevelDBProvider.java:[124,11] (modifier) ModifierOrder: 'static' modifier out of order with the JLS suggestions.
      [ERROR] src/main/java/org/apache/spark/network/util/TransportConf.java:[26] (regexp) RegexpSingleline: No trailing whitespace allowed.
      [ERROR] src/main/java/org/apache/spark/util/collection/unsafe/sort/PrefixComparators.java:[33] (sizes) LineLength: Line is longer than 100 characters (found 110).
      [ERROR] src/main/java/org/apache/spark/util/collection/unsafe/sort/PrefixComparators.java:[38] (sizes) LineLength: Line is longer than 100 characters (found 110).
      [ERROR] src/main/java/org/apache/spark/util/collection/unsafe/sort/PrefixComparators.java:[43] (sizes) LineLength: Line is longer than 100 characters (found 106).
      [ERROR] src/main/java/org/apache/spark/util/collection/unsafe/sort/PrefixComparators.java:[48] (sizes) LineLength: Line is longer than 100 characters (found 110).
      [ERROR] src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeInMemorySorter.java:[0] (misc) NewlineAtEndOfFile: File does not end with a newline.
      [ERROR] src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillReader.java:[67] (sizes) LineLength: Line is longer than 100 characters (found 106).
      [ERROR] src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java:[200] (regexp) RegexpSingleline: No trailing whitespace allowed.
      [ERROR] src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java:[309] (regexp) RegexpSingleline: No trailing whitespace allowed.
      [ERROR] src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java:[332] (regexp) RegexpSingleline: No trailing whitespace allowed.
      [ERROR] src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java:[348] (regexp) RegexpSingleline: No trailing whitespace allowed.
       ```
      After:
      ```
      ./dev/lint-java
      Using `mvn` from path: /usr/local/bin/mvn
      Checkstyle checks passed.
      ```
      
      Author: Weiqing Yang <yangweiqing001@gmail.com>
      
      Closes #15170 from Sherry302/fixjavastyle.
      1ea49916
    • Marcelo Vanzin's avatar
      [SPARK-17611][YARN][TEST] Make shuffle service test really test auth. · 7e418e99
      Marcelo Vanzin authored
      Currently, the code is just swallowing exceptions, and not really checking
      whether the auth information was being recorded properly. Fix both problems,
      and also avoid tests inadvertently affecting other tests by modifying the
      shared config variable (by making it not shared).
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #15161 from vanzin/SPARK-17611.
      7e418e99
  19. Sep 16, 2016
    • Jagadeesan's avatar
      [SPARK-17543] Missing log4j config file for tests in common/network-… · b2e27262
      Jagadeesan authored
      ## What changes were proposed in this pull request?
      
      The Maven module `common/network-shuffle` does not have a log4j configuration file for its test cases. So, added `log4j.properties` in the directory `src/test/resources`.
      
      …shuffle]
      
      Author: Jagadeesan <as2@us.ibm.com>
      
      Closes #15108 from jagadeesanas2/SPARK-17543.
      b2e27262
  20. Sep 15, 2016
    • Adam Roberts's avatar
      [SPARK-17379][BUILD] Upgrade netty-all to 4.0.41 final for bug fixes · 0ad8eeb4
      Adam Roberts authored
      ## What changes were proposed in this pull request?
      Upgrade netty-all to latest in the 4.0.x line which is 4.0.41, mentions several bug fixes and performance improvements we may find useful, see netty.io/news/2016/08/29/4-0-41-Final-4-1-5-Final.html. Initially tried to use 4.1.5 but noticed it's not backwards compatible.
      
      ## How was this patch tested?
      Existing unit tests against branch-1.6 and branch-2.0 using IBM Java 8 on Intel, Power and Z architectures
      
      Author: Adam Roberts <aroberts@uk.ibm.com>
      
      Closes #14961 from a-roberts/netty.
      0ad8eeb4
  21. Sep 09, 2016
    • Thomas Graves's avatar
      [SPARK-17433] YarnShuffleService doesn't handle moving credentials levelDb · a3981c28
      Thomas Graves authored
      The secrets leveldb isn't being moved if you run spark shuffle services without yarn nm recovery on and then turn it on.  This fixes that.  I unfortunately missed this when I ported the patch from our internal branch 2 to master branch due to the changes for the recovery path.  Note this only applies to master since it is the only place the yarn nm recovery dir is used.
      
      Unit tests ran and tested on 8 node cluster.  Fresh startup with NM recovery, fresh startup no nm recovery, switching between no nm recovery and recovery.  Also tested running applications to make sure wasn't affected by rolling upgrade.
      
      Author: Thomas Graves <tgraves@prevailsail.corp.gq1.yahoo.com>
      Author: Tom Graves <tgraves@apache.org>
      
      Closes #14999 from tgravescs/SPARK-17433.
      a3981c28
  22. Sep 06, 2016
  23. Sep 02, 2016
    • Thomas Graves's avatar
      [SPARK-16711] YarnShuffleService doesn't re-init properly on YARN rolling upgrade · e79962f2
      Thomas Graves authored
      The Spark Yarn Shuffle Service doesn't re-initialize the application credentials early enough which causes any other spark executors trying to fetch from that node during a rolling upgrade to fail with "java.lang.NullPointerException: Password cannot be null if SASL is enabled".  Right now the spark shuffle service relies on the Yarn nodemanager to re-register the applications, unfortunately this is after we open the port for other executors to connect. If other executors connected before the re-register they get a null pointer exception which isn't a re-tryable exception and cause them to fail pretty quickly. To solve this I added another leveldb file so that it can save and re-initialize all the applications before opening the port for other executors to connect to it.  Adding another leveldb was simpler from the code structure point of view.
      
      Most of the code changes are moving things to common util class.
      
      Patch was tested manually on a Yarn cluster with rolling upgrade was happing while spark job was running. Without the patch I consistently get the NullPointerException, with the patch the job gets a few Connection refused exceptions but the retries kick in and the it succeeds.
      
      Author: Thomas Graves <tgraves@staydecay.corp.gq1.yahoo.com>
      
      Closes #14718 from tgravescs/SPARK-16711.
      e79962f2
  24. Sep 01, 2016
    • Sean Owen's avatar
      [SPARK-17331][CORE][MLLIB] Avoid allocating 0-length arrays · 3893e8c5
      Sean Owen authored
      ## What changes were proposed in this pull request?
      
      Avoid allocating some 0-length arrays, esp. in UTF8String, and by using Array.empty in Scala over Array[T]()
      
      ## How was this patch tested?
      
      Jenkins
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #14895 from srowen/SPARK-17331.
      3893e8c5
  25. Aug 31, 2016
    • Sean Owen's avatar
      [SPARK-17332][CORE] Make Java Loggers static members · 5d84c7fd
      Sean Owen authored
      ## What changes were proposed in this pull request?
      
      Make all Java Loggers static members
      
      ## How was this patch tested?
      
      Jenkins
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #14896 from srowen/SPARK-17332.
      5d84c7fd
Loading