Skip to content
Snippets Groups Projects
  1. Jul 02, 2015
    • Ilya Ganelin's avatar
      [SPARK-3071] Increase default driver memory · 3697232b
      Ilya Ganelin authored
      I've updated default values in comments, documentation, and in the command line builder to be 1g based on comments in the JIRA. I've also updated most usages to point at a single variable defined in the Utils.scala and JavaUtils.java files. This wasn't possible in all cases (R, shell scripts etc.) but usage in most code is now pointing at the same place.
      
      Please let me know if I've missed anything.
      
      Will the spark-shell use the value within the command line builder during instantiation?
      
      Author: Ilya Ganelin <ilya.ganelin@capitalone.com>
      
      Closes #7132 from ilganeli/SPARK-3071 and squashes the following commits:
      
      4074164 [Ilya Ganelin] String fix
      271610b [Ilya Ganelin] Merge branch 'SPARK-3071' of github.com:ilganeli/spark into SPARK-3071
      273b6e9 [Ilya Ganelin] Test fix
      fd67721 [Ilya Ganelin] Update JavaUtils.java
      26cc177 [Ilya Ganelin] test fix
      e5db35d [Ilya Ganelin] Fixed test failure
      39732a1 [Ilya Ganelin] merge fix
      a6f7deb [Ilya Ganelin] Created default value for DRIVER MEM in Utils that's now used in almost all locations instead of setting manually in each
      09ad698 [Ilya Ganelin] Update SubmitRestProtocolSuite.scala
      19b6f25 [Ilya Ganelin] Missed one doc update
      2698a3d [Ilya Ganelin] Updated default value for driver memory
      3697232b
  2. Jun 28, 2015
    • Josh Rosen's avatar
      [SPARK-8683] [BUILD] Depend on mockito-core instead of mockito-all · f5100451
      Josh Rosen authored
      Spark's tests currently depend on `mockito-all`, which bundles Hamcrest and Objenesis classes. Instead, it should depend on `mockito-core`, which declares those libraries as Maven dependencies. This is necessary in order to fix a dependency conflict that leads to a NoSuchMethodError when using certain Hamcrest matchers.
      
      See https://github.com/mockito/mockito/wiki/Declaring-mockito-dependency for more details.
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #7061 from JoshRosen/mockito-core-instead-of-all and squashes the following commits:
      
      70eccbe [Josh Rosen] Depend on mockito-core instead of mockito-all.
      f5100451
  3. Jun 19, 2015
  4. Jun 03, 2015
    • Patrick Wendell's avatar
      [SPARK-7801] [BUILD] Updating versions to SPARK 1.5.0 · 2c4d550e
      Patrick Wendell authored
      Author: Patrick Wendell <patrick@databricks.com>
      
      Closes #6328 from pwendell/spark-1.5-update and squashes the following commits:
      
      2f42d02 [Patrick Wendell] A few more excludes
      4bebcf0 [Patrick Wendell] Update to RC4
      61aaf46 [Patrick Wendell] Using new release candidate
      55f1610 [Patrick Wendell] Another exclude
      04b4f04 [Patrick Wendell] More issues with transient 1.4 changes
      36f549b [Patrick Wendell] [SPARK-7801] [BUILD] Updating versions to SPARK 1.5.0
      2c4d550e
  5. May 19, 2015
    • Iulian Dragos's avatar
      [SPARK-7726] Fix Scaladoc false errors · 3c4c1f96
      Iulian Dragos authored
      Visibility rules for static members are different in Scala and Java, and this case requires an explicit static import. Even though these are Java files, they are run through scaladoc, which enforces Scala rules.
      
      Also reverted the commit that reverts the upgrade to 2.11.6
      
      Author: Iulian Dragos <jaguarul@gmail.com>
      
      Closes #6260 from dragos/issue/scaladoc-false-error and squashes the following commits:
      
      f2e998e [Iulian Dragos] Revert "[HOTFIX] Revert "[SPARK-7092] Update spark scala version to 2.11.6""
      0bad052 [Iulian Dragos] Fix scaladoc faux-error.
      3c4c1f96
  6. May 08, 2015
    • Aaron Davidson's avatar
      [SPARK-6955] Perform port retries at NettyBlockTransferService level · ffdc40ce
      Aaron Davidson authored
      Currently we're doing port retries in the TransportServer level, but this is not specified by the TransportContext API and it has other further-reaching impacts like causing undesirable behavior for the Yarn and Standalone shuffle services.
      
      Author: Aaron Davidson <aaron@databricks.com>
      
      Closes #5575 from aarondav/port-bind and squashes the following commits:
      
      3c2d6ed [Aaron Davidson] Oops, never do it.
      a5d9432 [Aaron Davidson] Remove shouldHostShuffleServiceIfEnabled
      e901eb2 [Aaron Davidson] fix local-cluster mode for ExternalShuffleServiceSuite
      59e5e38 [Aaron Davidson] [SPARK-6955] Perform port retries at NettyBlockTransferService level
      ffdc40ce
    • Kay Ousterhout's avatar
      [SPARK-6627] Finished rename to ShuffleBlockResolver · 4b3bb0e4
      Kay Ousterhout authored
      The previous cleanup-commit for SPARK-6627 renamed ShuffleBlockManager
      to ShuffleBlockResolver, but didn't rename the associated subclasses and
      variables; this commit does that.
      
      I'm unsure whether it's ok to rename ExternalShuffleBlockManager, since that's technically a public class?
      
      cc pwendell
      
      Author: Kay Ousterhout <kayousterhout@gmail.com>
      
      Closes #5764 from kayousterhout/SPARK-6627 and squashes the following commits:
      
      43add1e [Kay Ousterhout] Spacing fix
      96080bf [Kay Ousterhout] Test fixes
      d8a5d36 [Kay Ousterhout] [SPARK-6627] Finished rename to ShuffleBlockResolver
      4b3bb0e4
  7. May 01, 2015
    • Marcelo Vanzin's avatar
      [SPARK-6229] Add SASL encryption to network library. · 38d4e9e4
      Marcelo Vanzin authored
      There are two main parts of this change:
      
      - Extending the bootstrap mechanism in the network library to add a server-side
        bootstrap (which works a little bit differently than the client-side bootstrap), and
        to allow the  bootstraps to modify the underlying channel.
      
      - Use SASL to encrypt data going through the RPC channel.
      
      The second item requires some non-optimal code to be able to work around the
      fact that the outbound path in netty is not thread-safe, and ordering is very important
      when encryption is in the picture.
      
      A lot of the changes outside the network/common library are just to adjust to the
      changed API for initializing the RPC server.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #5377 from vanzin/SPARK-6229 and squashes the following commits:
      
      ff01966 [Marcelo Vanzin] Use fancy new size config style.
      be53f32 [Marcelo Vanzin] Merge branch 'master' into SPARK-6229
      47d4aff [Marcelo Vanzin] Merge branch 'master' into SPARK-6229
      7a2a805 [Marcelo Vanzin] Clean up some unneeded changes.
      2f92237 [Marcelo Vanzin] Add comment.
      67bb0c6 [Marcelo Vanzin] Revert "Avoid exposing ByteArrayWritableChannel outside of test code."
      065f684 [Marcelo Vanzin] Add test to verify chunking.
      3d1695d [Marcelo Vanzin] Minor cleanups.
      73cff0e [Marcelo Vanzin] Skip bytes in decode path too.
      318ad23 [Marcelo Vanzin] Avoid exposing ByteArrayWritableChannel outside of test code.
      346f829 [Marcelo Vanzin] Avoid trip through channel selector by not reporting 0 bytes written.
      a4a5938 [Marcelo Vanzin] Review feedback.
      4797519 [Marcelo Vanzin] Remove unused import.
      9908ada [Marcelo Vanzin] Fix test, SASL backend disposal.
      7fe1489 [Marcelo Vanzin] Add a test that makes sure encryption is actually enabled.
      adb6f9d [Marcelo Vanzin] Review feedback.
      cf2a605 [Marcelo Vanzin] Clean up some code.
      8584323 [Marcelo Vanzin] Fix a comment.
      e98bc55 [Marcelo Vanzin] Add option to only allow encrypted connections to the server.
      dad42fc [Marcelo Vanzin] Make encryption thread-safe, less memory-intensive.
      b00999a [Marcelo Vanzin] Consolidate ByteArrayWritableChannel, fix SASL code to match master changes.
      b923cae [Marcelo Vanzin] Make SASL encryption handler thread-safe, handle FileRegion messages.
      39539a7 [Marcelo Vanzin] Add config option to enable SASL encryption.
      351a86f [Marcelo Vanzin] Add SASL encryption to network library.
      fbe6ccb [Marcelo Vanzin] Add TransportServerBootstrap, make SASL code use it.
      38d4e9e4
    • Liang-Chi Hsieh's avatar
      [SPARK-7183] [NETWORK] Fix memory leak of TransportRequestHandler.streamIds · 16860327
      Liang-Chi Hsieh authored
      JIRA: https://issues.apache.org/jira/browse/SPARK-7183
      
      Author: Liang-Chi Hsieh <viirya@gmail.com>
      
      Closes #5743 from viirya/fix_requesthandler_memory_leak and squashes the following commits:
      
      cf2c086 [Liang-Chi Hsieh] For comments.
      97e205c [Liang-Chi Hsieh] Remove unused import.
      d35f19a [Liang-Chi Hsieh] For comments.
      f9a0c37 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into fix_requesthandler_memory_leak
      45908b7 [Liang-Chi Hsieh] for style.
      17f020f [Liang-Chi Hsieh] Remove unused import.
      37a4b6c [Liang-Chi Hsieh] Remove streamIds from TransportRequestHandler.
      3b3f38a [Liang-Chi Hsieh] Fix memory leak of TransportRequestHandler.streamIds.
      16860327
  8. Apr 28, 2015
    • Ilya Ganelin's avatar
      [SPARK-5932] [CORE] Use consistent naming for size properties · 2d222fb3
      Ilya Ganelin authored
      I've added an interface to JavaUtils to do byte conversion and added hooks within Utils.scala to handle conversion within Spark code (like for time strings). I've added matching tests for size conversion, and then updated all deprecated configs and documentation as per SPARK-5933.
      
      Author: Ilya Ganelin <ilya.ganelin@capitalone.com>
      
      Closes #5574 from ilganeli/SPARK-5932 and squashes the following commits:
      
      11f6999 [Ilya Ganelin] Nit fixes
      49a8720 [Ilya Ganelin] Whitespace fix
      2ab886b [Ilya Ganelin] Scala style
      fc85733 [Ilya Ganelin] Got rid of floating point math
      852a407 [Ilya Ganelin] [SPARK-5932] Added much improved overflow handling. Can now handle sizes up to Long.MAX_VALUE Petabytes instead of being capped at Long.MAX_VALUE Bytes
      9ee779c [Ilya Ganelin] Simplified fraction matches
      22413b1 [Ilya Ganelin] Made MAX private
      3dfae96 [Ilya Ganelin] Fixed some nits. Added automatic conversion of old paramter for kryoserializer.mb to new values.
      e428049 [Ilya Ganelin] resolving merge conflict
      8b43748 [Ilya Ganelin] Fixed error in pattern matching for doubles
      84a2581 [Ilya Ganelin] Added smoother handling of fractional values for size parameters. This now throws an exception and added a warning for old spark.kryoserializer.buffer
      d3d09b6 [Ilya Ganelin] [SPARK-5932] Fixing error in KryoSerializer
      fe286b4 [Ilya Ganelin] Resolved merge conflict
      c7803cd [Ilya Ganelin] Empty lines
      54b78b4 [Ilya Ganelin] Simplified byteUnit class
      69e2f20 [Ilya Ganelin] Updates to code
      f32bc01 [Ilya Ganelin] [SPARK-5932] Fixed error in API in SparkConf.scala where Kb conversion wasn't being done properly (was Mb). Added test cases for both timeUnit and ByteUnit conversion
      f15f209 [Ilya Ganelin] Fixed conversion of kryo buffer size
      0f4443e [Ilya Ganelin]     Merge remote-tracking branch 'upstream/master' into SPARK-5932
      35a7fa7 [Ilya Ganelin] Minor formatting
      928469e [Ilya Ganelin] [SPARK-5932] Converted some longs to ints
      5d29f90 [Ilya Ganelin] [SPARK-5932] Finished documentation updates
      7a6c847 [Ilya Ganelin] [SPARK-5932] Updated spark.shuffle.file.buffer
      afc9a38 [Ilya Ganelin] [SPARK-5932] Updated spark.broadcast.blockSize and spark.storage.memoryMapThreshold
      ae7e9f6 [Ilya Ganelin] [SPARK-5932] Updated spark.io.compression.snappy.block.size
      2d15681 [Ilya Ganelin] [SPARK-5932] Updated spark.executor.logs.rolling.size.maxBytes
      1fbd435 [Ilya Ganelin] [SPARK-5932] Updated spark.broadcast.blockSize
      eba4de6 [Ilya Ganelin] [SPARK-5932] Updated spark.shuffle.file.buffer.kb
      b809a78 [Ilya Ganelin] [SPARK-5932] Updated spark.kryoserializer.buffer.max
      0cdff35 [Ilya Ganelin] [SPARK-5932] Updated to use bibibytes in method names. Updated spark.kryoserializer.buffer.mb and spark.reducer.maxMbInFlight
      475370a [Ilya Ganelin] [SPARK-5932] Simplified ByteUnit code, switched to using longs. Updated docs to clarify that we use kibi, mebi etc instead of kilo, mega
      851d691 [Ilya Ganelin] [SPARK-5932] Updated memoryStringToMb to use new interfaces
      a9f4fcf [Ilya Ganelin] [SPARK-5932] Added unit tests for unit conversion
      747393a [Ilya Ganelin] [SPARK-5932] Added unit tests for ByteString conversion
      09ea450 [Ilya Ganelin] [SPARK-5932] Added byte string conversion to Jav utils
      5390fd9 [Ilya Ganelin] Merge remote-tracking branch 'upstream/master' into SPARK-5932
      db9a963 [Ilya Ganelin] Closing second spark context
      1dc0444 [Ilya Ganelin] Added ref equality check
      8c884fa [Ilya Ganelin] Made getOrCreate synchronized
      cb0c6b7 [Ilya Ganelin] Doc updates and code cleanup
      270cfe3 [Ilya Ganelin] [SPARK-6703] Documentation fixes
      15e8dea [Ilya Ganelin] Updated comments and added MiMa Exclude
      0e1567c [Ilya Ganelin] Got rid of unecessary option for AtomicReference
      dfec4da [Ilya Ganelin] Changed activeContext to AtomicReference
      733ec9f [Ilya Ganelin] Fixed some bugs in test code
      8be2f83 [Ilya Ganelin] Replaced match with if
      e92caf7 [Ilya Ganelin] [SPARK-6703] Added test to ensure that getOrCreate both allows creation, retrieval, and a second context if desired
      a99032f [Ilya Ganelin] Spacing fix
      d7a06b8 [Ilya Ganelin] Updated SparkConf class to add getOrCreate method. Started test suite implementation
      2d222fb3
    • Sean Owen's avatar
      [SPARK-7168] [BUILD] Update plugin versions in Maven build and centralize versions · 7f3b3b7e
      Sean Owen authored
      Update Maven build plugin versions and centralize plugin version management
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #5720 from srowen/SPARK-7168 and squashes the following commits:
      
      98a8947 [Sean Owen] Make install, deploy plugin versions explicit
      4ecf3b2 [Sean Owen] Update Maven build plugin versions and centralize plugin version management
      7f3b3b7e
  9. Apr 20, 2015
    • Aaron Davidson's avatar
      [SPARK-7003] Improve reliability of connection failure detection between Netty... · 968ad972
      Aaron Davidson authored
      [SPARK-7003] Improve reliability of connection failure detection between Netty block transfer service endpoints
      
      Currently we rely on the assumption that an exception will be raised and the channel closed if two endpoints cannot communicate over a Netty TCP channel. However, this guarantee does not hold in all network environments, and [SPARK-6962](https://issues.apache.org/jira/browse/SPARK-6962) seems to point to a case where only the server side of the connection detected a fault.
      
      This patch improves robustness of fetch/rpc requests by having an explicit timeout in the transport layer which closes the connection if there is a period of inactivity while there are outstanding requests.
      
      NB: This patch is actually only around 50 lines added if you exclude the testing-related code.
      
      Author: Aaron Davidson <aaron@databricks.com>
      
      Closes #5584 from aarondav/timeout and squashes the following commits:
      
      8699680 [Aaron Davidson] Address Reynold's comments
      37ce656 [Aaron Davidson] [SPARK-7003] Improve reliability of connection failure detection between Netty block transfer service endpoints
      968ad972
  10. Apr 13, 2015
    • Ilya Ganelin's avatar
      [SPARK-5931][CORE] Use consistent naming for time properties · c4ab255e
      Ilya Ganelin authored
      I've added new utility methods to do the conversion from times specified as e.g. 120s, 240ms, 360us to convert to a consistent internal representation. I've updated usage of these constants throughout the code to be consistent.
      
      I believe I've captured all usages of time-based properties throughout the code. I've also updated variable names in a number of places to reflect their units for clarity and updated documentation where appropriate.
      
      Author: Ilya Ganelin <ilya.ganelin@capitalone.com>
      Author: Ilya Ganelin <ilganeli@gmail.com>
      
      Closes #5236 from ilganeli/SPARK-5931 and squashes the following commits:
      
      4526c81 [Ilya Ganelin] Update configuration.md
      de3bff9 [Ilya Ganelin] Fixing style errors
      f5fafcd [Ilya Ganelin] Doc updates
      951ca2d [Ilya Ganelin] Made the most recent round of changes
      bc04e05 [Ilya Ganelin] Minor fixes and doc updates
      25d3f52 [Ilya Ganelin] Minor nit fixes
      642a06d [Ilya Ganelin] Fixed logic for invalid suffixes and addid matching test
      8927e66 [Ilya Ganelin] Fixed handling of -1
      69fedcc [Ilya Ganelin] Added test for zero
      dc7bd08 [Ilya Ganelin] Fixed error in exception handling
      7d19cdd [Ilya Ganelin] Added fix for possible NPE
      6f651a8 [Ilya Ganelin] Now using regexes to simplify code in parseTimeString. Introduces getTimeAsSec and getTimeAsMs methods in SparkConf. Updated documentation
      cbd2ca6 [Ilya Ganelin] Formatting error
      1a1122c [Ilya Ganelin] Formatting fixes and added m for use as minute formatter
      4e48679 [Ilya Ganelin] Fixed priority order and mixed up conversions in a couple spots
      d4efd26 [Ilya Ganelin] Added time conversion for yarn.scheduler.heartbeat.interval-ms
      cbf41db [Ilya Ganelin] Got rid of thrown exceptions
      1465390 [Ilya Ganelin] Nit
      28187bf [Ilya Ganelin] Convert straight to seconds
      ff40bfe [Ilya Ganelin] Updated tests to fix small bugs
      19c31af [Ilya Ganelin] Added cleaner computation of time conversions in tests
      6387772 [Ilya Ganelin] Updated suffix handling to handle overlap of units more gracefully
      5193d5f [Ilya Ganelin] Resolved merge conflicts
      76cfa27 [Ilya Ganelin] [SPARK-5931] Minor nit fixes'
      bf779b0 [Ilya Ganelin] Special handling of overlapping usffixes for java
      dd0a680 [Ilya Ganelin] Updated scala code to call into java
      b2fc965 [Ilya Ganelin] replaced get or default since it's not present in this version of java
      39164f9 [Ilya Ganelin] [SPARK-5931] Updated Java conversion to be similar to scala conversion. Updated conversions to clean up code a little using TimeUnit.convert. Added Unit tests
      3b126e1 [Ilya Ganelin] Fixed conversion to US from seconds
      1858197 [Ilya Ganelin] Fixed bug where all time was being converted to us instead of the appropriate units
      bac9edf [Ilya Ganelin] More whitespace
      8613631 [Ilya Ganelin] Whitespace
      1c0c07c [Ilya Ganelin] Updated Java code to add day, minutes, and hours
      647b5ac [Ilya Ganelin] Udpated time conversion to use map iterator instead of if fall through
      70ac213 [Ilya Ganelin] Fixed remaining usages to be consistent. Updated Java-side time conversion
      68f4e93 [Ilya Ganelin] Updated more files to clean up usage of default time strings
      3a12dd8 [Ilya Ganelin] Updated host revceiver
      5232a36 [Ilya Ganelin] [SPARK-5931] Changed default behavior of time string conversion.
      499bdf0 [Ilya Ganelin] Merge branch 'SPARK-5931' of github.com:ilganeli/spark into SPARK-5931
      9e2547c [Ilya Ganelin] Reverting doc changes
      8f741e1 [Ilya Ganelin] Update JavaUtils.java
      34f87c2 [Ilya Ganelin] Update Utils.scala
      9a29d8d [Ilya Ganelin] Fixed misuse of time in streaming context test
      42477aa [Ilya Ganelin] Updated configuration doc with note on specifying time properties
      cde9bff [Ilya Ganelin] Updated spark.streaming.blockInterval
      c6a0095 [Ilya Ganelin] Updated spark.core.connection.auth.wait.timeout
      5181597 [Ilya Ganelin] Updated spark.dynamicAllocation.schedulerBacklogTimeout
      2fcc91c [Ilya Ganelin] Updated spark.dynamicAllocation.executorIdleTimeout
      6d1518e [Ilya Ganelin] Upated spark.speculation.interval
      3f1cfc8 [Ilya Ganelin] Updated spark.scheduler.revive.interval
      3352d34 [Ilya Ganelin] Updated spark.scheduler.maxRegisteredResourcesWaitingTime
      272c215 [Ilya Ganelin] Updated spark.locality.wait
      7320c87 [Ilya Ganelin] updated spark.akka.heartbeat.interval
      064ebd6 [Ilya Ganelin] Updated usage of spark.cleaner.ttl
      21ef3dd [Ilya Ganelin] updated spark.shuffle.sasl.timeout
      c9f5cad [Ilya Ganelin] Updated spark.shuffle.io.retryWait
      4933fda [Ilya Ganelin] Updated usage of spark.storage.blockManagerSlaveTimeout
      7db6d2a [Ilya Ganelin] Updated usage of spark.akka.timeout
      404f8c3 [Ilya Ganelin] Updated usage of spark.core.connection.ack.wait.timeout
      59bf9e1 [Ilya Ganelin] [SPARK-5931] Updated Utils and JavaUtils classes to add helper methods to handle time strings. Updated time strings in a few places to properly parse time
      c4ab255e
  11. Apr 01, 2015
    • Reynold Xin's avatar
      [SPARK-6578] Small rewrite to make the logic more clear in MessageWithHeader.transferTo. · 899ebcb1
      Reynold Xin authored
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #5319 from rxin/SPARK-6578 and squashes the following commits:
      
      7c62a64 [Reynold Xin] Small rewrite to make the logic more clear in transferTo.
      899ebcb1
    • Marcelo Vanzin's avatar
      [SPARK-6578] [core] Fix thread-safety issue in outbound path of network library. · f084c5de
      Marcelo Vanzin authored
      While the inbound path of a netty pipeline is thread-safe, the outbound
      path is not. That means that multiple threads can compete to write messages
      to the next stage of the pipeline.
      
      The network library sometimes breaks a single RPC message into multiple
      buffers internally to avoid copying data (see MessageEncoder). This can
      result in the following scenario (where "FxBy" means "frame x, buffer y"):
      
                     T1         F1B1            F1B2
                                  \               \
                                   \               \
                     socket        F1B1   F2B1    F1B2  F2B2
                                           /             /
                                          /             /
                     T2                  F2B1         F2B2
      
      And the frames now cannot be rebuilt on the receiving side because the
      different messages have been mixed up on the wire.
      
      The fix wraps these multi-buffer messages into a `FileRegion` object
      so that these messages are written "atomically" to the next pipeline handler.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #5234 from vanzin/SPARK-6578 and squashes the following commits:
      
      16b2d70 [Marcelo Vanzin] Forgot to update a type.
      c9c2e4e [Marcelo Vanzin] Review comments: simplify some code.
      9c888ac [Marcelo Vanzin] Small style nits.
      8474bab [Marcelo Vanzin] Fix multiple calls to MessageWithHeader.transferTo().
      e26509f [Marcelo Vanzin] Merge branch 'master' into SPARK-6578
      c503f6c [Marcelo Vanzin] Implement a custom FileRegion instead of using locks.
      84aa7ce [Marcelo Vanzin] Rename handler to the correct name.
      432f3bd [Marcelo Vanzin] Remove unneeded method.
      8d70e60 [Marcelo Vanzin] Fix thread-safety issue in outbound path of network library.
      f084c5de
  12. Mar 20, 2015
    • Marcelo Vanzin's avatar
      [SPARK-6371] [build] Update version to 1.4.0-SNAPSHOT. · a7456459
      Marcelo Vanzin authored
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #5056 from vanzin/SPARK-6371 and squashes the following commits:
      
      63220df [Marcelo Vanzin] Merge branch 'master' into SPARK-6371
      6506f75 [Marcelo Vanzin] Use more fine-grained exclusion.
      178ba71 [Marcelo Vanzin] Oops.
      75b2375 [Marcelo Vanzin] Exclude VertexRDD in MiMA.
      a45a62c [Marcelo Vanzin] Work around MIMA warning.
      1d8a670 [Marcelo Vanzin] Re-group jetty exclusion.
      0e8e909 [Marcelo Vanzin] Ignore ml, don't ignore graphx.
      cef4603 [Marcelo Vanzin] Indentation.
      296cf82 [Marcelo Vanzin] [SPARK-6371] [build] Update version to 1.4.0-SNAPSHOT.
      a7456459
  13. Mar 11, 2015
  14. Mar 06, 2015
    • Vinod K C's avatar
      [SPARK-6178][Shuffle] Removed unused imports · dba0b2ea
      Vinod K C authored
      Author: Vinod K C <vinod.kchuawei.com>
      
      Author: Vinod K C <vinod.kc@huawei.com>
      
      Closes #4900 from vinodkc/unused_imports and squashes the following commits:
      
      5373456 [Vinod K C] Removed empty lines
      9da7438 [Vinod K C] Changed order of import
      594d471 [Vinod K C] Removed unused imports
      dba0b2ea
  15. Mar 05, 2015
  16. Feb 28, 2015
    • Marcelo Vanzin's avatar
      [SPARK-6070] [yarn] Remove unneeded classes from shuffle service jar. · dba08d1f
      Marcelo Vanzin authored
      These may conflict with the classes already in the NM. We shouldn't
      be repackaging them.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #4820 from vanzin/SPARK-6070 and squashes the following commits:
      
      871b566 [Marcelo Vanzin] The "d'oh how didn't I think of it before" solution.
      3cba946 [Marcelo Vanzin] Use profile instead, so that dependencies don't need to be explicitly listed.
      7a18a1b [Marcelo Vanzin] [SPARK-6070] [yarn] Remove unneeded classes from shuffle service jar.
      dba08d1f
  17. Feb 06, 2015
    • lianhuiwang's avatar
      [SPARK-4994][network]Cleanup removed executors' ShuffleInfo in yarn shuffle service · 61073f83
      lianhuiwang authored
      when the application is completed, yarn's nodemanager can remove application's local-dirs.but all executors' metadata of completed application havenot be removed. now it lets yarn ShuffleService to have much more memory to store Executors' ShuffleInfo. so these metadata need to be removed.
      
      Author: lianhuiwang <lianhuiwang09@gmail.com>
      
      Closes #3828 from lianhuiwang/SPARK-4994 and squashes the following commits:
      
      f3ba1d2 [lianhuiwang] Cleanup removed executors' ShuffleInfo
      61073f83
    • huangzhaowei's avatar
      [SPARK-5444][Network]Add a retry to deal with the conflict port in netty server. · 2bda1c1d
      huangzhaowei authored
      If the `spark.blockMnager.port` had conflicted with a specific port, Spark will throw an exception and exit.
      So add a retry to avoid this situation.
      
      Author: huangzhaowei <carlmartinmax@gmail.com>
      
      Closes #4240 from SaintBacchus/NettyPortConflict and squashes the following commits:
      
      cc926d2 [huangzhaowei] Add a retry to deal with the conflict port in netty server.
      2bda1c1d
  18. Feb 01, 2015
    • Patrick Wendell's avatar
      [SPARK-3996]: Shade Jetty in Spark deliverables · a15f6e31
      Patrick Wendell authored
      (v2 of this patch with a fix that was only relevant for the maven build).
      
      This patch piggy-back's on vanzin's work to simplify the Guava shading,
      and adds Jetty as a shaded library in Spark. Other than adding Jetty,
      it consilidates the <artifactSet>'s into the root pom. I found it was
      a bit easier to follow that way, since you don't need to look into
      child pom's to find out specific artifact sets included in shading.
      
      Author: Patrick Wendell <patrick@databricks.com>
      
      Closes #4285 from pwendell/jetty and squashes the following commits:
      
      d3e7f4e [Patrick Wendell] Fix for shaded deps causing compile errors
      19f0710 [Patrick Wendell] More code review feedback
      961452d [Patrick Wendell] Responding to feedback from Marcello
      6df25ca [Patrick Wendell] [WIP] [SPARK-3996]: Shade Jetty in Spark deliverables
      a15f6e31
  19. Jan 29, 2015
    • Patrick Wendell's avatar
      Revert "[WIP] [SPARK-3996]: Shade Jetty in Spark deliverables" · d2071e8f
      Patrick Wendell authored
      This reverts commit f240fe39.
      d2071e8f
    • Patrick Wendell's avatar
      [WIP] [SPARK-3996]: Shade Jetty in Spark deliverables · f240fe39
      Patrick Wendell authored
      This patch piggy-back's on vanzin's work to simplify the Guava shading,
      and adds Jetty as a shaded library in Spark. Other than adding Jetty,
      it consilidates the \<artifactSet\>'s into the root pom. I found it was
      a bit easier to follow that way, since you don't need to look into
      child pom's to find out specific artifact sets included in shading.
      
      Author: Patrick Wendell <patrick@databricks.com>
      
      Closes #4252 from pwendell/jetty and squashes the following commits:
      
      19f0710 [Patrick Wendell] More code review feedback
      961452d [Patrick Wendell] Responding to feedback from Marcello
      6df25ca [Patrick Wendell] [WIP] [SPARK-3996]: Shade Jetty in Spark deliverables
      f240fe39
  20. Jan 28, 2015
    • Marcelo Vanzin's avatar
      [SPARK-4809] Rework Guava library shading. · 37a5e272
      Marcelo Vanzin authored
      The current way of shading Guava is a little problematic. Code that
      depends on "spark-core" does not see the transitive dependency, yet
      classes in "spark-core" actually depend on Guava. So it's a little
      tricky to run unit tests that use spark-core classes, since you need
      a compatible version of Guava in your dependencies when running the
      tests. This can become a little tricky, and is kind of a bad user
      experience.
      
      This change modifies the way Guava is shaded so that it's applied
      uniformly across the Spark build. This means Guava is shaded inside
      spark-core itself, so that the dependency issues above are solved.
      Aside from that, all Spark sub-modules have their Guava references
      relocated, so that they refer to the relocated classes now packaged
      inside spark-core. Before, this was only done by the time the assembly
      was built, so projects that did not end up inside the assembly (such
      as streaming backends) could still reference the original location
      of Guava classes.
      
      The Guava classes are added to the "first" artifact Spark generates
      (network-common), so that all downstream modules have the needed
      classes available. Since "network-common" is a dependency of spark-core,
      all Spark apps should get the relocated classes automatically.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #3658 from vanzin/SPARK-4809 and squashes the following commits:
      
      3c93e42 [Marcelo Vanzin] Shade Guava in the network-common artifact.
      5d69ec9 [Marcelo Vanzin] Merge branch 'master' into SPARK-4809
      b3104fc [Marcelo Vanzin] Add comment.
      941848f [Marcelo Vanzin] Merge branch 'master' into SPARK-4809
      f78c48a [Marcelo Vanzin] Merge branch 'master' into SPARK-4809
      8053dd4 [Marcelo Vanzin] Merge branch 'master' into SPARK-4809
      107d7da [Marcelo Vanzin] Add fix for SPARK-5052 (PR #3874).
      40b8723 [Marcelo Vanzin] Merge branch 'master' into SPARK-4809
      4a4ed42 [Marcelo Vanzin] [SPARK-4809] Rework Guava library shading.
      37a5e272
  21. Jan 09, 2015
  22. Jan 06, 2015
    • Sean Owen's avatar
      SPARK-4159 [CORE] Maven build doesn't run JUnit test suites · 4cba6eb4
      Sean Owen authored
      This PR:
      
      - Reenables `surefire`, and copies config from `scalatest` (which is itself an old fork of `surefire`, so similar)
      - Tells `surefire` to test only Java tests
      - Enables `surefire` and `scalatest` for all children, and in turn eliminates some duplication.
      
      For me this causes the Scala and Java tests to be run once each, it seems, as desired. It doesn't affect the SBT build but works for Maven. I still need to verify that all of the Scala tests and Java tests are being run.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #3651 from srowen/SPARK-4159 and squashes the following commits:
      
      2e8a0af [Sean Owen] Remove specialized SPARK_HOME setting for REPL, YARN tests as it appears to be obsolete
      12e4558 [Sean Owen] Append to unit-test.log instead of overwriting, so that both surefire and scalatest output is preserved. Also standardize/correct comments a bit.
      e6f8601 [Sean Owen] Reenable Java tests by reenabling surefire with config cloned from scalatest; centralize test config in the parent
      4cba6eb4
  23. Jan 05, 2015
    • Reynold Xin's avatar
      [SPARK-5093] Set spark.network.timeout to 120s consistently. · bbcba3a9
      Reynold Xin authored
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #3903 from rxin/timeout-120 and squashes the following commits:
      
      7c2138e [Reynold Xin] [SPARK-5093] Set spark.network.timeout to 120s consistently.
      bbcba3a9
    • Varun Saxena's avatar
      [SPARK-4688] Have a single shared network timeout in Spark · d3f07fd2
      Varun Saxena authored
      [SPARK-4688] Have a single shared network timeout in Spark
      
      Author: Varun Saxena <vsaxena.varun@gmail.com>
      Author: varunsaxena <vsaxena.varun@gmail.com>
      
      Closes #3562 from varunsaxena/SPARK-4688 and squashes the following commits:
      
      6e97f72 [Varun Saxena] [SPARK-4688] Single shared network timeout
      cd783a2 [Varun Saxena] SPARK-4688
      d6f8c29 [Varun Saxena] SCALA-4688
      9562b15 [Varun Saxena] SPARK-4688
      a75f014 [varunsaxena] SPARK-4688
      594226c [varunsaxena] SPARK-4688
      d3f07fd2
  24. Dec 22, 2014
  25. Dec 09, 2014
    • Reynold Xin's avatar
      Config updates for the new shuffle transport. · 9bd9334f
      Reynold Xin authored
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #3657 from rxin/conf-update and squashes the following commits:
      
      7370eab [Reynold Xin] Config updates for the new shuffle transport.
      9bd9334f
    • Reynold Xin's avatar
      [SPARK-4740] Create multiple concurrent connections between two peer nodes in Netty. · 2b9b7268
      Reynold Xin authored
      It's been reported that when the number of disks is large and the number of nodes is small, Netty network throughput is low compared with NIO. We suspect the problem is that only a small number of disks are utilized to serve shuffle files at any given point, due to connection reuse. This patch adds a new config parameter to specify the number of concurrent connections between two peer nodes, default to 2.
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #3625 from rxin/SPARK-4740 and squashes the following commits:
      
      ad4241a [Reynold Xin] Updated javadoc.
      f33c72b [Reynold Xin] Code review feedback.
      0fefabb [Reynold Xin] Use double check in synchronization.
      41dfcb2 [Reynold Xin] Added test case.
      9076b4a [Reynold Xin] Fixed two NPEs.
      3e1306c [Reynold Xin] Minor style fix.
      4f21673 [Reynold Xin] [SPARK-4740] Create multiple concurrent connections between two peer nodes in Netty.
      2b9b7268
    • Sean Owen's avatar
      SPARK-4805 [CORE] BlockTransferMessage.toByteArray() trips assertion · d8f84f26
      Sean Owen authored
      Allocate enough room for type byte as well as message, to avoid tripping assertion about capacity of the buffer
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #3650 from srowen/SPARK-4805 and squashes the following commits:
      
      9e1d502 [Sean Owen] Allocate enough room for type byte as well as message, to avoid tripping assertion about capacity of the buffer
      d8f84f26
  26. Nov 28, 2014
    • Takuya UESHIN's avatar
      [SPARK-4193][BUILD] Disable doclint in Java 8 to prevent from build error. · e464f0ac
      Takuya UESHIN authored
      Author: Takuya UESHIN <ueshin@happy-camper.st>
      
      Closes #3058 from ueshin/issues/SPARK-4193 and squashes the following commits:
      
      e096bb1 [Takuya UESHIN] Add a plugin declaration to pluginManagement.
      6762ec2 [Takuya UESHIN] Fix usage of -Xdoclint javadoc option.
      fdb280a [Takuya UESHIN] Fix Javadoc errors.
      4745f3c [Takuya UESHIN] Merge branch 'master' into issues/SPARK-4193
      923e2f0 [Takuya UESHIN] Use doclint option `-missing` instead of `none`.
      30d6718 [Takuya UESHIN] Fix Javadoc errors.
      b548017 [Takuya UESHIN] Disable doclint in Java 8 to prevent from build error.
      e464f0ac
  27. Nov 25, 2014
    • Aaron Davidson's avatar
      [SPARK-4516] Avoid allocating Netty PooledByteBufAllocators unnecessarily · 346bc17a
      Aaron Davidson authored
      Turns out we are allocating an allocator pool for every TransportClient (which means that the number increases with the number of nodes in the cluster), when really we should just reuse one for all clients.
      
      This patch, as expected, greatly decreases off-heap memory allocation, and appears to make allocation only proportional to the number of cores.
      
      Author: Aaron Davidson <aaron@databricks.com>
      
      Closes #3465 from aarondav/fewer-pools and squashes the following commits:
      
      36c49da [Aaron Davidson] [SPARK-4516] Avoid allocating unnecessarily Netty PooledByteBufAllocators
      346bc17a
  28. Nov 18, 2014
    • Marcelo Vanzin's avatar
      Bumping version to 1.3.0-SNAPSHOT. · 397d3aae
      Marcelo Vanzin authored
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #3277 from vanzin/version-1.3 and squashes the following commits:
      
      7c3c396 [Marcelo Vanzin] Added temp repo to sbt build.
      5f404ff [Marcelo Vanzin] Add another exclusion.
      19457e7 [Marcelo Vanzin] Update old version to 1.2, add temporary 1.2 repo.
      3c8d705 [Marcelo Vanzin] Workaround for MIMA checks.
      e940810 [Marcelo Vanzin] Bumping version to 1.3.0-SNAPSHOT.
      397d3aae
  29. Nov 13, 2014
    • Xiangrui Meng's avatar
      [SPARK-4326] fix unidoc · 4b0c1edf
      Xiangrui Meng authored
      There are two issues:
      
      1. specifying guava 11.0.2 will cause hashInt not found in unidoc (any reason to force the version here?)
      2. unidoc doesn't recognize static class defined in a base class
      
      aarondav srowen vanzin
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #3253 from mengxr/SPARK-4326 and squashes the following commits:
      
      53967bf [Xiangrui Meng] fix unidoc
      4b0c1edf
  30. Nov 12, 2014
Loading