Skip to content
Snippets Groups Projects
  1. Oct 09, 2014
  2. Oct 07, 2014
    • Nicholas Chammas's avatar
      [SPARK-3398] [EC2] Have spark-ec2 intelligently wait for specific cluster states · 5912ca67
      Nicholas Chammas authored
      Instead of waiting arbitrary amounts of time for the cluster to reach a specific state, this patch lets `spark-ec2` explicitly wait for a cluster to reach a desired state.
      
      This is useful in a couple of situations:
      * The cluster is launching and you want to wait until SSH is available before installing stuff.
      * The cluster is being terminated and you want to wait until all the instances are terminated before trying to delete security groups.
      
      This patch removes the need for the `--wait` option and removes some of the time-based retry logic that was being used.
      
      Author: Nicholas Chammas <nicholas.chammas@gmail.com>
      
      Closes #2339 from nchammas/spark-ec2-wait-properly and squashes the following commits:
      
      43a69f0 [Nicholas Chammas] short-circuit SSH check; linear backoff
      9a9e035 [Nicholas Chammas] remove extraneous comment
      26c5ed0 [Nicholas Chammas] replace print with write()
      bb67c06 [Nicholas Chammas] deprecate wait option; remove dead code
      7969265 [Nicholas Chammas] fix long line (PEP 8)
      126e4cf [Nicholas Chammas] wait for specific cluster states
      5912ca67
  3. Sep 29, 2014
    • Nicholas Chammas's avatar
      [EC2] Sort long, manually-inputted dictionaries · aedd251c
      Nicholas Chammas authored
      Similar to the work done in #2571, this PR just sorts the remaining manually-inputted dicts in the EC2 script so they are easier to maintain.
      
      Author: Nicholas Chammas <nicholas.chammas@gmail.com>
      
      Closes #2578 from nchammas/ec2-dict-sort and squashes the following commits:
      
      f55c692 [Nicholas Chammas] sort long dictionaries
      aedd251c
  4. Sep 28, 2014
    • Nicholas Chammas's avatar
      [EC2] Cleanup Python parens and disk dict · 1651cc11
      Nicholas Chammas authored
      Minor fixes:
      * Remove unnecessary parens (Python style)
      * Sort `disks_by_instance` dict and remove duplicate `t1.micro` key
      
      Author: Nicholas Chammas <nicholas.chammas@gmail.com>
      
      Closes #2571 from nchammas/ec2-polish and squashes the following commits:
      
      9d203d5 [Nicholas Chammas] paren and dict cleanup
      1651cc11
  5. Sep 24, 2014
  6. Sep 20, 2014
  7. Sep 16, 2014
    • Dan Osipov's avatar
      [SPARK-787] Add S3 configuration parameters to the EC2 deploy scripts · b2017126
      Dan Osipov authored
      When deploying to AWS, there is additional configuration that is required to read S3 files. EMR creates it automatically, there is no reason that the Spark EC2 script shouldn't.
      
      This PR requires a corresponding PR to the mesos/spark-ec2 to be merged, as it gets cloned in the process of setting up machines: https://github.com/mesos/spark-ec2/pull/58
      
      Author: Dan Osipov <daniil.osipov@shazam.com>
      
      Closes #1120 from danosipov/s3_credentials and squashes the following commits:
      
      758da8b [Dan Osipov] Modify documentation to include the new parameter
      71fab14 [Dan Osipov] Use a parameter --copy-aws-credentials to enable S3 credential deployment
      7e0da26 [Dan Osipov] Get AWS credentials out of boto connection instance
      39bdf30 [Dan Osipov] Add S3 configuration parameters to the EC2 deploy scripts
      b2017126
  8. Sep 15, 2014
  9. Sep 06, 2014
    • Nicholas Chammas's avatar
      [EC2] don't duplicate default values · 0c681dd6
      Nicholas Chammas authored
      This PR makes two minor changes to the `spark-ec2` script:
      
      1. The script's input parameter default values are duplicated into the help text. This is unnecessary. This PR replaces the duplicated info with the appropriate `optparse`  placeholder.
      2. The default Spark version currently needs to be updated by hand during each release, which is known to be a faulty process. This PR places that default value in an easy-to-spot place.
      
      Author: Nicholas Chammas <nicholas.chammas@gmail.com>
      
      Closes #2290 from nchammas/spark-ec2-default-version and squashes the following commits:
      
      0c6d3bb [Nicholas Chammas] don't duplicate default values
      0c681dd6
    • Nicholas Chammas's avatar
      [SPARK-3361] Expand PEP 8 checks to include EC2 script and Python examples · 9422c4ee
      Nicholas Chammas authored
      This PR resolves [SPARK-3361](https://issues.apache.org/jira/browse/SPARK-3361) by expanding the PEP 8 checks to cover the remaining Python code base:
      * The EC2 script
      * All Python / PySpark examples
      
      Author: Nicholas Chammas <nicholas.chammas@gmail.com>
      
      Closes #2297 from nchammas/pep8-rulez and squashes the following commits:
      
      1e5ac9a [Nicholas Chammas] PEP 8 fixes to Python examples
      c3dbeff [Nicholas Chammas] PEP 8 fixes to EC2 script
      65ef6e8 [Nicholas Chammas] expand PEP 8 checks
      9422c4ee
  10. Sep 05, 2014
    • Reynold Xin's avatar
      [SPARK-3391][EC2] Support attaching up to 8 EBS volumes. · 1725a1a5
      Reynold Xin authored
      Please merge this at the same time as https://github.com/mesos/spark-ec2/pull/66
      
      Author: Reynold Xin <rxin@apache.org>
      
      Closes #2260 from rxin/ec2-ebs-vol and squashes the following commits:
      
      b9527d9 [Reynold Xin] Removed io1 ebs type.
      bf9c403 [Reynold Xin] Made EBS volume type configurable.
      c8e25ea [Reynold Xin] Support up to 8 EBS volumes.
      adf4f2e [Reynold Xin] Revert git repo change.
      020c542 [Reynold Xin] [SPARK-3391] Support attaching more than 1 EBS volumes.
      1725a1a5
  11. Sep 02, 2014
    • Patrick Wendell's avatar
      SPARK-3358: [EC2] Switch back to HVM instances for m3.X. · c64cc435
      Patrick Wendell authored
      During regression tests of Spark 1.1 we discovered perf issues with
      PVM instances when running PySpark. This reverts a change added in #1156
      which changed the default type for m3 instances to PVM.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #2244 from pwendell/ec2-hvm and squashes the following commits:
      
      1342d7e [Patrick Wendell] SPARK-3358: [EC2] Switch back to HVM instances for m3.X.
      c64cc435
    • Daniel Darabos's avatar
      [SPARK-3342] Add SSDs to block device mapping · 44d3a6a7
      Daniel Darabos authored
      On `m3.2xlarge` instances the 2x80GB SSDs are inaccessible if not added to the block device mapping when the instance is created. They work when added with this patch. I have not tested this with other instance types, and I do not know much about this script and EC2 deployment in general. Maybe this code needs to depend on the instance type.
      
      The requirement for this mapping is described in the AWS docs at:
      http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html#InstanceStore_UsageScenarios
      
      "For M3 instances, you must specify instance store volumes in the block
      device mapping for the instance. When you launch an M3 instance, we
      ignore any instance store volumes specified in the block device mapping
      for the AMI."
      
      Author: Daniel Darabos <darabos.daniel@gmail.com>
      
      Closes #2081 from darabos/patch-1 and squashes the following commits:
      
      1ceb2c8 [Daniel Darabos] Use %d string interpolation instead of {}.
      a1854d7 [Daniel Darabos] Only specify ephemeral device mapping for M3.
      e0d9e37 [Daniel Darabos] Create ephemeral device mapping based on get_num_disks().
      6b116a6 [Daniel Darabos] Add SSDs to block device mapping
      44d3a6a7
  12. Aug 27, 2014
  13. Aug 25, 2014
    • Allan Douglas R. de Oliveira's avatar
      SPARK-3180 - Better control of security groups · cc40a709
      Allan Douglas R. de Oliveira authored
      Adds the --authorized-address and --additional-security-group options as explained in the issue.
      
      Author: Allan Douglas R. de Oliveira <allan@chaordicsystems.com>
      
      Closes #2088 from douglaz/configurable_sg and squashes the following commits:
      
      e3e48ca [Allan Douglas R. de Oliveira] Adds the option to specify the address authorized to access the SG and another option to provide an additional existing SG
      cc40a709
  14. Aug 19, 2014
    • Vida Ha's avatar
      SPARK-2333 - spark_ec2 script should allow option for existing security group · 94053a7b
      Vida Ha authored
          - Uses the name tag to identify machines in a cluster.
          - Allows overriding the security group name so it doesn't need to coincide with the cluster name.
          - Outputs the request id's of up to 10 pending spot instance requests.
      
      Author: Vida Ha <vida@databricks.com>
      
      Closes #1899 from vidaha/vida/ec2-reuse-security-group and squashes the following commits:
      
      c80d5c3 [Vida Ha] wrap retries in a try catch block
      b2989d5 [Vida Ha] SPARK-2333: spark_ec2 script should allow option for existing security group
      94053a7b
  15. Aug 03, 2014
  16. Jul 18, 2014
    • Basit Mustafa's avatar
      Added t2 instance types · 7f87ab98
      Basit Mustafa authored
      New t2 instance types require HVM amis, bailout assumption of pvm
      causes failures when using t2 instance types.
      
      Author: Basit Mustafa <basitmustafa@computes-things-for-basit.local>
      
      Closes #1446 from 24601/master and squashes the following commits:
      
      01fe128 [Basit Mustafa] Makin' it pretty
      392a95e [Basit Mustafa] Added t2 instance types
      7f87ab98
  17. Jul 10, 2014
    • Nicholas Chammas's avatar
      name ec2 instances and security groups consistently · 369aa84e
      Nicholas Chammas authored
      Security groups created by `spark-ec2` do not prepend “spark-“ to the
      name.
      
      Since naming the instances themselves is new to `spark-ec2`, it’s better
      to change that pattern to match the existing naming pattern for the
      security groups, rather than the other way around.
      
      Author: Nicholas Chammas <nicholas.chammas@gmail.com>
      Author: nchammas <nicholas.chammas@gmail.com>
      
      Closes #1344 from nchammas/master and squashes the following commits:
      
      f7e4581 [Nicholas Chammas] unrelated pep8 fix
      a36eed0 [Nicholas Chammas] name ec2 instances and security groups consistently
      de7292a [nchammas] Merge pull request #4 from apache/master
      2e4fe00 [nchammas] Merge pull request #3 from apache/master
      89fde08 [nchammas] Merge pull request #2 from apache/master
      69f6e22 [Nicholas Chammas] PEP8 fixes
      2627247 [Nicholas Chammas] broke up lines before they hit 100 chars
      6544b7e [Nicholas Chammas] [SPARK-2065] give launched instances names
      69da6cf [nchammas] Merge pull request #1 from apache/master
      369aa84e
  18. Jul 08, 2014
    • Andrew Or's avatar
      [EC2] Add default history server port to ec2 script · 56e009d4
      Andrew Or authored
      Right now I have to open it manually
      
      Author: Andrew Or <andrewor14@gmail.com>
      
      Closes #1296 from andrewor14/hist-serv-port and squashes the following commits:
      
      8895a1f [Andrew Or] Add default history server port to ec2 script
      56e009d4
  19. Jun 26, 2014
    • Zichuan Ye's avatar
      Fixing AWS instance type information based upon current EC2 data · 62d4a0fa
      Zichuan Ye authored
      Fixed a problem in previous file in which some information regarding AWS instance types were wrong. Such information was updated base upon current AWS EC2 data.
      
      Author: Zichuan Ye <jerry@tangentds.com>
      
      Closes #1156 from jerry86/master and squashes the following commits:
      
      ff36e95 [Zichuan Ye] Fixing AWS instance type information based upon current EC2 data
      62d4a0fa
  20. Jun 22, 2014
    • Jean-Martin Archer's avatar
      SPARK-2166 - Listing of instances to be terminated before the prompt · 9cb64b2c
      Jean-Martin Archer authored
      Will list the EC2 instances before detroying the cluster.
      This was added because it can be scary to destroy EC2
      instances without knowing which one will be impacted.
      
      Author: Jean-Martin Archer <jeanmartin.archer@pulseenergy.com>
      
      This patch had conflicts when merged, resolved by
      Committer: Patrick Wendell <pwendell@gmail.com>
      
      Closes #270 from j-martin/master and squashes the following commits:
      
      826455f [Jean-Martin Archer] [SPARK-2611] Implementing recommendations
      27b0a36 [Jean-Martin Archer] Listing of instances to be terminated before the prompt Will list the EC2 instances before detroying the cluster. This was added because it can be scary to destroy EC2 instances without knowing which one will be impacted.
      9cb64b2c
    • Ori Kremer's avatar
      SPARK-2241: quote command line args in ec2 script · 9fc373e3
      Ori Kremer authored
      To preserve quoted command line args (in case options have space in them).
      
      Author: Ori Kremer <ori.kremer@gmail.com>
      
      Closes #1169 from orikremer/quote_cmd_line_args and squashes the following commits:
      
      67e2aa1 [Ori Kremer] quote command line args
      9fc373e3
  21. Jun 17, 2014
    • Patrick Wendell's avatar
      HOTFIX: bug caused by #941 · b2ebf429
      Patrick Wendell authored
      This patch should have qualified the use of PIPE. This needs to be back ported into 0.9 and 1.0.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #1108 from pwendell/hotfix and squashes the following commits:
      
      711c58d [Patrick Wendell] HOTFIX: bug caused by #941
      b2ebf429
    • Anant's avatar
      SPARK-1990: added compatibility for python 2.6 for ssh_read command · 8cd04c3e
      Anant authored
      https://issues.apache.org/jira/browse/SPARK-1990
      
      There were some posts on the lists that spark-ec2 does not work with Python 2.6. In addition, we should check the Python version at the top of the script and exit if it's too old
      
      Author: Anant <anant.asty@gmail.com>
      
      Closes #941 from anantasty/SPARK-1990 and squashes the following commits:
      
      4ca441d [Anant] Implmented check_optput withinthe module to work with python 2.6
      c6ed85c [Anant] added compatibility for python 2.6 for ssh_read command
      8cd04c3e
  22. Jun 10, 2014
    • Nicholas Chammas's avatar
      [SPARK-2065] give launched instances names · a2052a44
      Nicholas Chammas authored
      This update resolves [SPARK-2065](https://issues.apache.org/jira/browse/SPARK-2065). It gives launched EC2 instances descriptive names by using instance tags. Launched instances now show up in the EC2 console with these names.
      
      I used `format()` with named parameters, which I believe is the recommended practice for string formatting in Python, but which doesn’t seem to be used elsewhere in the script.
      
      Author: Nicholas Chammas <nicholas.chammas@gmail.com>
      Author: nchammas <nicholas.chammas@gmail.com>
      
      Closes #1043 from nchammas/master and squashes the following commits:
      
      69f6e22 [Nicholas Chammas] PEP8 fixes
      2627247 [Nicholas Chammas] broke up lines before they hit 100 chars
      6544b7e [Nicholas Chammas] [SPARK-2065] give launched instances names
      69da6cf [nchammas] Merge pull request #1 from apache/master
      a2052a44
  23. Jun 04, 2014
  24. Jun 01, 2014
    • Reynold Xin's avatar
      Made spark_ec2.py PEP8 compliant. · eea3aab4
      Reynold Xin authored
      The change set is actually pretty small -- mostly whitespace changes. Admittedly this is a scary change due to the lack of tests to cover the ec2 scripts, and also because indentation actually impacts control flow in Python ...
      
      Look at changes without whitespace diff here: https://github.com/apache/spark/pull/891/files?w=1
      
      Author: Reynold Xin <rxin@apache.org>
      
      Closes #891 from rxin/spark-ec2-pep8 and squashes the following commits:
      
      ac1bf11 [Reynold Xin] Made spark_ec2.py PEP8 compliant.
      eea3aab4
  25. May 16, 2014
    • Patrick Wendell's avatar
      Version bump of spark-ec2 scripts · c0ab85d7
      Patrick Wendell authored
      This will allow us to change things in spark-ec2 related to the 1.0 release.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #809 from pwendell/spark-ec2 and squashes the following commits:
      
      59117fb [Patrick Wendell] Version bump of spark-ec2 scripts
      c0ab85d7
  26. May 04, 2014
    • msiddalingaiah's avatar
      Address SPARK-1717 · bb2bb0cf
      msiddalingaiah authored
      I tested the change locally with Spark 0.9.1, but I can't test with 1.0.0 because there was no AMI for it at the time. It's a trivial fix, so it shouldn't cause any problems.
      
      Author: msiddalingaiah <madhu@madhu.com>
      
      Closes #641 from msiddalingaiah/master and squashes the following commits:
      
      a4f7404 [msiddalingaiah] Address SPARK-1717
      bb2bb0cf
    • Allan Douglas R. de Oliveira's avatar
      EC2 script should exit with non-zero code on UsageError · bcb9b7fd
      Allan Douglas R. de Oliveira authored
      This is specially import because some ssh errors are raised as UsageError, preventing an automated usage of the script from detecting the failure.
      
      Author: Allan Douglas R. de Oliveira <allan@chaordicsystems.com>
      
      Closes #638 from douglaz/ec2_exit_code_fix and squashes the following commits:
      
      5915e6d [Allan Douglas R. de Oliveira] EC2 script should exit with non-zero code on UsageError
      bcb9b7fd
  27. May 03, 2014
    • Allan Douglas R. de Oliveira's avatar
      EC2 configurable workers · 4669a84a
      Allan Douglas R. de Oliveira authored
      Added option to configure number of worker instances and to set SPARK_MASTER_OPTS
      
      Depends on: https://github.com/mesos/spark-ec2/pull/46
      
      Author: Allan Douglas R. de Oliveira <allan@chaordicsystems.com>
      
      Closes #612 from douglaz/ec2_configurable_workers and squashes the following commits:
      
      d6c5d65 [Allan Douglas R. de Oliveira] Added master opts parameter
      6c34671 [Allan Douglas R. de Oliveira] Use number of worker instances as string on template
      ba528b9 [Allan Douglas R. de Oliveira] Added SPARK_WORKER_INSTANCES parameter
      4669a84a
  28. Apr 10, 2014
  29. Mar 05, 2014
    • CodingCat's avatar
      SPARK-1156: allow user to login into a cluster without slaves · 3eb009f3
      CodingCat authored
      Reported in https://spark-project.atlassian.net/browse/SPARK-1156
      
      The current spark-ec2 script doesn't allow user to login to a cluster without slaves. One of the issues brought by this behaviour is that when all the worker died, the user cannot even login to the cluster for debugging, etc.
      
      Author: CodingCat <zhunansjtu@gmail.com>
      
      Closes #58 from CodingCat/SPARK-1156 and squashes the following commits:
      
      104af07 [CodingCat] output ERROR to stderr
      9a71769 [CodingCat] do not allow user to start 0-slave cluster
      24a7c79 [CodingCat] allow user to login into a cluster without slaves
      3eb009f3
  30. Mar 02, 2014
    • Patrick Wendell's avatar
      Remove remaining references to incubation · 1fd2bfd3
      Patrick Wendell authored
      This removes some loose ends not caught by the other (incubating -> tlp) patches. @markhamstra this updates the version as you mentioned earlier.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #51 from pwendell/tlp and squashes the following commits:
      
      d553b1b [Patrick Wendell] Remove remaining references to incubation
      1fd2bfd3
  31. Feb 18, 2014
  32. Feb 13, 2014
Loading