Skip to content
Snippets Groups Projects
  • Nicholas Chammas's avatar
    4dfe180f
    [SPARK-5473] [EC2] Expose SSH failures after status checks pass · 4dfe180f
    Nicholas Chammas authored
    If there is some fatal problem with launching a cluster, `spark-ec2` just hangs without giving the user useful feedback on what the problem is.
    
    This PR exposes the output of the SSH calls to the user if the SSH test fails during cluster launch for any reason but the instance status checks are all green. It also removes the growing trail of dots while waiting in favor of a fixed 3 dots.
    
    For example:
    
    ```
    $ ./ec2/spark-ec2 -k key -i /incorrect/path/identity.pem --instance-type m3.medium --slaves 1 --zone us-east-1c launch "spark-test"
    Setting up security groups...
    Searching for existing cluster spark-test...
    Spark AMI: ami-35b1885c
    Launching instances...
    Launched 1 slaves in us-east-1c, regid = r-7dadd096
    Launched master in us-east-1c, regid = r-fcadd017
    Waiting for cluster to enter 'ssh-ready' state...
    Warning: SSH connection error. (This could be temporary.)
    Host: 127.0.0.1
    SSH return code: 255
    SSH output: Warning: Identity file /incorrect/path/identity.pem not accessible: No such file or directory.
    Warning: Permanently added '127.0.0.1' (RSA) to the list of known hosts.
    Permission denied (publickey).
    ```
    
    This should give users enough information when some unrecoverable error occurs during launch so they can know to abort the launch. This will help avoid situations like the ones reported [here on Stack Overflow](http://stackoverflow.com/q/28002443/) and [here on the user list](http://mail-archives.apache.org/mod_mbox/spark-user/201501.mbox/%3C1422323829398-21381.postn3.nabble.com%3E), where the users couldn't tell what the problem was because it was being hidden by `spark-ec2`.
    
    This is a usability improvement that should be backported to 1.2.
    
    Resolves [SPARK-5473](https://issues.apache.org/jira/browse/SPARK-5473).
    
    Author: Nicholas Chammas <nicholas.chammas@gmail.com>
    
    Closes #4262 from nchammas/expose-ssh-failure and squashes the following commits:
    
    8bda6ed [Nicholas Chammas] default to print SSH output
    2b92534 [Nicholas Chammas] show SSH output after status check pass
    4dfe180f
    History
    [SPARK-5473] [EC2] Expose SSH failures after status checks pass
    Nicholas Chammas authored
    If there is some fatal problem with launching a cluster, `spark-ec2` just hangs without giving the user useful feedback on what the problem is.
    
    This PR exposes the output of the SSH calls to the user if the SSH test fails during cluster launch for any reason but the instance status checks are all green. It also removes the growing trail of dots while waiting in favor of a fixed 3 dots.
    
    For example:
    
    ```
    $ ./ec2/spark-ec2 -k key -i /incorrect/path/identity.pem --instance-type m3.medium --slaves 1 --zone us-east-1c launch "spark-test"
    Setting up security groups...
    Searching for existing cluster spark-test...
    Spark AMI: ami-35b1885c
    Launching instances...
    Launched 1 slaves in us-east-1c, regid = r-7dadd096
    Launched master in us-east-1c, regid = r-fcadd017
    Waiting for cluster to enter 'ssh-ready' state...
    Warning: SSH connection error. (This could be temporary.)
    Host: 127.0.0.1
    SSH return code: 255
    SSH output: Warning: Identity file /incorrect/path/identity.pem not accessible: No such file or directory.
    Warning: Permanently added '127.0.0.1' (RSA) to the list of known hosts.
    Permission denied (publickey).
    ```
    
    This should give users enough information when some unrecoverable error occurs during launch so they can know to abort the launch. This will help avoid situations like the ones reported [here on Stack Overflow](http://stackoverflow.com/q/28002443/) and [here on the user list](http://mail-archives.apache.org/mod_mbox/spark-user/201501.mbox/%3C1422323829398-21381.postn3.nabble.com%3E), where the users couldn't tell what the problem was because it was being hidden by `spark-ec2`.
    
    This is a usability improvement that should be backported to 1.2.
    
    Resolves [SPARK-5473](https://issues.apache.org/jira/browse/SPARK-5473).
    
    Author: Nicholas Chammas <nicholas.chammas@gmail.com>
    
    Closes #4262 from nchammas/expose-ssh-failure and squashes the following commits:
    
    8bda6ed [Nicholas Chammas] default to print SSH output
    2b92534 [Nicholas Chammas] show SSH output after status check pass