Skip to content
Snippets Groups Projects
  • Adrian Petrescu's avatar
    4a426ff8
    [SPARK-17437] Add uiWebUrl to JavaSparkContext and pyspark.SparkContext · 4a426ff8
    Adrian Petrescu authored
    ## What changes were proposed in this pull request?
    
    The Scala version of `SparkContext` has a handy field called `uiWebUrl` that tells you which URL the SparkUI spawned by that instance lives at. This is often very useful because the value for `spark.ui.port` in the config is only a suggestion; if that port number is taken by another Spark instance on the same machine, Spark will just keep incrementing the port until it finds a free one. So, on a machine with a lot of running PySpark instances, you often have to start trying all of them one-by-one until you find your application name.
    
    Scala users have a way around this with `uiWebUrl` but Java and Python users do not. This pull request fixes this in the most straightforward way possible, simply propagating this field through the `JavaSparkContext` and into pyspark through the Java gateway.
    
    Please let me know if any additional documentation/testing is needed.
    
    ## How was this patch tested?
    
    Existing tests were run to make sure there were no regressions, and a binary distribution was created and tested manually for the correct value of `sc.uiWebPort` in a variety of circumstances.
    
    Author: Adrian Petrescu <apetresc@gmail.com>
    
    Closes #15000 from apetresc/pyspark-uiweburl.
    [SPARK-17437] Add uiWebUrl to JavaSparkContext and pyspark.SparkContext
    Adrian Petrescu authored
    ## What changes were proposed in this pull request?
    
    The Scala version of `SparkContext` has a handy field called `uiWebUrl` that tells you which URL the SparkUI spawned by that instance lives at. This is often very useful because the value for `spark.ui.port` in the config is only a suggestion; if that port number is taken by another Spark instance on the same machine, Spark will just keep incrementing the port until it finds a free one. So, on a machine with a lot of running PySpark instances, you often have to start trying all of them one-by-one until you find your application name.
    
    Scala users have a way around this with `uiWebUrl` but Java and Python users do not. This pull request fixes this in the most straightforward way possible, simply propagating this field through the `JavaSparkContext` and into pyspark through the Java gateway.
    
    Please let me know if any additional documentation/testing is needed.
    
    ## How was this patch tested?
    
    Existing tests were run to make sure there were no regressions, and a binary distribution was created and tested manually for the correct value of `sc.uiWebPort` in a variety of circumstances.
    
    Author: Adrian Petrescu <apetresc@gmail.com>
    
    Closes #15000 from apetresc/pyspark-uiweburl.
context.py 40.66 KiB