Skip to content
  • Mark Grover's avatar
    36309110
    [SPARK-20756][YARN] yarn-shuffle jar references unshaded guava · 36309110
    Mark Grover authored
    and contains scala classes
    
    ## What changes were proposed in this pull request?
    This change ensures that all references to guava from within the yarn shuffle jar pointed to the shaded guava class already provided in the jar.
    
    Also, it explicitly excludes scala classes from being added to the jar.
    
    ## How was this patch tested?
    Ran unit tests on the module and they passed.
    javap now returns the expected result - reference to the shaded guava under `org/spark_project` (previously this was referring to `com.google...`
    ```
    javap -cp common/network-yarn/target/scala-2.11/spark-2.3.0-SNAPSHOT-yarn-shuffle.jar -c org/apache/spark/network/yarn/YarnShuffleService | grep Lists
          57: invokestatic  #138                // Method org/spark_project/guava/collect/Lists.newArrayList:()Ljava/util/ArrayList;
    ```
    
    Guava is still shaded in the jar:
    ```
    jar -tf common/network-yarn/target/scala-2.11/spark-2.3.0-SNAPSHOT-yarn-shuffle.jar | grep guava | head
    META-INF/maven/com.google.guava/
    META-INF/maven/com.google.guava/guava/
    META-INF/maven/com.google.guava/guava/pom.properties
    META-INF/maven/com.google.guava/guava/pom.xml
    org/spark_project/guava/
    org/spark_project/guava/annotations/
    org/spark_project/guava/annotations/Beta.class
    org/spark_project/guava/annotations/GwtCompatible.class
    org/spark_project/guava/annotations/GwtIncompatible.class
    org/spark_project/guava/annotations/VisibleForTesting.class
    ```
    (not sure if the above META-INF/* is a problem or not)
    
    I took this jar, deployed it on a yarn cluster with shuffle service enabled, and made sure the YARN node managers came up. An application with a shuffle was run and it succeeded.
    
    Author: Mark Grover <mark@apache.org>
    
    Closes #17990 from markgrover/spark-20756.
    36309110
    [SPARK-20756][YARN] yarn-shuffle jar references unshaded guava
    Mark Grover authored
    and contains scala classes
    
    ## What changes were proposed in this pull request?
    This change ensures that all references to guava from within the yarn shuffle jar pointed to the shaded guava class already provided in the jar.
    
    Also, it explicitly excludes scala classes from being added to the jar.
    
    ## How was this patch tested?
    Ran unit tests on the module and they passed.
    javap now returns the expected result - reference to the shaded guava under `org/spark_project` (previously this was referring to `com.google...`
    ```
    javap -cp common/network-yarn/target/scala-2.11/spark-2.3.0-SNAPSHOT-yarn-shuffle.jar -c org/apache/spark/network/yarn/YarnShuffleService | grep Lists
          57: invokestatic  #138                // Method org/spark_project/guava/collect/Lists.newArrayList:()Ljava/util/ArrayList;
    ```
    
    Guava is still shaded in the jar:
    ```
    jar -tf common/network-yarn/target/scala-2.11/spark-2.3.0-SNAPSHOT-yarn-shuffle.jar | grep guava | head
    META-INF/maven/com.google.guava/
    META-INF/maven/com.google.guava/guava/
    META-INF/maven/com.google.guava/guava/pom.properties
    META-INF/maven/com.google.guava/guava/pom.xml
    org/spark_project/guava/
    org/spark_project/guava/annotations/
    org/spark_project/guava/annotations/Beta.class
    org/spark_project/guava/annotations/GwtCompatible.class
    org/spark_project/guava/annotations/GwtIncompatible.class
    org/spark_project/guava/annotations/VisibleForTesting.class
    ```
    (not sure if the above META-INF/* is a problem or not)
    
    I took this jar, deployed it on a yarn cluster with shuffle service enabled, and made sure the YARN node managers came up. An application with a shuffle was run and it succeeded.
    
    Author: Mark Grover <mark@apache.org>
    
    Closes #17990 from markgrover/spark-20756.
Loading