-
- Downloads
[SPARK-6943] [SPARK-6944] DAG visualization on SparkUI
This patch adds the functionality to display the RDD DAG on the SparkUI. This DAG describes the relationships between - an RDD and its dependencies, - an RDD and its operation scopes, and - an RDD's operation scopes and the stage / job hierarchy An operation scope here refers to the existing public APIs that created the RDDs (e.g. `textFile`, `treeAggregate`). In the future, we can expand this to include higher level operations like SQL queries. *Note: This blatantly stole a few lines of HTML and JavaScript from #5547 (thanks shroffpradyumn!)* Here's what the job page looks like: <img src="https://issues.apache.org/jira/secure/attachment/12730286/job-page.png" width="700px"/> and the stage page: <img src="https://issues.apache.org/jira/secure/attachment/12730287/stage-page.png" width="300px"/> Author: Andrew Or <andrew@databricks.com> Closes #5729 from andrewor14/viz2 and squashes the following commits: 666c03b [Andrew Or] Round corners of RDD boxes on stage page (minor) 01ba336 [Andrew Or] Change RDD cache color to red (minor) 6f9574a [Andrew Or] Add tests for RDDOperationScope 1c310e4 [Andrew Or] Wrap a few more RDD functions in an operation scope 3ffe566 [Andrew Or] Restore "null" as default for RDD name 5fdd89d [Andrew Or] children -> child (minor) 0d07a84 [Andrew Or] Fix python style afb98e2 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2 0d7aa32 [Andrew Or] Fix python tests 3459ab2 [Andrew Or] Fix tests 832443c [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2 429e9e1 [Andrew Or] Display cached RDDs on the viz b1f0fd1 [Andrew Or] Rename OperatorScope -> RDDOperationScope 31aae06 [Andrew Or] Extract visualization logic from listener 83f9c58 [Andrew Or] Implement a programmatic representation of operator scopes 5a7faf4 [Andrew Or] Rename references to viz scopes to viz clusters ee33d52 [Andrew Or] Separate HTML generating code from listener f9830a2 [Andrew Or] Refactor + clean up + document JS visualization code b80cc52 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2 0706992 [Andrew Or] Add link from jobs to stages deb48a0 [Andrew Or] Translate stage boxes taking into account the width 5c7ce16 [Andrew Or] Connect RDDs across stages + update style ab91416 [Andrew Or] Introduce visualization to the Job Page 5f07e9c [Andrew Or] Remove more return statements from scopes 5e388ea [Andrew Or] Fix line too long 43de96e [Andrew Or] Add parent IDs to StageInfo 6e2cfea [Andrew Or] Remove all return statements in `withScope` d19c4da [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2 7ef957c [Andrew Or] Fix scala style 4310271 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2 aa868a9 [Andrew Or] Ensure that HadoopRDD is actually serializable c3bfcae [Andrew Or] Re-implement scopes using closures instead of annotations 52187fc [Andrew Or] Rat excludes 09d361e [Andrew Or] Add ID to node label (minor) 71281fa [Andrew Or] Embed the viz in the UI in a toggleable manner 8dd5af2 [Andrew Or] Fill in documentation + miscellaneous minor changes fe7816f [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz 205f838 [Andrew Or] Reimplement rendering with dagre-d3 instead of viz.js 5e22946 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz 6a7cdca [Andrew Or] Move RDD scope util methods and logic to its own file 494d5c2 [Andrew Or] Revert a few unintended style changes 9fac6f3 [Andrew Or] Re-implement scopes through annotations instead f22f337 [Andrew Or] First working implementation of visualization with vis.js 2184348 [Andrew Or] Translate RDD information to dot file 5143523 [Andrew Or] Expose the necessary information in RDDInfo a9ed4f9 [Andrew Or] Add a few missing scopes to certain RDD methods 6b3403b [Andrew Or] Scope all RDD methods
Showing
- .rat-excludes 3 additions, 0 deletions.rat-excludes
- core/src/main/resources/org/apache/spark/ui/static/d3.min.js 5 additions, 0 deletionscore/src/main/resources/org/apache/spark/ui/static/d3.min.js
- core/src/main/resources/org/apache/spark/ui/static/dagre-d3.min.js 29 additions, 0 deletions...main/resources/org/apache/spark/ui/static/dagre-d3.min.js
- core/src/main/resources/org/apache/spark/ui/static/graphlib-dot.min.js 4 additions, 0 deletions.../resources/org/apache/spark/ui/static/graphlib-dot.min.js
- core/src/main/resources/org/apache/spark/ui/static/spark-dag-viz.js 392 additions, 0 deletions...ain/resources/org/apache/spark/ui/static/spark-dag-viz.js
- core/src/main/resources/org/apache/spark/ui/static/webui.css 1 addition, 1 deletioncore/src/main/resources/org/apache/spark/ui/static/webui.css
- core/src/main/scala/org/apache/spark/SparkContext.scala 58 additions, 39 deletionscore/src/main/scala/org/apache/spark/SparkContext.scala
- core/src/main/scala/org/apache/spark/rdd/AsyncRDDActions.scala 5 additions, 5 deletions...src/main/scala/org/apache/spark/rdd/AsyncRDDActions.scala
- core/src/main/scala/org/apache/spark/rdd/DoubleRDDFunctions.scala 27 additions, 11 deletions.../main/scala/org/apache/spark/rdd/DoubleRDDFunctions.scala
- core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala 5 additions, 1 deletioncore/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala
- core/src/main/scala/org/apache/spark/rdd/OrderedRDDFunctions.scala 3 additions, 3 deletions...main/scala/org/apache/spark/rdd/OrderedRDDFunctions.scala
- core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala 99 additions, 68 deletions...rc/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
- core/src/main/scala/org/apache/spark/rdd/RDD.scala 205 additions, 136 deletionscore/src/main/scala/org/apache/spark/rdd/RDD.scala
- core/src/main/scala/org/apache/spark/rdd/RDDOperationScope.scala 137 additions, 0 deletions...c/main/scala/org/apache/spark/rdd/RDDOperationScope.scala
- core/src/main/scala/org/apache/spark/rdd/SequenceFileRDDFunctions.scala 3 additions, 1 deletion...scala/org/apache/spark/rdd/SequenceFileRDDFunctions.scala
- core/src/main/scala/org/apache/spark/scheduler/StageInfo.scala 2 additions, 0 deletions...src/main/scala/org/apache/spark/scheduler/StageInfo.scala
- core/src/main/scala/org/apache/spark/storage/RDDInfo.scala 7 additions, 4 deletionscore/src/main/scala/org/apache/spark/storage/RDDInfo.scala
- core/src/main/scala/org/apache/spark/ui/SparkUI.scala 9 additions, 1 deletioncore/src/main/scala/org/apache/spark/ui/SparkUI.scala
- core/src/main/scala/org/apache/spark/ui/UIUtils.scala 54 additions, 1 deletioncore/src/main/scala/org/apache/spark/ui/UIUtils.scala
- core/src/main/scala/org/apache/spark/ui/jobs/AllJobsPage.scala 1 addition, 1 deletion...src/main/scala/org/apache/spark/ui/jobs/AllJobsPage.scala
Loading
Please register or sign in to comment