Skip to content
Snippets Groups Projects
  1. Nov 17, 2016
    • Zheng RuiFeng's avatar
      [SPARK-18480][DOCS] Fix wrong links for ML guide docs · 536a2159
      Zheng RuiFeng authored
      
      ## What changes were proposed in this pull request?
      1, There are two `[Graph.partitionBy]` in `graphx-programming-guide.md`, the first one had no effert.
      2, `DataFrame`, `Transformer`, `Pipeline` and `Parameter`  in `ml-pipeline.md` were linked to `ml-guide.html` by mistake.
      3, `PythonMLLibAPI` in `mllib-linear-methods.md` was not accessable, because class `PythonMLLibAPI` is private.
      4, Other link updates.
      ## How was this patch tested?
       manual tests
      
      Author: Zheng RuiFeng <ruifengz@foxmail.com>
      
      Closes #15912 from zhengruifeng/md_fix.
      
      (cherry picked from commit cdaf4ce9)
      Signed-off-by: default avatarSean Owen <sowen@cloudera.com>
      Unverified
      536a2159
  2. Nov 14, 2016
    • Zheng RuiFeng's avatar
      [SPARK-18428][DOC] Update docs for GraphX · 649c15fa
      Zheng RuiFeng authored
      
      ## What changes were proposed in this pull request?
      1, Add link of `VertexRDD` and `EdgeRDD`
      2, Notify in `Vertex and Edge RDDs` that not all methods are listed
      3, `VertexID` -> `VertexId`
      
      ## How was this patch tested?
      No tests, only docs is modified
      
      Author: Zheng RuiFeng <ruifengz@foxmail.com>
      
      Closes #15875 from zhengruifeng/update_graphop_doc.
      
      (cherry picked from commit c31def1d)
      Signed-off-by: default avatarReynold Xin <rxin@databricks.com>
      649c15fa
  3. Aug 16, 2016
  4. Aug 12, 2016
  5. Aug 07, 2016
    • Shivansh's avatar
      [SPARK-16911] Fix the links in the programming guide · 6c1ecb19
      Shivansh authored
      ## What changes were proposed in this pull request?
      
       Fix the broken links in the programming guide of the Graphx Migration and understanding closures
      
      ## How was this patch tested?
      
      By running the test cases  and checking the links.
      
      Author: Shivansh <shiv4nsh@gmail.com>
      
      Closes #14503 from shiv4nsh/SPARK-16911.
      6c1ecb19
  6. Jul 02, 2016
    • WeichenXu's avatar
      [SPARK-16345][DOCUMENTATION][EXAMPLES][GRAPHX] Extract graphx programming... · 0bd7cd18
      WeichenXu authored
      [SPARK-16345][DOCUMENTATION][EXAMPLES][GRAPHX] Extract graphx programming guide example snippets from source files instead of hard code them
      
      ## What changes were proposed in this pull request?
      
      I extract 6 example programs from GraphX programming guide and replace them with
      `include_example` label.
      
      The 6 example programs are:
      - AggregateMessagesExample.scala
      - SSSPExample.scala
      - TriangleCountingExample.scala
      - ConnectedComponentsExample.scala
      - ComprehensiveExample.scala
      - PageRankExample.scala
      
      All the example code can run using
      `bin/run-example graphx.EXAMPLE_NAME`
      
      ## How was this patch tested?
      
      Manual.
      
      Author: WeichenXu <WeichenXu123@outlook.com>
      
      Closes #14015 from WeichenXu123/graphx_example_plugin.
      0bd7cd18
    • WeichenXu's avatar
      [GRAPHX][EXAMPLES] move graphx test data directory and update graphx document · 192d1f9c
      WeichenXu authored
      ## What changes were proposed in this pull request?
      
      There are two test data files used for graphx examples existing in directory "graphx/data"
      I move it into "data/" directory because the "graphx" directory is used for code files and other test data files (such as mllib, streaming test data) are all in there.
      
      I also update the graphx document where reference the data files which I move place.
      
      ## How was this patch tested?
      
      N/A
      
      Author: WeichenXu <WeichenXu123@outlook.com>
      
      Closes #14010 from WeichenXu123/move_graphx_data_dir.
      192d1f9c
  7. Jun 07, 2016
    • WeichenXu's avatar
      [MINOR] fix typo in documents · 1e2c9311
      WeichenXu authored
      ## What changes were proposed in this pull request?
      
      I use spell check tools checks typo in spark documents and fix them.
      
      ## How was this patch tested?
      
      N/A
      
      Author: WeichenXu <WeichenXu123@outlook.com>
      
      Closes #13538 from WeichenXu123/fix_doc_typo.
      1e2c9311
  8. Dec 09, 2015
  9. Oct 18, 2015
  10. Aug 19, 2015
  11. Jul 27, 2015
    • Alexander Ulanov's avatar
      Pregel example type fix · 90006f3c
      Alexander Ulanov authored
      Pregel example to express single source shortest path from https://spark.apache.org/docs/latest/graphx-programming-guide.html#pregel-api does not work due to incorrect type. The reason is that `GraphGenerators.logNormalGraph` returns the graph with `Long` vertices. Fixing `val graph: Graph[Int, Double]` to `val graph: Graph[Long, Double]`.
      
      Author: Alexander Ulanov <nashb@yandex.ru>
      
      Closes #7695 from avulanov/SPARK-9380-pregel-doc and squashes the following commits:
      
      c269429 [Alexander Ulanov] Pregel example type fix
      90006f3c
  12. Mar 26, 2015
    • Brennon York's avatar
      [SPARK-6510][GraphX]: Add Graph#minus method to act as Set#difference · 39fb5796
      Brennon York authored
      Adds a `Graph#minus` method which will return only unique `VertexId`'s from the calling `VertexRDD`.
      
      To demonstrate a basic example with pseudocode:
      
      ```
      Set((0L,0),(1L,1)).minus(Set((1L,1),(2L,2)))
      > Set((0L,0))
      ```
      
      Author: Brennon York <brennon.york@capitalone.com>
      
      Closes #5175 from brennonyork/SPARK-6510 and squashes the following commits:
      
      248d5c8 [Brennon York] added minus(VertexRDD[VD]) method to avoid createUsingIndex and updated the mask operations to simplify with andNot call
      3fb7cce [Brennon York] updated graphx doc to reflect the addition of minus method
      6575d92 [Brennon York] updated mima exclude
      aaa030b [Brennon York] completed graph#minus functionality
      7227c0f [Brennon York] beginning work on minus functionality
      39fb5796
  13. Mar 02, 2015
    • DEBORAH SIEGEL's avatar
      aggregateMessages example in graphX doc · e7d8ae44
      DEBORAH SIEGEL authored
      Examples illustrating difference between legacy mapReduceTriplets usage and aggregateMessages usage has type issues on the reduce for both operators.
      
      Being just an example-  changed example to reduce the message String by concatenation. Although non-optimal for performance.
      
      Author: DEBORAH SIEGEL <deborahsiegel@DEBORAHs-MacBook-Pro.local>
      
      Closes #4853 from d3borah/master and squashes the following commits:
      
      db54173 [DEBORAH SIEGEL] fixed aggregateMessages example in graphX doc
      e7d8ae44
  14. Feb 25, 2015
    • Benedikt Linse's avatar
      [GraphX] fixing 3 typos in the graphx programming guide · 5b8480e0
      Benedikt Linse authored
      Corrected 3 Typos in the GraphX programming guide. I hope this is the correct way to contribute.
      
      Author: Benedikt Linse <benedikt.linse@gmail.com>
      
      Closes #4766 from 1123/master and squashes the following commits:
      
      8a63812 [Benedikt Linse] fixing 3 typos in the graphx programming guide
      5b8480e0
  15. Feb 05, 2015
    • Matei Zaharia's avatar
      [SPARK-5608] Improve SEO of Spark documentation pages · 4d74f060
      Matei Zaharia authored
      - Add meta description tags on some of the most important doc pages
      - Shorten the titles of some pages to have more relevant keywords; for
        example there's no reason to have "Spark SQL Programming Guide - Spark
        1.2.0 documentation", we can just say "Spark SQL - Spark 1.2.0
        documentation".
      
      Author: Matei Zaharia <matei@databricks.com>
      
      Closes #4381 from mateiz/docs-seo and squashes the following commits:
      
      4940563 [Matei Zaharia] [SPARK-5608] Improve SEO of Spark documentation pages
      4d74f060
  16. Nov 21, 2014
  17. Nov 19, 2014
    • Joseph E. Gonzalez's avatar
      Updating GraphX programming guide and documentation · 377b0682
      Joseph E. Gonzalez authored
      This pull request revises the programming guide to reflect changes in the GraphX API as well as the deprecated mapReduceTriplets operator.
      
      Author: Joseph E. Gonzalez <joseph.e.gonzalez@gmail.com>
      
      Closes #3359 from jegonzal/GraphXProgrammingGuide and squashes the following commits:
      
      4421964 [Joseph E. Gonzalez] updating documentation for graphx
      377b0682
  18. May 30, 2014
    • Matei Zaharia's avatar
      [SPARK-1566] consolidate programming guide, and general doc updates · c8bf4131
      Matei Zaharia authored
      This is a fairly large PR to clean up and update the docs for 1.0. The major changes are:
      
      * A unified programming guide for all languages replaces language-specific ones and shows language-specific info in tabs
      * New programming guide sections on key-value pairs, unit testing, input formats beyond text, migrating from 0.9, and passing functions to Spark
      * Spark-submit guide moved to a separate page and expanded slightly
      * Various cleanups of the menu system, security docs, and others
      * Updated look of title bar to differentiate the docs from previous Spark versions
      
      You can find the updated docs at http://people.apache.org/~matei/1.0-docs/_site/ and in particular http://people.apache.org/~matei/1.0-docs/_site/programming-guide.html.
      
      Author: Matei Zaharia <matei@databricks.com>
      
      Closes #896 from mateiz/1.0-docs and squashes the following commits:
      
      03e6853 [Matei Zaharia] Some tweaks to configuration and YARN docs
      0779508 [Matei Zaharia] tweak
      ef671d4 [Matei Zaharia] Keep frames in JavaDoc links, and other small tweaks
      1bf4112 [Matei Zaharia] Review comments
      4414f88 [Matei Zaharia] tweaks
      d04e979 [Matei Zaharia] Fix some old links to Java guide
      a34ed33 [Matei Zaharia] tweak
      541bb3b [Matei Zaharia] miscellaneous changes
      fcefdec [Matei Zaharia] Moved submitting apps to separate doc
      61d72b4 [Matei Zaharia] stuff
      181f217 [Matei Zaharia] migration guide, remove old language guides
      e11a0da [Matei Zaharia] Add more API functions
      6a030a9 [Matei Zaharia] tweaks
      8db0ae3 [Matei Zaharia] Added key-value pairs section
      318d2c9 [Matei Zaharia] tweaks
      1c81477 [Matei Zaharia] New section on basics and function syntax
      e38f559 [Matei Zaharia] Actually added programming guide to Git
      a33d6fe [Matei Zaharia] First pass at updating programming guide to support all languages, plus other tweaks throughout
      3b6a876 [Matei Zaharia] More CSS tweaks
      01ec8bf [Matei Zaharia] More CSS tweaks
      e6d252e [Matei Zaharia] Change color of doc title bar to differentiate from 0.9.0
      c8bf4131
  19. May 10, 2014
    • Ankur Dave's avatar
      Unify GraphImpl RDDs + other graph load optimizations · 905173df
      Ankur Dave authored
      This PR makes the following changes, primarily in e4fbd329aef85fe2c38b0167255d2a712893d683:
      
      1. *Unify RDDs to avoid zipPartitions.* A graph used to be four RDDs: vertices, edges, routing table, and triplet view. This commit merges them down to two: vertices (with routing table), and edges (with replicated vertices).
      
      2. *Avoid duplicate shuffle in graph building.* We used to do two shuffles when building a graph: one to extract routing information from the edges and move it to the vertices, and another to find nonexistent vertices referred to by edges. With this commit, the latter is done as a side effect of the former.
      
      3. *Avoid no-op shuffle when joins are fully eliminated.* This is a side effect of unifying the edges and the triplet view.
      
      4. *Join elimination for mapTriplets.*
      
      5. *Ship only the needed vertex attributes when upgrading the triplet view.* If the triplet view already contains source attributes, and we now need both attributes, only ship destination attributes rather than re-shipping both. This is done in `ReplicatedVertexView#upgrade`.
      
      Author: Ankur Dave <ankurdave@gmail.com>
      
      Closes #497 from ankurdave/unify-rdds and squashes the following commits:
      
      332ab43 [Ankur Dave] Merge remote-tracking branch 'apache-spark/master' into unify-rdds
      4933e2e [Ankur Dave] Exclude RoutingTable from binary compatibility check
      5ba8789 [Ankur Dave] Add GraphX upgrade guide from Spark 0.9.1
      13ac845 [Ankur Dave] Merge remote-tracking branch 'apache-spark/master' into unify-rdds
      a04765c [Ankur Dave] Remove unnecessary toOps call
      57202e8 [Ankur Dave] Replace case with pair parameter
      75af062 [Ankur Dave] Add explicit return types
      04d3ae5 [Ankur Dave] Convert implicit parameter to context bound
      c88b269 [Ankur Dave] Revert upgradeIterator to if-in-a-loop
      0d3584c [Ankur Dave] EdgePartition.size should be val
      2a928b2 [Ankur Dave] Set locality wait
      10b3596 [Ankur Dave] Clean up public API
      ae36110 [Ankur Dave] Fix style errors
      e4fbd32 [Ankur Dave] Unify GraphImpl RDDs + other graph load optimizations
      d6d60e2 [Ankur Dave] In GraphLoader, coalesce to minEdgePartitions
      62c7b78 [Ankur Dave] In Analytics, take PageRank numIter
      d64e8d4 [Ankur Dave] Log current Pregel iteration
      905173df
  20. Apr 21, 2014
    • Matei Zaharia's avatar
      [SPARK-1439, SPARK-1440] Generate unified Scaladoc across projects and Javadocs · fc783847
      Matei Zaharia authored
      I used the sbt-unidoc plugin (https://github.com/sbt/sbt-unidoc) to create a unified Scaladoc of our public packages, and generate Javadocs as well. One limitation is that I haven't found an easy way to exclude packages in the Javadoc; there is a SBT task that identifies Java sources to run javadoc on, but it's been very difficult to modify it from outside to change what is set in the unidoc package. Some SBT-savvy people should help with this. The Javadoc site also lacks package-level descriptions and things like that, so we may want to look into that. We may decide not to post these right now if it's too limited compared to the Scala one.
      
      Example of the built doc site: http://people.csail.mit.edu/matei/spark-unified-docs/
      
      Author: Matei Zaharia <matei@databricks.com>
      
      This patch had conflicts when merged, resolved by
      Committer: Patrick Wendell <pwendell@gmail.com>
      
      Closes #457 from mateiz/better-docs and squashes the following commits:
      
      a63d4a3 [Matei Zaharia] Skip Java/Scala API docs for Python package
      5ea1f43 [Matei Zaharia] Fix links to Java classes in Java guide, fix some JS for scrolling to anchors on page load
      f05abc0 [Matei Zaharia] Don't include java.lang package names
      995e992 [Matei Zaharia] Skip internal packages and class names with $ in JavaDoc
      a14a93c [Matei Zaharia] typo
      76ce64d [Matei Zaharia] Add groups to Javadoc index page, and a first package-info.java
      ed6f994 [Matei Zaharia] Generate JavaDoc as well, add titles, update doc site to use unified docs
      acb993d [Matei Zaharia] Add Unidoc plugin for the projects we want Unidoced
      fc783847
  21. Mar 13, 2014
    • Sandy Ryza's avatar
      SPARK-1183. Don't use "worker" to mean executor · 69837321
      Sandy Ryza authored
      Author: Sandy Ryza <sandy@cloudera.com>
      
      Closes #120 from sryza/sandy-spark-1183 and squashes the following commits:
      
      5066a4a [Sandy Ryza] Remove "worker" in a couple comments
      0bd1e46 [Sandy Ryza] Remove --am-class from usage
      bfc8fe0 [Sandy Ryza] Remove am-class from doc and fix yarn-alpha
      607539f [Sandy Ryza] Address review comments
      74d087a [Sandy Ryza] SPARK-1183. Don't use "worker" to mean executor
      69837321
  22. Jan 15, 2014
  23. Jan 14, 2014
  24. Jan 13, 2014
Loading