Skip to content
Snippets Groups Projects
Commit 6bd9a78e authored by Ankur Dave's avatar Ankur Dave
Browse files

Add back Bagel links to docs, but mark them superseded

parent cfc10c74
No related branches found
No related tags found
No related merge requests found
......@@ -67,6 +67,7 @@
<li class="divider"></li>
<li><a href="streaming-programming-guide.html">Spark Streaming</a></li>
<li><a href="mllib-guide.html">MLlib (Machine Learning)</a></li>
<li><a href="bagel-programming-guide.html">Bagel (Pregel on Spark, superseded by GraphX)</a></li>
<li><a href="graphx-programming-guide.html">GraphX (Graph-Parallel Spark)</a></li>
</ul>
</li>
......@@ -79,7 +80,8 @@
<li class="divider"></li>
<li><a href="api/streaming/index.html#org.apache.spark.streaming.package">Spark Streaming</a></li>
<li><a href="api/mllib/index.html#org.apache.spark.mllib.package">MLlib (Machine Learning)</a></li>
<li><a href="api/graphx/index.html#org.apache.spark.graphx.package">GraphX (Graph-Paralle Spark)</a></li>
<li><a href="api/bagel/index.html#org.apache.spark.bagel.package">Bagel (Pregel on Spark, superseded by GraphX)</a></li>
<li><a href="api/graphx/index.html#org.apache.spark.graphx.package">GraphX (Graph-Parallel Spark)</a></li>
</ul>
</li>
......
......@@ -8,5 +8,6 @@ Here you can find links to the Scaladoc generated for the Spark sbt subprojects.
- [Spark](api/core/index.html)
- [Spark Examples](api/examples/index.html)
- [Spark Streaming](api/streaming/index.html)
- [Bagel](api/bagel/index.html)
- [Bagel](api/bagel/index.html) *(superseded by GraphX)*
- [GraphX](api/graphx/index.html)
- [PySpark](api/pyspark/index.html)
......@@ -3,6 +3,8 @@ layout: global
title: Bagel Programming Guide
---
**Bagel has been superseded by [GraphX](graphx-programming-guide.html) for graph processing. New users should use GraphX instead.**
Bagel is a Spark implementation of Google's [Pregel](http://portal.acm.org/citation.cfm?id=1807184) graph processing framework. Bagel currently supports basic graph computation, combiners, and aggregators.
In the Pregel programming model, jobs run as a sequence of iterations called _supersteps_. In each superstep, each vertex in the graph runs a user-specified function that can update state associated with the vertex and send messages to other vertices for use in the *next* iteration.
......@@ -21,7 +23,7 @@ To use Bagel in your program, add the following SBT or Maven dependency:
Bagel operates on a graph represented as a [distributed dataset](scala-programming-guide.html) of (K, V) pairs, where keys are vertex IDs and values are vertices plus their associated state. In each superstep, Bagel runs a user-specified compute function on each vertex that takes as input the current vertex state and a list of messages sent to that vertex during the previous superstep, and returns the new vertex state and a list of outgoing messages.
For example, we can use Bagel to implement PageRank. Here, vertices represent pages, edges represent links between pages, and messages represent shares of PageRank sent to the pages that a particular page links to.
For example, we can use Bagel to implement PageRank. Here, vertices represent pages, edges represent links between pages, and messages represent shares of PageRank sent to the pages that a particular page links to.
We first extend the default `Vertex` class to store a `Double`
representing the current PageRank of the vertex, and similarly extend
......@@ -38,7 +40,7 @@ import org.apache.spark.bagel.Bagel._
val active: Boolean) extends Vertex
@serializable class PRMessage(
val targetId: String, val rankShare: Double) extends Message
val targetId: String, val rankShare: Double) extends Message
{% endhighlight %}
Next, we load a sample graph from a text file as a distributed dataset and package it into `PRVertex` objects. We also cache the distributed dataset because Bagel will use it multiple times and we'd like to avoid recomputing it.
......@@ -114,7 +116,7 @@ Here are the actions and types in the Bagel API. See [Bagel.scala](https://githu
/*** Full form ***/
Bagel.run(sc, vertices, messages, combiner, aggregator, partitioner, numSplits)(compute)
// where compute takes (vertex: V, combinedMessages: Option[C], aggregated: Option[A], superstep: Int)
// where compute takes (vertex: V, combinedMessages: Option[C], aggregated: Option[A], superstep: Int)
// and returns (newVertex: V, outMessages: Array[M])
/*** Abbreviated forms ***/
......@@ -124,7 +126,7 @@ Bagel.run(sc, vertices, messages, combiner, partitioner, numSplits)(compute)
// and returns (newVertex: V, outMessages: Array[M])
Bagel.run(sc, vertices, messages, combiner, numSplits)(compute)
// where compute takes (vertex: V, combinedMessages: Option[C], superstep: Int)
// where compute takes (vertex: V, combinedMessages: Option[C], superstep: Int)
// and returns (newVertex: V, outMessages: Array[M])
Bagel.run(sc, vertices, messages, numSplits)(compute)
......
......@@ -16,7 +16,7 @@ title: GraphX Programming Guide
# Overview
GraphX is the new (alpha) Spark API for graphs and graph-parallel
computation. At a high-level GraphX, extends the Spark
computation. At a high-level, GraphX extends the Spark
[RDD](api/core/index.html#org.apache.spark.rdd.RDD) by
introducing the [Resilient Distributed property Graph (RDG)](#property_graph):
a directed graph with properties attached to each vertex and edge.
......@@ -77,12 +77,13 @@ graph-parallel systems while easily expressing the entire analytics pipelines.
## GraphX Replaces the Spark Bagel API
Prior to the release of GraphX, graph computation in Spark was expressed using
Bagel, an implementation of the Pregel API. GraphX improves upon Bagel by exposing
a richer property graph API, a more streamlined version of the Pregel abstraction,
and system optimizations to improve performance and reduce memory
Bagel, an implementation of the Pregel API. GraphX improves upon Bagel by
exposing a richer property graph API, a more streamlined version of the Pregel
abstraction, and system optimizations to improve performance and reduce memory
overhead. While we plan to eventually deprecate the Bagel, we will continue to
support the API and [Bagel programming guide](bagel-programming-guide.html). However,
we encourage Bagel to explore the new GraphX API and comment on issues that may
support the [Bagel API](api/bagel/index.html#org.apache.spark.bagel.package) and
[Bagel programming guide](bagel-programming-guide.html). However, we encourage
Bagel users to explore the new GraphX API and comment on issues that may
complicate the transition from Bagel.
# The Property Graph
......@@ -168,4 +169,3 @@ val userInfoWithPageRank = subgraph.outerJoinVertices(pagerankGraph.vertices){
println(userInfoWithPageRank.top(5))
{% endhighlight %}
......@@ -77,7 +77,8 @@ For this version of Spark (0.8.1) Hadoop 2.2.x (or newer) users will have to bui
* [Python Programming Guide](python-programming-guide.html): using Spark from Python
* [Spark Streaming](streaming-programming-guide.html): using the alpha release of Spark Streaming
* [MLlib (Machine Learning)](mllib-guide.html): Spark's built-in machine learning library
* [GraphX (Graphs on Spark)](graphx-programming-guide.html): simple graph processing model
* [Bagel (Pregel on Spark)](bagel-programming-guide.html): simple graph processing model *(superseded by GraphX)*
* [GraphX (Graphs on Spark)](graphx-programming-guide.html): Spark's new API for graphs
**API Docs:**
......@@ -85,6 +86,7 @@ For this version of Spark (0.8.1) Hadoop 2.2.x (or newer) users will have to bui
* [Spark for Python (Epydoc)](api/pyspark/index.html)
* [Spark Streaming for Java/Scala (Scaladoc)](api/streaming/index.html)
* [MLlib (Machine Learning) for Java/Scala (Scaladoc)](api/mllib/index.html)
* [Bagel (Pregel on Spark) for Scala (Scaladoc)](api/bagel/index.html) *(superseded by GraphX)*
* [GraphX (Graphs on Spark) for Scala (Scaladoc)](api/graphx/index.html)
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment