Commits · ad06727fe985ca243ebdaaba55cd7d35a4749d0a · cs525-sp18-g07 / spark

May 30, 2015

Update documentation for the new DataFrame reader/writer interface. · 00a71379

Reynold Xin authored 10 years ago

Author: Reynold Xin <rxin@databricks.com>

Closes #6522 from rxin/sql-doc-1.4 and squashes the following commits:

c227be7 [Reynold Xin] Updated link.
040b6d7 [Reynold Xin] Update documentation for the new DataFrame reader/writer interface.

00a71379

Updated SQL programming guide's Hive connectivity section. · 7716a5a1
Reynold Xin authored 10 years ago

7716a5a1

[SPARK-7849] [SQL] [Docs] Updates SQL programming guide for 1.4 · 6e3f0c78

Cheng Lian authored 10 years ago

Author: Cheng Lian <lian@databricks.com>

Closes #6520 from liancheng/spark-7849 and squashes the following commits:

705264b [Cheng Lian] Updates SQL programming guide for 1.4

6e3f0c78

May 29, 2015

[SPARK-6806] [SPARKR] [DOCS] Add a new SparkR programming guide · 5f48e5c3

Shivaram Venkataraman authored 10 years ago

This PR adds a new SparkR programming guide at the top-level. This will be useful for R users as our APIs don't directly match the Scala/Python APIs and as we need to explain SparkR without using RDDs as examples etc.

cc rxin davies pwendell

cc cafreeman -- Would be great if you could also take a look at this !

Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu>

Closes #6490 from shivaram/sparkr-guide and squashes the following commits:

d5ff360 [Shivaram Venkataraman] Add a section on HiveContext, HQL queries
408dce5 [Shivaram Venkataraman] Fix link
dbb86e3 [Shivaram Venkataraman] Fix minor typo
9aff5e0 [Shivaram Venkataraman] Address comments, use dplyr-like syntax in example
d09703c [Shivaram Venkataraman] Fix default argument in read.df
ea816a1 [Shivaram Venkataraman] Add a new SparkR programming guide Also update write.df, read.df to handle defaults better

5f48e5c3

May 28, 2015

[DOCS] Fix typo in documentation for Java UDF registration · 35410614

Matt Wise authored 10 years ago

This contribution is my original work and I license the work to the project under the project's open source license

Author: Matt Wise <mwise@quixey.com>

Closes #6447 from wisematthew/fix-typo-in-java-udf-registration-doc and squashes the following commits:

e7ef5f7 [Matt Wise] Fix typo in documentation for Java UDF registration

35410614

May 23, 2015

[SPARK-6806] [SPARKR] [DOCS] Fill in SparkR examples in programming guide · 7af3818c

Davies Liu authored 10 years ago

sqlCtx -> sqlContext

You can check the docs by:

```
$ cd docs
$ SKIP_SCALADOC=1 jekyll serve
```
cc shivaram

Author: Davies Liu <davies@databricks.com>

Closes #5442 from davies/r_docs and squashes the following commits:

7a12ec6 [Davies Liu] remove rdd in R docs
8496b26 [Davies Liu] remove the docs related to RDD
e23b9d6 [Davies Liu] delete R docs for RDD API
222e4ff [Davies Liu] Merge branch 'master' into r_docs
89684ce [Davies Liu] Merge branch 'r_docs' of github.com:davies/spark into r_docs
f0a10e1 [Davies Liu] address comments from @shivaram
f61de71 [Davies Liu] Update pairRDD.R
3ef7cf3 [Davies Liu] use + instead of function(a,b) a+b
2f10a77 [Davies Liu] address comments from @cafreeman
9c2a062 [Davies Liu] mention R api together with Python API
23f751a [Davies Liu] Fill in SparkR examples in programming guide

7af3818c

May 12, 2015

[SPARK-6994][SQL] Update docs for fetching Row fields by name · 640f63b9

vidmantas zemleris authored 10 years ago

add docs for https://issues.apache.org/jira/browse/SPARK-6994

Author: vidmantas zemleris <vidmantas@vinted.com>

Closes #6030 from vidma/docs/row-with-named-fields and squashes the following commits:

241b401 [vidmantas zemleris] [SPARK-6994][SQL] Update docs for fetching Row fields by name

640f63b9

May 11, 2015

[SPARK-7462][SQL] Update documentation for retaining grouping columns in DataFrames. · 3a9b6997

Reynold Xin authored 10 years ago

Author: Reynold Xin <rxin@databricks.com>

Closes #6062 from rxin/agg-retain-doc and squashes the following commits:

43e511e [Reynold Xin] [SPARK-7462][SQL] Update documentation for retaining grouping columns in DataFrames.

3a9b6997

[SPARK-7516] [Minor] [DOC] Replace depreciated inferSchema() with createDataFrame() · 8e674331

gchen authored 10 years ago

JIRA: https://issues.apache.org/jira/browse/SPARK-7516

In sql-programming-guide, deprecated python data frame api inferSchema() should be replaced by createDataFrame():

schemaPeople = sqlContext.inferSchema(people) ->
schemaPeople = sqlContext.createDataFrame(people)

Author: gchen <chenguancheng@gmail.com>

Closes #6041 from gchen/python-docs and squashes the following commits:

c27eb7c [gchen] replace inferSchema() with createDataFrame()

8e674331

May 07, 2015

[SPARK-7035] Encourage __getitem__ over __getattr__ on column access in the Python DataFrame API · fae4e2d6

ksonj authored 10 years ago

Author: ksonj <kson@siberie.de>

Closes #5971 from ksonj/doc and squashes the following commits:

dadfebb [ksonj] __getitem__ is cleaner than __getattr__

fae4e2d6

Apr 24, 2015

[SPARK-7136][Docs] Spark SQL and DataFrame Guide fix example file and paths · 59b7cfc4

Deborah Siegel authored 10 years ago

Changes example file for Generic Load/Save Functions to users.parquet rather than people.parquet which doesn't exist unless a later example has already been executed. Also adds filepaths.

Author: Deborah Siegel <deborah.siegel@gmail.com>
Author: DEBORAH SIEGEL <deborahsiegel@d-140-142-0-49.dhcp4.washington.edu>
Author: DEBORAH SIEGEL <deborahsiegel@DEBORAHs-MacBook-Pro.local>
Author: DEBORAH SIEGEL <deborahsiegel@d-69-91-154-197.dhcp4.washington.edu>

Closes #5693 from d3borah/master and squashes the following commits:

4d5e43b [Deborah Siegel] sparkSQL doc change
b15a497 [Deborah Siegel] Revert "sparkSQL doc change"
5a2863c [DEBORAH SIEGEL] Merge remote-tracking branch 'upstream/master'
91972fc [DEBORAH SIEGEL] sparkSQL doc change
f000e59 [DEBORAH SIEGEL] Merge remote-tracking branch 'upstream/master'
db54173 [DEBORAH SIEGEL] fixed aggregateMessages example in graphX doc

59b7cfc4

Apr 23, 2015

Update sql-programming-guide.md · 67bccbda

Ken Geis authored 10 years ago

fix typo

Author: Ken Geis <geis.ken@gmail.com>

Closes #5674 from kgeis/patch-1 and squashes the following commits:

5ae67de [Ken Geis] Update sql-programming-guide.md

67bccbda

Apr 18, 2015

SPARK-6992 : Fix documentation example for Spark SQL on StructType · 5f095d56

Olivier Girardot authored 10 years ago


This patch is fixing the Java examples for Spark SQL when defining
programmatically a Schema and mapping Rows.

Author: Olivier Girardot <o.girardot@lateral-thoughts.com>

Closes #5569 from ogirardot/branch-1.3 and squashes the following commits:

c29e58d [Olivier Girardot] SPARK-6992 : Fix documentation example for Spark SQL on StructType

(cherry picked from commit c9b1ba4b16a7afe93d45bf75b128cc0dd287ded0)
Signed-off-by: Reynold Xin <rxin@databricks.com>

5f095d56

Apr 17, 2015

SPARK-6988 : Fix documentation regarding DataFrames using the Java API · d305e686

Olivier Girardot authored 10 years ago


This patch includes :
 * adding how to use map after an sql query using javaRDD
 * fixing the first few java examples that were written in Scala

Thank you for your time,

Olivier.

Author: Olivier Girardot <o.girardot@lateral-thoughts.com>

Closes #5564 from ogirardot/branch-1.3 and squashes the following commits:

9f8d60e [Olivier Girardot] SPARK-6988 : Fix documentation regarding DataFrames using the Java API

(cherry picked from commit 6b528dc139da594ef2e651d84bd91fe0f738a39d)
Signed-off-by: Reynold Xin <rxin@databricks.com>

d305e686

Apr 15, 2015

[SPARK-6800][SQL] Update doc for JDBCRelation's columnPartition · e3e4e9a3

Liang-Chi Hsieh authored 10 years ago

JIRA https://issues.apache.org/jira/browse/SPARK-6800

Author: Liang-Chi Hsieh <viirya@gmail.com>

Closes #5488 from viirya/fix_jdbc_where and squashes the following commits:

51386c8 [Liang-Chi Hsieh] Update code comment.
1dcc929 [Liang-Chi Hsieh] Update document.
3eb74d6 [Liang-Chi Hsieh] Revert and modify doc.
df11783 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into fix_jdbc_where
3e7db15 [Liang-Chi Hsieh] Fix wrong logic to generate WHERE clause for JDBC.

e3e4e9a3

Apr 11, 2015

[SPARK-6863] Fix formatting on SQL programming guide. · 6437e7cc

Santiago M. Mola authored 10 years ago

https://issues.apache.org/jira/browse/SPARK-6863

Author: Santiago M. Mola <santiago.mola@sap.com>

Closes #5472 from smola/fix/sql-docs and squashes the following commits:

42503d4 [Santiago M. Mola] [SPARK-6863] Fix formatting on SQL programming guide.

6437e7cc

Apr 08, 2015

[SPARK-6781] [SQL] use sqlContext in python shell · 6ada4f6f

Davies Liu authored 10 years ago

Use `sqlContext` in PySpark shell, make it consistent with SQL programming guide. `sqlCtx` is also kept for compatibility.

Author: Davies Liu <davies@databricks.com>

Closes #5425 from davies/sqlCtx and squashes the following commits:

af67340 [Davies Liu] sqlCtx -> sqlContext
15a278f [Davies Liu] use sqlContext in python shell

6ada4f6f

Mar 26, 2015

[DOCS][SQL] Fix JDBC example · aad00322

Michael Armbrust authored 10 years ago

Author: Michael Armbrust <michael@databricks.com>

Closes #5192 from marmbrus/fixJDBCDocs and squashes the following commits:

b48a33d [Michael Armbrust] [DOCS][SQL] Fix JDBC example

aad00322

Mar 25, 2015

[DOCUMENTATION]Fixed Missing Type Import in Documentation · c5cc4146

Bill Chambers authored 10 years ago

Needed to import the types specifically, not the more general pyspark.sql

Author: Bill Chambers <wchambers@ischool.berkeley.edu>
Author: anabranch <wac.chambers@gmail.com>

Closes #5179 from anabranch/master and squashes the following commits:

8fa67bf [anabranch] Corrected SqlContext Import
603b080 [Bill Chambers] [DOCUMENTATION]Fixed Missing Type Import in Documentation

c5cc4146

Mar 22, 2015

[SPARK-6337][Documentation, SQL]Spark 1.3 doc fixes · 2bf40c58

vinodkc authored 10 years ago

Author: vinodkc <vinod.kc.in@gmail.com>

Closes #5112 from vinodkc/spark_1.3_doc_fixes and squashes the following commits:

2c6aee6 [vinodkc] Spark 1.3 doc fixes

2bf40c58

SPARK-6454 [DOCS] Fix links to pyspark api · 6ef48632

Kamil Smuga authored 10 years ago

Author: Kamil Smuga <smugakamil@gmail.com>
Author: stderr <smugakamil@gmail.com>

Closes #5120 from kamilsmuga/master and squashes the following commits:

fee3281 [Kamil Smuga] more python api links fixed for docs
13240cb [Kamil Smuga] resolved merge conflicts with upstream/master
6649b3b [Kamil Smuga] fix broken docs links to Python API
92f03d7 [stderr] Fix links to pyspark api

6ef48632

Mar 17, 2015

[SPARK-6383][SQL]Fixed compiler and errors in Dataframe examples · a012e086

Tijo Thomas authored 10 years ago

Author: Tijo Thomas <tijoparacka@gmail.com>

Closes #5068 from tijoparacka/fix_sql_dataframe_example and squashes the following commits:

6953ac1 [Tijo Thomas] Handled Java and Python example sections
0751a74 [Tijo Thomas] Fixed compiler and errors in Dataframe examples

a012e086

Mar 13, 2015

[SPARK-5310] [SQL] [DOC] Parquet section for the SQL programming guide · 69ff8e8c

Cheng Lian authored 10 years ago

Also fixed a bunch of minor styling issues.

<!-- Reviewable:start -->
[<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/5001)
<!-- Reviewable:end -->

Author: Cheng Lian <lian@databricks.com>

Closes #5001 from liancheng/parquet-doc and squashes the following commits:

89ad3db [Cheng Lian] Addresses @rxin's comments
7eb6955 [Cheng Lian] Docs for the new Parquet data source
415eefb [Cheng Lian] Some minor formatting improvements

69ff8e8c

Mar 12, 2015

[SPARK-6275][Documentation]Miss toDF() function in docs/sql-programming-guide.md · 304366c4

zzcclp authored 10 years ago

Miss `toDF()` function in docs/sql-programming-guide.md

Author: zzcclp <xm_zzc@sina.com>

Closes #4977 from zzcclp/SPARK-6275 and squashes the following commits:

9a96c7b [zzcclp] Miss toDF()

304366c4

Mar 10, 2015

[SPARK-5183][SQL] Update SQL Docs with JDBC and Migration Guide · 26723741

Michael Armbrust authored 10 years ago

Author: Michael Armbrust <michael@databricks.com>

Closes #4958 from marmbrus/sqlDocs and squashes the following commits:

9351dbc [Michael Armbrust] fix parquet example
6877e13 [Michael Armbrust] add sql examples
d81b7e7 [Michael Armbrust] rxins comments
e393528 [Michael Armbrust] fix order
19c2735 [Michael Armbrust] more on data source load/store
00d5914 [Michael Armbrust] Update SQL Docs with JDBC and Migration Guide

26723741

Mar 09, 2015

[SPARK-5310][Doc] Update SQL Programming Guide to include DataFrames. · 3cac1991

Reynold Xin authored 10 years ago

Author: Reynold Xin <rxin@databricks.com>

Closes #4954 from rxin/df-docs and squashes the following commits:

c592c70 [Reynold Xin] [SPARK-5310][Doc] Update SQL Programming Guide to include DataFrames.

3cac1991

Feb 17, 2015

[Minor] fix typo in SQL document · 31efb39c

CodingCat authored 10 years ago

Author: CodingCat <zhunansjtu@gmail.com>

Closes #4656 from CodingCat/fix_typo and squashes the following commits:

b41d15c [CodingCat] recover
689fe46 [CodingCat] fix typo

31efb39c

Feb 12, 2015

[SQL][DOCS] Update sql documentation · 6a1be026

Antonio Navarro Perez authored 10 years ago

Updated examples using the new api and added DataFrame concept

Author: Antonio Navarro Perez <ajnavarro@users.noreply.github.com>

Closes #4560 from ajnavarro/ajnavarro-doc-sql-update and squashes the following commits:

82ebcf3 [Antonio Navarro Perez] Changed a missing JavaSQLContext to SQLContext.
8d5376a [Antonio Navarro Perez] fixed typo
8196b6b [Antonio Navarro Perez] [SQL][DOCS] Update sql documentation

6a1be026

Feb 10, 2015

[SPARK-5704] [SQL] [PySpark] createDataFrame from RDD with columns · ea602840

Davies Liu authored 10 years ago

Deprecate inferSchema() and applySchema(), use createDataFrame() instead, which could take an optional `schema` to create an DataFrame from an RDD. The `schema` could be StructType or list of names of columns.

Author: Davies Liu <davies@databricks.com>

Closes #4498 from davies/create and squashes the following commits:

08469c1 [Davies Liu] remove Scala/Java API for now
c80a7a9 [Davies Liu] fix hive test
d1bd8f2 [Davies Liu] cleanup applySchema
9526e97 [Davies Liu] createDataFrame from RDD with columns

ea602840

Feb 05, 2015

[Branch-1.3] [DOC] doc fix for date · 6fa4ac1b

Daoyuan Wang authored 10 years ago

Trivial fix.

Author: Daoyuan Wang <daoyuan.wang@intel.com>

Closes #4400 from adrian-wang/docdate and squashes the following commits:

31bbe40 [Daoyuan Wang] doc fix for date

6fa4ac1b

[SPARK-5608] Improve SEO of Spark documentation pages · 4d74f060

Matei Zaharia authored 10 years ago

- Add meta description tags on some of the most important doc pages
- Shorten the titles of some pages to have more relevant keywords; for
  example there's no reason to have "Spark SQL Programming Guide - Spark
  1.2.0 documentation", we can just say "Spark SQL - Spark 1.2.0
  documentation".

Author: Matei Zaharia <matei@databricks.com>

Closes #4381 from mateiz/docs-seo and squashes the following commits:

4940563 [Matei Zaharia] [SPARK-5608] Improve SEO of Spark documentation pages

4d74f060

Feb 03, 2015

[SPARK-4987] [SQL] parquet timestamp type support · 0c20ce69

Daoyuan Wang authored 10 years ago

Author: Daoyuan Wang <daoyuan.wang@intel.com>

Closes #3820 from adrian-wang/parquettimestamp and squashes the following commits:

b1e2a0d [Daoyuan Wang] fix for nanos
4dadef1 [Daoyuan Wang] fix wrong read
93f438d [Daoyuan Wang] parquet timestamp support

0c20ce69

Jan 18, 2015

[SQL][Minor] Update sql doc according to data type APIs changes · 1a200a3e

scwf authored 10 years ago

Follow up of #3925
/cc rxin

Author: scwf <wangfei1@huawei.com>

Closes #4095 from scwf/sql-doc and squashes the following commits:

97e311b [scwf] update sql doc since now expose only one version of the data type APIs

1a200a3e

Dec 30, 2014

[SPARK-4930][SQL][DOCS]Update SQL programming guide, CACHE TABLE is eager · 2deac748

luogankun authored 10 years ago

`CACHE TABLE tbl` is now __eager__ by default not __lazy__

Author: luogankun <luogankun@gmail.com>

Closes #3773 from luogankun/SPARK-4930 and squashes the following commits:

cc17b7d [luogankun] [SPARK-4930][SQL][DOCS]Update SQL programming guide, add CACHE [LAZY] TABLE [AS SELECT] ...
bffe0e8 [luogankun] [SPARK-4930][SQL][DOCS]Update SQL programming guide, CACHE TABLE tbl is eager

2deac748

[SPARK-4916][SQL][DOCS]Update SQL programming guide about cache section · f7a41a0e

luogankun authored 10 years ago

`SchemeRDD.cache()` now uses in-memory columnar storage.

Author: luogankun <luogankun@gmail.com>

Closes #3759 from luogankun/SPARK-4916 and squashes the following commits:

7b39864 [luogankun] [SPARK-4916]Update SQL programming guide
6018122 [luogankun] Merge branch 'master' of https://github.com/apache/spark into SPARK-4916
0b93785 [luogankun] [SPARK-4916]Update SQL programming guide
99b2336 [luogankun] [SPARK-4916]Update SQL programming guide

f7a41a0e

Dec 16, 2014

[DOCS][SQL] Add a Note on jsonFile having separate JSON objects per line · 1a9e35e5

Peter Vandenabeele authored 10 years ago

* This commit hopes to avoid the confusion I faced when trying
  to submit a regular, valid multi-line JSON file, also see

  http://apache-spark-user-list.1001560.n3.nabble.com/Loading-JSON-Dataset-fails-with-com-fasterxml-jackson-databind-JsonMappingException-td20041.html

Author: Peter Vandenabeele <peter@vandenabeele.com>

Closes #3517 from petervandenabeele/pv-docs-note-on-jsonFile-format/01 and squashes the following commits:

1f98e52 [Peter Vandenabeele] Revert to people.json and simple Note text
6b6e062 [Peter Vandenabeele] Change the "JSON" connotation to "txt"
fca7dfb [Peter Vandenabeele] Add a Note on jsonFile having separate JSON objects per line

1a9e35e5

[SQL] SPARK-4700: Add HTTP protocol spark thrift server · 17688d14

Judy Nash authored 10 years ago

Add HTTP protocol support and test cases to spark thrift server, so users can deploy thrift server in both TCP and http mode.

Author: Judy Nash <judynash@microsoft.com>
Author: judynash <judynash@microsoft.com>

Closes #3672 from judynash/master and squashes the following commits:

526315d [Judy Nash] correct spacing on startThriftServer method
31a6520 [Judy Nash] fix code style issues and update sql programming guide format issue
47bf87e [Judy Nash] modify withJdbcStatement method definition to meet less than 100 line length
2e9c11c [Judy Nash] add thrift server in http mode documentation on sql programming guide
1cbd305 [Judy Nash] Merge remote-tracking branch 'upstream/master'
2b1d312 [Judy Nash] updated http thrift server support based on feedback
377532c [judynash] add HTTP protocol spark thrift server

17688d14

Dec 04, 2014

Fix typo in Spark SQL docs. · 15cf3b01

Andy Konwinski authored 10 years ago

Author: Andy Konwinski <andykonwinski@gmail.com>

Closes #3611 from andyk/patch-3 and squashes the following commits:

7bab333 [Andy Konwinski] Fix typo in Spark SQL docs.

15cf3b01

Dec 01, 2014

[SQL][DOC] Date type in SQL programming guide · 5edbcbfb

Daoyuan Wang authored 10 years ago

Author: Daoyuan Wang <daoyuan.wang@intel.com>

Closes #3535 from adrian-wang/datedoc and squashes the following commits:

18ff1ed [Daoyuan Wang] [DOC] Date type

5edbcbfb

[SQL] Minor fix for doc and comment · 7b799578

wangfei authored 10 years ago

Author: wangfei <wangfei1@huawei.com>

Closes #3533 from scwf/sql-doc1 and squashes the following commits:

962910b [wangfei] doc and comment fix

7b799578