Skip to content
Snippets Groups Projects
Commit 2182e432 authored by Nicholas Chammas's avatar Nicholas Chammas Committed by Reynold Xin
Browse files

[SPARK-16772][PYTHON][DOCS] Restore "datatype string" to Python API docstrings

## What changes were proposed in this pull request?

This PR corrects [an error made in an earlier PR](https://github.com/apache/spark/pull/14393/files#r72843069).

## How was this patch tested?

```sh
$ ./dev/lint-python
PEP8 checks passed.
rm -rf _build/*
pydoc checks passed.
```

I also built the docs and confirmed that they looked good in my browser.

Author: Nicholas Chammas <nicholas.chammas@gmail.com>

Closes #14408 from nchammas/SPARK-16772.
parent 2c15323a
No related branches found
No related tags found
No related merge requests found
...@@ -226,9 +226,8 @@ class SQLContext(object): ...@@ -226,9 +226,8 @@ class SQLContext(object):
from ``data``, which should be an RDD of :class:`Row`, from ``data``, which should be an RDD of :class:`Row`,
or :class:`namedtuple`, or :class:`dict`. or :class:`namedtuple`, or :class:`dict`.
When ``schema`` is :class:`pyspark.sql.types.DataType` or When ``schema`` is :class:`pyspark.sql.types.DataType` or a datatype string it must match
:class:`pyspark.sql.types.StringType`, it must match the the real data, or an exception will be thrown at runtime. If the given schema is not
real data, or an exception will be thrown at runtime. If the given schema is not
:class:`pyspark.sql.types.StructType`, it will be wrapped into a :class:`pyspark.sql.types.StructType`, it will be wrapped into a
:class:`pyspark.sql.types.StructType` as its only field, and the field name will be "value", :class:`pyspark.sql.types.StructType` as its only field, and the field name will be "value",
each record will also be wrapped into a tuple, which can be converted to row later. each record will also be wrapped into a tuple, which can be converted to row later.
...@@ -239,8 +238,7 @@ class SQLContext(object): ...@@ -239,8 +238,7 @@ class SQLContext(object):
:param data: an RDD of any kind of SQL data representation(e.g. :class:`Row`, :param data: an RDD of any kind of SQL data representation(e.g. :class:`Row`,
:class:`tuple`, ``int``, ``boolean``, etc.), or :class:`list`, or :class:`tuple`, ``int``, ``boolean``, etc.), or :class:`list`, or
:class:`pandas.DataFrame`. :class:`pandas.DataFrame`.
:param schema: a :class:`pyspark.sql.types.DataType` or a :param schema: a :class:`pyspark.sql.types.DataType` or a datatype string or a list of
:class:`pyspark.sql.types.StringType` or a list of
column names, default is None. The data type string format equals to column names, default is None. The data type string format equals to
:class:`pyspark.sql.types.DataType.simpleString`, except that top level struct type can :class:`pyspark.sql.types.DataType.simpleString`, except that top level struct type can
omit the ``struct<>`` and atomic types use ``typeName()`` as their format, e.g. use omit the ``struct<>`` and atomic types use ``typeName()`` as their format, e.g. use
...@@ -251,7 +249,7 @@ class SQLContext(object): ...@@ -251,7 +249,7 @@ class SQLContext(object):
.. versionchanged:: 2.0 .. versionchanged:: 2.0
The ``schema`` parameter can be a :class:`pyspark.sql.types.DataType` or a The ``schema`` parameter can be a :class:`pyspark.sql.types.DataType` or a
:class:`pyspark.sql.types.StringType` after 2.0. datatype string after 2.0.
If it's not a :class:`pyspark.sql.types.StructType`, it will be wrapped into a If it's not a :class:`pyspark.sql.types.StructType`, it will be wrapped into a
:class:`pyspark.sql.types.StructType` and each record will also be wrapped into a tuple. :class:`pyspark.sql.types.StructType` and each record will also be wrapped into a tuple.
......
...@@ -414,9 +414,8 @@ class SparkSession(object): ...@@ -414,9 +414,8 @@ class SparkSession(object):
from ``data``, which should be an RDD of :class:`Row`, from ``data``, which should be an RDD of :class:`Row`,
or :class:`namedtuple`, or :class:`dict`. or :class:`namedtuple`, or :class:`dict`.
When ``schema`` is :class:`pyspark.sql.types.DataType` or When ``schema`` is :class:`pyspark.sql.types.DataType` or a datatype string, it must match
:class:`pyspark.sql.types.StringType`, it must match the the real data, or an exception will be thrown at runtime. If the given schema is not
real data, or an exception will be thrown at runtime. If the given schema is not
:class:`pyspark.sql.types.StructType`, it will be wrapped into a :class:`pyspark.sql.types.StructType`, it will be wrapped into a
:class:`pyspark.sql.types.StructType` as its only field, and the field name will be "value", :class:`pyspark.sql.types.StructType` as its only field, and the field name will be "value",
each record will also be wrapped into a tuple, which can be converted to row later. each record will also be wrapped into a tuple, which can be converted to row later.
...@@ -426,8 +425,7 @@ class SparkSession(object): ...@@ -426,8 +425,7 @@ class SparkSession(object):
:param data: an RDD of any kind of SQL data representation(e.g. row, tuple, int, boolean, :param data: an RDD of any kind of SQL data representation(e.g. row, tuple, int, boolean,
etc.), or :class:`list`, or :class:`pandas.DataFrame`. etc.), or :class:`list`, or :class:`pandas.DataFrame`.
:param schema: a :class:`pyspark.sql.types.DataType` or a :param schema: a :class:`pyspark.sql.types.DataType` or a datatype string or a list of
:class:`pyspark.sql.types.StringType` or a list of
column names, default is ``None``. The data type string format equals to column names, default is ``None``. The data type string format equals to
:class:`pyspark.sql.types.DataType.simpleString`, except that top level struct type can :class:`pyspark.sql.types.DataType.simpleString`, except that top level struct type can
omit the ``struct<>`` and atomic types use ``typeName()`` as their format, e.g. use omit the ``struct<>`` and atomic types use ``typeName()`` as their format, e.g. use
...@@ -438,7 +436,7 @@ class SparkSession(object): ...@@ -438,7 +436,7 @@ class SparkSession(object):
.. versionchanged:: 2.0 .. versionchanged:: 2.0
The ``schema`` parameter can be a :class:`pyspark.sql.types.DataType` or a The ``schema`` parameter can be a :class:`pyspark.sql.types.DataType` or a
:class:`pyspark.sql.types.StringType` after 2.0. If it's not a datatype string after 2.0. If it's not a
:class:`pyspark.sql.types.StructType`, it will be wrapped into a :class:`pyspark.sql.types.StructType`, it will be wrapped into a
:class:`pyspark.sql.types.StructType` and each record will also be wrapped into a tuple. :class:`pyspark.sql.types.StructType` and each record will also be wrapped into a tuple.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment