Skip to content
Snippets Groups Projects
Commit 2182e432 authored by Nicholas Chammas's avatar Nicholas Chammas Committed by Reynold Xin
Browse files

[SPARK-16772][PYTHON][DOCS] Restore "datatype string" to Python API docstrings

## What changes were proposed in this pull request?

This PR corrects [an error made in an earlier PR](https://github.com/apache/spark/pull/14393/files#r72843069).

## How was this patch tested?

```sh
$ ./dev/lint-python
PEP8 checks passed.
rm -rf _build/*
pydoc checks passed.
```

I also built the docs and confirmed that they looked good in my browser.

Author: Nicholas Chammas <nicholas.chammas@gmail.com>

Closes #14408 from nchammas/SPARK-16772.
parent 2c15323a
No related branches found
No related tags found
No related merge requests found
......@@ -226,9 +226,8 @@ class SQLContext(object):
from ``data``, which should be an RDD of :class:`Row`,
or :class:`namedtuple`, or :class:`dict`.
When ``schema`` is :class:`pyspark.sql.types.DataType` or
:class:`pyspark.sql.types.StringType`, it must match the
real data, or an exception will be thrown at runtime. If the given schema is not
When ``schema`` is :class:`pyspark.sql.types.DataType` or a datatype string it must match
the real data, or an exception will be thrown at runtime. If the given schema is not
:class:`pyspark.sql.types.StructType`, it will be wrapped into a
:class:`pyspark.sql.types.StructType` as its only field, and the field name will be "value",
each record will also be wrapped into a tuple, which can be converted to row later.
......@@ -239,8 +238,7 @@ class SQLContext(object):
:param data: an RDD of any kind of SQL data representation(e.g. :class:`Row`,
:class:`tuple`, ``int``, ``boolean``, etc.), or :class:`list`, or
:class:`pandas.DataFrame`.
:param schema: a :class:`pyspark.sql.types.DataType` or a
:class:`pyspark.sql.types.StringType` or a list of
:param schema: a :class:`pyspark.sql.types.DataType` or a datatype string or a list of
column names, default is None. The data type string format equals to
:class:`pyspark.sql.types.DataType.simpleString`, except that top level struct type can
omit the ``struct<>`` and atomic types use ``typeName()`` as their format, e.g. use
......@@ -251,7 +249,7 @@ class SQLContext(object):
.. versionchanged:: 2.0
The ``schema`` parameter can be a :class:`pyspark.sql.types.DataType` or a
:class:`pyspark.sql.types.StringType` after 2.0.
datatype string after 2.0.
If it's not a :class:`pyspark.sql.types.StructType`, it will be wrapped into a
:class:`pyspark.sql.types.StructType` and each record will also be wrapped into a tuple.
......
......@@ -414,9 +414,8 @@ class SparkSession(object):
from ``data``, which should be an RDD of :class:`Row`,
or :class:`namedtuple`, or :class:`dict`.
When ``schema`` is :class:`pyspark.sql.types.DataType` or
:class:`pyspark.sql.types.StringType`, it must match the
real data, or an exception will be thrown at runtime. If the given schema is not
When ``schema`` is :class:`pyspark.sql.types.DataType` or a datatype string, it must match
the real data, or an exception will be thrown at runtime. If the given schema is not
:class:`pyspark.sql.types.StructType`, it will be wrapped into a
:class:`pyspark.sql.types.StructType` as its only field, and the field name will be "value",
each record will also be wrapped into a tuple, which can be converted to row later.
......@@ -426,8 +425,7 @@ class SparkSession(object):
:param data: an RDD of any kind of SQL data representation(e.g. row, tuple, int, boolean,
etc.), or :class:`list`, or :class:`pandas.DataFrame`.
:param schema: a :class:`pyspark.sql.types.DataType` or a
:class:`pyspark.sql.types.StringType` or a list of
:param schema: a :class:`pyspark.sql.types.DataType` or a datatype string or a list of
column names, default is ``None``. The data type string format equals to
:class:`pyspark.sql.types.DataType.simpleString`, except that top level struct type can
omit the ``struct<>`` and atomic types use ``typeName()`` as their format, e.g. use
......@@ -438,7 +436,7 @@ class SparkSession(object):
.. versionchanged:: 2.0
The ``schema`` parameter can be a :class:`pyspark.sql.types.DataType` or a
:class:`pyspark.sql.types.StringType` after 2.0. If it's not a
datatype string after 2.0. If it's not a
:class:`pyspark.sql.types.StructType`, it will be wrapped into a
:class:`pyspark.sql.types.StructType` and each record will also be wrapped into a tuple.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment