Skip to content
Snippets Groups Projects

Welcome to Spark Python API Docs!

Contents:

Core classes:

:class:`pyspark.SparkContext`

Main entry point for Spark functionality.

:class:`pyspark.RDD`

A Resilient Distributed Dataset (RDD), the basic abstraction in Spark.

:class:`pyspark.sql.SQLContext`

Main entry point for DataFrame and SQL functionality.

:class:`pyspark.sql.DataFrame`

A distributed collection of data grouped into named columns.

Indices and tables

  • :ref:`search`