Skip to content
Snippets Groups Projects
  • hyukjinkwon's avatar
    60472dbf
    [SPARK-21485][SQL][DOCS] Spark SQL documentation generation for built-in functions · 60472dbf
    hyukjinkwon authored
    ## What changes were proposed in this pull request?
    
    This generates a documentation for Spark SQL built-in functions.
    
    One drawback is, this requires a proper build to generate built-in function list.
    Once it is built, it only takes few seconds by `sql/create-docs.sh`.
    
    Please see https://spark-test.github.io/sparksqldoc/ that I hosted to show the output documentation.
    
    There are few more works to be done in order to make the documentation pretty, for example, separating `Arguments:` and `Examples:` but I guess this should be done within `ExpressionDescription` and `ExpressionInfo` rather than manually parsing it. I will fix these in a follow up.
    
    This requires `pip install mkdocs` to generate HTMLs from markdown files.
    
    ## How was this patch tested?
    
    Manually tested:
    
    ```
    cd docs
    jekyll build
    ```
    ,
    
    ```
    cd docs
    jekyll serve
    ```
    
    and
    
    ```
    cd sql
    create-docs.sh
    ```
    
    Author: hyukjinkwon <gurwls223@gmail.com>
    
    Closes #18702 from HyukjinKwon/SPARK-21485.
    60472dbf
    History
    [SPARK-21485][SQL][DOCS] Spark SQL documentation generation for built-in functions
    hyukjinkwon authored
    ## What changes were proposed in this pull request?
    
    This generates a documentation for Spark SQL built-in functions.
    
    One drawback is, this requires a proper build to generate built-in function list.
    Once it is built, it only takes few seconds by `sql/create-docs.sh`.
    
    Please see https://spark-test.github.io/sparksqldoc/ that I hosted to show the output documentation.
    
    There are few more works to be done in order to make the documentation pretty, for example, separating `Arguments:` and `Examples:` but I guess this should be done within `ExpressionDescription` and `ExpressionInfo` rather than manually parsing it. I will fix these in a follow up.
    
    This requires `pip install mkdocs` to generate HTMLs from markdown files.
    
    ## How was this patch tested?
    
    Manually tested:
    
    ```
    cd docs
    jekyll build
    ```
    ,
    
    ```
    cd docs
    jekyll serve
    ```
    
    and
    
    ```
    cd sql
    create-docs.sh
    ```
    
    Author: hyukjinkwon <gurwls223@gmail.com>
    
    Closes #18702 from HyukjinKwon/SPARK-21485.
README.md 1.13 KiB

Spark SQL

This module provides support for executing relational queries expressed in either SQL or the DataFrame/Dataset API.

Spark SQL is broken up into four subprojects:

  • Catalyst (sql/catalyst) - An implementation-agnostic framework for manipulating trees of relational operators and expressions.
  • Execution (sql/core) - A query planner / execution engine for translating Catalyst's logical query plans into Spark RDDs. This component also includes a new public interface, SQLContext, that allows users to execute SQL or LINQ statements against existing RDDs and Parquet files.
  • Hive Support (sql/hive) - Includes an extension of SQLContext called HiveContext that allows users to write queries using a subset of HiveQL and access data from a Hive Metastore using Hive SerDes. There are also wrappers that allows users to run queries that include Hive UDFs, UDAFs, and UDTFs.
  • HiveServer and CLI support (sql/hive-thriftserver) - Includes support for the SQL CLI (bin/spark-sql) and a HiveServer2 (for JDBC/ODBC) compatible server.

Running sql/create-docs.sh generates SQL documentation for built-in functions under sql/site.