Skip to content
Snippets Groups Projects
  • Reynold Xin's avatar
    e98dfe62
    [SPARK-5752][SQL] Don't implicitly convert RDDs directly to DataFrames · e98dfe62
    Reynold Xin authored
    - The old implicit would convert RDDs directly to DataFrames, and that added too many methods.
    - toDataFrame -> toDF
    - Dsl -> functions
    - implicits moved into SQLContext.implicits
    - addColumn -> withColumn
    - renameColumn -> withColumnRenamed
    
    Python changes:
    - toDataFrame -> toDF
    - Dsl -> functions package
    - addColumn -> withColumn
    - renameColumn -> withColumnRenamed
    - add toDF functions to RDD on SQLContext init
    - add flatMap to DataFrame
    
    Author: Reynold Xin <rxin@databricks.com>
    Author: Davies Liu <davies@databricks.com>
    
    Closes #4556 from rxin/SPARK-5752 and squashes the following commits:
    
    5ef9910 [Reynold Xin] More fix
    61d3fca [Reynold Xin] Merge branch 'df5' of github.com:davies/spark into SPARK-5752
    ff5832c [Reynold Xin] Fix python
    749c675 [Reynold Xin] count(*) fixes.
    5806df0 [Reynold Xin] Fix build break again.
    d941f3d [Reynold Xin] Fixed explode compilation break.
    fe1267a [Davies Liu] flatMap
    c4afb8e [Reynold Xin] style
    d9de47f [Davies Liu] add comment
    b783994 [Davies Liu] add comment for toDF
    e2154e5 [Davies Liu] schema() -> schema
    3a1004f [Davies Liu] Dsl -> functions, toDF()
    fb256af [Reynold Xin] - toDataFrame -> toDF - Dsl -> functions - implicits moved into SQLContext.implicits - addColumn -> withColumn - renameColumn -> withColumnRenamed
    0dd74eb [Reynold Xin] [SPARK-5752][SQL] Don't implicitly convert RDDs directly to DataFrames
    97dd47c [Davies Liu] fix mistake
    6168f74 [Davies Liu] fix test
    1fc0199 [Davies Liu] fix test
    a075cd5 [Davies Liu] clean up, toPandas
    663d314 [Davies Liu] add test for agg('*')
    9e214d5 [Reynold Xin] count(*) fixes.
    1ed7136 [Reynold Xin] Fix build break again.
    921b2e3 [Reynold Xin] Fixed explode compilation break.
    14698d4 [Davies Liu] flatMap
    ba3e12d [Reynold Xin] style
    d08c92d [Davies Liu] add comment
    5c8b524 [Davies Liu] add comment for toDF
    a4e5e66 [Davies Liu] schema() -> schema
    d377fc9 [Davies Liu] Dsl -> functions, toDF()
    6b3086c [Reynold Xin] - toDataFrame -> toDF - Dsl -> functions - implicits moved into SQLContext.implicits - addColumn -> withColumn - renameColumn -> withColumnRenamed
    807e8b1 [Reynold Xin] [SPARK-5752][SQL] Don't implicitly convert RDDs directly to DataFrames
    e98dfe62
    History
    [SPARK-5752][SQL] Don't implicitly convert RDDs directly to DataFrames
    Reynold Xin authored
    - The old implicit would convert RDDs directly to DataFrames, and that added too many methods.
    - toDataFrame -> toDF
    - Dsl -> functions
    - implicits moved into SQLContext.implicits
    - addColumn -> withColumn
    - renameColumn -> withColumnRenamed
    
    Python changes:
    - toDataFrame -> toDF
    - Dsl -> functions package
    - addColumn -> withColumn
    - renameColumn -> withColumnRenamed
    - add toDF functions to RDD on SQLContext init
    - add flatMap to DataFrame
    
    Author: Reynold Xin <rxin@databricks.com>
    Author: Davies Liu <davies@databricks.com>
    
    Closes #4556 from rxin/SPARK-5752 and squashes the following commits:
    
    5ef9910 [Reynold Xin] More fix
    61d3fca [Reynold Xin] Merge branch 'df5' of github.com:davies/spark into SPARK-5752
    ff5832c [Reynold Xin] Fix python
    749c675 [Reynold Xin] count(*) fixes.
    5806df0 [Reynold Xin] Fix build break again.
    d941f3d [Reynold Xin] Fixed explode compilation break.
    fe1267a [Davies Liu] flatMap
    c4afb8e [Reynold Xin] style
    d9de47f [Davies Liu] add comment
    b783994 [Davies Liu] add comment for toDF
    e2154e5 [Davies Liu] schema() -> schema
    3a1004f [Davies Liu] Dsl -> functions, toDF()
    fb256af [Reynold Xin] - toDataFrame -> toDF - Dsl -> functions - implicits moved into SQLContext.implicits - addColumn -> withColumn - renameColumn -> withColumnRenamed
    0dd74eb [Reynold Xin] [SPARK-5752][SQL] Don't implicitly convert RDDs directly to DataFrames
    97dd47c [Davies Liu] fix mistake
    6168f74 [Davies Liu] fix test
    1fc0199 [Davies Liu] fix test
    a075cd5 [Davies Liu] clean up, toPandas
    663d314 [Davies Liu] add test for agg('*')
    9e214d5 [Reynold Xin] count(*) fixes.
    1ed7136 [Reynold Xin] Fix build break again.
    921b2e3 [Reynold Xin] Fixed explode compilation break.
    14698d4 [Davies Liu] flatMap
    ba3e12d [Reynold Xin] style
    d08c92d [Davies Liu] add comment
    5c8b524 [Davies Liu] add comment for toDF
    a4e5e66 [Davies Liu] schema() -> schema
    d377fc9 [Davies Liu] Dsl -> functions, toDF()
    6b3086c [Reynold Xin] - toDataFrame -> toDF - Dsl -> functions - implicits moved into SQLContext.implicits - addColumn -> withColumn - renameColumn -> withColumnRenamed
    807e8b1 [Reynold Xin] [SPARK-5752][SQL] Don't implicitly convert RDDs directly to DataFrames