Skip to content
Snippets Groups Projects
  • Reynold Xin's avatar
    b8ff2bc6
    [SPARK-6119][SQL] DataFrame support for missing data handling · b8ff2bc6
    Reynold Xin authored
    This pull request adds variants of DataFrame.na.drop and DataFrame.na.fill to the Scala/Java API, and DataFrame.fillna and DataFrame.dropna to the Python API.
    
    Author: Reynold Xin <rxin@databricks.com>
    
    Closes #5274 from rxin/df-missing-value and squashes the following commits:
    
    4ee1b98 [Reynold Xin] Improve error reporting in Python.
    33a330c [Reynold Xin] Remove replace for now.
    bc4fdbb [Reynold Xin] Added documentation for replace.
    d56f5a5 [Reynold Xin] Added replace for Scala/Java.
    2385d00 [Reynold Xin] Feedback from Xiangrui on "how".
    914a374 [Reynold Xin] fill with map.
    185c67e [Reynold Xin] Allow specifying column subsets in fill.
    749eb47 [Reynold Xin] fillna
    249b94e [Reynold Xin] Removing undefined functions.
    6a73c68 [Reynold Xin] Missing file.
    67d7003 [Reynold Xin] [SPARK-6119][SQL] DataFrame.na.drop (Scala/Java) and DataFrame.dropna (Python)
    b8ff2bc6
    History
    [SPARK-6119][SQL] DataFrame support for missing data handling
    Reynold Xin authored
    This pull request adds variants of DataFrame.na.drop and DataFrame.na.fill to the Scala/Java API, and DataFrame.fillna and DataFrame.dropna to the Python API.
    
    Author: Reynold Xin <rxin@databricks.com>
    
    Closes #5274 from rxin/df-missing-value and squashes the following commits:
    
    4ee1b98 [Reynold Xin] Improve error reporting in Python.
    33a330c [Reynold Xin] Remove replace for now.
    bc4fdbb [Reynold Xin] Added documentation for replace.
    d56f5a5 [Reynold Xin] Added replace for Scala/Java.
    2385d00 [Reynold Xin] Feedback from Xiangrui on "how".
    914a374 [Reynold Xin] fill with map.
    185c67e [Reynold Xin] Allow specifying column subsets in fill.
    749eb47 [Reynold Xin] fillna
    249b94e [Reynold Xin] Removing undefined functions.
    6a73c68 [Reynold Xin] Missing file.
    67d7003 [Reynold Xin] [SPARK-6119][SQL] DataFrame.na.drop (Scala/Java) and DataFrame.dropna (Python)