Skip to content
  • Matei Zaharia's avatar
    23f966f4
    [SPARK-3930] [SPARK-3933] Support fixed-precision decimal in SQL, and some optimizations · 23f966f4
    Matei Zaharia authored
    - Adds optional precision and scale to Spark SQL's decimal type, which behave similarly to those in Hive 13 (https://cwiki.apache.org/confluence/download/attachments/27362075/Hive_Decimal_Precision_Scale_Support.pdf)
    - Replaces our internal representation of decimals with a Decimal class that can store small values in a mutable Long, saving memory in this situation and letting some operations happen directly on Longs
    
    This is still marked WIP because there are a few TODOs, but I'll remove that tag when done.
    
    Author: Matei Zaharia <matei@databricks.com>
    
    Closes #2983 from mateiz/decimal-1 and squashes the following commits:
    
    35e6b02 [Matei Zaharia] Fix issues after merge
    227f24a [Matei Zaharia] Review comments
    31f915e [Matei Zaharia] Implement Davies's suggestions in Python
    eb84820 [Matei Zaharia] Support reading/writing decimals as fixed-length binary in Parquet
    4dc6bae [Matei Zaharia] Fix decimal support in PySpark
    d1d9d68 [Matei Zaharia] Fix compile error and test issues after rebase
    b28933d [Matei Zaharia] Support decimal precision/scale in Hive metastore
    2118c0d [Matei Zaharia] Some test and bug fixes
    81db9cb [Matei Zaharia] Added mutable Decimal that will be more efficient for small precisions
    7af0c3b [Matei Zaharia] Add optional precision and scale to DecimalType, but use Unlimited for now
    ec0a947 [Matei Zaharia] Make the result of AVG on Decimals be Decimal, not Double
    23f966f4
    [SPARK-3930] [SPARK-3933] Support fixed-precision decimal in SQL, and some optimizations
    Matei Zaharia authored
    - Adds optional precision and scale to Spark SQL's decimal type, which behave similarly to those in Hive 13 (https://cwiki.apache.org/confluence/download/attachments/27362075/Hive_Decimal_Precision_Scale_Support.pdf)
    - Replaces our internal representation of decimals with a Decimal class that can store small values in a mutable Long, saving memory in this situation and letting some operations happen directly on Longs
    
    This is still marked WIP because there are a few TODOs, but I'll remove that tag when done.
    
    Author: Matei Zaharia <matei@databricks.com>
    
    Closes #2983 from mateiz/decimal-1 and squashes the following commits:
    
    35e6b02 [Matei Zaharia] Fix issues after merge
    227f24a [Matei Zaharia] Review comments
    31f915e [Matei Zaharia] Implement Davies's suggestions in Python
    eb84820 [Matei Zaharia] Support reading/writing decimals as fixed-length binary in Parquet
    4dc6bae [Matei Zaharia] Fix decimal support in PySpark
    d1d9d68 [Matei Zaharia] Fix compile error and test issues after rebase
    b28933d [Matei Zaharia] Support decimal precision/scale in Hive metastore
    2118c0d [Matei Zaharia] Some test and bug fixes
    81db9cb [Matei Zaharia] Added mutable Decimal that will be more efficient for small precisions
    7af0c3b [Matei Zaharia] Add optional precision and scale to DecimalType, but use Unlimited for now
    ec0a947 [Matei Zaharia] Make the result of AVG on Decimals be Decimal, not Double
Loading