Skip to content
Snippets Groups Projects
Commit fbebaedf authored by Andre Schumacher's avatar Andre Schumacher Committed by Reynold Xin
Browse files

Spark parquet improvements

A few improvements to the Parquet support for SQL queries:
- Instead of files a ParquetRelation is now backed by a directory, which simplifies importing data from other
  sources
- InsertIntoParquetTable operation now supports switching between overwriting or appending (at least in
  HiveQL)
- tests now use the new API
- Parquet logging can be set to WARNING level (Default)
- Default compression for Parquet files (GZIP, as in parquet-mr)

Author: Andre Schumacher <andre.schumacher@iki.fi>

Closes #195 from AndreSchumacher/spark_parquet_improvements and squashes the following commits:

54df314 [Andre Schumacher] SPARK-1383 [SQL] Improvements to ParquetRelation
parent 92a86b28
No related branches found
No related tags found
No related merge requests found
Showing
with 460 additions and 212 deletions
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment