-
- Downloads
[SPARK-18658][SQL] Write text records directly to a FileOutputStream
## What changes were proposed in this pull request? This replaces uses of `TextOutputFormat` with an `OutputStream`, which will either write directly to the filesystem or indirectly via a compressor (if so configured). This avoids intermediate buffering. The inverse of this (reading directly from a stream) is necessary for streaming large JSON records (when `wholeFile` is enabled) so I wanted to keep the read and write paths symmetric. ## How was this patch tested? Existing unit tests. Author: Nathan Howell <nhowell@godaddy.com> Closes #16089 from NathanHowell/SPARK-18658.
Showing
- common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java 19 additions, 0 deletions...c/main/java/org/apache/spark/unsafe/types/UTF8String.java
- common/unsafe/src/test/java/org/apache/spark/unsafe/types/UTF8StringSuite.java 109 additions, 0 deletions...t/java/org/apache/spark/unsafe/types/UTF8StringSuite.java
- mllib/src/main/scala/org/apache/spark/ml/source/libsvm/LibSVMRelation.scala 8 additions, 20 deletions...la/org/apache/spark/ml/source/libsvm/LibSVMRelation.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonGenerator.scala 4 additions, 0 deletions...org/apache/spark/sql/catalyst/json/JacksonGenerator.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/CodecStreams.scala 74 additions, 0 deletions...apache/spark/sql/execution/datasources/CodecStreams.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParser.scala 8 additions, 11 deletions...pache/spark/sql/execution/datasources/csv/CSVParser.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVRelation.scala 8 additions, 35 deletions...che/spark/sql/execution/datasources/csv/CSVRelation.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JsonFileFormat.scala 7 additions, 24 deletions...spark/sql/execution/datasources/json/JsonFileFormat.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/text/TextFileFormat.scala 9 additions, 33 deletions...spark/sql/execution/datasources/text/TextFileFormat.scala
- sql/hive/src/test/scala/org/apache/spark/sql/sources/SimpleTextRelation.scala 6 additions, 21 deletions...ala/org/apache/spark/sql/sources/SimpleTextRelation.scala
Loading
Please register or sign in to comment