-
- Downloads
[SPARK-12935][SQL] DataFrame API for Count-Min Sketch
This PR integrates Count-Min Sketch from spark-sketch into DataFrame. This version resorts to `RDD.aggregate` for building the sketch. A more performant UDAF version can be built in future follow-up PRs. Author: Cheng Lian <lian@databricks.com> Closes #10911 from liancheng/cms-df-api.
Showing
- common/sketch/src/main/java/org/apache/spark/util/sketch/BloomFilter.java 6 additions, 4 deletions...c/main/java/org/apache/spark/util/sketch/BloomFilter.java
- common/sketch/src/main/java/org/apache/spark/util/sketch/CountMinSketch.java 16 additions, 10 deletions...ain/java/org/apache/spark/util/sketch/CountMinSketch.java
- common/sketch/src/main/java/org/apache/spark/util/sketch/CountMinSketchImpl.java 34 additions, 22 deletions...java/org/apache/spark/util/sketch/CountMinSketchImpl.java
- sql/core/pom.xml 5 additions, 0 deletionssql/core/pom.xml
- sql/core/src/main/scala/org/apache/spark/sql/DataFrameStatFunctions.scala 81 additions, 0 deletions...n/scala/org/apache/spark/sql/DataFrameStatFunctions.scala
- sql/core/src/test/java/test/org/apache/spark/sql/JavaDataFrameSuite.java 27 additions, 1 deletion...st/java/test/org/apache/spark/sql/JavaDataFrameSuite.java
- sql/core/src/test/scala/org/apache/spark/sql/DataFrameStatSuite.scala 36 additions, 0 deletions.../test/scala/org/apache/spark/sql/DataFrameStatSuite.scala
Loading
Please register or sign in to comment