Skip to content
Snippets Groups Projects
  1. Aug 26, 2014
    • Davies Liu's avatar
      [SPARK-3073] [PySpark] use external sort in sortBy() and sortByKey() · f1e71d4c
      Davies Liu authored
      Using external sort to support sort large datasets in reduce stage.
      
      Author: Davies Liu <davies.liu@gmail.com>
      
      Closes #1978 from davies/sort and squashes the following commits:
      
      bbcd9ba [Davies Liu] check spilled bytes in tests
      b125d2f [Davies Liu] add test for external sort in rdd
      eae0176 [Davies Liu] choose different disks from different processes and instances
      1f075ed [Davies Liu] Merge branch 'master' into sort
      eb53ca6 [Davies Liu] Merge branch 'master' into sort
      644abaf [Davies Liu] add license in LICENSE
      19f7873 [Davies Liu] improve tests
      55602ee [Davies Liu] use external sort in sortBy() and sortByKey()
      f1e71d4c
Loading