Skip to content
Snippets Groups Projects
  • Davies Liu's avatar
    f1e71d4c
    [SPARK-3073] [PySpark] use external sort in sortBy() and sortByKey() · f1e71d4c
    Davies Liu authored
    Using external sort to support sort large datasets in reduce stage.
    
    Author: Davies Liu <davies.liu@gmail.com>
    
    Closes #1978 from davies/sort and squashes the following commits:
    
    bbcd9ba [Davies Liu] check spilled bytes in tests
    b125d2f [Davies Liu] add test for external sort in rdd
    eae0176 [Davies Liu] choose different disks from different processes and instances
    1f075ed [Davies Liu] Merge branch 'master' into sort
    eb53ca6 [Davies Liu] Merge branch 'master' into sort
    644abaf [Davies Liu] add license in LICENSE
    19f7873 [Davies Liu] improve tests
    55602ee [Davies Liu] use external sort in sortBy() and sortByKey()
    f1e71d4c
    History
    [SPARK-3073] [PySpark] use external sort in sortBy() and sortByKey()
    Davies Liu authored
    Using external sort to support sort large datasets in reduce stage.
    
    Author: Davies Liu <davies.liu@gmail.com>
    
    Closes #1978 from davies/sort and squashes the following commits:
    
    bbcd9ba [Davies Liu] check spilled bytes in tests
    b125d2f [Davies Liu] add test for external sort in rdd
    eae0176 [Davies Liu] choose different disks from different processes and instances
    1f075ed [Davies Liu] Merge branch 'master' into sort
    eb53ca6 [Davies Liu] Merge branch 'master' into sort
    644abaf [Davies Liu] add license in LICENSE
    19f7873 [Davies Liu] improve tests
    55602ee [Davies Liu] use external sort in sortBy() and sortByKey()
tox.ini 838 B
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

[pep8]
max-line-length=100
exclude=cloudpickle.py,heapq3.py