Skip to content
Snippets Groups Projects
  • Josh Rosen's avatar
    cbb7f04a
    Add custom serializer support to PySpark. · cbb7f04a
    Josh Rosen authored
    For now, this only adds MarshalSerializer, but it lays the groundwork
    for other supporting custom serializers.  Many of these mechanisms
    can also be used to support deserialization of different data formats
    sent by Java, such as data encoded by MsgPack.
    
    This also fixes a bug in SparkContext.union().
    cbb7f04a
    History
    Add custom serializer support to PySpark.
    Josh Rosen authored
    For now, this only adds MarshalSerializer, but it lays the groundwork
    for other supporting custom serializers.  Many of these mechanisms
    can also be used to support deserialization of different data formats
    sent by Java, such as data encoded by MsgPack.
    
    This also fixes a bug in SparkContext.union().
run-tests 1.70 KiB
#!/usr/bin/env bash

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#


# Figure out where the Spark framework is installed
FWDIR="$(cd `dirname $0`; cd ../; pwd)"

# CD into the python directory to find things on the right path
cd "$FWDIR/python"

FAILED=0

rm -f unit-tests.log

function run_test() {
    $FWDIR/pyspark $1 2>&1 | tee -a unit-tests.log
    FAILED=$((PIPESTATUS[0]||$FAILED))
}

run_test "pyspark/rdd.py"
run_test "pyspark/context.py"
run_test "-m doctest pyspark/broadcast.py"
run_test "-m doctest pyspark/accumulators.py"
run_test "-m doctest pyspark/serializers.py"
run_test "pyspark/tests.py"

if [[ $FAILED != 0 ]]; then
    echo -en "\033[31m"  # Red
    echo "Had test failures; see logs."
    echo -en "\033[0m"  # No color
    exit -1
else
    echo -en "\033[32m"  # Green
    echo "Tests passed."
    echo -en "\033[0m"  # No color
fi

# TODO: in the long-run, it would be nice to use a test runner like `nose`.
# The doctest fixtures are the current barrier to doing this.