Skip to content
Snippets Groups Projects
Commit 15d57f9c authored by Wenchen Fan's avatar Wenchen Fan Committed by Davies Liu
Browse files

[SPARK-13647] [SQL] also check if numeric value is within allowed range in _verify_type

## What changes were proposed in this pull request?

This PR makes the `_verify_type` in `types.py` more strict, also check if numeric value is within allowed range.

## How was this patch tested?

newly added doc test.

Author: Wenchen Fan <wenchen@databricks.com>

Closes #11492 from cloud-fan/py-verify.
parent d062587d
No related branches found
No related tags found
No related merge requests found
...@@ -1093,8 +1093,11 @@ _acceptable_types = { ...@@ -1093,8 +1093,11 @@ _acceptable_types = {
def _verify_type(obj, dataType): def _verify_type(obj, dataType):
""" """
Verify the type of obj against dataType, raise an exception if Verify the type of obj against dataType, raise a TypeError if they do not match.
they do not match.
Also verify the value of obj against datatype, raise a ValueError if it's not within the allowed
range, e.g. using 128 as ByteType will overflow. Note that, Python float is not checked, so it
will become infinity when cast to Java float if it overflows.
>>> _verify_type(None, StructType([])) >>> _verify_type(None, StructType([]))
>>> _verify_type("", StringType()) >>> _verify_type("", StringType())
...@@ -1111,6 +1114,12 @@ def _verify_type(obj, dataType): ...@@ -1111,6 +1114,12 @@ def _verify_type(obj, dataType):
Traceback (most recent call last): Traceback (most recent call last):
... ...
ValueError:... ValueError:...
>>> # Check if numeric values are within the allowed range.
>>> _verify_type(12, ByteType())
>>> _verify_type(1234, ByteType()) # doctest: +IGNORE_EXCEPTION_DETAIL
Traceback (most recent call last):
...
ValueError:...
""" """
# all objects are nullable # all objects are nullable
if obj is None: if obj is None:
...@@ -1137,7 +1146,19 @@ def _verify_type(obj, dataType): ...@@ -1137,7 +1146,19 @@ def _verify_type(obj, dataType):
if type(obj) not in _acceptable_types[_type]: if type(obj) not in _acceptable_types[_type]:
raise TypeError("%s can not accept object %r in type %s" % (dataType, obj, type(obj))) raise TypeError("%s can not accept object %r in type %s" % (dataType, obj, type(obj)))
if isinstance(dataType, ArrayType): if isinstance(dataType, ByteType):
if obj < -128 or obj > 127:
raise ValueError("object of ByteType out of range, got: %s" % obj)
elif isinstance(dataType, ShortType):
if obj < -32768 or obj > 32767:
raise ValueError("object of ShortType out of range, got: %s" % obj)
elif isinstance(dataType, IntegerType):
if obj < -2147483648 or obj > 2147483647:
raise ValueError("object of IntegerType out of range, got: %s" % obj)
elif isinstance(dataType, ArrayType):
for i in obj: for i in obj:
_verify_type(i, dataType.elementType) _verify_type(i, dataType.elementType)
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment