Skip to content
Snippets Groups Projects
Commit 846cf462 authored by zhichao.li's avatar zhichao.li Committed by Davies Liu
Browse files

[SPARK-9238] [SQL] Remove two extra useless entries for bytesOfCodePointInUTF8

Only a trial thing, not sure if I understand correctly or not but I guess only 2 entries in `bytesOfCodePointInUTF8` for the case of 6 bytes codepoint(1111110x) is enough.
Details can be found from https://en.wikipedia.org/wiki/UTF-8 in "Description" section.

Author: zhichao.li <zhichao.li@intel.com>

Closes #7582 from zhichao-li/utf8 and squashes the following commits:

8bddd01 [zhichao.li] two extra entries
parent dfb18be0
No related branches found
No related tags found
No related merge requests found
......@@ -48,7 +48,7 @@ public final class UTF8String implements Comparable<UTF8String>, Serializable {
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
4, 4, 4, 4, 4, 4, 4, 4,
5, 5, 5, 5,
6, 6, 6, 6};
6, 6};
public static final UTF8String EMPTY_UTF8 = UTF8String.fromString("");
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment