Skip to content
  • Daoyuan Wang's avatar
    d642b273
    [SPARK-15397][SQL] fix string udf locate as hive · d642b273
    Daoyuan Wang authored
    ## What changes were proposed in this pull request?
    
    in hive, `locate("aa", "aaa", 0)` would yield 0, `locate("aa", "aaa", 1)` would yield 1 and `locate("aa", "aaa", 2)` would yield 2, while in Spark, `locate("aa", "aaa", 0)` would yield 1,  `locate("aa", "aaa", 1)` would yield 2 and  `locate("aa", "aaa", 2)` would yield 0. This results from the different understanding of the third parameter in udf `locate`. It means the starting index and starts from 1, so when we use 0, the return would always be 0.
    
    ## How was this patch tested?
    
    tested with modified `StringExpressionsSuite` and `StringFunctionsSuite`
    
    Author: Daoyuan Wang <daoyuan.wang@intel.com>
    
    Closes #13186 from adrian-wang/locate.
    d642b273
    [SPARK-15397][SQL] fix string udf locate as hive
    Daoyuan Wang authored
    ## What changes were proposed in this pull request?
    
    in hive, `locate("aa", "aaa", 0)` would yield 0, `locate("aa", "aaa", 1)` would yield 1 and `locate("aa", "aaa", 2)` would yield 2, while in Spark, `locate("aa", "aaa", 0)` would yield 1,  `locate("aa", "aaa", 1)` would yield 2 and  `locate("aa", "aaa", 2)` would yield 0. This results from the different understanding of the third parameter in udf `locate`. It means the starting index and starts from 1, so when we use 0, the return would always be 0.
    
    ## How was this patch tested?
    
    tested with modified `StringExpressionsSuite` and `StringFunctionsSuite`
    
    Author: Daoyuan Wang <daoyuan.wang@intel.com>
    
    Closes #13186 from adrian-wang/locate.
Loading