Skip to content
  • Steve Loughran's avatar
    a2409d1c
    [SPARK-8064] [SQL] Build against Hive 1.2.1 · a2409d1c
    Steve Loughran authored
    Cherry picked the parts of the initial SPARK-8064 WiP branch needed to get sql/hive to compile against hive 1.2.1. That's the ASF release packaged under org.apache.hive, not any fork.
    
    Tests not run yet: that's what the machines are for
    
    Author: Steve Loughran <stevel@hortonworks.com>
    Author: Cheng Lian <lian@databricks.com>
    Author: Michael Armbrust <michael@databricks.com>
    Author: Patrick Wendell <patrick@databricks.com>
    
    Closes #7191 from steveloughran/stevel/feature/SPARK-8064-hive-1.2-002 and squashes the following commits:
    
    7556d85 [Cheng Lian] Updates .q files and corresponding golden files
    ef4af62 [Steve Loughran] Merge commit '6a92bb09f46a04d6cd8c41bdba3ecb727ebb9030' into stevel/feature/SPARK-8064-hive-1.2-002
    6a92bb0 [Cheng Lian] Overrides HiveConf time vars
    dcbb391 [Cheng Lian] Adds com.twitter:parquet-hadoop-bundle:1.6.0 for Hive Parquet SerDe
    0bbe475 [Steve Loughran] SPARK-8064 scalastyle rejects the standard Hadoop ASF license header...
    fdf759b [Steve Loughran] SPARK-8064 classpath dependency suite to be in sync with shading in final (?) hive-exec spark
    7a6c727 [Steve Loughran] SPARK-8064 switch to second staging repo of the spark-hive artifacts. This one has the protobuf-shaded hive-exec jar
    376c003 [Steve Loughran] SPARK-8064 purge duplicate protobuf declaration
    2c74697 [Steve Loughran] SPARK-8064 switch to the protobuf shaded hive-exec jar with tests to chase it down
    cc44020 [Steve Loughran] SPARK-8064 remove hadoop.version from runtest.py, as profile will fix that automatically.
    6901fa9 [Steve Loughran] SPARK-8064 explicit protobuf import
    da310dc [Michael Armbrust] Fixes for Hive tests.
    a775a75 [Steve Loughran] SPARK-8064 cherry-pick-incomplete
    7404f34 [Patrick Wendell] Add spark-hive staging repo
    832c164 [Steve Loughran] SPARK-8064 try to supress compiler warnings on Complex.java pasted-thrift-code
    312c0d4 [Steve Loughran] SPARK-8064  maven/ivy dependency purge; calcite declaration needed
    fa5ae7b [Steve Loughran] HIVE-8064 fix up hive-thriftserver dependencies and cut back on evicted references in the hive- packages; this keeps mvn and ivy resolution compatible, as the reconciliation policy is "by hand"
    c188048 [Steve Loughran] SPARK-8064 manage the Hive depencencies to that -things that aren't needed are excluded -sql/hive built with ivy is in sync with the maven reconciliation policy, rather than latest-first
    4c8be8d [Cheng Lian] WIP: Partial fix for Thrift server and CLI tests
    314eb3c [Steve Loughran] SPARK-8064 deprecation warning  noise in one of the tests
    17b0341 [Steve Loughran] SPARK-8064 IDE-hinted cleanups of Complex.java to reduce compiler warnings. It's all autogenerated code, so still ugly.
    d029b92 [Steve Loughran] SPARK-8064 rely on unescaping to have already taken place, so go straight to map of serde options
    23eca7e [Steve Loughran] HIVE-8064 handle raw and escaped property tokens
    54d9b06 [Steve Loughran] SPARK-8064 fix compilation regression surfacing from rebase
    0b12d5f [Steve Loughran] HIVE-8064 use subset of hive complex type whose types deserialize
    fce73b6 [Steve Loughran] SPARK-8064 poms rely implicitly on the version of kryo chill provides
    fd3aa5d [Steve Loughran] SPARK-8064 version of hive to d/l from ivy is 1.2.1
    dc73ece [Steve Loughran] SPARK-8064 revert to master's determinstic pushdown strategy
    d3c1e4a [Steve Loughran] SPARK-8064 purge UnionType
    051cc21 [Steve Loughran] SPARK-8064 switch to an unshaded version of hive-exec-core, which must have been built with Kryo 2.21. This currently looks for a (locally built) version 1.2.1.spark
    6684c60 [Steve Loughran] SPARK-8064 ignore RTE raised in blocking process.exitValue() call
    e6121e5 [Steve Loughran] SPARK-8064 address review comments
    aa43dc6 [Steve Loughran] SPARK-8064  more robust teardown on JavaMetastoreDatasourcesSuite
    f2bff01 [Steve Loughran] SPARK-8064 better takeup of asynchronously caught error text
    8b1ef38 [Steve Loughran] SPARK-8064: on failures executing spark-submit in HiveSparkSubmitSuite, print command line and all logged output.
    5a9ce6b [Steve Loughran] SPARK-8064 add explicit reason for kv split failure, rather than array OOB. *does not address the issue*
    642b63a [Steve Loughran] SPARK-8064 reinstate something cut briefly during rebasing
    97194dc [Steve Loughran] SPARK-8064 add extra logging to the YarnClusterSuite classpath test. There should be no reason why this is failing on jenkins, but as it is (and presumably its CP-related), improve the logging including any exception raised.
    335357f [Steve Loughran] SPARK-8064 fail fast on thrive process spawning tests on exit codes and/or error string patterns seen in log.
    3ed872f [Steve Loughran] SPARK-8064 rename field double to  dbl
    bca55e5 [Steve Loughran] SPARK-8064 missed one of the `date` escapes
    41d6479 [Steve Loughran] SPARK-8064 wrap tests with withTable() calls to avoid table-exists exceptions
    2bc29a4 [Steve Loughran] SPARK-8064 ParquetSuites to escape `date` field name
    1ab9bc4 [Steve Loughran] SPARK-8064 TestHive to use sered2.thrift.test.Complex
    bf3a249 [Steve Loughran] SPARK-8064: more resubmit than fix; tighten startup timeout to 60s. Still no obvious reason why jersey server code in spark-assembly isn't being picked up -it hasn't been shaded
    c829b8f [Steve Loughran] SPARK-8064: reinstate yarn-rm-server dependencies to hive-exec to ensure that jersey server is on classpath on hadoop versions < 2.6
    0b0f738 [Steve Loughran] SPARK-8064: thrift server startup to fail fast on any exception in the main thread
    13abaf1 [Steve Loughran] SPARK-8064 Hive compatibilty tests sin sync with explain/show output from Hive 1.2.1
    d14d5ea [Steve Loughran] SPARK-8064: DATE is now a predicate; you can't use it as a field in select ops
    26eef1c [Steve Loughran] SPARK-8064: HIVE-9039 renamed TOK_UNION => TOK_UNIONALL while adding TOK_UNIONDISTINCT
    3d64523 [Steve Loughran] SPARK-8064 improve diagns on uknown token; fix scalastyle failure
    d0360f6 [Steve Loughran] SPARK-8064: delicate merge in of the branch vanzin/hive-1.1
    1126e5a [Steve Loughran] SPARK-8064: name of unrecognized file format wasn't appearing in error text
    8cb09c4 [Steve Loughran] SPARK-8064: test resilience/assertion improvements. Independent of the rest of the work; can be backported to earlier versions
    dec12cb [Steve Loughran] SPARK-8064: when a CLI suite test fails include the full output text in the raised exception; this ensures that the stdout/stderr is included in jenkins reports, so it becomes possible to diagnose the cause.
    463a670 [Steve Loughran] SPARK-8064 run-tests.py adds a hadoop-2.6 profile, and changes info messages to say "w/Hive 1.2.1" in console output
    2531099 [Steve Loughran] SPARK-8064 successful attempt to get rid of pentaho as a transitive dependency of hive-exec
    1d59100 [Steve Loughran] SPARK-8064 (unsuccessful) attempt to get rid of pentaho as a transitive dependency of hive-exec
    75733fc [Steve Loughran] SPARK-8064 change thrift binary startup message to "Starting ThriftBinaryCLIService on port"
    3ebc279 [Steve Loughran] SPARK-8064 move strings used to check for http/bin thrift services up into constants
    c80979d [Steve Loughran] SPARK-8064: SparkSQLCLIDriver drops remote mode support. CLISuite Tests pass instead of timing out: undetected regression?
    27e8370 [Steve Loughran] SPARK-8064 fix some style & IDE warnings
    00e50d6 [Steve Loughran] SPARK-8064 stop excluding hive shims from dependency (commented out , for now)
    cb4f142 [Steve Loughran] SPARK-8054 cut pentaho dependency from calcite
    f7aa9cb [Steve Loughran] SPARK-8064 everything compiles with some commenting and moving of classes into a hive package
    6c310b4 [Steve Loughran] SPARK-8064 subclass  Hive ServerOptionsProcessor to make it public again
    f61a675 [Steve Loughran] SPARK-8064 thrift server switched to Hive 1.2.1, though it doesn't compile everywhere
    4890b9d [Steve Loughran] SPARK-8064, build against Hive 1.2.1
    a2409d1c
    [SPARK-8064] [SQL] Build against Hive 1.2.1
    Steve Loughran authored
    Cherry picked the parts of the initial SPARK-8064 WiP branch needed to get sql/hive to compile against hive 1.2.1. That's the ASF release packaged under org.apache.hive, not any fork.
    
    Tests not run yet: that's what the machines are for
    
    Author: Steve Loughran <stevel@hortonworks.com>
    Author: Cheng Lian <lian@databricks.com>
    Author: Michael Armbrust <michael@databricks.com>
    Author: Patrick Wendell <patrick@databricks.com>
    
    Closes #7191 from steveloughran/stevel/feature/SPARK-8064-hive-1.2-002 and squashes the following commits:
    
    7556d85 [Cheng Lian] Updates .q files and corresponding golden files
    ef4af62 [Steve Loughran] Merge commit '6a92bb09f46a04d6cd8c41bdba3ecb727ebb9030' into stevel/feature/SPARK-8064-hive-1.2-002
    6a92bb0 [Cheng Lian] Overrides HiveConf time vars
    dcbb391 [Cheng Lian] Adds com.twitter:parquet-hadoop-bundle:1.6.0 for Hive Parquet SerDe
    0bbe475 [Steve Loughran] SPARK-8064 scalastyle rejects the standard Hadoop ASF license header...
    fdf759b [Steve Loughran] SPARK-8064 classpath dependency suite to be in sync with shading in final (?) hive-exec spark
    7a6c727 [Steve Loughran] SPARK-8064 switch to second staging repo of the spark-hive artifacts. This one has the protobuf-shaded hive-exec jar
    376c003 [Steve Loughran] SPARK-8064 purge duplicate protobuf declaration
    2c74697 [Steve Loughran] SPARK-8064 switch to the protobuf shaded hive-exec jar with tests to chase it down
    cc44020 [Steve Loughran] SPARK-8064 remove hadoop.version from runtest.py, as profile will fix that automatically.
    6901fa9 [Steve Loughran] SPARK-8064 explicit protobuf import
    da310dc [Michael Armbrust] Fixes for Hive tests.
    a775a75 [Steve Loughran] SPARK-8064 cherry-pick-incomplete
    7404f34 [Patrick Wendell] Add spark-hive staging repo
    832c164 [Steve Loughran] SPARK-8064 try to supress compiler warnings on Complex.java pasted-thrift-code
    312c0d4 [Steve Loughran] SPARK-8064  maven/ivy dependency purge; calcite declaration needed
    fa5ae7b [Steve Loughran] HIVE-8064 fix up hive-thriftserver dependencies and cut back on evicted references in the hive- packages; this keeps mvn and ivy resolution compatible, as the reconciliation policy is "by hand"
    c188048 [Steve Loughran] SPARK-8064 manage the Hive depencencies to that -things that aren't needed are excluded -sql/hive built with ivy is in sync with the maven reconciliation policy, rather than latest-first
    4c8be8d [Cheng Lian] WIP: Partial fix for Thrift server and CLI tests
    314eb3c [Steve Loughran] SPARK-8064 deprecation warning  noise in one of the tests
    17b0341 [Steve Loughran] SPARK-8064 IDE-hinted cleanups of Complex.java to reduce compiler warnings. It's all autogenerated code, so still ugly.
    d029b92 [Steve Loughran] SPARK-8064 rely on unescaping to have already taken place, so go straight to map of serde options
    23eca7e [Steve Loughran] HIVE-8064 handle raw and escaped property tokens
    54d9b06 [Steve Loughran] SPARK-8064 fix compilation regression surfacing from rebase
    0b12d5f [Steve Loughran] HIVE-8064 use subset of hive complex type whose types deserialize
    fce73b6 [Steve Loughran] SPARK-8064 poms rely implicitly on the version of kryo chill provides
    fd3aa5d [Steve Loughran] SPARK-8064 version of hive to d/l from ivy is 1.2.1
    dc73ece [Steve Loughran] SPARK-8064 revert to master's determinstic pushdown strategy
    d3c1e4a [Steve Loughran] SPARK-8064 purge UnionType
    051cc21 [Steve Loughran] SPARK-8064 switch to an unshaded version of hive-exec-core, which must have been built with Kryo 2.21. This currently looks for a (locally built) version 1.2.1.spark
    6684c60 [Steve Loughran] SPARK-8064 ignore RTE raised in blocking process.exitValue() call
    e6121e5 [Steve Loughran] SPARK-8064 address review comments
    aa43dc6 [Steve Loughran] SPARK-8064  more robust teardown on JavaMetastoreDatasourcesSuite
    f2bff01 [Steve Loughran] SPARK-8064 better takeup of asynchronously caught error text
    8b1ef38 [Steve Loughran] SPARK-8064: on failures executing spark-submit in HiveSparkSubmitSuite, print command line and all logged output.
    5a9ce6b [Steve Loughran] SPARK-8064 add explicit reason for kv split failure, rather than array OOB. *does not address the issue*
    642b63a [Steve Loughran] SPARK-8064 reinstate something cut briefly during rebasing
    97194dc [Steve Loughran] SPARK-8064 add extra logging to the YarnClusterSuite classpath test. There should be no reason why this is failing on jenkins, but as it is (and presumably its CP-related), improve the logging including any exception raised.
    335357f [Steve Loughran] SPARK-8064 fail fast on thrive process spawning tests on exit codes and/or error string patterns seen in log.
    3ed872f [Steve Loughran] SPARK-8064 rename field double to  dbl
    bca55e5 [Steve Loughran] SPARK-8064 missed one of the `date` escapes
    41d6479 [Steve Loughran] SPARK-8064 wrap tests with withTable() calls to avoid table-exists exceptions
    2bc29a4 [Steve Loughran] SPARK-8064 ParquetSuites to escape `date` field name
    1ab9bc4 [Steve Loughran] SPARK-8064 TestHive to use sered2.thrift.test.Complex
    bf3a249 [Steve Loughran] SPARK-8064: more resubmit than fix; tighten startup timeout to 60s. Still no obvious reason why jersey server code in spark-assembly isn't being picked up -it hasn't been shaded
    c829b8f [Steve Loughran] SPARK-8064: reinstate yarn-rm-server dependencies to hive-exec to ensure that jersey server is on classpath on hadoop versions < 2.6
    0b0f738 [Steve Loughran] SPARK-8064: thrift server startup to fail fast on any exception in the main thread
    13abaf1 [Steve Loughran] SPARK-8064 Hive compatibilty tests sin sync with explain/show output from Hive 1.2.1
    d14d5ea [Steve Loughran] SPARK-8064: DATE is now a predicate; you can't use it as a field in select ops
    26eef1c [Steve Loughran] SPARK-8064: HIVE-9039 renamed TOK_UNION => TOK_UNIONALL while adding TOK_UNIONDISTINCT
    3d64523 [Steve Loughran] SPARK-8064 improve diagns on uknown token; fix scalastyle failure
    d0360f6 [Steve Loughran] SPARK-8064: delicate merge in of the branch vanzin/hive-1.1
    1126e5a [Steve Loughran] SPARK-8064: name of unrecognized file format wasn't appearing in error text
    8cb09c4 [Steve Loughran] SPARK-8064: test resilience/assertion improvements. Independent of the rest of the work; can be backported to earlier versions
    dec12cb [Steve Loughran] SPARK-8064: when a CLI suite test fails include the full output text in the raised exception; this ensures that the stdout/stderr is included in jenkins reports, so it becomes possible to diagnose the cause.
    463a670 [Steve Loughran] SPARK-8064 run-tests.py adds a hadoop-2.6 profile, and changes info messages to say "w/Hive 1.2.1" in console output
    2531099 [Steve Loughran] SPARK-8064 successful attempt to get rid of pentaho as a transitive dependency of hive-exec
    1d59100 [Steve Loughran] SPARK-8064 (unsuccessful) attempt to get rid of pentaho as a transitive dependency of hive-exec
    75733fc [Steve Loughran] SPARK-8064 change thrift binary startup message to "Starting ThriftBinaryCLIService on port"
    3ebc279 [Steve Loughran] SPARK-8064 move strings used to check for http/bin thrift services up into constants
    c80979d [Steve Loughran] SPARK-8064: SparkSQLCLIDriver drops remote mode support. CLISuite Tests pass instead of timing out: undetected regression?
    27e8370 [Steve Loughran] SPARK-8064 fix some style & IDE warnings
    00e50d6 [Steve Loughran] SPARK-8064 stop excluding hive shims from dependency (commented out , for now)
    cb4f142 [Steve Loughran] SPARK-8054 cut pentaho dependency from calcite
    f7aa9cb [Steve Loughran] SPARK-8064 everything compiles with some commenting and moving of classes into a hive package
    6c310b4 [Steve Loughran] SPARK-8064 subclass  Hive ServerOptionsProcessor to make it public again
    f61a675 [Steve Loughran] SPARK-8064 thrift server switched to Hive 1.2.1, though it doesn't compile everywhere
    4890b9d [Steve Loughran] SPARK-8064, build against Hive 1.2.1
Loading