Skip to content

Hive-HCatalog action flaky on HA #205

@DariuszAniszewski

Description

@DariuszAniszewski

I noticed that hive-hcatalog init action is flaky on High Availability cluster.

Cluster was created using following command:

gcloud dataproc clusters create 'hive-ha' \
  --initialization-actions 'gs://dataproc-initialization-actions/hive-hcatalog/hive-hcatalog.sh' \
  --num-workers 2 \
  --num-masters 3 \
  --worker-machine-type n1-standard-4 \
  --master-machine-type n1-standard-4

While cluster seems to be created properly and init action is executed (I manually checked for artifacts of the action), it can only run jobs on main master (m-0). I wanted to simply run SHOW TABLES; query against the cluster using

gcloud dataproc jobs submit hive --cluster 'hive-ha' -e 'SHOW TABLES;'

Result is flaky and depends on which master node takes the job.

Job below was executed on m-0:

$ gcloud dataproc jobs submit hive --cluster 'hive-ha' -e 'SHOW TABLES;'
Job [381c440c-146e-4794-8752-de1420f050ca] submitted.
Waiting for job output...
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to jdbc:hive2://hive-ha-m-0:10000
Connected to: Apache Hive (version 2.1.1)
Driver: Hive JDBC (version 2.1.1)
18/03/05 13:29:55 [main]: WARN jdbc.HiveConnection: Request to set autoCommit to false; Hive does not support autoCommit=false.
Transaction isolation: TRANSACTION_REPEATABLE_READ
+-----------+--+
| tab_name  |
+-----------+--+
+-----------+--+
No rows selected (0.112 seconds)
Beeline version 2.1.1 by Apache Hive
Closing: 0: jdbc:hive2://hive-ha-m-0:10000
Job [381c440c-146e-4794-8752-de1420f050ca] finished successfully.

Job below was executed on m-1:

$ gcloud dataproc jobs submit hive --cluster 'hive-ha' -e 'SHOW TABLES;'
Job [e46b6906-aeea-4b3a-aefd-4c8b517cd360] submitted.
Waiting for job output...
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to jdbc:hive2://hive-ha-m-1:10000
18/03/05 13:32:10 [main]: WARN jdbc.HiveConnection: Failed to connect to hive-ha-m-1:10000
Could not open connection to the HS2 server. Please check the server URI and if the URI is correct, then ask the administrator to check the server status.
Error: Could not open client transport with JDBC Uri: jdbc:hive2://hive-ha-m-1:10000: java.net.ConnectException: Connection refused (Connection refused) (state=08S01,code=0)
No current connection
ERROR: (gcloud.dataproc.jobs.submit.hive) Job [e46b6906-aeea-4b3a-aefd-4c8b517cd360] entered state [ERROR] while waiting for [DONE].

Job below was executed on m-2:

$ gcloud dataproc jobs submit hive --cluster 'hive-ha' -e 'SHOW TABLES;'
Job [1cbc6ced-6e50-42c2-a29f-f8d500043d7d] submitted.
Waiting for job output...
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to jdbc:hive2://hive-ha-m-2:10000
18/03/05 13:29:07 [main]: WARN jdbc.HiveConnection: Failed to connect to hive-ha-m-2:10000
Could not open connection to the HS2 server. Please check the server URI and if the URI is correct, then ask the administrator to check the server status.
Error: Could not open client transport with JDBC Uri: jdbc:hive2://hive-ha-m-2:10000: java.net.ConnectException: Connection refused (Connection refused) (state=08S01,code=0)
No current connection
ERROR: (gcloud.dataproc.jobs.submit.hive) Job [1cbc6ced-6e50-42c2-a29f-f8d500043d7d] entered state [ERROR] while waiting for [DONE].

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions