はじめに
3ノードクラスターをミニPCで運用している中で、Prism Centralにログインできなくなってしまいました。対処方法についてメモを残しておきます。
対処方法について
Communityで聞いてもわからず。
内部で名前解決を行っているようですが、それが失敗してしまうようです。MSPを再度インストールするために最新版のリリースノートを確認したら、MSP2.5.0.0のバグでした。
Resolved an issue where, during upgrades to Prism Central version pc.2024.1, the worker VM in a migrated cluster was unable to resolve the DNS name iam-proxy.ntnx-base.
とは言っても、どうにかしなければならないので、Prism Centralの登録解除について、ドキュメントを探して読み解くと、次のようなことがわかりました。
- pc.2024.1~からは、Prism ElementからPrism Centralの登録を解除することをサポートしていません。
- 登録を解除した場合はブラックリストに入れられて、再登録が不可能。
- 再登録したい場合はサポートにエスカレーションしてください。
- どうしても、登録解除したい場合はPrism Centralからcluster destroyをしてください(Prism UIから操作)。
- Prism Centralクラスターを再構築すれば接続は可能。
今回、そもそもPrismにログインができないので、ドキュメント通りに対処できません。
そこで、CLIコマンドでcluster destoroyをしてクラスターの再構築をすればいけそうだな、と思い試してみました。
nutanix@PCVM:~$ cluster destroy 2024-11-22 19:12:23,363Z INFO MainThread zookeeper_session.py:136 Using multithreaded Zookeeper client library: 1 2024-11-22 19:12:23,367Z INFO MainThread zookeeper_session.py:248 Parsed cluster id: 1106392137852969902, cluster incarnation id: 1215556561207248468 2024-11-22 19:12:23,367Z INFO MainThread zookeeper_session.py:270 cluster is attempting to connect to Zookeeper, host port list zk1:9876 2024-11-22 19:12:23,374Z INFO Dummy-1 zookeeper_session.py:840 ZK session establishment complete, sessionId=0x193553ebf380022, negotiated timeout=20 secs 2024-11-22 19:12:23,378Z INFO Dummy-2 zookeeper_session.py:941 Calling zookeeper_close and invalidating zhandle 2024-11-22 19:12:23,383Z INFO MainThread cluster:3296 Executing action destroy on SVMs 192.168.0.110 2024-11-22 19:12:23,384Z WARNING MainThread genesis_utils.py:345 Deprecated: use util.cluster.info.get_node_uuid() instead 2024-11-22 19:12:23,402Z INFO MainThread cluster:3343 ***** CLUSTER NAME ***** Unnamed This operation will completely erase all data and all metadata, and each node will no longer belong to a cluster. Do you want to proceed? (Y/[N]): Y 2024-11-22 19:12:35,852Z INFO MainThread zookeeper_session.py:136 Using multithreaded Zookeeper client library: 1 2024-11-22 19:12:35,853Z WARNING MainThread zookeeper_session.py:220 Going to replace the passed host_port_list: zk1:9876,zk2:9876,zk3:9876 with the ZOOKEEPER_HOST_PORT_LIST environment variable: zk1:9876 because the passed host_port_list appears to have been copied from FLAGS.zookeeper_host_port_list 2024-11-22 19:12:35,854Z INFO MainThread zookeeper_session.py:248 Parsed cluster id: 1106392137852969902, cluster incarnation id: 1215556561207248468 2024-11-22 19:12:35,854Z INFO MainThread zookeeper_session.py:270 cluster is attempting to connect to Zookeeper, host port list zk1:9876 2024-11-22 19:12:35,862Z INFO Dummy-3 zookeeper_session.py:840 ZK session establishment complete, sessionId=0x193553ebf380023, negotiated timeout=20 secs 2024-11-22 19:12:35,863Z INFO MainThread cluster:1945 Cluster destroy initiated by ssh client IP: 192.168.0.134 2024-11-22 19:12:43,431Z INFO MainThread cluster:1824 Unconfigured VIP Monitor service on svm_ips: ['192.168.0.110'] 2024-11-22 19:12:44,087Z INFO MainThread cluster:1828 Unconfigured zk mappings on nodes : ['192.168.0.110'] 2024-11-22 19:13:20,685Z INFO MainThread cluster:443 Restarted Genesis on 192.168.0.110. 2024-11-22 19:13:20,685Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:13:23,030Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:13:25,444Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:13:27,850Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:13:30,242Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:13:32,521Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:13:34,820Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:13:37,238Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:13:39,581Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:13:41,968Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:13:44,252Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:13:46,667Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:13:49,081Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:13:51,443Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:13:53,727Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:13:56,014Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:13:58,523Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:14:00,835Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:14:03,214Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:14:05,512Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:14:08,035Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:14:10,362Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:14:12,669Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:14:15,014Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:14:17,299Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:14:19,573Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:14:21,842Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:14:24,136Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:14:26,396Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:14:28,869Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:14:31,274Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:14:33,547Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:14:35,818Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:14:38,086Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:14:40,394Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:14:42,677Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:14:45,055Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:14:47,422Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:14:49,687Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:14:52,132Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:14:54,497Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:14:56,799Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:14:59,215Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:15:01,539Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:15:03,922Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:15:06,208Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:15:08,515Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:15:10,799Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:15:13,093Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:15:15,354Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:15:17,634Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:15:19,933Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:15:22,192Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:15:24,527Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:15:26,895Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:15:29,177Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:15:31,569Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:15:33,884Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:15:36,239Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:15:38,634Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:15:40,968Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:15:43,320Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:15:45,653Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:15:48,024Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:15:50,473Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:15:52,879Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:15:55,178Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:15:57,524Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:15:59,817Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:16:02,250Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:16:04,551Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:16:06,836Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:16:09,051Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:16:11,258Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:16:13,453Z INFO MainThread cluster:314 Checking for /home/nutanix/.node_unconfigure to disappear on ips ['192.168.0.110'] 2024-11-22 19:16:13,720Z INFO Dummy-4 zookeeper_session.py:941 Calling zookeeper_close and invalidating zhandle 2024-11-22 19:16:13,720Z INFO MainThread cluster:3459 Success!
その後、クラスターの再構築を行いました。
nutanix@PCVM:~$ cluster --cluster_function_list="multicluster" -s 192.168.0.110 start 2024-11-22 19:32:46,882Z CRITICAL MainThread cluster:3235 Cluster is currently unconfigured. Please create the cluster.
クラスターの再構築ができません。
と思ったら、startとなっていたので、ここをcreateにしてもう一度実行しました。
すると、クラスターの構築が無事完了しました。
The state of the cluster: start Lockdown mode: Disabled CVM: 192.168.0.110 Up, ZeusLeader IkatProxy UP [45036, 45292, 45293, 45294] Zeus UP [43084, 43143, 43144, 43145, 43155, 43172] Scavenger UP [45248, 45572, 45573, 45575] SysStatCollector UP [47097, 47608, 47609, 47610] IkatControlPlane UP [47121, 47376, 47377, 47378] Medusa UP [47261, 47578, 47579, 47694, 48495] DynamicRingChanger UP [52854, 53146, 53147, 53290] InsightsDB UP [52952, 53467, 53469, 53617] Athena UP [53012, 53372, 53373, 53376] Mercury UP [53056, 53603, 53604, 53641] Mantle UP [53091, 53571, 53572, 53611] MessageBus UP [55096, 55283, 55284] InsightsDataTransfer UP [56208, 56515, 56516, 56532, 56534, 56535, 56536, 56537, 56538, 56539, 56540, 56541, 56542, 56543] GoErgon UP [56231, 56716, 56717, 56834] Prism UP [56255, 56820, 56821, 57059, 57856, 57916] Adonis UP [56304, 56946, 56948, 56950, 56997] AsyncProcessor UP [56399, 56998, 56999, 57000, 57002] AlertManager UP [56545, 57575, 57576, 58050] Catalog UP [56598, 57295, 57296, 57298, 57350] Atlas UP [56796, 57398, 57399, 57400] Uhura UP [56876, 57494, 57495, 57496] ClusterConfig UP [57042, 57593, 57594, 57595] APLOSEngine UP [57187, 57810, 57811, 57812] APLOS UP [58158, 58538, 58539, 58541] StatsGateway UP [58215, 58608, 58609, 58668] DomainManager UP [58349, 58780, 58781] PlacementSolver UP [58611, 59079, 59080, 59081, 59150] GoLazan UP [58661, 59153, 59154, 59155] Lazan UP [58798, 59259, 59260, 59261] Kanon UP [59087, 59423, 59424, 59425] Polaris UP [59307, 59674, 59676, 59898] Metropolis UP [59375, 59811, 59812, 59904] Flow UP [59485, 59978, 59979, 59980, 60004] Magneto UP [59583, 60098, 60099, 60100] Search UP [59743, 60265, 60267, 60269, 60293, 60294, 60295, 60497] XPlay UP [59870, 60281, 60282, 60391] XTrim UP [59972, 60462, 60463, 60464] DataProviderManager UP [60060, 60668, 60669, 60732, 60885] Pollux UP [60147, 60907, 60908, 60989, 60990] ClusterHealth UP [60285, 61047, 61048] Neuron UP [60534, 61206, 61207, 61763, 61814, 61815, 61829, 61830] Categories UP [60752, 61509, 61510, 61545, 61632] ClusterManagement UP [61125, 61593, 61594, 61595] 2024-11-22 19:48:06,371Z INFO MainThread cluster:3459 Success!
しかし、残念ながらDisconnectのままです。
ただ、Prism Centralにはログインできるようになったので、クラスターの破壊をしてみたいと思います。
そして、ログインしてわかりましたが、クラスターに登録されていません(そりゃそっか)。
ということは、Prism Elementから登録解除しても、ブラックリストには当てはまらないのでは?ということで、登録解除を行います。
任意のCVMから次のコマンドを流し込みます。
ncli multicluster remove-from-multicluster external-ip-address-or-svm-ips=pc-name-or-ip username=pc-username password=pc-password
Unregistering the PE from PC is not a supported workflow. Are you sure you want to proceed(Y/N) ?(Refer to KB-15679 for details): Y Processing request for cluster unregistration. This operation may take a while. Cluster data cleanup will not be done as part of the current operation. To ensure proper completion of the operation, execute below command. multicluster remove-from-multicluster cluster-id=<Cluster UUID> username=<Username> password=<Password> external-ip-address-or-svm-ips=<Cluster IP> cleanup_only=true Error: {"correctiveMessages":["This cluster is not added to the remote PC. Execute unregistration on this machine using local_only"]}
Prism CentralのクラスターUUIDが不明だから登録解除みたいですね。
local_only=true つけてみます。
Unregistering the PE from PC is not a supported workflow. Are you sure you want to proceed(Y/N) ?(Refer to KB-15679 for details): Y
Processing request for cluster unregistration. This operation may take a while.
Cluster data cleanup will not be done as part of the current operation.
To ensure proper completion of the operation, execute below command.
multicluster remove-from-multicluster cluster-id=<Cluster UUID> username=<Username> password=<Password> external-ip-address-or-svm-ips=<Cluster IP> cleanup_only=true
Error: {"correctiveMessages":["Prism Element has 1 protected entity(ies). Refer to KB 12749 for resolution."]}
ダメでした。
出力に書かれているKBの以下の章に従って対応します。
Scenario 2: Unregister PC from PE (PC and PE connectivity is either not normal or PC is deleted without being unregistered)
nutanix@CVM:192.168.0.112:~$ ncli multicluster get-cluster-state Registered Cluster Count: 1 Cluster Id : 10de8640-8b27-4654-8f5a-b1c99702e3ae Cluster Name : Unnamed Is Multicluster : true Controller VM IP Addre... : [192.168.0.110] External or Masqueradi... : Cluster FQDN : Controller VM NAT IP A... : Marked for Removal : false Remote Connection Exists : false
nutanix@CVM:192.168.0.112:~$ ncli multicluster delete-cluster-state cluster-id=10de8640-8b27-4654-8f5a-b1c99702e3ae Cluster data cleanup will not be done as part of the current operation. To ensure proper completion of the operation, execute below command. multicluster remove-from-multicluster cluster-id=<Cluster UUID> username=<Username> password=<Password> external-ip-address-or-svm-ips=<Cluster IP> cleanup_only=true Cluster state deleted successfully
nutanix@CVM:192.168.0.112:~$ allssh genesis stop prism && cluster start
そして、ついに…
Prism Centralでは元々の登録名が表示されました。
試しにクリックしてみましたが、情報連携はされていないようです。
Prism Elementでは登録前の状態に戻りました。
登録をしてみると、無事に元通りになりました。
その後
確かにPrism Centralへの接続は元に戻りましたが、不具合?が発生しています。
- NCMのライセンスが付与されておらず、marketplaceへのアクセス不可→MSPの有効化に失敗していることが原因。
- LCMの実行不可→DNSサーバの設定が初期化されたことが原因
- NTPサーバ及びDNSサーバの設定は初期化→再度設定すればよし
- Prism Elementでは登録解除前のPCVMのvCPUとメモリが表示されているが、どうやらPCVMの内部的には初期状態のリソースで認識しているようです。そのため、VMダッシュボードからUpdateを行ってください。この時リソースは変更する必要はありません。
MSPについてですが、最新版のアップロードにも失敗してしまいます。どうしようもないので、再構築したいと思います。
コメント