Upgrade to 3.5.2 and 7.5.2
Upgrade to 3.5.2 and 7.5.2
1.Prerequisites
Before proceeding with this upgrade, please make sure and verify the below prerequisites are met.
-
RDAF Deployment CLI version: 1.3.0
-
Infra Services tag: 1.0.3 / 1.0.3.2
-
Platform Services and RDA Worker tag: 3.5
-
OIA Application Services tag: 7.5
2. Download the New Docker Images
Download the new docker image tags for RDAF Platform and OIA (AIOps) Application services and wait until all of the images are downloaded
To fetch registry please use the below command
Run the below command to verify above mentioned tags are downloaded for all of the RDAF Platform and OIA (AIOps) Application services.
Please make sure the 3.5.2 image tag is downloaded for the below RDAF Platform (AIOps) services.
-
rda_api_server
-
rda-worker-all
Please make sure the 7.5.2 image tag is downloaded for the below RDAF OIA (AIOps) Application services.
-
rda-alert-ingester
-
rda-alert-processor
3. Upgrade Steps
3.1 Upgrade Platform Services
Step-1: Run the below command to initiate upgrading RDAF Platform API Server
Note
After upgrading the above mentioned service, use the following commands to verify that the service is up and running
As the upgrade procedure is a non-disruptive upgrade, it puts the currently running PODs into Terminating state and newer version PODs into Pending state.
Step-2: Run the below command to check the status of the existing and newer PODs and make sure atleast one instance of each Platform service is in Terminating state.
Step-3: Run the below command to put all Terminating RDAF platform service PODs into maintenance mode. It will list all of the POD Ids of platform services along with rdac maintenance command that required to be put in maintenance mode.
Step-4: Copy & Paste the rdac maintenance command as below.
Step-5: Run the below command to verify the maintenance mode status of the RDAF platform services.
Step-6: Run the below command to delete the Terminating RDAF platform service PODs
for i in `kubectl get pods -n rda-fabric -l app_category=rdaf-platform | grep 'Terminating' | awk '{print $1}'`; do kubectl delete pod $i -n rda-fabric --force; done
Please wait till the new platform service is in Up state and run the below command to verify their status and make sure below service is running with 3.5.2 version.
- rda-api-server
+--------------------+----------------+-----------------+--------------+-------+
| Name | Host | Status | Container Id | Tag |
+--------------------+----------------+-----------------+--------------+-------+
| rda-api-server | 192.168.131.44 | Up 1 Days ago | hcb83e98940n | 3.5.2 |
| rda-api-server | 192.168.131.45 | Up 1 Days ago | g9a445a35879 | 3.5.2 |
| rda-registry | 192.168.131.44 | Up 3 Weeks ago | 8596787a1fc7 | 3.5 |
| rda-registry | 192.168.131.45 | Up 3 Weeks ago | 720f65e42b33 | 3.5 |
| rda-identity | 192.168.131.45 | Up 3 Weeks ago | 80d255d651a4 | 3.5 |
| rda-identity | 192.168.131.47 | Up 3 Weeks ago | t49dc2dd1607 | 3.5 |
+--------------------+----------------+-----------------+--------------+-------+
Run the below command to check if all services has ok status and does not throw any failure messages.
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------+
| Cat | Pod-Type | Host | ID | Site | Health Parameter | Status | Message |
|-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------|
| rda_infra | api-server | rda-api-serv | 3c36575d | | service-status | ok | |
| rda_infra | api-server | rda-api-serv | 3c36575d | | minio-connectivity | ok | |
| rda_infra | api-server | rda-api-serv | 1fd1778b | | service-status | ok | |
| rda_infra | api-server | rda-api-serv | 1fd1778b | | minio-connectivity | ok | |
| rda_infra | asm | rda-asm-f6b8 | 39a53ac4 | | service-status | ok | |
| rda_infra | asm | rda-asm-f6b8 | 39a53ac4 | | minio-connectivity | ok | |
| rda_infra | asm | rda-asm-f6b8 | 199a31d2 | | service-status | ok | |
| rda_infra | asm | rda-asm-f6b8 | 199a31d2 | | minio-connectivity | ok | |
| rda_infra | scheduler | rda-schedule | ee7565aa | | service-status | ok | |
| rda_infra | scheduler | rda-schedule | ee7565aa | | minio-connectivity | ok | |
| rda_infra | scheduler | rda-schedule | ee7565aa | | DB-connectivity | ok | |
| rda_infra | scheduler | rda-schedule | ee7565aa | | scheduler-webserver-connectivity | ok | |
| rda_infra | scheduler | rda-schedule | 779a624d | | service-status | ok | |
| rda_infra | scheduler | rda-schedule | 779a624d | | minio-connectivity | ok | |
| rda_infra | scheduler | rda-schedule | 779a624d | | DB-connectivity | ok | |
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------+
Warning
For Non-Kubernetes deployment, upgrading RDAF Platform and AIOps application services is a disruptive operation when rolling-upgrade option is not used. Please schedule a maintenance window before upgrading RDAF Platform and AIOps services to newer version.
Run the below command to initiate upgrading RDAF Platform API Server Services with zero downtime
Note
timeout <10> mentioned in the above command represents as Seconds
Note
The rolling-upgrade option upgrades the Platform services running in high-availability mode on one VM at a time in sequence. It completes the upgrade of Platform services running on VM-1 before upgrading them on VM-2, followed by VM-3, and so on.
During this upgrade sequence, RDAF platform continues to function without any impact to the application traffic.
After completing the Platform services upgrade on all VMs, it will ask for user confirmation to delete the older version Platform service PODs. The user has to provide YES to delete the old pods
2024-09-19 07:19:09,615 [rdaf.component.platform] INFO - Gathering platform container details.
2024-09-19 07:19:09,684 [rdaf.component.platform] INFO - Gathering rdac pod details.
+----------+------------+---------+------------------+--------------+-------------+------------+
| Pod ID | Pod Type | Version | Age | Hostname | Maintenance | Pod Status |
+----------+------------+---------+------------------+--------------+-------------+------------+
| 70809963 | api-server | 3.5 | 29 days, 5:12:27 | 4de526af74d9 | None | True |
+----------+------------+---------+------------------+--------------+-------------+------------+
Continue moving above pods to maintenance mode? [yes/no]: yes
2024-09-19 07:22:21,696 [rdaf.component.platform] INFO - Initiating Maintenance Mode...
2024-09-19 07:22:28,763 [rdaf.component.platform] INFO - Waiting for services to be moved to maintenance.
2024-09-19 07:22:52,287 [rdaf.component.platform] INFO - Following container are in maintenance mode
+----------+------------+---------+------------------+--------------+-------------+------------+
| Pod ID | Pod Type | Version | Age | Hostname | Maintenance | Pod Status |
+----------+------------+---------+------------------+--------------+-------------+------------+
| 70809963 | api-server | 3.5 | 29 days, 5:15:54 | 4de526af74d9 | maintenance | False |
+----------+------------+---------+------------------+--------------+-------------+------------+
2024-09-19 07:22:52,289 [rdaf.component.platform] INFO - Waiting for timeout of 10 seconds...
2024-09-19 07:23:02,298 [rdaf.component.platform] INFO - Upgrading service: rda_api_server on host 192.168.107.61
[+] Running 1/1
⠿ Container platform-rda_api_server-1 Started 11.7s
2024-09-19 07:23:14,300 [rdaf.component.platform] INFO - Waiting for upgraded containers to join pods
2024-09-19 07:23:14,301 [rdaf.component.platform] INFO - Checking if the upgraded components '['rda_api_server']' has joined the rdac pods...
2024-09-19 07:23:28,294 [rdaf.component.platform] INFO - Waiting for platform components to be up and running... retry 1
+----------+------------+---------+------------------+--------------+-------------+------------+
| Pod ID | Pod Type | Version | Age | Hostname | Maintenance | Pod Status |
+----------+------------+---------+------------------+--------------+-------------+------------+
| 7366effa | api-server | 3.5 | 29 days, 5:12:17 | 1da78e0f77af | None | True |
+----------+------------+---------+------------------+--------------+-------------+------------+
Continue moving above pods to maintenance mode? [yes/no]: yes
2024-09-19 07:25:27,357 [rdaf.component.platform] INFO - Initiating Maintenance Mode...
2024-09-19 07:25:34,116 [rdaf.component.platform] INFO - Waiting for services to be moved to maintenance.
2024-09-19 07:25:57,912 [rdaf.component.platform] INFO - Following container are in maintenance mode
+----------+------------+---------+------------------+--------------+-------------+------------+
| Pod ID | Pod Type | Version | Age | Hostname | Maintenance | Pod Status |
+----------+------------+---------+------------------+--------------+-------------+------------+
| 7366effa | api-server | 3.5 | 29 days, 5:18:58 | 1da78e0f77af | maintenance | False |
+----------+------------+---------+------------------+--------------+-------------+------------+
2024-09-19 07:25:57,913 [rdaf.component.platform] INFO - Waiting for timeout of 10 seconds...
2024-09-19 07:26:07,921 [rdaf.component.platform] INFO - Upgrading service: rda_api_server on host 192.168.107.62
[+] Running 1/16:26,053 [rdaf.component] INFO -
⠿ Container platform-rda_api_server-1 Started 14.4s
2024-09-19 07:26:26,056 [rdaf.component.platform] INFO - Waiting for upgraded containers to join pods
2024-09-19 07:26:26,056 [rdaf.component.platform] INFO - Checking if the upgraded components '['rda_api_server']' has joined the rdac pods...
2024-09-19 07:26:38,909 [rdaf.component.platform] INFO - Waiting for platform components to be up and running... retry 1
+--------------------------+----------------+--------------+--------------+-----------+
| Name | Host | Status | Container Id | Tag |
+--------------------------+----------------+--------------+--------------+-----------+
| rda_api_server | 192.168.107.61 | Up 5 minutes | 31881b61be72 | 3.5.2 |
| | | | | |
| rda_api_server | 192.168.107.62 | Up 2 minutes | 428fb3b16e29 | 3.5.2 |
| | | | | |
| rda_registry | 192.168.107.61 | Up 4 weeks | acb72b6eb387 | 3.5 |
| | | | | |
| rda_registry | 192.168.107.62 | Up 4 weeks | b7d19acf34de | 3.5 |
| | | | | |
| rda_scheduler | 192.168.107.61 | Up 4 days | 5153af9efed0 | 3.5 |
| | | | | |
| rda_scheduler | 192.168.107.62 | Up 4 days | 357a47ef8854 | 3.5 |
+--------------------------+----------------+--------------+--------------+-----------+
Run the below command to initiate upgrading RDAF Platform API Server Services without zero downtime
Note
After upgrading the above mentioned services, use the following commands to verify that the services are up and running
Please wait till the new platform service is in Up state and run the below command to verify their status and make sure below service is running with 3.5.2 version.
- rda_api_server
+--------------------------+----------------+------------+--------------+-------+
| Name | Host | Status | Container Id | Tag |
+--------------------------+----------------+------------+--------------+-------+
| rda_api_server | 192.168.107.61 | Up 2 hours | 3b0542719619 | 3.5.2 |
| rda_api_server | 192.168.107.62 | Up 2 hours | t4404cffdc7d | 3.5.2 |
| rda_registry | 192.168.107.61 | Up 2 hours | hdb3e3b7e297 | 3.5 |
| rda_registry | 192.168.107.62 | Up 2 hours | fadfc2db2733 | 3.5 |
| rda_scheduler | 192.168.107.61 | Up 2 hours | 6fbdaf30ad04 | 3.5 |
| rda_scheduler | 192.168.107.62 | Up 2 hours | hf3280d11a4y | 3.5 |
| rda_collector | 192.168.107.61 | Up 2 hours | b0e5d30e3abf | 3.5 |
| rda_collector | 192.168.107.62 | Up 2 hours | 6d6b8d14add8 | 3.5 |
+--------------------------+----------------+------------+--------------+-------+
Run the below command to check if all services has ok status and does not throw any failure messages.
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------+
| Cat | Pod-Type | Host | ID | Site | Health Parameter | Status | Message |
|-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------|
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | service-status | ok | |
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | minio-connectivity | ok | |
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | service-dependency:configuration-service | ok | 2 pod(s) found for configuration-service |
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | service-initialization-status | ok | |
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | kafka-connectivity | ok | Cluster=NTc1NWU1MTQxYmY3MTFlZg, Broker=1, Brokers=[1, 2, 3] |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | service-status | ok | |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | minio-connectivity | ok | |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | service-dependency:configuration-service | ok | 2 pod(s) found for configuration-service |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | service-initialization-status | ok | |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | kafka-connectivity | ok | Cluster=NTc1NWU1MTQxYmY3MTFlZg, Broker=3, Brokers=[1, 2, 3] |
| rda_app | alert-processor | c6cc7b04ab33 | b4ebfb06 | | service-status | ok | |
| rda_app | alert-processor | c6cc7b04ab33 | b4ebfb06 | | minio-connectivity | ok | |
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------+
3.2 Upgrade RDA Worker Services
Step-1: Please run the below command to initiate upgrading the RDA Worker service PODs.
Step-2: Run the below command to check the status of the existing and newer PODs and make sure atleast one instance of each RDA Worker service POD is in Terminating state.
NAME READY STATUS RESTARTS AGE
rda-worker-5b5cfcf8f7-bhqsl 1/1 Running 0 11h
rda-worker-5b5cfcf8f7-tlns2 1/1 Running 0 11h
Step-3: Run the below command to put all Terminating RDAF worker service PODs into maintenance mode. It will list all of the POD Ids of RDA worker services along with rdac maintenance command that is required to be put in maintenance mode.
Step-4: Copy & Paste the rdac maintenance command as below.
Step-5: Run the below command to verify the maintenance mode status of the RDAF worker services.
Step-6: Run the below command to delete the Terminating RDAF worker service PODs
for i in `kubectl get pods -n rda-fabric -l app_component=rda-worker | grep 'Terminating' | awk '{print $1}'`; do kubectl delete pod $i -n rda-fabric --force; done
Note
Wait for 120 seconds between each RDAF worker service upgrade by repeating above steps from Step-2 to Step-6 for rest of the RDAF worker service PODs.
Step-7: Please wait for 120 seconds to let the newer version of RDA Worker service PODs join the RDA Fabric appropriately. Run the below commands to verify the status of the newer RDA Worker service PODs.
+------------+----------------+-----------------+--------------+-------+
| Name | Host | Status | Container Id | Tag |
+------------+----------------+-----------------+--------------+-------+
| rda-worker | 192.168.108.18 | Up 12 Hours ago | 41457849dv5n | 3.5.2 |
| rda-worker | 192.168.108.19 | Up 12 Hours ago | 5g78d6b80689 | 3.5.2 |
+------------+----------------+-----------------+--------------+-------+
Please run the below command to initiate upgrading the RDA Worker Service with zero downtime
Please run the below command to initiate upgrading the RDA Worker Service without zero downtime
Please wait for 120 seconds to let the newer version of RDA Worker service containers join the RDA Fabric appropriately. Run the below commands to verify the status of the newer RDA Worker service containers.
| Infra | worker | True | 6eff605e72c4 | a318f394 | rda-site-01 | 13:45:13 | 4 | 31.21 | 0 | 0 |
| Infra | worker | True | ae7244d0d10a | 554c2cd8 | rda-site-01 | 13:40:40 | 4 | 31.21 | 0 | 0 |
+------------+----------------+------------+--------------+-------+
| Name | Host | Status | Container Id | Tag |
+------------+----------------+------------+--------------+-------+
| rda_worker | 192.168.133.96 | Up 2 hours | 12572ty7jydf | 3.5.2 |
| rda_worker | 192.168.133.92 | Up 2 hours | vbn67b947yg7 | 3.5.2 |
+------------+----------------+------------+--------------+-------+
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-----------------------------------------------------------------------------------------------------------------------------+
| Cat | Pod-Type | Host | ID | Site | Health Parameter | Status | Message |
|-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-----------------------------------------------------------------------------------------------------------------------------|
| rda_infra | api-server | 1b0542719618 | 1845ae67 | | service-status | ok | |
| rda_infra | api-server | 1b0542719618 | 1845ae67 | | minio-connectivity | ok | |
| rda_infra | api-server | d4404cffdc7a | a4cfdc6d | | service-status | ok | |
| rda_infra | api-server | d4404cffdc7a | a4cfdc6d | | minio-connectivity | ok | |
| rda_infra | asm | 8d3d52a7a475 | 418c9dc1 | | service-status | ok | |
| rda_infra | asm | 8d3d52a7a475 | 418c9dc1 | | minio-connectivity | ok | |
| rda_infra | asm | ab172a9b8229 | 2ac1d67a | | service-status | ok | |
| rda_infra | asm | ab172a9b8229 | 2ac1d67a | | minio-connectivity | ok | |
| rda_app | asset-dependency | 6ac69ca1085c | c2e9dcb9 | | service-status | ok | |
| rda_app | asset-dependency | 6ac69ca1085c | c2e9dcb9 | | minio-connectivity | ok | |
| rda_app | asset-dependency | 58a5f4f460d3 | 0b91caac | | service-status | ok | |
| rda_app | asset-dependency | 58a5f4f460d3 | 0b91caac | | minio-connectivity | ok | |
| rda_app | authenticator | 9011c2aef498 | 9f7efdc3 | | service-status | ok | |
| rda_app | authenticator | 9011c2aef498 | 9f7efdc3 | | minio-connectivity | ok | |
| rda_app | authenticator | 9011c2aef498 | 9f7efdc3 | | DB-connectivity | ok | |
| rda_app | authenticator | 148621ed8c82 | dbf16b82 | | service-status | ok | |
| rda_app | authenticator | 148621ed8c82 | dbf16b82 | | minio-connectivity | ok | |
| rda_app | authenticator | 148621ed8c82 | dbf16b82 | | DB-connectivity | ok | |
| rda_app | cfx-app-controller | 75ec0f30cfa3 | 1198fdee | | service-status | ok | |
| rda_app | cfx-app-controller | 75ec0f30cfa3 | 1198fdee | | minio-connectivity | ok | |
| rda_app | cfx-app-controller | 75ec0f30cfa3 | 1198fdee | | service-initialization-status | ok | |
| rda_app | cfx-app-controller | 75ec0f30cfa3 | 1198fdee | | DB-connectivity | ok |
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-----------------------------------------------------------------------------------------------------------------------------+
3.3 Upgrade OIA Application Services
Step-1: Please run the below command to initiate upgrading the Alert Ingester and Alert Processor services
rdafk8s app upgrade OIA --tag 7.5.2 upgrade --service cfx-rda-alert-ingester --service cfx-rda-alert-processor
Step-2: Run the below command to check the status of the newly upgraded PODs.
Step-3: Run the below command to put all Terminating OIA application service PODs into maintenance mode. It will list all of the POD Ids of OIA application services along with rdac maintenance command that are required to be put in maintenance mode.
Step-4: Copy & Paste the rdac maintenance command as below.
Step-5: Run the below command to verify the maintenance mode status of the OIA application services.
Step-6: Run the below command to delete the Terminating OIA application service PODs
for i in `kubectl get pods -n rda-fabric -l app_name=oia | grep 'Terminating' | awk '{print $1}'`; do kubectl delete pod $i -n rda-fabric --force; done
Note
Wait for 120 seconds and Repeat above steps from Step-2 to Step-6 for rest of the OIA application service PODs.
Please wait till all of the new OIA application service PODs are in Running state and run the below command to verify their status and make sure they are running with 7.5.2 version.
-
rda-alert-ingester
-
rda-alert-processor
+--------------------+----------------+----------------+--------------+---------+
| Name | Host | Status | Container Id | Tag |
+--------------------+----------------+----------------+--------------+---------+
| rda-alert-ingester | 192.168.131.47 | Up 1 Days ago | g9f6a37d0383 | 7.5.2 |
| rda-alert-ingester | 192.168.131.49 | Up 1 Days ago | 7bc4ce56511k | 7.5.2 |
| rda-alert- | 192.168.131.49 | Up 1 Days ago | h1a75d54aec7 | 7.5.2 |
| processor | | | | |
| rda-alert- | 192.168.131.50 | Up 1 Days ago | 71d2d24ad7dy | 7.5.2 |
| processor | | | | |
| rda-alert- | 192.168.131.47 | Up 3 Weeks ago | 667892946d97 | 7.5 |
| processor- | | | | |
| companion | | | | |
| rda-alert- | 192.168.131.49 | Up 3 Weeks ago | hc11c26bd99g | 7.5 |
| processor- | | | | |
| companion | | | | |
+--------------------+----------------+----------------+--------------+---------+
+-------+----------------------------------------+-------------+----------------+----------+-------------+-------------------+--------+--------------+---------------+--------------+
| Cat | Pod-Type | Pod-Ready | Host | ID | Site | Age | CPUs | Memory(GB) | Active Jobs | Total Jobs |
|-------+----------------------------------------+-------------+----------------+----------+-------------+-------------------+--------+--------------+---------------+--------------|
| Infra | collector | True | rda-collector- | 874761cd | | 5 days, 9:04:23 | 8 | 31.33 | | |
| Infra | registry | True | rda-registry-6 | 069f3456 | | 23 days, 20:03:11 | 8 | 31.33 | | |
| Infra | registry | True | rda-registry-6 | 0ae8af6f | | 23 days, 20:03:06 | 8 | 31.33 | | |
| Infra | collector | True | rda-collector- | 874761cd | | 5 days, 9:04:23 | 8 | 31.33 | | |
| Infra | registry | True | rda-registry-6 | 069f3456 | | 23 days, 20:03:11 | 8 | 31.33 | | |
| Infra | registry | True | rda-registry-6 | 0ae8af6f | | 23 days, 20:03:06 | 8 | 31.33 | | |
| Infra | scheduler | True | rda-scheduler- | ee7565aa | *leader* | 3 days, 0:43:20 | 8 | 31.33 | | |
| Infra | scheduler | True | rda-scheduler- | 779a624d | | 3 days, 0:43:02 | 8 | 31.33 | | |
| Infra | worker | True | rda-worker-7cf | 7563615a | rda-site-01 | 3 days, 0:36:29 | 8 | 31.33 | 0 | 3281 |
| Infra | worker | True | rda-worker-7cf | 0cbdeb0d | rda-site-01 | 3 days, 0:35:31 | 8 | 31.33 | 2 | 3252 |
+-------+----------------------------------------+-------------+----------------+----------+-------------+-------------------+--------+--------------+---------------+--------------+
Run the below command to check if all services has ok status and does not throw any failure messages.
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------+
| Cat | Pod-Type | Host | ID | Site | Health Parameter | Status | Message |
|-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------|
| rda_app | alert-ingester | rda-alert-in | f9314916 | | service-status | ok | |
| rda_app | alert-ingester | rda-alert-in | f9314916 | | minio-connectivity | ok | |
| rda_app | alert-ingester | rda-alert-in | f9314916 | | service-dependency:configuration-service | ok | 2 pod(s) found for configuration-service |
| rda_app | alert-ingester | rda-alert-in | f9314916 | | service-initialization-status | ok | |
| rda_app | alert-ingester | rda-alert-in | f9314916 | | kafka-connectivity | ok | Cluster=IrA5ccri7mBeUvhzvrimEg, Broker=0, Brokers=[0, 1, 2] |
| rda_app | alert-ingester | rda-alert-in | 8fc5bbcb | | service-status | ok | |
| rda_app | alert-ingester | rda-alert-in | 8fc5bbcb | | minio-connectivity | ok | |
| rda_app | alert-ingester | rda-alert-in | 8fc5bbcb | | service-dependency:configuration-service | ok | 2 pod(s) found for configuration-service |
| rda_app | alert-ingester | rda-alert-in | 8fc5bbcb | | service-initialization-status | ok | |
| rda_app | alert-ingester | rda-alert-in | 8fc5bbcb | | kafka-connectivity | ok | Cluster=IrA5ccri7mBeUvhzvrimEg, Broker=1, Brokers=[0, 1, 2] |
| rda_app | alert-processor | rda-alert-pr | e7e1e389 | | service-status | ok | |
| rda_app | alert-processor | rda-alert-pr | e7e1e389 | | minio-connectivity | ok | |
| rda_app | alert-processor | rda-alert-pr | e7e1e389 | | service-dependency:cfx-app-controller | ok | 2 pod(s) found for cfx-app-controller |
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------+
- Please run the below command to initiate upgrading the OIA Application services Alert Ingester and Alert Processor with zero downtime
rdaf app upgrade OIA --tag 7.5.2 --rolling-upgrade --service cfx-rda-alert-ingester --service cfx-rda-alert-processor --timeout 10
Note
timeout <10> mentioned in the above command represents as Seconds
Note
The rolling-upgrade option upgrades the OIA application services running in high-availability mode on one VM at a time in sequence. It completes the upgrade of OIA application services running on VM-1 before upgrading them on VM-2, followed by VM-3, and so on.
After completing the OIA application services upgrade on all VMs, it will ask for user confirmation to delete the older version OIA application service PODs.
2024-09-19 07:40:19,623 [rdaf.component.oia] INFO - Gathering OIA app container details.
2024-09-19 07:40:19,756 [rdaf.component.oia] INFO - Gathering rdac pod details.
+----------+-----------------+-----------+----------------+--------------+-------------+------------+
| Pod ID | Pod Type | Version | Age | Hostname | Maintenance | Pod Status |
+----------+-----------------+-----------+----------------+--------------+-------------+------------+
| e8c7a10e | alert-ingester | 7.5 | 1 day, 0:24:48 | 3ac142155704 | None | True |
| 4c055387 | alert-processor | 7.5 | 1 day, 0:24:18 | f525b48255bf | None | True |
+----------+-----------------+-----------+----------------+--------------+-------------+------------+
Continue moving above pods to maintenance mode? [yes/no]: yes
2024-09-19 07:40:25,275 [rdaf.component.oia] INFO - Initiating Maintenance Mode...
2024-09-19 07:40:30,790 [rdaf.component.oia] INFO - Waiting for services to be moved to maintenance.
2024-09-19 07:40:53,350 [rdaf.component.oia] INFO - Following container are in maintenance mode
+----------+-----------------+-----------+----------------+--------------+-------------+------------+
| Pod ID | Pod Type | Version | Age | Hostname | Maintenance | Pod Status |
+----------+-----------------+-----------+----------------+--------------+-------------+------------+
| e8c7a10e | alert-ingester | 7.5 | 1 day, 0:25:14 | 3ac142155704 | maintenance | False |
| 4c055387 | alert-processor | 7.5 | 1 day, 0:24:44 | f525b48255bf | maintenance | False |
+----------+-----------------+-----------+----------------+--------------+-------------+------------+
2024-09-19 07:40:53,351 [rdaf.component.oia] INFO - Waiting for timeout of 10 seconds...
2024-09-19 07:41:03,361 [rdaf.component.oia] INFO - Upgrading cfx-rda-alert-ingester on host 10.95.107.66
[+] Running 1/11:17,723 [rdaf.component] INFO -
⠿ Container oia-cfx-rda-alert-ingester-1 Started 10.9s
2024-09-19 07:41:17,728 [rdaf.component.oia] INFO - Upgrading cfx-rda-alert-processor on host 10.95.107.66
[+] Running 1/11:32,041 [rdaf.component] INFO -
⠿ Container oia-cfx-rda-alert-processor-1 Started 10.9s
2024-09-19 07:41:32,044 [rdaf.component.oia] INFO - Waiting for upgraded containers to join rdac pods
2024-09-19 07:41:32,045 [rdaf.component.oia] INFO - Checking if the upgraded components '['cfx-rda-alert-ingester', 'cfx-rda-alert-processor']' has joined the rdac pods...
2024-09-19 07:41:44,945 [rdaf.component.oia] INFO - Waiting for oia components to be up and running... retry 1
2024-09-19 07:41:57,719 [rdaf.component.oia] INFO - Waiting for oia components to be up and running... retry 2
+----------+-----------------+-----------+----------------+--------------+-------------+------------+
| Pod ID | Pod Type | Version | Age | Hostname | Maintenance | Pod Status |
+----------+-----------------+-----------+----------------+--------------+-------------+------------+
| 9d36613d | alert-ingester | 7.5 | 1 day, 0:24:33 | f2d6bf171612 | None | True |
| b80e614c | alert-processor | 7.5 | 1 day, 0:24:04 | 9fbce417d2c9 | None | True |
+----------+-----------------+-----------+----------------+--------------+-------------+------------+
Continue moving above pods to maintenance mode? [yes/no]: yes
2024-09-19 07:44:42,394 [rdaf.component.oia] INFO - Initiating Maintenance Mode...
2024-09-19 07:44:47,437 [rdaf.component.oia] INFO - Waiting for services to be moved to maintenance.
2024-09-19 07:45:09,916 [rdaf.component.oia] INFO - Following container are in maintenance mode
+----------+-----------------+-----------+----------------+--------------+-------------+------------+
| Pod ID | Pod Type | Version | Age | Hostname | Maintenance | Pod Status |
+----------+-----------------+-----------+----------------+--------------+-------------+------------+
| 9d36613d | alert-ingester | 7.5 | 1 day, 0:29:20 | f2d6bf171612 | maintenance | False |
| b80e614c | alert-processor | 7.5 | 1 day, 0:28:50 | 9fbce417d2c9 | maintenance | False |
+----------+-----------------+-----------+----------------+--------------+-------------+------------+
2024-09-19 07:45:09,918 [rdaf.component.oia] INFO - Waiting for timeout of 10 seconds...
2024-09-19 07:45:19,925 [rdaf.component.oia] INFO - Upgrading cfx-rda-alert-ingester on host 10.95.107.67
[+] Running 1/15:33,971 [rdaf.component] INFO -
⠿ Container oia-cfx-rda-alert-ingester-1 Started 10.9s
2024-09-19 07:45:33,974 [rdaf.component.oia] INFO - Upgrading cfx-rda-alert-processor on host 10.95.107.67
[+] Running 1/15:48,080 [rdaf.component] INFO -
⠿ Container oia-cfx-rda-alert-processor-1 Started 10.9s
2024-09-19 07:45:48,083 [rdaf.component.oia] INFO - Waiting for upgraded containers to join rdac pods
2024-09-19 07:45:48,084 [rdaf.component.oia] INFO - Checking if the upgraded components '['cfx-rda-alert-ingester', 'cfx-rda-alert-processor']' has joined the rdac pods...
Please run the below command to initiate upgrading the OIA Application services Alert Ingester and Alert Processor without zero downtime
Please wait till all of the new OIA application service containers are in Up state and run the below command to verify their status and make sure they are running with 7.5.2 version.
-
rda-alert-ingester
-
rda-alert-processor
+--------------------+------------ --+------------+--------------+-------+
| Name | Host | Status | Container Id | Tag |
+--------------------+------------ --+------------+--------------+-------+
| cfx-rda-app- | 192.168.133.96 | Up 4 hours | f139e2b3cca3 | 7.5 |
| controller | | | | |
| cfx-rda-app- | 192.168.133.92 | Up 3 hours | 6d68b737715a | 7.5 |
| controller | | | | |
| cfx-rda-reports- | 192.168.133.96 | Up 4 hours | 0a6bac884dff | 7.5 |
| registry | | | | |
| cfx-rda-reports- | 192.168.133.92 | Up 3 hours | 3477e7f751ec | 7.5 |
| registry | | | | |
| cfx-rda- | 192.168.133.96 | Up 4 hours | 96dd2337f779 | 7.5 |
| notification- | | | | |
| service | | | | |
| cfx-rda- | 192.168.133.92 | Up 3 hours | 3a1743239a99 | 7.5 |
| notification- | | | | |
| service | | | | |
| cfx-rda-file- | 192.168.133.96 | Up 3 hours | bd41100a456c | 7.5 |
| browser | | | | |
| cfx-rda-file- | 192.168.133.92 | Up 3 hours | 2cc517b8a640 | 7.5 |
| browser | | | | |
| cfx-rda- | 192.168.133.96 | Up 3 hours | 9f1e53602999 | 7.5 |
| configuration- | | | | |
| service | | | | |
| cfx-rda- | 192.168.133.92 | Up 3 hours | 8e50e464bcd5 | 7.5 |
| configuration- | | | | |
| service | | | | |
| cfx-rda-alert- | 192.168.133.96 | Up 3 hours | 7f75047e9e44 | 7.5.2 |
| ingester | | | | |
| cfx-rda-alert- | 192.168.133.92 | Up 3 hours | f9ec55862be0 | 7.5.2 |
| ingester | | | | |
| cfx-rda-alert- | 192.168.133.92 | Up 3 hours | r5tr67543r27 | 7.5.2 |
| processor | | | | |
| cfx-rda-alert- | 192.168.133.92 | Up 3 hours | 66yc40565bg8 | 7.5.2 |
| processor | | | | |
+--------------------+----------------+------------+--------------+-------+
Run the below command to verify all OIA application services are up and running.
+-------+----------------------------------------+-------------+----------------+----------+-------------+----------+--------+--------------+---------------+--------------+
| Cat | Pod-Type | Pod-Ready | Host | ID | Site | Age | CPUs | Memory(GB) | Active Jobs | Total Jobs |
|-------+----------------------------------------+-------------+----------------+----------+-------------+----------+--------+--------------+---------------+--------------|
| App | alert-ingester | True | rda-alert-inge | 6a6e464d | | 19:22:36 | 8 | 31.33 | | |
| App | alert-ingester | True | rda-alert-inge | 7f6b42a0 | | 19:22:53 | 8 | 31.33 | | |
| App | alert-processor | True | rda-alert-proc | a880e491 | | 19:23:21 | 8 | 31.33 | | |
| App | alert-processor | True | rda-alert-proc | b684609e | | 19:23:18 | 8 | 31.33 | | |
| App | alert-processor-companion | True | rda-alert-proc | 874f3b33 | | 19:22:24 | 8 | 31.33 | | |
| App | alert-processor-companion | True | rda-alert-proc | 70cadaa7 | | 19:22:05 | 8 | 31.33 | | |
| App | asset-dependency | True | rda-asset-depe | bde06c15 | | 19:47:50 | 8 | 31.33 | | |
| App | asset-dependency | True | rda-asset-depe | 47b9eb02 | | 19:47:38 | 8 | 31.33 | | |
| App | authenticator | True | rda-identity-d | faa33e1b | | 19:47:52 | 8 | 31.33 | | |
| App | authenticator | True | rda-identity-d | 36083c36 | | 19:47:46 | 8 | 31.33 | | |
| App | cfx-app-controller | True | rda-app-contro | 5fd3c3f4 | | 19:23:09 | 8 | 31.33 | | |
| App | cfx-app-controller | True | rda-app-contro | d66e5ce8 | | 19:22:56 | 8 | 31.33 | | |
| App | cfxdimensions-app-access-manager | True | rda-access-man | ecbb535c | | 19:47:46 | 8 | 31.33 | | |
| App | cfxdimensions-app-access-manager | True | rda-access-man | 9a05db5a | | 19:47:36 | 8 | 31.33 | | |
| App | cfxdimensions-app-collaboration | True | rda-collaborat | 61b3c53b | | 19:22:18 | 8 | 31.33 | | |
| App | cfxdimensions-app-collaboration | True | rda-collaborat | 09b9474e | | 19:21:57 | 8 | 31.33 | | |
| App | cfxdimensions-app-file-browser | True | rda-file-brows | 00495640 | | 19:22:45 | 8 | 31.33 | | |
| App | cfxdimensions-app-file-browser | True | rda-file-brows | 640f0653 | | 19:22:29 | 8 | 31.33 | | |
| App | cfxdimensions-app-irm_service | True | rda-irm-servic | 27e345c5 | | 19:21:43 | 8 | 31.33 | | |
| App | cfxdimensions-app-irm_service | True | rda-irm-servic | 23c7e082 | | 19:21:56 | 8 | 31.33 | | |
| App | cfxdimensions-app-notification-service | True | rda-notificati | bbb5b08b | | 19:23:20 | 8 | 31.33 | | |
| App | cfxdimensions-app-notification-service | True | rda-notificati | 9841bcb5 | | 19:23:02 | 8 | 31.33 | | |
+-------+----------------------------------------+-------------+----------------+----------+-------------+----------+--------+--------------+---------------+--------------+
Run the below command to check services has ok status and does not throw any failure messages.
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------+
| Cat | Pod-Type | Host | ID | Site | Health Parameter | Status | Message |
|-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------|
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | service-status | ok | |
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | minio-connectivity | ok | |
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | service-dependency:configuration-service | ok | 2 pod(s) found for configuration-service |
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | service-initialization-status | ok | |
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | kafka-connectivity | ok | Cluster=NTc1NWU1MTQxYmY3MTFlZg, Broker=1, Brokers=[1, 2, 3] |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | service-status | ok | |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | minio-connectivity | ok | |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | service-dependency:configuration-service | ok | 2 pod(s) found for configuration-service |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | service-initialization-status | ok | |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | kafka-connectivity | ok | Cluster=NTc1NWU1MTQxYmY3MTFlZg, Broker=2, Brokers=[1, 2, 3] |
| rda_app | alert-processor | c6cc7b04ab33 | b4ebfb06 | | service-status | ok | |
| rda_app | alert-processor | c6cc7b04ab33 | b4ebfb06 | | minio-connectivity | ok | |
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------+