Upgrade from 7.8.1.x to 7.8.1.4
1. Upgrade From 7.8.1.x to 7.8.1.4
AIOps (OIA) Application: From 7.8.1.x to 7.8.1.4 (Selected Services)
1.1 Prerequisites
Before proceeding with this upgrade, please make sure and verify the below prerequisites are met.
Currently deployed CLI and RDAF services are running the below versions.
-
RDAF Deployment CLI version: 1.3.3
-
Infra Services tag: 1.0.3 / 1.0.3.3 (haproxy)
-
GraphDB Tag: 1.0.3
-
Platform Services and RDA Worker tag: 3.8.0
-
OIA Application Services tag: 7.8.0 / 7.8.1.x
-
CloudFabrix recommends taking VMware VM snapshots where RDA Fabric infra/platform/applications are deployed
Note
- Check the Disk space of all the Platform and Service Vm's using the below mentioned command, the highlighted disk size should be less than 80%
rdauser@oia-125-216:
Filesystem Size Used Avail Use% Mounted on
udev 32G 0 32G 0% /dev
tmpfs 6.3G 357M 6.0G 6% /run
/dev/mapper/ubuntu--vg-ubuntu--lv 48G 12G 34G 26% /
tmpfs 32G 0 32G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 32G 0 32G 0% /sys/fs/cgroup
/dev/loop0 64M 64M 0 100% /snap/core20/2318
/dev/loop2 92M 92M 0 100% /snap/lxd/24061
/dev/sda2 1.5G 309M 1.1G 23% /boot
/dev/sdf 50G 3.8G 47G 8% /var/mysql
/dev/loop3 39M 39M 0 100% /snap/snapd/21759
/dev/sdg 50G 541M 50G 2% /minio-data
/dev/loop4 92M 92M 0 100% /snap/lxd/29619
/dev/loop5 39M 39M 0 100% /snap/snapd/21465
/dev/sde 15G 140M 15G 1% /zookeeper
/dev/sdd 30G 884M 30G 3% /kafka-logs
/dev/sdc 50G 3.3G 47G 7% /opt
/dev/sdb 50G 29G 22G 57% /var/lib/docker
/dev/sdi 25G 294M 25G 2% /graphdb
/dev/sdh 50G 34G 17G 68% /opensearch
/dev/loop6 64M 64M 0 100% /snap/core20/2379
- Check all MariaDB nodes are sync on HA setup using below commands before start upgrade
Tip
Please run the below commands on the VM host where RDAF deployment CLI was installed and rdafk8s setup command was run. The mariadb configuration is read from /opt/rdaf/rdaf.cfg file.
MARIADB_HOST=`cat /opt/rdaf/rdaf.cfg | grep -A3 mariadb | grep datadir | awk '{print $3}' | cut -f1 -d'/'`
MARIADB_USER=`cat /opt/rdaf/rdaf.cfg | grep -A3 mariadb | grep user | awk '{print $3}' | base64 -d`
MARIADB_PASSWORD=`cat /opt/rdaf/rdaf.cfg | grep -A3 mariadb | grep password | awk '{print $3}' | base64 -d`
mysql -u$MARIADB_USER -p$MARIADB_PASSWORD -h $MARIADB_HOST -P3307 -e "show status like 'wsrep_local_state_comment';"
Please verify that the mariadb cluster state is in Synced state.
+---------------------------+--------+
| Variable_name | Value |
+---------------------------+--------+
| wsrep_local_state_comment | Synced |
+---------------------------+--------+
Please run the below command and verify that the mariadb cluster size is 3.
mysql -u$MARIADB_USER -p$MARIADB_PASSWORD -h $MARIADB_HOST -P3307 -e "SHOW GLOBAL STATUS LIKE 'wsrep_cluster_size'";
+--------------------+-------+
| Variable_name | Value |
+--------------------+-------+
| wsrep_cluster_size | 3 |
+--------------------+-------+
Warning
Make sure all of the above pre-requisites are met before proceeding with the upgrade process.
Warning
Kubernetes: Though Kubernetes based RDA Fabric deployment supports zero downtime upgrade, it is recommended to schedule a maintenance window for upgrading RDAF Platform and AIOps services to newer version.
Important
Please make sure full backup of the RDAF platform system is completed before performing the upgrade.
Kubernetes: Please run the below backup command to take the backup of application data.
Run the below command on RDAF Management system and make sure the Kubernetes PODs are NOT in restarting mode (it is applicable to only Kubernetes environment)
- Verify that RDAF deployment
rdafcli version is 1.3.3 on the VM where CLI was installed for docker on-prem registry managing Kubernetes or Non-kubernetes deployments.
- On-premise docker registry service version is 1.0.3
ff6b1de8515f cfxregistry.CloudFabrix.io:443/docker-registry:1.0.3 "/entrypoint.sh /bin…" 7 days ago Up 7 days deployment-scripts-docker-registry-1
-
RDAF Infrastructure services version is 1.0.3 except for below services.
-
rda-minio: version is
RELEASE.2023-09-30T07-02-29Z -
haproxy: version is 1.0.3.3
Run the below command to get rdafk8s Infra service details
- RDAF Platform services version is 3.8.0 / 3.8.1
Run the below command to get RDAF Platform services details
- RDAF OIA Application services version is 7.8.0 / 7.8.1.x
Run the below command to get RDAF App services details
1.2 Download the new Docker Images
Download the new docker image tags for RDAF Platform and OIA (AIOps) Application services and wait until all of the images are downloaded.
Note
If the Download of the images fail, Please re-execute the above command
Run the below command to verify above mentioned tags are downloaded for all of the RDAF Platform and OIA (AIOps) Application services.
Please make sure 7.8.1.4 image tag is downloaded for the below RDAF OIA (AIOps) Application services.
- rda-alert-processor
- rda-event-consumer
- rda-alert-ingester
- rda-collaboration
- rda-irm-service
Downloaded Docker images are stored under the below path.
/opt/rdaf-registry/data/docker/registry/v2/ or /opt/rdaf/data/docker/registry/v2/
Run the below command to check the filesystem's disk usage on offline registry VM where docker images are pulled.
If necessary, older image tags that are no longer in use can be deleted to free up disk space using the command below.
Note
Run the command below if /opt occupies more than 80% of the disk space or if the free capacity of /opt is less than 25GB.
1.3 Upgrade Steps
1.3.1 Upgrade OIA Application Services
Note
-
Before updating the services, please execute the following script on the CLI VM to increase the partitions from 15 to 30.
-
Download the script from the following URL.
wget https://macaw-amer.s3.us-east-1.amazonaws.com/releases/RDA/7.8.1.4/kafka_topic_configure_k8s.py
- Run the script using the command given below.
Step-1: Run the below commands to initiate upgrading RDAF OIA Application services
rdafk8s app upgrade OIA --tag 7.8.1.4 --service rda-alert-ingester --service rda-event-consumer --service rda-alert-processor --service rda-collaboration --service rda-irm-service
Step-2: Run the below command to check the status of the newly upgraded PODs.
Step-3: Run the below command to put all Terminating OIA application service PODs into maintenance mode. It will list all of the POD Ids of OIA application services along with rdac maintenance command that are required to be put in maintenance mode.
Step-4: Copy & Paste the rdac maintenance command as below.
Step-5: Run the below command to verify the maintenance mode status of the OIA application services.
Step-6: Run the below command to delete the Terminating OIA application service PODs
for i in `kubectl get pods -n rda-fabric -l app_name=oia | grep 'Terminating' | awk '{print $1}'`; do kubectl delete pod $i -n rda-fabric --force; done
Note
Wait for 120 seconds and Repeat above steps from Step-2 to Step-6 for rest of the OIA application service PODs.
Please wait till all of the new OIA application service PODs are in Running state and run the below command to verify their status and make sure they are running with 7.8.1.4 version.
+---------------+----------------+---------------+--------------+--------+
| Name | Host | Status | Container Id | Tag |
+---------------+----------------+---------------+--------------+--------+
| rda-alert- | 192.168.108.31 | Up 9 Hours | b229e14994cd | 7.8.1.4|
| ingester | | ago | | |
| rda-alert- | 192.168.108.32 | Up 8 Hours | 5ffee59cac9e | 7.8.1.4|
| processor | | ago | | |
| rda-alert- | 192.168.108.32 | Up 8 Hours | 4e0d8b8a9076 | 7.8.1 |
| processor- | | ago | | |
| companion | | | | |
| rda-app- | 192.168.108.31 | Up 9 Hours | 5b230e6677c6 | 7.8.0 |
| controller | | ago | | |
| rda- | 192.168.108.31 | Up 9 Hours | 55e858d0cca0 | 7.8.1.4|
| collaboration | | ago | | |
| rda-configura | 192.168.108.31 | Up 9 Hours | a78ff5743025 | 7.8.0 |
| tion-service | | ago | | |
| rda-event- | 192.168.108.31 | Up 9 Hours | ae6e1aaac682 | 7.8.1.4|
| consumer | | ago | | |
| rda-file- | 192.168.108.31 | Up 9 Hours | 987891ea364e | 7.8.0 |
| browser | | ago | | |
| rda- | 192.168.108.31 | Up 9 Hours | d2151559a4bf | 7.8.0 |
| ingestion- | | ago | | |
| tracker | | | | |
| rda-irm- | 192.168.108.31 | Up 9 Hours | ba070de557ec | 7.8.1.4|
| service | | ago | | |
| rda-ml-config | 192.168.108.31 | Up 9 Hours | 5c0685496bd5 | 7.8.0 |
| | | ago | | |
| rda- | 192.168.108.31 | Up 9 Hours | 1ef37a720759 | 7.8.0 |
| notification- | | ago | | |
| service | | | | |
| rda-reports- | 192.168.108.31 | Up 9 Hours | 5daa7555f309 | 7.8.0 |
| registry | | ago | | |
| rda-smtp- | 192.168.108.31 | Up 9 Hours | b81ccaf5883a | 7.8.0 |
| server | | ago | | |
| rda-webhook- | 192.168.108.32 | Up 5 Hours | c5e6a72674d7 | 7.8.0 |
| server | | ago | | |
+---------------+----------------+---------------+--------------+--------+
Step-7: Run the below command to verify all OIA application services are up and running.
+-------+----------------------------------------+-------------+----------------+----------+-------------+----------+--------+--------------+---------------+--------------+
| Cat | Pod-Type | Pod-Ready | Host | ID | Site | Age | CPUs | Memory(GB) | Active Jobs | Total Jobs |
|-------+----------------------------------------+-------------+----------------+----------+-------------+----------+--------+--------------+---------------+--------------|
| App | alert-ingester | True | rda-alert-inge | 6a6e464d | | 19:19:06 | 8 | 31.33 | | |
| App | alert-ingester | True | rda-alert-inge | 7f6b42a0 | | 19:19:23 | 8 | 31.33 | | |
| App | alert-processor | True | rda-alert-proc | a880e491 | | 19:19:51 | 8 | 31.33 | | |
| App | alert-processor | True | rda-alert-proc | b684609e | | 19:19:48 | 8 | 31.33 | | |
| App | alert-processor-companion | True | rda-alert-proc | 874f3b33 | | 19:18:54 | 8 | 31.33 | | |
| App | alert-processor-companion | True | rda-alert-proc | 70cadaa7 | | 19:18:35 | 8 | 31.33 | | |
| App | asset-dependency | True | rda-asset-depe | bde06c15 | | 19:44:20 | 8 | 31.33 | | |
| App | asset-dependency | True | rda-asset-depe | 47b9eb02 | | 19:44:08 | 8 | 31.33 | | |
| App | authenticator | True | rda-identity-d | faa33e1b | | 19:44:22 | 8 | 31.33 | | |
| App | authenticator | True | rda-identity-d | 36083c36 | | 19:44:16 | 8 | 31.33 | | |
| App | cfx-app-controller | True | rda-app-contro | 5fd3c3f4 | | 19:19:39 | 8 | 31.33 | | |
| App | cfx-app-controller | True | rda-app-contro | d66e5ce8 | | 19:19:26 | 8 | 31.33 | | |
| App | cfxdimensions-app-access-manager | True | rda-access-man | ecbb535c | | 19:44:16 | 8 | 31.33 | | |
| App | cfxdimensions-app-access-manager | True | rda-access-man | 9a05db5a | | 19:44:06 | 8 | 31.33 | | |
| App | cfxdimensions-app-collaboration | True | rda-collaborat | 61b3c53b | | 19:18:48 | 8 | 31.33 | | |
| App | cfxdimensions-app-collaboration | True | rda-collaborat | 09b9474e | | 19:18:27 | 8 | 31.33 | | |
+-------+----------------------------------------+-------------+----------------+----------+-------------+-------------------+--------+-----------------------------+--------------+
Run the below command to check if all services has ok status and does not throw any failure messages.
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-----------------------------------------------------------------------------------------------------------------------------+
| Cat | Pod-Type | Host | ID | Site | Health Parameter | Status | Message |
|-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-----------------------------------------------------------------------------------------------------------------------------|
| rda_app | alert-ingester | rda-alert-in | 6a6e464d | | service-status | ok | |
| rda_app | alert-ingester | rda-alert-in | 6a6e464d | | minio-connectivity | ok | |
| rda_app | alert-ingester | rda-alert-in | 6a6e464d | | service-dependency:configuration-service | ok | 2 pod(s) found for configuration-service |
| rda_app | alert-ingester | rda-alert-in | 6a6e464d | | service-initialization-status | ok | |
| rda_app | alert-ingester | rda-alert-in | 6a6e464d | | kafka-connectivity | ok | Cluster=dKnnkaYSPELK8DBUk0rPig, Broker=0, Brokers=[0, 1, 2] |
| rda_app | alert-ingester | rda-alert-in | 6a6e464d | | kafka-consumer | ok | Health: [{'387c0cb507b84878b9d0b15222cb4226.inbound-events': 0, '387c0cb507b84878b9d0b15222cb4226.mapped-events': 0}, {}] |
| rda_app | alert-ingester | rda-alert-in | 7f6b42a0 | | service-status | ok | |
| rda_app | alert-ingester | rda-alert-in | 7f6b42a0 | | minio-connectivity | ok | |
| rda_app | alert-ingester | rda-alert-in | 7f6b42a0 | | service-dependency:configuration-service | ok | 2 pod(s) found for configuration-service |
| rda_app | alert-ingester | rda-alert-in | 7f6b42a0 | | service-initialization-status | ok | |
| rda_app | alert-ingester | rda-alert-in | 7f6b42a0 | | kafka-consumer | ok | Health: [{'387c0cb507b84878b9d0b15222cb4226.inbound-events': 0, '387c0cb507b84878b9d0b15222cb4226.mapped-events': 0}, {}] |
| rda_app | alert-ingester | rda-alert-in | 7f6b42a0 | | kafka-connectivity | ok | Cluster=dKnnkaYSPELK8DBUk0rPig, Broker=1, Brokers=[0, 1, 2] |
| rda_app | alert-processor | rda-alert-pr | a880e491 | | service-status | ok | |
| rda_app | alert-processor | rda-alert-pr | a880e491 | | minio-connectivity | ok | |
| rda_app | alert-processor | rda-alert-pr | a880e491 | | service-dependency:cfx-app-controller | ok | 2 pod(s) found for cfx-app-controller |
| rda_app | alert-processor | rda-alert-pr | a880e491 | | service-dependency:configuration-service | ok | 2 pod(s) found for configuration-service |
| rda_app | alert-processor | rda-alert-pr | a880e491 | | service-initialization-status | ok | |
| rda_app | alert-processor | rda-alert-pr | a880e491 | | kafka-connectivity | ok | Cluster=dKnnkaYSPELK8DBUk0rPig, Broker=1, Brokers=[0, 1, 2] |
| rda_app | alert-processor | rda-alert-pr | a880e491 | | DB-connectivity | ok | |
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------+