Skip to content

Upgrade from 8.1.0.2 & 8.1.0.2.1 to 8.1.0.3

1. Upgrade From 8.1.0.2 / 8.1.0.2.1 to 8.1.0.3 (Selected Services)

RDAF Platform: From 8.1.0.2 / 8.1.0.2.1 to 8.1.0.3 (Selected Services)

AIOps (OIA) Application: From 8.1.0.2 / 8.1.0.2.1 to 8.1.0.3 (Selected Services)

1.1. Prerequisites

Before proceeding with this upgrade, please make sure and verify the below prerequisites are met.

Currently deployed CLI and RDAF services are running the below versions.

  • RDAF Deployment CLI version: 1.4.1

  • Infra Services tag: 1.0.4

  • Platform Services and RDA Worker tag: 8.1.0.2 / 8.1.0.2.1

  • OIA Application Services tag: 8.1.0.2 / 8.1.0.2.1

  • CloudFabrix recommends taking VMware VM snapshots where RDA Fabric infra/platform/applications are deployed

Note

  • Check the Disk space of all the Platform and Service Vm's using the below mentioned command, the highlighted disk size should be less than 80%
    df -kh
    
rdauser@oia-125-216:~/collab-3.7-upgrade$ df -kh
Filesystem                         Size  Used Avail Use% Mounted on
udev                                32G     0   32G   0% /dev
tmpfs                              6.3G  357M  6.0G   6% /run
/dev/mapper/ubuntu--vg-ubuntu--lv   48G   12G   34G  26% /
tmpfs                               32G     0   32G   0% /dev/shm
tmpfs                              5.0M     0  5.0M   0% /run/lock
tmpfs                               32G     0   32G   0% /sys/fs/cgroup
/dev/loop0                          64M   64M     0 100% /snap/core20/2318
/dev/loop2                          92M   92M     0 100% /snap/lxd/24061
/dev/sda2                          1.5G  309M  1.1G  23% /boot
/dev/sdf                            50G  3.8G   47G   8% /var/mysql
/dev/loop3                          39M   39M     0 100% /snap/snapd/21759
/dev/sdg                            50G  541M   50G   2% /192.168-data
/dev/loop4                          92M   92M     0 100% /snap/lxd/29619
/dev/loop5                          39M   39M     0 100% /snap/snapd/21465
/dev/sde                            15G  140M   15G   1% /zookeeper
/dev/sdd                            30G  884M   30G   3% /kafka-logs
/dev/sdc                            50G  3.3G   47G   7% /opt
/dev/sdb                            50G   29G   22G  57% /var/lib/docker
/dev/sdi                            25G  294M   25G   2% /graphdb
/dev/sdh                            50G   34G   17G  68% /opensearch
/dev/loop6                          64M   64M     0 100% /snap/core20/2379
  • Check all MariaDB nodes are sync on HA setup using below commands before start upgrade

Tip

Please run the below commands on the VM host where RDAF deployment CLI was installed and rdafk8s setup command was run. The mariadb configuration is read from /opt/rdaf/rdaf.cfg file.

MARIADB_HOST=`cat /opt/rdaf/rdaf.cfg | grep -A3 haproxy| grep advertised_external_host | awk '{print $3}'`
MARIADB_USER=`cat /opt/rdaf/rdaf.cfg | grep -A3 mariadb | grep user | awk '{print $3}' | base64 -d`
MARIADB_PASSWORD=`cat /opt/rdaf/rdaf.cfg | grep -A3 mariadb | grep password | awk '{print $3}' | base64 -d`

mysql -u$MARIADB_USER -p$MARIADB_PASSWORD -h $MARIADB_HOST -P3307 -e "show status like 'wsrep_local_state_comment';"

Please verify that the mariadb cluster state is in Synced state.

+---------------------------+--------+
| Variable_name             | Value  |
+---------------------------+--------+
| wsrep_local_state_comment | Synced |
+---------------------------+--------+

Please run the below command and verify that the mariadb cluster size is 3.

mysql -u$MARIADB_USER -p$MARIADB_PASSWORD -h $MARIADB_HOST -P3307 -e "SHOW GLOBAL STATUS LIKE 'wsrep_cluster_size'";
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 3     |
+--------------------+-------+

Warning

Make sure all of the above pre-requisites are met before proceeding with the upgrade process.

Warning

Kubernetes: Though Kubernetes based RDA Fabric deployment supports zero downtime upgrade, it is recommended to schedule a maintenance window for upgrading RDAF Platform and AIOps services to newer version.

Important

Please make sure full backup of the RDAF platform system is completed before performing the upgrade.

Kubernetes: Please run the below backup command to take the backup of application data.

rdafk8s backup --dest-dir <backup-dir>

Run the below command on RDAF Management system and make sure the Kubernetes PODs are NOT in restarting mode (it is applicable to only Kubernetes environment)

kubectl get pods -n rda-fabric -l app_category=rdaf-infra
kubectl get pods -n rda-fabric -l app_category=rdaf-platform
kubectl get pods -n rda-fabric -l app_component=rda-worker 
kubectl get pods -n rda-fabric -l app_name=oia 

  • Verify that RDAF deployment rdaf cli version is 1.4.1 on the VM where CLI was installed for docker on-prem registry managing Kubernetes or Non-kubernetes deployments.
rdafk8s --version
RDAF CLI version: 1.4.1
  • On-premise docker registry service version is 1.0.4
docker ps | grep docker-registry
0889e08f0871   docker1.cloudfabrix.io:443/external/docker-registry:1.0.4   "/entrypoint.sh /bin…"   7 days ago   Up 7 days             deployment-scripts-docker-registry-1
  • RDAF Infrastructure services version is 1.0.4 except for below services.

  • rda-minio: version is RELEASE.2024-12-18T13-15-44Z

Run the below command to get rdafk8s Infra service details

rdafk8s infra status
+--------------------------+-----------------+-------------------+--------------+--------------------------------+
| Name                     | Host            | Status            | Container Id | Tag                            |
+--------------------------+-----------------+-------------------+--------------+--------------------------------+
| rda-nats                 | 192.168.108.114 | Up 19 Minutes ago | bbb50d2dacc5 | 1.0.4                          |
| rda-minio                | 192.168.108.114 | Up 19 Minutes ago | d26148d4bf44 | RELEASE.2024-12-18T13-15-44Z   |
| rda-mariadb              | 192.168.108.114 | Up 19 Minutes ago | 02975e0eec89 | 1.0.4                          |
| rda-opensearch           | 192.168.108.114 | Up 18 Minutes ago | 1494be76f694 | 1.0.4                          |
+--------------------------+-----------------+-------------------+--------------+--------------------------------+
  • RDAF Platform services version is 8.1.0.2 / 8.1.0.2.1

Run the below command to get RDAF Platform services details

rdafk8s platform status
+---------------+-----------------+---------------+--------------+-----------+
| Name          | Host            | Status        | Container Id | Tag       |
+---------------+-----------------+---------------+--------------+-----------+
| rda-api-      | 192.168.108.119 | Up 14 Hours   | 2ca4370a175a | 8.1.0.2.1 |
| server        |                 | ago           |              |           |
| rda-api-      | 192.168.108.120 | Up 14 Hours   | cce0d6bcba36 | 8.1.0.2.1 |
| server        |                 | ago           |              |           |
| rda-registry  | 192.168.108.120 | Up 14 Hours   | e029a9ff96fe | 8.1.0.1   |
|               |                 | ago           |              |           |
| rda-registry  | 192.168.108.119 | Up 14 Hours   | eacbc82ae8c9 | 8.1.0.1   |
|               |                 | ago           |              |           |
| rda-identity  | 192.168.108.120 | Up 14 Hours   | 45409c977c7c | 8.1.0.1   |
|               |                 | ago           |              |           |
| rda-identity  | 192.168.108.119 | Up 14 Hours   | 584458932e2c | 8.1.0.1   |
+---------------+-----------------+---------------+--------------+-----------+
  • RDAF Worker version is 8.1.0.2 / 8.1.0.2.1

Run the below command to get RDAF Worker details

rdafk8s worker status
+------------+----------------+------------+--------------+-----------+
| Name       | Host           | Status     | Container Id | Tag       |
+------------+----------------+------------+--------------+-----------+
| rda_worker | 192.168.125.63 | Up 7 weeks | cfe1fe65c692 | 8.1.0.2.1 |
+------------+----------------+------------+--------------+-----------+
  • RDAF OIA Application services version is 8.1.0.2 / 8.1.0.2.1

Run the below command to get RDAF App services details

rdafk8s app status
+-------------------------------+-----------------+-----------------+--------------+-----------+
| Name                          | Host            | Status          | Container Id | Tag       |
+-------------------------------+-----------------+-----------------+--------------+-----------+
| rda-alert-correlator          | 192.168.108.118 | Up 14 Hours ago | afdbbe6453e4 | 8.1.0.2.1 |
| rda-alert-correlator          | 192.168.108.117 | Up 14 Hours ago | 631b7978dcb0 | 8.1.0.2.1 |
| rda-alert-ingester            | 192.168.108.117 | Up 14 Hours ago | 33322e0b9cb9 | 8.1.0.2.1 |
| rda-alert-ingester            | 192.168.108.118 | Up 14 Hours ago | 8178c043bd04 | 8.1.0.2.1 |
| rda-alert-processor           | 192.168.108.117 | Up 14 Hours ago | b342b582ea1d | 8.1.0.2.1 |
| rda-alert-processor           | 192.168.108.118 | Up 14 Hours ago | b6f85413c2df | 8.1.0.2.1 |
+-------------------------------+-----------------+-----------------+--------------+-----------+

1.2. Upgrade Steps

1.2.1 Download the new Docker Images

Login into the VM where rdaf deployment CLI was installed for docker on-premise registry and managing kubernetes & Non-kubernetes deployment.

Download the new docker image tags for RDAF Platform and OIA (AIOps) Application services and wait until all of the images are downloaded.

To fetch registry please use the below command

rdaf registry fetch --tag 8.1.0.3

Note

If the Download of the images fail, Please re-execute the above command

Run the below command to verify above mentioned tags are downloaded for all of the RDAF Platform and OIA (AIOps) Application services.

rdaf registry list-tags 

Please make sure 8.1.0.3 image tag is downloaded for the below RDAF Platform services.

  • rda_asm
  • rda_collector
  • rda_api_server
  • rda_scheduler
  • rda_worker
  • rda_event_gateway
  • ubuntu-cfxdx-nb-nginx-all

Please make sure 8.1.0.3 image tag is downloaded for the below RDAF OIA (AIOps) Application services.

  • rda-alert-ingester
  • rda-alert-processor
  • rda-alert-processor-companion
  • rda-event-consumer
  • rda-collaboration
  • rda-config-service
  • rda-irm-service

Downloaded Docker images are stored under the below path.

/opt/rdaf-registry/data/docker/registry/v2/ or /opt/rdaf/data/docker/registry/v2/

Run the below command to check the filesystem's disk usage on offline registry VM where docker images are pulled.

df -h /opt

If necessary, older image tags that are no longer in use can be deleted to free up disk space using the command below.

Note

Run the command below if /opt occupies more than 80% of the disk space or if the free capacity of /opt is less than 25GB.

rdaf registry delete-images --tag <tag1,tag2>

1.2.2 Upgrade RDAF Platform Services

Step-1: Run the below command to initiate upgrading the following RDAF Platform services.

rdafk8s platform upgrade --tag 8.1.0.3 --service rda-collector --service rda-asm --service rda-api-server --service rda-scheduler

As the upgrade procedure is a non-disruptive upgrade, it puts the currently running PODs into Terminating state and newer version PODs into Pending state.

Step-2: Run the below command to check the status of the existing and newer PODs and make sure atleast one instance of each Platform service is in Terminating state.

kubectl get pods -n rda-fabric -l app_category=rdaf-platform

Step-3: Run the below command to put all Terminating RDAF platform service PODs into maintenance mode. It will list all of the POD Ids of platform services along with rdac maintenance command that required to be put in maintenance mode.

python maint_command.py

Note

If maint_command.py script doesn't exist on RDAF deployment CLI VM, it can be downloaded using the below command.

wget https://macaw-amer.s3.amazonaws.com/releases/rdaf-platform/1.1.6/maint_command.py

Step-4: Copy & Paste the rdac maintenance command as below.

rdac maintenance start --ids <comma-separated-list-of-platform-pod-ids>

Step-5: Run the below command to verify the maintenance mode status of the RDAF platform services.

rdac pods --show_maintenance | grep False

Step-6: Run the below command to delete the Terminating RDAF platform service PODs

for i in `kubectl get pods -n rda-fabric -l app_category=rdaf-platform | grep 'Terminating' | awk '{print $1}'`; do kubectl delete pod $i -n rda-fabric --force; done

Note

Wait for 120 seconds and Repeat above steps from Step-2 to Step-6 for rest of the RDAF Platform service PODs.

Please wait till the new platform service PODs are in Running state and run the below command to verify their status and make sure they are running with 8.1.0.3 version.

rdafk8s platform status
+----------------------+-----------------+----------------+--------------+----------+
| Name                 | Host            | Status         | Container Id | Tag      |
+----------------------+-----------------+----------------+--------------+----------+
| rda-api-server       | 192.168.108.117 | Up 5 Hours ago | fda327eb1c5b | 8.1.0.3  |
| rda-api-server       | 192.168.108.118 | Up 5 Hours ago | 4ea8906a4fbc | 8.1.0.3  |
| rda-registry         | 192.168.108.117 | Up 1 Days ago  | 2b40e501b898 | 8.1.0.1  |
| rda-registry         | 192.168.108.118 | Up 1 Days ago  | 2ab897175da3 | 8.1.0.1  |
| rda-identity         | 192.168.108.117 | Up 1 Days ago  | 772db9d128e4 | 8.1.0.1  |
| rda-identity         | 192.168.108.118 | Up 1 Days ago  | f5f947988c7a | 8.1.0.1  |
| rda-fsm              | 192.168.108.118 | Up 1 Days ago  | 36cb8d7c2fd2 | 8.1.0.1  |
| rda-fsm              | 192.168.108.117 | Up 1 Days ago  | f0bdaad2af77 | 8.1.0.1  |
| rda-asm              | 192.168.108.117 | Up 5 Hours ago | ac3226389b66 | 8.1.0.3  |
| rda-asm              | 192.168.108.118 | Up 5 Hours ago | f93533c4805a | 8.1.0.3  |
| rda-chat-helper      | 192.168.108.118 | Up 1 Days ago  | 275bdcf1b39a | 8.1.0.1  |
| rda-chat-helper      | 192.168.108.117 | Up 1 Days ago  | 3f82a2bb8c77 | 8.1.0.1  |
| rda-access-manager   | 192.168.108.117 | Up 1 Days ago  | 4570536616a4 | 8.1.0.1  |
| rda-access-manager   | 192.168.108.118 | Up 1 Days ago  | 5f8d95194a0e | 8.1.0.1  |
| rda-resource-manager | 192.168.108.118 | Up 1 Days ago  | 01b77acafb0f | 8.1.0.1  |
| rda-resource-manager | 192.168.108.117 | Up 1 Days ago  | db544835c22a | 8.1.0.1  |
| rda-scheduler        | 192.168.108.118 | Up 1 Days ago  | 2103b3a7f586 | 8.1.0.3  |
| rda-scheduler        | 192.168.108.117 | Up 1 Days ago  | 81de432b6ab3 | 8.1.0.3  |
| rda-collector        | 192.168.108.118 | Up 5 Hours ago | b0527f543a3f | 8.1.0.3  |
| rda-collector        | 192.168.108.117 | Up 5 Hours ago | bc0c48539795 | 8.1.0.3  |
+----------------------+-----------------+----------------+--------------+----------+

Run the below command to check the rda-scheduler service is elected as a leader under Site column.

rdac pods
+-------+----------------------------------------+-------------+--------------+----------+-------------+-----------------+--------+--------------+---------------+--------------+
| Cat   | Pod-Type                               | Pod-Ready   | Host         | ID       | Site        | Age             |   CPUs |   Memory(GB) | Active Jobs   | Total Jobs   |
|-------+----------------------------------------+-------------+--------------+----------+-------------+-----------------+--------+--------------+---------------+--------------|
| Infra | api-server                             | True        | rda-api-server | 9c0484af |             | 11:41:50 |      8 |        31.33 |               |              |
| Infra | api-server                             | True        | rda-api-server | 196558ed |             | 11:40:23 |      8 |        31.33 |               |              |
| Infra | asm                                    | True        | rda-asm-5b8fb9 | bcbdaae5 |             | 11:42:26 |      8 |        31.33 |               |              |
| Infra | asm                                    | True        | rda-asm-5b8fb9 | 232a58af |             | 11:42:40 |      8 |        31.33 |               |              |
| Infra | collector                              | True        | rda-collector- | d06fb56c |             | 11:42:03 |      8 |        31.33 |               |              |
| Infra | collector                              | True        | rda-collector- | a4c79e4c |             | 11:41:59 |      8 |        31.33 |               |              |
| Infra | registry                               | True        | rda-registry-6 | 2fd69950 |             | 11:42:03 |      8 |        31.33 |               |              |
| Infra | registry                               | True        | rda-registry-6 | fac544d6 |             | 11:41:59 |      8 |        31.33 |               |              |
| Infra | scheduler                              | True        | rda-scheduler- | b98afe88 | *leader*    | 11:42:01 |      8 |        31.33 |               |              |
| Infra | scheduler                              | True        | rda-scheduler- | e25a0841 |             | 11:41:56 |      8 |        31.33 |               |              |
| Infra | worker                                 | True        | rda-worker-5b5 | 99bd054e | rda-site-01 | 11:33:40 |      8 |        31.33 | 0             | 0            |
| Infra | worker                                 | True        | rda-worker-5b5 | 0bfdcd98 | rda-site-01 | 11:33:34 |      8 |        31.33 | 0             | 0            |
+-------+----------------------------------------+-------------+----------------+----------+-------------+----------+--------+--------------+---------------+--------------+

Run the below command to check if all services has ok status and does not throw any failure messages.

rdac healthcheck

1.2.3 Upgrade RDA Worker Services

Note

If the worker was deployed in a HTTP proxy environment, please make sure the required HTTP proxy environment variables are added in /opt/rdaf/deployment-scripts/values.yaml file under rda_worker configuration section as shown below before upgrading RDA Worker services.

rda_worker:
  terminationGracePeriodSeconds: 300
  replicas: 6
  sizeLimit: 1024Mi
  privileged: true
  resources:
    requests:
      memory: 100Mi
    limits:
      memory: 24Gi
  env:
    WORKER_GROUP: rda-prod-01
    CAPACITY_FILTER: cpu_load1 <= 7.0 and mem_percent < 95
    MAX_PROCESSES: '1000'
    RDA_ENABLE_TRACES: 'no'
    WORKER_PUBLIC_ACCESS: 'true'
    DISABLE_REMOTE_LOGGING_CONTROL: 'no'
    RDA_SELF_HEALTH_RESTART_AFTER_FAILURES: 3
  extraEnvs:
  - name: http_proxy
    value: http://test:[email protected]:3128
  - name: https_proxy
    value: http://test:[email protected]:3128
  - name: HTTP_PROXY
    value: http://test:[email protected]:3128
  - name: HTTPS_PROXY
    value: http://test:[email protected]:3128
  ....
  ....

Step-1: Please run the below command to initiate upgrading the RDA Worker service PODs.

rdafk8s worker upgrade --tag 8.1.0.3

Step-2: Run the below command to check the status of the existing and newer PODs and make sure atleast one instance of each RDA Worker service POD is in Terminating state.

kubectl get pods -n rda-fabric -l app_component=rda-worker
NAME                          READY   STATUS    RESTARTS   AGE
rda-worker-7c5b47bfc8-fdwb4   1/1     Running   0          6m23s
rda-worker-7c5b47bfc8-pcghs   1/1     Running   0          5m8s

Step-3: Run the below command to put all Terminating RDAF worker service PODs into maintenance mode. It will list all of the POD Ids of RDA worker services along with rdac maintenance command that is required to be put in maintenance mode.

python maint_command.py

Step-4: Copy & Paste the rdac maintenance command as below.

rdac maintenance start --ids <comma-separated-list-of-platform-pod-ids>

Step-5: Run the below command to verify the maintenance mode status of the RDAF worker services.

rdac pods --show_maintenance | grep False

Step-6: Run the below command to delete the Terminating RDAF worker service PODs

for i in `kubectl get pods -n rda-fabric -l app_component=rda-worker | grep 'Terminating' | awk '{print $1}'`; do kubectl delete pod $i -n rda-fabric --force; done

Note

Wait for 120 seconds between each RDAF worker service upgrade by repeating above steps from Step-2 to Step-6 for rest of the RDAF worker service PODs.

Step-7: Please wait for 120 seconds to let the newer version of RDA Worker service PODs join the RDA Fabric appropriately. Run the below commands to verify the status of the newer RDA Worker service PODs.

rdac pods | grep rda-worker
rdafk8s worker status
+------------+----------------+-------------------+--------------+---------+
| Name       | Host           | Status            | Container Id | Tag     |
+------------+----------------+-------------------+--------------+---------+
| rda-worker | 192.168.108.17 |  Up 6 Minutes ago | cfcca2c11c9a | 8.1.0.3 |
| rda-worker | 192.168.108.18 |  Up 5 Minutes ago | 209598b9d921 | 8.1.0.3 |
+------------+----------------+-------------------+--------------+---------+

Step-8: Run the below command to check if all RDA Worker services has ok status and does not throw any failure messages.

rdac healthcheck

1.2.4 Update Environment Variables in values.yaml

Alert Ingester- Add Environment Variables

  • Before upgrading the Alert Ingester service, ensure the following environment variables are added under the cfx-rda-alert-ingester section in the values.yaml file. file path /opt/rdaf/deployment-scripts/values.yaml

  • Environment Variables to Add

INBOUND_PARTITION_WORKERS_MAX
cfx-rda-alert-ingester:
   mem_limit: 6G
   memswap_limit: 6G
   privileged: true
environment:
  DISABLE_REMOTE_LOGGING_CONTROL: 'no'
  RDA_ENABLE_TRACES: 'yes'
  RDA_SELF_HEALTH_RESTART_AFTER_FAILURES: 3
  INBOUND_PARTITION_WORKERS_MAX: 3
hosts:
- 192.168.109.53
- 192.168.109.54
cap_add:
- SYS_PTRACE

1.2.5 Upgrade OIA Application Services

Step-1: Run the below commands to initiate upgrading the following RDAF OIA Application services

rdafk8s app upgrade OIA --tag  8.1.0.3 --service rda-configuration-service --service rda-alert-ingester --service rda-alert-processor --service rda-alert-correlator --service rda-alert-processor-companion --service rda-event-consumer --service rda-collaboration --service rda-irm-service

Step-2: Run the below command to check the status of the newly upgraded PODs.

kubectl get pods -n rda-fabric -l app_name=oia

Step-3: Run the below command to put all Terminating OIA application service PODs into maintenance mode. It will list all of the POD Ids of OIA application services along with rdac maintenance command that are required to be put in maintenance mode.

python maint_command.py

Step-4: Copy & Paste the rdac maintenance command as below.

rdac maintenance start --ids <comma-separated-list-of-oia-app-pod-ids>

Step-5: Run the below command to verify the maintenance mode status of the OIA application services.

rdac pods --show_maintenance | grep False

Step-6: Run the below command to delete the Terminating OIA application service PODs

for i in `kubectl get pods -n rda-fabric -l app_name=oia | grep 'Terminating' | awk '{print $1}'`; do kubectl delete pod $i -n rda-fabric --force; done
kubectl get pods -n rda-fabric -l app_name=oia

Note

Wait for 120 seconds and Repeat above steps from Step-2 to Step-6 for rest of the OIA application service PODs.

Please wait till all of the new OIA application service PODs are in Running state and run the below command to verify their status and make sure they are running with 8.1.0.3 version.

rdafk8s app status
+-------------------------------+-----------------+-----------------+--------------+-----------+
| Name                          | Host            | Status          | Container Id | Tag       |
+-------------------------------+-----------------+-----------------+--------------+-----------+
| rda-alert-correlator          | 192.168.108.120 | Up 5 Hours ago  | de58c823d265 | 8.1.0.3   |
| rda-alert-correlator          | 192.168.108.119 | Up 5 Hours ago  | 7ccfb9832d63 | 8.1.0.3   |
| rda-alert-ingester            | 192.168.108.120 | Up 5 Hours ago  | d9722596015a | 8.1.0.3   |
| rda-alert-ingester            | 192.168.108.119 | Up 5 Hours ago  | 2d73cfed8226 | 8.1.0.3   |
| rda-alert-processor           | 192.168.108.120 | Up 5 Hours ago  | 3349c4455841 | 8.1.0.3   |
| rda-alert-processor           | 192.168.108.119 | Up 5 Hours ago  | 3f17dde3eed2 | 8.1.0.3   |
| rda-alert-processor-companion | 192.168.108.119 | Up 5 Hours ago  | ec87f1383f2a | 8.1.0.3   |
| rda-alert-processor-companion | 192.168.108.120 | Up 5 Hours ago  | eda5b39c3da1 | 8.1.0.3   |
| rda-app-controller            | 192.168.108.119 | Up 23 Hours ago | cb51cf3875ad | 8.1.0.1   |
| rda-app-controller            | 192.168.108.120 | Up 23 Hours ago | 83b2d405f6ee | 8.1.0.1   |
| rda-collaboration             | 192.168.108.119 | Up 5 Hours ago  | a16102be5b3f | 8.1.0.3   |
| rda-collaboration             | 192.168.108.120 | Up 5 Hours ago  | b9779202b517 | 8.1.0.3   |
| rda-configuration-service     | 192.168.108.119 | Up 23 Hours ago | 2666a70fd84b | 8.1.0.3   |
| rda-configuration-service     | 192.168.108.120 | Up 23 Hours ago | fa90a76ec426 | 8.1.0.3   |
| rda-event-consumer            | 192.168.108.120 | Up 5 Hours ago  | 339cb5f787a7 | 8.1.0.3   |
| rda-event-consumer            | 192.168.108.119 | Up 5 Hours ago  | 85a539443123 | 8.1.0.3   |
+-------------------------------+-----------------+-----------------+--------------+-----------+

Step-7: Run the below command to verify all OIA application services are up and running.

rdac pods
+-------+----------------------------------------+-------------+----------------+----------+-------------+----------+--------+--------------+---------------+--------------+
| Cat   | Pod-Type                               | Pod-Ready   | Host           | ID       | Site        | Age      |   CPUs |   Memory(GB) | Active Jobs   | Total Jobs   |
|-------+----------------------------------------+-------------+----------------+----------+-------------+----------+--------+--------------+---------------+--------------|
| App   | alert-ingester                         | True        | rda-alert-inge | 6a6e464d |             | 19:19:06 |      8 |        31.33 |               |              |
| App   | alert-ingester                         | True        | rda-alert-inge | 7f6b42a0 |             | 19:19:23 |      8 |        31.33 |               |              |
| App   | alert-processor                        | True        | rda-alert-proc | a880e491 |             | 19:19:51 |      8 |        31.33 |               |              |
| App   | alert-processor                        | True        | rda-alert-proc | b684609e |             | 19:19:48 |      8 |        31.33 |               |              |
| App   | alert-processor-companion              | True        | rda-alert-proc | 874f3b33 |             | 19:18:54 |      8 |        31.33 |               |              |
| App   | alert-processor-companion              | True        | rda-alert-proc | 70cadaa7 |             | 19:18:35 |      8 |        31.33 |               |              |
| App   | asset-dependency                       | True        | rda-asset-depe | bde06c15 |             | 19:44:20 |      8 |        31.33 |               |              |
| App   | asset-dependency                       | True        | rda-asset-depe | 47b9eb02 |             | 19:44:08 |      8 |        31.33 |               |              |
| App   | authenticator                          | True        | rda-identity-d | faa33e1b |             | 19:44:22 |      8 |        31.33 |               |              |
| App   | authenticator                          | True        | rda-identity-d | 36083c36 |             | 19:44:16 |      8 |        31.33 |               |              |
| App   | cfx-app-controller                     | True        | rda-app-contro | 5fd3c3f4 |             | 19:19:39 |      8 |        31.33 |               |              |
| App   | cfx-app-controller                     | True        | rda-app-contro | d66e5ce8 |             | 19:19:26 |      8 |        31.33 |               |              |
| App   | cfxdimensions-app-access-manager       | True        | rda-access-man | ecbb535c |             | 19:44:16 |      8 |        31.33 |               |              |
| App   | cfxdimensions-app-access-manager       | True        | rda-access-man | 9a05db5a |             | 19:44:06 |      8 |        31.33 |               |              |
| App   | cfxdimensions-app-collaboration        | True        | rda-collaborat | 61b3c53b |             | 19:18:48 |      8 |        31.33 |               |              |
| App   | cfxdimensions-app-collaboration        | True        | rda-collaborat | 09b9474e |             | 19:18:27 |      8 |        31.33 |               |              |
+-------+----------------------------------------+-------------+----------------+----------+-------------+-------------------+--------+-----------------------------+--------------+

Run the below command to check if all services has ok status and does not throw any failure messages.

rdac healthcheck
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-----------------------------------------------------------------------------------------------------------------------------+
| Cat       | Pod-Type                               | Host         | ID       | Site        | Health Parameter                                    | Status   | Message                                                                                                                     |
|-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-----------------------------------------------------------------------------------------------------------------------------|
| rda_app   | alert-ingester                         | rda-alert-in | 6a6e464d |             | service-status                                      | ok       |                                                                                                                             |
| rda_app   | alert-ingester                         | rda-alert-in | 6a6e464d |             | 192.168-connectivity                                  | ok       |                                                                                                                             |
| rda_app   | alert-ingester                         | rda-alert-in | 6a6e464d |             | service-dependency:configuration-service            | ok       | 2 pod(s) found for configuration-service                                                                                    |
| rda_app   | alert-ingester                         | rda-alert-in | 6a6e464d |             | service-initialization-status                       | ok       |                                                                                                                             |
| rda_app   | alert-ingester                         | rda-alert-in | 6a6e464d |             | kafka-connectivity                                  | ok       | Cluster=dKnnkaYSPELK8DBUk0rPig, Broker=0, Brokers=[0, 1, 2]                                                                 |
| rda_app   | alert-ingester                         | rda-alert-in | 6a6e464d |             | kafka-consumer                                      | ok       | Health: [{'387c0cb507b84878b9d0b15222cb4226.inbound-events': 0, '387c0cb507b84878b9d0b15222cb4226.mapped-events': 0}, {}]   |
| rda_app   | alert-ingester                         | rda-alert-in | 7f6b42a0 |             | service-status                                      | ok       |                                                                                                                             |
| rda_app   | alert-ingester                         | rda-alert-in | 7f6b42a0 |             | 192.168-connectivity                                  | ok       |                                                                                                                             |
| rda_app   | alert-ingester                         | rda-alert-in | 7f6b42a0 |             | service-dependency:configuration-service            | ok       | 2 pod(s) found for configuration-service                                                                                    |
| rda_app   | alert-ingester                         | rda-alert-in | 7f6b42a0 |             | service-initialization-status                       | ok       |                                                                                                                             |
| rda_app   | alert-ingester                         | rda-alert-in | 7f6b42a0 |             | kafka-consumer                                      | ok       | Health: [{'387c0cb507b84878b9d0b15222cb4226.inbound-events': 0, '387c0cb507b84878b9d0b15222cb4226.mapped-events': 0}, {}]   |
| rda_app   | alert-ingester                         | rda-alert-in | 7f6b42a0 |             | kafka-connectivity                                  | ok       | Cluster=dKnnkaYSPELK8DBUk0rPig, Broker=1, Brokers=[0, 1, 2]                                                                 |
| rda_app   | alert-processor                        | rda-alert-pr | a880e491 |             | service-status                                      | ok       |                                                                                                                             |
| rda_app   | alert-processor                        | rda-alert-pr | a880e491 |             | 192.168-connectivity                                  | ok       |                                                                                                                             |
| rda_app   | alert-processor                        | rda-alert-pr | a880e491 |             | service-dependency:cfx-app-controller               | ok       | 2 pod(s) found for cfx-app-controller                                                                                       |
| rda_app   | alert-processor                        | rda-alert-pr | a880e491 |             | service-dependency:configuration-service            | ok       | 2 pod(s) found for configuration-service                                                                                    |
| rda_app   | alert-processor                        | rda-alert-pr | a880e491 |             | service-initialization-status                       | ok       |                                                                                                                             |
| rda_app   | alert-processor                        | rda-alert-pr | a880e491 |             | kafka-connectivity                                  | ok       | Cluster=dKnnkaYSPELK8DBUk0rPig, Broker=1, Brokers=[0, 1, 2]                                                                 |
| rda_app   | alert-processor                        | rda-alert-pr | a880e491 |             | DB-connectivity                                     | ok       |                                                                                                                             |
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------+

1.2.6 RDA Studio Upgrade

Please navigate to the rda-studio.yml file. You need to modify the existing tag version to 8.1.0.3, ensuring it matches the format shown in the example below, and then save the file

services:
cfxdx:
    image: docker1.cloudfabrix.io:443/external/ubuntu-cfxdx-nb-nginx-all:8.1.0.3
    restart: unless-stopped
    volumes:
    - /opt/rdaf/cfxdx/home/:/root
    - /opt/rdaf/cfxdx/config/:/tmp/config/
    - /opt/rdaf/cfxdx/output:/tmp/output/
    - /opt/rdaf/config/network_config/:/network_config
    ports:
    - "9998:9998"
    environment:
    #JUPYTER_TOKEN: cfxdxdemo
    NLTK_DATA : "/root/nltk_data"
    CFXDX_CONFIG_FILE: /tmp/config/conf.yml
    RDA_NETWORK_CONFIG: /network_config/config.json
    RDA_USER: xxxxxxx
    RDA_PASSWORD: xxxxxxxxxx

After updating the rda-studio.yml file to set the tag version to 8.1.0.3, execute the following commands to pull the latest images and start the services

docker-compose -f rda-studio pull
docker-compose -f rda-studio up -d

1.2.7 Upgrade Event Gateway Services

Note

This upgrade is only for Non-K8s

Step 1. Prerequisites

  • Event Gateway with 8.1.0.2 tag should be already installed

Note

If a user deployed the event gateway using the RDAF CLI, follow Step 2 and skip Step 3 or if the user did not deploy event gateway in RDAF CLI go to Step 3

Step 2. Upgrade Event Gateway Using RDAF CLI

  • To upgrade the event gateway, log in to the rdaf cli VM and execute the following command.
rdaf event_gateway upgrade --tag 8.1.0.3

Step 3. Upgrade Event Gateway Using Docker Compose File

  • Login to the Event Gateway installed VM

  • Navigate to the location where Event Gateway was previously installed, using the following command

    cd /opt/rdaf/event_gateway
    
  • Edit the docker-compose file for the Event Gateway using a local editor (e.g. vi) update the tag and save it

    vi event-gateway-docker-compose.yml
    
    version: '3.1'
    services:
    rda_event_gateway:
    image: docker1.cloudfabrix.io:443/external/ubuntu-rda-event-gateway:8.1.0.3
    restart: always
    network_mode: host
    mem_limit: 6G
    memswap_limit: 6G
    volumes:
    - /opt/rdaf/network_config:/network_config
    - /opt/rdaf/event_gateway/config:/event_gw_config
    - /opt/rdaf/event_gateway/certs:/certs
    - /opt/rdaf/event_gateway/logs:/logs
    - /opt/rdaf/event_gateway/log_archive:/tmp/log_archive
    logging:
        driver: "json-file"
        options:
        max-size: "25m"
        max-file: "5"
    environment:
        RDA_NETWORK_CONFIG: /network_config/rda_network_config.json
        EVENT_GW_MAIN_CONFIG: /event_gw_config/main/main.yml
        EVENT_GW_SNMP_TRAP_CONFIG: /event_gw_config/snmptrap/trap_template.json
        EVENT_GW_SNMP_TRAP_ALERT_CONFIG: /event_gw_config/snmptrap/trap_to_alert_go.yaml
        AGENT_GROUP: event_gateway_site01
        EVENT_GATEWAY_CONFIG_DIR: /event_gw_config
        LOGGER_CONFIG_FILE: /event_gw_config/main/logging.yml
    
  • Please run the following commands

    docker-compose -f event-gateway-docker-compose.yml down
    docker-compose -f event-gateway-docker-compose.yml pull
    docker-compose -f event-gateway-docker-compose.yml up -d
    
  • Use the command as shown below to ensure that the RDA docker instances are up and running.

    docker ps -a | grep event
    
  • Use the below mentioned command to check docker logs for any errors

    docker logs -f  -tail 200 <event gateway containerid>
    
rdaf event_gateway status
+-------------------+---------------+---------------+--------------+---------+
| Name              | Host          | Status        | Container Id | Tag     |
+-------------------+---------------+---------------+--------------+---------+
| rda_event_gateway | 192.168.108.127 | Up 42 hours | c22b1cf6900e | 8.1.0.3 |
| rda_event_gateway | 192.168.108.128 | Up 42 hours | 36b86a7bdff3 | 8.1.0.3 |
+-------------------+---------------+---------------+--------------+---------+

1.2.8 Prune Images

After upgrading the services, run the below command to clean up the un-used docker images. This command helps to clean up and free the disk space

Run the below command on rdafcli vm to clean up old docker images

rdafk8s prune_images

1.3. Post Upgrade Steps

Note

To get the latest OIA Alerts and Incidents Dashboard changes, please activate the Fabrix AIOps Fault Management Base Version 9.0.15

  • Upload following RDA Packs (Go to Main Menu --> Configuration --> RDA Administration --> Packs --> Click Upload Packs from Catalog), and activate the packs for the latest dashboard changes for OIA Alerts and Incidents. Upload the following pack (ensure you select the correct versions)

  • Fabrix AIOps Fault Management Base Version 9.0.15

  • Download the Fabrix AIOps Fault Management Base Version 9.0.15 from the following Link

  • Upload the downloaded RDA Packs (Go to Main Menu --> Configuration --> RDA Administration --> Packs --> Click Upload Pack), and activate the packs for the latest dashboard changes for OIA Alerts and Incidents. Upload the following pack (ensure you select the correct versions)