Upgrade from 8.0.0 to 8.1.1
1. Upgrade From 8.0.0 to 8.1.1
RDAF Infra Upgrade: 1.0.3 / 1.0.3.3 (haproxy) to 1.0.4
RDAF Platform: From 8.0.0 to 8.1.1
OIA (AIOps) Application: From 8.0.0 to 8.1.1
RDAF Deployment rdaf CLI: From 1.4.0 to 1.4.2
RDAF Client rdac CLI: From 8.0.0 to 8.1.1
1.1. Prerequisites
Before proceeding with this upgrade, please make sure and verify the below prerequisites are met.
-
RDAF Deployment CLI version: 1.4.0
-
Infra Services tag: 1.0.3 / 1.0.3.3 (haproxy)
-
Platform Services and RDA Worker tag: 8.0.0
-
OIA Application Services tag: 8.0.0
-
Each OpenSearch node requires an additional 100 GB of disk space to support both the ingestion of new alert payloads and the migration of alert history data to the pstream.
-
CloudFabrix recommends taking VMware VM snapshots where RDA Fabric infra/platform/applications are deployed
Note
- Check the Disk space of all the Platform and Service Vm's using the below mentioned command, the highlighted disk size should be less than 80%
rdauser@oia-125-216:~/collab-3.7-upgrade$ df -kh
Filesystem Size Used Avail Use% Mounted on
udev 32G 0 32G 0% /dev
tmpfs 6.3G 357M 6.0G 6% /run
/dev/mapper/ubuntu--vg-ubuntu--lv 48G 12G 34G 26% /
tmpfs 32G 0 32G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 32G 0 32G 0% /sys/fs/cgroup
/dev/loop0 64M 64M 0 100% /snap/core20/2318
/dev/loop2 92M 92M 0 100% /snap/lxd/24061
/dev/sda2 1.5G 309M 1.1G 23% /boot
/dev/sdf 50G 3.8G 47G 8% /var/mysql
/dev/loop3 39M 39M 0 100% /snap/snapd/21759
/dev/sdg 50G 541M 50G 2% /minio-data
/dev/loop4 92M 92M 0 100% /snap/lxd/29619
/dev/loop5 39M 39M 0 100% /snap/snapd/21465
/dev/sde 15G 140M 15G 1% /zookeeper
/dev/sdd 30G 884M 30G 3% /kafka-logs
/dev/sdc 50G 3.3G 47G 7% /opt
/dev/sdb 50G 29G 22G 57% /var/lib/docker
/dev/sdi 25G 294M 25G 2% /graphdb
/dev/sdh 50G 34G 17G 68% /opensearch
/dev/loop6 64M 64M 0 100% /snap/core20/2379
Warning
Make sure all of the above pre-requisites are met before proceeding with the upgrade process.
Warning
Non-Kubernetes: Upgrading RDAF Platform and AIOps application services is a disruptive operation. Schedule a maintenance window before upgrading RDAF Platform and AIOps services to newer version.
Important
Please make sure full backup of the RDAF platform system is completed before performing the upgrade.
Non-Kubernetes: Please run the below backup command to take the backup of application data.
Note: Please make sure this backup-dir is mounted across all infra,cli vms.- Verify that RDAF deployment
rdafcli version is 1.4.0 on the VM where CLI was installed for docker on-prem registry managing Non-kubernetes deployments.
- On-premise docker registry service version is 1.0.3
ff6b1de8515f cfxregistry.CloudFabrix.io:443/docker-registry:1.0.3 "/entrypoint.sh /bin…" 7 days ago Up 7 days deployment-scripts-docker-registry-1
-
RDAF Infrastructure services version is 1.0.3 except for below services.
-
rda-minio: version is
RELEASE.2023-09-30T07-02-29Z -
haproxy: version is
1.0.3.3
Run the below command to get RDAF Infra service details
+---------------+----------------+-------------+--------------+------------------------------+
| Name | Host | Status | Container Id | Tag |
+---------------+----------------+-------------+--------------+------------------------------+
| nats | 192.168.125.63 | Up 2 months | aff2eb1f37c9 | 1.0.3 |
| minio | 192.168.125.63 | Up 2 months | ed6bb3ea036a | RELEASE.2023-09-30 T07-02-29Z|
| mariadb | 192.168.125.63 | Up 2 months | 616a98d6471c | 1.0.3 |
| opensearch | 192.168.125.63 | Up 2 months | 7edeede52a9b | 1.0.3 |
| kafka | 192.168.125.63 | Up 2 months | d1426429da4c | 1.0.3 |
+---------------+----------------+-------------+--------------+------------------------------+
- RDAF Platform services version is 8.0.0
Run the below command to get RDAF Platform services details
+----------------+----------------+------------+--------------+-------+
| Name | Host | Status | Container Id | Tag |
+----------------+----------------+------------+--------------+-------+
| rda_api_server | 192.168.125.63 | Up 7 weeks | c6500e23738f | 8.0.0 |
| rda_registry | 192.168.125.63 | Up 7 weeks | 34f008691fd4 | 8.0.0 |
| rda_scheduler | 192.168.125.63 | Up 7 weeks | 8b358f65a7d3 | 8.0.0 |
| rda_collector | 192.168.125.63 | Up 7 weeks | 1888441693c0 | 8.0.0 |
| rda_identity | 192.168.125.63 | Up 7 weeks | 10e43ae93430 | 8.0.0 |
| rda_asm | 192.168.125.63 | Up 7 weeks | f98c2c79539a | 8.0.0 |
| rda_fsm | 192.168.125.63 | Up 7 weeks | a3a8634a2f2d | 8.0.0 |
+----------------+----------------+------------+--------------+-------+
- RDAF worker services version is 8.0.0
Run the below command to get RDAF worker services details
+------------+--------------+------------+--------------+---------+
| Name | Host | Status | Container Id | Tag |
+------------+--------------+------------+--------------+---------+
| rda_worker | 10.95.125.63 | Up 7 weeks | dr65357r76t3 | 8.0.0 |
+------------+--------------+------------+--------------+---------+
- RDAF OIA Application services version is 8.0.0
Run the below command to get RDAF App services details
+---------------+----------------+------------+--------------+-------+
| Name | Host | Status | Container Id | Tag |
+---------------+----------------+------------+--------------+-------+
| cfx-rda-app- | 192.168.125.63 | Up 7 weeks | 1bae5abb4e9c | 8.0.0 |
| controller | | | | |
| cfx-rda- | 192.168.125.63 | Up 7 weeks | 925a97ecb0a3 | 8.0.0 |
| reports- | | | | |
| registry | | | | |
| cfx-rda- | 192.168.125.63 | Up 7 weeks | 1628da0a7a30 | 8.0.0 |
| notification- | | | | |
| service | | | | |
| cfx-rda-file- | 192.168.125.63 | Up 7 weeks | 237c85c6cb9f | 8.0.0 |
| browser | | | | |
| cfx-rda-confi | 192.168.125.63 | Up 7 weeks | 0fe8f3ee7596 | 8.0.0 |
| guration- | | | | |
| service | | | | |
+---------------+----------------+------------+--------------+-------+
RDAF Deployment CLI Upgrade:
Please follow the below given steps.
Note
Upgrade RDAF Deployment CLI on both on-premise docker registry VM and RDAF Platform's management VM if provisioned separately.
Login into the VM where rdaf deployment CLI was installed for docker on-premise registry and managing Non-kubernetes deployment.
- Download the RDAF Deployment CLI's newer version 1.4.2 bundle
wget https://macaw-amer.s3.us-east-1.amazonaws.com/releases/rdaf-platform/1.4.2/rdafcli-1.4.2.tar.gz
- Upgrade the
rdafCLI to version 1.4.2
- Verify the installed
rdafCLI version is upgraded to 1.4.2
- Download the RDAF Deployment CLI's newer version 1.4.2 bundle and copy it to RDAF management VM on which `rdaf deployment CLI was installed.
1.2. Upgrade Steps
1.2.1 Upgrade On-Prem Registry
- Please update the registry by using the below command
Please download the below python script (rdaf_upgrade_140_142.py)
wget https://macaw-amer.s3.us-east-1.amazonaws.com/releases/rdaf-platform/1.4.2/rdaf_upgrade_140_142.py
The below step will generate values.yaml.latest files for all RDAF Infrastructure, Platform and Application services in the /opt/rdaf/deployment-scripts directory.
Please run the downloaded python upgrade script rdaf_upgrade_140_142.py as shown below
Note
The above command will show the available options for the upgrade script
usage: rdaf_upgrade_140_142.py [-h] {upgrade,haproxy_upgrade,cleanup_haproxy} ...
options:
-h, --help show this help message and exit
options:
{upgrade,haproxy_upgrade,cleanup_haproxy}
Available options
upgrade upgrade the setup
haproxy_upgrade Upgrade HAProxy with VIP
cleanup_haproxy cleaning up haproxy and keepalived
Please run the downloaded python upgrade script rdaf_upgrade_140_142.py as shown below
rdauser-infra13360:~$ python rdaf_upgrade_140_142.py upgrade
cleaning up expiring certificates...
Cleanup complete!
cleaning up expiring certificates...
Cleanup complete!
Updating policy json configuration.
Creating backup policy.json
Encrypting policy user credentials.
Updating the policy.json in platform and service hosts.
Copying policy.json to hosts: 192.168.133.63
Copying policy.json to hosts: 192.168.133.66
Copying policy.json to hosts: 192.168.133.65
Copying policy.json to hosts: 192.168.133.64
Updating the opensearch tenant user permissions...
{"status":"OK","message":"'role-74f772b55ef14890929b7857d20766be-dataplane-policy' updated."}
{"status":"OK","message":"'role-74f772b55ef14890929b7857d20766be' updated."}
Creating backup of existing haproxy.cfg on host 192.168.133.60
Updating haproxy configs on host 192.168.133.60..
Creating backup of existing haproxy.cfg on host 192.168.133.61
Updating haproxy configs on host 192.168.133.61..
Copied /opt/rdaf/deployment-scripts/worker.yaml to /opt/rdaf/deployment-scripts/192.168.133.65
Copied /opt/rdaf/deployment-scripts/worker.yaml to /opt/rdaf/deployment-scripts/192.168.133.66
Copying /opt/rdaf/rdaf.cfg to host 192.168.133.61
Creating directory /opt/rdaf/config/runtime and setting ownership to user 1000 and group to group 1000 on host 192.168.133.61
Copying /opt/rdaf/rdaf.cfg to host 192.168.133.63
Creating directory /opt/rdaf/config/runtime and setting ownership to user 1000 and group to group 1000 on host 192.168.133.63
Copying /opt/rdaf/rdaf.cfg to host 192.168.133.65
Creating directory /opt/rdaf/config/runtime and setting ownership to user 1000 and group to group 1000 on host 192.168.133.65
Copying /opt/rdaf/rdaf.cfg to host 192.168.133.62
Creating directory /opt/rdaf/config/runtime and setting ownership to user 1000 and group to group 1000 on host 192.168.133.62
Copying /opt/rdaf/rdaf.cfg to host 192.168.133.66
Creating directory /opt/rdaf/config/runtime and setting ownership to user 1000 and group to group 1000 on host 192.168.133.66
Copying /opt/rdaf/rdaf.cfg to host 192.168.133.64
Creating directory /opt/rdaf/config/runtime and setting ownership to user 1000 and group to group 1000 on host 192.168.133.64
backing up existing values.yaml..
Removing rda_asset_dependency and AIA entries from the values.yaml file
[+] Stopping 1/1
✔ Container platform-rda_asset_dependency-1 Stopped 10.5s
Going to remove platform-rda_asset_dependency-1
[+] Removing 1/0
✔ Container platform-rda_asset_dependency-1 Removed 0.0s
Removing rda_asset_dependency entries from the platform_yaml
[+] Stopping 1/1
✔ Container platform-rda_asset_dependency-1 Stopped 10.6s
Going to remove platform-rda_asset_dependency-1
[+] Removing 1/0
✔ Container platform-rda_asset_dependency-1 Removed 0.0s
Removing rda_asset_dependency entries from the platform_yaml
backing up existing nats.conf on host 192.168.133.60
JetStream section removed successfully.
backing up existing nats.conf on host 192.168.133.61
JetStream section removed successfully.
The upgrade script makes the following changes:
-
OpenSearch Certificate Cleanup
Cleans up expired OpenSearch certificates.
Connects to all VMs via SSH to perform the cleanup.
-
Policy File Update
Copies
policy.jsonto/opt/rdaf/config/policy.jsonon platform and service hosts.Takes backup of the existing
policy.json.Updates policy user credentials within the file.
-
IP Address Directory Creation
Creates a directory for each platform and worker host at
/opt/rdaf/deployment-scripts/192.168.xx.xx.Moves corresponding YAML files into their respective IP address directories.
-
Runtime Folder Creation
Creates an empty runtime folder at
/opt/rdaf/config. -
AIA Dependency Removal
Removes AIA dependency configuration from
values.yaml. -
Asset Dependency Service Removal
Removes the asset-dependency service entry from
platform.yaml. -
NATS JetStream Removal
Removes the JetStream configuration section from
/opt/rdaf/config/nats.conf. -
HAProxy Configuration Update
Creates a backup of the existing
haproxy.cfgfile.Updates
/opt/rdaf/config/haproxy/haproxy.cfgwith the following configuration under backend webhook:backend webhook mode http balance roundrobin stick-table type ip size 10k expire 10m stick on src option httpchk GET /healthcheck http-check expect rstatus (2|3)[0-9][0-9] http-check disable-on-404 http-response set-header Cache-Control no-store http-response set-header Pragma no-cache default-server inter 10s downinter 5s fall 3 rise 2 cookie SERVERID insert indirect nocache maxidle 30m maxlife 24h httponly secure server rdaf-webhook-1 192.168.108.51:8888 check cookie rdaf-webhook-1 server rdaf-webhook-2 192.168.108.52:8888 check cookie rdaf-webhook-2 -
Portal Backend Update in
values.yamlFile path:
/opt/rdaf/deployment-scripts/values.yamlUpdates the portal-backend environment variables section to be dynamically injected via CLI instead of hardcoded:
Note
The parameters highlighted in yellow for the FSM Service, Collaboration Service, and Ingestion-Tracker Services must be updated manually.
-
FSM Environment Updates in
values.yamlFile path:
/opt/rdaf/deployment-scripts/values.yamlUnder
rda_fsm service, the valuePURGE_STALE_INSTANCES_DAYSis updated from 120 to 90.Adds a new environment variable
FSM_INSTANCE_CACHE_SIZEwith value 2000rda_fsm: mem_limit: 4G memswap_limit: 4G privileged: true cap_add: - SYS_PTRACE environment: RDA_ENABLE_TRACES: 'yes' DISABLE_REMOTE_LOGGING_CONTROL: 'no' RDA_SELF_HEALTH_RESTART_AFTER_FAILURES: 3 PURGE_COMPLETED_INSTANCES_DAYS: 1 PURGE_STALE_INSTANCES_DAYS: 90 FSM_INSTANCE_CACHE_SIZE: 2000 KAFKA_CONSUMER_BATCH_MAX_SIZE: 100 KAFKA_CONSUMER_BATCH_MAX_TIME_SECONDS: 1 KAFKA_CONSUMER_BATCH_MAX_TIME_SECONDS: 1 deployment: true -
Collaboration Updates in
values.yaml -
Ingestion-tracker Updates in
values.yaml
1.2.2 Download the new Docker Images
Download the new docker image tags for RDAF Platform and OIA (AIOps) Application services and wait until all of the images are downloaded.
Note
If the Download of the images fail, Please re-execute the above command
Run the below command to verify above mentioned tags are downloaded for all of the RDAF Platform and OIA (AIOps) Application services.
Please make sure 1.0.4 ,opensearch (1.0.4 & 1.0.4.1) Image tag is downloaded for the following services
- nats - 1.0.4
- minio-tag RELEASE.2024-12-18T13-15-44Z
- mariadb - 1.0.4
- opensearch - 1.0.4,1.0.4.1
- kafka - 1.0.4
- graphdb - 1.0.4
- haproxy - 1.0.4
- telegraph - 1.0.4
Please make sure 8.1.1 image tag is downloaded for the below RDAF Platform services.
- rda-client-api-server
- rda-registry
- rda-scheduler
- rda-collector
- rda-identity
- rda-fsm
- rda-asm
- rda-access-manager
- rda-resource-manager
- rda-user-preferences
- onprem-portal
- onprem-portal-nginx
- rda-worker-all
- onprem-portal-dbinit
- cfxdx-nb-nginx-all
- rda-event-gateway
- rda-chat-helper
- rdac
- bulk_stats
Please make sure 8.1.1 image tag is downloaded for the below RDAF OIA (AIOps) Application services.
- cfx-rda-app-controller
- cfx-rda-alert-processor
- cfx-rda-file-browser
- cfx-rda-smtp-server
- cfx-rda-ingestion-tracker
- cfx-rda-reports-registry
- cfx-rda-ml-config
- cfx-rda-event-consumer
- cfx-rda-webhook-server
- cfx-rda-irm-service
- cfx-rda-alert-ingester
- cfx-rda-collaboration
- cfx-rda-notification-service
- cfx-rda-configuration-service
- cfx-rda-alert-processor-companion
Downloaded Docker images are stored under the below path.
/opt/rdaf-registry/data/docker/registry/v2/ or /opt/rdaf/data/docker/registry/v2/
Run the below command to check the filesystem's disk usage on offline registry VM where docker images are pulled.
If necessary, older image tags that are no longer in use can be deleted to free up disk space using the command below.
Note
Run the command below if /opt occupies more than 80% of the disk space or if the free capacity of /opt is less than 25GB.
1.2.3 Upgrade RDAF Infra Services
- Upgrade infra service using below command.
- Please use the below mentioned command to see infra services are up and in Running state
+------------------------+------------------+---------------+--------------+------------------------------+
| Name | Host | Status | Container Id | Tag |
+------------------------+------------------+---------------+--------------+------------------------------+
| nats | 192.168.108.56 | Up 56 minutes | 41f35b3e8a03 | 1.0.4 |
| minio | 192.168.108.50 | Up 56 minutes | f12a7f8f6f85 | RELEASE.2024-12-18T13-15-44Z |
| minio | 192.168.108.56 | Up 56 minutes | 43ae0b473698 | RELEASE.2024-12-18T13-15-44Z |
| minio | 192.168.108.58 | Up 56 minutes | 48829343c2f6 | RELEASE.2024-12-18T13-15-44Z |
| minio | 192.168.108.51 | Up 56 minutes | 2424ed057dee | RELEASE.2024-12-18T13-15-44Z |
| mariadb | 192.168.108.50 | Up 55 minutes | c435c7f38ba3 | 1.0.4 |
| mariadb | 192.168.108.56 | Up 55 minutes | b7f7416c5e3f | 1.0.4 |
| mariadb | 192.168.108.58 | Up 55 minutes | dc78e416f180 | 1.0.4 |
| opensearch | 192.168.108.50 | Up 55 minutes | 85a0df23e3f7 | 1.0.4 |
| opensearch | 192.168.108.56 | Up 55 minutes | 6f76f281aca8 | 1.0.4 |
| opensearch | 192.168.108.58 | Up 55 minutes | b2f36099113e | 1.0.4 |
| kafka | 192.168.108.50 | Up 54 minutes | 4fcdb0d6c942 | 1.0.4 |
| kafka | 192.168.108.56 | Up 54 minutes | 6810698a8b30 | 1.0.4 |
| kafka | 192.168.108.58 | Up 54 minutes | 21f1c70953f0 | 1.0.4 |
| graphdb[operator] | 192.168.108.50 | Up 54 minutes | bb0686761330 | 1.0.4 |
| graphdb[agent] | 192.168.108.50 | Up 54 minutes | 8ace86d77247 | 1.0.4 |
| graphdb[server] | 192.168.108.56 | Up 54 minutes | bb9754e230f0 | 1.0.4 |
| graphdb[coordinator] | 192.168.108.50 | Up 54 minutes | 11217b9360ea | 1.0.4 |
| graphdb[operator] | 192.168.108.56 | Up 54 minutes | 828b36784ff3 | 1.0.4 |
| graphdb[agent] | 192.168.108.56 | Up 54 minutes | 546a17d4fede | 1.0.4 |
+------------------------+------------------+---------------+--------------+------------------------------+
Run the below RDAF command to check infra healthcheck status
+------------+-----------------+--------+--------+----------------+--------------+
| Name | Check | Status | Reason | Host | Container Id |
+------------+-----------------+--------+--------+----------------+--------------+
| nats | Port Connection | OK | N/A | 192.168.108.50 | 178176a0cc79 |
| nats | Service Status | OK | N/A | 192.168.108.50 | 178176a0cc79 |
| nats | Firewall Port | OK | N/A | 192.168.108.50 | 178176a0cc79 |
| nats | Port Connection | OK | N/A | 192.168.108.56 | 41f35b3e8a03 |
| nats | Service Status | OK | N/A | 192.168.108.56 | 41f35b3e8a03 |
| nats | Firewall Port | OK | N/A | 192.168.108.56 | 41f35b3e8a03 |
| minio | Port Connection | OK | N/A | 192.168.108.50 | f12a7f8f6f85 |
| minio | Service Status | OK | N/A | 192.168.108.50 | f12a7f8f6f85 |
| minio | Firewall Port | OK | N/A | 192.168.108.50 | f12a7f8f6f85 |
| minio | Port Connection | OK | N/A | 192.168.108.56 | 43ae0b473698 |
| minio | Service Status | OK | N/A | 192.168.108.56 | 43ae0b473698 |
| minio | Firewall Port | OK | N/A | 192.168.108.56 | 43ae0b473698 |
| minio | Port Connection | OK | N/A | 192.168.108.58 | 48829343c2f6 |
| minio | Service Status | OK | N/A | 192.168.108.58 | 48829343c2f6 |
| minio | Firewall Port | OK | N/A | 192.168.108.58 | 48829343c2f6 |
| minio | Port Connection | OK | N/A | 192.168.108.51 | 2424ed057dee |
| minio | Service Status | OK | N/A | 192.168.108.51 | 2424ed057dee |
| minio | Firewall Port | OK | N/A | 192.168.108.51 | 2424ed057dee |
| mariadb | Port Connection | OK | N/A | 192.168.108.50 | c435c7f38ba3 |
| mariadb | Service Status | OK | N/A | 192.168.108.50 | c435c7f38ba3 |
| mariadb | Firewall Port | OK | N/A | 192.168.108.50 | c435c7f38ba3 |
| mariadb | Port Connection | OK | N/A | 192.168.108.56 | bf7f416c5e3f |
| mariadb | Service Status | OK | N/A | 192.168.108.56 | bf7f416c5e3f |
| mariadb | Firewall Port | OK | N/A | 192.168.108.56 | bf7f416c5e3f |
| mariadb | Port Connection | OK | N/A | 192.168.108.58 | dc78e416f180 |
| mariadb | Service Status | OK | N/A | 192.168.108.58 | dc78e416f180 |
| mariadb | Firewall Port | OK | N/A | 192.168.108.58 | dc78e416f180 |
| opensearch | Port Connection | OK | N/A | 192.168.108.50 | 85a0df23e3f7 |
| opensearch | Service Status | OK | N/A | 192.168.108.50 | 85a0df23e3f7 |
| opensearch | Firewall Port | OK | N/A | 192.168.108.50 | 85a0df23e3f7 |
| opensearch | Port Connection | OK | N/A | 192.168.108.56 | 6f76f281aca8 |
| opensearch | Service Status | OK | N/A | 192.168.108.56 | 6f76f281aca8 |
| opensearch | Firewall Port | OK | N/A | 192.168.108.56 | 6f76f281aca8 |
| opensearch | Port Connection | OK | N/A | 192.168.108.58 | b2f36099113e |
| opensearch | Service Status | OK | N/A | 192.168.108.58 | b2f36099113e |
| opensearch | Firewall Port | OK | N/A | 192.168.108.58 | b2f36099113e |
+------------+-----------------+--------+--------+----------------+--------------+
- After upgrading MariaDB, ensure that all nodes are in sync by executing the following commands.
mysql -urdafadmin -pabcd1234 -h192.168.108.59 -P3307 -e "SHOW GLOBAL STATUS LIKE 'wsrep_cluster_size'";
+-------------------+-------+
| Variable_name | Value |
+-------------------+-------+
| wsrep_cluster_size| 3 |
+-------------------+-------+
mysql -urdafadmin -pabcd1234 -h192.168.108.59 -P3307 -e "show status like 'wsrep_local_state_comment';"
+---------------------------+-----------+
| Variable_name | Value |
+---------------------------+-----------+
| wsrep_local_state_comment | synced |
+---------------------------+-----------+
1.2.3.1 Installing Nginx for External URL Access
- To enable loading the UI with an external URL, install Nginx using the following command. Replace
<external URL IP>with the actual external IP address you want to configure.
Note
Internet access from the API server is required for users to download packs directly from the public GitHub repository at fabrix.ai.
If users do not have internet access from the API server container and prefer not to configure a proxy on the API server, they can manually download the packs from the public GitHub repository to their local desktop and then use the Upload option instead of "Upload from Catalog."
In case of proxy environment please add below proxy settings to api-server in /opt/rdaf/deployment-scripts/values.yaml
Need to add the IPs in no_proxy, NO_PROXY which are part of rdaf product.
rda_api_server:
mem_limit: 4G
memswap_limit: 4G
privileged: true
environment:
RDA_STUDIO_URL: '""'
RDA_ENABLE_TRACES: 'no'
DISABLE_REMOTE_LOGGING_CONTROL: 'no'
RDA_SELF_HEALTH_RESTART_AFTER_FAILURES: 3
no_proxy: localhost,127.0.0.1,192.168.133.60,192.168.133.61,192.168.133.62,192.168.133.63,192.168.133.64,192.168.133.65,192.168.133.66
NO_PROXY: localhost,127.0.0.1,192.168.133.60,192.168.133.61,192.168.133.62,192.168.133.63,192.168.133.64,192.168.133.65,192.168.133.66
http_proxy: "http://test:[email protected]:3128"
https_proxy: "http://test:[email protected]:3128"
HTTP_PROXY: "http://test:[email protected]:3128"
HTTPS_PROXY: "http://test:[email protected]:3128"
deployment: true
cap_add:
- SYS_PTRACE
Note
For the document on SAML Configuration Update ("strict": false) for Portals Behind a URL Prefix with SAML SSO, Please Click Here
Note
Starting from version 8.1, the portal backend uses two additional ports — 8081 and 8082. If necessary, ensure these ports are allowed through the internal firewall.
1.2.4 Upgrade External OpenSearch
Note
If an external OpenSearch is configured, ensure it is upgraded to version 1.0.4.1 by running the following command.
- To upgrade External Opensearch please use these following command.
- To check the status of External OpenSearch, use the following command.
+---------------------+-----------------+------------+--------------+---------+
| Name | Host | Status | Container Id | Tag |
+---------------------+-----------------+------------+--------------+---------+
| opensearch_external | 192.168.107.187 | Up 34 hours | 6fb1babd1e05 | 1.0.4.1|
| opensearch_external | 192.168.107.188 | Up 34 hours | 95a8a7b61135 | 1.0.4.1|
| opensearch_external | 192.168.107.189 | Up 34 hours | dc776fc0adb6 | 1.0.4.1|
+---------------------+-----------------+------------+--------------+---------+
1.2.5 Upgrade RDAF Platform Services
Warning
For Non-Kubernetes deployment, upgrading RDAF Platform and AIOps application services is a disruptive operation when rolling-upgrade option is not used. Please schedule a maintenance window before upgrading RDAF Platform and AIOps services to newer version.
Run the below command to initiate upgrading RDAF Platform services with zero downtime
Note
timeout <10> mentioned in the above command represents as Seconds
Note
The rolling-upgrade option upgrades the Platform services running in high-availability mode on one VM at a time in sequence. It completes the upgrade of Platform services running on VM-1 before upgrading them on VM-2, followed by VM-3, and so on.
During this upgrade sequence, RDAF platform continues to function without any impact to the application traffic.
After completing the Platform services upgrade on all VMs, it will ask for user confirmation to delete the older version Platform service PODs. The user has to provide YES to delete the old docker containers (in non-k8s)
2025-09-09 05:06:33,450 [rdaf.component.platform] INFO - Checking if the upgraded components '['rda_api_server', 'rda_registry', 'rda_scheduler', 'rda_collector', 'rda_identity', 'rda_asm', 'rda_fsm', 'rda_chat_helper', 'cfx-rda-access-manager', 'cfx-rda-resource-manager', 'cfx-rda-user-preferences', 'portal-backend', 'portal-frontend']' has joined the rdac pods...
+----------+-----------------------+---------+----------+--------------+-------------+------------+
| Pod ID | Pod Type | Version | Age | Hostname | Maintenance | Pod Status |
+----------+-----------------------+---------+----------+--------------+-------------+------------+
| 7b8fb4c3 | api-server | 8.0.0 | 19:49:15 | c0f66f5a6f7d | None | True |
| 1b85e698 | registry | 8.0.0 | 19:48:49 | 7ec9180c93a9 | None | True |
| 96acc485 | scheduler | 8.0.0 | 19:48:18 | 9dc6bcc4411e | None | True |
| 075bc0d3 | collector | 8.0.0 | 19:47:53 | f7d1e7fe7abc | None | True |
| b33510ed | authenticator | 8.0.0 | 19:47:24 | eed73b76b2b8 | None | True |
| 9cd29c86 | asm | 8.0.0 | 19:47:00 | 0ba88473ecaf | None | True |
| edd075be | fsm | 8.0.0 | 19:46:35 | 085b70d83cda | None | True |
| fbabb4a0 | chat-helper | 8.0.0 | 19:46:06 | b0ad9515d410 | None | True |
| 0f61cceb | cfxdimensions-app- | 8.0.0 | 19:45:43 | d5e1507b9e1c | None | True |
| | access-manager | | | | | |
| d6361f4c | cfxdimensions-app- | 8.0.0 | 19:45:17 | 0fbe8c80a5bd | None | True |
| | resource-manager | | | | | |
| a67e7e15 | user-preferences | 8.0.0 | 19:44:50 | ac3d513b9d25 | None | True |
+----------+-----------------------+---------+----------+--------------+-------------+------------+
Continue moving above pods to maintenance mode? [yes/no]: yes
2025-09-09 05:10:51,257 [rdaf.component.platform] INFO - Initiating Maintenance Mode...
2025-09-09 05:11:13,851 [rdaf.component.platform] INFO - Following container are in maintenance mode
+----------+-----------------------+---------+----------+--------------+-------------+------------+
| Pod ID | Pod Type | Version | Age | Hostname | Maintenance | Pod Status |
+----------+-----------------------+---------+----------+--------------+-------------+------------+
| 7b8fb4c3 | api-server | 8.0.0 | 20:00:26 | c0f66f5a6f7d | maintenance | False |
| 9cd29c86 | asm | 8.0.0 | 19:58:12 | 0ba88473ecaf | maintenance | False |
| b33510ed | authenticator | 8.0.0 | 19:58:35 | eed73b76b2b8 | maintenance | False |
| 0f61cceb | cfxdimensions-app- | 8.0.0 | 19:56:55 | d5e1507b9e1c | maintenance | False |
| | access-manager | | | | | |
| d6361f4c | cfxdimensions-app- | 8.0.0 | 19:56:29 | 0fbe8c80a5bd | maintenance | False |
| | resource-manager | | | | | |
| fbabb4a0 | chat-helper | 8.0.0 | 19:57:18 | b0ad9515d410 | maintenance | False |
| 075bc0d3 | collector | 8.0.0 | 19:59:05 | f7d1e7fe7abc | maintenance | False |
| edd075be | fsm | 8.0.0 | 19:57:46 | 085b70d83cda | maintenance | False |
| 1b85e698 | registry | 8.0.0 | 20:00:00 | 7ec9180c93a9 | maintenance | False |
| 96acc485 | scheduler | 8.0.0 | 19:59:29 | 9dc6bcc4411e | maintenance | False |
| a67e7e15 | user-preferences | 8.0.0 | 19:56:02 | ac3d513b9d25 | maintenance | False |
+----------+-----------------------+---------+----------+--------------+-------------+------------+
Run the below command to initiate upgrading RDAF Platform services without zero downtime
Please wait till all of the new platform services are in Up state and run the below command to verify their status and make sure all of them are running with 8.1.1 version.
+--------------------------+----------------+-------------------------------+--------------------------+---------+
| Name | Host | Status | Container Id | Tag |
+--------------------------+----------------+-------------------------------+--------------------------+---------+
| rda_api_server | 192.168.108.51 | Up 4 hours | 9a2b3c4d5e6f | 8.1.1 |
| rda_api_server | 192.168.108.52 | Up 4 hours | 1b2c3d4e5f6a | 8.1.1 |
| rda_registry | 192.168.108.51 | Up 4 hours | c7d8e9f01a2b | 8.1.1 |
| rda_registry | 192.168.108.52 | Up 4 hours | 3f4a5b6c7d8e | 8.1.1 |
| rda_scheduler | 192.168.108.51 | Up 4 hours | e5f6a7b8c9d0 | 8.1.1 |
| rda_scheduler | 192.168.108.52 | Up 4 hours | 1a2b3c4d5e6f | 8.1.1 |
| rda_collector | 192.168.108.51 | Up 4 hours | 7b8c9d0e1f2a | 8.1.1 |
| rda_collector | 192.168.108.52 | Up 4 hours | 3c4d5e6f7a8b | 8.1.1 |
| rda_identity | 192.168.108.51 | Up 4 hours | 9d0e1f2a3b4c | 8.1.1 |
| rda_identity | 192.168.108.52 | Up 4 hours | 5e6f7a8b9c0d | 8.1.1 |
| rda_asm | 192.168.108.51 | Up 4 hours | 1f2a3b4c5d6e | 8.1.1 |
| rda_asm | 192.168.108.52 | Up 4 hours | 7a8b9c0d1e2f | 8.1.1 |
| rda_fsm | 192.168.108.51 | Up 4 hours | 3b4c5d6e7f8a | 8.1.1 |
| rda_fsm | 192.168.108.52 | Up 4 hours | 9c0d1e2f3a4b | 8.1.1 |
+--------------------------+----------------+-------------------------------+--------------------------+---------+
Run the below command to check the rda-scheduler service is elected as a leader under Site column.
Run the below command to check if all services has ok status and does not throw any failure messages.
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------+
| Cat | Pod-Type | Host | ID | Site | Health Parameter | Status | Message |
|-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------|
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | service-status | ok | |
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | minio-connectivity | ok | |
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | service-dependency:configuration-service | ok | 2 pod(s) found for configuration-service |
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | service-initialization-status | ok | |
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | kafka-connectivity | ok | Cluster=NTc1NWU1MTQxYmY3MTFlZg, Broker=1, Brokers=[1, 2, 3] |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | service-status | ok | |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | minio-connectivity | ok | |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | service-dependency:configuration-service | ok | 2 pod(s) found for configuration-service |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | service-initialization-status | ok | |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | kafka-connectivity | ok | Cluster=NTc1NWU1MTQxYmY3MTFlZg, Broker=3, Brokers=[1, 2, 3] |
| rda_app | alert-processor | c6cc7b04ab33 | b4ebfb06 | | service-status | ok | |
| rda_app | alert-processor | c6cc7b04ab33 | b4ebfb06 | | minio-connectivity | ok | |
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------+
1.2.6 Upgrade rdac CLI
1.2.7 Upgrade RDA Worker Services
Note
If the worker was deployed in a HTTP proxy environment, please make sure the required HTTP proxy environment variables are added in /opt/rdaf/deployment-scripts/values.yaml file under rda_worker configuration section as shown below before upgrading RDA Worker services.
rda_worker:
mem_limit: 8G
memswap_limit: 8G
privileged: false
environment:
RDA_ENABLE_TRACES: 'no'
RDA_SELF_HEALTH_RESTART_AFTER_FAILURES: 3
http_proxy: "http://test:[email protected]:3128"
https_proxy: "http://test:[email protected]:3128"
HTTP_PROXY: "http://test:[email protected]:3128"
HTTPS_PROXY: "http://test:[email protected]:3128"
- Upgrade RDA Worker Services
Please run the below command to initiate upgrading the RDA Worker Service with zero downtime
Note
timeout <10> mentioned in the above command represents as seconds
Note
The rolling-upgrade option upgrades the Worker services running in high-availability mode on one VM at a time in sequence. It completes the upgrade of Worker services running on VM-1 before upgrading them on VM-2, followed by VM-3, and so on.
After completing the Worker services upgrade on all VMs, it will ask for user confirmation, the user has to provide YES to delete the older version Worker service PODs.
2024-08-12 02:56:11,573 [rdaf.component.worker] INFO - Collecting worker details for rolling upgrade
2024-08-12 02:56:14,301 [rdaf.component.worker] INFO - Rolling upgrade worker on 192.168.133.96
+----------+----------+---------------+---------+--------------+-------------+------------+
| Pod ID | Pod Type | Version | Age | Hostname | Maintenance | Pod Status |
+----------+----------+---------------+---------+--------------+-------------+------------+
| c8a37db9 | worker | 8.1.1 |3:32:31 | fffe44b43708 | None | True |
+----------+----------+---------------+---------+--------------+-------------+------------+
Continue moving above pod to maintenance mode? [yes/no]: yes
2024-08-12 02:57:17,346 [rdaf.component.worker] INFO - Initiating maintenance mode for pod c8a37db9
2024-08-12 02:57:22,401 [rdaf.component.worker] INFO - Waiting for worker to be moved to maintenance.
2024-08-12 02:57:35,001 [rdaf.component.worker] INFO - Following worker container is in maintenance mode
+----------+----------+---------------+---------+--------------+-------------+------------+
| Pod ID | Pod Type | Version | Age | Hostname | Maintenance | Pod Status |
+----------+----------+---------------+---------+--------------+-------------+------------+
| c8a37db9 | worker | 8.1.1 | 3:33:52 | fffe44b43708 | maintenance | False |
+----------+----------+---------------+---------+--------------+-------------+------------+
2024-08-12 02:57:35,002 [rdaf.component.worker] INFO - Waiting for timeout of 3 seconds.
Please run the below command to initiate upgrading the RDA Worker Service without zero downtime
Please wait for 120 seconds to let the newer version of RDA Worker service containers join the RDA Fabric appropriately. Run the below commands to verify the status of the newer RDA Worker service containers.
| Infra | worker | True | 6eff605e72c4 | a318f394 | rda-site-01 | 13:45:13 | 4 | 31.21 | 0 | 0 |
| Infra | worker | True | ae7244d0d10a | 554c2cd8 | rda-site-01 | 13:40:40 | 4 | 31.21 | 0 | 0 |
+------------+----------------+------------+--------------+---------+
| Name | Host | Status | Container Id | Tag |
+------------+----------------+------------+--------------+---------+
| rda_worker | 192.168.108.53 | Up 4 hours | 2b3c4d5e6f7a | 8.1.1 |
| rda_worker | 192.168.108.54 | Up 4 hours | 8d9e0f1a2b3c | 8.1.1 |
+------------+----------------+------------+--------------+---------+
Run the below command to check if all RDA Worker services has ok status and does not throw any failure messages.
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-----------------------------------------------------------------------------------------------------------------------------+
| Cat | Pod-Type | Host | ID | Site | Health Parameter | Status | Message |
|-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-----------------------------------------------------------------------------------------------------------------------------|
| rda_infra | api-server | 1b0542719618 | 1845ae67 | | service-status | ok | |
| rda_infra | api-server | 1b0542719618 | 1845ae67 | | minio-connectivity | ok | |
| rda_infra | api-server | d4404cffdc7a | a4cfdc6d | | service-status | ok | |
| rda_infra | api-server | d4404cffdc7a | a4cfdc6d | | minio-connectivity | ok | |
| rda_infra | asm | 8d3d52a7a475 | 418c9dc1 | | service-status | ok | |
| rda_infra | asm | 8d3d52a7a475 | 418c9dc1 | | minio-connectivity | ok | |
| rda_infra | asm | ab172a9b8229 | 2ac1d67a | | service-status | ok | |
| rda_infra | asm | ab172a9b8229 | 2ac1d67a | | minio-connectivity | ok | |
| rda_app | asset-dependency | 6ac69ca1085c | c2e9dcb9 | | service-status | ok | |
| rda_app | asset-dependency | 6ac69ca1085c | c2e9dcb9 | | minio-connectivity | ok | |
| rda_app | asset-dependency | 58a5f4f460d3 | 0b91caac | | service-status | ok | |
| rda_app | asset-dependency | 58a5f4f460d3 | 0b91caac | | minio-connectivity | ok | |
| rda_app | authenticator | 9011c2aef498 | 9f7efdc3 | | service-status | ok | |
| rda_app | authenticator | 9011c2aef498 | 9f7efdc3 | | minio-connectivity | ok | |
| rda_app | authenticator | 9011c2aef498 | 9f7efdc3 | | DB-connectivity | ok | |
| rda_app | authenticator | 148621ed8c82 | dbf16b82 | | service-status | ok | |
| rda_app | authenticator | 148621ed8c82 | dbf16b82 | | minio-connectivity | ok | |
| rda_app | authenticator | 148621ed8c82 | dbf16b82 | | DB-connectivity | ok | |
| rda_app | cfx-app-controller | 75ec0f30cfa3 | 1198fdee | | service-status | ok | |
| rda_app | cfx-app-controller | 75ec0f30cfa3 | 1198fdee | | minio-connectivity | ok | |
| rda_app | cfx-app-controller | 75ec0f30cfa3 | 1198fdee | | service-initialization-status | ok | |
| rda_app | cfx-app-controller | 75ec0f30cfa3 | 1198fdee | | DB-connectivity | ok |
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-----------------------------------------------------------------------------------------------------------------------------+
1.2.8 Update Environment Variables in values.yaml
Note
Please ensure to update the highlighted environment variables manually in the values.yaml file
Step 1: Alert Ingester- Add Environment Variables
-
Before upgrading the Alert Ingester service, ensure the following environment variables are added under the alert_ingester section in the values.yaml file. file path
/opt/rdaf/deployment-scripts/values.yaml -
Environment Variables to Add
cfx-rda-alert-ingester:
mem_limit: 6G
memswap_limit: 6G
privileged: true
environment:
DISABLE_REMOTE_LOGGING_CONTROL: 'no'
RDA_ENABLE_TRACES: 'yes'
RDA_SELF_HEALTH_RESTART_AFTER_FAILURES: 3
INBOUND_PARTITION_WORKERS_MAX: 1
OUTBOUND_TOPIC_WORKERS_MAX: 1
hosts:
- 192.168.109.53
- 192.168.109.54
cap_add:
- SYS_PTRACE
Step 2: Event Consumer- Add Environment Variable
-
Before upgrading the event consumer service, add the following environment variable under the
event_consumersection ofvalues.yaml. -
Environment Variable to Add
cfx-rda-event-consumer:
mem_limit: 6G
memswap_limit: 6G
privileged: true
environment:
DISABLE_REMOTE_LOGGING_CONTROL: 'no'
RDA_ENABLE_TRACES: 'yes'
RDA_SELF_HEALTH_RESTART_AFTER_FAILURES: 3
OUTBOUND_WORKERS_MAX: 3
hosts:
- 192.168.109.53
- 192.168.109.54
cap_add:
- SYS_PTRACE
Note
For above Environment Variables Configuration needs to be updated as per the Production Deployments, Please refer this Document
1.2.9 Upgrade OIA Application Services
Run the below commands to initiate upgrading the RDA Fabric OIA Application services with zero downtime
Note
timeout <10> mentioned in the above command represents as Seconds
Note
The rolling-upgrade option upgrades the OIA application services running in high-availability mode on one VM at a time in sequence. It completes the upgrade of OIA application services running on VM-1 before upgrading them on VM-2, followed by VM-3, and so on.
After completing the OIA application services upgrade on all VMs, it will ask for user confirmation to delete the older version OIA application service PODs.
2024-08-12 03:18:08,705 [rdaf.component.oia] INFO - Gathering OIA app container details.
2024-08-12 03:18:10,719 [rdaf.component.oia] INFO - Gathering rdac pod details.
+----------+----------------------+---------+---------+--------------+-------------+------------+
| Pod ID | Pod Type | Version | Age | Hostname | Maintenance | Pod Status |
+----------+----------------------+---------+---------+--------------+-------------+------------+
| 2992fe69 | cfx-app-controller | 8.0.0 | 3:44:53 | 0500f773a8ff | None | True |
| 336138c8 | reports-registry | 8.0.0 | 3:44:12 | 92a5e0daa942 | None | True |
| ccc5f3ce | cfxdimensions-app- | 8.0.0 | 3:43:34 | 99192de47ea4 | None | True |
| | notification-service | | | | | |
| 03614007 | cfxdimensions-app- | 8.0.0 | 3:42:54 | fbdf4e5c16c3 | None | True |
| | file-browser | | | | | |
| a4949804 | configuration- | 8.0.0 | 3:42:15 | 4ea08c8cbf2e | None | True |
| | service | | | | | |
| 8f37c520 | alert-ingester | 8.0.0 | 3:41:35 | e9e3a3e69cac | None | True |
| 249b7104 | webhook-server | 8.0.0 | 3:12:04 | 1df43cebc888 | None | True |
| 76c64336 | smtp-server | 8.0.0 | 3:08:57 | 03725b0cb91f | None | True |
| ad85cb4c | event-consumer | 8.0.0 | 3:09:58 | 8a7d349da513 | None | True |
| 1a788ef3 | alert-processor | 8.0.0 | 3:11:01 | a7c5294cba3d | None | True |
| 970b90b1 | cfxdimensions-app- | 8.0.0 | 3:38:14 | 01d4245bb90e | None | True |
| | irm_service | | | | | |
| 153aa6ac | ml-config | 8.0.0 | 3:37:33 | 10d5d6766354 | None | True |
| 5aa927a4 | cfxdimensions-app- | 8.0.0 | 3:36:53 | dcfda7175cb5 | None | True |
| | collaboration | | | | | |
| 6833aa86 | ingestion-tracker | 8.0.0 | 3:36:13 | ef0e78252e48 | None | True |
| afe77cb9 | alert-processor- | 8.0.0 | 3:35:33 | 6f03c7fdba51 | None | True |
| | companion | | | | | |
+----------+----------------------+---------+---------+--------------+-------------+------------+
Continue moving above pods to maintenance mode? [yes/no]: yes
2024-08-12 03:18:27,159 [rdaf.component.oia] INFO - Initiating Maintenance Mode...
2024-08-12 03:18:32,978 [rdaf.component.oia] INFO - Waiting for services to be moved to maintenance.
2024-08-12 03:18:55,771 [rdaf.component.oia] INFO - Following container are in maintenance mode
+----------+----------------------+---------+---------+--------------+-------------+------------+
Run the below command to initiate upgrading the RDA Fabric OIA Application services without zero downtime
Please wait till all of the new OIA application service containers are in Up state and run the below command to verify their status and make sure they are running with 8.1.1 version.
+-----------------------------------+----------------+------------+--------------------------+---------+
| Name | Host | Status | Container Id | Tag |
+-----------------------------------+----------------+------------+--------------------------+---------+
| cfx-rda-app-controller | 192.168.108.51 | Up 3 hours | a1b2c3d4e5f6 | 8.1.1 |
| cfx-rda-app-controller | 192.168.108.52 | Up 3 hours | 7a8b9c0d1e2f | 8.1.1 |
| cfx-rda-reports-registry | 192.168.108.51 | Up 4 hours | 3b4c5d6e7f8a | 8.1.1 |
| cfx-rda-reports-registry | 192.168.108.52 | Up 4 hours | 9c0d1e2f3a4b | 8.1.1 |
| cfx-rda-notification-service | 192.168.108.51 | Up 4 hours | 5e6f7a8b9c0d | 8.1.1 |
| cfx-rda-notification-service | 192.168.108.52 | Up 4 hours | 1f2a3b4c5d6e | 8.1.1 |
| cfx-rda-file-browser | 192.168.108.51 | Up 4 hours | 7a8b9c0d1e2f | 8.1.1 |
| cfx-rda-file-browser | 192.168.108.52 | Up 4 hours | 3b4c5d6e7f8a | 8.1.1 |
| cfx-rda-configuration-service | 192.168.108.51 | Up 4 hours | 9c0d1e2f3a4b | 8.1.1 |
| cfx-rda-configuration-service | 192.168.108.52 | Up 4 hours | 5e6f7a8b9c0d | 8.1.1 |
| cfx-rda-alert-ingester | 192.168.108.51 | Up 4 hours | 1f2a3b4c5d6e | 8.1.1 |
| cfx-rda-alert-ingester | 192.168.108.52 | Up 4 hours | a1b2c3d4e5f6 | 8.1.1 |
| cfx-rda-webhook-server | 192.168.108.51 | Up 4 hours | 7a8b9c0d1e2f | 8.1.1 |
| cfx-rda-webhook-server | 192.168.108.52 | Up 4 hours | 3b4c5d6e7f8a | 8.1.1 |
| cfx-rda-smtp-server | 192.168.108.51 | Up 4 hours | 9c0d1e2f3a4b | 8.1.1 |
| cfx-rda-smtp-server | 192.168.108.52 | Up 4 hours | 5e6f7a8b9c0d | 8.1.1 |
+-----------------------------------+----------------+------------+--------------------------+---------+
Run the below command to verify all OIA application services are up and running.
+-------+----------------------------------------+-------------+----------------+----------+-------------+----------+--------+--------------+---------------+--------------+
| Cat | Pod-Type | Pod-Ready | Host | ID | Site | Age | CPUs | Memory(GB) | Active Jobs | Total Jobs |
|-------+----------------------------------------+-------------+----------------+----------+-------------+----------+--------+--------------+---------------+--------------|
| App | alert-ingester | True | rda-alert-inge | 6a6e464d | | 19:22:36 | 8 | 31.33 | | |
| App | alert-ingester | True | rda-alert-inge | 7f6b42a0 | | 19:22:53 | 8 | 31.33 | | |
| App | alert-processor | True | rda-alert-proc | a880e491 | | 19:23:21 | 8 | 31.33 | | |
| App | alert-processor | True | rda-alert-proc | b684609e | | 19:23:18 | 8 | 31.33 | | |
| App | alert-processor-companion | True | rda-alert-proc | 874f3b33 | | 19:22:24 | 8 | 31.33 | | |
| App | alert-processor-companion | True | rda-alert-proc | 70cadaa7 | | 19:22:05 | 8 | 31.33 | | |
| App | asset-dependency | True | rda-asset-depe | bde06c15 | | 19:47:50 | 8 | 31.33 | | |
| App | asset-dependency | True | rda-asset-depe | 47b9eb02 | | 19:47:38 | 8 | 31.33 | | |
| App | authenticator | True | rda-identity-d | faa33e1b | | 19:47:52 | 8 | 31.33 | | |
| App | authenticator | True | rda-identity-d | 36083c36 | | 19:47:46 | 8 | 31.33 | | |
| App | cfx-app-controller | True | rda-app-contro | 5fd3c3f4 | | 19:23:09 | 8 | 31.33 | | |
| App | cfx-app-controller | True | rda-app-contro | d66e5ce8 | | 19:22:56 | 8 | 31.33 | | |
| App | cfxdimensions-app-access-manager | True | rda-access-man | ecbb535c | | 19:47:46 | 8 | 31.33 | | |
| App | cfxdimensions-app-access-manager | True | rda-access-man | 9a05db5a | | 19:47:36 | 8 | 31.33 | | |
| App | cfxdimensions-app-collaboration | True | rda-collaborat | 61b3c53b | | 19:22:18 | 8 | 31.33 | | |
| App | cfxdimensions-app-collaboration | True | rda-collaborat | 09b9474e | | 19:21:57 | 8 | 31.33 | | |
| App | cfxdimensions-app-file-browser | True | rda-file-brows | 00495640 | | 19:22:45 | 8 | 31.33 | | |
| App | cfxdimensions-app-file-browser | True | rda-file-brows | 640f0653 | | 19:22:29 | 8 | 31.33 | | |
| App | cfxdimensions-app-irm_service | True | rda-irm-servic | 27e345c5 | | 19:21:43 | 8 | 31.33 | | |
| App | cfxdimensions-app-irm_service | True | rda-irm-servic | 23c7e082 | | 19:21:56 | 8 | 31.33 | | |
| App | cfxdimensions-app-notification-service | True | rda-notificati | bbb5b08b | | 19:23:20 | 8 | 31.33 | | |
| App | cfxdimensions-app-notification-service | True | rda-notificati | 9841bcb5 | | 19:23:02 | 8 | 31.33 | | |
+-------+----------------------------------------+-------------+----------------+----------+-------------+----------+--------+--------------+---------------+--------------+
Run the below command to check if all services has ok status and does not throw any failure messages.
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------+
| Cat | Pod-Type | Host | ID | Site | Health Parameter | Status | Message |
|-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------|
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | service-status | ok | |
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | minio-connectivity | ok | |
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | service-dependency:configuration-service | ok | 2 pod(s) found for configuration-service |
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | service-initialization-status | ok | |
| rda_app | alert-ingester | 7f75047e9e44 | daa8c414 | | kafka-connectivity | ok | Cluster=NTc1NWU1MTQxYmY3MTFlZg, Broker=1, Brokers=[1, 2, 3] |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | service-status | ok | |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | minio-connectivity | ok | |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | service-dependency:configuration-service | ok | 2 pod(s) found for configuration-service |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | service-initialization-status | ok | |
| rda_app | alert-ingester | f9ec55862be0 | f9b9231c | | kafka-connectivity | ok | Cluster=NTc1NWU1MTQxYmY3MTFlZg, Broker=2, Brokers=[1, 2, 3] |
| rda_app | alert-processor | c6cc7b04ab33 | b4ebfb06 | | service-status | ok | |
| rda_app | alert-processor | c6cc7b04ab33 | b4ebfb06 | | minio-connectivity | ok | |
+-----------+----------------------------------------+--------------+----------+-------------+-----------------------------------------------------+----------+-------------------------------------------------------------+
1.2.10 Upgrade Event Gateway Services
Important
This Upgrade is for Non-K8s only
Step 1. Prerequisites
- Event Gateway with 8.0.0 tag should be already installed
Note
If a user deployed the event gateway using the RDAF CLI, follow Step 2 and skip Step 3 or if the user did not deploy event gateway in RDAF CLI go to Step 3
Step 2. Upgrade Event Gateway Using RDAF CLI
-
Log in to the RDAF CLI VM.
-
Run the following command to check the status of the Event Gateway.
+-------------------+-----------------+------------------+--------------------------+---------+
| Name | Host | Status | Container Id | Tag |
+-------------------+-----------------+------------------+--------------------------+---------+
| rda_event_gateway | 192.168.107.188 | Up 2 hours | a1b2c3d4e5f6 | 8.1.1 |
| rda_event_gateway | 192.168.107.189 | Up 16 hours | 7a8b9c0d1e2f | 8.1.1 |
+-------------------+-----------------+------------------+--------------------------+---------+
Note
- If the command returns a valid status output, proceed with the upgrade using RDAF CLI.
- If no status output is shown, skip to Step 3 (Upgrade using Docker Compose).
- To upgrade the event gateway, log in to the rdaf cli VM and execute the following command.
- Verify that the upgrade completed successfully by rechecking the Event Gateway status.
+-------------------+-----------------+------------------+--------------------------+---------+
| Name | Host | Status | Container Id | Tag |
+-------------------+-----------------+------------------+--------------------------+---------+
| rda_event_gateway | 192.168.107.188 | Up About an hour | a1b2c3d4e5f6 | 8.1.1 |
| rda_event_gateway | 192.168.107.189 | Up 21 hours | 7a8b9c0d1e2f | 8.1.1 |
+-------------------+-----------------+------------------+--------------------------+---------+
Step 3. Upgrade Event Gateway Using Docker Compose File
-
Login to the Event Gateway installed VM
-
Navigate to the location where Event Gateway was previously installed, using the following command
-
Edit the docker-compose file for the Event Gateway using a local editor (e.g. vi) update the tag and save it
version: '3.1' services: rda_event_gateway: image: docker1.cloudfabrix.io:443/external/ubuntu-rda-event-gateway:8.1.1 restart: always network_mode: host mem_limit: 6G memswap_limit: 6G volumes: - /opt/rdaf/network_config:/network_config - /opt/rdaf/event_gateway/config:/event_gw_config - /opt/rdaf/event_gateway/certs:/certs - /opt/rdaf/event_gateway/logs:/logs - /opt/rdaf/event_gateway/log_archive:/tmp/log_archive logging: driver: "json-file" options: max-size: "25m" max-file: "5" environment: RDA_NETWORK_CONFIG: /network_config/rda_network_config.json EVENT_GW_MAIN_CONFIG: /event_gw_config/main/main.yml EVENT_GW_SNMP_TRAP_CONFIG: /event_gw_config/snmptrap/trap_template.json EVENT_GW_SNMP_TRAP_ALERT_CONFIG: /event_gw_config/snmptrap/trap_to_alert_go.yaml AGENT_GROUP: event_gateway_site01 EVENT_GATEWAY_CONFIG_DIR: /event_gw_config LOGGER_CONFIG_FILE: /event_gw_config/main/logging.yml -
Please run the following commands
-
Use the command as shown below to ensure that the RDA docker instances are up and running.
-
Use the below mentioned command to check docker logs for any errors
1.2.11 RDA Studio Upgrade
Please navigate to the rda-studio.yml file. You need to modify the existing tag version to 8.1.1, ensuring it matches the format shown in the example below, and then save the file
services:
cfxdx:
image: docker1.cloudfabrix.io:443/external/ubuntu-cfxdx-nb-nginx-all:8.1.1
restart: unless-stopped
volumes:
- /opt/rdaf/cfxdx/home/:/root
- /opt/rdaf/cfxdx/config/:/tmp/config/
- /opt/rdaf/cfxdx/output:/tmp/output/
- /opt/rdaf/config/network_config/:/network_config
ports:
- "9998:9998"
environment:
#JUPYTER_TOKEN: cfxdxdemo
NLTK_DATA : "/root/nltk_data"
CFXDX_CONFIG_FILE: /tmp/config/conf.yml
RDA_NETWORK_CONFIG: /network_config/config.json
RDA_USER: xxxxxxx
RDA_PASSWORD: xxxxxxxxxxxx
After updating the rda-studio.yml file to set the tag version to 8.1.1, execute the following commands to pull the latest images and start the services
1.2.12 Upgrade RDAF Bulkstats Services
Note
The RDAF Bulkstats service is optional and only necessary if the Bulkstats data ingestion feature is required. Otherwise, you may ignore the steps below and go to next section.
Run the below command to upgrade bulk_stats services
Run the below command to get the bulk_stats status
+----------------+----------------+------------+--------------------------+---------+
| Name | Host | Status | Container Id | Tag |
+----------------+----------------+------------+--------------------------+---------+
| rda_bulk_stats | 192.168.108.51 | Up 4 hours | a1b2c3d4e5f6 | 8.1.1 |
| rda_bulk_stats | 192.168.108.52 | Up 4 hours | 7a8b9c0d1e2f | 8.1.1 |
+----------------+----------------+------------+--------------------------+---------+
1.2.12.1 Upgrade RDAF File Object Services
Note
This service is applicable for Non-K8s only, The RDAF File Object service is optional and only necessary if the Bulkstats data ingestion feature is required. Otherwise, you may ignore the steps below and go to next section
Run the below command to upgrade File Object services.
Run the below command to get the file_object status
+-----------------+----------------+---------------+--------------------------+---------+
| Name | Host | Status | Container Id | Tag |
+-----------------+----------------+---------------+--------------------------+---------+
| rda_file_object | 192.168.108.51 | Up 54 seconds | a1b2c3d4e5f6 | 8.1.1 |
| rda_file_object | 192.168.108.52 | Up 52 seconds | 7a8b9c0d1e2f | 8.1.1 |
+-----------------+----------------+---------------+--------------------------+---------+
1.2.13 Nginx Load Balancer for Event gateway
Note
Update the nginx configuration file to enable log rotation for the Event Gateway only when Load Balancer is deployed
Add the below mentioned configuration file with the specified content and restart the Nginx container
Paste the below content and save it1.3. Post Upgrade Steps
Step 1: Manually restart both instances of "app-controller" service using docker command
Step 2: Update Cleared Alerts Data Retention in Database Property to 8760 hours (1 year) retention. Path to update Cleared Alerts Data Retention property Main Menu --> Administration --> Configurations --> Click on row level action for Cleared Alerts Data Retention in Database.
Step 3: Purge Resolved/Closed incidents data from IRM/AP/Collab DB and pstreams. This change is needed so that incidents/Alerts/collab can be maintained consistently across the system.
Currently, the purging mechanism relies on the retention_days setting defined per pstream. As a result, related data (e.g., alerts or collab messages) may be retained for different durations, leading to inconsistencies in how incident-related information is managed throughout the system.
Note
Need this change so that the collector will not purge data from pstreams.
Go to Main Menu --> Configuration --> RDA Administration --> Persistent Streams --> Persistent Streams Update the pstream definition for oia-alerts-payload by removing retention_days and retention_purge_extra_filter attributes if a pstream has defined these properties.
| Old Config | New Config |
|---|---|
|
|
Step 4: In 8.0.0, system bundles were converted to packs at the start of the api server and uploaded so that they can be available in Packs page by default.
In 8.1.1, not all system bundles are needed and the relevant ones have been converted to packs that can be uploaded on demand. As a result of this change the environments that is upgraded from 8.0.0 to 8.1.1 need to run the script to remove the "system bundles" that were added to packs page.
Upgrade script that deletes the "bundle packs" from Packs page if they are not activated. Steps:
Download the Script delete_bundle_packs_8.0_to_8.1_upgrade.py
wget https://macaw-amer.s3.us-east-1.amazonaws.com/releases/rdaf-platform/1.4.1/delete_bundle_packs_8.0_to_8.1_upgrade.py
copy the downloaded file inside the api server container.
Run the script as follows
Run in test mode to see what would be deleted
(cfx_venv) root@2602fff46f91:/tmp# python delete_bundle_packs_8.0_to_8.1_upgrade.py --test
/cfx_venv/lib/python3.12/site-packages/google/protobuf/runtime_version.py:98: UserWarning: Protobuf gencode version 5.29.0 is exactly one major version older than the runtime version 6.30.2 at nats.proto. Please update the gencode to avoid compatibility violations in the next runtime release.
warnings.warn(
/cfx_venv/lib/python3.12/site-packages/cfxql/__init__.py:8: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
import pkg_resources
Running in TEST MODE - no actual deletions will be performed
Starting 8.0 to 8.1 upgrade pack cleanup process...
Processing 33 packs for 8.0 to 8.1 upgrade cleanup...
2025-07-07 14:23:10,890 [PID=2651:TID=Thread-2 (subscibe_over_grpc_internal):cfx.rda_messaging.nats.nats_subscriber:subscibe_over_grpc_internal:274] INFO - Subscribing to tenants.98e005500460423c886d8e30d8a9acf6.streams.rda-systc65fa61756c5f235299dc80f85c48356 using grpc
2025-07-07 14:23:10,891 [PID=2651:TID=Thread-2 (subscibe_over_grpc_internal):cfx.rda_nats.nats_grpc_util:subscribe:101] INFO - Subscribing on subject tenants.98e005500460423c886d8e30d8a9acf6.streams.rda-systc65fa61756c5f235299dc80f85c48356/1a0deaf4. ID: e4a04f90-e4e9-47ce-8873-eedbdb73c9ea
2025-07-07 14:23:10,899 [PID=2651:TID=NATS_SUB_299dc80f85c48356:cfx.rda_nats.nats_grpc_util:_subscription_worker:142] INFO - Received response on subscription on subject tenants.98e005500460423c886d8e30d8a9acf6.streams.rda-systc65fa61756c5f235299dc80f85c48356/1a0deaf4. ID: e4a04f90-e4e9-47ce-8873-eedbdb73c9ea. Time taken 8 msec
2025-07-07 14:23:10,955 [PID=2651:TID=MainThread:cfx.rda_messaging.dataplane_policy:__load_from_file:66] INFO - Loading DataPlanePolicy from file: /network_config/policy.json
2025-07-07 14:23:10,957 [PID=2651:TID=MainThread:cfx.rda_messaging.dataplane_policy:__load_from_file:106] INFO - Loaded Dataplane policy with 1 configs, 3 pstream-mappings
2025-07-07 14:23:10,957 [PID=2651:TID=MainThread:cfx.rda_messaging.dataplane_policy:__load_from_file:110] INFO - Dataplane custom routing enabled
2025-07-07 14:23:10,963 [PID=2651:TID=Thread-3 (subscibe_over_grpc_internal):cfx.rda_messaging.nats.nats_subscriber:subscibe_over_grpc_internal:274] INFO - Subscribing to tenants.98e005500460423c886d8e30d8a9acf6.streams.rda-systc65fa61756c5f235299dc80f85c48356 using grpc
2025-07-07 14:23:10,963 [PID=2651:TID=Thread-3 (subscibe_over_grpc_internal):cfx.rda_nats.nats_grpc_util:subscribe:101] INFO - Subscribing on subject tenants.98e005500460423c886d8e30d8a9acf6.streams.rda-systc65fa61756c5f235299dc80f85c48356/19cab3a1. ID: 730545b5-afe4-4033-8c9b-7944ef747413
2025-07-07 14:23:10,966 [PID=2651:TID=NATS_SUB_299dc80f85c48356:cfx.rda_nats.nats_grpc_util:_subscription_worker:142] INFO - Received response on subscription on subject tenants.98e005500460423c886d8e30d8a9acf6.streams.rda-systc65fa61756c5f235299dc80f85c48356/19cab3a1. ID: 730545b5-afe4-4033-8c9b-7944ef747413. Time taken 2 msec
2025-07-07 14:23:11,142 [PID=2651:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:does_pack_exists:2037] INFO - Pack Cisco BPA version 1.0.0 exists
[TEST MODE] Pack 'Cisco BPA' version '1.0.0' would be deleted (not activated)
2025-07-07 14:23:11,213 [PID=2651:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:does_pack_exists:2037] INFO - Pack Topology Path Visualisation version 1.0.0 exists
[TEST MODE] Pack 'Topology Path Visualisation' version '1.0.0' would be deleted (not activated)
2025-07-07 14:23:11,245 [PID=2651:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:does_pack_exists:2037] INFO - Pack Topology version 1.0.0 exists
[TEST MODE] Pack 'Topology' version '1.0.0' would be deleted (not activated)
Pack 'System' version '1.0.0' does not exist, skipping
Pack 'Synthetic Metrics' version '1.0.0' does not exist, skipping
8.0 to 8.1 upgrade cleanup complete. 29 packs processed for deletion.
2025-07-07 14:23:12,213 [PID=2651:TID=MainThread:cfx.rda_messaging.nats.nats_subscriber:handle_exit:285] INFO - Initiating clean shutdown for subscription tenants.98e005500460423c886d8e30d8a9acf6.streams.rda-systc65fa61756c5f235299dc80f85c48356/19cab3a1
Run the actual deletion
(cfx_venv) root@2602fff46f91:/tmp# python delete_bundle_packs_8.0_to_8.1_upgrade.py
/cfx_venv/lib/python3.12/site-packages/google/protobuf/runtime_version.py:98: UserWarning: Protobuf gencode version 5.29.0 is exactly one major version older than the runtime version 6.30.2 at nats.proto. Please update the gencode to avoid compatibility violations in the next runtime release.
warnings.warn(
/cfx_venv/lib/python3.12/site-packages/cfxql/__init__.py:8: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
import pkg_resources
Starting 8.0 to 8.1 upgrade pack cleanup process...
Processing 33 packs for 8.0 to 8.1 upgrade cleanup...
2025-07-07 14:27:00,757 [PID=2677:TID=Thread-2 (subscibe_over_grpc_internal):cfx.rda_messaging.nats.nats_subscriber:subscibe_over_grpc_internal:274] INFO - Subscribing to tenants.98e005500460423c886d8e30d8a9acf6.streams.rda-systc65fa61756c5f235299dc80f85c48356 using grpc
2025-07-07 14:27:00,757 [PID=2677:TID=Thread-2 (subscibe_over_grpc_internal):cfx.rda_nats.nats_grpc_util:subscribe:101] INFO - Subscribing on subject tenants.98e005500460423c886d8e30d8a9acf6.streams.rda-systc65fa61756c5f235299dc80f85c48356/02ad5eca. ID: 4c0091c8-249f-4cd5-9f8a-937381dea8f8
2025-07-07 14:27:00,760 [PID=2677:TID=NATS_SUB_299dc80f85c48356:cfx.rda_nats.nats_grpc_util:_subscription_worker:142] INFO - Received response on subscription on subject tenants.98e005500460423c886d8e30d8a9acf6.streams.rda-systc65fa61756c5f235299dc80f85c48356/02ad5eca. ID: 4c0091c8-249f-4cd5-9f8a-937381dea8f8. Time taken 3 msec
2025-07-07 14:27:00,806 [PID=2677:TID=MainThread:cfx.rda_messaging.dataplane_policy:__load_from_file:66] INFO - Loading DataPlanePolicy from file: /network_config/policy.json
2025-07-07 14:27:00,806 [PID=2677:TID=MainThread:cfx.rda_messaging.dataplane_policy:__load_from_file:106] INFO - Loaded Dataplane policy with 1 configs, 3 pstream-mappings
2025-07-07 14:27:00,807 [PID=2677:TID=MainThread:cfx.rda_messaging.dataplane_policy:__load_from_file:110] INFO - Dataplane custom routing enabled
2025-07-07 14:27:00,810 [PID=2677:TID=Thread-3 (subscibe_over_grpc_internal):cfx.rda_messaging.nats.nats_subscriber:subscibe_over_grpc_internal:274] INFO - Subscribing to tenants.98e005500460423c886d8e30d8a9acf6.streams.rda-systc65fa61756c5f235299dc80f85c48356 using grpc
2025-07-07 14:27:00,811 [PID=2677:TID=Thread-3 (subscibe_over_grpc_internal):cfx.rda_nats.nats_grpc_util:subscribe:101] INFO - Subscribing on subject tenants.98e005500460423c886d8e30d8a9acf6.streams.rda-systc65fa61756c5f235299dc80f85c48356/79c09d31. ID: 3d7d6d4a-7d10-4cc1-b6a6-25e5ed21d0ab
2025-07-07 14:27:00,813 [PID=2677:TID=NATS_SUB_299dc80f85c48356:cfx.rda_nats.nats_grpc_util:_subscription_worker:142] INFO - Received response on subscription on subject tenants.98e005500460423c886d8e30d8a9acf6.streams.rda-systc65fa61756c5f235299dc80f85c48356/79c09d31. ID: 3d7d6d4a-7d10-4cc1-b6a6-25e5ed21d0ab. Time taken 1 msec
2025-07-07 14:27:00,946 [PID=2677:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:does_pack_exists:2037] INFO - Pack Cisco BPA version 1.0.0 exists
Deleting pack 'Cisco BPA' version '1.0.0'
2025-07-07 14:27:00,993 [PID=2677:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:remove_pack:612] INFO - Checking if pack Cisco BPA 1.0.0 exists before removing
2025-07-07 14:27:01,170 [PID=2677:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:is_any_customer_enabled:1130] INFO - Unable to find any enabled customers package with scope query: id is 'Cisco BPA' and version is '1.0.0' and status is 'ACTIVATED'
2025-07-07 14:27:01,187 [PID=2677:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:is_enabled_for_single_tenant:1185] INFO - Unable to find any enabled customers package with scope query: id is 'Cisco BPA' and version is '1.0.0' and single_tenant_status is 'ACTIVATED'
2025-07-07 14:27:01,202 [PID=2677:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:remove_pack:623] INFO - Deleting decription MD file for pack Cisco BPA version 1_0_0
Done deleting objects
2025-07-07 14:27:01,393 [PID=2677:TID=MainThread:cfx.rda_messaging.sync_artifacts:delete_artifact:1607] INFO - Deleting rda_objects rda-objects/data/Cisco BPA/1_0_0/77be143c-description_Cisco BPA_1_0_0.data from pstream
2025-07-07 14:27:12,751 [PID=2677:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:remove_pack:628] INFO - Deleting pack HP Network Automation version 1_0_0
2025-07-07 14:27:12,768 [PID=2677:TID=MainThread:cfx.rda_messaging.sync_artifacts:delete_artifact:1607] INFO - Deleting rda_packs rda_packs/HP Network Automation from pstream
2025-07-07 14:27:12,800 [PID=2677:TID=MainThread:cfx.rda_messaging.sync_artifacts:delete_artifact:1644] INFO - Response from deleting: {'status': 'ok', 'reason': '', 'data': {'took': 11, 'timed_out': False, 'total': 1, 'deleted': 1, 'batches': 1, 'version_conflicts': 0, 'noops': 0, 'retries': {'bulk': 0, 'search': 0}, 'throttled_millis': 0, 'requests_per_second': -1.0, 'throttled_until_millis': 0}, 'now': '2025-07-07T14:27:12.798267'}
2025-07-07 14:27:12,800 [PID=2677:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:delete_from_customer_stream:178] INFO - Deleting pack from customer-rda-packs: id='HP Network Automation' and version='1.0.0'
Successfully deleted pack 'HP Network Automation' version '1.0.0'
Processing summary:
- Packs processed: 33
- Packs marked for deletion: 29
- Packs successfully deleted: 29
- Packs skipped (activated): 0
- Packs skipped (not found): 4
8.0 to 8.1 upgrade cleanup complete. 29 packs processed for deletion.
Run to get the stats of packs that are in Packs page (total # of packs, activated packs, # that will be deleted)
(cfx_venv) root@2602fff46f91:/tmp# python delete_bundle_packs_8.0_to_8.1_upgrade.py --stats
/cfx_venv/lib/python3.12/site-packages/google/protobuf/runtime_version.py:98: UserWarning: Protobuf gencode version 5.29.0 is exactly one major version older than the runtime version 6.30.2 at nats.proto. Please update the gencode to avoid compatibility violations in the next runtime release.
warnings.warn(
/cfx_venv/lib/python3.12/site-packages/cfxql/__init__.py:8: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
import pkg_resources
Gathering pack statistics...
2025-07-07 16:10:15,072 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:707] INFO - Getting minio path: rda_packs/
2025-07-07 16:10:15,116 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Cisco Meraki/9_0_1/Cisco Meraki.tar.gz
2025-07-07 16:10:15,116 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Cisco Meraki/9_0_1/manifest.yaml
2025-07-07 16:10:15,116 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:713] INFO - Reading manifest file: rda_packs/Cisco Meraki/9_0_1/manifest.yaml
2025-07-07 16:10:15,121 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Cisco vManage/9_0_1/Cisco vManage.tar.gz
2025-07-07 16:10:15,121 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Cisco vManage/9_0_1/manifest.yaml
2025-07-07 16:10:15,121 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:713] INFO - Reading manifest file: rda_packs/Cisco vManage/9_0_1/manifest.yaml
2025-07-07 16:10:15,128 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Asset Correlation Regression/9_0_0/Fabrix AIOps Asset Correlation Regression.tar.gz
2025-07-07 16:10:15,128 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Asset Correlation Regression/9_0_0/manifest.yaml
2025-07-07 16:10:15,128 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:713] INFO - Reading manifest file: rda_packs/Fabrix AIOps Asset Correlation Regression/9_0_0/manifest.yaml
2025-07-07 16:10:15,214 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Fault Management Base/8_1_1/Fabrix AIOps Fault Management Base.tar.gz
2025-07-07 16:10:15,214 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Fault Management Base/8_1_1/manifest.yaml
2025-07-07 16:10:15,214 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:713] INFO - Reading manifest file: rda_packs/Fabrix AIOps Fault Management Base/8_1_1/manifest.yaml
2025-07-07 16:10:15,220 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Fault Management Base/9_0_1/Fabrix AIOps Fault Management Base.tar.gz
2025-07-07 16:10:15,221 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Fault Management Base/9_0_1/manifest.yaml
2025-07-07 16:10:15,221 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:713] INFO - Reading manifest file: rda_packs/Fabrix AIOps Fault Management Base/9_0_1/manifest.yaml
2025-07-07 16:10:15,227 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Fault Management Base/9_0_10/Fabrix AIOps Fault Management Base.tar.gz
2025-07-07 16:10:15,227 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Fault Management Base/9_0_10/manifest.yaml
2025-07-07 16:10:15,227 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:713] INFO - Reading manifest file: rda_packs/Fabrix AIOps Fault Management Base/9_0_10/manifest.yaml
2025-07-07 16:10:15,234 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Fault Management Base/9_0_3/Fabrix AIOps Fault Management Base.tar.gz
2025-07-07 16:10:15,234 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Fault Management Base/9_0_3/manifest.yaml
2025-07-07 16:10:15,234 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:713] INFO - Reading manifest file: rda_packs/Fabrix AIOps Fault Management Base/9_0_3/manifest.yaml
2025-07-07 16:10:15,241 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Fault Management Base/9_0_8/Fabrix AIOps Fault Management Base.tar.gz
2025-07-07 16:10:15,241 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Fault Management Base/9_0_8/manifest.yaml
2025-07-07 16:10:15,241 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:713] INFO - Reading manifest file: rda_packs/Fabrix AIOps Fault Management Base/9_0_8/manifest.yaml
2025-07-07 16:10:15,247 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Fault Management Base/9_0_9/Fabrix AIOps Fault Management Base.tar.gz
2025-07-07 16:10:15,248 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps Fault Management Base/9_0_9/manifest.yaml
2025-07-07 16:10:15,248 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:713] INFO - Reading manifest file: rda_packs/Fabrix AIOps Fault Management Base/9_0_9/manifest.yaml
2025-07-07 16:10:15,255 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps ML Metrics Regression/9_0_0/Fabrix AIOps ML Metrics Regression.tar.gz
2025-07-07 16:10:15,255 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps ML Metrics Regression/9_0_0/manifest.yaml
2025-07-07 16:10:15,255 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:713] INFO - Reading manifest file: rda_packs/Fabrix AIOps ML Metrics Regression/9_0_0/manifest.yaml
2025-07-07 16:10:15,259 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:list_packs_from_minio:711] INFO - Getting manifest file for rda_packs/Fabrix AIOps ML/9_0_0/Fabrix AIOps ML.tar.gz
Pack Statistics:
2025-07-07 16:10:15,926 [PID=2897:TID=MainThread:cfx.rda_packs.rda_pack_mgmt:is_pack_activated:413] INFO - Pack VMWare vCenter version 9.0.2 is already in ACTIVATED state.
{
"total_packs": 16,
"activated_packs": 13,
"non_activated_packs": 3,
"pack_details": [
{
"pack_name": "Cisco Meraki",
"version": "9.0.1",
"activated": false
},
{
"pack_name": "Cisco vManage",
"version": "9.0.1",
"activated": true
},
Step 5: Upload following RDA Packs (Go to Main Menu --> Configuration --> RDA Administration --> Packs --> Click Upload Packs from Catalog), and activate the packs in the below given order to get the latest dashboard changes for OIA Alerts and Incidents.
Upload the following packs (ensure you select the correct versions)
- Fabrix AIOps Fault Management Base Version 9.0.14
- Fabrix Inventory Collection Base Pack Version 7.2.0
- VMWare vCenter with version 9.0.2
- Network Device Discovery with version 9.0.0
- Fabrix AIOps ML with version 9.0.0
- Fabrix AIOps Asset Correlation Regression with version 9.0.1
Step 6: Activate the packs in following order to get the latest changes.
1. Activate base pack first.
- Fabrix Inventory Collection Base Pack Version 7.2.0
- Fabrix AIOps Fault Management Base Version 9.0.14
2. Activate following RDA Packs if they are deployed. The packs that are not deployed can be skipped.
- VMWare vCenter with version 9.0.2
- Network Device Discovery with version 9.0.0
- Fabrix AIOps ML with version 9.0.0
- Fabrix AIOps Asset Correlation Regression with version 9.0.1
Step 7. After the upgrade, check the following Platform, Worker, OIA Services, Event-gateway, Bulkstats YAML files in CLI VM located at /opt/rdaf/deployment-scripts/values.yaml
Please check the SYS_PTRACE within the capabilities section for each service, as illustrated in the following example.
rda_api_server:
mem_limit: 4G
memswap_limit: 4G
privileged: true
environment:
RDA_STUDIO_URL: '""'
RDA_ENABLE_TRACES: 'no'
DISABLE_REMOTE_LOGGING_CONTROL: 'no'
RDA_SELF_HEALTH_RESTART_AFTER_FAILURES: 3
deployment: true
hosts:
- 192.168.109.50
- 192.168.109.51
cap_add:
- SYS_PTRACE
Step 8: Purge of Alerthistory Data from the Database
This task includes the purging of CLEARED alerts data from the alerthistory table and migrating CLEARED alert payloads from the alerthistory table to the oia-alert-payload PStream.
-
Execute the purge script as detailed in the provided link. please refer to Manual Purge History Alerts Document. This task involves the purging of alerts history data from the alerthistory table and the Alert payload migration from the alerthistory table to the alert payload PStream.
-
Upon successful execution of the Purge script, update the Cleared Alerts Data Retention in Database setting to 1 hour. Path to update Cleared Alerts Data Retention property Main Menu --> Administration --> Configurations --> Click on row level action for Cleared Alerts Data Retention in Database.
Step 9: Copy Policies From DB To PStream
Suppression/Correlation policies should be copied to pstream from the database to simplify the writing of rda_packs, dashboards, snapshots, and other artifacts.
Execute the CopyPoliciesFromDBToPStream Script as detailed in the provided Copy Policies From DB to Pstream