How Can We Help?

Introduction

Refer Viewing Signals for details on Signals.

There are two types of Lead Signals:

  • Early Warning
  • Problem

HEAL sends notifications in following cases.

  • When a signal is created. (Status is Open)
  • When a new service is added to the timeline. (Status is Open)
  • When the signal is upgraded. (Status is Upgraded)
  • When the signal is closed. (Status is Closed)
  • When a new severe event is added to the timeline (Status is Open)
  • When signal severity changes (Status is Open)
  • When signal is open for long or open for too long – regular reminders (Status is Open)

You receive a notification only for those services which are part of the applications assigned to you.

Signal Template Variables

Configuration variables are as follows:

  • Signal ID
  • Signal Description (impact summary)
  • Signal Type (Problem or Early Warning)
  • Severity (Critical or Non-Critical)
  • Status (open, upgraded, or closed)
  • Application Names (entry level service may belong to multiple services)
  • Related Signals IDs
  • Started On (along with timezone )
  • Ended On (along with timezone)
  • User Name
  • Organization Name
  • Impacted Entry Point Service (NA in case of EW)
  • Root Cause Services (can be multiple)
  • Affected Services (Newest Affected Service(s) added to the Signal timeline which lead to notification generation)
  • Affected Applications (Application(s) which have the affected services tagged but not the actual application on which the Signal is raised)
  • Total Events Count (total number of events during the course of a signal)
  • Latest Event Count (on the current update)
  • Latest Event Time (along with the timezone) (on the current update)
  • Latest Event detected (for the current notification or update) : Service Name, Instance name, Request Name, Host address, KPI name, KPI attribute, Value, Unit, operation, Lower threshold, Upper Threshold

You can use all these variables across the templates.

Scenario

  • Jack is assigned to Application A1 (Travels) and Service S1 (Travel Web), S2 (Hotels), S3 (Hotel Inventory), S4 (Bookings) and S5 (Booking DB) are tagged to it.
  • Joe is assigned to Application A2 (Flights) and S4 (bookings) is tagged to it.
  • Signal Open:

a) S1 has some transaction failures observed, Five events observed on S1 only.

b) Latest event count is one.

c) Signal status is open.

d) Jack should immediately receive a notification stating that signal is open.

  • Signal Update 1:

a) Three events are observed on S2, two events are observed on S3. Both S2 and S3 are added to the timeline. Two more events are observed on S1.

b) Latest Event count is three.

c) Signal status is Open.

d) Jack immediately receives a notification stating that signal is Open.

  • Signal Update 2:

a) Four events are observed on S4. S4 is added to the timeline. Two more events are observed on both S1 and S2.

b) Latest Event count is four.

c) S4 is part of Application A1 (Travels) and A2 (Flights) both.

d) Signal status is open.

e) Jack and Joe both immediately receive a notification stating that signal is open.

Signal Closed:

Signal is closed.

Jack and Joe both immediately receive a notification stating that signal is closed.

Email Templates

Template for Lead Problem Open

Dear User,

{Events_Changes_Descriptions}

{Signal_Type} is {Signal_Status} on application(s) {App_Names}.

Affected service(s): {Affected_ServiceNames}.

For the detailed overview, Please select here.

Impacted Entry Point Service: {EntryPoint_ServiceName}

Suggested Root Cause at {RootCause_ServiceNames}

Affected Application(s): {Affected_ApplicationNames}

Severity: {Severity}

Started On: {StartTime}({TimzoneShortName})

{Signal_Summary}

{Signal_Request_Workload}

{Signal_Instance_Behaviour}

Total {Total_Events} event(s) detected so far on this {Signal_Type}

{Latest_Events}

Template for Lead Problem Closed

Dear User,

{Signal_Type} is {Signal_Status} on application(s) {App_Names}.

For the detailed overview, Please select here.

Impacted Entry Point Service: {EntryPoint_ServiceName}

Suggested Root Cause at {RootCause_ServiceNames}

Severity: {Severity}

Started On: {StartTime}({TimzoneShortName})

Ended On: {EndTime}({TimzoneShortName})

{Signal_Request_Workload}

{Signal_Instance_Behaviour}

{Signal_Summary}

Total {Total_Events} event(s) were detected on this {Signal_Type}

{Latest_Events}

Appreciate your efforts !

Early Warning Open

Dear User,

{Events_Changes_Descriptions}

{Signal_Type} is {Signal_Status} on application(s) {App_Names}.

Affected service(s): {Affected_ServiceNames}.

For the detailed overview, Please select here.

Impacted Entry Point Service: NA

Suggested Root Cause at {RootCause_ServiceNames}

Affected Application(s): {Affected_ApplicationNames}

Severity: {Severity}

Started On: {StartTime}({TimzoneShortName})

{Signal_Summary}

{Signal_Instance_Behaviour}

Total {Total_Events} event(s) detected so far on this {Signal_Type}

{Latest_Events}

Early Warning Closed

Dear User,

{Signal_Type} is {Signal_Status} on application(s) {App_Names}.

For the detailed overview, Please select here.

Impacted Entry Point Service: NA

Suggested Root Cause at {RootCause_ServiceNames}

Severity: {Severity}

Started On: {StartTime}({TimzoneShortName})

Ended On: {EndTime}({TimzoneShortName})

{Signal_Instance_Behaviour}

{Signal_Summary}

Total {Total_Events} event(s) were detected on this {Signal_Type}

{Latest_Events}

Appreciate your efforts !

Early Warning Upgraded

Dear User,

{Signal_Type} is {Signal_Status} on application(s) {App_Names}.

For the detailed overview, Please select here.

Impacted Entry Point Service: NA

Suggested Root Cause at {RootCause_ServiceNames}

Severity: {Severity}

Started On: {StartTime}({TimzoneShortName})

Ended On: {EndTime}({TimzoneShortName})

{Signal_Instance_Behaviour}

{Signal_Summary}

Total {Total_Events} event(s) were detected on this {Signal_Type}

{Latest_Events}

Appreciate your efforts !

Info Signal Email Template

Subject: <Signal Type> [<Signal ID>: <Description>]

Dear User,

<Signal Type> signal is detected on metric category <KPI category> in service <affected service> on application(s) <Application name> at <detected time>. For the detailed overview, Please select here.

Total <Total Events Count> event(s) detected on this <Signal Type> signal

{Signal_Summary}

{Latest_Events}

Batch Problem Open Email Template

Subject: Batch Problem[{Signal_ID}:{Batch_Job_Details}, Current Status: {batch_job_status}] {Signal_Status}

Dear User,

{Signal_Type} is {Signal_Status} on application {App_Names}.

For the detailed overview, Please select here.

Severity: {Severity}

Signal Started On: {StartTime}({TimzoneShortName})

Total {Total_Events} event(s) detected so far on this {Signal_Type}

Latest Event detected on this update on {Latest_Event_Time}:

Batch Group name:{Batch_Job_Group},Batch job id: {Batch_Job},KPI name: {KPI_Name},Actual Duration:{Actual_Duration},Unit: {KPI_Unit}, Expected Duration: {Expected_Duration}

{Latest_Events}

Batch Problem Closed Email Template

Subject: Batch Problem[{Signal_ID}:{Batch_Job_Details}, Current Status: {batch_job_status}] {Signal_Status}

Dear User,

{Signal_Type} is {Signal_Status} on application {App_Names}.

For the detailed overview, Please select here.

Severity: {Severity}

Signal Started On: {StartTime}({TimzoneShortName})

Signal Ended On: {EndTime}({TimzoneShortName})

Total {Total_Events} event(s) detected so far on this {Signal_Type}

Latest Event detected on this update on {Latest_Event_Time}:

Batch Group name:{Batch_Job_Group},Batch job id: {Batch_Job},KPI name: {KPI_Name},Actual Duration:{Actual_Duration},Unit: {KPI_Unit}, Expected Duration: {Expected_Duration}

{Latest_Events}

Forensic Email Template

Dear User,\r\n\r\nForensic is captured on metric KPI {KPIName} at {Event_Detected_Time}, category {CategoryName}, instance {InstanceName} in service {Affected_ServiceNames} on application(s) {App_Names}.\r\n\r\n Threshold Details are below,\r\n\r\nSeverity: {Severity} \r\nKPI Value:{KPIValue}\r\nOperation:{Operation}\r\nThreshold Value:Lower: {Lower} Upper: {Upper} \r\n\r\nFor more details kindly look into attachment.\r\n

Email Template Samples

Lead Problem Closed

Dear User,

Problem is CLOSED on application(s) LOS-DR,NetBanking-DR.

For the detailed overview, Please select here.

Impacted Entry Point Service: LOS-App-Service-DR

Suggested Root Cause at NB-App-Service-DR

Severity: Severe

Started On: 2020-12-30 22:15:00(GMT +09:00)

Ended On: 2020-12-30 22:25:20(GMT +09:00)

Top 1 request events details:-

Service NameAffected Request NamesAffected KpisEvents CountLatest Event Detected Time
LOS-App-Service-DRGET#/txn/branchserver1.aspx|srv=LOS-App-Service-DR|acc=2Fail (Default),Slow Percentage (Default),Slow (Default),Response Time (Default),Volume (Default)882020-12-30 22:07:00 (GMT +09:00)

Top 1 instance events details:-

Service NameAffected Instance NamesAffected KpisEvents CountLatest Event Detected Time
NB-App-Service-DRRHEL_NB_App_Host_146_Inst_1-DRTotal Process Count (Default),Total Transactions Committed (Default),Process CPU Util (Default),Established Status (Default),CPU Util (Severe),CPU Util (Default),Listen Status (Default),Ping Status (Default),Process Running (Default)1582020-12-30 22:15:00 (GMT +09:00)

Signal Summary So Far:-

Service NameSevere Events CountDefault Events CountAffected Instance countAffected Request countAffected Kpi CategoriesLatest Event Detected Time
NB-App-Service-DR2713140AppserverTransaction,DBAvailability,CPU,Process,Network Utilization2020-12-30 22:15:00 (GMT +09:00)
LOS-App-Service-DR088010Errors,Volume,Slow2020-12-30 22:07:00 (GMT +09:00)

Latest Events Details:-

  • Event Id: AE2-52-1-C-ALL-26822241,Event Detected Time: 2020-12-30 22:20:00 (GMT +09:00),Service Name: NB-App-Service-DR,Instance name: RHEL_NB_App_Host_146_Inst_1-DR,Host address:192.168.13.146,KPI name: CPU Util,KPI attribute: NA,Value: 8.84,Unit: Percentage,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-52-254-C-192.168.13.146,8080-26822241,Event Detected Time: 2020-12-30 22:20:00 (GMT +09:00),Service Name: NB-App-Service-DR,Instance name: RHEL_NB_App_Host_146_Inst_1-DR,Host address:192.168.13.146,KPI name: Established Status,KPI attribute: 192.168.13.146,8080,Value: 0.0,Unit: Count,Operation: not equals,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-52-254-C-192.168.13.146,80-26822241,Event Detected Time: 2020-12-30 22:20:00 (GMT +09:00),Service Name: NB-App-Service-DR,Instance name: RHEL_NB_App_Host_146_Inst_1-DR,Host address:192.168.13.146,KPI name: Established Status,KPI attribute: 192.168.13.146,80,Value: 0.0,Unit: Count,Operation: not equals,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-52-255-C-192.168.13.146,8080-26822241,Event Detected Time: 2020-12-30 22:20:00 (GMT +09:00),Service Name: NB-App-Service-DR,Instance name: RHEL_NB_App_Host_146_Inst_1-DR,Host address:192.168.13.146,KPI name: Listen Status,KPI attribute: 192.168.13.146,8080,Value: 0.0,Unit: Count,Operation: not equals,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-52-255-C-192.168.13.146,80-26822241,Event Detected Time: 2020-12-30 22:20:00 (GMT +09:00),Service Name: NB-App-Service-DR,Instance name: RHEL_NB_App_Host_146_Inst_1-DR,Host address:192.168.13.146,KPI name: Listen Status,KPI attribute: 192.168.13.146,80,Value: 0.0,Unit: Count,Operation: not equals,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-52-250-C-httpd2-26822241,Event Detected Time: 2020-12-30 22:20:00 (GMT +09:00),Service Name: NB-App-Service-DR,Instance name: RHEL_NB_App_Host_146_Inst_1-DR,Host address:192.168.13.146,KPI name: Process Running,KPI attribute: httpd2,Value: 0.0,Unit: Count,Operation: not equals,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-52-250-C-httpd-26822241,Event Detected Time: 2020-12-30 22:20:00 (GMT +09:00),Service Name: NB-App-Service-DR,Instance name: RHEL_NB_App_Host_146_Inst_1-DR,Host address:192.168.13.146,KPI name: Process Running,KPI attribute: httpd,Value: 0.0,Unit: Count,Operation: not equals,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-52-251-C-192.168.14.146-26822241,Event Detected Time: 2020-12-30 22:20:00 (GMT +09:00),Service Name: NB-App-Service-DR,Instance name: RHEL_NB_App_Host_146_Inst_1-DR,Host address:192.168.13.146,KPI name: Ping Status,KPI attribute: 192.168.14.146,Value: 0.0,Unit: Count,Operation: not equals,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-2-1-C-ALL-26822242,Event Detected Time: 2020-12-30 22:20:00 (GMT +09:00),Service Name: NB-App-Service-DR,Instance name: RHEL-APP-HOST-Cluster-DR,Host address:NA,KPI name: CPU Util,KPI attribute: NA,Value: 8.84,Unit: Percentage,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-52-1-C-ALL-26822237,Event Detected Time: 2020-12-30 22:16:00 (GMT +09:00),Service Name: NB-App-Service-DR,Instance name: RHEL_NB_App_Host_146_Inst_1-DR,Host address:192.168.13.146,KPI name: CPU Util,KPI attribute: NA,Value: 10.17,Unit: Percentage,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0

Appreciate your efforts !

Lead Problem Open

Dear User,

  • Signal is open from past 0 Minutes

Problem is OPEN on application(s) NetBanking-DR.

Affected service(s): NB-App-Service-DR.

For the detailed overview, Please select here.

Impacted Entry Point Service: NB-App-Service-DR

Suggested Root Cause at NB-App-Service-DR

Affected Application(s): NetBanking-DR

Severity: Severe

Started On: 2020-12-29 17:57:00(GMT +10:00)

Signal Summary So Far:-

Service NameSevere Events CountDefault Events CountAffected Instance countAffected Request countAffected Kpi CategoriesLatest Event Detected Time
NB-App-Service-DR10937921DBAvailability,Volume,CPU2020-12-29 17:57:00 (GMT +10:00)

Top 1 request events details:-

Service NameAffected Request NamesAffected KpisEvents CountLatest Event Detected Time
NB-App-Service-DRGET#/netbank/Access/nbApp4005|srv=NB-App-Service-DR|acc=2Volume (Default)12020-12-29 17:57:00 (GMT +10:00)

Top 1 instance events details:-

Service NameAffected Instance NamesAffected KpisEvents CountLatest Event Detected Time
NB-App-Service-DRRHEL_NB_App_Host_146_Inst_1-DREstablished Status (Default),CPU Util (Severe),Listen Status (Default),Ping Status (Default),Process Running (Default)4872020-12-29 16:43:00 (GMT +10:00)

Total 488 event(s) detected so far on this Problem

Latest Events Details:-

  • Event Id: AE-2-30-398-C-26820496,Event Detected Time: 2020-12-29 17:57:00 (GMT +10:00),Service Name: NB-App-Service-DR,Request Name: GET#/netbank/Access/nbApp4005|srv=NB-App-Service-DR|acc=2,Host address:NA,KPI name: Volume,KPI attribute: NA,Value: 25.0,Unit: Count,Operation: not between,Lower threshold: 9.0,Upper Threshold: 20.0
  • Event Id: AE2-52-1-C-ALL-26820405,Event Detected Time: 2020-12-29 16:43:00 (GMT +10:00),Service Name: NB-App-Service-DR,Instance name: RHEL_NB_App_Host_146_Inst_1-DR,Host address:192.168.13.146,KPI name: CPU Util,KPI attribute: NA,Value: 10.0,Unit: Percentage,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-52-254-C-192.168.13.146,80-26820405,Event Detected Time: 2020-12-29 16:43:00 (GMT +10:00),Service Name: NB-App-Service-DR,Instance name: RHEL_NB_App_Host_146_Inst_1-DR,Host address:192.168.13.146,KPI name: Established Status,KPI attribute: 192.168.13.146,80,Value: 0.0,Unit: Count,Operation: not equals,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-2-1-C-ALL-26820406,Event Detected Time: 2020-12-29 16:43:00 (GMT +10:00),Service Name: NB-App-Service-DR,Instance name: RHEL-APP-HOST-Cluster-DR,Host address:NA,KPI name: CPU Util,KPI attribute: NA,Value: 10.0,Unit: Percentage,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-52-250-C-httpd2-26820405,Event Detected Time: 2020-12-29 16:43:00 (GMT +10:00),Service Name: NB-App-Service-DR,Instance name: RHEL_NB_App_Host_146_Inst_1-DR,Host address:192.168.13.146,KPI name: Process Running,KPI attribute: httpd2,Value: 0.0,Unit: Count,Operation: not equals,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-52-255-C-192.168.13.146,80-26820405,Event Detected Time: 2020-12-29 16:43:00 (GMT +10:00),Service Name: NB-App-Service-DR,Instance name: RHEL_NB_App_Host_146_Inst_1-DR,Host address:192.168.13.146,KPI name: Listen Status,KPI attribute: 192.168.13.146,80,Value: 0.0,Unit: Count,Operation: not equals,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-52-251-C-192.168.14.146-26820405,Event Detected Time: 2020-12-29 16:43:00 (GMT +10:00),Service Name: NB-App-Service-DR,Instance name: RHEL_NB_App_Host_146_Inst_1-DR,Host address:192.168.13.146,KPI name: Ping Status,KPI attribute: 192.168.14.146,Value: 0.0,Unit: Count,Operation: not equals,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-52-255-C-192.168.13.146,8080-26820405,Event Detected Time: 2020-12-29 16:43:00 (GMT +10:00),Service Name: NB-App-Service-DR,Instance name: RHEL_NB_App_Host_146_Inst_1-DR,Host address:192.168.13.146,KPI name: Listen Status,KPI attribute: 192.168.13.146,8080,Value: 0.0,Unit: Count,Operation: not equals,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-52-254-C-192.168.13.146,8080-26820405,Event Detected Time: 2020-12-29 16:43:00 (GMT +10:00),Service Name: NB-App-Service-DR,Instance name: RHEL_NB_App_Host_146_Inst_1-DR,Host address:192.168.13.146,KPI name: Established Status,KPI attribute: 192.168.13.146,8080,Value: 0.0,Unit: Count,Operation: not equals,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-52-250-C-httpd-26820405,Event Detected Time: 2020-12-29 16:43:00 (GMT +10:00),Service Name: NB-App-Service-DR,Instance name: RHEL_NB_App_Host_146_Inst_1-DR,Host address:192.168.13.146,KPI name: Process Running,KPI attribute: httpd,Value: 0.0,Unit: Count,Operation: not equals,Lower threshold: 0.0,Upper Threshold: 0.0

Early Warning Open

Dear User,

  • Signal is open from past 0 Minutes

Early Warning is OPEN on application(s) LOS-DR.

Affected service(s): LOS-Web-Service-DR.

For the detailed overview, please select here.

Impacted Entry Point Service: NA

Suggested Root Cause at LOS-Web-Service-DR

Affected Application(s): LOS-DR

Severity: Severe

Started On: 2020-12-28 17:35:00(GMT +08:00)

Signal Summary So Far:-

Service NameSevere Events CountDefault Events CountAffected Instance countAffected Request countAffected Kpi CategoriesLatest Event Detected Time
LOS-Web-Service-DR1010Network Utilization2020-12-28 17:35:00 (GMT +08:00)

Top 1 instance events details:-

Service NameAffected Instance NamesAffected KpisEvents CountAffected Request count
LOS-Web-Service-DROHS_LOS_Web_110_Inst_1-DRTime Wait (Severe)12020-12-28 17:35:00 (GMT +08:00)

Total 1 event(s) detected so far on this Early Warning

Latest Events Details:-

  • Event Id: AE2-84-46-C-443-26819137,Event Detected Time: 2020-12-28 17:35:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 3.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0

Early Warning Closed

Dear User,

Early Warning is CLOSED on application(s) LOS-DR.

For the detailed overview, Please select here.

Impacted Entry Point Service: NA

Suggested Root Cause at LOS-Web-Service-DR

Severity: Severe

Started On: 2020-12-28 20:58:00(GMT +08:00)

Ended On: 2020-12-28 22:58:00(GMT +08:00)

Top 1 instance events details:-

Service NameAffected Instance NamesAffected KpisEvents CountLatest Event Detected Time
LOS-Web-Service-DROHS_LOS_Web_110_Inst_1-DRTime Wait (Severe)222020-12-28 20:58:00 (GMT +08:00)

Signal Summary So Far:-

Service NameSevere Events CountDefault Events CountAffected Instance countAffected Request countAffected Kpi CategoriesLatest Event Detected Time
LOS-Web-Service-DR22010Network Utilization2020-12-28 20:58:00 (GMT +08:00)

Total 22 event(s) were detected on this Early Warning

Latest Events Details:-

  • Event Id: AE2-84-46-C-443-26819340,Event Detected Time: 2020-12-28 20:58:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 1.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-84-46-C-443-26819334,Event Detected Time: 2020-12-28 20:52:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 4.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-84-46-C-443-26819329,Event Detected Time: 2020-12-28 20:47:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 5.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-84-46-C-443-26819324,Event Detected Time: 2020-12-28 20:42:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 2.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-84-46-C-443-26819313,Event Detected Time: 2020-12-28 20:31:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 5.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-84-46-C-443-26819307,Event Detected Time: 2020-12-28 20:25:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 4.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-84-46-C-443-26819296,Event Detected Time: 2020-12-28 20:14:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 5.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-84-46-C-443-26819279,Event Detected Time: 2020-12-28 19:57:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 4.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-84-46-C-443-26819247,Event Detected Time: 2020-12-28 19:25:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 5.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-84-46-C-443-26819241,Event Detected Time: 2020-12-28 19:19:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 3.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-84-46-C-443-26819230,Event Detected Time: 2020-12-28 19:08:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 5.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-84-46-C-443-26819219,Event Detected Time: 2020-12-28 18:57:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 5.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-84-46-C-443-26819213,Event Detected Time: 2020-12-28 18:51:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 3.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-84-46-C-443-26819203,Event Detected Time: 2020-12-28 18:41:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 4.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-84-46-C-443-26819192,Event Detected Time: 2020-12-28 18:30:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 5.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-84-46-C-443-26819186,Event Detected Time: 2020-12-28 18:24:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 3.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-84-46-C-443-26819175,Event Detected Time: 2020-12-28 18:13:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 5.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-84-46-C-443-26819164,Event Detected Time: 2020-12-28 18:02:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 5.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-84-46-C-443-26819158,Event Detected Time: 2020-12-28 17:56:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 2.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-84-46-C-443-26819153,Event Detected Time: 2020-12-28 17:51:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 1.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-84-46-C-443-26819148,Event Detected Time: 2020-12-28 17:46:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 4.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-84-46-C-443-26819137,Event Detected Time: 2020-12-28 17:35:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 3.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0

Appreciate your efforts !

Early Warning Upgraded

Dear User,

Early Warning is UPGRADED on application(s) LOS-DR,NetBanking-DR.

For the detailed overview, Please select here.

Impacted Entry Point Service: NA

Suggested Root Cause at NB-App-Service-DR

Severity: Severe

Started On: 2020-12-30 21:10:00(GMT +10:00)

Ended On: 2020-12-30 21:10:00(GMT +10:00)

Top 1 instance events details:-

Service NameAffected Instance NamesAffected KpisEvents CountLatest Event Detected Time
NB-App-Service-DRRHEL_NB_App_Host_146_Inst_1-DRTotal Process Count (Default),Total Transactions Committed (Default),Process CPU Util (Default),Established Status (Default),CPU Util (Severe),Listen Status (Default),CPU Util (Default),Process Running (Default),Ping Status (Default)582020-12-30 21:26:00 (GMT +10:00)

Signal Summary So Far:-

Service NameSevere Events CountDefault Events CountAffected Instance countAffected Request countAffected Kpi Categories
NB-App-Service-DR104840AppserverTransaction,DBAvailability,CPU,Process,Network Utilization
LOS-App-Service-DR0101Slow

Total 59 event(s) were detected on this Early Warning

Latest Events Details:-

  • Event Id: AE2-52-1-C-ALL-26822127,Event Detected Time: 2020-12-30 21:26:00 (GMT +10:00),Service Name: NB-App-Service-DR,Instance name: RHEL_NB_App_Host_146_Inst_1-DR,Host address:192.168.13.146,KPI name: CPU Util,KPI attribute: NA,Value: 9.48,Unit: Percentage,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-52-254-C-192.168.13.146,8080-26822127,Event Detected Time: 2020-12-30 21:26:00 (GMT +10:00),Service Name: NB-App-Service-DR,Instance name: RHEL_NB_App_Host_146_Inst_1-DR,Host address:192.168.13.146,KPI name: Established Status,KPI attribute: 192.168.13.146,8080,Value: 0.0,Unit: Count,Operation: not equals,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-52-254-C-192.168.13.146,80-26822127,Event Detected Time: 2020-12-30 21:26:00 (GMT +10:00),Service Name: NB-App-Service-DR,Instance name: RHEL_NB_App_Host_146_Inst_1-DR,Host address:192.168.13.146,KPI name: Established Status,KPI attribute: 192.168.13.146,80,Value: 0.0,Unit: Count,Operation: not equals,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-52-255-C-192.168.13.146,8080-26822127,Event Detected Time: 2020-12-30 21:26:00 (GMT +10:00),Service Name: NB-App-Service-DR,Instance name: RHEL_NB_App_Host_146_Inst_1-DR,Host address:192.168.13.146,KPI name: Listen Status,KPI attribute: 192.168.13.146,8080,Value: 0.0,Unit: Count,Operation: not equals,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-52-255-C-192.168.13.146,80-26822127,Event Detected Time: 2020-12-30 21:26:00 (GMT +10:00),Service Name: NB-App-Service-DR,Instance name: RHEL_NB_App_Host_146_Inst_1-DR,Host address:192.168.13.146,KPI name: Listen Status,KPI attribute: 192.168.13.146,80,Value: 0.0,Unit: Count,Operation: not equals,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-52-251-C-192.168.14.146-26822127,Event Detected Time: 2020-12-30 21:26:00 (GMT +10:00),Service Name: NB-App-Service-DR,Instance name: RHEL_NB_App_Host_146_Inst_1-DR,Host address:192.168.13.146,KPI name: Ping Status,KPI attribute: 192.168.14.146,Value: 0.0,Unit: Count,Operation: not equals,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-52-250-C-httpd2-26822127,Event Detected Time: 2020-12-30 21:26:00 (GMT +10:00),Service Name: NB-App-Service-DR,Instance name: RHEL_NB_App_Host_146_Inst_1-DR,Host address:192.168.13.146,KPI name: Process Running,KPI attribute: httpd2,Value: 0.0,Unit: Count,Operation: not equals,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-52-250-C-httpd-26822127,Event Detected Time: 2020-12-30 21:26:00 (GMT +10:00),Service Name: NB-App-Service-DR,Instance name: RHEL_NB_App_Host_146_Inst_1-DR,Host address:192.168.13.146,KPI name: Process Running,KPI attribute: httpd,Value: 0.0,Unit: Count,Operation: not equals,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-2-1-C-ALL-26822128,Event Detected Time: 2020-12-30 21:26:00 (GMT +10:00),Service Name: NB-App-Service-DR,Instance name: RHEL-APP-HOST-Cluster-DR,Host address:NA,KPI name: CPU Util,KPI attribute: NA,Value: 9.48,Unit: Percentage,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-52-1-C-ALL-26822123,Event Detected Time: 2020-12-30 21:22:00 (GMT +10:00),Service Name: NB-App-Service-DR,Instance name: RHEL_NB_App_Host_146_Inst_1-DR,Host address:192.168.13.146,KPI name: CPU Util,KPI attribute: NA,Value: 10.84,Unit: Percentage,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0

Appreciate your efforts !

Reminder Email Notification

Dear User,

Signal is open from past 30 Minutes

Early Warning is OPEN on application(s) LOS-DR.

Affected service(s): LOS-Web-Service-DR.

For the detailed overview, Please select here.

Impacted Entry Point Service: NA

Suggested Root Cause at LOS-Web-Service-DR

Affected Application(s): LOS-DR

Severity: Severe

Started On: 2020-12-28 18:02:00(GMT +08:00)

Signal Summary So Far:-

Service NameSevere Events CountDefault Events CountAffected Instance countAffected Request countAffected Kpi CategoriesLatest Event Detected Time
LOS-Web-Service-DR5010Network Utilization2020-12-28 18:02:00 (GMT +08:00)

Top 1 instance events details:-

Service NameAffected Instance NamesAffected KpisEvents CountLatest Event Detected Time
LOS-Web-Service-DROHS_LOS_Web_110_Inst_1-DRTime Wait (Severe)52020-12-28 18:02:00 (GMT +08:00)

Total 5 event(s) detected so far on this Early Warning

Latest Events Details:-

  • Event Id: AE2-84-46-C-443-26819164,Event Detected Time: 2020-12-28 18:02:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 5.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-84-46-C-443-26819158,Event Detected Time: 2020-12-28 17:56:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 2.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-84-46-C-443-26819153,Event Detected Time: 2020-12-28 17:51:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 1.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-84-46-C-443-26819148,Event Detected Time: 2020-12-28 17:46:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 4.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE2-84-46-C-443-26819137,Event Detected Time: 2020-12-28 17:35:00 (GMT +08:00),Service Name: LOS-Web-Service-DR,Instance name: OHS_LOS_Web_110_Inst_1-DR,Host address:192.168.13.112,KPI name: Time Wait,KPI attribute: 443,Value: 3.0,Unit: Count,Operation: greater than,Lower threshold: 0.0,Upper Threshold: 0.0

Info Signal Detected

Subject: Signal_Type} [I-2-9-2-26872957: Events detected in category Uptime for service NB-Web-Service-DR]

Dear User,

Info signal is detected on metric category Uptime in service NB-Web-Service-DR on application(s) NetBanking-DR at 2021-02-04 00:08:00(GMT +05:30).

For the detailed overview, Please select here.

Total 1 event(s) detected on this Info signal

Signal Summary So Far:-

Service NameSevere Events CountDefault Events CountAffected Instance countAffected Request countAffected Kpi CategoriesLatest Event Detected Time
NB-Web-Service-DR1010Uptime2021-02-04 00:08:00 (GMT +05:30)

Latest Events Details:-

  • Event Id: AE2-49-48-C-ALL-26872957,Event Detected Time: 2021-02-04 00:08:00 (GMT +05:30),Service Name: NB-Web-Service-DR,Instance name: RHEL_NB_Web_Host_154_Inst_1-DR,Host address:192.168.14.234,KPI name: Uptime Days,KPI attribute: NA,Value: 15.0,Unit: Count,Operation: greater than,Lower threshold: 1.0,Upper Threshold: 0.0

Config Watch Info Signal for Version Change of Database Component

Subject: Info [I-9-56-99-27235265: Events detected in category Config for service Postgres-DB-Service]

Dear User,

Info signal is detected on metric category Config in service Postgres-DB-Service on application(s) Postgres at 2021-10-13 14:29:00(GMT +05:30). Severity: Default For the detailed overview, Please select here.

Total 1 event(s) detected on this Info signal

Signal Summary So Far:-

Service NameSevere Events CountDefault Events CountAffected Instance countAffected Request countAffected Kpi CategoriesLatest Event Detected Time
Postgres-DB-Service0110Config2021-10-13 14:29:00 (GMT +05:30)

Latest Events Details:-

  • Event Id: AE9-274-1012-T-DBVersion-27235265,Event Detected Time: 2021-10-13 14:29:00 (GMT +05:30),Service Name: Postgres-DB-Service,Instance name: Postgres-232-DB,KPI name: Component Properties,Unit: Text,Operation: Modified,Host address:192.168.14.232,File name:NA,Parameter Name:DBVersion,New Value: PostgreSQL 13.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit,Old Value: PostgreSQL 10.18 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit

Batch Problem Open

Subject: Batch Problem[B-2-test000410-1615808749:test00:test000410, Current Status: Running] OPEN

Dear User,

Batch Job Problem is OPEN on application NetBanking-DR.

For the detailed overview, Please select here.

Severity: Severe

Signal Started On: 2021-03-15 17:00:00(GMT +05:30)

Total 1 event(s) detected so far on this Batch Job

Latest Event detected on this update on 2021-03-15 17:00:00:

Batch Group name:test00,Batch job id: test000410,KPI name: Duration,Actual Duration:889717.28,Unit: Seconds, Expected Duration: 90.0

Latest Events Details:-

  • Event Id: AE-B-test000410-1615808748,Event Detected Time: 2021-03-15 17:00:00 (GMT +05:30),Application Name: netbanking_1_DR,Batch job id: test000410,KPI name: Duration,KPI attribute: NA,Value: 889717.28,Unit: Seconds,Operation: not equals,Batch Group name:test00,Actual Duration: 889717.28,Expected Duration: 90.0

Batch Problem Closed

Subject: Batch Problem[B-2-test000410-1615808749:test00:test000410, Current Status: Running] CLOSED

Dear User,

Batch Job Problem is CLOSED on application NetBanking-DR.

For the detailed overview, Please select here.

Severity: Severe

Signal Started On: 2021-03-15 17:00:00(GMT +05:30)

Signal Ended On: 2021-03-15 17:16:36(GMT +05:30)

Total 1 event(s) detected so far on this Batch Job

Latest Event detected on this update on 2021-03-15 17:00:00:

Batch Group name:test00,Batch job id: test000410,KPI name: Duration,Actual Duration:889717.28,Unit: Seconds, Expected Duration: 90.0

Latest Events Details:-

Event Id: AE-B-test000410-1615808748,Event Detected Time: 2021-03-15 17:00:00 (GMT +05:30),Application Name: netbanking_1_DR,Batch job id: test000410,KPI name: Duration,KPI attribute: NA,Value: 889717.28,Unit: Seconds,Operation: not equals,Batch Group name:test00,Actual Duration: 889717.28,Expected Duration: 90.0

Forensic Email Notification

Subject: AE2-70-16-T-vdc1-27197564 : fetch_disio_forensics: forensics detected in category Disk IO for service [ENET-DB-Service-DC, ENET-App-Service-DC, ENET-Web-Service-DC]

Dear User,

Forensic is captured on metric KPI Device Busy at 1631853780000, category Disk IO, instance SOLARIS_ENET_HOST_112_Inst_1-DC in service [ENET-DB-Service-DC, ENET-App-Service-DC, ENET-Web-Service-DC] on application(s) [enet_3_DC].

Threshold Details are below,

Severity: Default

KPI Value:0.0

Operation:not between

Threshold Value:Lower: 4.0 Upper: 6.0

For more details kindly look into attachment.

Attachment Contents

Name: sar#@#@#APPSONE_NEWLINE#@#@#Description: Disk Utilization and performance#@#@#APPSONE_NEWLINE#@#@#Start Time: 09-17-2021, 04:46:10#@#@#APPSONE_NEWLINE#@#@#End Time: 09-17-2021, 04:46:11#@#@#APPSONE_NEWLINE#@#@#Execute Status: Failed#@#@#APPSONE_NEWLINE#@#@#ReturnCode: E5001#@#@#APPSONE_NEWLINE#@#@#KeyValueOutput: #@#@#APPSONE_NEWLINE#@#@#Output:Error:

Invalid number of arguments received

Sample Email Notification With Recipient List

From: redwoodautomationresults@gmail.com <redwoodautomationresults@gmail.com>

Sent: Tuesday, November 2, 2021 7:10 PM

To: Anup Kumar B R

Cc: Raghavendra Mirajkar G; Chipurupalli Raviteja

Subject: Early Warning [E-11-974-103-27263811: Event(s) in Postgres-DB-Service root cause service(s) may impact transaction performance.] OPEN

Dear User,

Signal is open from past 8 Hours 59 Minutes

New Severe event(s) got added to the timeline

Early Warning is OPEN on application(s) Postgres.

Affected service(s): Postgres-DB-Service

Postgres-232-DB, Locks Awaited, 2021-11-02 19:06:00, NA

true.

For the detailed overview, Please select here.

Impacted Entry Point Service: NA

Suggested Root Cause at Postgres-DB-Service

Affected Application(s): Postgres

Severity: Severe

Started On: 2021-11-02 10:10:00(GMT +05:30)

Signal Summary So Far:-

Service NameSevere Events CountDefault Events CountAffected Instance countAffected Request countAffected Kpi CategoriesLatest Event Detected Time
Postgres-DB-Service2012130060DBAvailability,Host availability,Stats,CPU,Lock2021-11-02 19:06:00 (GMT +05:30)

Top 6 instance events details:-

Service NameAffected Instance NamesAffected KpisEvents CountLatest Event Detected Time
Postgres-DB-ServicePostgres-232-DBLocks Held (Severe),Locks Awaited (Severe)4532021-11-02 19:06:00 (GMT +05:30)
Postgres-DB-ServicePostgres-232-HostHeal Host Availability (Default),Established Status (Default),CPU Util (Severe),Ping Status (Severe)7602021-11-02 19:05:00 (GMT +05:30)
Postgres-DB-ServicePostgres-63-DBLocks Held (Severe),Locks Awaited (Default),Buffers Backend (Severe)5832021-11-02 19:06:00 (GMT +05:30)
Postgres-DB-ServicePostgres-63-HostHeal Host Availability (Default),Established Status (Default),CPU Util (Severe),Ping Status (Severe)7582021-11-02 19:05:00 (GMT +05:30)
Postgres-DB-ServicePostgres-DB-ClusterLocks Held (Severe),Locks Awaited (Default),Buffers Backend (Severe)5832021-11-02 19:04:00 (GMT +05:30)
Postgres-DB-ServicePostgres-178-HostHeal Host Availability (Default)1752021-11-02 19:07:00 (GMT +05:30)

Total 3312 event(s) detected so far on this Early Warning

Latest Events Details:-

  • Event Id: AE11-241-747-T-ALL-27264336,Event Detected Time: 2021-11-02 19:07:00 (GMT +05:30),Service Name: Postgres-DB-Service,Instance name: Postgres-63-Host,KPI name: Heal Host Availability,Unit: NA,Operation: NA,Host address:192.168.14.63,KPI attribute: NA,Value: Not available,Lower threshold: NA,Upper Threshold: NA
  • Event Id: AE11-242-747-T-ALL-27264336,Event Detected Time: 2021-11-02 19:07:00 (GMT +05:30),Service Name: Postgres-DB-Service,Instance name: Postgres-178-Host,KPI name: Heal Host Availability,Unit: NA,Operation: NA,Host address:192.168.14.178,KPI attribute: NA,Value: Not available,Lower threshold: NA,Upper Threshold: NA
  • Event Id: AE11-243-747-T-ALL-27264336,Event Detected Time: 2021-11-02 19:07:00 (GMT +05:30),Service Name: Postgres-DB-Service,Instance name: Postgres-232-Host,KPI name: Heal Host Availability,Unit: NA,Operation: NA,Host address:192.168.14.232,KPI attribute: NA,Value: Not available,Lower threshold: NA,Upper Threshold: NA
  • Event Id: AE11-246-974-T-ALL-27264338,Event Detected Time: 2021-11-02 19:06:00 (GMT +05:30),Service Name: Postgres-DB-Service,Instance name: Postgres-232-DB,KPI name: Locks Awaited,Unit: Count,Operation: lesser than,Host address:192.168.14.232,KPI attribute: NA,Value: 0.0,Lower threshold: 1.0,Upper Threshold: 0.0
  • Event Id: AE11-241-747-T-ALL-27264334,Event Detected Time: 2021-11-02 19:05:00 (GMT +05:30),Service Name: Postgres-DB-Service,Instance name: Postgres-63-Host,KPI name: Heal Host Availability,Unit: NA,Operation: NA,Host address:192.168.14.63,KPI attribute: NA,Value: Not available,Lower threshold: NA,Upper Threshold: NA
  • Event Id: AE11-242-747-T-ALL-27264334,Event Detected Time: 2021-11-02 19:05:00 (GMT +05:30),Service Name: Postgres-DB-Service,Instance name: Postgres-178-Host,KPI name: Heal Host Availability,Unit: NA,Operation: NA,Host address:192.168.14.178,KPI attribute: NA,Value: Not available,Lower threshold: NA,Upper Threshold: NA
  • Event Id: AE11-243-747-T-ALL-27264334,Event Detected Time: 2021-11-02 19:05:00 (GMT +05:30),Service Name: Postgres-DB-Service,Instance name: Postgres-232-Host,KPI name: Heal Host Availability,Unit: NA,Operation: NA,Host address:192.168.14.232,KPI attribute: NA,Value: Not available,Lower threshold: NA,Upper Threshold: NA
  • Event Id: AE11-243-1-T-ALL-27264337,Event Detected Time: 2021-11-02 19:05:00 (GMT +05:30),Service Name: Postgres-DB-Service,Instance name: Postgres-232-Host,KPI name: CPU Util,Unit: Percentage,Operation: greater than,Host address:192.168.14.232,KPI attribute: NA,Value: 0.26,Lower threshold: 0.1,Upper Threshold: 0.0
  • Event Id: AE11-246-973-T-ALL-27264337,Event Detected Time: 2021-11-02 19:05:00 (GMT +05:30),Service Name: Postgres-DB-Service,Instance name: Postgres-232-DB,KPI name: Locks Held,Unit: Count,Operation: greater than,Host address:192.168.14.232,KPI attribute: NA,Value: 1.0,Lower threshold: 0.0,Upper Threshold: 0.0
  • Event Id: AE11-243-251-T-192.168.13.10-27264337,Event Detected Time: 2021-11-02 19:05:00 (GMT +05:30),Service Name: Postgres-DB-Service,Instance name: Postgres-232-Host,KPI name: Ping Status,Unit: NA,Operation: NA,Host address:192.168.14.232,KPI attribute: 192.168.13.10,Value: Not available,Lower threshold: NA,Upper Threshold: NA
  • Event Id: AE11-246-974-T-ALL-27264338,Event Detected Time: 2021-11-02 19:06:00 (GMT +05:30),Service Name: Postgres-DB-Service,Instance name: Postgres-232-DB,KPI name: Locks Awaited,Unit: Count,Operation: lesser than,Host address:192.168.14.232,KPI attribute: NA,Value: 0.0,Lower threshold: 1.0,Upper Threshold: 0.0

SMS Templates

Note:  All the variables defined for Email Template are applicable for SMS as well.

Template for Lead Signal Open

<Signal Type> [<Signal ID>: <Description>] <Status> on application(s) <Application Names>. Affected Service(s): <Affected Service>, Affected Application(s): <Affected Service>. For the detailed overview, please login to HEAL and navigate to Signal list to open the report.

Template for Lead Signal Upgraded or Closed

<Signal Type> [<Signal ID>: <Description>] <Status> on application(s) <Application Names>. For the detailed overview, please login to Heal and navigate to Signal list to open report. Appreciate your efforts !

Info Signal SMS Template

<Signal Type> [<Signal ID>: <Description>] signal detected on metric category <Kpi category> in service <Affected Service> on Application(s): <Affected application>. For the detailed overview, please login to Heal and navigate to info Signal list to view details.

Info Signal SMS Template for Config Watch KPIs

{Signal_Type} [{Signal_ID}: Events detected in category {KPI_Category} for service {Affected_ServiceNames}] on Application(s): {Affected_ApplicationNames}

Status:{Signal_Status}

{Latest_Events_SMS}. For the detailed overview, please login to Heal and navigate to info Signal list to view details.

Template for Batch Problem Open

Batch Problem [{Batch_Job_Group}: Batch Job {Batch_Job}, Event detected: {batch_job_status}, Current Status: {batch_job_status}] open on application {App_Names}. Actual Duration: {Actual_Duration}, unit: {KPI_Unit}, Expected Duration: {Expected_Duration}.For the detailed overview, please login to Heal and navigate to Signal list to view details.

Template for Batch Problem Closed

Batch Problem [{Batch_Job_Group}: Batch Job {Batch_Job}, Event detected: {batch_job_status}, Current Status: {batch_job_status}] closed on application {App_Names}. Actual Duration: {Actual_Duration}, unit: {KPI_Unit}, Expected Duration: {Expected_Duration}.For the detailed overview, please login to Heal and navigate to Signal list to view details.

SMS Template Samples

Lead Problem Open, Update 1 and Update 2

Problem [112558: Travel Web transactions failing] open on application(s) Travels. Affected Service(s): Travel Web, Affected Applications: — For the detailed overview, please login to HEAL and navigate to Signal list to open report.

Problem [112558: Travel Web transactions failing] update on application(s) Travels. Affected Service(s): Hotels, Hotel Inventory, Affected Applications: —. For the detailed overview, please login to HEAL and navigate to Signal list to open report.

Problem [112558: Travel Web transactions failing] update on application(s) Travels. Affected Service(s): Bookings, Affected Applications: Flights For the detailed overview, please login to HEAL and navigate to Signal list to open report.

Early Warning Open, Update 1 and Update 2

  • Early Warning [112556: Metric breaches in services can potentially impact Travels application(s)] open on application(s) Travels. Affected Service(s): Travel Web, Affected Applications: —. For the detailed overview, please login to HEAL and navigate to Signal list to open report.
  • Early Warning [112556: Metric breaches in services can potentially impact Travels application(s)] update on application(s) Travels. Affected Service(s): Hotels, Hotel Inventory , Affected Applications: —. For the detailed overview, please login to HEAL and navigate to Signal list to open report.
  • Early Warning [112556: Metric breaches in services can potentially impact Travels application(s)] update on application(s) Travels. Affected Service(s): Bookings , Affected Applications: Flights. For the detailed overview, please login to HEAL and navigate to Signal list to open report.

Lead Problem Closed

Problem [112558: Travel Web transactions failing] closed on application(s) Travels. For the detailed overview, please login to HEAL and navigate to Signal list to open report. Appreciate your efforts !

Early Warning Closed

Early Warning [112556: Metric breaches in services can potentially impact Travels application(s)] closed on application(s) Travels. For the detailed overview, please login to HEAL and navigate to Signal list to open report. Appreciate your efforts !

Info Signal Detected

Info [I-2-94-2-26874081: Info: Transaction performance may get affected due to issues in services] signal detected on metric category Uptime_info in service NB-Web-Service-DR on Application(s): NetBanking-DR. For the detailed overview, please login to Heal and navigate to info Signal list to view details.”

Info Signal for Config Watch KPIs

Property Watch Sample

“Info [I-2-56-4-26990982: Events detected in category Config for service NB-DB-Service-DR] on Application(s): NetBanking-DR\r\nStatus:OPEN\r\nInstance Name:RHEL_NB_DB_Host_176_Inst_1-DR\r\nKpi name:File Watch\r\nFile name:/home/raghav/ConfigDataDR/agent_config.properties\r\nOperation:Modified\r\nNew Value:db2dc77382d03161175918c9b771d8b4\r\nOld Value:4e90699ed82c62fddb5c507d4941c306\r\nTime:2021-04-26 23:14:00. For the detailed overview, please login to Heal and navigate to info Signal list to view details.”

File Watch Sample

“Info [I-8-56-99-26995358: Events detected in category Config for service NB-Finacle-Service] on Application(s): NetBanking\r\nStatus:OPEN\r\nInstance Name:RHEL_NB_Finacle_Host_204_Inst_1\r\nKpi name:File Watch\r\nFile name:/opt/appnomic/ConfigData/alert.properties\r\nOperation:Added\r\nNew Value:d41d8cd98f00b204e9800998ecf8427e\r\nOld Value:NA\r\nTime:2021-04-30 03:36:00. For the detailed overvi ew, please login to Heal and navigate to info Signal list to view details.”

Batch Problem Open

Batch Problem [test00: Batch Job test000410, Event detected: OPEN, Current Status: OPEN] open on application NetBanking-DR. Actual Duration: , unit : Milliseconds, Expected Duration: .For the detailed overview, please login to Heal and navigate to Signal list to view details.

Batch Problem Closed

Batch Problem [test00: Batch Job test000410, Event detected: CLOSED, Current Status: CLOSED] closed on application NetBanking-DR. Actual Duration: , unit: Milliseconds, Expected Duration: .For the detailed overview, please login to Heal and navigate to Signal list to view details

Table of Contents