Skip to Main Content
Dream Factory by TruBridge Ideas Portal
Categories TruBridge Analytics
Created by Guest
Created on Feb 7, 2025

System and Performance Monitoring Dashboard (update the current System Management Dashboard)

  1. Uptime Monitoring Module – Focuses specifically on tracking the operational status and uptime of the system. I want to know when any part of the system goes down or slows down to unusable. This should include individual applications like Notes, Patient Connect, My Care Corner, IMS, and any other system tied to Trubridge. I would also want to see and overarching score for the EHR in its entirety.

    1. I want to be able to set my tolerance level.

      1. 99.9% uptime: 8.76 hours/year of downtime.

      2. 99.99% uptime: 52.56 minutes/year of downtime.

      3. 99.999% uptime: 5.26 minutes/year of downtime.

    2. I want a score based on my tolerance level. (i.e. We have been down 1 hour in January, and my tolerance is 8.76hours = 11.4% of tolerance is utilized for the year)

  2. Performance Analytics Module – Monitors and analyzes the system’s performance, such as response times, load times, and user activity metrics. This should be based on application and EHR in its entirety.

  3. Health Monitoring – An umbrella term that would include both uptime and performance monitoring, potentially with additional focus on system health checks and alerting.
    I want the following information and trends based on date/time and amount of users in the system.

    1. CPU Usage: High CPU usage may signal inefficient processes or high demand on the server.

    2. Memory Usage: Too much memory usage can degrade system performance or lead to crashes.

    3. Disk I/O: Slow or high disk input/output can affect the system’s ability to retrieve and save data, affecting performance.

    4. Response Time: Measure how long it takes for the system to respond to requests (e.g., fetching a patient’s record).

    5. Database Performance: Ensure that queries to the database are fast and efficient, as slow database performance can lead to delays in accessing patient data.

  4. Real Time Alerting -

    1. Threshold-Based Alerts

      1. CPU Usage: If CPU usage exceeds 90% for more than a defined period (e.g., 5 minutes), an alert is triggered to notify the IT team.

      2. Memory Usage: When memory usage exceeds a certain threshold (e.g., 85% for more than 10 minutes), an alert will be raised.

      3. Disk Space: If disk space usage reaches 90% or more, an alert is triggered.

      4. Response Time: If average response time exceeds a set limit (e.g., 3 seconds), an alert is triggered.

    2. Availability and Downtime Alerts

      1. EHR Application Availability: If the EHR application is unreachable (i.e., down for more than a certain time, e.g., 1 minute), the alert triggers.

      2. Database Service Failure: If the database service is not responding, an alert is triggered.

      3. Network Connectivity Issues: If network connectivity fails or degrades, alerts are triggered for the IT team.

    3. Error Alerts

      1. API Failures: If APIs used by the EHR system fail (e.g., to retrieve data or integrate with external services), an alert is triggered.

      2. Authentication Errors: Too many failed login attempts in a short time (e.g., 5 failed attempts in 1 minute) could indicate a brute force attack or user misconfiguration.

    4. Security Alerts

      1. Unauthorized Access: Alerts when an unauthorized login attempt or suspicious login location is detected.

      2. Data Integrity Issues: If a discrepancy in data integrity is detected (e.g., a mismatch between backup data and live data), an alert is triggered.

      3. Encryption Failures: If encryption fails or is disabled, an immediate alert is sent to ensure that sensitive data is protected.

    5. Backup and Recovery Alerts

      1. Backup Failures: If a scheduled backup fails or completes with errors, an alert should be triggered.

      2. Recovery Time Violation: If recovery from a backup exceeds the expected time, it triggers an alert.

    6. User Experience Monitoring Alerts

      1. Slow Page Load: If the system's front-end pages are taking too long to load, an alert could notify administrators.

      2. High Error Rates for Users: If users encounter frequent application errors (e.g., 50+ errors in 1 minute), it will trigger an alert.

  • Attach files
  • Linda Pfeifle
    Reply
    |
    Feb 11, 2025

    That is an impressive write-up, and all of that is greatly needed! Thank you for submitting this!

  • Janna Sartin
    Reply
    |
    Feb 7, 2025

    And do not charge extra for this!