Modern workload monitoring

Observe all your workloads, including containerized and generative AI applications

Benefits

Optimize AWS resource performance and availability through proactive monitoring, issue resolution, and data-driven insights, enabling smooth, efficient, and secure cloud operations.

Reduce mean time to resolution (MTTR) by surfacing data to quickly diagnose the root cause of issues.

Unify end-to-end observability and analytics across containers and serverless services, eliminating tedious tagging and event correlation across services.

Monitor and troubleshoot containers and serverless workloads for enhanced resilience and efficiency. For example, you can leverage AI and ML- powered capabilities in CloudWatch to query logs and metrics using natural language, analyze patterns and detect anomalies, and automatically mask sensitive data in your CloudWatch logs.

Use cases

Effectively monitor and optimize the performance of your generative AI resources by leveraging the power of Amazon Bedrock, Amazon SageMaker,and Amazon CloudWatch. You can use CloudWatch Container Insights to automatically discover and monitor key health metrics for NVIDIA GPUs, Trainium and Inferentia accelerators, EFA network adapters, and SageMaker HyperPods running in your Amazon EKS clusters, providing visibility into resource utilization, availability, and latency.

You can gain deep insights into the performance of your serverless applications by monitoring key operational metrics such as execution duration, errors, and throttles using CloudWatch Application Signals. Using CloudWatch Lambda Insights, you can monitor key health metrics like CPU, memory and network metrics in out-of-the-box curated dashboards, and leverage CloudWatch Logs Insights to analyze log data and distributed tracing to identify potential bottlenecks. These CloudWatch features allows you to optimize your serverless architectures for cost and efficiency.

With CloudWatch Application Signals you can easily monitor key application metrics and gain insights on your performance across your applications running on containers easily. You can translate your business goals into SLOs to keep track of performance against key performance indicators (KPIs). CloudWatch Application Signals works with CloudWatch Container Insights to deliver health and performance metrics for Amazon EKS and Amazon ECS resources, enabling end-to-end observability for your applications.

Application developers and database administrators (DBAs) can access a comprehensive database telemetry dashboard in CloudWatch Database Insights to correlate a slowdown in their database cluster, e.g., Aurora MySQL and PostgreSQL, with issues impacting their application performance. This helps expedite database troubleshooting and ultimately delivering a better end user experience.