Containers

Amazon EKS now supports Amazon Application Recovery Controller

Introduction Amazon Elastic Kubernetes Service (Amazon EKS) now supports Amazon Application Recovery Controller (ARC). ARC is an AWS service that allows you to prepare for and recover from AWS Region or Availability Zone (AZ) impairments. ARC provides two sets of capabilities: Multi-AZ recovery, which includes zonal shift and zonal autoshift, and multi-Region recovery, which includes routing […]

Improving deployment visibility for Amazon ECS services

When deploying software, it’s critical to have visibility into all stages of the deployment process. Knowing the status of ongoing deployments, troubleshooting issues when things go wrong, and having an audit trail of past deployments are essential for ensuring a safe and reliable release process. Amazon Elastic Container Service (Amazon ECS) now provides enhanced observability […]

How Infinitium reduced fraud detection time by 95% with Amazon ECS and AWS Fargate on AWS Graviton

This post was created in collaboration with Infinitium Engineering Team. Introduction Infinitium (a Euronet Company) is a leading digital payments company in Southeast Asia, specializing in secure online payment solutions and risk management services. With a strong presence across the Asia Pacific region, Infinitium offers cutting-edge technologies such as 3D Secure (3DS) authentication, fraud detection […]

Serverless containers at AWS re:Invent 2024

AWS re:Invent is the premier learning conference hosted by AWS for the global cloud computing community. This year the Amazon Elastic Container Service (Amazon ECS) and AWS Fargate teams will share the latest trends, innovations, best practices, and tips to help you increase productivity, optimize costs, and enhance business agility. Join us in Las Vegas […]

Amazon EKS optimized Amazon Linux 2023 accelerated AMIs now available

Introduction Earlier this year we announced support for Amazon EKS optimized AL2023 AMIs that provided many enhancements in terms of security and performance. Amazon Linux 2023 (AL2023) is the next generation of Amazon Linux from Amazon Web Services (AWS) and is designed to provide a secure, stable, and high-performance environment to develop and run your […]

Scaling a Large Language Model with NVIDIA NIM on Amazon EKS with Karpenter

Many organizations are building artificial intelligence (AI) applications using Large Language Models (LLMs) to deliver new experiences to their customers, from content creation to customer service and data analysis. However, the substantial size and intensive computational requirements of these models may have challenges in configuring, deploying, and scaling them effectively on graphic processing units (GPUs). […]

Inside Pinterest’s Custom Spark Job logging and monitoring on Amazon EKS: Using AWS for Fluent Bit, Amazon S3, and ADOT

In Part 1, we explored Moka’s high-level design and logging infrastructure, showcasing how AWS for Fluent Bit, Amazon S3, and a robust logging framework make sure of operational visibility and facilitate issue resolution. For more details, read part 1 here. Introduction As we transition to the second part of our series, our focus shifts to […]

Inside Pinterest’s Custom Spark Job logging and monitoring on Amazon EKS: Using AWS for Fluent Bit, Amazon S3, and ADOT

This is Part 1 of the blog post. Introduction Pinterest is a visual search and curation platform focused on inspiring users to create a life they love. Critical to the service are data insights, recommendations and machine learning (ML) models that are produced by synthesizing insights provided by the over 500 million monthly active users […]

Automating custom Amazon EKS worker node builds using EC2 Image Builder

Customers who are building their “Golden Image” Amazon Machine Images (AMIs) using EC2 Image Builder may wish to extend their Image Builder pipelines to build out their Amazon Elastic Kubernetes Service (Amazon EKS) worker nodes as well. In this blog, we will show you how to do this and provide you with AWS CloudFormation templates […]

Powering the Next Generation of AI Workloads on Amazon EKS with Anyscale

Ray is an open-source framework that manages, executes, and optimizes compute needs for AI workloads. It is designed to make it easy to write parallel and distributed Python applications by providing a simple and intuitive API for distributed computing. Ray unifies infrastructure by leveraging any compute instance and accelerator on AWS via a single, flexible […]