Secure Bottlerocket deployments on Amazon EKS with KubeArmor

Introduction

Bottlerocket is a security focused operating system (OS) image that provides out-of-the-box security options to protect host or worker nodes. While Bottlerocket is useful, the security of the pods and the containers is still the responsibility of the application developer or provider. KubeArmor, a CNCF (Cloud Native Computing Foundation) sandbox project, is a runtime security engine that leverages extended Berkeley Packet Filter (eBPF) and Berkeley Packet Filter-Linux Security Module (BPF-LSM) to protect the pods and containers.

KubeArmor uses Linux security modules (LSMs) for policy enforcement. LSMs are decades-old kernel technology traditionally used for host hardening. It can work on top of any Linux platforms (e.g., Alpine, Ubuntu, and Container-optimized OS, like Bottlerocket). LSMs are extremely difficult to configure and there are many instances where the security administrators often disable it due to its complexity. One key aim of KubeArmor is to simplify the use of LSMs to enforce the required policies without the user having to worry about LSMs itself (i.e., KubeArmor abstracts away the complexities of LSMs while using the advantages of the LSMs).

Solution overview

KubeArmor uses Linux kernel primitives to enforce user-specified policies and for workload observability.

Figure 1. KubeArmor architecture

Kernel support for BPF-LSM for policy enforcement

With version 0.5, KubeArmor now integrates with BPF-LSM for pod and container-based policy enforcement. BPF-LSM is a new LSM (Linux Security Modules) that’s introduced in the newer kernels (version > 5.7). BPF-LSM allows KubeArmor to attach bpf-bytecode at LSM hooks that contains user-specified policy controls. This changes everything since now bpf-bytecode has access to much richer information and kernel context and it doesn’t have to work within the constraints of SELinux and AppArmor policy language. For example, SELinux uses a Common Intermediate Language (CIL) that defines the policy definition language and the security rules that have to strictly follow the semantics of this policy language. Similarly, AppArmor specifies explicit policy language constructs and the user-specified rules need to follow these constructs.

Figure 2. BPF-LSM and KubeArmor

What platforms support BPF-LSM?

Bottlerocket
Latest images of Amazon Linux 2. Note: The default Amazon Linux 2 is still at kernel version 5.4 and hence bpf-lsm cannot be used with it. Amazon Linux 2 can be used with bpf-lsm if upgraded to kernel version 5.10. Please follow this guide on upgrading the default kernel to 5.10
For the detailed list, check this.

How KubeArmor improves on Bottlerocket security

Bottlerocket uses SELinux to lock down the host and provides some limited inter-container isolation.

KubeArmor provides enhanced security by using BPF-LSM to protect k8s pods hosted on Bottlerocket by limiting system behavior with respect to processes, files, use of network primitives, etc. For example, a k8s security access token that’s mounted within the pod is accessible by default across all the containers within that pod. KubeArmor can restrict access to such tokens only for certain processes. Similarly, KubeArmor is used to protect other sensitive information (e.g., k8s secrets, x509 certificates) within the container. You can specify policy rules in KubeArmor such that any attempts to update the root certificates in any of the certificate’s folders (i.e., /etc/ssl/, /etc/pki/, or /usr/local/share/ca-certificates/) can be blocked. Moreover, KubeArmor can restrict the execution of certain binaries within the containers.

AWS Bottlerocket Node, Pod and example of sensitive assets such as service tokens, volume mounts.

Figure 3. Kubearmor improves Bottlerocket security

Why is it important to protect pods?

Typically, a k8s pod is the entity that’s reachable from an external world possibly through an ingress controller. Thus, if the workload or an application within the pod is vulnerable, it has higher chances of getting compromised. The host may subsequently be compromised if the attacker can leverage a container escape. However, the pod itself has attack vectors that an attacker can leverage to do lateral movements or to exfiltrate the data.

To quote NSA-CISA K8s Hardening Guide:

“Applications running inside the cluster are common targets. They are frequently accessible outside of the cluster, making them reachable by remote cyber actors. An actor can then pivot from an already compromised Pod or escalate privileges within the cluster using an exposed application’s internally accessible resources.”

In the AWS Shared Responsibility Model, the pods and application security falls within the scope of the customer who installs the workloads. KubeArmor provides a way to harden the pods just the way the hosts and nodes have been hardened for decades.

KubeArmor policies are enforced at runtime after the workloads are deployed and are executing. The KubeArmor policy management should be handled in similar ways, in which one handles network policies that are also enforced at runtime.

Real-world use case

The alpine image used as a base image in many of the containerized workloads is shipped with /sbin/apk binary, which is essentially a package management tool. It’s possible that new binaries are installed using tools that can increase the attack surface within the pods. In the production environment, it’s best to disable execution of these tools. KubeArmor reduces the attack surface area by blocking execution of such binaries.

The following example policy shows you how to do this:

apiVersion: security.kubearmor.com/v1
kind: KubeArmorPolicy
metadata:
  name: alpine-pol
spec:
  selector:
    matchLabels:
      app: alpine
  process:
    matchPaths:
    - path: /sbin/apk
  action:
    Block

Another example implementation, is when you deploy any pod in your k8s cluster, because that pod is mounted by default with a service account token at path /var/run/secrets/kubernetes.io/serviceaccount even if none of the applications require access to it within the pod. Attackers can leverage tokens to access k8s api-server to move laterally within the structure.

To quote NSA-CISA K8s Hardening Guide in the context:

“By default, Kubernetes automatically provisions a service account when creating a Pod and mounts the account’s secret token within the Pod at runtime. Many containerized applications do not require direct access to the service account as Kubernetes orchestration occurs transparently in the background. If an application is compromised, account tokens in Pods can be gleaned by cyber actors and used to further compromise the cluster.”

KubeArmor provides a way to restrict access to service account token paths within the pod by disabling access to it.

apiVersion: security.kubearmor.com/v1
kind: KubeArmorPolicy
metadata:
  name: block-token-access
spec:
  selector:
    matchLabels:
      app: alpine
  file:
    matchDirectories:
    - dir: /var/run/secrets/kubernetes.io/serviceaccount/
      recursive: true
  action:
    Block

Walkthrough

Deployment steps with Amazon EKS using Bottlerocket AMI

Please follow the quickstart guide on implementing an Amazon Elastic Kubernetes Service (EKS) cluster with Bottlerocket. Once the cluster is established, please follow the steps below for KubeArmor installation.

Step 0. Install prerequisites

Install eksctl, kubectl, curl, and jq. This guide assumes that the host from where the installation is driven is a Linux machine.

Step 1. Download and install karmor cli-tool

curl -sfL http://get.kubearmor.io/ | sh -s -- -b /usr/local/bin

Step 2. Install KubeArmor

karmor install

It’s assumed that the k8s cluster is already present/reachable with the required prerequisites and the user has rights to create service-accounts and cluster-role-bindings.

Step 3. Deploying sample app and policies

a. Deploy sample multiubuntu app

kubectl apply -f https://raw.githubusercontent.com/kubearmor/KubeArmor/main/examples/multiubuntu/multiubuntu-deployment.yaml

b. Deploy sample policies

kubectl apply -f https://raw.githubusercontent.com/kubearmor/KubeArmor/main/examples/multiubuntu/security-policies/ksp-group-1-proc-path-block.yaml This sample policy blocks execution of sleep command in ubuntu-1 pods.

apiVersion: security.kubearmor.com/v1
kind: KubeArmorPolicy
metadata:
  name: ksp-group-1-proc-path-block
  namespace: multiubuntu
spec:
  severity: 5
  message: "block /bin/sleep"
  selector:
    matchLabels:
      group: group-1
  process:
    matchPaths:
    - path: /bin/sleep
  action:
    Block

The above policy uses a policy construct process that specifies a set of matchPaths indicating the list of binaries (in this case only /bin/sleep) and the Action is Block. Thus, when the /bin/sleep is executed, the policy comes into effect and denies execution of the command.

Step c. Simulate policy violation

$ POD_NAME=$(kubectl get pods -n multiubuntu -l "group=group-1,container=ubuntu-1" -o jsonpath='{.items[0].metadata.name}') && kubectl -n multiubuntu exec -it $POD_NAME -- bash
# sleep 1
(Permission Denied)

Step 4. Getting alerts and telemetry from KubeArmor

a. Enable port-forwarding for KubeArmor relay

$ kubectl port-forward -n kube-system svc/kubearmor 32767:32767

b. Observing logs using karmor cli

$ karmor log –json | jq .
{
  "Timestamp": 1661286176,
  "UpdatedTime": "2022-08-23T20:22:56.286246Z",
  "ClusterName": "default",
  "HostName": " ip-192-168-18-137.ec2.internal",
  "NamespaceName": "multiubuntu",
  "PodName": "ubuntu-3-deployment-6d8587dc77-x5lxd",
  "ContainerID": "fb531f48f12a29623bf8629f63b5a21abe9ac7007b83aecff7c29c38ca52c37a",
  "ContainerName": "fb531f48f12a",
  "HostPPID": 926656,
  "HostPID": 926756,
  "PPID": 14,
  "PID": 75,
  "ParentProcessName": "/bin/dash",
  "ProcessName": "/bin/sleep",
  "PolicyName": "ksp-group-1-proc-path-block",
  "Severity": "5",
  "Message": "block /bin/sleep",
  "Type": "MatchedPolicy",
  "Source": "/bin/dash",
  "Operation": "Process",
  "Resource": "/bin/sleep 1",
  "Data": "syscall=SYS_EXECVE",
  "Enforcer": "BPFLSM",
  "Action": "Block",
  "Result": "Permission denied"
}

The alert events shows that the execution of /bin/sleep was blocked, and the event contains the contextual information (i.e., associated container, pod, namespace, and node). It also shows that the Enforcer of the action was “BPFLSM”.

5. Integrating with external logging tools

KubeArmor operates as a k8s daemonset, which means the policy enforcement pod is installed on per worker node basis. There is a kubearmor-relay-service installed as part of KubeArmor deployment that connects to all the KubeArmor pods and provides a single point of interface for consuming logs and events. The following methods consume these logs and events from KubeArmor:

Exporting events to Prometheus: KubeArmor provides a Prometheus exporter adapter that one use export events to Prometheus.
Exporting events to ELK using elk-adapter.
KubeArmor Relay service also dumps the events to stdout. These events could be consumed by third-party generic k8s logging backends such as fluentd.

6. Cleaning up

To uninstall KubeArmor from the k8s cluster, you can use:

karmor uninstall

Please remember to delete the cluster to avoid unnecessary costs. Refer to the Cleanup section on deleting the Amazon EKS resources.

Conclusion

In this post, we showed you how to use KubeArmor, a cloud native solution that operates on top of Bottlerocket to secure pods and containers using BPF-LSM. In the case of k8s, the pods are the execution units and are usually exposed to the external entities. Thus, it’s imperative to have a layer of defense within the pods so that the attacker is limited in their ability to use system primitives to exploit the vulnerability. KubeArmor is a k8s-native solution that uses Linux kernel primitives on Bottlerocket to harden the pods.

Rahul Jadhav, Cofounder, Accuknox

Rahul Jadhav is a systems engineer working on solutions involving security and performance optimizations of cloud-native technologies. He has contributed towards several open sources including Linux Kernel and associated with IETF Standards Groups and Linux Foundation Groups. Taken several projects from conception to market and is an active maintainer for CNCF Sandbox project, “KubeArmor”. He cofounded Accuknox where they are enabling runtime, zero-trust security by leveraging Linux-LSM/eBPF based security for k8s and containerized/cloud-native workloads.

Containers