01 Kubestronaut KCNA & KCSA

Author included in StudyNote

2024-11-29 6229 words 30 minutes views

Contents

KCNA

Container and Orchestration

The matrix from hell

Web server, database, messaging, orchestration
OS, hardware infrastructure
libraries, dependencies

Module 2 Quiz (failed ones only)

How can you specify a specific runtime endpoint when using the CRI control tool?

Setting the CONTAINER_RUNTIME_ENDPOINT environment variable,

Using the --runtime-endpoint command line option
Why do Docker images continue to work seamlessly in Kubernetes even after the removal of Docker support?

Because Docker images adhere to the Open Container Initiative (OCI) standard
What is the benefit of using the Nerd control (nerdctl) tool over the CTR command line tool?

NerdCTL provides access to the newest features implemented into containerD.
Which component in Kubernetes is responsible for distributing containers across multiple nodes?

Scheduler
What is containerization?

Running microservices in their own container
Which container technology does Docker primarily utilize for its containers?

LXC
How do containers differ from virtual machines (VMs) with respect to their operating system (OS) kernel?

Containers shsare the same OS kernel as the host system, but VMs have their own separate OS kernel.
Which container orchestration technology is known for being a bit difficult to set up and get started, but provides several feature to customize the deployment?

Kubernetes
Which command is used to view information about the Kubernetes cluster?

kubectl cluster-info
Who maintains and develops the CRI control command line utility?

Kubernetes community

KCSA

Infrastructure Security

summary

Isolate critical applications on separate server for better security
Restrict Docker port access with firewall rules and policies
Apply least privilege to containers and secure Kubernetes dashboard
Store sensitive data securely using Kubernetes secrets and RBAC
Encrypt etcd data and use TLS authentication for protection

Kubernetes Isolation Techniques

Multi Tenancy with namespaces: different teams or project can use the same cluster without interfereing with others. Each team or project has its own namespace.

Network Policies allow us to define role regarding which components can talk to each other.

RBAC ensures that developer has only read access permission to production ns and full access to the development ns. Team A only has access to namespace A resources and team B only has access to namespace B resource.

Resource Quotas and Limits ensures that each component gets a fair share of resources.

Workload and Application Code Security

Container - Let’s say a team use containeriation for the benefit of portability, scalability, consistency, isolation and security. Popular artifact repository supports image signing, enhanced access control, integrated vulnerability scaning, such as JFrog, Nexus Repository, GitHub Packages.

Code - SQL injection attacks, use SonarQube to detect problematic code. Use OWASP dependency checking tool. Gaining insight into containerized application is essential. This is where tools lik Sysdig come into as a solution. Sysdig provides deep visibility into containerized environment, which allows you to monitor resrouce usage, detect anomalies, and troubleshoot issues in real time. With Sysdig, you can quickly pinpoint which container or process is causing high CPU or memory usage, can track down problematic workloads, and even inspect live system calls.

4 C’s	best practices
Code
Container	Restrict Images, Supply Chain, Sandboxing, Privileged
Cluster	Authentication Authorization, Admission, Network Policy
Cloud	Data Center, Network, Servers

Api Server

Who can access the api-server?

Files - Username and passwords
Files - username and tokens
Certificates
External Authentication providers - LDAP
Service Accounts

What can they (api-server) do?

RBAC authorization
ABAC authorization
Node authorization
Webhook Mode

Securing Controller Manager and Scheduler

summary

Isolate controller manager and scheduler on separate dedicated nodes
Use RBAC to limit permissions of controller manager and scheduler
Encrypt communications between components using TLS for security
Enable audit logging to track and review all actions taken
Secure default settings and protect the configuration files
Run the latest version of Kubernetes
Regularly scan for vulnerabilities

Securing Kubelet

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


# kubelet.service
ExecStart=/usr/local/bin/kubelet \\
  --container-runtime=docker \\
  --image-pull-progress-deadline=2m \\
  --kubeconfig=/var/lib/kubelet/kubeconfig \\
  --network-plugin=cni \\
  --register-node=true \\
  --v=2 \\
  --config=/var/lib/kubelet/kubelet-config.yaml # add
#  --cluster-domain=cluster.local \\
#  --file-check-frequency=0s \\
#  --healthz-port=10248 \\
#  --cluster-dns=10.96.0.10 \\
#  --http-check-frequency=0s \\
#  --sync-frequency=0s

Above is how the kubelet service configuration file was originally, but with the release of version 1.10, most of these parameters were moved to another file called the kubelet config file for ease of deployment and configuration management. The object created within the file is named kubelet configuration.

On the kubelete service config file, we pass the path to this file as a command line argument named config. Note that within the file, the parameters use camel case. So all dashes that separate words in the previous implementation are removed and words are written without spaces, and the first letter of each word is capitalized except the first word, see below:

1
2
3
4
5
6
7
8
9


apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
clusterDomain: cluster.local
fileCheckFrequency: 0s
healthzPort: 10248
clusterDNS:
- 10.96.0.10
httpCheckFrequency: 0s
syncFrequency: 0s

If you specify a flag both in the service configuration file or on the command line as well as in the file in the kubelet configuration file, then the flag specified on the command line will override whatever is in this file.

Let’s say there are a large number of worker nodes, instead of manually creating these kubelet config files in each of those worker nodes, the kubdadm tool can help in automatically configuring the kubelet files on those nodes when you run the kubeadm join command.

View kubelet options

1
2


ps -aux | grep kubelet
car /var/lib/kubelet/config.yaml

Kubelet - Security

The kubelet serves on two ports: port 10250, which is where the kubelet serves the API server that allows full access. And port 10255, which serves an API that allows an authenticated, unauthorized read-only access. By default, the kubelet allows anonymous access to its API.

port	description
10250	Serves API that allows full access
10255	Serves API that allows unauthenticated read-only access

Anyone that knows the IP address of these hosts can access these APIs to perform anything that the API server can do, such as viewing existing pods, creating new pods, execing into existing pods or viewing usage metrics and stats and logs and more.

How to prevent possible security issues? How to implement security on the kubelet? Any request that comes to the kubelet is first authenticated and then authorized. The authorization process decides what areas of the API can the user access and what operations can they perform.

Set `--anonymous-auth=false`

You can disable anonymous user by setting anonymous auth flag to false in a kubelete service configuration file. Also you can set anonymous enabled to false in a KubeletConfiguration file.

1
2
3
4
5


apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
authentication:
  anonymous:
  enabled: false

Summary

Regularly update and patch container runtimes to fix vulnerabilities
Run containers with least privileges to minimize security risks
Use read-only filesystems to prevent unauthorized filesystem modifications
Limit resource usage to prevent denial-of-service (DoS) attacks
Apply security profiles like SELinux and AppArmer for protection
Transition to supported runtimes like containered or CRI-O
Implement monitoring and logging for runtime behavior detection

Securing kube proxy

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23


# to locate the kube-proxy config file
ps -ef | grep kube-proxy
...... /usr/local/bin/kube-proxy --config=/var/lib/kube-proxy/config.conf --hostname-override=controlplane

cat /var/lib/kube-proxy/config.conf
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
bindAddressHardFail: false
clientConnection:
  acceptContentTypes: ""
  burst: 0
  contentType: ""
  kubeconfig: /var/lib/kube-proxy/kubeconfig.conf
  qps: 0
clusterCIDR: 127.17.0.0/16
......

# check the permission - must be 644 or stricter, only the own can edit the file
stat -c %a /var/lib/kube-proxy/kubeconfig.conf
644
# check the ownership (file owner and group)
stat -c %U:%G /var/lib/kube-proxy/kubeconfig.conf
root:root

Securing communication

Secure communication between cube-proxy and the API server is crucial.

The file path at clusters[*].cluster.certificate-authority is use to validate the API server’s TLS certificate. The kube-proxy itself uses a service-account token to authenticate to the kube api server. We need to ensure that this communicated is encrypted using TLS to avoid eavesdropping and tempering (竊聽和篡改).

Another step to make sure is that audit logs are enabled. In K8s, audit logs enables all actions performed by that service (in this case kube-proxy service) to track changes and identify unauthorized access, so it’s important that we regularly review these logs that helps in detecting and addressing suspicious activities.

Summary

Secure kube-proxy config file with strict permissions
Encrypt API server communication using TLS and service accounts
Run kube-proxy with least privileges necessary
Implement network policies for traffic control (If necessary)
Use logging and monitoring for detecting anomalies
Regularly update and patch kube-proxy for security
Enable audit logs to track kube-proxy actions

Pod Security

Pod Security Policy (PSP) was removed as of version 1.25 in Kubernetes and is replaced by Pod Security Admission (PSA) and Pod Security Standards (PSS), which was promoted to being stable in version 1.25.

Basic idea of how PSP works:

When PSP is enabled, the Pod Security Policy Admission Controller observes all Pod creation requests and validates the configuration against the set of pre-configured rules. If it detects a violation of the rule, then that request to create the Pod is rejected and the user receives an error message.

To enable Pod Security Admission, we added to the enable admission plugins flag (--enable-admission-plugins).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


# at /etc/kubernetes/manifests/kube-apiserver.yaml
spec:
  containers:
  - command:
    - kube-apiserver
    - --authorization-mode=Node,RBAC
    - --advertise-address=172.17.0.107
    - --allow-privileged=true
    - --enable-bootstrap-token-auth=true
    - --enable-admission-plugins=PodSecurityPolicy

Define Pod Security Policy

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20


apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: example-psp
spec:
  privileged: false
  seLinux:
  	rule: RunAsAny
  supplementalGroups:
  	rule: RunAsAny
  runAsUser:
  	rule: 'MustRunAsNonRoot'
# fsGroup:
# 	rule: RunAsAny
  requiredDropCapabilities:
  - 'CAP_SYS_BOOT'
  defaultAddCapabilities:
  - 'CAP_SYS_TIME'
  volumes:
  - 'persistentVolumeClaim'

[kubectl] ➡️ [Authentication] ➡️ [Authorization] ➡️ [Admission Controllers - PodSecurityPolicy] ➡️ Create POD

Challenges

PodSecurityPolicy is not enabled by default. If you were to enable Pod Security Policies in a cluster, you have to make sure that appropriate policies are created in advance. Also it’s complex as you need to bind PSP to roles and bindings.

Securing ETD

Enabling Data Encryption at Rest

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


kind: EncryptionConfiguration
apiVersion: apiserver.config.k8s.io/v1
resources:
- resources:
	- secrets
	providers:
	- aescbc:
	    keys:
	    - name: key1
	      secret: <base64-encoded-encryption-key>
	- identity: {} # in case of aescbc can't be found

Specifies the type of k8s object: EncryptionConfiguration
Specifies the k8s API version: apiserver.config.k8s.io/v1
Lists the resources to be encrypted, like secrets
Defines encryption providers, using aescbc as the algorithm

Generate the encryption key

1

openssl rand -base64 32

After kubectl apply and created the encryption object, we need to update the etcd pod specification file to use the encryption configuration file.

1
2
3
4
5
6
7


containers:
  - name: etcd
    image: k8s.gcr.io/etcd:3.4.13-0
    command:
      - etcd
      - ......
      - --encryption-provider-config=/path/to/encryption-config.yaml

User TLS for Secure Communication (the data in transit / at movement)

--cert-file: Specifies the server’s certificate file for secure identity presentation

--key-file: Specifies the server’s key file used with the certificate

--client-cert-auth : Enables client certificate authentication to allow only trusted clients

--trusted-ca-file: Specifies the CA certificate for verifying client certificates

``–peer-cert-file`: for secure peer-to-peer communication

--peer-key-file

--peer-client-cert-auth

--peer-trusted-ca-file

Regular Backups

1
2
3
4
5


ETCDCTL_API=3 etcdctl snapshot save /path/to/backup.db \
  --endpoints=<etcd-endpoints> \
  --cacert=/path/to/ca.crt \
  --cert=/path/to/etcd-client.crt \
  --key=/path/to/etcd-client.key

Specify the etcdctl API version to use (version 3)
Command to take a snapshot of the etcd data
Path where the backup file will be saved
Specify the etcd endpoints to connect to
Path to the CA certificate used to verify the etcd server certificate
Path to the client certificate used to authenticate the client
Path to the client key used to authenticate the client

Summary

Enable data encryption at rest for etcd security
Use TLS to secure etcd communication
Regularly back up etcd data for recovery

Secure container networking

Implement networkPolicies to restrict pod to pod communication

Use service meshes

Istio, Linkerd to enforce mutual TLS, to ensure that all communication between services is encrypted and authenticated, which significantly reduces the risk of man-in-the-middle attacks

Encrypting Network Traffic

Use a CNI plugin like Calico to enable IPsec encryption for network traffic, which ensures that all data transferred between nodes is encrypted

Isolating sensitive workloads

Using namespaces or network policies to help contain potentical security breaches

Summary

Implement network policies to control pod traffic flow
Use service meshes for encrypted, secure service communication
Encrypt network traffic between containers using IPsec or WireGuard
Isolate sensitive workloads with namespaces and network policies

Client Security - kubectl proxy port forward

user details and credentials stored in kubeconfig file (~/.kube/config)

Through port 6443 using curl using certificate files

1
2
3
4


curl http://<kube-api-server-ip>:6443 -k
  --key admin.key
  --cert admin.crt
  --cacert ca.crt

Alternative option: start kubectl proxy client

1
2
3
4


kubectl proxy
Starting to servce on 127.0.0.1:9001

curl http://localhost:8001 -k

The kube proxy client launch a proxy service locally on port 8001 by default, and used the credentials and certificates from your cube-config file to access the cluster. Remember that the proxy only runs on your laptop and is only accessible within your laptop.

Kubectl port forward

1
2


kubectl port-forward service/nginx 28080:80
kubectl port-forward --help

Summary

Encrypt data at rest and in transit for protection
Implement RBAC to control access to storage resources
Use Storage Classes to enforce security and performance policies
Regularly back up data and have a disaster recovery plan
Monitor and audit storage access for compliance and security

Pod Security Standards and Pod Security Admissions

KEP 2579 PSP replacement

PodSecurityAdmissionController

1

kubectl exec -n kube-system kube-apiserver-controlplane -it -- kube-apiserver -h | grep enable-admission-plugins

Configure PSA

PSA is namespace scoped. This means you can label a namespace with PSA mode and security standard using below command syntax.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


kubectl label ns payroll pod-security.kubernetes.io/<mode>=<security standard>

# examples of label key and value
[Key]
pod-security.kubernetes.io/enforce
pod-security.kubernetes.io/audit
pod-security.kubernetes.io/warn

[Value]
privileged
baseline
restricted

One of PSA goals is to simplify the whole implementation. Instead of requiring users to write their own profiles, a few built-in profiles are defined: privileged, baseline, restricted.

Three security policies defined by Pod Security Standards

Profile	Description
Privileged	Unrestricted policy. It allows the widest possible level of permissions, almost like there are no restrictions. Useful for system-wide programs like logging agents, CNIs, storage drivers
Baseline	Minimally restrictive policy.
Restricted	Heavily restricted policy. It follows the pod hardening best practices.

A mode defines what action the control plane takes if the policy is violated. If it’s set to enforce, then the pod creation request is rejected. If it’s set to audit, then the pod creation is allowed and an entry is added to the audit logs. And warn mode triggers a user-facing warning.

Three modes of Pod Security Admission

Mode	On Violation
enforce	Reject pod
audit	Record in the audit logs
warn	Trigger user-facing warning

Transitioning from PSP to a new pod security solution

PodSecurityPolicy is being deprecated in Kubernetes 1.21 and removed from Kubernetes 1.25. With alternative security measures, there are two options available:

Policy as Code (PAC)
- notable examples: Kyverno, OPA/Gatekeeper, Open Policy Agent, jsPolicy
Pod Security Standards with Pod Security Admission (PSS+PSA)

Pod Security Standards

Configure the built-in admission controller

Three types of exemptions: Usernames, RuntimeClassNNames, Namespaces

Privileged

Example case is a container that needs to manage the host’s network stack.

1
2


    securityContext:
      privileged: true

Baseline

Example case is an API server that requires limited security permissions without escalation privileges.

1
2


    securityContext:
      allowPrivilegeEscalation: false

Restricted

Example case is a payment processing app that handles sensitive data, which is under restricted level to minimize the attack surface.

1
2
3
4


    securityContext:
      allowPrivilegeEscalation: false
      runAsNonRoot: true
      readOnlyRootFilesystem: true

Example of admission configuration file

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18


apiVersion: admissionregistration.k8s.io/v1
kind: AdmissionConfiguration
plugins:
  - name: PodSecurity
    configuration:
      apiVersion: pod-security.admission.config.k8s.io/v1
      kind: PodSecurityConfiguration
      defaults:
        enforce: baseline
        enforce-version: latest
        audit: restricted
        audit-version: latest
        warn: restricted
        warn-version: latest
      exemptions:
        usernames: [] 
        runtimeClassNames: [] 
        namespaces: [my-namespace]  

Isolation and Segmentation - ResourceQuota & Limits

By default, Kubernetes does not have a CPU or memory request or limit set. This means that any pod can consume as many resources as required on any node and suffocate other pods or processes that are running on the node of resources.

LimitRange

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


apiVersion: v1
kind: LimitRange
metadata:
  name: cpu-resource-constraint
spec:
  limits:
  - default:
      cpu: 500m
    defaultRequest:
      cpu: 500m
    max:
      cpu: "1"
    min:
      cpu: 100m
    type: Container

We can ensure that every pod created has some defaults set by LimitRanges. This can help you define default values to be set for containers in pods that are created without a request or limit specified in the pod definition file.

ResourceQuota

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


apiVersion: v1
kind: ResourceQuota
metadata:
  name: my-resource-quota
spec:
  hard:
  	requests.cpu: 4
  	requests.memory: 4Gi
  	limits.cpu: 10
  	limits.memory: 10Gi

To restrict the total amount of resources that can be consume by applications deployed in a Kubernetes cluster, we could create quotas at a namespace level. ResourceQuota is a namespace level object that can be created to set hard limit for requests and limits.

Audit logs

1

kubectl logs -f falco-6t3dd

All requests made to the Kubernetes cluster have to go through the API server. When we send a request to the Kubernetes cluster, such as for creating a new Nginx pod, it’s the API server that receives this request. As soon as a request is made to the API server, it goes through what is known as the request received stage.

In the request received stage, events are generated, irrespective of whether the request is valid or not. Once the request is authenticated, validated and authorized, another event called response started is generated.

The response started event is applicable for requests that can take some time to complete, such as when making use of the --watch flag with the kubectl get command. To continuously observe the states of objects, once a request has been completed, a response body is sent back. This stage is known as the response complete stage.

Finally in case of an invalid request or an error, the request goes through a panic stage. Each of the stages that we have seen here, request received, response started, response complete, and panic generated events that can be recorded by the API server.

[RequestReceived] –> [ResponseStarted] –> [ResponseComplete] –> [Panic]

1. Create a Policy Object

Only record when pods were deleted.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


apiVerion: audit.k8s.io/v1
kind: Policy
omitStages: ["RequestReceived"]
rules:
  - namespaces: ["prod-namespaces"]
    verb: ["delete"]
    resources:
    - groups: ""
      resources: ["pods"]
      resourceNames: ["webapp-pod"]
    level: RequestResponse 
  - level: Metadata
  	resources:
  	- groups: " "
  	  resources: ["secrets"]

Level - 4 value:

None: no events will be logs if the pod called webapp is deleted in the prod namespace
Metadata: only the metadata (timestamp, username, resources, verbs) will be logged, the least verbose level
Request
RequestResponse

2. Enable auditing in kube apiserver

In version 1.20, two types of backends are supported: A log audit backend that stores audit events to a file on the master node, or a webhook backend that writes to a remote web hook such as a Falco service.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


# At /etc/kubernetes/manifests/kube-apiserver.yaml
spec:
  containers:
  - command
    - kube-apiserver
    - ...
    - --audit-log-path=/var/log/k8-audit.log
    - --audit-policy-file=/etc/kubernetes/audit-policy.yaml
    - --audit-log-maxage=10
    - --audit-log-maxbackup=5
    - --audit-log-maxsize=100

Kubernetes Threat Model

Kubernetes Trust Boundaries and Data Flow

Threat modeling is a process that helps you find potential threats, understand their impact and put measures in place.

Trust Boundaries

Isolating parts of the system and enforce specific measures help us manage and reduce security risks by ensuring that a breach in one part of the system doesn’t compromise the entire application. These isolated areas are called trust boundaries.

Cluster boundaries ➡️ Node boundary ➡️ Namespace boundary ➡️ Pod boundary ➡️ Container Boundary

Exploring Data Flow in a Multi-Tier Application

From user to frontend-pod/nginx: Secure with measures like HTTPs and authentication to protect user data righ from the start
From nginx to authentication service: Secure communication channels and proper API authentication
From backend-svc to db-pod: Interation with database must be tightly controlled and encrypted to prevent unauthorized access and protect sensitive information
Backend communication: use networkPolicies to prevent unauthorized inter-pod communication

Threat actors

Entities that post threats to our system, such as external attackers, compromised containers and malicious users.

Persistence

Persistence: The ability of an attacker to maintain access to a compromised system even after reboots, updates or other interruptions

Mitigating Persistence Risks

Implementing role based access controls
Restrict access to Secrets
Hardening pod security
- including preventing the use of privileged containers and enforcing read-only root filesystems
Regular updates and patching
- up-to-date security patches prevent exploitation
Monitoring and auditing
- monitor for suspicious activities
- regular audit of Kubernetes events

Denial of Service

Summary

DoS attacks overwhelm system resources, causing unresponsiveness
Set resource quotas to prevent excessive resource usage
Restrict service account permissions to limit potential attacks
Use Network Policies and firewalls to control access
Monitor and alert on unusual activity for quick response

Melicious Code Execution

Summary

Attackers exploit vulnerabilities in containers to execute malicious code
Restrict API server access to authorized users and services only
Secure image repositories and use signed images for verification
Monitor and log activities to detect and respond to threats
Regularly update and patch applications to prevent security exploits

Compromised Applications in Containers

Please refer to compromised container attackTree, credit from Marco Lancini’s blog.

Attacker on the Network

The attacker could cause a loss of etcd quorum by overloading its ports 2380 and 2379, or by blocking it from syncing up with other components.

By taking down the scheduler itself and the controller manager, the attacker can halt new or restarted workloads from being scheduled in the cluster. Attacking the scheduler ports (10251 and 10259), prevents Kubernetes from assigning ports to nodes, stopping new tasks from starting.

Going after the controller manager (port 10252 and 10257) will disrupt critical control loops, which impacts scaling updates and replications, meaning Kubernetes won’t be able to manage the cluster’s state effectively.

Kube-proxy is the service that manages network rules on each node. If the attacker takes down the kube-proxy or overloads its ports 10256 and 10249, it stops the flow of traffic between services and pods and essentially freezing communications.

Targeting Kubernetes DNS ports 53 would break name resolution, meaning services can’t find each other by name, causing connectivity issues all over.

Attackers could also degrade the CNI (container network interface) by flooding it, which slows down or cuts off port-to-port communication, causing distributed services to fail.

By targeting network boot services like PXE server, the attackers could prevent new nodes from joining the cluster. This would keep the cluster from scaling up or replacing failed nodes, locking it into a degraded state.

To mitigate network based attacks

Configuring firewalls
Securing nodes
- apply latest security patches
- monitor vulnerabilities
Implementing network policies
Strong authentication and authorization
- strong password
- multi-factor authentication
- rule-based access control
Monitoring and logging

Summary

Attackers can target K8s control plane and nodes for breaches
Configure firewalls to limit network access to trusted IP addresses
Keep node operating systems and components updated and patched
Implement network policies to control traffic and prevent lateral movement
Use strong authentication, multi-factor, and RBAC for secure access
Monitor and log activities to detect and respond to threats

Access to Sensitive Data

1
2
3
4
5


vi /etc/ssh/sshd_config
sudo apt install nginx
# the default configuration for sudo
cat /etc/sudoers
visudo

Platform Security

Supply Chain Security - Minimize base image footprint

When the imsage is built from scratch (FROM scratch), it is called as the base image. Parent image and base image can be referred to as the same thing.

Modular

Do not build images that combine multiple applications, such as a web server, database and other service, all into one image. Instead, build images that aer modular. These different images, when deployed as containers, can together form a single large application that has different services. And each component can scale up or down as required without having to scale the other components.

Persist State

Another best practice to be followed is not storing data or states inside a container. Because containers are ephmeral in nature, we should be able to bring them back online and not lose data along with the container. Always store data in either an external volume or caching service like Redis.

Choose a base image

FROM ??????

First you look for iamges that suit your technical needs, such as httpd base image for your application, or nginx base image for nginx based server. Secondly, you must look for images with authenticity (OFFICIAL IMAGE or verified publisher tag). Thirdly, images must also be up to date, which are less likely to have vulnerabilities in them.

Slim/Minimal images

Create slim/minimal images
Find an official minimal image that exists
Only install necessary packages
- remove shells/packages managers/tools
Maintain different images for different environment
- development - debug tools
- Production - lean
Use multi-stage builds to create lean production ready images

Distress Docker Images

Contains:

Application
Runtime dependencies

Does not contain:

package managers
shells
network tools
text editors
other unwanted programs

Please refer to gcr.io/distroless/xxxxxx images.

Vulnerability scanning

1
2
3
4


trivy image httpd
# found 124 
trivy image httpd:alpine
# found zero!

Supply Chain Security - Scan images for known vulnerabilities

CVE zero to ten points (low - high level vulnerabilities)

CVE Scanner

A solution is to reduce the attack surface by removing unnecessary packages.

Trivy by aqua security

easy to install, easy to run scan

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


# add repo to /etc/apt/sources.list.d
sudo apt-get install wget apt-transport-https gnupg lsb-release
wget -qO - https://aquasecurity.github.io/trivy-repo/deb/public.key | sudo apt-key add -
echo deb https://aquasecurity.github.io/trivy-repo/deb $(lsb_release -sc) main | sudo tee -a /etc/apt/sources.list.d/trivy.list
sudo apt-get update
sudo apt-get install trivy

# run image scan
trivy image ngnx:1.18.0
# only list vulnerabilities of critical severity level
trivy image --severity CRITICAL nginx:1.18.0
trivy image --ignore-unfixed nginx:1.18.0

docker save nginx:1.18.0 > nginx.tar
trivy image --input archive.tar

Best practices

Continuously rescan images
K8s admission controllers to scan images
Have your own repository with pre-scanned images ready to go
Integrate scanning tool with CI/CD pipelines

Image Repository Security

image: docker.io/library/nginx

registry: docker.io, gcr.io
user/account
image/repository

Private Repository

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14


docker login private-registry.io
docker run private-registry.io/apps/internal-app

# create credentials
kubectl create secret docker-registry regcred \
  --docker-server=private-registry.io \
  --docker-username=registry-user \
  --docker-password=registry-password \
  --docker-email=[email protected]

# pass the image pull secret for private registry
spec:
  imagePullSecrets:
  - name: regcred

Observability - Overview

[Securing Cluster] [Minimizing Microservices Vulnerability] [Sandboxing Techniques] [MTLS encryption] [Restricting Network Access]

Even with above techniques are all used to secure our infrastructure, there’s absolutely no guarantee that an attack will never happen in the future.

How to identify breaches that have already occurred in our K8s cluster? We can make use of tools such as Falco from Sysdig.

SYSCALL name
`close`
`nanosleep`
`fcntl`
`fstatfs`
`getdents64`
`exit_group`
`poll_ctl`
`openat`

We need analyze syscalls and filter out those that are suspicious. Attackers often want to erase tracks that they have ever been in the system, so they often try to delete some parts of logs that tracked how they got into the system in the first place. Normally, an administrator rarely has reasons to delete recent logs, so this activity can be considered anomalous and can be used as an early sign of intrusion. Falco can monitor this event and then send alerts using various notification channels.

Observability - Falco Overview and Installation

Falco Kernel Module - intrusive, some managed k8s service providers do not allow this
Extended Berkely Packet Filter (eBPF) - safer and less intrusive

Install Falco as a Package

With this method, Falco is isolated from Kubernetes, and it can still continue to detect and alert suspicious behavior.

1
2
3
4
5
6
7


curl -s https://falco.org/repo/falcosecurity-3672BA8F.asc | apt-key add -
echo "deb https://download.falco.org/packages/deb stable main" | tee -a /etc/apt/sources.list.d/falcosecurity.list
apt update -y
apt get install -y linux-headers-$(uname -r)
apt install -y falco
# start the falco server
systemctl start falco

Install as a DaemonSet

1
2
3


helm repo add falcosecurity https://falcosecurity.github.io/charts
helm repo update
helm install falco falcosecurity/falco

Observability - Using Falco to Detect Threats

1
2
3
4


# assume falco is running as package
systemctl status falco
# run nginx pod and check which node it's on, ssh into that node
journalctl -fu falco

Service Mesh - Monolithics vs Microservices

Service Mesh

Service Mesh - Istio

Service Mesh - Security in Istio

Service Mesh - Istio Security Architecture

Lab

K8s PKI - certificate creation

K8s PKI - view certificate details

Lab

Connectivity - TLS intro

Connectivity - TLS basics

Connectivity - TLS in Kubernetes

Connectivity - Mutual TLS

Admission Controllers

[kubectl] ➡️ [Authentication] ➡️ [Authorization] ➡️ [Admission Controllers] ➡️ [Create Pod]

Authorization is achieved through role-based access control using Role and RoleBinding objects.

Admission Controller helps us achieve better security measures to enforce how a cluster is used. Examples are like:

Only permit images from certain registry
Do not permit runAs root user
Only permit certain capabilities
Pod always has labels

Examples of Admission Controllers:

AlwaysPullImages
DefaultStorageClass
EventRateLimit
NamespaceExists

Rejected a kubectl run command if the specified ns does not exist.
NamespaceAutoProvision
many more examples …

View Enabled Admission Controllers

1

kube-apiserver -h | grep enable-admission-plugins

Enable Admission Controllers

Update the enable admission plugins flag to the kube-apiserver yaml file to add the new admission controllers.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


apiVersion: v1
kind: Pod
metadata:
  name: kube-apiserver
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-apiserver
    - --authorization-mode=Node,RBAC
    - --advertise-address=172.17.0.107
    - --allow-privileged=true
    - --enable-bootstrap-token-auth=true
    - --enable-admission-plugins=NodeRestriction,
    - --disable-admission-plugins=DefaultStorageClass

Lab -admission controllers

Compliance and security framework

Compliance Frameworks

Defines what needs to be done to meet legal, regulatory, or industry standards

Security compliance frameworks provide the guidelines and standards that help us protect sensitive data, such as personal info, health records, payment details, to ensure system integrity and meet legal obligations.

To avoid data breach leading to significant fines and loss of trust, organizations need to follow structured guidelines known as compliance frameworks. Compliance frameworks are guidelines and best practices that are designed to help organizations meet regulatory requirements and ensure data security and privacy.

frameworks such as GDPR, HIPAA, PCI DSS, NIST, CIS Benchmarks

GDPR is a comprehensive data protection law enacted by the European Union to safegard individual’s personal data and uphold their privacy rights. GDPR is highly relevant for the EU, and it’s got to do with personal data.

Health Insurance Portability and Accountability Act (HIPAA)

HIPAA is a US regulation that protects sensitive patient health information. If our web application handles patient data, we must ensure taht all data transfers between front-end, back-end and database are encrypted using TLS. Also we need to implement access controls to restrict unauthorized access to health data and that our Kubernetes secrets are securely configured for application use. HIPAA is used to protect health data, health info (PH, protected health information).

Payment Card Industry Data Security Standard (PSI DSS)

We must make sure the cardhold data both in transit and at rest are encrypted. Strong access controls must be in place, and we should regularly monitor and audit access to payment data to comply with PCI DSS.

National Institute of Standards and Technology (NIST)

NIST provides a framework for improving the security and resilience of information systems. Conducting regular risk assessments to identify potential vulnerabilities, implementing security controls like firewalls, intrusion detection systems and regular security audits to help ensure mitigation of this risk. Key component: protection against security standards like cybersecurity.

Center for Internet Security (CIS)

CIS benchmarks offer best practices for securing IT systems and data.

securing a configuration of the API server at etcd, kubelet, controller, manager, scheduler, authentication and authorization
enforcing RBAC and other access control mechanisms, logging and monitoring, network policies, pod securiity

Tools like Kubebench by Aqua Security can help check whether K8s is deployed securely by running the checks documented in the CIS K8s benchmark. Key component: CIS provides benchmarks for different systems, including K8s.

Recommended Tools

OneTrust, TrustArc - GDPR
Compliancy Group, POBOX - HIPAA
Prisma Cloud - PCI
NIST SRE Toolkit - NIST
Kube-bench - CIS

Threat Modelling Frameworks

Specifies how to achieve it by identifying specific threats and suggesting mitigations to secure the system

STRIDE
MITRE ATT & CK

STRIDE

STRIDE is developed by Microsoft. STRIDE identifies six categories of threats.

S - spoofing
- an attacker try to impersonate a legitmate user to access our server
- mitigated by implementing strong authentication mechanisms 驗證機制
T - tempering
- an attacker attempt to alter data being process by the backend server, including data in transit, data at rest
- mitigated by ensuring data integrity with encryption and digital signatures 數據加密電子簽章
R - repudiation
- users claim that they didn’t do something or were not responsible for doing something
- addressed by keeping comprehensive logs that record user actions and using non-repudiation techniques like digital signatures 審計日誌電子簽章
I - information disclosure
- someone obtaining info that they are not authorized to access
- prevented by encrypting data in transit and at rest 加密保護
D - denial of service
- attacker might try to overload our system, causing it to crash
- mitigated by setting up rate limiting and resource quotas in our K8s environment 服務限流
E - elevation of privilege
- attacker might gain unauthorized access to higher privilege levels
- prevented by implementing strict RBAC policies

MITRE ATT&CK

MITRE ATT&CK is a global knowledge based of real-world tactics and techniques that are used to build threat models. This framework focuses on what attackers aim to do (tactics) and how they do it (techniques).

Initial Access: How attackers enter the cluster, such as exploiting weak authentication
Execution: Running unauthorized commands or code, like deploying malicious containers
Persistence: maintaining access, for example, by creating new users or modifying roles
Privilege Escalation: gaining higher access, such as exploiting misconfigured RBAC
Defence Evasion: hiding activities, like disabling logs or concealing workloads

Summary

Threat modeling identifies, assesses and mitigates specific security threats
STRIDE stands for spoofing, tampering, repudiation, information disclosure, DoS, Elevation
STRIDE helps uncover vulnerabilities with tailered defenses for our environment
Use attack trees to visualize and analyze different attack scenarios
Implement security controls to mitigate prioritized threats identified by STRIDE
Integrate threat modeling into development to address issues early
Leverage the MITRE ATT&CK framework to understand adversary tactics and techniques and strengthen defenses against specific threats

Supply Chain Compliance

Key components of Supply Chain Security

Artifact: verify what you deploy

In K8s, artifacts are signed during the release process using keyless signing tools like Cosign. This ensures the artifacts haven’t been tampered with.
1 2

cosign sign $IMAGE # generates ephmeral keys, retrieves signed certs and ask you to confirm

You can also use co-sign utilty to verify the image

1
2
3
4
5


cosign verify-blob "$BINARY" \
  --signature "$BINARY".sig \
  --certificate "$BINARY".cert \
  --certificate-identity [email protected] \
  --certificate-oidc-issuer https://accounts.google.com

Metadata: understanding what’s inside
- Metadata describes what’s in your artifacts and where they come from
- A critical type of metadata is the SBOM (software bill of materials)
- A SBOM is in what is known as an SPDX format
- An SBOM details all the components, libraries and dependencies along with their versions and sources
- We use SIFT (a CLI tool and go library) to generate a SBOM

Attestations: building trust

Attestations are signed statements that verify metadata like SBOM’s provenance data or vulnerability reports from an authentic, trustworthy source
Kubernetes release team generates the SBOM and signs it using a private key, creating an attestation
1

cosign sign --key <PRIVATE_KEY> sbom.k8s.io/v1.27.4/release.spdx > sbom.attestation

You verify the attestation using their public certificates with the cosign command

1
2
3
4
5


cosign verify-attestation \
  --key <PUBLIC_KEY> \
  --certificate-identity [email protected] \
  --certificate-oidc-issuer https://accounts.google.com \
  sbom.k8s.io/v1.27.4/release.spdx

Think of the SHA checksum as a checking that a sealed package hasn’t been damaged during the shipping process
The attestation is like verifying the package’s seal and sender signature to confirm it’s from a trusted vender and contains what they claim it has
Attestation is rather a framework for defining and verifying the entire supply chain process

Policies: automating compliance
- To automate the deployment and verification of these components on K8s, and make sure we are only deploying verified, trustworthy artifacts
- Policy - the rules that ensure compliance and security standards are enforced automatically
- Policy helps to prevent insecure or non-compliant components form being deployed
- SIGstore's policy-controller integrated with admission controllers

component	steps
Artifact	The binaries and container images are signed using Cosign
Metadata	The SBOM details all the components and their origins, helping you identify risks
Attestations	The SBOM and other metadata are signed to ensure trustworthiness
Policies	Finally Admission controllers verify these signatures and enforce compliance before deployment

Automation and Tooling

The Cloud Native Security Map builds on top of the Cloud Native Security Whitepaper by SIG-Security.

Lifecycle - 4 phases

Develop
- code (code, dockerfile, k8s manifest, Iac) –> commit
  - oss-fuzz by Google 模糊測試
  - snyk Code at VScode plugin
  - fabric8
  - KubeLinter
    1
    
    kube-linter lint pod.yaml
Distribute
- Build pipelines –> container registry
Deploy
Runtime

Contents

01 Kubestronaut KCNA & KCSA

KCNA

Module 2 Quiz (failed ones only)

KCSA

Infrastructure Security

Kubernetes Isolation Techniques

Workload and Application Code Security

Api Server

Securing Controller Manager and Scheduler

Securing Kubelet

View kubelet options

Kubelet - Security

Set --anonymous-auth=false

Securing kube proxy

Securing communication

Pod Security

Define Pod Security Policy

Challenges

Securing ETD

Enabling Data Encryption at Rest

Generate the encryption key

User TLS for Secure Communication (the data in transit / at movement)

Regular Backups

Summary

Secure container networking

Summary

Client Security - kubectl proxy port forward

Kubectl port forward

Pod Security Standards and Pod Security Admissions

Configure PSA

Three security policies defined by Pod Security Standards

Three modes of Pod Security Admission

Transitioning from PSP to a new pod security solution

Privileged

Baseline

Restricted

Example of admission configuration file

Isolation and Segmentation - ResourceQuota & Limits

LimitRange

ResourceQuota

Audit logs

1. Create a Policy Object

Level - 4 value:

2. Enable auditing in kube apiserver

Kubernetes Threat Model

Kubernetes Trust Boundaries and Data Flow

Trust Boundaries

Exploring Data Flow in a Multi-Tier Application

Threat actors

Persistence

Mitigating Persistence Risks

Denial of Service

Melicious Code Execution

Compromised Applications in Containers

Attacker on the Network

To mitigate network based attacks

Summary

Access to Sensitive Data

Platform Security

Supply Chain Security - Minimize base image footprint

Modular

Persist State

Choose a base image

Slim/Minimal images

Distress Docker Images

Contains:

Does not contain:

Vulnerability scanning

Supply Chain Security - Scan images for known vulnerabilities

CVE Scanner

Best practices

Image Repository Security

Private Repository

Observability - Overview

Observability - Falco Overview and Installation

Install Falco as a Package

Install as a DaemonSet

Observability - Using Falco to Detect Threats

Service Mesh - Monolithics vs Microservices

Set `--anonymous-auth=false`