01 Kubestronaut KCNA & KCSA
KCNA
Container and Orchestration
The matrix from hell
- Web server, database, messaging, orchestration
- OS, hardware infrastructure
- libraries, dependencies
Module 2 Quiz (failed ones only)
-
How can you specify a specific runtime endpoint when using the CRI control tool?
Setting the
CONTAINER_RUNTIME_ENDPOINT
environment variable,Using the
--runtime-endpoint
command line option -
Why do Docker images continue to work seamlessly in Kubernetes even after the removal of Docker support?
Because Docker images adhere to the Open Container Initiative (OCI) standard
-
What is the benefit of using the Nerd control (nerdctl) tool over the CTR command line tool?
NerdCTL provides access to the newest features implemented into containerD.
-
Which component in Kubernetes is responsible for distributing containers across multiple nodes?
Scheduler
-
What is containerization?
Running microservices in their own container
-
Which container technology does Docker primarily utilize for its containers?
LXC
-
How do containers differ from virtual machines (VMs) with respect to their operating system (OS) kernel?
Containers shsare the same OS kernel as the host system, but VMs have their own separate OS kernel.
-
Which container orchestration technology is known for being a bit difficult to set up and get started, but provides several feature to customize the deployment?
Kubernetes
-
Which command is used to view information about the Kubernetes cluster?
kubectl cluster-info
-
Who maintains and develops the CRI control command line utility?
Kubernetes community
KCSA
Infrastructure Security
summary
- Isolate critical applications on separate server for better security
- Restrict Docker port access with firewall rules and policies
- Apply least privilege to containers and secure Kubernetes dashboard
- Store sensitive data securely using Kubernetes secrets and RBAC
- Encrypt etcd data and use TLS authentication for protection
Kubernetes Isolation Techniques
Multi Tenancy with namespaces: different teams or project can use the same cluster without interfereing with others. Each team or project has its own namespace.
Network Policies allow us to define role regarding which components can talk to each other.
RBAC ensures that developer has only read access permission to production ns and full access to the development ns. Team A only has access to namespace A resources and team B only has access to namespace B resource.
Resource Quotas and Limits ensures that each component gets a fair share of resources.
Workload and Application Code Security
Container - Let’s say a team use containeriation for the benefit of portability, scalability, consistency, isolation and security. Popular artifact repository supports image signing, enhanced access control, integrated vulnerability scaning, such as JFrog, Nexus Repository, GitHub Packages.
Code - SQL injection attacks, use SonarQube to detect problematic code. Use OWASP dependency checking tool. Gaining insight into containerized application is essential. This is where tools lik Sysdig come into as a solution. Sysdig provides deep visibility into containerized environment, which allows you to monitor resrouce usage, detect anomalies, and troubleshoot issues in real time. With Sysdig, you can quickly pinpoint which container or process is causing high CPU or memory usage, can track down problematic workloads, and even inspect live system calls.
4 C’s | best practices |
---|---|
Code | |
Container | Restrict Images, Supply Chain, Sandboxing, Privileged |
Cluster | Authentication Authorization, Admission, Network Policy |
Cloud | Data Center, Network, Servers |
Api Server
Who can access the api-server?
- Files - Username and passwords
- Files - username and tokens
- Certificates
- External Authentication providers - LDAP
- Service Accounts
What can they (api-server) do?
- RBAC authorization
- ABAC authorization
- Node authorization
- Webhook Mode
Securing Controller Manager and Scheduler
summary
- Isolate controller manager and scheduler on separate dedicated nodes
- Use RBAC to limit permissions of controller manager and scheduler
- Encrypt communications between components using TLS for security
- Enable audit logging to track and review all actions taken
- Secure default settings and protect the configuration files
- Run the latest version of Kubernetes
- Regularly scan for vulnerabilities
Securing Kubelet
|
|
Above is how the kubelet service configuration file was originally, but with the release of version 1.10, most of these parameters were moved to another file called the kubelet config file for ease of deployment and configuration management. The object created within the file is named kubelet configuration.
On the kubelete service config file, we pass the path to this file as a command line argument named config. Note that within the file, the parameters use camel case. So all dashes that separate words in the previous implementation are removed and words are written without spaces, and the first letter of each word is capitalized except the first word, see below:
|
|
If you specify a flag both in the service configuration file or on the command line as well as in the file in the kubelet configuration file, then the flag specified on the command line will override whatever is in this file.
Let’s say there are a large number of worker nodes, instead of manually creating these kubelet config files in each of those worker nodes, the kubdadm tool can help in automatically configuring the kubelet files on those nodes when you run the kubeadm join command.
View kubelet options
|
|
Kubelet - Security
The kubelet serves on two ports: port 10250, which is where the kubelet serves the API server that allows full access. And port 10255, which serves an API that allows an authenticated, unauthorized read-only access. By default, the kubelet allows anonymous access to its API.
port | description |
---|---|
10250 | Serves API that allows full access |
10255 | Serves API that allows unauthenticated read-only access |
Anyone that knows the IP address of these hosts can access these APIs to perform anything that the API server can do, such as viewing existing pods, creating new pods, execing into existing pods or viewing usage metrics and stats and logs and more.
How to prevent possible security issues? How to implement security on the kubelet? Any request that comes to the kubelet is first authenticated and then authorized. The authorization process decides what areas of the API can the user access and what operations can they perform.
Set --anonymous-auth=false
You can disable anonymous user by setting anonymous auth flag to false in a kubelete service configuration file. Also you can set anonymous enabled to false in a KubeletConfiguration file.
|
|
Summary
- Regularly update and patch container runtimes to fix vulnerabilities
- Run containers with least privileges to minimize security risks
- Use read-only filesystems to prevent unauthorized filesystem modifications
- Limit resource usage to prevent denial-of-service (DoS) attacks
- Apply security profiles like SELinux and AppArmer for protection
- Transition to supported runtimes like containered or CRI-O
- Implement monitoring and logging for runtime behavior detection
Securing kube proxy
|
|
Securing communication
Secure communication between cube-proxy and the API server is crucial.
The file path at clusters[*].cluster.certificate-authority
is use to validate the API server’s TLS certificate. The kube-proxy itself uses a service-account token to authenticate to the kube api server. We need to ensure that this communicated is encrypted using TLS to avoid eavesdropping and tempering (竊聽和篡改).
Another step to make sure is that audit logs are enabled. In K8s, audit logs enables all actions performed by that service (in this case kube-proxy service) to track changes and identify unauthorized access, so it’s important that we regularly review these logs that helps in detecting and addressing suspicious activities.
Summary
- Secure kube-proxy config file with strict permissions
- Encrypt API server communication using TLS and service accounts
- Run kube-proxy with least privileges necessary
- Implement network policies for traffic control (If necessary)
- Use logging and monitoring for detecting anomalies
- Regularly update and patch kube-proxy for security
- Enable audit logs to track kube-proxy actions
Pod Security
Pod Security Policy (PSP) was removed as of version 1.25 in Kubernetes and is replaced by Pod Security Admission (PSA) and Pod Security Standards (PSS), which was promoted to being stable in version 1.25.
Basic idea of how PSP works:
When PSP is enabled, the Pod Security Policy Admission Controller observes all Pod creation requests and validates the configuration against the set of pre-configured rules. If it detects a violation of the rule, then that request to create the Pod is rejected and the user receives an error message.
To enable Pod Security Admission, we added to the enable admission plugins flag (--enable-admission-plugins
).
|
|
Define Pod Security Policy
|
|
[kubectl] ➡️ [Authentication] ➡️ [Authorization] ➡️ [Admission Controllers - PodSecurityPolicy] ➡️ Create POD
Challenges
PodSecurityPolicy is not enabled by default. If you were to enable Pod Security Policies in a cluster, you have to make sure that appropriate policies are created in advance. Also it’s complex as you need to bind PSP to roles and bindings.
Securing ETD
Enabling Data Encryption at Rest
|
|
- Specifies the type of k8s object: EncryptionConfiguration
- Specifies the k8s API version: apiserver.config.k8s.io/v1
- Lists the resources to be encrypted, like secrets
- Defines encryption providers, using aescbc as the algorithm
Generate the encryption key
|
|
After kubectl apply and created the encryption object, we need to update the etcd pod specification file to use the encryption configuration file.
|
|
User TLS for Secure Communication (the data in transit / at movement)
--cert-file
: Specifies the server’s certificate file for secure identity presentation
--key-file
: Specifies the server’s key file used with the certificate
--client-cert-auth
: Enables client certificate authentication to allow only trusted clients
--trusted-ca-file
: Specifies the CA certificate for verifying client certificates
``–peer-cert-file`: for secure peer-to-peer communication
--peer-key-file
--peer-client-cert-auth
--peer-trusted-ca-file
Regular Backups
|
|
- Specify the etcdctl API version to use (version 3)
- Command to take a snapshot of the etcd data
- Path where the backup file will be saved
- Specify the etcd endpoints to connect to
- Path to the CA certificate used to verify the etcd server certificate
- Path to the client certificate used to authenticate the client
- Path to the client key used to authenticate the client
Summary
- Enable data encryption at rest for etcd security
- Use TLS to secure etcd communication
- Regularly back up etcd data for recovery
Secure container networking
Implement networkPolicies to restrict pod to pod communication
Use service meshes
- Istio, Linkerd to enforce mutual TLS, to ensure that all communication between services is encrypted and authenticated, which significantly reduces the risk of man-in-the-middle attacks
Encrypting Network Traffic
- Use a CNI plugin like Calico to enable IPsec encryption for network traffic, which ensures that all data transferred between nodes is encrypted
Isolating sensitive workloads
- Using namespaces or network policies to help contain potentical security breaches
Summary
- Implement network policies to control pod traffic flow
- Use service meshes for encrypted, secure service communication
- Encrypt network traffic between containers using IPsec or WireGuard
- Isolate sensitive workloads with namespaces and network policies
Client Security - kubectl proxy port forward
user details and credentials stored in kubeconfig file (~/.kube/config
)
Through port 6443 using curl using certificate files
|
|
Alternative option: start kubectl proxy client
|
|
The kube proxy client launch a proxy service locally on port 8001 by default, and used the credentials and certificates from your cube-config file to access the cluster. Remember that the proxy only runs on your laptop and is only accessible within your laptop.
Kubectl port forward
|
|
Summary
- Encrypt data at rest and in transit for protection
- Implement RBAC to control access to storage resources
- Use Storage Classes to enforce security and performance policies
- Regularly back up data and have a disaster recovery plan
- Monitor and audit storage access for compliance and security
Pod Security Standards and Pod Security Admissions
KEP 2579 PSP replacement
PodSecurityAdmissionController
|
|
Configure PSA
PSA is namespace scoped. This means you can label a namespace with PSA mode and security standard using below command syntax.
|
|
One of PSA goals is to simplify the whole implementation. Instead of requiring users to write their own profiles, a few built-in profiles are defined: privileged, baseline, restricted.
Three security policies defined by Pod Security Standards
Profile | Description |
---|---|
Privileged | Unrestricted policy. It allows the widest possible level of permissions, almost like there are no restrictions. Useful for system-wide programs like logging agents, CNIs, storage drivers |
Baseline | Minimally restrictive policy. |
Restricted | Heavily restricted policy. It follows the pod hardening best practices. |
A mode defines what action the control plane takes if the policy is violated. If it’s set to enforce, then the pod creation request is rejected. If it’s set to audit, then the pod creation is allowed and an entry is added to the audit logs. And warn mode triggers a user-facing warning.
Three modes of Pod Security Admission
Mode | On Violation |
---|---|
enforce | Reject pod |
audit | Record in the audit logs |
warn | Trigger user-facing warning |
Transitioning from PSP to a new pod security solution
PodSecurityPolicy is being deprecated in Kubernetes 1.21 and removed from Kubernetes 1.25. With alternative security measures, there are two options available:
- Policy as Code (PAC)
- notable examples: Kyverno, OPA/Gatekeeper, Open Policy Agent, jsPolicy
- Pod Security Standards with Pod Security Admission (PSS+PSA)
Configure the built-in admission controller
Three types of exemptions: Usernames, RuntimeClassNNames, Namespaces
Privileged
Example case is a container that needs to manage the host’s network stack.
|
|
Baseline
Example case is an API server that requires limited security permissions without escalation privileges.
|
|
Restricted
Example case is a payment processing app that handles sensitive data, which is under restricted level to minimize the attack surface.
|
|
Example of admission configuration file
|
|
Isolation and Segmentation - ResourceQuota & Limits
By default, Kubernetes does not have a CPU or memory request or limit set. This means that any pod can consume as many resources as required on any node and suffocate other pods or processes that are running on the node of resources.
LimitRange
|
|
We can ensure that every pod created has some defaults set by LimitRanges. This can help you define default values to be set for containers in pods that are created without a request or limit specified in the pod definition file.
ResourceQuota
|
|
To restrict the total amount of resources that can be consume by applications deployed in a Kubernetes cluster, we could create quotas at a namespace level. ResourceQuota is a namespace level object that can be created to set hard limit for requests and limits.
Audit logs
|
|
All requests made to the Kubernetes cluster have to go through the API server. When we send a request to the Kubernetes cluster, such as for creating a new Nginx pod, it’s the API server that receives this request. As soon as a request is made to the API server, it goes through what is known as the request received stage.
In the request received stage, events are generated, irrespective of whether the request is valid or not. Once the request is authenticated, validated and authorized, another event called response started is generated.
The response started event is applicable for requests that can take some time to complete, such as when making use of the --watch
flag with the kubectl get command. To continuously observe the states of objects, once a request has been completed, a response body is sent back. This stage is known as the response complete stage.
Finally in case of an invalid request or an error, the request goes through a panic stage. Each of the stages that we have seen here, request received, response started, response complete, and panic generated events that can be recorded by the API server.
[RequestReceived] –> [ResponseStarted] –> [ResponseComplete] –> [Panic]
1. Create a Policy Object
Only record when pods were deleted.
|
|
Level - 4 value:
None
: no events will be logs if the pod called webapp is deleted in the prod namespaceMetadata
: only the metadata (timestamp, username, resources, verbs) will be logged, the least verbose levelRequest
RequestResponse
2. Enable auditing in kube apiserver
In version 1.20, two types of backends are supported: A log audit backend that stores audit events to a file on the master node, or a webhook backend that writes to a remote web hook such as a Falco service.
|
|
Kubernetes Threat Model
Kubernetes Trust Boundaries and Data Flow
Threat modeling is a process that helps you find potential threats, understand their impact and put measures in place.
Trust Boundaries
Isolating parts of the system and enforce specific measures help us manage and reduce security risks by ensuring that a breach in one part of the system doesn’t compromise the entire application. These isolated areas are called trust boundaries.
Cluster boundaries ➡️ Node boundary ➡️ Namespace boundary ➡️ Pod boundary ➡️ Container Boundary
Exploring Data Flow in a Multi-Tier Application
- From user to frontend-pod/nginx: Secure with measures like HTTPs and authentication to protect user data righ from the start
- From nginx to authentication service: Secure communication channels and proper API authentication
- From backend-svc to db-pod: Interation with database must be tightly controlled and encrypted to prevent unauthorized access and protect sensitive information
- Backend communication: use networkPolicies to prevent unauthorized inter-pod communication
Threat actors
Entities that post threats to our system, such as external attackers, compromised containers and malicious users.
Persistence
Persistence: The ability of an attacker to maintain access to a compromised system even after reboots, updates or other interruptions
Mitigating Persistence Risks
- Implementing role based access controls
- Restrict access to Secrets
- Hardening pod security
- including preventing the use of privileged containers and enforcing read-only root filesystems
- Regular updates and patching
- up-to-date security patches prevent exploitation
- Monitoring and auditing
- monitor for suspicious activities
- regular audit of Kubernetes events
Denial of Service
Summary
- DoS attacks overwhelm system resources, causing unresponsiveness
- Set resource quotas to prevent excessive resource usage
- Restrict service account permissions to limit potential attacks
- Use Network Policies and firewalls to control access
- Monitor and alert on unusual activity for quick response
Melicious Code Execution
Summary
- Attackers exploit vulnerabilities in containers to execute malicious code
- Restrict API server access to authorized users and services only
- Secure image repositories and use signed images for verification
- Monitor and log activities to detect and respond to threats
- Regularly update and patch applications to prevent security exploits
Compromised Applications in Containers
Please refer to compromised container attackTree, credit from Marco Lancini’s blog.
Attacker on the Network
The attacker could cause a loss of etcd quorum by overloading its ports 2380 and 2379, or by blocking it from syncing up with other components.
By taking down the scheduler itself and the controller manager, the attacker can halt new or restarted workloads from being scheduled in the cluster. Attacking the scheduler ports (10251 and 10259), prevents Kubernetes from assigning ports to nodes, stopping new tasks from starting.
Going after the controller manager (port 10252 and 10257) will disrupt critical control loops, which impacts scaling updates and replications, meaning Kubernetes won’t be able to manage the cluster’s state effectively.
Kube-proxy is the service that manages network rules on each node. If the attacker takes down the kube-proxy or overloads its ports 10256 and 10249, it stops the flow of traffic between services and pods and essentially freezing communications.
Targeting Kubernetes DNS ports 53 would break name resolution, meaning services can’t find each other by name, causing connectivity issues all over.
Attackers could also degrade the CNI (container network interface) by flooding it, which slows down or cuts off port-to-port communication, causing distributed services to fail.
By targeting network boot services like PXE server, the attackers could prevent new nodes from joining the cluster. This would keep the cluster from scaling up or replacing failed nodes, locking it into a degraded state.
To mitigate network based attacks
- Configuring firewalls
- Securing nodes
- apply latest security patches
- monitor vulnerabilities
- Implementing network policies
- Strong authentication and authorization
- strong password
- multi-factor authentication
- rule-based access control
- Monitoring and logging
Summary
- Attackers can target K8s control plane and nodes for breaches
- Configure firewalls to limit network access to trusted IP addresses
- Keep node operating systems and components updated and patched
- Implement network policies to control traffic and prevent lateral movement
- Use strong authentication, multi-factor, and RBAC for secure access
- Monitor and log activities to detect and respond to threats
Access to Sensitive Data
|
|
Platform Security
Supply Chain Security - Minimize base image footprint
When the imsage is built from scratch (FROM scratch
), it is called as the base image. Parent image and base image can be referred to as the same thing.
Modular
Do not build images that combine multiple applications, such as a web server, database and other service, all into one image. Instead, build images that aer modular. These different images, when deployed as containers, can together form a single large application that has different services. And each component can scale up or down as required without having to scale the other components.
Persist State
Another best practice to be followed is not storing data or states inside a container. Because containers are ephmeral in nature, we should be able to bring them back online and not lose data along with the container. Always store data in either an external volume or caching service like Redis.
Choose a base image
FROM ??????
First you look for iamges that suit your technical needs, such as httpd base image for your application, or nginx base image for nginx based server. Secondly, you must look for images with authenticity (OFFICIAL IMAGE
or verified publisher
tag). Thirdly, images must also be up to date, which are less likely to have vulnerabilities in them.
Slim/Minimal images
- Create slim/minimal images
- Find an official minimal image that exists
- Only install necessary packages
- remove shells/packages managers/tools
- Maintain different images for different environment
- development - debug tools
- Production - lean
- Use multi-stage builds to create lean production ready images
Distress Docker Images
Contains:
- Application
- Runtime dependencies
Does not contain:
- package managers
- shells
- network tools
- text editors
- other unwanted programs
Please refer to gcr.io/distroless/xxxxxx
images.
Vulnerability scanning
|
|
Supply Chain Security - Scan images for known vulnerabilities
CVE zero to ten points (low - high level vulnerabilities)
CVE Scanner
A solution is to reduce the attack surface by removing unnecessary packages.
-
Trivy by aqua security
-
easy to install, easy to run scan
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
# add repo to /etc/apt/sources.list.d sudo apt-get install wget apt-transport-https gnupg lsb-release wget -qO - https://aquasecurity.github.io/trivy-repo/deb/public.key | sudo apt-key add - echo deb https://aquasecurity.github.io/trivy-repo/deb $(lsb_release -sc) main | sudo tee -a /etc/apt/sources.list.d/trivy.list sudo apt-get update sudo apt-get install trivy # run image scan trivy image ngnx:1.18.0 # only list vulnerabilities of critical severity level trivy image --severity CRITICAL nginx:1.18.0 trivy image --ignore-unfixed nginx:1.18.0 docker save nginx:1.18.0 > nginx.tar trivy image --input archive.tar
Best practices
- Continuously rescan images
- K8s admission controllers to scan images
- Have your own repository with pre-scanned images ready to go
- Integrate scanning tool with CI/CD pipelines
Image Repository Security
image: docker.io/library/nginx
- registry:
docker.io
,gcr.io
- user/account
- image/repository
Private Repository
|
|
Observability - Overview
[Securing Cluster] [Minimizing Microservices Vulnerability] [Sandboxing Techniques] [MTLS encryption] [Restricting Network Access]
Even with above techniques are all used to secure our infrastructure, there’s absolutely no guarantee that an attack will never happen in the future.
How to identify breaches that have already occurred in our K8s cluster? We can make use of tools such as Falco from Sysdig.
SYSCALL name |
---|
close |
nanosleep |
fcntl |
fstatfs |
getdents64 |
exit_group |
poll_ctl |
openat |
We need analyze syscalls and filter out those that are suspicious. Attackers often want to erase tracks that they have ever been in the system, so they often try to delete some parts of logs that tracked how they got into the system in the first place. Normally, an administrator rarely has reasons to delete recent logs, so this activity can be considered anomalous and can be used as an early sign of intrusion. Falco can monitor this event and then send alerts using various notification channels.
Observability - Falco Overview and Installation
- Falco Kernel Module - intrusive, some managed k8s service providers do not allow this
- Extended Berkely Packet Filter (eBPF) - safer and less intrusive
Install Falco as a Package
With this method, Falco is isolated from Kubernetes, and it can still continue to detect and alert suspicious behavior.
|
|
Install as a DaemonSet
|
|
Observability - Using Falco to Detect Threats
|
|
Service Mesh - Monolithics vs Microservices
Service Mesh
Service Mesh - Istio
Service Mesh - Security in Istio
Service Mesh - Istio Security Architecture
Lab
K8s PKI - certificate creation
K8s PKI - view certificate details
Lab
Connectivity - TLS intro
Connectivity - TLS basics
Connectivity - TLS in Kubernetes
Connectivity - Mutual TLS
Admission Controllers
[kubectl] ➡️ [Authentication] ➡️ [Authorization] ➡️ [Admission Controllers] ➡️ [Create Pod]
Authorization is achieved through role-based access control using Role and RoleBinding objects.
Admission Controller helps us achieve better security measures to enforce how a cluster is used. Examples are like:
- Only permit images from certain registry
- Do not permit
runAs
root user - Only permit certain capabilities
- Pod always has labels
Examples of Admission Controllers:
-
AlwaysPullImages
-
DefaultStorageClass
-
EventRateLimit
-
NamespaceExists
Rejected a kubectl run command if the specified ns does not exist.
-
NamespaceAutoProvision
-
many more examples …
View Enabled Admission Controllers
|
|
Enable Admission Controllers
Update the enable admission plugins flag to the kube-apiserver yaml file to add the new admission controllers.
|
|
Lab -admission controllers
Compliance and security framework
Compliance Frameworks
Defines what needs to be done to meet legal, regulatory, or industry standards
Security compliance frameworks provide the guidelines and standards that help us protect sensitive data, such as personal info, health records, payment details, to ensure system integrity and meet legal obligations.
To avoid data breach leading to significant fines and loss of trust, organizations need to follow structured guidelines known as compliance frameworks. Compliance frameworks are guidelines and best practices that are designed to help organizations meet regulatory requirements and ensure data security and privacy.
frameworks such as GDPR, HIPAA, PCI DSS, NIST, CIS Benchmarks
General Data Protection Regulation (GDPR)
GDPR is a comprehensive data protection law enacted by the European Union to safegard individual’s personal data and uphold their privacy rights. GDPR is highly relevant for the EU, and it’s got to do with personal data.
Health Insurance Portability and Accountability Act (HIPAA)
HIPAA is a US regulation that protects sensitive patient health information. If our web application handles patient data, we must ensure taht all data transfers between front-end, back-end and database are encrypted using TLS. Also we need to implement access controls to restrict unauthorized access to health data and that our Kubernetes secrets are securely configured for application use. HIPAA is used to protect health data, health info (PH, protected health information).
Payment Card Industry Data Security Standard (PSI DSS)
We must make sure the cardhold data both in transit and at rest are encrypted. Strong access controls must be in place, and we should regularly monitor and audit access to payment data to comply with PCI DSS.
National Institute of Standards and Technology (NIST)
NIST provides a framework for improving the security and resilience of information systems. Conducting regular risk assessments to identify potential vulnerabilities, implementing security controls like firewalls, intrusion detection systems and regular security audits to help ensure mitigation of this risk. Key component: protection against security standards like cybersecurity.
Center for Internet Security (CIS)
CIS benchmarks offer best practices for securing IT systems and data.
- securing a configuration of the API server at etcd, kubelet, controller, manager, scheduler, authentication and authorization
- enforcing RBAC and other access control mechanisms, logging and monitoring, network policies, pod securiity
Tools like Kubebench by Aqua Security can help check whether K8s is deployed securely by running the checks documented in the CIS K8s benchmark. Key component: CIS provides benchmarks for different systems, including K8s.
Recommended Tools
- OneTrust, TrustArc - GDPR
- Compliancy Group, POBOX - HIPAA
- Prisma Cloud - PCI
- NIST SRE Toolkit - NIST
- Kube-bench - CIS
Threat Modelling Frameworks
Specifies how to achieve it by identifying specific threats and suggesting mitigations to secure the system
- STRIDE
- MITRE ATT & CK
STRIDE
STRIDE is developed by Microsoft. STRIDE identifies six categories of threats.
- S - spoofing
- an attacker try to impersonate a legitmate user to access our server
- mitigated by implementing strong authentication mechanisms 驗證機制
- T - tempering
- an attacker attempt to alter data being process by the backend server, including data in transit, data at rest
- mitigated by ensuring data integrity with encryption and digital signatures 數據加密 電子簽章
- R - repudiation
- users claim that they didn’t do something or were not responsible for doing something
- addressed by keeping comprehensive logs that record user actions and using non-repudiation techniques like digital signatures 審計日誌 電子簽章
- I - information disclosure
- someone obtaining info that they are not authorized to access
- prevented by encrypting data in transit and at rest 加密保護
- D - denial of service
- attacker might try to overload our system, causing it to crash
- mitigated by setting up rate limiting and resource quotas in our K8s environment 服務限流
- E - elevation of privilege
- attacker might gain unauthorized access to higher privilege levels
- prevented by implementing strict RBAC policies
MITRE ATT&CK
MITRE ATT&CK is a global knowledge based of real-world tactics and techniques that are used to build threat models. This framework focuses on what attackers aim to do (tactics) and how they do it (techniques).
- Initial Access: How attackers enter the cluster, such as exploiting weak authentication
- Execution: Running unauthorized commands or code, like deploying malicious containers
- Persistence: maintaining access, for example, by creating new users or modifying roles
- Privilege Escalation: gaining higher access, such as exploiting misconfigured RBAC
- Defence Evasion: hiding activities, like disabling logs or concealing workloads
Summary
- Threat modeling identifies, assesses and mitigates specific security threats
- STRIDE stands for spoofing, tampering, repudiation, information disclosure, DoS, Elevation
- STRIDE helps uncover vulnerabilities with tailered defenses for our environment
- Use attack trees to visualize and analyze different attack scenarios
- Implement security controls to mitigate prioritized threats identified by STRIDE
- Integrate threat modeling into development to address issues early
- Leverage the MITRE ATT&CK framework to understand adversary tactics and techniques and strengthen defenses against specific threats
Supply Chain Compliance
Key components of Supply Chain Security
-
Artifact: verify what you deploy
-
In K8s, artifacts are signed during the release process using keyless signing tools like
Cosign
. This ensures the artifacts haven’t been tampered with.1 2
cosign sign $IMAGE # generates ephmeral keys, retrieves signed certs and ask you to confirm
-
You can also use co-sign utilty to verify the image
1 2 3 4 5
cosign verify-blob "$BINARY" \ --signature "$BINARY".sig \ --certificate "$BINARY".cert \ --certificate-identity [email protected] \ --certificate-oidc-issuer https://accounts.google.com
-
-
Metadata: understanding what’s inside
- Metadata describes what’s in your artifacts and where they come from
- A critical type of metadata is the SBOM (software bill of materials)
- A SBOM is in what is known as an SPDX format
- An SBOM details all the components, libraries and dependencies along with their versions and sources
- We use
SIFT
(a CLI tool and go library) to generate a SBOM
-
Attestations: building trust
-
Attestations are signed statements that verify metadata like SBOM’s provenance data or vulnerability reports from an authentic, trustworthy source
-
Kubernetes release team generates the SBOM and signs it using a private key, creating an attestation
1
cosign sign --key <PRIVATE_KEY> sbom.k8s.io/v1.27.4/release.spdx > sbom.attestation
-
You verify the attestation using their public certificates with the cosign command
1 2 3 4 5
cosign verify-attestation \ --key <PUBLIC_KEY> \ --certificate-identity [email protected] \ --certificate-oidc-issuer https://accounts.google.com \ sbom.k8s.io/v1.27.4/release.spdx
-
Think of the SHA checksum as a checking that a sealed package hasn’t been damaged during the shipping process
-
The attestation is like verifying the package’s seal and sender signature to confirm it’s from a trusted vender and contains what they claim it has
-
Attestation is rather a framework for defining and verifying the entire supply chain process
-
-
Policies: automating compliance
- To automate the deployment and verification of these components on K8s, and make sure we are only deploying verified, trustworthy artifacts
- Policy - the rules that ensure compliance and security standards are enforced automatically
- Policy helps to prevent insecure or non-compliant components form being deployed
SIGstore's policy-controller
integrated with admission controllers
component | steps |
---|---|
Artifact | The binaries and container images are signed using Cosign |
Metadata | The SBOM details all the components and their origins, helping you identify risks |
Attestations | The SBOM and other metadata are signed to ensure trustworthiness |
Policies | Finally Admission controllers verify these signatures and enforce compliance before deployment |
Automation and Tooling
The Cloud Native Security Map builds on top of the Cloud Native Security Whitepaper by SIG-Security.
Lifecycle - 4 phases
-
Develop
-
code (code, dockerfile, k8s manifest, Iac) –> commit
-
oss-fuzz
by Google 模糊測試 -
snyk Code
at VScode plugin -
fabric8
-
KubeLinter
1
kube-linter lint pod.yaml
-
-
-
Distribute
- Build pipelines –> container registry
-
Deploy
-
Runtime