Challenge Structure
Understanding the anatomy of a Kubeasy challenge and how all components work together.
Every Kubeasy challenge follows a consistent structure. This page explains each component and how they work together.
Directory Structure
Each challenge is a folder in the challenges repository:
challenges/
└── pod-evicted/
├── challenge.yaml # Metadata, description, AND objectives
├── manifests/ # Initial broken state
│ ├── deployment.yaml
│ ├── service.yaml
│ └── ...
├── policies/ # Kyverno policies (prevent bypasses)
│ └── protect.yaml
└── image/ # Optional: custom Docker images
└── DockerfileComponents
1. challenge.yaml
The challenge.yaml file contains everything about the challenge: metadata, description, and validation objectives.
title: Pod Evicted
description: |
A data processing pod keeps crashing and getting evicted.
It was working fine yesterday, but now Kubernetes keeps killing it.
theme: resources-scaling
difficulty: easy
type: fix
estimatedTime: 15
initialSituation: |
A data processing application is deployed as a single pod.
The pod starts successfully but after a few seconds gets killed.
It enters a CrashLoopBackOff state and keeps restarting.
objective: |
Fix the pod so it can run without being evicted.
Understand why Kubernetes is killing the application.
objectives:
- key: pod-running
title: "Pod Ready"
description: "The pod must be running and healthy"
order: 1
type: condition
spec:
target:
kind: Pod
labelSelector:
app: data-processor
checks:
- type: Ready
status: "True"Metadata Fields
| Field | Required | Description |
|---|---|---|
title | Yes | Challenge name (shown in UI) |
description | Yes | Brief description of symptoms (NOT the cause!) |
theme | Yes | Category for grouping |
difficulty | Yes | easy, medium, or hard |
type | Yes | Challenge type (e.g., fix) |
estimatedTime | Yes | Minutes to complete |
initialSituation | Yes | What the user will find |
objective | Yes | What needs to be achieved |
objectives | Yes | Success criteria (validation definitions) |
Writing Good Descriptions
Describe symptoms, not causes:
# BAD - Reveals the problem
description: |
The ConfigMap has invalid JSON syntax.
Fix the JSON formatting error.
# GOOD - Maintains mystery
description: |
A microservice keeps crashing shortly after deployment.
The team swears the code hasn't changed.State goals, not methods:
# BAD - Tells user what to do
objective: |
Increase the memory limit to 256Mi.
# GOOD - States the outcome
objective: |
Make the pod run stably without being evicted.2. manifests/ Directory
Contains the initial broken state -- Kubernetes resources that learners need to fix.
# manifests/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: data-processor
spec:
replicas: 1
selector:
matchLabels:
app: data-processor
template:
metadata:
labels:
app: data-processor
spec:
containers:
- name: processor
image: kubeasy/data-processor:v1
resources:
limits:
memory: "32Mi" # BUG: Too low!
cpu: "100m"Design principles:
- Keep it minimal - Only include what's needed
- Make it realistic - Mirror production problems
- One problem at a time - Don't combine unrelated issues
- Clear naming - Be descriptive
3. policies/ Directory
Kyverno policies that prevent bypasses -- stopping users from cheating instead of solving the challenge properly.
# policies/protect.yaml
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: protect-data-processor
spec:
validationFailureAction: Enforce
rules:
- name: preserve-image
match:
resources:
kinds: ["Deployment"]
names: ["data-processor"]
namespaces: ["challenge-*"]
validate:
message: "Cannot change the application image"
pattern:
spec:
template:
spec:
containers:
- name: processor
image: "kubeasy/data-processor:v1"What to protect:
- Container images (prevent replacing the app)
- Critical volume mounts
- Essential labels (ensure validations can find resources)
What NOT to protect:
- Resource limits/requests (user should be able to change these)
- Environment variables
- Probe configurations
- Labels and annotations
4. Objectives
Defined in challenge.yaml under the objectives key, these determine when the challenge is solved.
objectives:
- key: unique-identifier
title: "User-Friendly Title"
description: "What this checks"
order: 1
type: condition|status|log|event|connectivity
spec:
# Type-specific configurationAvailable types:
| Type | Purpose |
|---|---|
condition | Check resource conditions (Ready, Available) |
status | Check status fields with operators (restart count < 3) |
log | Find strings in container logs |
event | Detect forbidden K8s events (OOMKilled, Evicted) |
connectivity | HTTP connectivity tests |
See Validation Rules for detailed examples and Validation Reference for complete specs.
5. image/ Directory (Optional)
If your challenge needs a custom application, you can create your own Docker image. Simply add an image/ directory with a Dockerfile in your challenge folder:
my-challenge/
├── challenge.yaml
├── manifests/
├── policies/
└── image/
├── Dockerfile
└── app.py # Your application filesExample Dockerfile:
# image/Dockerfile
FROM python:3.11-slim
COPY app.py /app/
CMD ["python", "/app/app.py"]Automatic Build and Publish
When you merge your challenge to main, the CI automatically:
- Detects challenges with an
image/directory - Builds the Docker image (multi-arch:
linux/amd64andlinux/arm64) - Pushes to GitHub Container Registry:
ghcr.io/kubeasy-dev/<challenge-name>:latest
Using your custom image in manifests:
# manifests/deployment.yaml
spec:
containers:
- name: app
image: ghcr.io/kubeasy-dev/my-challenge:latestWhen to Use Custom Images
- Your challenge needs specific application behavior (memory leaks, slow responses, etc.)
- You need to simulate a realistic application workload
- Standard images (nginx, python, busybox) don't fit your scenario
Best Practices
- Keep images small (use slim/alpine base images)
- Don't include secrets or sensitive data
- Make the application behavior predictable and reproducible
- Add a
.dockerignorefile to exclude unnecessary files
Themes
Challenges are grouped by theme:
| Theme | Description |
|---|---|
rbac-security | Permissions, roles, security contexts |
networking | Services, ingress, network policies |
volumes-secrets | Storage, ConfigMaps, Secrets |
resources-scaling | Limits, requests, HPA, scaling |
monitoring-debugging | Probes, logging, events |
How It All Works Together
-
User starts challenge (
kubeasy challenge start pod-evicted)- CLI creates a namespace and pulls the OCI artifact
- Manifests and Kyverno policies are applied to the namespace
-
User investigates and fixes
- Uses
kubectlto explore the problem - Modifies resources to fix the issue
- Kyverno validates changes (prevents bypasses)
- Uses
-
User submits solution (
kubeasy challenge submit pod-evicted)- CLI loads objectives from
challenge.yaml - Executes each validation against the cluster
- Sends results to backend
- XP awarded if all objectives pass
- CLI loads objectives from
Best Practices
Challenge Design
- One concept per challenge - Don't mix RBAC + networking + storage
- Realistic scenarios - Use problems that occur in production
- Clear error messages - When users check logs, they should see helpful errors
- No red herrings - Don't add confusing complexity
Validation Design
- Check outcomes, not implementations - "Pod is healthy" not "Memory is 256Mi"
- Don't reveal solutions - Validation titles should be generic
- Accept multiple solutions - If there are valid alternatives, allow them
Documentation
- Describe symptoms - Not the root cause
- State goals - Not the method to achieve them
- Never include solutions - Let users figure it out
Example: Complete Challenge
pod-evicted/
├── challenge.yaml
├── manifests/
│ └── deployment.yaml
└── policies/
└── protect.yamlchallenge.yaml:
title: Pod Evicted
description: |
A data processing pod keeps crashing and getting evicted.
It was working fine yesterday, but now Kubernetes keeps killing it.
theme: resources-scaling
difficulty: easy
type: fix
estimatedTime: 15
initialSituation: |
A data processing application is deployed as a single pod.
The pod starts successfully but after a few seconds gets killed.
It enters a CrashLoopBackOff state and keeps restarting.
objective: |
Fix the pod so it can run without being evicted.
objectives:
- key: pod-ready
title: "Pod Ready"
description: "The pod must be running"
order: 1
type: condition
spec:
target:
kind: Pod
labelSelector:
app: data-processor
checks:
- type: Ready
status: "True"
- key: no-oom
title: "No Crash Events"
description: "No eviction or crash events"
order: 2
type: event
spec:
target:
kind: Pod
labelSelector:
app: data-processor
forbiddenReasons:
- "OOMKilled"
- "Evicted"
sinceSeconds: 300Next Steps
- Creating Your First Challenge - Step-by-step guide
- Validation Rules - Detailed validation examples
- Validation Reference - Complete type specifications
- Testing Challenges - How to test locally