Kubeasy LogoKubeasy

Validation Rules

How to define success criteria for challenges using the CLI-based validation system.

Last updated: March 31, 2026GitHubView on GitHub

Validation rules define when a challenge is considered "solved". Kubeasy uses a CLI-based validation system where objectives are defined in challenge.yaml and executed directly against the Kubernetes cluster.

Philosophy: Validate Outcomes, Not Implementations

The key principle of Kubeasy validations is to check that the problem is fixed, not how it was fixed.

Bad validation (reveals the solution):

- key: memory-limit-fixed
  title: "Memory Limit Set to 256Mi"
  description: "Container memory limit must be 256Mi"

Good validation (checks outcome):

- key: stable-operation
  title: "Stable Operation"
  description: "Pod must run without crashing"

This allows multiple valid solutions and doesn't spoil the learning experience.

Validation Types

TypePurposeExample Use Case
conditionCheck resource conditionsPod Ready, Deployment Available
statusCheck status fields with operatorsRestart count < 3, replicas >= 2
logFind strings in container logs"Connected to database", "Server started"
eventDetect forbidden K8s eventsNo OOMKilled, no Evicted
connectivityHTTP connectivity testsService responds with 200

Defining Objectives

All objectives are defined in the objectives section of challenge.yaml:

title: Pod Evicted
description: |
  A pod keeps crashing...
# ... other metadata

objectives:
  - key: unique-identifier
    title: "User-Friendly Title"
    description: "What this checks (not how to fix it)"
    order: 1
    type: condition
    spec:
      # Type-specific configuration

Common Fields

FieldRequiredDescription
keyYesUnique identifier for this objective
titleYesShort title shown in the UI
descriptionYesWhat this objective checks
orderNoDisplay order (lower = first)
typeYesOne of: condition, status, log, event, connectivity
specYesType-specific configuration

Target Specification

All validation types use a target to specify which resources to check:

spec:
  target:
    kind: Pod              # Required: Resource kind
    name: my-pod           # Optional: Specific resource by name
    labelSelector:         # Optional: Match by labels
      app: my-app

Validation Examples

Condition Validation

Check if a resource has specific Kubernetes conditions (Ready, Available, etc.)

- key: pod-ready-check
  title: "Pod Ready"
  description: "The application pod must be running and healthy"
  order: 1
  type: condition
  spec:
    target:
      kind: Pod
      labelSelector:
        app: my-app
    checks:
      - type: Ready
        status: "True"

For Deployments:

- key: deployment-available
  title: "Deployment Available"
  description: "All replicas must be available"
  order: 1
  type: condition
  spec:
    target:
      kind: Deployment
      name: my-deployment
    checks:
      - type: Available
        status: "True"

Common conditions:

  • Pod: Ready, ContainersReady, PodScheduled
  • Deployment: Available, Progressing
  • Job: Complete, Failed

Status Validation

Check resource status fields using operators. Use this for numeric or string comparisons on any status field.

- key: low-restarts
  title: "Low Restart Count"
  description: "Pod must be stable without excessive restarts"
  order: 2
  type: status
  spec:
    target:
      kind: Pod
      labelSelector:
        app: my-app
    checks:
      - field: "containerStatuses[0].restartCount"
        operator: "<"
        value: 3

Available operators: ==, !=, >, <, >=, <=

Field path syntax:

  • Simple: phase, readyReplicas
  • Array index: containerStatuses[0].restartCount
  • Array filter: conditions[type=Ready].status

Checking replica counts:

- key: replicas-check
  title: "Correct Replicas"
  description: "Deployment must have the expected number of replicas"
  type: status
  spec:
    target:
      kind: Deployment
      name: my-deployment
    checks:
      - field: "readyReplicas"
        operator: ">="
        value: 2

Log Validation

Find expected strings in container logs.

- key: database-connection
  title: "Database Connected"
  description: "The application must connect to the database"
  order: 2
  type: log
  spec:
    target:
      kind: Pod
      labelSelector:
        app: api-service
    expectedStrings:
      - "Connected to database successfully"
    sinceSeconds: 120

With specific container:

- key: sidecar-logs
  title: "Sidecar Running"
  description: "The sidecar container must be logging"
  type: log
  spec:
    target:
      kind: Pod
      labelSelector:
        app: my-app
    container: sidecar
    expectedStrings:
      - "Sidecar initialized"
    sinceSeconds: 60

Default sinceSeconds is 300 (5 minutes) if not specified.

Event Validation

Detect forbidden Kubernetes events (useful for checking stability).

- key: no-crashes
  title: "No Crash Events"
  description: "The pod should not experience crashes or evictions"
  order: 3
  type: event
  spec:
    target:
      kind: Pod
      labelSelector:
        app: data-processor
    forbiddenReasons:
      - "OOMKilled"
      - "Evicted"
      - "BackOff"
      - "FailedScheduling"
    sinceSeconds: 300

Default sinceSeconds is 300 (5 minutes) if not specified.

Common forbidden reasons:

  • OOMKilled -- Out of memory
  • CrashLoopBackOff -- Container keeps crashing
  • Evicted -- Pod evicted from node
  • FailedScheduling -- Cannot schedule pod
  • BackOff -- Back-off restarting
  • FailedMount -- Volume mount failed

Connectivity Validation

Test HTTP connectivity from a source pod to a target URL.

- key: service-reachable
  title: "Service Connectivity"
  description: "The backend must be reachable"
  order: 4
  type: connectivity
  spec:
    sourcePod:
      labelSelector:
        app: client
    targets:
      - url: "http://backend-service:8080/health"
        expectedStatusCode: 200
        timeoutSeconds: 5

With custom headers (useful for Ingress testing):

- key: ingress-routing
  title: "Ingress Routing"
  description: "Traffic must route through ingress"
  type: connectivity
  spec:
    sourcePod:
      labelSelector:
        app: client
    targets:
      - url: "http://ingress-nginx-controller.ingress-nginx.svc.cluster.local"
        headers:
          Host: "api.local"
        expectedStatusCode: 200
        timeoutSeconds: 5

Default timeoutSeconds is 5 if not specified.

Use expectedStatusCode: 0 to test that a connection is blocked (e.g., by a NetworkPolicy).

Kyverno Policies

Kyverno policies prevent users from bypassing the challenge (e.g., replacing the broken app with a working one).

What to Protect

  • Container images -- Prevent replacing the application
  • Critical volume mounts -- Prevent removing problematic configs
  • Essential labels -- Ensure validations can find resources

Example Policy

# policies/protect.yaml
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: protect-challenge-image
spec:
  validationFailureAction: Enforce
  rules:
    - name: preserve-image
      match:
        resources:
          kinds: ["Deployment"]
          names: ["my-app"]
          namespaces: ["challenge-*"]
      validate:
        message: "Cannot change the application image"
        pattern:
          spec:
            template:
              spec:
                containers:
                  - name: app
                    image: "kubeasy/broken-app:v1"

What NOT to Protect

Users should be free to:

  • Modify resource limits/requests
  • Add environment variables
  • Change probe configurations
  • Add/modify labels and annotations
  • Scale deployments

Complete Example

Here's a complete challenge.yaml with multiple objectives:

title: Pod Evicted
description: |
  A data processing pod keeps crashing and getting evicted.
  It was working fine yesterday, but now Kubernetes keeps killing it.
theme: resources-scaling
difficulty: easy
type: fix
estimatedTime: 15
initialSituation: |
  A data processing application is deployed as a single pod.
  The pod starts successfully but after a few seconds gets killed.
  It enters a CrashLoopBackOff state and keeps restarting.
objective: |
  Fix the pod so it can run without being evicted.
  Understand why Kubernetes is killing the application.

objectives:
  - key: pod-running
    title: "Pod Ready"
    description: "The data-processor pod must be running and healthy"
    order: 1
    type: condition
    spec:
      target:
        kind: Pod
        labelSelector:
          app: data-processor
      checks:
        - type: Ready
          status: "True"

  - key: no-eviction
    title: "No Crash Events"
    description: "The pod should run stably without being killed"
    order: 2
    type: event
    spec:
      target:
        kind: Pod
        labelSelector:
          app: data-processor
      forbiddenReasons:
        - "Evicted"
        - "OOMKilled"
      sinceSeconds: 300

  - key: low-restarts
    title: "Stable Operation"
    description: "The pod must not restart excessively"
    order: 3
    type: status
    spec:
      target:
        kind: Pod
        labelSelector:
          app: data-processor
      checks:
        - field: "containerStatuses[0].restartCount"
          operator: "<"
          value: 3

Anti-Patterns

Don't reveal the solution in validation titles

# BAD
- key: memory-limit
  title: "Memory Limit Increased to 256Mi"

# GOOD
- key: stable-operation
  title: "Stable Operation"

Don't be too specific about implementation

# BAD
- key: probe-check
  title: "Liveness Probe Uses /healthz Endpoint"

# GOOD
- key: health-checks
  title: "Health Checks Pass"

Don't check implementation details

# BAD - Forces specific solution
- key: secret-volume
  title: "Secret Mounted at /etc/credentials"
  type: status
  spec:
    # Checks for specific volume mount path

# GOOD - Checks the app works
- key: authentication
  title: "Application Authenticated"
  type: log
  spec:
    target:
      kind: Pod
      labelSelector:
        app: my-app
    expectedStrings:
      - "Authentication successful"

How Validation Works

  1. User starts challenge -- CLI deploys manifests via OCI artifact
  2. User works on the fix -- Modifies resources with kubectl
  3. User submits -- CLI loads objectives from challenge.yaml
  4. CLI executes validations -- Runs each check against the cluster
  5. Results sent to backend -- Backend verifies all objectives present
  6. XP awarded -- If all validations pass

Next Steps

On this page