Skip to main content

Kubernetes

The Kubernetes check performs requests on Kubernetes resources such as Pods to get the desired information.

kubernetes.yaml
apiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
name: kube-system-checks
spec:
schedule: "@every 5m"
kubernetes:
- name: kube-system
kind: Pod
healthy: true
# resource:
# search: labels.app=test
# OR
# labelSelector: k8s-app=kube-dns
namespaceSelector:
name: kube-*,!*lease
# name: "*"
display:
expr: |
dyn(results).
map(i, i.Object).
filter(i, !k8s.isHealthy(i)).
map(i, "%s/%s -> %s".format([i.metadata.namespace, i.metadata.name, k8s.getHealth(i).message])).join('\n')
test:
expr: dyn(results).all(x, k8s.isHealthy(x))
FieldDescriptionScheme
kind*

Kubernetes object kind

string

name*

Name of the check, must be unique within the canary

string

cnrm

CNRM connection details

CNRM

connection

The connection url to use, mutually exclusive with kubeconfig

Connection

eks

EKS connection details

EKS

gke

GKE connection details

GKE

healthy

Fail the check if any resources are unhealthy

boolean

ignore

Ignore the specified resources from the fetched resources. Can be a glob pattern.

[]glob

kubeconfig

Source for kubeconfig

EnvVar

namespace

Failing checks are placed in this namespace, useful if you have shared namespaces. NOTE: this does not change the namespace of the resources being queried

namespaceSelector

Filters namespaces by name or labels

Resource Selector

ready

Fail the check if any resources are not ready

boolean

resource

Filters resources by name, namespace, or labels

Resource Selector

description

Description for the check

string

display

Expression to change the formatting of the display

Expression

icon

Icon for overwriting default icon on the dashboard

Icon

labels

Labels for check

[map[string]string]

markFailOnEmpty

If a transformation or datasource returns empty results, the check should fail

boolean

metrics

Metrics to export from

[]Metrics

test

Evaluate whether a check is healthy

Expression

transform

Transform data from a check into multiple individual checks

Expression

Resource Selector

Resource Selectors are used throughout Mission Control for:

  • Creating relationships between configs and configs/components
  • Filtering resources in playbook triggers and actions
  • Selecting targets for health checks
  • Building dynamic views and dashboards
FieldDescriptionSchemeRequired
idSelect resource by ID. Supports comma-separated values and wildcards (id=abc*,def*)string
nameSelect resource by name. Supports comma-separated values and wildcards (name=*-prod,*-staging)string
namespaceSelect resources in this namespace only. If empty, selects from all namespacesstring
typesSelect resources matching any of the specified types (e.g., Kubernetes::Pod, AWS::EC2::Instance)[]string
statusesSelect resources matching any of the specified statuses[]string
healthSelect resources matching the specified health status. Supports multiple values separated by comma (healthy,warning) and negation (!unhealthy)string
scopeLimit selection to resources belonging to a specific parent. For configs this is the scraper id, for checks it's the canary, and for components it's the topology. Can be a UUID or namespace/namestring
labelSelectorKubernetes-style label selector. Supports =, ==, != operators and set-based selectors (key in (v1,v2), key notin (v1,v2), key, !key)LabelSelector
fieldSelectorSelect resources by property fields using Kubernetes field selector syntax. Supports fields like owner, topology_id, parent_id for componentsFieldSelector
tagSelectorSelect resources by tags using the same syntax as labelSelector. Tags are key-value pairs assigned during scrapingstring
agentSelect resources created on a specific agent. Accepts agent UUID, agent name, or special values: local (resources without an agent), self (alias for local), all (resources from any agent). Defaults to localstring
cacheCache settings for selector results. Useful for expensive or frequently-used selectors. Values: no-cache (bypass but allow caching), no-store (bypass and don't cache), max-age=<duration> (cache for duration)string
limitMaximum number of resources to returnint
includeDeletedInclude soft-deleted resources in results. Defaults to falsebool
searchFull-text search across resource name, tags, and labels using parsing expression grammar. See Searchstring

Wildcards and Negation

The name, id, types, statuses, and health fields support:

  • Prefix matching: name=prod-* matches names starting with prod-
  • Suffix matching: name=*-backend matches names ending with -backend
  • Negation: health=!unhealthy excludes unhealthy resources
  • Multiple values: types=Kubernetes::Pod,Kubernetes::Deployment matches either type

The search field provides a powerful query language for filtering resources.

Syntax

field1=value1 field2>value2 field3=value3* field4=*value4

Multiple conditions are combined with AND logic. Use | to combine conditions with OR logic.

type=Kubernetes::Pod | type=Kubernetes::Deployment

Parentheses can be used for grouping:

(type=Kubernetes::Pod | type=Kubernetes::Deployment) health=unhealthy

A bare word (without a field name) is treated as a name prefix search:

nginx

is equivalent to name=nginx*.

Operators

OperatorExampleDescriptionTypes
=status=healthyEquals (exact match or wildcard)string int json
!=health!=unhealthyNot equalsstring int json
=*name=*-prod or name=api-*Prefix or suffix matchstring int
>created_at>now-24hGreater thandatetime int
<updated_at<2025-01-01Less thandatetime int
>=cost_total_30d>=100Greater than or equaldatetime int
<=cost_total_30d<=1000Less than or equaldatetime int

Existence Checks

Check whether a label, tag, or property key exists (or does not exist):

labels.app          # matches resources that have the label "app"
!labels.app # matches resources that do NOT have the label "app"
tags.environment # matches resources that have the tag "environment"
!tags.environment # matches resources that do NOT have the tag "environment"
properties.cpu # matches resources with a "cpu" property
!properties.cpu # matches resources without a "cpu" property

Date Queries

  • Absolute dates: 2025-01-15, 2025-01-15T10:30:00Z
  • Relative dates: now-24h, now-7d, now+1w
  • Supported units: s (seconds), m (minutes), h (hours), d (days), w (weeks), y (years)

JSON Field Access

Access nested fields in labels, tags, and config using dot notation:

labels.app=nginx
tags.env=production
config.spec.replicas>3

Type Prefix Matching

The type field supports partial matching — a search for a short type name will match against all known types by suffix. For example:

type=pod          # matches Kubernetes::Pod
type=Deployment # matches Kubernetes::Deployment
type=EC2 # matches AWS::EC2::Instance

Common Fields

These fields are available in any search query regardless of resource type:

FieldDescription
limitLimit the number of results returned
sortSort results by a column. Prefix with - for descending order
offsetSkip the first N results
@orderAlias for sort

Searchable Fields

Catalog Items (Configs)

FieldTypeDescription
namestringResource name
namespacestringKubernetes namespace (alias for tags.namespace)
typestringResource type (e.g., Kubernetes::Pod). Aliases: config_type
config_classstringConfig class (e.g., Deployment, Node)
statusstringCurrent status
healthstringHealth status
sourcestringSource identifier
external_idstringExternal identifier
agentstringAgent name or ID (alias for agent_id)
labels.<key>jsonKubernetes-style labels (dot notation)
tags.<key>jsonScraper-assigned tags (dot notation)
config.<path>jsonFull configuration data (JSON path)
properties.<key>jsonResource properties
cost_per_minutefloatCost per minute
cost_total_1dfloatTotal cost over 1 day
cost_total_7dfloatTotal cost over 7 days
cost_total_30dfloatTotal cost over 30 days
created_atdatetimeCreation timestamp. Aliases: created
updated_atdatetimeLast update timestamp. Aliases: updated
deleted_atdatetimeSoft deletion timestamp. Aliases: deleted
relatedspecialFind configs related to a given config ID. See Related Configs

Config Changes

FieldTypeDescription
idstringChange ID
config_idstringParent config ID
namestringChange name
typestringConfig type
change_typestringType of change (e.g., diff, event). Aliases: changeType
severitystringChange severity
summarystringChange summary
countintOccurrence count
agent_idstringAgent ID. Aliases: agent
tags.<key>jsonChange tags (dot notation)
details.<path>jsonAdditional details (JSON path)
created_atdatetimeChange timestamp. Aliases: created
first_observeddatetimeFirst observation time

Components

FieldTypeDescription
namestringComponent name
namespacestringComponent namespace
topology_idstringParent topology ID
typestringComponent type. Aliases: component_type
statusstringCurrent status
healthstringHealth status
agent_idstringAgent ID. Aliases: agent
labels.<key>jsonComponent labels (dot notation)
summary.<key>jsonComponent summary (JSON path)
properties.<key>jsonComponent properties
created_atdatetimeCreation timestamp. Aliases: created
updated_atdatetimeLast update timestamp. Aliases: updated
deleted_atdatetimeDeletion timestamp. Aliases: deleted
component_config_traversespecialFind components related to a config via config relationships. Value: <componentID>[,<direction>]

Canaries

FieldTypeDescription
idstringCanary ID
namestringCanary name
namespacestringCanary namespace
agent_idstringAgent ID. Aliases: agent
labels.<key>jsonCanary labels (dot notation)
spec.<path>jsonCanary spec (JSON path)
created_atdatetimeCreation timestamp. Aliases: created
updated_atdatetimeLast update timestamp. Aliases: updated
deleted_atdatetimeDeletion timestamp. Aliases: deleted

Playbooks

FieldTypeDescription
idstringPlaybook ID
namestringPlaybook name
namespacestringPlaybook namespace
created_atdatetimeCreation timestamp. Aliases: created
updated_atdatetimeLast update timestamp. Aliases: updated
deleted_atdatetimeDeletion timestamp. Aliases: deleted

Connections

FieldTypeDescription
idstringConnection ID
namestringConnection name
namespacestringConnection namespace
typestringConnection type

The related field on catalog items finds configs related to a given config through relationship traversal.

Syntax: related=<config_id>[,direction=<direction>][,depth=<depth>][,type=<type>]

ParameterValuesDefaultDescription
directionincoming, outgoing, allallRelationship direction
depthinteger5Maximum traversal depth
typehard, soft, bothbothRelationship type
# Find all configs related to a specific config
search: related=3b1a2c4d-5e6f-7a8b-9c0d-1e2f3a4b5c6d

# Find outgoing hard relationships only, max 3 hops
search: related=3b1a2c4d-5e6f-7a8b-9c0d-1e2f3a4b5c6d,direction=outgoing,depth=3,type=hard

Examples

Basic Selection

# Select by exact name
name: my-deployment

# Select by ID
id: 3b1a2c4d-5e6f-7a8b-9c0d-1e2f3a4b5c6d

# Select all pods in a namespace
types:
- Kubernetes::Pod
namespace: production

Using Wildcards

# Select all resources with names starting with "prod-"
name: prod-*

# Select all AWS resources
types:
- AWS::*

# Select resources ending with "-backend"
name: "*-backend"

Label and Tag Selectors

# Select by labels (Kubernetes-style)
labelSelector: app=nginx,env in (prod,staging)

# Select by tags
tagSelector: team=platform,cost-center!=shared

# Combine both
labelSelector: app=api
tagSelector: environment=production

Health and Status Filtering

# Select only healthy resources
health: healthy

# Exclude unhealthy resources
health: "!unhealthy"

# Select resources with specific statuses
statuses:
- Running
- Pending

Search Queries

# Find all Kubernetes namespaces starting with "kube"
search: type=Kubernetes::Namespace name=kube*

# Find unhealthy AWS EC2 instances
search: type=AWS::EC2::Instance health=unhealthy

# Find configs created in the last 24 hours
search: created_at>now-24h

# Find nginx pods with specific tags
search: type=Kubernetes::Pod labels.app=nginx tags.cluster=prod

# Complex query with date range
search: updated_at>2025-01-01 updated_at<2025-01-31 type=Kubernetes::Deployment

# Find high-cost resources
search: cost_total_30d>1000

# Find resources by partial type (matches Kubernetes::Pod)
search: type=pod

# Use OR to match multiple types
search: type=Kubernetes::Pod | type=Kubernetes::Deployment

# Check for existence of a label
search: labels.app

# Sort results by name descending
search: type=Kubernetes::Pod @order=-name

# Find configs related to another config
search: related=abc-123,direction=outgoing,type=hard

Multi-Agent Selection

# Select from a specific agent
agent: production-cluster

# Select from all agents
agent: all

# Select only local (agentless) resources
agent: local

Scoped Selection

# Select configs from a specific scraper
scope: namespace/my-scraper

# Select checks from a specific canary
scope: canary-uuid-here
catalog-pod-check.yaml
apiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
name: k8s-checks
spec:
schedule: '@every 30s'
kubernetes:
- name: notification-pod-health-check
selector:
- labelSelector: 'kubernetes.io/app=notification-listener'
types:
- Kubernetes::Pod
test:
expr: dyn(results).all(x, k8s.isHealthy(x))

Healthy

Using healthy: true is functionally equivalent to:

  test:
expr: dyn(results).all(x, k8s.isHealthy(x))
kubnetes-healthy.yaml
apiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
name: kube-system-checks
spec:
interval: 30
kubernetes:
- namespace: kube-system
name: kube-system
kind: Pod
healthy: true
resource:
labelSelector: k8s-app=kube-dns
namespaceSelector:
name: kube-system

See the CEL function k8s.isHealthy for more details

Ready

Similar to the healthy flag, there's also a ready flag which is functionally equivalent to having the following test expression

dyn(results).all(x, k8s.isReady(x))

Checking for certificate readiness
cert-manager.yaml
apiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
name: cert-manager
spec:
schedule: "@every 15m"
kubernetes:
- name: cert-manager-check
kind: Certificate
test:
expr: |
dyn(results).
map(i, i.Object).
filter(i, i.status.conditions[0].status != "True").size() == 0
display:
expr: |
dyn(results).
map(i, i.Object).
filter(i, i.status.conditions[0].status != "True").
map(i, "%s/%s -> %s".format([i.metadata.namespace, i.metadata.name, i.status.conditions[0].message])).join('\n')

Remote clusters

A single canary-checker instance can connect to any number of remote clusters via custom kubeconfig. Either the kubeconfig itself or the path to the kubeconfig can be provided.

kubeconfig from kubernetes secret

remote-cluster.yaml
---
apiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
name: pod-access-check
spec:
schedule: "@every 5m"
kubernetes:
- name: pod access on aws cluster
namespace: default
description: "deploy httpbin"
kubeconfig:
valueFrom:
secretKeyRef:
name: aws-kubeconfig
key: kubeconfig
kind: Pod
ready: true
namespaceSelector:
name: default

Kubeconfig inline

remote-cluster.yaml
apiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
name: pod-access-check
spec:
schedule: "@every 5m"
kubernetes:
- name: pod access on aws cluster
namespace: default
kubeconfig:
value: |
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: xxxxx
server: https://xxxxx.sk1.eu-west-1.eks.amazonaws.com
name: arn:aws:eks:eu-west-1:765618022540:cluster/aws-cluster
contexts:
- context:
cluster: arn:aws:eks:eu-west-1:765618022540:cluster/aws-cluster
namespace: mission-control
user: arn:aws:eks:eu-west-1:765618022540:cluster/aws-cluster
name: arn:aws:eks:eu-west-1:765618022540:cluster/aws-cluster
current-context: arn:aws:eks:eu-west-1:765618022540:cluster/aws-cluster
kind: Config
preferences: {}
users:
- name: arn:aws:eks:eu-west-1:765618022540:cluster/aws-cluster
user:
exec:
....
kind: Pod
ready: true
namespaceSelector:
name: default

Kubeconfig from local filesystem

remote-cluster.yaml
---
apiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
name: pod-access-check
spec:
schedule: "@every 5m"
kubernetes:
- name: pod access on aws cluster
namespace: default
kubeconfig:
value: /root/.kube/aws-kubeconfig
kind: Pod
ready: true
namespaceSelector:
name: default