Skip to main content

Runner Disconnected

Docker

1
Find the runner container
docker ps -a
Filter to just the runner:
docker ps -a | grep blink-runner
You should see output like:
CONTAINER ID   IMAGE                            COMMAND           CREATED        STATUS    NAMES
{container_id} blinkops/blink-runner:{version}  "./blink-runner"  ~ hours ago    Running   blink-runner~~~~~
2
Check the logs
docker logs <container_id>
3
Diagnose the error
Network communication to Blink is down or limited — likely due to a firewall. Refer to the outbound traffic documentation for the required firewall rules.
Verify Blink is up and running: Check the #p1-production-alert Slack channel for any active production alerts.

Kubernetes

1
Check pod status
kubectl get pods
2
Get runner logs
kubectl logs -f deployment/blink-runner
Note: This is valid only if you have a single blink-runner instance. If HPA (Horizontal Pod Autoscaling) is enabled, retrieve each pod’s logs individually:
kubectl get pods
kubectl logs -f {pod_name}
3
Diagnose the error
Network communication to Blink is down or limited — likely due to a firewall. Refer to the outbound traffic documentation.
Verify Blink is up: Check #p1-production-alert for active production alerts.

Runner Upgrade Failure

Docker

Note: Docker handles runner upgrades differently and does not use the blink-operator.
1
Get the runner logs
docker logs -f {container_id}
2
Look for upgrade indicatorsSearch for one of the following messages:
docker logs {container_id} | grep 'Creating new container with name'
docker logs {container_id} | grep 'Found existing container with name'
Expected messages:
  • Creating new container with name: blink-runner-{version} and configuration from container with id: {some-container-id}
  • Found existing container with name: blink-runner-{version} already running with id: {some-container-id}
3
Debug the failed runnerUsing the {some-container-id} from the log output, locate and inspect the existing runner containers. Their logs will contain the details of the failure — review them to determine whether you can resolve the issue or need to escalate to R&D.
4
Recovery stepsTry restarting the container:
docker container restart {container_id}
If that doesn’t work, perform a manual upgrade through the UI, then re-check from the top of the Docker logs section.

Kubernetes

1
Check pod status
kubectl get pods
2
Get runner logs
kubectl logs -f deployment/blink-runner
If HPA is enabled, check each pod individually:
kubectl get pods
kubectl logs -f {pod_name}
3
Check for the blink-operator
kubectl get pods | grep blink-operator
The operator will automatically roll back after 5 minutes if the upgrade fails. If you don’t see it running, check the runner logs for:
Operator has been deployed with version: {version}, Current Runner will shutdown once new runner is up and ready.
If you see this message and the runner is still running but the operator is no longer alive, the upgrade has failed.
4

Validate operator deployment

Debug before escalating to R&D
kubectl get deployment | grep blink-operator
  • Operator is running → Do nothing, wait for it to complete.
  • Operator exists but is not running → Wait one minute, then check again. If still not running, delete it:
kubectl delete deployment blink-operator-{version}
5

Restart the runner

kubectl rollout restart deployment blink-runner
6

Trigger a manual upgrade via the UI

In the Blink UI, navigate to the runner group, click the 3-dot menu, and select Update.
7

Re-run the full process

Go through the entire Kubernetes upgrade process again and monitor the blink-operator logs closely. Based on those logs, determine whether you can resolve the issue with the customer or whether R&D escalation is needed.