> ## Documentation Index
> Fetch the complete documentation index at: https://docs.blinkops.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Blink Runner Troubleshooting

> This guide covers common runner notification issues and how to resolve them.

## Runner Disconnected

### Docker

<Steps>
  <Step>
    **Find the runner container**

    ```bash theme={"dark"}
    docker ps -a
    ```

    **Filter to just the runner:**

    ```bash theme={"dark"}
    docker ps -a | grep blink-runner
    ```

    **You should see output like:**

    ```bash theme={"dark"}
    CONTAINER ID   IMAGE                            COMMAND           CREATED        STATUS    NAMES
    {container_id} blinkops/blink-runner:{version}  "./blink-runner"  ~ hours ago    Running   blink-runner~~~~~
    ```
  </Step>

  <Step>
    **Check the logs**

    ```bash theme={"dark"}
    docker logs <container_id>
    ```
  </Step>

  <Step>
    **Diagnose the error**

    <Tabs>
      <Tab title="Timeout">
        Network communication to Blink is down or limited — likely due to a firewall.
        Refer to the [outbound traffic documentation](https://www.docs.blinkops.com/docs/blink-platform/runners#outbound-traffic) for the required firewall rules.

        > Verify Blink is up and running:
        >
        > * [https://app.blinkops.com/api/stats](https://app.blinkops.com/api/stats)
        > * [https://status.blinkops.com](https://status.blinkops.com)
        >
        > Check the `#p1-production-alert` Slack channel for any active production alerts.
      </Tab>

      <Tab title="Runner Unauthorized">
        **Is the runner deleted in Blink?**

        * **Yes**: That's the issue. Re-register the runner.
        * **No**: Restart the runner:

          ```bash theme={"dark"}
          docker restart <container_id>
          ```

        If still not connected, escalate to R\&D for further investigation.
      </Tab>
    </Tabs>
  </Step>
</Steps>

***

### Kubernetes

<Steps>
  <Step>
    **Check pod status**

    ```bash theme={"dark"}
    kubectl get pods
    ```
  </Step>

  <Step>
    **Get runner logs**

    ```bash theme={"dark"}
    kubectl logs -f deployment/blink-runner
    ```

    <Note>
      **Note:** This is valid only if you have a single `blink-runner` instance. If HPA (Horizontal Pod Autoscaling) is enabled, retrieve each pod's logs individually:

      ```bash theme={"dark"}
      kubectl get pods
      kubectl logs -f {pod_name}
      ```
    </Note>
  </Step>

  <Step>
    **Diagnose the error**

    <Tabs>
      <Tab title="Timeout">
        Network communication to Blink is down or limited — likely due to a firewall.
        Refer to the [outbound traffic documentation](https://www.docs.blinkops.com/docs/blink-platform/runners#outbound-traffic).

        > Verify Blink is up:
        >
        > * [https://app.blinkops.com/api/stats](https://app.blinkops.com/api/stats)
        > * [https://status.blinkops.com](https://status.blinkops.com)
        >
        > Check `#p1-production-alert` for active production alerts.
      </Tab>

      <Tab title="Runner Unauthorized">
        **Is the runner deleted in Blink?**

        * **Yes**: That's the issue. Re-register the runner.
        * **No**: Restart the runner:

          ```bash theme={"dark"}
          kubectl rollout restart deployment blink-runner
          ```

        If still not connected, escalate to R\&D for further investigation.
      </Tab>
    </Tabs>
  </Step>
</Steps>

***

## Runner Upgrade Failure

### Docker

<Note> **Note:** Docker handles runner upgrades differently and does not use the `blink-operator`. </Note>

<Steps>
  <Step>
    **Get the runner logs**

    ```bash theme={"dark"}
    docker logs -f {container_id}
    ```
  </Step>

  <Step>
    **Look for upgrade indicators**

    Search for one of the following messages:

    ```bash theme={"dark"}
    docker logs {container_id} | grep 'Creating new container with name'
    docker logs {container_id} | grep 'Found existing container with name'
    ```

    Expected messages:

    * `Creating new container with name: blink-runner-{version} and configuration from container with id: {some-container-id}`
    * `Found existing container with name: blink-runner-{version} already running with id: {some-container-id}`
  </Step>

  <Step>
    **Debug the failed runner**

    Using the `{some-container-id}` from the log output, locate and inspect the existing runner containers. Their logs will contain the details of the failure — review them to determine whether you can resolve the issue or need to escalate to R\&D.
  </Step>

  <Step>
    Recovery steps

    Try restarting the container:

    ```bash theme={"dark"}
    docker container restart {container_id}
    ```

    If that doesn't work, perform a **manual upgrade through the UI**, then re-check from the top of the Docker logs section.
  </Step>
</Steps>

***

### Kubernetes

<Steps>
  <Step>
    **Check pod status**

    ```bash theme={"dark"}
    kubectl get pods
    ```
  </Step>

  <Step>
    **Get runner logs**

    ```bash theme={"dark"}
    kubectl logs -f deployment/blink-runner
    ```

    > If HPA is enabled, check each pod individually:
    >
    > ```bash theme={"dark"}
    > kubectl get pods
    > kubectl logs -f {pod_name}
    > ```
  </Step>

  <Step>
    **Check for the blink-operator**

    ```bash theme={"dark"}
    kubectl get pods | grep blink-operator
    ```

    > The operator will automatically roll back after **5 minutes** if the upgrade fails. If you don't see it running, check the runner logs for:
    >
    > ```
    > Operator has been deployed with version: {version}, Current Runner will shutdown once new runner is up and ready.
    > ```
    >
    > If you see this message and the runner is still running but the operator is no longer alive, the upgrade has failed.
  </Step>

  <Step title="Validate operator deployment">
    **Debug before escalating to R\&D**

    ```bash theme={"dark"}
    kubectl get deployment | grep blink-operator
    ```

    * **Operator is running** → Do nothing, wait for it to complete.
    * **Operator exists but is not running** → Wait one minute, then check again. If still not running, delete it:

    ```bash theme={"dark"}
    kubectl delete deployment blink-operator-{version}
    ```
  </Step>

  <Step title="Restart the runner">
    ```bash theme={"dark"}
    kubectl rollout restart deployment blink-runner
    ```
  </Step>

  <Step title="Trigger a manual upgrade via the UI">
    In the Blink UI, navigate to the runner group, click the **3-dot menu**, and select **Update**.
  </Step>

  <Step title="Re-run the full process">
    Go through the entire Kubernetes upgrade process again and monitor the `blink-operator` logs closely.
    Based on those logs, determine whether you can resolve the issue with the customer or whether R\&D escalation is needed.
  </Step>
</Steps>
