Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Github 3.14.1 compatiblity issue - unable to register runners #3767

Closed
4 tasks done
kb-med opened this issue Oct 8, 2024 · 6 comments
Closed
4 tasks done

Github 3.14.1 compatiblity issue - unable to register runners #3767

kb-med opened this issue Oct 8, 2024 · 6 comments
Labels
bug Something isn't working gha-runner-scale-set Related to the gha-runner-scale-set mode

Comments

@kb-med
Copy link

kb-med commented Oct 8, 2024

Checks

Controller Version

0.9.3

Deployment Method

Helm

Checks

  • This isn't a question or user support case (For Q&A and community support, go to Discussions).
  • I've read the Changelog before submitting this issue and I'm sure it's not due to any recently-introduced backward-incompatible changes

To Reproduce

1. Create Github Enterprise instance with version 3.14.1
2. Install gha-runner-scale-set-controller-0.9.3
3. Generate token on Github for authentication
4. Configure secret with token
5. Install gha-runner-scale-set-0.9.3

Describe the bug

Arc controller cannot register runner with Github Enterprise instance with following errors

On existing instance we experienced these errors

2024-10-07T08:07:29Z INFO listener-app app initialized 2024-10-07T08:07:29Z INFO listener-app Starting listener 2024-10-07T08:07:29Z INFO listener-app refreshing token {"githubConfigUrl": " https://github.enterprise.com/team"} 2024-10-07T08:07:29Z INFO listener-app getting runner registration token {"registrationTokenURL": " [https://github.enterprise.com/api/v3/orgs/team/actions/runners/registration-token"}](https://github.enterprise.com/api/v3/orgs/team/actions/runners/registration-token%22%7D) 2024-10-07T08:07:30Z INFO listener-app getting Actions tenant URL and JWT {"registrationURL": " https://github.enterprise.com/api/v3/actions/runner-registration"} 2024/10/07 08:07:30 Application returned an error: createSession failed: failed to create session: actions error: StatusCode 404, AcivityId "5be3369d-4ae3-479b-8a85-489c7e9c2986": GitHub.Actions.Runtime.WebApi.RunnerScaleSetNotFoundException, GitHub.Actions.Runtime.WebApi: No runner scale set found with identifier 12.

After reinstallation now we see something like this

2024-10-08T09:22:25Z INFO EphemeralRunner Creating new ephemeral runner registration and updating status with runner config {"version": "0.9.3", "ephemeralrunner": {"name":"def-runner-set-6pqwm-runner-vbm68","namespace":"eks-arc-dev"}} 2024-10-08T09:22:25Z INFO EphemeralRunner Creating ephemeral runner JIT config {"version": "0.9.3", "ephemeralrunner": {"name":"def-runner-set-6pqwm-runner-vbm68","namespace":"eks-arc-dev"}} 2024-10-08T09:22:25Z INFO actions-clients retrieve actions client {"githubConfigURL": "https://github.enterprise.com/team", "namespace": "eks-arc-dev"} 2024-10-08T09:22:25Z INFO actions-clients using cache client {"githubConfigURL": "https://github.enterprise.com/team", "namespace": "eks-arc-dev"} 2024-10-08T09:22:25Z ERROR Reconciler error {"controller": "ephemeralrunner", "controllerGroup": "actions.github.com", "controllerKind": "EphemeralRunner", "EphemeralRunner": {"name":"def-runner-set-6pqwm-runner-vbm68","namespace":"eks-arc-dev"}, "namespace": "eks-arc-dev", "name": "def-runner-set-6pqwm-runner-vbm68", "reconcileID": "2a8ac671-e305-4c32-9395-0a8305e6b86d", "error": "failed to generate JIT config with Actions service error: actions error: StatusCode 404, AcivityId \"5b380ee9-4ae3-479b-8a85-489c7e9c2986\": GitHub.Actions.Runtime.WebApi.RunnerScaleSetNotFoundException, GitHub.Actions.Runtime.WebApi: No runner scale set found with identifier 12."} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:329 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227

Describe the expected behavior

Successful registration of dynamic ARC Runners

Additional Context

#Controller Values
#Controller pod only requires changes to image hosting and image pull secret
image:
  repository: registry.internal.com/team/tools/gha-runner-scale-set-controller
imagePullSecrets: 
  - name: img-secret

--------------------------
#Runners set values
#Setup of github repo/space
githubConfigUrl: "https://github.enterprise.com/team"
githubConfigSecret: gha-token

#Setting minimal and maximal amount of runners.
#Minimal can be 0 but that incease amount of wait time for run as pod have to be spinned up on every pipeline run
#Maximal limits amounts of runners running at same time
minRunners: 2
maxRunners: 5

#Worker pod reconfiguration to use internally hosted image and our image pull secret
template:
  spec:
    containers:
      - name: runner
        image: registry.internal.com/team/tools/actions-runner:1.1
        command: ["/home/runner/run.sh"]
    imagePullSecrets:
    - name: img-secret

Controller Logs

`2024-10-07T08:07:29Z	INFO	listener-app	app initialized
2024-10-07T08:07:29Z	INFO	listener-app	Starting listener
2024-10-07T08:07:29Z	INFO	listener-app	refreshing token	{"githubConfigUrl": "
https://github.enterprise.com/team"}
2024-10-07T08:07:29Z	INFO	listener-app	getting runner registration token	{"registrationTokenURL": "
[https://github.enterprise.com/api/v3/orgs/team/actions/runners/registration-token"}](https://github.enterprise.com/api/v3/orgs/team/actions/runners/registration-token%22%7D)
2024-10-07T08:07:30Z	INFO	listener-app	getting Actions tenant URL and JWT	{"registrationURL": "
https://github.enterprise.com/api/v3/actions/runner-registration"}
2024/10/07 08:07:30 Application returned an error: createSession failed: failed to create session: actions error: StatusCode 404, AcivityId "5be3369d-4ae3-479b-8a85-489c7e9c2986": GitHub.Actions.Runtime.WebApi.RunnerScaleSetNotFoundException, GitHub.Actions.Runtime.WebApi: No runner scale set found with identifier 12.`

-------------------------------------------

`2024-10-08T09:22:25Z	INFO	EphemeralRunner	Creating new ephemeral runner registration and updating status with runner config	{"version": "0.9.3", "ephemeralrunner": {"name":"def-runner-set-6pqwm-runner-vbm68","namespace":"eks-arc-dev"}}
2024-10-08T09:22:25Z	INFO	EphemeralRunner	Creating ephemeral runner JIT config	{"version": "0.9.3", "ephemeralrunner": {"name":"def-runner-set-6pqwm-runner-vbm68","namespace":"eks-arc-dev"}}
2024-10-08T09:22:25Z	INFO	actions-clients	retrieve actions client	{"githubConfigURL": "https://github.enterprise.com/team", "namespace": "eks-arc-dev"}
2024-10-08T09:22:25Z	INFO	actions-clients	using cache client	{"githubConfigURL": "https://github.enterprise.com/team", "namespace": "eks-arc-dev"}
2024-10-08T09:22:25Z	ERROR	Reconciler error	{"controller": "ephemeralrunner", "controllerGroup": "actions.github.com", "controllerKind": "EphemeralRunner", "EphemeralRunner": {"name":"def-runner-set-6pqwm-runner-vbm68","namespace":"eks-arc-dev"}, "namespace": "eks-arc-dev", "name": "def-runner-set-6pqwm-runner-vbm68", "reconcileID": "2a8ac671-e305-4c32-9395-0a8305e6b86d", "error": "failed to generate JIT config with Actions service error: actions error: StatusCode 404, AcivityId \"5b380ee9-4ae3-479b-8a85-489c7e9c2986\": GitHub.Actions.Runtime.WebApi.RunnerScaleSetNotFoundException, GitHub.Actions.Runtime.WebApi: No runner scale set found with identifier 12."}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:329
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227`

Runner Pod Logs

N/a, is not spawning
@kb-med kb-med added bug Something isn't working gha-runner-scale-set Related to the gha-runner-scale-set mode needs triage Requires review from the maintainers labels Oct 8, 2024
@kb-med
Copy link
Author

kb-med commented Nov 14, 2024

Update, issue is caused by previous registration details hanging and not being removed.
Helm uninstall also does not clean-up those hanging objects.

@meiswjn
Copy link

meiswjn commented Nov 25, 2024

Yup, this also happens when updating the Kubernetes Secret that should be used for authenticating. It appears to be a general problem with Custom Resources not being cleaned up.

@gleruzh
Copy link

gleruzh commented Dec 4, 2024

Is there a good way to fix?
something less than "completely remove charts, secrets, custom resources and crd" and reinstall, which works of course.

@meiswjn
Copy link

meiswjn commented Dec 4, 2024

Removing the finalizers actually worked for us.

Its a bit tedious because you have to do it for every CRD (and also service accounts, apparently - everything that has stuck finalizers). This is how you do it with a command:
kubectl get AutoscalingListeners -n <namespace> -o name | xargs -I {} kubectl patch {} -n <namespace> -p '{"metadata":{"finalizers":null}}' --type=merge
This command sets the finalizers to an empty array. I do not know if this brings any technical issues. What it doesnt do is checking if the resource is still needed... to check this, some resources support "--field-selector=status.phase==Succeeded", which is the state when it should actually be removed.

A simple way to check if you caught all of them: It seems like you are not be able to (force) delete the namespace, unless all finalizers are removed.

@gleruzh
Copy link

gleruzh commented Dec 4, 2024

yes, I went the same way - patch custom resources with empty finalizers.

@nikola-jokic
Copy link
Collaborator

Hey everyone,

It does seem like the cleanup process did not complete before re-installation, causing this issue. @meiswjn correctly pointed out that in that case, patching to remove finalizers is a solution.

We should invest in making the deletion process quicker, but the issue is still that you can't start the installation before everything is completely removed from the cluster. We documented that you need to wait until all resources are completely removed here.

I will close this issue since it is working as expected, but I must acknowledge that we should improve the experience for upgrades.

@nikola-jokic nikola-jokic removed the needs triage Requires review from the maintainers label Jan 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working gha-runner-scale-set Related to the gha-runner-scale-set mode
Projects
None yet
Development

No branches or pull requests

4 participants