istio getting "RBAC: access denied" even the servicerolebinding checked to be allowed
Answering my own question since I've made some progress on them.
I cannot update the ServiceRoleBinding even after I deleted the validating webhook
That's because the ServiceRoleBinding is actually generated/monitored/managed by the profile controller in the kubeflow namespace instead of the validating webhook.
I'm having this rbac issue because based on the params.yaml in the profiles manifest folder the rule is generated as
request.headers[]: [email protected]
instead of
request.headers[kubeflow-userid]: [email protected]
Due to I mis-configed the value as blank instead of userid-header=kubeflow-userid in the params.yaml
Comments
-
Roger Ray almost 2 years
I've been struggleing with istio... So here I am seeking help from the experts!
Background
I'm trying to deploy my kubeflow application for multi-tenency with dex. Refering to the kubeflow offical document with the manifest file from github
Here is a list of component/version information
- I'm running kubernetes 1.15 on GKE
- Istio 1.1.6 been used in kubeflow for service meth
- Trying to deploy kubeflow 1.0 for ML
- Deployed dex 1.0 for authn
With the manifest file I successfully deployed the kubeflow on my cluster. Here's what I've done.
- Deploy the kubeflow application on the cluster
- Deploy Dex with OIDC service to enable authn to google Oauth2.0
- Enable the RBAC
- create envoy filter to append header "kubeflow-userid" as the login user
Here is a verification of step 3 and 4 Check RBAC enabled and envoyfilter added for kubeflow-userid
[root@gke-client-tf leilichao]# k get clusterrbacconfigs -o yaml apiVersion: v1 items: - apiVersion: rbac.istio.io/v1alpha1 kind: ClusterRbacConfig metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"rbac.istio.io/v1alpha1","kind":"ClusterRbacConfig","metadata":{"annotations":{},"name":"default"},"spec":{"mode":"ON"}} creationTimestamp: "2020-07-04T01:28:52Z" generation: 2 name: default resourceVersion: "5986075" selfLink: /apis/rbac.istio.io/v1alpha1/clusterrbacconfigs/default uid: db70920e-f364-40ec-a93b-a3364f88650f spec: mode: "ON" kind: List metadata: resourceVersion: "" selfLink: "" [root@gke-client-tf leilichao]# k get envoyfilter -n istio-system -o yaml apiVersion: v1 items: - apiVersion: networking.istio.io/v1alpha3 kind: EnvoyFilter metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"networking.istio.io/v1alpha3","kind":"EnvoyFilter","metadata":{"annotations":{},"labels":{"app.kubernetes.io/component":"oidc-authservice","app.kubernetes.io/instance":"oidc-authservice-v1.0.0","app.kubernetes.io/managed-by":"kfctl","app.kubernetes.io/name":"oidc-authservice","app.kubernetes.io/part-of":"kubeflow","app.kubernetes.io/version":"v1.0.0"},"name":"authn-filter","namespace":"istio-system"},"spec":{"filters":[{"filterConfig":{"httpService":{"authorizationRequest":{"allowedHeaders":{"patterns":[{"exact":"cookie"},{"exact":"X-Auth-Token"}]}},"authorizationResponse":{"allowedUpstreamHeaders":{"patterns":[{"exact":"kubeflow-userid"}]}},"serverUri":{"cluster":"outbound|8080||authservice.istio-system.svc.cluster.local","failureModeAllow":false,"timeout":"10s","uri":"http://authservice.istio-system.svc.cluster.local"}},"statusOnError":{"code":"GatewayTimeout"}},"filterName":"envoy.ext_authz","filterType":"HTTP","insertPosition":{"index":"FIRST"},"listenerMatch":{"listenerType":"GATEWAY"}}],"workloadLabels":{"istio":"ingressgateway"}}} creationTimestamp: "2020-07-04T01:40:43Z" generation: 1 labels: app.kubernetes.io/component: oidc-authservice app.kubernetes.io/instance: oidc-authservice-v1.0.0 app.kubernetes.io/managed-by: kfctl app.kubernetes.io/name: oidc-authservice app.kubernetes.io/part-of: kubeflow app.kubernetes.io/version: v1.0.0 name: authn-filter namespace: istio-system resourceVersion: "4715289" selfLink: /apis/networking.istio.io/v1alpha3/namespaces/istio-system/envoyfilters/authn-filter uid: e599ba82-315a-4fc1-9a5d-e8e35d93ca26 spec: filters: - filterConfig: httpService: authorizationRequest: allowedHeaders: patterns: - exact: cookie - exact: X-Auth-Token authorizationResponse: allowedUpstreamHeaders: patterns: - exact: kubeflow-userid serverUri: cluster: outbound|8080||authservice.istio-system.svc.cluster.local failureModeAllow: false timeout: 10s uri: http://authservice.istio-system.svc.cluster.local statusOnError: code: GatewayTimeout filterName: envoy.ext_authz filterType: HTTP insertPosition: index: FIRST listenerMatch: listenerType: GATEWAY workloadLabels: istio: ingressgateway kind: List metadata: resourceVersion: "" selfLink: ""
RBAC Issue problem analysis
After I finished my deployment. I performed below functional testing:
- I can login with my google account with google oauth
- I was able to create my own profile/namespace
- I was able to create a notebook server
- However I can NOT connect to the notebook server
RBAC Issue investigation
I'm getting "RBAC: access denied" error after I successfully created the notebook server on kubeflow and trying to connect the notebook server. I managed to updated the envoy log level and get the log below.
[2020-08-06 13:32:43.290][26][debug][rbac] [external/envoy/source/extensions/filters/http/rbac/rbac_filter.cc:64] checking request: remoteAddress: 10.1.1.2:58012, localAddress: 10.1.2.66:8888, ssl: none, headers: ':authority', 'compliance-kf-system.ml' ':path', '/notebook/roger-l-c-lei/aug06/' ':method', 'GET' 'user-agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36' 'accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' 'accept-encoding', 'gzip, deflate' 'accept-language', 'en,zh-CN;q=0.9,zh;q=0.8' 'cookie', 'authservice_session=MTU5NjY5Njk0MXxOd3dBTkZvMldsVllVMUZPU0VaR01sSk5RVlJJV2xkRFVrRTFTVUl5V0RKV1EwdEhTMU5QVjFCVlUwTkpSVFpYUlVoT1RGVlBUa0U9fN3lPBXDDSZMT9MTJRbG8jv7AtblKTE3r84ayeCYuKOk; _xsrf=2|1e6639f2|10d3ea0a904e0ae505fd6425888453f8|1596697030' 'referer', 'http://compliance-kf-system.ml/jupyter/' 'upgrade-insecure-requests', '1' 'x-forwarded-for', '10.10.10.230' 'x-forwarded-proto', 'http' 'x-request-id', 'babbf884-4cec-93fd-aea6-2fc60d3abb83' 'kubeflow-userid', '[email protected]' 'x-istio-attributes', 'CjAKHWRlc3RpbmF0aW9uLnNlcnZpY2UubmFtZXNwYWNlEg8SDXJvZ2VyLWwtYy1sZWkKIwoYZGVzdGluYXRpb24uc2VydmljZS5uYW1lEgcSBWF1ZzA2Ck4KCnNvdXJjZS51aWQSQBI+a3ViZXJuZXRlczovL2lzdGlvLWluZ3Jlc3NnYXRld2F5LTg5Y2Q0YmQ0Yy1kdnF3dC5pc3Rpby1zeXN0ZW0KQQoXZGVzdGluYXRpb24uc2VydmljZS51aWQSJhIkaXN0aW86Ly9yb2dlci1sLWMtbGVpL3NlcnZpY2VzL2F1ZzA2CkMKGGRlc3RpbmF0aW9uLnNlcnZpY2UuaG9zdBInEiVhdWcwNi5yb2dlci1sLWMtbGVpLnN2Yy5jbHVzdGVyLmxvY2Fs' 'x-envoy-expected-rq-timeout-ms', '300000' 'x-b3-traceid', '3bf35cca1f7b75e7a42a046b1c124b1f' 'x-b3-spanid', 'a42a046b1c124b1f' 'x-b3-sampled', '1' 'x-envoy-original-path', '/notebook/roger-l-c-lei/aug06/' 'content-length', '0' 'x-envoy-internal', 'true' , dynamicMetadata: filter_metadata { key: "istio_authn" value { } } [2020-08-06 13:32:43.290][26][debug][rbac] [external/envoy/source/extensions/filters/http/rbac/rbac_filter.cc:108] enforced denied
From the source code it looks like the allowed function is returnning false so it's giving the "RBAC: access denied" response.
if (engine.has_value()) { if (engine->allowed(*callbacks_->connection(), headers, callbacks_->streamInfo().dynamicMetadata(), nullptr)) { ENVOY_LOG(debug, "enforced allowed"); config_->stats().allowed_.inc(); return Http::FilterHeadersStatus::Continue; } else { ENVOY_LOG(debug, "enforced denied"); callbacks_->sendLocalReply(Http::Code::Forbidden, "RBAC: access denied", nullptr, absl::nullopt); config_->stats().denied_.inc(); return Http::FilterHeadersStatus::StopIteration; } }
I took a search on the dumped envoy, it looks like the rule should be allowing any request with a header key as my mail address. Now I can confirm I've got that in my header from above log.
{ "name": "envoy.filters.http.rbac", "config": { "rules": { "policies": { "ns-access-istio": { "permissions": [ { "and_rules": { "rules": [ { "any": true } ] } } ], "principals": [ { "and_ids": { "ids": [ { "header": { "exact_match": "[email protected]" } } ] } } ] } } } } }
With the understand that the envoy config that's been used to validate RBAC authz is from this config. And it's distributed to the sidecar by mixer, The log and code leads me to the rbac.istio.io config of servicerolebinding.
[root@gke-client-tf leilichao]# k get servicerolebinding -n roger-l-c-lei -o yaml apiVersion: v1 items: - apiVersion: rbac.istio.io/v1alpha1 kind: ServiceRoleBinding metadata: annotations: role: admin user: [email protected] creationTimestamp: "2020-07-04T01:35:30Z" generation: 5 name: owner-binding-istio namespace: roger-l-c-lei ownerReferences: - apiVersion: kubeflow.org/v1 blockOwnerDeletion: true controller: true kind: Profile name: roger-l-c-lei uid: 689c9f04-08a6-4c51-a1dc-944db1a66114 resourceVersion: "23201026" selfLink: /apis/rbac.istio.io/v1alpha1/namespaces/roger-l-c-lei/servicerolebindings/owner-binding-istio uid: bbbffc28-689c-4099-837a-87a2feb5948f spec: roleRef: kind: ServiceRole name: ns-access-istio subjects: - properties: request.headers[]: [email protected] status: {} kind: List metadata: resourceVersion: "" selfLink: ""
I wanted to have a try updating this ServiceRoleBinding to validate some assumption since I can't debug the envoy source code and there's not enough log to show why exactly is the "allow" method returnning false.
However I find myself cannot update the servicerolebinding. It resumes to its orriginal version everytime right after I finish editing it.
I find that there's this istio-galley validatingAdmissionConfiguration(Code block below) that monitors these istio rbac resources.
[root@gke-client-tf leilichao]# k get validatingwebhookconfigurations istio-galley -oyaml apiVersion: admissionregistration.k8s.io/v1beta1 kind: ValidatingWebhookConfiguration metadata: creationTimestamp: "2020-08-04T15:00:59Z" generation: 1 labels: app: galley chart: galley heritage: Tiller istio: galley release: istio name: istio-galley ownerReferences: - apiVersion: extensions/v1beta1 blockOwnerDeletion: true controller: true kind: Deployment name: istio-galley uid: 11fef012-4145-49ac-a43c-2e1d0a460ea4 resourceVersion: "22484680" selfLink: /apis/admissionregistration.k8s.io/v1beta1/validatingwebhookconfigurations/istio-galley uid: 6f485e28-3b5a-4a3b-b31f-a5c477c82619 webhooks: - admissionReviewVersions: - v1beta1 clientConfig: caBundle: . . . service: name: istio-galley namespace: istio-system path: /admitpilot port: 443 failurePolicy: Fail matchPolicy: Exact name: pilot.validation.istio.io namespaceSelector: {} objectSelector: {} rules: - apiGroups: - config.istio.io apiVersions: - v1alpha2 operations: - CREATE - UPDATE resources: - httpapispecs - httpapispecbindings - quotaspecs - quotaspecbindings scope: '*' - apiGroups: - rbac.istio.io apiVersions: - '*' operations: - CREATE - UPDATE resources: - '*' scope: '*' - apiGroups: - authentication.istio.io apiVersions: - '*' operations: - CREATE - UPDATE resources: - '*' scope: '*' - apiGroups: - networking.istio.io apiVersions: - '*' operations: - CREATE - UPDATE resources: - destinationrules - envoyfilters - gateways - serviceentries - sidecars - virtualservices scope: '*' sideEffects: Unknown timeoutSeconds: 30 - admissionReviewVersions: - v1beta1 clientConfig: caBundle: . . . service: name: istio-galley namespace: istio-system path: /admitmixer port: 443 failurePolicy: Fail matchPolicy: Exact name: mixer.validation.istio.io namespaceSelector: {} objectSelector: {} rules: - apiGroups: - config.istio.io apiVersions: - v1alpha2 operations: - CREATE - UPDATE resources: - rules - attributemanifests - circonuses - deniers - fluentds - kubernetesenvs - listcheckers - memquotas - noops - opas - prometheuses - rbacs - solarwindses - stackdrivers - cloudwatches - dogstatsds - statsds - stdios - apikeys - authorizations - checknothings - listentries - logentries - metrics - quotas - reportnothings - tracespans scope: '*' sideEffects: Unknown timeoutSeconds: 30
Long stroy short
I've been banging my head over this istio issue for more than 2 weeks. I'm sure there's planty of people felting the same trying to trouble shoot istio on k8s. Any suggestion is welcomed! Here's how I understand the problem, please correct me if I'm wrong:
- The log evidence showed the rbac rules is not allowing my access to the resource
- I need to update the rbac rules
- rules are distributed by mixer to the envoy container according to ServiceRoleBinding
- So I need to update the ServiceRoleBinding instead
- I cannot update the ServiceRoleBinding because either the validating admission webhook or the istio mixer is preventing me from doing it
I've run into below problems where
I cannot update the ServiceRoleBinding even after I deleted the validating webhook
I tried to delete this validating webhook to update the servicerolebinding. The resource resumes right after I save the edit. The validating webhook is actually generated automatically from a configmap so I had to update that to update the webhook.
Is there some kind of cache in galley that mixer uses to distribute the config
I can't find any relevent log that indicates the rbac.istio.io resource is protected/validated by any service in the istio-system namespace.
How can I get the log of the MIXER
I need to understand which component exactly controls the policy. I managed to update the log level but failed to find anything useful
Most importantly How do I debug an envoy container
I need to debug the envoy app to understand why it's returnning false for the allow function. If we can not debug it easily. Is there a document that lets me update the code to add more log and build a new image to GCR so I can have another run and based on the log to see what's going on behind the scene.