istio getting "RBAC: access denied" even the servicerolebinding checked to be allowed

10,348

Answering my own question since I've made some progress on them.

I cannot update the ServiceRoleBinding even after I deleted the validating webhook

That's because the ServiceRoleBinding is actually generated/monitored/managed by the profile controller in the kubeflow namespace instead of the validating webhook.

I'm having this rbac issue because based on the params.yaml in the profiles manifest folder the rule is generated as

request.headers[]: [email protected]

instead of

request.headers[kubeflow-userid]: [email protected]

Due to I mis-configed the value as blank instead of userid-header=kubeflow-userid in the params.yaml

Share:
10,348
Roger Ray
Author by

Roger Ray

Java Developer DevOps Cloud Engineer

Updated on June 04, 2022

Comments

  • Roger Ray
    Roger Ray almost 2 years

    I've been struggleing with istio... So here I am seeking help from the experts!

    Background

    I'm trying to deploy my kubeflow application for multi-tenency with dex. Refering to the kubeflow offical document with the manifest file from github

    Here is a list of component/version information

    • I'm running kubernetes 1.15 on GKE
    • Istio 1.1.6 been used in kubeflow for service meth
    • Trying to deploy kubeflow 1.0 for ML
    • Deployed dex 1.0 for authn

    With the manifest file I successfully deployed the kubeflow on my cluster. Here's what I've done.

    • Deploy the kubeflow application on the cluster
    • Deploy Dex with OIDC service to enable authn to google Oauth2.0
    • Enable the RBAC
    • create envoy filter to append header "kubeflow-userid" as the login user

    Here is a verification of step 3 and 4 Check RBAC enabled and envoyfilter added for kubeflow-userid

    [root@gke-client-tf leilichao]# k get clusterrbacconfigs -o yaml
    apiVersion: v1
    items:
    - apiVersion: rbac.istio.io/v1alpha1
      kind: ClusterRbacConfig
      metadata:
        annotations:
          kubectl.kubernetes.io/last-applied-configuration: |
            {"apiVersion":"rbac.istio.io/v1alpha1","kind":"ClusterRbacConfig","metadata":{"annotations":{},"name":"default"},"spec":{"mode":"ON"}}
        creationTimestamp: "2020-07-04T01:28:52Z"
        generation: 2
        name: default
        resourceVersion: "5986075"
        selfLink: /apis/rbac.istio.io/v1alpha1/clusterrbacconfigs/default
        uid: db70920e-f364-40ec-a93b-a3364f88650f
      spec:
        mode: "ON"
    kind: List
    metadata:
      resourceVersion: ""
      selfLink: ""
    [root@gke-client-tf leilichao]# k get envoyfilter -n istio-system -o yaml
    apiVersion: v1
    items:
    - apiVersion: networking.istio.io/v1alpha3
      kind: EnvoyFilter
      metadata:
        annotations:
          kubectl.kubernetes.io/last-applied-configuration: |
            {"apiVersion":"networking.istio.io/v1alpha3","kind":"EnvoyFilter","metadata":{"annotations":{},"labels":{"app.kubernetes.io/component":"oidc-authservice","app.kubernetes.io/instance":"oidc-authservice-v1.0.0","app.kubernetes.io/managed-by":"kfctl","app.kubernetes.io/name":"oidc-authservice","app.kubernetes.io/part-of":"kubeflow","app.kubernetes.io/version":"v1.0.0"},"name":"authn-filter","namespace":"istio-system"},"spec":{"filters":[{"filterConfig":{"httpService":{"authorizationRequest":{"allowedHeaders":{"patterns":[{"exact":"cookie"},{"exact":"X-Auth-Token"}]}},"authorizationResponse":{"allowedUpstreamHeaders":{"patterns":[{"exact":"kubeflow-userid"}]}},"serverUri":{"cluster":"outbound|8080||authservice.istio-system.svc.cluster.local","failureModeAllow":false,"timeout":"10s","uri":"http://authservice.istio-system.svc.cluster.local"}},"statusOnError":{"code":"GatewayTimeout"}},"filterName":"envoy.ext_authz","filterType":"HTTP","insertPosition":{"index":"FIRST"},"listenerMatch":{"listenerType":"GATEWAY"}}],"workloadLabels":{"istio":"ingressgateway"}}}
        creationTimestamp: "2020-07-04T01:40:43Z"
        generation: 1
        labels:
          app.kubernetes.io/component: oidc-authservice
          app.kubernetes.io/instance: oidc-authservice-v1.0.0
          app.kubernetes.io/managed-by: kfctl
          app.kubernetes.io/name: oidc-authservice
          app.kubernetes.io/part-of: kubeflow
          app.kubernetes.io/version: v1.0.0
        name: authn-filter
        namespace: istio-system
        resourceVersion: "4715289"
        selfLink: /apis/networking.istio.io/v1alpha3/namespaces/istio-system/envoyfilters/authn-filter
        uid: e599ba82-315a-4fc1-9a5d-e8e35d93ca26
      spec:
        filters:
        - filterConfig:
            httpService:
              authorizationRequest:
                allowedHeaders:
                  patterns:
                  - exact: cookie
                  - exact: X-Auth-Token
              authorizationResponse:
                allowedUpstreamHeaders:
                  patterns:
                  - exact: kubeflow-userid
              serverUri:
                cluster: outbound|8080||authservice.istio-system.svc.cluster.local
                failureModeAllow: false
                timeout: 10s
                uri: http://authservice.istio-system.svc.cluster.local
            statusOnError:
              code: GatewayTimeout
          filterName: envoy.ext_authz
          filterType: HTTP
          insertPosition:
            index: FIRST
          listenerMatch:
            listenerType: GATEWAY
        workloadLabels:
          istio: ingressgateway
    kind: List
    metadata:
      resourceVersion: ""
      selfLink: ""
    
    

    RBAC Issue problem analysis

    After I finished my deployment. I performed below functional testing:

    • I can login with my google account with google oauth
    • I was able to create my own profile/namespace
    • I was able to create a notebook server
    • However I can NOT connect to the notebook server

    RBAC Issue investigation

    I'm getting "RBAC: access denied" error after I successfully created the notebook server on kubeflow and trying to connect the notebook server. I managed to updated the envoy log level and get the log below.

    [2020-08-06 13:32:43.290][26][debug][rbac] [external/envoy/source/extensions/filters/http/rbac/rbac_filter.cc:64] checking request: remoteAddress: 10.1.1.2:58012, localAddress: 10.1.2.66:8888, ssl: none, headers: ':authority', 'compliance-kf-system.ml'
    ':path', '/notebook/roger-l-c-lei/aug06/'
    ':method', 'GET'
    'user-agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
    'accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9'
    'accept-encoding', 'gzip, deflate'
    'accept-language', 'en,zh-CN;q=0.9,zh;q=0.8'
    'cookie', 'authservice_session=MTU5NjY5Njk0MXxOd3dBTkZvMldsVllVMUZPU0VaR01sSk5RVlJJV2xkRFVrRTFTVUl5V0RKV1EwdEhTMU5QVjFCVlUwTkpSVFpYUlVoT1RGVlBUa0U9fN3lPBXDDSZMT9MTJRbG8jv7AtblKTE3r84ayeCYuKOk; _xsrf=2|1e6639f2|10d3ea0a904e0ae505fd6425888453f8|1596697030'
    'referer', 'http://compliance-kf-system.ml/jupyter/'
    'upgrade-insecure-requests', '1'
    'x-forwarded-for', '10.10.10.230'
    'x-forwarded-proto', 'http'
    'x-request-id', 'babbf884-4cec-93fd-aea6-2fc60d3abb83'
    'kubeflow-userid', '[email protected]'
    'x-istio-attributes', 'CjAKHWRlc3RpbmF0aW9uLnNlcnZpY2UubmFtZXNwYWNlEg8SDXJvZ2VyLWwtYy1sZWkKIwoYZGVzdGluYXRpb24uc2VydmljZS5uYW1lEgcSBWF1ZzA2Ck4KCnNvdXJjZS51aWQSQBI+a3ViZXJuZXRlczovL2lzdGlvLWluZ3Jlc3NnYXRld2F5LTg5Y2Q0YmQ0Yy1kdnF3dC5pc3Rpby1zeXN0ZW0KQQoXZGVzdGluYXRpb24uc2VydmljZS51aWQSJhIkaXN0aW86Ly9yb2dlci1sLWMtbGVpL3NlcnZpY2VzL2F1ZzA2CkMKGGRlc3RpbmF0aW9uLnNlcnZpY2UuaG9zdBInEiVhdWcwNi5yb2dlci1sLWMtbGVpLnN2Yy5jbHVzdGVyLmxvY2Fs'
    'x-envoy-expected-rq-timeout-ms', '300000'
    'x-b3-traceid', '3bf35cca1f7b75e7a42a046b1c124b1f'
    'x-b3-spanid', 'a42a046b1c124b1f'
    'x-b3-sampled', '1'
    'x-envoy-original-path', '/notebook/roger-l-c-lei/aug06/'
    'content-length', '0'
    'x-envoy-internal', 'true'
    , dynamicMetadata: filter_metadata {
      key: "istio_authn"
      value {
      }
    }
    
    [2020-08-06 13:32:43.290][26][debug][rbac] [external/envoy/source/extensions/filters/http/rbac/rbac_filter.cc:108] enforced denied
    

    From the source code it looks like the allowed function is returnning false so it's giving the "RBAC: access denied" response.

      if (engine.has_value()) {
        if (engine->allowed(*callbacks_->connection(), headers,
                            callbacks_->streamInfo().dynamicMetadata(), nullptr)) {
          ENVOY_LOG(debug, "enforced allowed");
          config_->stats().allowed_.inc();
          return Http::FilterHeadersStatus::Continue;
        } else {
          ENVOY_LOG(debug, "enforced denied");
          callbacks_->sendLocalReply(Http::Code::Forbidden, "RBAC: access denied", nullptr,
                                     absl::nullopt);
          config_->stats().denied_.inc();
          return Http::FilterHeadersStatus::StopIteration;
        }
      }
    

    I took a search on the dumped envoy, it looks like the rule should be allowing any request with a header key as my mail address. Now I can confirm I've got that in my header from above log.

    {
     "name": "envoy.filters.http.rbac",
     "config": {
      "rules": {
       "policies": {
        "ns-access-istio": {
         "permissions": [
          {
           "and_rules": {
            "rules": [
             {
              "any": true
             }
            ]
           }
          }
         ],
         "principals": [
          {
           "and_ids": {
            "ids": [
             {
              "header": {
               "exact_match": "[email protected]"
              }
             }
            ]
           }
          }
         ]
        }
       }
      }
     }
    }
    

    With the understand that the envoy config that's been used to validate RBAC authz is from this config. And it's distributed to the sidecar by mixer, The log and code leads me to the rbac.istio.io config of servicerolebinding.

    [root@gke-client-tf leilichao]# k get servicerolebinding -n roger-l-c-lei -o yaml
    apiVersion: v1
    items:
    - apiVersion: rbac.istio.io/v1alpha1
      kind: ServiceRoleBinding
      metadata:
        annotations:
          role: admin
          user: [email protected]
        creationTimestamp: "2020-07-04T01:35:30Z"
        generation: 5
        name: owner-binding-istio
        namespace: roger-l-c-lei
        ownerReferences:
        - apiVersion: kubeflow.org/v1
          blockOwnerDeletion: true
          controller: true
          kind: Profile
          name: roger-l-c-lei
          uid: 689c9f04-08a6-4c51-a1dc-944db1a66114
        resourceVersion: "23201026"
        selfLink: /apis/rbac.istio.io/v1alpha1/namespaces/roger-l-c-lei/servicerolebindings/owner-binding-istio
        uid: bbbffc28-689c-4099-837a-87a2feb5948f
      spec:
        roleRef:
          kind: ServiceRole
          name: ns-access-istio
        subjects:
        - properties:
            request.headers[]: [email protected]
      status: {}
    kind: List
    metadata:
      resourceVersion: ""
      selfLink: ""
    

    I wanted to have a try updating this ServiceRoleBinding to validate some assumption since I can't debug the envoy source code and there's not enough log to show why exactly is the "allow" method returnning false.

    However I find myself cannot update the servicerolebinding. It resumes to its orriginal version everytime right after I finish editing it.

    I find that there's this istio-galley validatingAdmissionConfiguration(Code block below) that monitors these istio rbac resources.

    [root@gke-client-tf leilichao]# k get validatingwebhookconfigurations istio-galley -oyaml
    apiVersion: admissionregistration.k8s.io/v1beta1
    kind: ValidatingWebhookConfiguration
    metadata:
      creationTimestamp: "2020-08-04T15:00:59Z"
      generation: 1
      labels:
        app: galley
        chart: galley
        heritage: Tiller
        istio: galley
        release: istio
      name: istio-galley
      ownerReferences:
      - apiVersion: extensions/v1beta1
        blockOwnerDeletion: true
        controller: true
        kind: Deployment
        name: istio-galley
        uid: 11fef012-4145-49ac-a43c-2e1d0a460ea4
      resourceVersion: "22484680"
      selfLink: /apis/admissionregistration.k8s.io/v1beta1/validatingwebhookconfigurations/istio-galley
      uid: 6f485e28-3b5a-4a3b-b31f-a5c477c82619
    webhooks:
    - admissionReviewVersions:
      - v1beta1
      clientConfig:
        caBundle: 
        .
        .
        .
        service:
          name: istio-galley
          namespace: istio-system
          path: /admitpilot
          port: 443
      failurePolicy: Fail
      matchPolicy: Exact
      name: pilot.validation.istio.io
      namespaceSelector: {}
      objectSelector: {}
      rules:
      - apiGroups:
        - config.istio.io
        apiVersions:
        - v1alpha2
        operations:
        - CREATE
        - UPDATE
        resources:
        - httpapispecs
        - httpapispecbindings
        - quotaspecs
        - quotaspecbindings
        scope: '*'
      - apiGroups:
        - rbac.istio.io
        apiVersions:
        - '*'
        operations:
        - CREATE
        - UPDATE
        resources:
        - '*'
        scope: '*'
      - apiGroups:
        - authentication.istio.io
        apiVersions:
        - '*'
        operations:
        - CREATE
        - UPDATE
        resources:
        - '*'
        scope: '*'
      - apiGroups:
        - networking.istio.io
        apiVersions:
        - '*'
        operations:
        - CREATE
        - UPDATE
        resources:
        - destinationrules
        - envoyfilters
        - gateways
        - serviceentries
        - sidecars
        - virtualservices
        scope: '*'
      sideEffects: Unknown
      timeoutSeconds: 30
    - admissionReviewVersions:
      - v1beta1
      clientConfig:
        caBundle: 
        .
        .
        .
        service:
          name: istio-galley
          namespace: istio-system
          path: /admitmixer
          port: 443
      failurePolicy: Fail
      matchPolicy: Exact
      name: mixer.validation.istio.io
      namespaceSelector: {}
      objectSelector: {}
      rules:
      - apiGroups:
        - config.istio.io
        apiVersions:
        - v1alpha2
        operations:
        - CREATE
        - UPDATE
        resources:
        - rules
        - attributemanifests
        - circonuses
        - deniers
        - fluentds
        - kubernetesenvs
        - listcheckers
        - memquotas
        - noops
        - opas
        - prometheuses
        - rbacs
        - solarwindses
        - stackdrivers
        - cloudwatches
        - dogstatsds
        - statsds
        - stdios
        - apikeys
        - authorizations
        - checknothings
        - listentries
        - logentries
        - metrics
        - quotas
        - reportnothings
        - tracespans
        scope: '*'
      sideEffects: Unknown
      timeoutSeconds: 30
    
    

    Long stroy short

    I've been banging my head over this istio issue for more than 2 weeks. I'm sure there's planty of people felting the same trying to trouble shoot istio on k8s. Any suggestion is welcomed! Here's how I understand the problem, please correct me if I'm wrong:

    • The log evidence showed the rbac rules is not allowing my access to the resource
    • I need to update the rbac rules
    • rules are distributed by mixer to the envoy container according to ServiceRoleBinding
    • So I need to update the ServiceRoleBinding instead
    • I cannot update the ServiceRoleBinding because either the validating admission webhook or the istio mixer is preventing me from doing it

    I've run into below problems where

    I cannot update the ServiceRoleBinding even after I deleted the validating webhook

    I tried to delete this validating webhook to update the servicerolebinding. The resource resumes right after I save the edit. The validating webhook is actually generated automatically from a configmap so I had to update that to update the webhook.

    Is there some kind of cache in galley that mixer uses to distribute the config

    I can't find any relevent log that indicates the rbac.istio.io resource is protected/validated by any service in the istio-system namespace.

    How can I get the log of the MIXER

    I need to understand which component exactly controls the policy. I managed to update the log level but failed to find anything useful

    Most importantly How do I debug an envoy container

    I need to debug the envoy app to understand why it's returnning false for the allow function. If we can not debug it easily. Is there a document that lets me update the code to add more log and build a new image to GCR so I can have another run and based on the log to see what's going on behind the scene.