Federating System and User metrics to S3 in Red Hat OpenShift for AWS

Last edited: May 15, 2024
Published: June 7, 2021
Authors: Paul Czarkowski

Tags:

AWS

ROSA

This content is authored by Red Hat experts, but has not yet been tested on every supported configuration.

This guide walks through setting up federating Prometheus metrics to S3 storage.

ToDo - Add Authorization in front of Thanos APIs

Prerequisites

A ROSA cluster deployed with STS
aws CLI

Set up environment

Create environment variables

export CLUSTER_NAME=my-cluster
export S3_BUCKET=my-thanos-bucket
export REGION=us-east-2
export NAMESPACE=federated-metrics
export SA=aws-prometheus-proxy
export SCRATCH_DIR=/tmp/scratch
export OIDC_PROVIDER=$(oc get authentication.config.openshift.io cluster -o json | jq -r .spec.serviceAccountIssuer| sed -e "s/^https:\/\///")
export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
export AWS_PAGER=""
rm -rf $SCRATCH_DIR
mkdir -p $SCRATCH_DIR

Create namespace
```
oc new-project $NAMESPACE
```

AWS Preperation

Create an S3 bucket

aws s3 mb --region $REGION s3://$S3_BUCKET

Create a Policy for access to S3

cat <<EOF > $SCRATCH_DIR/s3-policy.json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Statement",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:GetObject",
                "s3:DeleteObject",
                "s3:PutObject",
                "s3:PutObjectAcl"
            ],
            "Resource": [
                "arn:aws:s3:::$S3_BUCKET/*",
                "arn:aws:s3:::$S3_BUCKET"
            ]
        }
    ]
}
EOF

Apply the Policy

S3_POLICY=$(aws iam create-policy --policy-name $CLUSTER_NAME-thanos \
  --policy-document file://$SCRATCH_DIR/s3-policy.json \
  --query 'Policy.Arn' --output text)
echo $S3_POLICY

Create a Trust Policy

cat <<EOF > $SCRATCH_DIR/TrustPolicy.json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER}"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "${OIDC_PROVIDER}:sub": [
            "system:serviceaccount:${NAMESPACE}:${SA}"
          ]
        }
      }
    }
  ]
}
EOF

Create Role for AWS Prometheus and CloudWatch

S3_ROLE=$(aws iam create-role \
  --role-name "$CLUSTER_NAME-thanos-s3" \
  --assume-role-policy-document file://$SCRATCH_DIR/TrustPolicy.json \
  --query "Role.Arn" --output text)
echo $S3_ROLE

Attach the Policies to the Role

aws iam attach-role-policy \
  --role-name "$CLUSTER_NAME-thanos-s3" \
  --policy-arn $S3_POLICY

Deploy Operators

Add the MOBB chart repository to your Helm

helm repo add mobb https://rh-mobb.github.io/helm-charts/

Update your repositories
```
helm repo update
```

Use the mobb/operatorhub chart to deploy the needed operators

helm upgrade -n $NAMESPACE custom-metrics-operators \
  mobb/operatorhub --install \
  --values https://raw.githubusercontent.com/rh-mobb/helm-charts/main/charts/rosa-thanos-s3/files/operatorhub.yaml

Deploy Thanos Store Gateway

We use Grafana Alloy to scrape the prometheus metrics and ship them to Thanos, which will then store them in S3. Currently Grafana Alloy requires running as a specific user so we must set a SecurityContextConstraint to allow it.
```
oc adm policy add-scc-to-user anyuid -z rosa-thanos-s3-alloy
```

Deploy ROSA Thanos S3 Helm Chart

helm upgrade -n $NAMESPACE rosa-thanos-s3 --install mobb/rosa-thanos-s3 \
  --set "aws.roleArn=$S3_ROLE" \
  --set "rosa.clusterName=$CLUSTER_NAME" \
   --set "aws.region=$REGION" \
   --set "aws.bucket=$S3_BUCKET"

Append remoteWrite settings to the user-workload-monitoring config to forward user workload metrics to Thanos.

Check if the User Workload Config Map exists:

oc -n openshift-user-workload-monitoring get \
  configmaps user-workload-monitoring-config

If the config doesn’t exist run:

cat << EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
  name: user-workload-monitoring-config
  namespace: openshift-user-workload-monitoring
data:
  config.yaml: |
    prometheus:
      remoteWrite:
        - url: "http://thanos-receive.${NAMESPACE}.svc.cluster.local:9091/api/v1/receive"
EOF

Otherwise update it with the following:

oc -n openshift-user-workload-monitoring edit \
  configmaps user-workload-monitoring-config

  data:
    config.yaml: |
      ...
      prometheus:
      ...
        remoteWrite:
          - url: "http://thanos-receive.thanos-receiver.svc.cluster.local:9091/api/v1/receive"

Check metrics are flowing by logging into Grafana

Get the Route URL for Grafana (remember its https) and login using username root and the password you updated to (or the default of secret).
```
oc -n $NAMESPACE get route rosa-thanos-s3-grafana-cr-route
```
Once logged in go to Dashboards->Manage and expand the federated-metrics group and you should see the cluster metrics dashboards. Click on the Use Method / Cluster Dashboard and you should see metrics. \o/.

Cleanup

Delete the Helm Charts

helm delete -n $NAMESPACE rosa-thanos-s3
helm delete -n $NAMESPACE custom-metrics-operators

Delete the namespace
```
oc delete project $NAMESPACE
```
Delete the S3 bucket
```
aws s3 rb --force s3://$S3_BUCKET
```

Delete the AWS IAM Role and Policy

aws iam detach-role-policy \
  --role-name "$CLUSTER_NAME-thanos-s3" \
  --policy-arn $S3_POLICY
aws iam delete-role --role-name "$CLUSTER_NAME-thanos-s3"
aws iam delete-policy --policy-arn $S3_POLICY

Federating System and User metrics to S3 in Red Hat OpenShift for AWS

Prerequisites

Set up environment

AWS Preperation

Deploy Operators

Deploy Thanos Store Gateway

Check metrics are flowing by logging into Grafana

Cleanup

Interested in contributing to these docs?

Products

Tools

Try, buy & sell

Communicate

About Red Hat

Subscribe to our newsletter, Red Hat Shares

Red Hat legal and privacy links

Red Hat legal and privacy links