How To Backup Elasticsearch On Kubernetes With Amazon S3 and Kibana

Raphael De Lio
5 min readJul 11, 2020

Everyone is susceptible to mistakes, but those who can foresee them are able to prevent them much before they occur.

There are many ways you can lose all your data and you probably don’t want it to happen. But if it eventually does, not having a backup might be your worst nightmare.

Today we will be going through how to backup our Elasticsearch cluster (Running on Kubernetes) data into Amazon S3 and protect it from unexpected data losses.

What will we be seeing?

  • Create Amazon Web Services S3 Bucket
  • Create a Kubernetes Secret to Hold Your Access Key
  • Configure Elasticsearch To Connect To S3
  • Take a snapshot through Elasticsearch
  • Take a snapshot through Kibana

Create An Amazon Web Services S3 Bucket

Let’s get started by creating a Amazon Web Services S3 Bucket, you can do it by clicking here. Make sure the bucket is in the same region as your cluster.

Create a Kubernetes Secret To Hold Your Access Key

Next, still in your AWS account, create an IAM user, copy the access key ID and secret, and configure the following user policy. This is important to make sure the access keys, which you will need to provide to your cluster, can only access the intended bucket.

{
"Statement": [
{
"Action": [
"s3:*"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::bucket-name",
"arn:aws:s3:::bucket-name/*"
]
}
]
}

Now that you have your access key, you can create a Kubernetes Secret to hold its information. We will call it aws-s3-keys and you can create it by replacing the fields with your access key info and before running the snippet below:

kubectl create secret generic aws-s3-keys --from-literal=access-key-id='<YOUR-ACCESS-KEY-ID>' --from-literal=access-secret-key='<YOUR-ACCESS-KEY-SECRET>'

Configure Elasticsearch To Connect To AWS S3

In order to do it, we will need to:

  1. Install the AWS S3 plug in
  2. Add the Access Key to the configurations of Elasticsearch

Both actions must be done before the cluster actually starts and in order to accomplish it, we will use Init Containers.

Init containers allow us to run commands before the actual entrypoint is run and the Elasticsearch cluster has started. If you followed my Deploy the Elastic Stack with the Elastic Cloud On Kubernetes (ECK) story, you already have an init container configuring the vm.max_map_count that you can see as an example:

initContainers:
- name: sysctl
securityContext:
privileged: true
command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144']

We will need to add two more initContainers to this spec, the first one to install the plugin:

- name: install-plugins
command:
- sh
- -c
- |
bin/elasticsearch-plugin install --batch repository-s3

And the second one to add the Access Key to the Elasticsearch Keystore:

- name: add-aws-keys
env:
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aws-s3-keys
key: access-key-id
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: aws-s3-keys
key: access-secret-key
command:
- sh
- -c
- |
echo $AWS_ACCESS_KEY_ID | bin/elasticsearch-keystore add --stdin --force s3.client.default.access_key
echo $AWS_SECRET_ACCESS_KEY | bin/elasticsearch-keystore add --stdin --force s3.client.default.secret_key

As you can see, we add the access key id and secret as environment variables and then add them inside the Elasticsearch Keystore

Finally, you will have something like:

Now you can start your pod and we should be able to take our first Snapshot!

First we will learn how to do it using Elasticsearch directly and then we will be going through on how to do it through Kibana.

Take a snapshot through Elasticsearch

Before we create our first backup we need to set up a repository, that’s where your backups live, to do so you can send a request as:

PUT _snapshot/my_backup
{
"type": "s3",
"settings": {
"bucket": "name-of-bucket",
"region": "region-of-bucket-same-as-cluster",
"base_path": "backups"
}
}

This will create a repository named my_backup in the bucket we created in the beginning of this tutorial and in the path “backups” inside this bucket. The base_path is optional.

We are now ready to take our first snapshot by sending the following request:

PUT /_snapshot/my_backup/snapshot_1

Depending on the size of your database, your backup might take a long time to save. To check its progress you can send a request as:

GET /_snapshot/my_backup/snapshot_1

Which will retrieve one of the information below:

Now let’s see how we can achieve the same thing in Kibana.

Take a snapshot through Kibana

In Kibana it is easier to set up a repository and take a snapshot. Let’s see how we can do it.

First of all open Kibana and go to Management > Snapshot and Restore.

  1. On the Repositories tab, click Register a repository.
  2. Provide a name for your repository and select type AWS S3.
  3. Provide the following settings:
    - Client: default
    - Bucket: Your S3 bucket name (elasticsearch-backup in our case)
    - Add any other settings that you wish to configure.
    - Click Register.
    - Click Verify to confirm that your settings are correct and the deployment can connect to your repository.

Your snapshot repository is now set up using S3! Now let’s automate our snapshots!

  1. Open the Policies view.
  2. Click Create a policy.

3. As you walk through the wizard, enter the following values:

4. Review your input, and then click Create policy.

Congratulations, your new policy is created and you are now taking daily snapshots of your Elasticsearch cluster! 🎉

Medium is changing (Did you know?)

Medium Partner Program (The Monetization Program) has suffered great changes since the 1st of August, 2023. Before that date, Medium would pay writers based on the time readers spent on their stories.

Unfortunately, this has changed. Medium pays writers now based on the interaction with their stories. The amount they earn is based on claps, follows, highlights, and replies. I really hate to do it, but if you enjoyed this story and other stories on Medium, don’t forget to interact with them. This is an easy way to keep supporting the authors you like and keep them on the platform.

Contribute

Writing takes time and effort. I love writing and sharing knowledge, but I also have bills to pay. If you like my work, please, consider donating through Buy Me a Coffee: https://www.buymeacoffee.com/RaphaelDeLio

Or by sending me BitCoin: 1HjG7pmghg3Z8RATH4aiUWr156BGafJ6Zw

Follow Me on Social Media

Stay connected and dive deeper into the world of Elasticsearch with me! Follow my journey across all major social platforms for exclusive content, tips, and discussions.

Twitter | LinkedIn | YouTube | Instagram

You might also enjoy:

--

--

Raphael De Lio

Software Consultant @ Xebia - Dutch Kotlin User Group Organizer: https://kotlin.nl