Quick use case : Splitt big index into daily indexes with _reindex API

Let's suppose you have a large timeseries index that holds multiple days, months or years of data and you want to split it into daily, weekly or monthly timeseries index.

Let's talk the following example of timeseries metrics ingested into the same index timeseries-data

PUT timeseries-data
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0
  },
  "mappings": {
    "properties": {
      "@timestamp": {
        "type": "date"
      },
      "metric": {
        "type": "float"
      }
    }
  }
}

Let's put some documents with differents timestamp and see later the reinding strategy

PUT timeseries-data/_doc/1?refresh
{"@timestamp": "2020-12-10T10:10:10", "metric": 0.9081}
PUT timeseries-data/_doc/2?refresh
{"@timestamp": "2020-12-11T12:10:10", "metric": 0.9082}
PUT timeseries-data/_doc/3?refresh
{"@timestamp": "2020-12-13T13:10:10", "metric": 0.9083}

Check our index

GET _cat/indices?index=time*&v

health status index           uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   timeseries-data SeZfKCERTdys0n9Zk4Rs1A   1   0          3            0       10kb           10kb

Now let's run the _reindex API with a painless script to change dynamically the index name based on @timestamp

POST _reindex?wait_for_completion=false
{
  "source": {
    "index": "timeseries-data"
  },
  "dest": {
    "index": "timeseries-data-*"
  },
  "script": {
    "source": """

        def inputFormat = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss");
        def myDate = inputFormat.parse(ctx._source['@timestamp']);

        def outputFormat = new SimpleDateFormat("yyyy-MM-dd");
        def outputDay = outputFormat.format(myDate);

        ctx._index = "timeseries-data-" + outputDay;
"""
  }
}

Let's check the result as expected

GET _cat/indices?index=time*&v

health status index                      uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   timeseries-data-2020-12-10 QOQWyBNzQOueYZ-jPSbOHA   1   1          1            0      3.3kb          3.3kb
yellow open   timeseries-data-2020-12-11 3zrPIjsjQMyr_blRBTAgmg   1   1          1            0      3.3kb          3.3kb
green  open   timeseries-data            SeZfKCERTdys0n9Zk4Rs1A   1   0          3            0       10kb           10kb
yellow open   timeseries-data-2020-12-13 o67kPtxYQx-EwJmMidS0Ow   1   1          1            0      3.3kb          3.3kb

Happy reindexing :)