Managing MongoDB TTL Indexes

Learn how to manage the TTL (Time to Live) index on MongoDB RAW collections in the METRIC_RAW_DATA database table, to configure deletion time.

Valid in Version: 2020.2.0 and later

Table of Contents

How the Akana Platform uses TTL indexes

The Akana Platform uses MongoDB, as an alternative to the traditional RDBMS, to help scale certain features such as OAuth, analytics, auditing, and Policy Manager alerts.

The platform uses MongoDB for managing analytics information in the Community Manager developer portal (Community Manager) as well as for the Envision product. In the Akana Platform, analytics data is marked for deletion on TTL index columns after they have been aggregated into rollups.

In these scenarios, MongoDB built-in TTL indexes allow you to control how long data is kept for. When the TTL setting is reached, the MongoDB document is deleted.

By default, data added to the METRIC_RAW_DATA table is scheduled for deletion after 1 hour via a MongoDB TTL index.

In some cases, the MongoDB Administrator might want to modify the TTL indexes; for example, if the data is not being processed fast enough (see Use case: Extending the TTL below).

For information about modifying the MongoDB TTL indexes, refer to the MongoDB documentation: https://docs.mongodb.com/manual/core/index-ttl/.

Extending the TTL

You might find that an external script that is pulling data from the Akana Platform database, such as an ETL script, cannot pull data fast enough, resulting in loss of data. In this scenario, you might want to extend the life of the data to accommodate the slower retrieval rate so that data is not lost. When external systems are interfacing with the raw data directly, extending the TTL, and therefore the life of the data, can give these processes time to read the data.

Extending the TTL is covered in the MongoDB documentation: https://docs.mongodb.com/manual/core/index-ttl/.

Use case: Extending the TTL

Let's say that external ETL scripts that are in use are not pulling the analytics data from Community Manager fast enough, resulting in data loss. As a solution, you might want to extend the life of the data by increasing the TTL.

Below is a generalized example of how to execute the command, from the command line, in a MongoDB instance, to modify the TTL. There are three placeholder values: collection, index_spec, and seconds.

db.runCommand({"collMod": <collection>,
    "index": {
           keyPattern: <index_spec>,
           expireAfterSeconds: <seconds>
     }
  })

Below is an example of a Java program that performs the TTL update on a collection named TestDS1, to modify the TTL indexes.

public static void main(String[] args) {
  MongoClientWrapper wrapper = new MongoClientWrapper("mongodb://localhost:27017");
  wrapper.setDefaultDB("METRIC_ROLLUP_DATA");
  wrapper.setDefaultCollection("TestDS1");
  wrapper.updateTTL(1);
}
public Document updateTTL(long newExpiry) {
  return updateTTL(getDefaultDB(), getDefaultCollection(), "value._deleteOn", newExpiry);
}
public Document updateTTL(String indexName, long newExpiry) {
  return updateTTL(getDefaultDB(), getDefaultCollection(), indexName, newExpiry);
}
public Document updateTTL(String dbName, String collectionName, String deleteOnKey, long newExpiry) {
  Document cmd = new Document("collMod", collectionName).
      append("index", new Document("keyPattern", new Document(deleteOnKey, 1))
                 .append("background", true)
                 .append("expireAfterSeconds", newExpiry));
  Document r = mongoClient.getDatabase(dbName).runCommand(cmd);
  logger.debug(r.toJson());
  return r;
}

System impact of increased TTL

It's important to realize that if you increase the TTL time, there will be impact on hardware and memory requirements for data capture, management, and storage.

For example, if you expand the TTL from the default of 1 hour to a new value of 2 hours, twice as much data will need to be held. This impacts disk space requirements, since data is stored on disk, as well as RAM requirements since the indexes are held in memory.

For details about the purge intervals, and an example of a query to purge operational metric rollup data, see MongoDB: metric rollup data cleanup.

Operational metric rollup configuration

In some cases, purge intervals may not be defined for OPERATIONAL_METRIC rollup configuration. In this scenario, rollup data is not purged by default. Data could be purged by creating a TTL index based on the timestamp property and expiring after one year. In this scenario, the same property would apply to all intervals: minutes, days, weeks, and so on.

A more efficient approach is to configure a separate purge interval for each OPERATIONAL_METRIC rollup configuration. For more information, and an example, see MongoDB: metric rollup data cleanup.