Externalized Checkpoints

您可以将定期检查点配置为在外部持久化。 外部检查点将其元数据写入永久存储,并且在作业失败时不会自动清除。 这样,如果您的工作失败,您将有一个检查点来恢复。

You can configure periodic checkpoints to be persisted externally. Externalized checkpoints write their meta data out to persistent storage and are not automatically cleaned up when the job fails. This way, you will have a checkpoint around to resume from if your job fails.

CheckpointConfig config = env.getCheckpointConfig();

The ExternalizedCheckpointCleanup mode configures what happens with externalized checkpoints when you cancel the job:

  • ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION: Retain the externalized checkpoint when the job is cancelled. Note that you have to manually clean up the checkpoint state after cancellation in this case.

  • ExternalizedCheckpointCleanup.DELETE_ON_CANCELLATION: Delete the externalized checkpoint when the job is cancelled. The checkpoint state will only be available if the job fails.

The target directory for the checkpoint is determined from the default checkpoint directory configuration. This is configured via the configuration key state.checkpoints.dir, which should point to the desired target directory:

state.checkpoints.dir: hdfs:///checkpoints/

This directory will then contain the checkpoint meta data required to restore the checkpoint. The actual checkpoint files will still be available in their configured directory. You currently can only set this via the configuration files.

Follow the savepoint guide when you want to resume from a specific checkpoint.