Flink通过Savepoint功能可以做到程序升级后,继续从升级前的那个点开始执行计算,保证数据不中断。
Flink中Checkpoint用于保存状态,是自动执行的,会过期,Savepoint是指向Checkpoint的指针,需要手动执行,并且不会过期。
据Flink路线图,后面Savepoint会和Checkpoint合并成一个,不像现在这样分成两个,而且一个自动、一个手动了。
1.flink-conf.yaml中配置Savepoint存储位置
不是必须设置,但是设置后,后面创建指定Job的Savepoint时,可以不用在手动执行命令时指定Savepoint的位置
state.savepoints.dir: hdfs://t-sha1-flk-01:9000/flink-savepoints
2.列出当前Job
[teld@T-SHA1-FLK-01 log]$ flink list ------------------ Running/Restarting Jobs ------------------- aaaaaaaaaaaa : 8eaee3ed045c14337568c1cf3a272a45 : MonitorEngine_V1.0_SH.A1_Minute (RUNNING) bbbbbbbbbbbb : ca1f3ac0081711ee6a0767fe1fd5b31c : MonitorEngine_V1.0_SH.A1_Second (RUNNING) -------------------------------------------------------------- No scheduled jobs.
3.停止Job,并将状态写入Savepoint
[teld@T-SHA1-FLK-01 log]$ flink cancel -s ca1f3ac0081711ee6a0767fe1fd5b31c Cancelling job ca1f3ac0081711ee6a0767fe1fd5b31c with savepoint to default savepoint directory. Cancelled job ca1f3ac0081711ee6a0767fe1fd5b31c. Savepoint stored in
hdfs://t-sha1-flk-01:9000/flink-savepoints/savepoint-ca1f3a-9f86a020ee76.
4.从指定的Savepoint启动Job
[teld@T-SHA1-FLK-01 log]$ flink run -s hdfs://t-sha1-flk-01:9000/flink-savepoints/savepoint-ca1f3a-9f86a020ee76
-p 6 -c cn.teld.monitor.MonitorEngine monitorengine_flink_sec-1.0-jar-with-dependencies.jar
5.建议为Flink程序中的每个操作设置uid以及name
6.从界面提交升级包
前面是通过命令行的方式进行升级,也可以直接通过界面方式进行提交,提交时指定Savepoint路径即可。