镜头分割常常被用于视频智能剪辑、视频关键帧提取等场景。
本文给出一种解决镜头分割问题的思路,可分为两个步骤:
1、根据镜头分割算法对视频进行分割标记
核心在于镜头分割算法,这里简单描述一种算法思路:ratio = different(current_frame_histogram, prevous_frame_histogram) / avgvere_different(previous_frame_histogram),通过大量试验找到合适的ratio 阈值,若ratio大于阈值,则从当前帧分割视频,由于版权原因本文省略具体算法及实现。利用cv2的calcHist计算帧RGB三通道histogram的代码如下:
for id in range(3): self.current_hist_rgb[id] = cv2.calcHist([frame], [0], None, [256], [0, 255])
2、 根据分割标记进行实际分割
本文使用ffmpeg进行视频分割(需安装ffmpeg),具体命令如下
ffmpeg -ss starttime -i input.mp4 -t duration -codec copy -codec copy output.mp4 -y
命令中参数的顺序不能任意调整,-ss必须是第一个参数,否则分割后的视频可能出现黑屏,-t参数必须在-i参数后面,否则分割后视频可能出现时长不正确的问题。从实际效果来看,分割点并不准确在-ss参数指定的时间点,而是之前最近的关键帧。
最后,本文采用ffmpeg-python(需要用pip安装)来计算视频pts,具体实现见VideoCutEngine的calcPTS方法。
实现代码:
import cv2 import ffmpeg import numpy as np import sys import os class VideoCutEngine(): def __init__(self, input): self.input = input def calcPTS(self): try: probe = ffmpeg.probe(self.input) except ffmpeg.Error as e: print(e.stderr, sys.stderr) return False, 0 video_stream = next((stream for stream in probe['streams'] if stream['codec_type'] == 'video'), None) if video_stream is None: return False, 1 num_frames = int(video_stream['nb_frames']) duration = float(video_stream['duration']) return True, num_frames * 1.0 / duration def doCut(self, start, duration, output): cmd = 'ffmpeg -ss {} -i {} -t {} -codec copy -codec copy {} -y'.format(start, self.input, duration, output) ret = os.system(cmd) return ret class SceneSplitEngine(): def __init__(self): self.frame = None self.current_hist_rgb = [0, 0, 0] self.last_hist_rgb = [0, 0, 0] self.frame_count = 0 self.current_shot_count = 0 self.hist_diff = [] def setFrmae(self,frame): self.frame = frame self.frame_count += 1 self.current_shot_count += 1 def doSplit(self): for id in range(3): self.current_hist_rgb[id] = cv2.calcHist([frame], [0], None, [256], [0, 255]) 具体算法实现省略。 input = '/data/test.mp4' if __name__ == '__main__': sceneSpliter = SceneSplitEngine() videoCutter = VideoCutEngine(input) videoCapturer = cv2.VideoCapture(input) pts = videoCutter.calcPTS() while True: ret1, frame = videoCapturer.read() if ret1 == True: sceneSpliter.setFrmae(frame) ret2, start, end = sceneSpliter.doSplit() if ret2 == True: duration = max((end -start) / 24, 1) print(ret2, start / 24, duration) output = '/data/output{}.mp4'.format(start / 24) videoCutter.doCut(start / 24, duration, output) else: break