for b in range(29): a = open('D:\a\' + str(b+1) + '.srt','r') #读取文件 lines = len(a.readlines()) #测量文件行数 a.seek(0,0) #测量文件行数 c = open('c.txt','a') #新建文件 if b > 0: ce = ' This is lesson ' + str(b+1) + ' ' #标题行 if b == 0: ce = 'This is lesson ' + str(b+1) + ' ' #标题行 c.write(ce) g = 0 while g < lines: e = a.readline() g += 1 e = a.readline() g += 1 e = a.readline() g += 1 while(e != ' ' and e != '' ): if '>>' in e: e = e[8:] c.write(e) e = a.readline() g += 1 a.close() c.close()
目标:提取字幕文件里的文字部分
坑、要点:
1)文件行数:
lines = len(a.readlines())
a.seek(0,0)
2)追加写入模式'a'
3)文件读取到空行是' ',文件读取到结尾没了是 ''
例子:1.srt
1
00:00:00,730 --> 00:00:02,050
Hi. Welcome to Version Control.
2
00:00:02,050 --> 00:00:04,300
I'm Carolyn, one of your instructors for the course.
3
00:00:04,300 --> 00:00:05,350
>> And I'm Sarah, your other instructor.
4
00:00:05,350 --> 00:00:08,340
In this course, we'll cover the concept of Version Control and
5
00:00:08,340 --> 00:00:09,960
why you might want to use it.
6
00:00:09,960 --> 00:00:12,610
>> Yeah. So, Version Control is sort of like having a giant undo button for
7
00:00:12,610 --> 00:00:13,530
your project.
8
00:00:13,530 --> 00:00:15,720
>> You mean like this one?
例子:3.srt
1
00:00:00,310 --> 00:00:01,630
Hey, I'm here with Larry, and
2
00:00:01,630 --> 00:00:04,019
he's been having a problem with
a webpage he's been working on.
3
00:00:04,019 --> 00:00:06,810
I think this is a great opportunity to
both help Larry out with his problem and
4
00:00:06,810 --> 00:00:09,230
to show you what version
control can be useful for.
5
00:00:09,230 --> 00:00:10,700
>> Yeah, so,
I've been working on a website, and
6
00:00:10,700 --> 00:00:12,540
it looks a lot different
than it did before.
7
00:00:12,540 --> 00:00:13,610
There used to be a banner at the top.