zoukankan      html  css  js  c++  java
  • 60行代码爬取抖音个人主页视频

    60行代码批量爬取抖音视频

    ​ 爬虫原理这里就不详细写了,直接贴代码,主要也是为了方便我本人拿取,需要的朋友自取顺便点个赞哦。

    ​ 操作方法:打开抖音,切换到某一个用户页面下,点击右上角的三个点,点击分享再点击复制链接,运行程序,把链接输入等待程序运行即可(“抖音,记录美好生活”这几个字记得去掉),然后就会把该用户下所有上传的视频全部爬取下来。

    # !/usr/bin/env python3
    # -*- coding:utf-8 -*-
    # @Time : 2021-03-15
    # @Author : wind_leaf
    import requests
    import json
    import re
    import sys
    
    headers = {
        'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
        'accept-language': 'zh-CN,zh;q=0.9,en;q=0.8',
        'pragma': 'no-cache',
        'cache-control': 'no-cache',
        'upgrade-insecure-requests': '1',
        'User-Agent': 'Mozilla/5.0 (iPhone; CPU iPhone OS 13_2_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.3 Mobile/15E148 Safari/604.1',
    }
    '''
    https://www.iesdouyin.com/web/api/v2/aweme/post/?
    sec_uid=MS4wLjABAAAAeAIH1d_98INk5rNXF9Q4zrbGK9d1Eumyydy7qKL1WPk&
    count=21&
    max_cursor=0&
    aid=1128&
    _signature=j0NkqgAA7x12uIyl2MgN6I9DZL&dytk=
    '''
    '''
    <a href="https://www.iesdouyin.com/share/user/72673737181?u_code=17fc9cg0a&amp;did=69773896663&amp;iid=1302254358919767&amp;sec_uid=MS4wLjABAAAAeAIH1d_98INk5rNXF9Q4zrbGK9d1Eumyydy7qKL1WPk&amp;timestamp=1615603669&amp;utm_source=copy&amp;utm_campaign=client_share&amp;utm_medium=android&amp;share_app_name=douyin">Found</a>.
    
    '''
    '''eg: https://v.douyin.com/eRENmGV/    # 一条小团团OvO'''
    
    root_url = input('输入你要下载的用户的分享链接:').strip()
    max_cursor = 0      # 页码
    has_more = True     # 是否有下一页
    page = 1        # 1页20个视频
    response = requests.get(url=root_url, headers=headers, allow_redirects=False)
    sec_uid = re.findall(r'sec_uid.*?&', response.headers['location'])[0][8:-1]     # 用户唯一id
    
    while has_more:
        video_lis = []
        print(f'获取第{page}页视频地址---')
        response = requests.get(url=f'https://www.iesdouyin.com/web/api/v2/aweme/post/?sec_uid={sec_uid}&count=21&max_cursor={max_cursor}&aid=1128&_signature=dpcuDQAAFtyPbMYCi7BbQ3aXLh&dytk=', headers=headers)
        print(response.text)
        result = json.loads(response.text)
        if result['aweme_list']:
            max_cursor = result['max_cursor']
            has_more = result['has_more']
            for video_data in result['aweme_list']:
                dic = {'desc': video_data['desc']}
                dic['url'] = video_data['video']['play_addr']['url_list'][2]
                video_lis.append(dic)
        print('开始下载---')
        for i, video in enumerate(video_lis):
            print(f"第{page}页{i+1}个视频:{video['desc']}")
            size = 0
            response = requests.get(url=video['url'], headers=headers)
            content_size = int(response.headers['content-length'])
            sys.stdout.write('----[文件大小]:%0.2f MB
    ' % (content_size / 1024 / 1024))
    
            with open(video['desc']+'.mp4', 'wb')as f:
                for data in response.iter_content(chunk_size=1024):
                    f.write(data)
                    size += len(data)
                f.flush()
        page += 1
    
    
    
  • 相关阅读:
    POJ 2112 Optimal Milking (Floyd+二分+最大流)
    hdu5444 Elven Postman
    hdu5442 Favorite Donut
    hdu5437 Alisha’s Party
    hdu5433 Xiao Ming climbing
    hdu5432 Pyramid Split
    Codeforces Round #316 (Div. 2) C. Replacement
    hdu5396 Expression
    hdu3506 Monkey Party
    hdu3516 Tree Construction
  • 原文地址:https://www.cnblogs.com/leaf-wind/p/14541428.html
Copyright © 2011-2022 走看看