最新消息:20210917 已从crifan.com换到crifan.org

【未解决】Python给requests下载文件加速提升速度

Python crifan 1275浏览 0评论
折腾:
【记录】批量运行安卓游戏测试:ViVo的Game的三国类游戏
期间,之前已有代码去下载安卓apk:
[201126 14:10:07][DownloadApps.py 111] download snsgz/少年三国志 68.56%
[201126 14:10:14][DownloadApps.py 111] download snsgz/少年三国志 69.6%
[201126 14:10:20][DownloadApps.py 111] download snsgz/少年三国志 70.64%
[201126 14:10:27][DownloadApps.py 111] download snsgz/少年三国志 71.68%
[201126 14:10:34][DownloadApps.py 111] download snsgz/少年三国志 72.72%
[201126 14:10:40][DownloadApps.py 111] download snsgz/少年三国志 73.76%
[201126 14:10:46][DownloadApps.py 111] download snsgz/少年三国志 74.8%
但是感觉速度不够快
后记:
[201126 14:13:10][DownloadApps.py 126] download snsgz/少年三国志 end, size=962.6MB, time=00:09:34, speed=1.7MB/s
下载完毕,平均速度是 1.7MB/s
对应代码是:
src/common/DownloadApps.py
    def download(self,task):

。。。
        try:
            r1 = requests.get(url, stream=True)
            total_size = int(r1.headers['Content-Length']) # 447681304
        except:
            total_size = None
        temp_size = os.path.getsize(filepath) if os.path.exists(filepath) else 0
        headers = {
            'Range': 'bytes=%d-' % temp_size,
            "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.81 Safari/537.36",
        }
        r = requests.get(url, stream=True, headers=headers)
        with open(filepath, "ab") as f:
            for chunk in r.iter_content(chunk_size=1024*1024*10):
                if chunk:
                    temp_size += len(chunk)
                    f.write(chunk)
                    f.flush()
                    ratio = round(100 * temp_size / total_size,2) if total_size is not None else "unknown"
                    logging.info("download {0} {1}%".format(appname,ratio))
。。。
且此处是单线程
希望加速 提速 提高下载速度
python requests download speed up
python – Increase download speed of requests – Stack Overflow
python – How to measure download speed and progress using requests? – Stack Overflow
performance – Ridiculously low download speed with Python requests module – Stack Overflow
Python – Requests – Increase download speed : learnpython
faster-than-requests · PyPI
Nim Programming Language
juancarlospaco (Juan Carlos)
juancarlospaco/faster-than-requests: Faster requests on Python 3
其中看到:
download2
https://github.com/juancarlospaco/faster-than-requests#download2
* threads Passing threads = True uses Multi-Threading, threads = False will Not use Multi-Threading, omitting it will Not use Multi-Threading.
支持多线程,或许可以提高下载速度?
抽空去试试
另外,如果可以输出进度信息,更好
搜:
progress
找到:
debugs
https://github.com/juancarlospaco/faster-than-requests#debugs
“Set Debug Mode by changing the environment variable REQUESTS_DEBUG, bool type, can be empty string, Debug Mode prints progress in real time each second on the terminal as JSON string, Debug Mode is slow. This is 100% Optional, this is provided as Extra feature.
$ export REQUESTS_DEBUG = "true"
$ # This is the Bash command line terminal!.
但是是命令行模式,不是代码可以控制的。
另外此处下载:
率土之滨
27275482
https://gameapktxdl.vivo.com.cn/appstore/developer/soft/20201125/2020112509424466mb1.apk
貌似耗时较多
[201126 15:07:20][DownloadApps.py 111] download stzb/率土之滨 31.17%
[201126 15:07:25][DownloadApps.py 111] download stzb/率土之滨 31.77%
[201126 15:07:31][DownloadApps.py 111] download stzb/率土之滨 32.36%
[201126 15:07:37][DownloadApps.py 111] download stzb/率土之滨 32.96%
抽空可以用这个去测试下载速度是否有提升
python requests multi thread speed
Speeding up Python code using multithreading – Yasoob Khalid
Python requests with multithreading – Stack Overflow
都是说的 多个url,多线程下载
不是此处 单个url,希望多线程下载,用于提高下载速度
What is the fastest way to send 100,000 HTTP requests in Python? – Stack Overflow
Multithread python requests – Stack Overflow
Speed Up Your Python Program With Concurrency – Real Python
后来才看到
https://github.com/juancarlospaco/faster-than-requests#download2
“Description: Takes a list of URLs, makes 1 HTTP GET Download for each URL of the list.
看来:多线程指的是 对于输入的多个url来说的。。。
python – Increase download speed of requests – Stack Overflow
好像下载速度和chunk size有关系?
python requests iter_content speed
requests.get really slow when stream=True · Issue #2015 · psf/requests
network programming – Get the speed of downloading a file using python – Stack Overflow
Advanced Usage — Requests 2.25.0 documentation
Body Content Workflow
没太懂,貌似别人用法是对的
iter_content slow with large chunk size on HTTPS connection · Issue #3729 · psf/requests
貌似MacOS有bug?
那抽空试试不同的chunksize 看看下载速度是否有变化?
python requests download speed
python requests increase download speed
python – requests library: how to speed it up? – Stack Overflow
或许除了requests,有其他速度更快的库?
【未解决】其他下载文件速度更快的Python的库
后来从
Downloading files from web using Python – GeeksforGeeks
中的:
“The HTTP response content (r.content) is nothing but a string which is storing the file data. So, it won’t be possible to save all the data in a single string in case of large files. To overcome this problem, we do some changes to our program:
* Since all file data can’t be stored by a single string, we use r.iter_content method to load data in chunks, specifying the chunk size.
r = requests.get(URL, stream = True)
Setting stream parameter to True will cause the download of response headers only and the connection remains open. This avoids reading the content all at once into memory for large responses. A fixed chunk will be loaded each time while r.iter_content is iterated.”
终于看懂了,stream = True的含义:
  • 普通小文件
    • stream = False(默认值)
      • 直接从r.content获取到全部数据(文件的二进制数据)
  • 大文件 = 无法一次性完全下载到 = 即使下载到本地内存估计也放不下
    • stream = True
      • 意思是,流式下载,想流水一样,慢慢下载
      • 确切含义是
        • 第一次只下载header
          • 不下载具体数据data content
        • 之后每次
          • 你访问r.iter_content,读取到一个chunk块状数据
            • 对于chunk的大小,根据参数chunk_size决定
              • 典型代码
                • for chunk in r.iter_content(chunk_size=1024)
完整实例代码:
(1)stream=False的小问题
# imported the requests library
import requests

image_url = "https://www.python.org/static/community_logos/python-logo-master-v3-TM.png"

# URL of the image to be downloaded is defined as image_url
r = requests.get(image_url) # create HTTP response object
# send a HTTP request to the server and save
# the HTTP response in a response object called r
with open("python_logo.png",'wb') as f:

    # Saving received content as a png file in
    # binary format
  
    # write the contents of the response (r.content)
    # to a new file in binary mode.
    f.write(r.content)
(2)stream=True的大文件
import requests

file_url = "http://codex.cs.yale.edu/avi/db-book/db4/slide-dir/ch1-2.pdf"

r = requests.get(file_url, stream = True)
with open("python.pdf","wb") as pdf:

    for chunk in r.iter_content(chunk_size=1024):
         # writing one chunk at a time to pdf file
         if chunk:

             pdf.write(chunk)
即可。
【后记】
【未解决】给安卓游戏自动化测试添加统计requests的实时下载速度和平均下载速度
【后记2】
当前chunk size=10MB,平均速度是 1.7MB/s
[201127 14:51:30][DownloadApps.py 139] download fknsg3/放开那三国3 speed: cur=1.7MB/s, time: total=00:03:47, size: 360.0MB 21.35%
去改为20MB:
        # ChunkSize = 1024*1024*10 # 10MB
        ChunkSize = 1024*1024*20 # 20MB
        with open(filepath, "ab") as f:
            startTime = time.time()
            prevTime = startTime
            for chunkBytes in r.iter_content(chunk_size=ChunkSize):
结果平均速度是:
[201127 16:38:24][DownloadApps.py 86 ] start to download msg/梦三国,url https://gameapktxdl.vivo.com.cn/appstore/developer/soft/20201021/20201021162842302dx.apk
[201127 16:38:24][DownloadApps.py 98 ] app total size: 1.6GB
[201127 16:38:36][DownloadApps.py 139] download msg/梦三国 speed: cur=1.7MB/s, time: total=00:00:11, size: 20.0MB 1.22%
[201127 16:38:47][DownloadApps.py 139] download msg/梦三国 speed: cur=1.8MB/s, time: total=00:00:22, size: 40.0MB 2.45%
[201127 16:38:59][DownloadApps.py 139] download msg/梦三国 speed: cur=1.7MB/s, time: total=00:00:34, size: 60.0MB 3.67%
[201127 16:39:10][DownloadApps.py 139] download msg/梦三国 speed: cur=1.8MB/s, time: total=00:00:45, size: 80.0MB 4.9%
[201127 16:39:21][DownloadApps.py 139] download msg/梦三国 speed: cur=1.8MB/s, time: total=00:00:56, size: 100.0MB 6.12%
[201127 16:39:32][DownloadApps.py 139] download msg/梦三国 speed: cur=1.8MB/s, time: total=00:01:08, size: 120.0MB 7.35%
[201127 16:39:44][DownloadApps.py 139] download msg/梦三国 speed: cur=1.7MB/s, time: total=00:01:19, size: 140.0MB 8.57%
还是 1.7MB/s 左右
对整体速度没影响。
【后记20201204】
后来发现是公司WiFi被限速导致的。
去找网管提速到5MB/s,即可下载速度达到5MB/s左右了。

转载请注明:在路上 » 【未解决】Python给requests下载文件加速提升速度

发表我的评论
取消评论

表情

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址
92 queries in 0.186 seconds, using 23.35MB memory