http://joysofprogramming.com/download-webpage-curl-c/
Download a webpage using CURL in C
CURL is used for the data transfer to and from a data server. Curl can be used to download a webpage. Given below is the code that can be used to download any webpage of your choice to the location of your choice
For this purpose, you must fill the following two constants
- WEBPAGE_URL : This is the URL of the webpage which you want to download. Enter the link of any website or webpage
- DESTINATION_FILE: This is the destination where the downloaded webpage is stored. Give the absolute path of the file.
#include <curl/curl.h> #include <stdio.h> #include <stdlib.h> #define WEBPAGE_URL "http://google.com" #define DESTINATION_FILE "/home/user/data.txt" size_t write_data( void *ptr, size_t size, size_t nmeb, void *stream) { return fwrite (ptr,size,nmeb,stream); } int main() { FILE * file = ( FILE *) fopen (DESTINATION_FILE, "w+" ); if (!file){ perror ( "File Open:" ); exit (0); } CURL *handle = curl_easy_init(); curl_easy_setopt(handle,CURLOPT_URL,WEBPAGE_URL); /*Using the http protocol*/ curl_easy_setopt(handle,CURLOPT_WRITEFUNCTION, write_data); curl_easy_setopt(handle,CURLOPT_WRITEDATA, file); curl_easy_perform(handle); curl_easy_cleanup(handle); } |
To compile this program, you must specify the library
$gcc webpage.c -lcurl |
Note the -lcurl. This is to perform the dynamic linking of the libcurl library. Let’s look at the function curl_easy_setopt. This function is used to set the URL of the webpage to be downloaded. Note the flag CURLOPT_URL used to specify the location of the webpage. Similarly the function to write the downloaded webpage to the destination file is also set using curl_easy_setopt. In short, curl_easy_setopt is used to set the parameters before taking any action of downloading or uploading a web page
http://www.rainsts.net/article.asp?id=989
CURL - 使用方法
"-o" 使用自定文件名保存; "-O" 使用服务器提供的文件名保存。
curl -o a.html http://www.abc.net curl -O http://www.abc.net/a.html curl -O http://www.abc.net/pic/image[1-10].jpg
"-c" 断点续传; "-r" 分块下载; "--compressed" 压缩传输。
curl -c -O http://www.abc.net/download/a.zip curl -r 0-1023 -o part1.zip http://www.abc.net/download/a.zip curl -r 1024- -o part2.zip http://www.abc.net/download/a.zip
分段下载示例:
- 0-499: 前 500 字节。
- 500-999: 下一个 500 字节片段。
- -500: 最后 500 字节。
- 9500-: 从 9500 开始到最后。
- 0-0,-1: 第一和最后一个字节。
- 500-700,600-799: 从 500 开始的 300 字节快 (重叠)。
- 100-199,500-599: 两个 100 字节块。
"-D" 保存 Cookie; "-b" 传递 Cookie 到 Request。
curl -D cookie.txt http://www.abc.net curl -b cookie.txt http://www.abc.net
3. Agent
某些网站不支持非 IE 浏览器。
curl -A "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)" http://www.abc.net
4. Referer
某些网站会防盗链。
curl -e "http://www.abc.net" http://www.abc.net/a.html
5. Get & Post
"-d" 以 POST 方式请求。
curl http://www.abc.net/a.asp?a=1&b=2 curl -d "a=1&b=2" http://www.abc.net/a.asp
6. FTP
"-T" 上传文件。
curl -T a.txt name:password ftp://192.168.1.1:201/home/abc