  
- 帖子
- 3167
- 积分
- 6481
- 技术
- 320
- 捐助
- 70
- 注册时间
- 2008-8-3
|
本帖最后由 523066680 于 2016-11-29 11:31 编辑
回复 6# tmplinshi
补充修改了
关于 _headers , 和 content-disposition 的键值由来,Perl的说明文档没有具体介绍,但是可以通过 Data::Dump 输出整个数据结构
- use LWP::Simple;
- use Data::Dump qw(dump);
- my $h = head("https://www.nyaa.se/?page=download&tid=613616");
- print dump $h;
do {
my $a = bless({
_content => "",
_headers => bless({
"cf-ray" => "3092ded3aec707eb-LAX",
"client-date" => "Tue, 29 Nov 2016 03:11:05 GMT",
"client-peer" => "104.20.74.106:443",
"client-response-num" => 1,
"client-ssl-cert-issuer" => "/C=GB/ST=Greater Manchester/L=Salford/O=COMODO CA Limited/CN=COMODO ECC Domain Validation Secure Server CA 2",
"client-ssl-cert-subject" => "/OU=Domain Control Validated/OU=PositiveSSL Multi-Domain/CN=ssl366349.cloudflaressl.com",
"client-ssl-cipher" => "ECDHE-ECDSA-AES128-GCM-SHA256",
"client-ssl-socket-class" => "IO::Socket::SSL",
"connection" => "close",
"content-disposition" => "inline; filename=\"\xE6\xB5\xB7\xE8\xB4\xBC\xE7\x8E\x8B765.rar.torrent\"",
"content-type" => "application/x-bittorrent",
"date" => "Tue, 29 Nov 2016 03:11:07 GMT",
"last-modified" => "Thu, 23 Oct 2014 12:11:17 GMT",
"server" => "cloudflare-nginx",
"set-cookie" => "__cfduid=d41adfbdcefc8d9c55b9a6c24451c6fb61480389066; expires=Wed, 29-Nov-17 03:11:06 GMT; path=/; domain=.nyaa.se; HttpOnly",
"vary" => "Accept-Encoding",
}, "HTTP::Headers"),
_msg => "OK",
_protocol => "HTTP/1.1",
_rc => 200,
_request => bless({
_content => "",
_headers => bless({ "user-agent" => "LWP::Simple/6.00 libwww-perl/6.04" }, "HTTP::Headers"),
_method => "HEAD",
_uri => bless(do{\(my $o = "https://www.nyaa.se/?page=download&tid=613616")}, "URI::https"),
_uri_canonical => 'fix',
}, "HTTP::Request"),
}, "HTTP::Response");
$a->{_request}{_uri_canonical} = \${$a->{_request}{_uri}};
$a;
}
我觉得这件事(网络爬虫)有三种语言比较合适:ruby python perl
安利 ruby |
|