批处理之家's Archiver

小渣飞 发表于 2019-11-8 23:56

Python TypeError: OpenerDirector object is not callable

啥问题啊各位大神[code]#导入需要使用的模块

from urllib import request #访问模块urllib
import re #正规表达式(用来过滤信息?)
import os #对系统有操作需要用到os模块
import random #随机模块

#设置多个浏览器备用,基于PC端

agent_one='Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.835.163 Safari/535.1'#Chrome Win7
agent_two='Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0) Gecko/20100101 Firefox/6.0'#Firefox Win7
agent_three='Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 Safari/534.50'#Safari Win7
agent_four='Opera/9.80 (Windows NT 6.1; U; zh-cn) Presto/2.9.168 Version/11.50'#Opera Win7
agent_five='Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0; .NET CLR 2.0.50727; SLCC2; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; InfoPath.3; .NET4.0C; Tablet PC 2.0; .NET4.0E)'#IE Win7+ie9
agent_six='Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; InfoPath.3)'#Win7+ie8
agent_seven='Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB7.0)'#Win7XP+ie8
agent_eight='Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)'#WinXP+ie8
agent_night='Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)'#WinXP+ie7
agent_ten='Mozilla/5.0 (Windows; U; Windows NT 6.1; ) AppleWebKit/534.12 (KHTML, like Gecko) Maxthon/3.0 Safari/534.12'#WinXP+ie6
agent_eleven='Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; InfoPath.3; .NET4.0C; .NET4.0E)'#傲游3.1.7在Win7+ie9,高速模式
agent_twelve='Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; InfoPath.3; .NET4.0C; .NET4.0E; SE 2.X MetaSr 1.0)'#傲游3.1.7在Win7+ie9,IE内核兼容模式
agent_thirteen='Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.3 (KHTML, like Gecko) Chrome/6.0.472.33 Safari/534.3 SE 2.X MetaSr 1.0'#搜狗
agent_fourteen='Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; InfoPath.3; .NET4.0C; .NET4.0E)'#搜狗3.0在Win7+ie9,高速模式
agent_fifteen='Mozilla/5.0 (Windows NT 6.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.41 Safari/535.1 QQBrowser/6.9.11079.201'#360浏览器
agent_seventeen='Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; InfoPath.3; .NET4.0C; .NET4.0E) QQBrowser/6.9.11079.201'#QQ浏览器
agent_eighteen='Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)'#QQ浏览器6.9(11079)在Win7+ie9,IE内核兼容模式
agent_nineteen='Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.122 UBrowser/4.0.3214.0 Safari/537.36'#阿云浏览器
agent_twenty='Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Maxthon/4.4.3.4000 Chrome/30.0.1599.101 Safari/537.36'#

#封装信息到List列表
agent_list=[agent_one,agent_two,agent_three,
            agent_four,agent_five,agent_seven,
            agent_eight,agent_night,agent_ten,
            agent_eleven,agent_twelve,agent_thirteen,
            agent_fourteen,agent_eighteen,agent_nineteen,agent_twenty
            ]

Agent=random.choice(agent_list)

#构造请求头信息
header={'User-Agent':Agent}

#网址
url=r'http://www.baidu.com/'

#构建处理器对象(专门处理HTTP/HTTPS请求的对象)
http_hander=request.HTTPHandler()

#创建自定义opener
opener=request.build_opener(http_hander)

#创建自定义请求对象
req=request.Request(url,headers=header)

#opener发送请求,获取响应信息,
reponse=opener(req).read().decode()
#Urlopen发送请求,获取响应信息,自动创建请求对象。缺点是无法自定义封装,无法添加自己的元素进去
#reponse=request.urlopen(req).read().decode()


pat=r'<title>(.*?)</title>'
data=re.findall(pat,reponse)

#print(type(data)) 输出结果是List列表信息
print(data[0])[/code]执行结果:
Traceback (most recent call last):
  File "C:/Users/28158/.PyCharm2019.1/config/scratches/scratch.py", line 58, in <module>
    reponse=opener(req).read().decode()
TypeError: 'OpenerDirector' object is not callable

codegay 发表于 2019-11-9 06:02

不要一行调用写到头,
要考虑异常处理的概念和可能。
如果opener(req)返回null后面就出错了。

以及应该使用第三方库requests代替urllib

还有百度已经全站HTTPS了。

小渣飞 发表于 2019-11-9 10:52

[i=s] 本帖最后由 小渣飞 于 2019-11-9 11:02 编辑 [/i]

[b]回复 [url=http://www.bathome.net/redirect.php?goto=findpost&pid=224767&ptid=54199]2#[/url] [i]codegay[/i] [/b]


    嗯嗯,刚学python一星期,还有个问题就是
var='百度知道'
var_1='r''

pat=r+var

可是提示错误怎么办我想要的输出结果是 pat=r'百度知道'
这是源码
url=r'http://www.baidu.com/s?'
req=(input('输入需要爬取内容:'))
pat=(input('输入要爬取的关键字'))
wd={'wd':req}
#构造URL编码
wdd=urllib.parse.urlencode(wd)
url=url+wdd
req=request.Request(url)
reponse=request.urlopen(req).read().decode()
pat=r'+''pat''

data=re.findall(pat,reponse)
print('查找的内容:',req,'查找的关键字:',rap)
print(data)

codegay 发表于 2019-11-9 13:21

引号和括号必须成对
字符串前加r表示是raw string,不转义其中的特殊字符。

你把r前缀的字符串改成三重引号的形式更好理解和阅读一点。[code]"""这是三重引号的字符串"""
'''三重单引号字符串'''
[/code]

小渣飞 发表于 2019-11-9 15:25

[b]回复 [url=http://www.bathome.net/redirect.php?goto=findpost&pid=224779&ptid=54199]4#[/url] [i]codegay[/i] [/b]


    嗯嗯明白了谢谢你

页: [1]

Powered by Discuz! Archiver 7.2  © 2001-2009 Comsenz Inc.