Python爬虫爬资源时由于连接方在一段时间后没有正确答复或连接的主机没有反应,连接尝试失败。怎么破?


craw 59 : http://www.sz.gov.cn/cn/xxgk/zfxxgj/tzgg/201701/t20170106_5866245.htm
craw 60 : http://www.sz.gov.cn/cn/xxgk/zfxxgj/tzgg/201701/t20170106_5866315.htm
Traceback (most recent call last):
  File "C:\Users\向晓宇\AppData\Local\Programs\Python\Python35\lib\site-packages\requests\packages\urllib3\connection.py", line 138, in _new_conn
    (self.host, self.port), self.timeout, **extra_kw)
  File "C:\Users\向晓宇\AppData\Local\Programs\Python\Python35\lib\site-packages\requests\packages\urllib3\util\connection.py", line 98, in create_connection
    raise err
  File "C:\Users\向晓宇\AppData\Local\Programs\Python\Python35\lib\site-packages\requests\packages\urllib3\util\connection.py", line 88, in create_connection
    sock.connect(sa)
TimeoutError: [WinError 10060] 由于连接方在一段时间后没有正确答复或连接的主机没有反应,连接尝试失败。

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\向晓宇\AppData\Local\Programs\Python\Python35\lib\site-packages\requests\packages\urllib3\connectionpool.py", line 594, in urlopen
    chunked=chunked)
  File "C:\Users\向晓宇\AppData\Local\Programs\Python\Python35\lib\site-packages\requests\packages\urllib3\connectionpool.py", line 361, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "C:\Users\向晓宇\AppData\Local\Programs\Python\Python35\lib\http\client.py", line 1106, in request
    self._send_request(method, url, body, headers)
  File "C:\Users\向晓宇\AppData\Local\Programs\Python\Python35\lib\http\client.py", line 1151, in _send_request
    self.endheaders(body)
  File "C:\Users\向晓宇\AppData\Local\Programs\Python\Python35\lib\http\client.py", line 1102, in endheaders
    self._send_output(message_body)
  File "C:\Users\向晓宇\AppData\Local\Programs\Python\Python35\lib\http\client.py", line 934, in _send_output
    self.send(msg)
  File "C:\Users\向晓宇\AppData\Local\Programs\Python\Python35\lib\http\client.py", line 877, in send
    self.connect()
  File "C:\Users\向晓宇\AppData\Local\Programs\Python\Python35\lib\site-packages\requests\packages\urllib3\connection.py", line 163, in connect
    conn = self._new_conn()
  File "C:\Users\向晓宇\AppData\Local\Programs\Python\Python35\lib\site-packages\requests\packages\urllib3\connection.py", line 147, in _new_conn
    self, "Failed to establish a new connection: %s" % e)
requests.packages.urllib3.exceptions.NewConnectionError: <requests.packages.urllib3.connection.HTTPConnection object at 0x000001E7FA96F550>: Failed to establish a new connection: [WinError 10060] 由于连接方在一段时间后没有正确答复或连接的主机没有反应,连接尝试失败。

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\向晓宇\AppData\Local\Programs\Python\Python35\lib\site-packages\requests\adapters.py", line 423, in send
    timeout=timeout
  File "C:\Users\向晓宇\AppData\Local\Programs\Python\Python35\lib\site-packages\requests\packages\urllib3\connectionpool.py", line 643, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "C:\Users\向晓宇\AppData\Local\Programs\Python\Python35\lib\site-packages\requests\packages\urllib3\util\retry.py", line 363, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
requests.packages.urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='www.sz.gov.cn', port=80): Max retries exceeded with url: /cn/xxgk/zfxxgj/tzgg/201701/t20170106_5869715.htm (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x000001E7FA96F550>: Failed to establish a new connection: [WinError 10060] 由于连接方在一段时间后没有正确答复或连接的主机没有反应,连接尝试失败。',))

requests.get有超时参数。

 a = requests.get("http://www.baidu.com",timeout = 500)

可以换用scrapy框架来实现爬虫,可以实现抓取失败尝试3次,如果还失败可以自定义写进日志里面,后边自己再慢慢处理



相关阅读:
Markdown编辑器服务器处理最佳实践
html中有多个form标签,每一个form标签下对应一个submit,为什么未输入内容的form表单也会提交?
canvas背景不透明,但内部某元素透明(类似镂空效果)
php制作中英文两版网站比较方便的思路
使用iconfont在线使用时,出现多个空格,求解答
使用pjax的时候直接跳转到页面了,不能替换页面中的某个div
gdb都能调试什么类型的文件?
这种在App加载页面前的显示是什么技术或者框架?
boostrap-table可以接收string类型吗
访客能够通过猜测的方式伪造出一个正确的session id并进行不好的行为吗?
react中const {dispatch} = this.props;
tomcat配置问题
vue-router如何将接口返回的数据传给组件初始化
为什么我的程序跑的很慢?
刚入门PHP,是否有必要学习JAVA
JS prototype原型问题
关于移动端localstorage 手机浏览器无效果
PHP中的$_REQUEST和$_POST|$_GET有什么区别??
react这个错是什么意思,要怎么解决
.htaccess 文件导致的自动加载问题



快速导航

Copyright © 2016 phpStudy |