【已解决】Python访问网络出错：urllib2.URLError: <urlopen error [Errno 10060] >

【问题】

写Python代码，利用urllib2去访问网络，结果期间会出现错误：

urllib2.URLError: <urlopen error [Errno 10060] >

【解决过程】

1.后来发现，程序本身是好的，但是是由于，网络的偶尔的不稳定，而导致了此错误的。

2.所以后来就想到，当发现网络遇到这类错误的时候，多试几次，应该就可以解决此问题了。

所以把原先的：

#itemRespHtml = crifanLib.getUrlRespHtml(itemLink);

改为：

itemRespHtml = crifanLib.getUrlRespHtml_multiTry(itemLink);

其中对应的代码是：

crifan的Python库：crifanLib.py

中的：

#------------------------------------------------------------------------------
def getUrlResponse(url, postDict={}, headerDict={}, timeout=0, useGzip=False, postDataDelimiter="&") :
    """Get response from url, support optional postDict,headerDict,timeout,useGzip

    Note:
    1. if postDict not null, url request auto become to POST instead of default GET
    2  if you want to auto handle cookies, should call initAutoHandleCookies() before use this function.
       then following urllib2.Request will auto handle cookies
    """

    # makesure url is string, not unicode, otherwise urllib2.urlopen will error
    url = str(url);

    if (postDict) :
        if(postDataDelimiter=="&"):
            postData = urllib.urlencode(postDict);
        else:
            postData = "";
            for eachKey in postDict.keys() :
                postData += str(eachKey) + "="  + str(postDict[eachKey]) + postDataDelimiter;
        postData = postData.strip();
        logging.info("postData=%s", postData);
        req = urllib2.Request(url, postData);
        logging.info("req=%s", req);
        req.add_header('Content-Type', "application/x-www-form-urlencoded");
    else :
        req = urllib2.Request(url);

    defHeaderDict = {
        'User-Agent'    : gConst['UserAgent'],
        'Cache-Control' : 'no-cache',
        'Accept'        : '*/*',
        'Connection'    : 'Keep-Alive',
    };

    # add default headers firstly
    for eachDefHd in defHeaderDict.keys() :
        #print "add default header: %s=%s"%(eachDefHd,defHeaderDict[eachDefHd]);
        req.add_header(eachDefHd, defHeaderDict[eachDefHd]);

    if(useGzip) :
        #print "use gzip for",url;
        req.add_header('Accept-Encoding', 'gzip, deflate');

    # add customized header later -> allow overwrite default header 
    if(headerDict) :
        #print "added header:",headerDict;
        for key in headerDict.keys() :
            req.add_header(key, headerDict[key]);

    if(timeout > 0) :
        # set timeout value if necessary
        resp = urllib2.urlopen(req, timeout=timeout);
    else :
        resp = urllib2.urlopen(req);
        
    #update cookies into local file
    if(gVal['cookieUseFile']):
        gVal['cj'].save();
        logging.info("gVal['cj']=%s", gVal['cj']);
    
    return resp;

#------------------------------------------------------------------------------
# get response html==body from url
#def getUrlRespHtml(url, postDict={}, headerDict={}, timeout=0, useGzip=False) :
def getUrlRespHtml(url, postDict={}, headerDict={}, timeout=0, useGzip=True, postDataDelimiter="&") :
    resp = getUrlResponse(url, postDict, headerDict, timeout, useGzip, postDataDelimiter);
    respHtml = resp.read();
    if(useGzip) :
        #print "---before unzip, len(respHtml)=",len(respHtml);
        respInfo = resp.info();
        
        # Server: nginx/1.0.8
        # Date: Sun, 08 Apr 2012 12:30:35 GMT
        # Content-Type: text/html
        # Transfer-Encoding: chunked
        # Connection: close
        # Vary: Accept-Encoding
        # ...
        # Content-Encoding: gzip
        
        # sometime, the request use gzip,deflate, but actually returned is un-gzip html
        # -> response info not include above "Content-Encoding: gzip"
        # eg: http://blog.sina.com.cn/s/comment_730793bf010144j7_3.html
        # -> so here only decode when it is indeed is gziped data
        if( ("Content-Encoding" in respInfo) and (respInfo['Content-Encoding'] == "gzip")) :
            respHtml = zlib.decompress(respHtml, 16+zlib.MAX_WBITS);
            #print "+++ after unzip, len(respHtml)=",len(respHtml);

    return respHtml;

def getUrlRespHtml_multiTry(url, postDict={}, headerDict={}, timeout=0, useGzip=True, postDataDelimiter="&", maxTryNum=5):
    """
        get url response html, multiple try version:
            if fail, then retry
    """
    respHtml = "";
    
    # access url
    # mutile retry, if some (mostly is network) error
    for tries in range(maxTryNum) :
        try :
            respHtml = getUrlRespHtml(url, postDict, headerDict, timeout, useGzip, postDataDelimiter);
            #logging.debug("Successfully access url %s", url);
            break # successfully, so break now
        except :
            if tries < (maxTryNum - 1) :
                #logging.warning("Access url %s fail, do %d retry", url, (tries + 1));
                continue;
            else : # last try also failed, so exit
                logging.error("Has tried %d times to access url %s, all failed!", maxTryNum, url);
                break;

    return respHtml;

【总结】

当访问网络出错：

urllib2.URLError: <urlopen error [Errno 10060] >

那就多试几次，就可以了。

转载请注明：在路上 » 【已解决】Python访问网络出错：urllib2.URLError: <urlopen error [Errno 10060] >

补充一下，http://python.6.x6.nabble.com/urllib2-HTTP-header-quot-Connection-Keep-Alive-quot-td2223713.html 这里写的情况我全部中枪，麻烦有空给看一下

王大头13年前 (2013-05-18)回复

你好，请教一个细节的问题。我是要访问：'http://www.welovewe.com'，计划下载所有球员数据，首先要解决的就是登录该网站。按然你的多个教程也基本把代码写差不多了。目前还没有调试成功，我分析主要原因是发送一个请求时，Headers部分的'Connection'，用浏览器访问时都是keep-alive，虽然我也设置了该参数，但是抓包的结果却显示，这个参数始终是close。我看到你上面的代码里面也写了这个，但也有网友反应说urllib2里面设置Connection为keep-alive就没有成功过。你觉得这个问题怎么解决？谢谢！

王大头13年前 (2013-05-15)回复

两种可能： 1.你自己程序模拟登陆过程，没有模拟完整，导致人家服务器返回给你是connection是close 2.urllib2中有bug导致你虽然设置了keep-alive，结果还是close 目测，第一种可能性极大。解决办法：继续调试，找到你程序模拟的过程，和实际上用浏览器登陆时，到底差距在哪里。
crifan13年前 (2013-05-16)回复
- 1.你的意思是Connection是在其它参数设置正确之后被访问网站的返回值，而不是需要我自己设置的参数？ 2.由于太多人都说urllib2设置Connection=Keep-Alive不成功，于是我用conn=httplib.HTTPConnection(host='www.google-analytics.com') conn.request(method='GET',url='/ga.js',headers=loginHeaders) （上面的loginHeaders里含有'Connection':'Keep-Alive'项）测试了一下，再抓包就成功了。您看这是什么情况。
  王大头13年前 (2013-05-18)回复
  - 1.是的。 2.我猜测是： urllib2应该也是正常的，但是你没有用好。因为我用了N多次了，貌似没有出现类似的Connection=Keep-Alive不成功的问题。然后你用httplib成功了，那就是成功了，没啥意外的。总结：或许urllib2库真有bug，只是我没遇到这个bug。
    crifan13年前 (2013-05-24)回复

【已解决】Python访问网络出错：urllib2.URLError: <urlopen error [Errno 10060] >

与本文相关的文章

Hi，您需要填写昵称和邮箱！

网友最新评论 (5)