最新消息:20210917 已从crifan.com换到crifan.org

[已解决]Python给unicode去编码出错:UnicodeEncodeError gbk codec can’t encode character u’xe6′ in position 0 illegal multibyte sequence

Python crifan 2570浏览 0评论

折腾:

[已解决]微信授权登陆后获取用户信息后用Python解码

期间,用代码:

app.logger.debug(‘type(province)=%s’, type(province))
provinceEncodedUtf8 = province.encode(‘utf-8’)
provinceEncodedGbk = province.encode(‘gbk’)
app.logger.debug(‘provinceEncodedUtf8=%s, provinceEncodedGbk=%s’, provinceEncodedUtf8, provinceEncodedGbk)

去给unicode编码,结果出错:

DEBUG in sipevents [/usr/share/nginx/html/SIPEvents/sipevents.py:102]:
type(province)=<type ‘unicode’>

<div–<——————————————————————————

[2016-08-21 15:57:34 +0000] [25083] [ERROR] Error handling request /?code=021mvzWb1xwSEp0VUdZb18JyWb1mvzWa&state=
Traceback (most recent call last):
  File “/root/Envs/SIPEvents/lib/python2.7/site-packages/gunicorn/workers/sync.py”, line 135, in handle
    self.handle_request(listener, req, client, addr)
  File “/root/Envs/SIPEvents/lib/python2.7/site-packages/gunicorn/workers/sync.py”, line 176, in handle_request
    respiter = self.wsgi(environ, resp.start_response)
  File “/root/Envs/SIPEvents/lib/python2.7/site-packages/flask/app.py”, line 2000, in __call__
    return self.wsgi_app(environ, start_response)
  File “/root/Envs/SIPEvents/lib/python2.7/site-packages/flask/app.py”, line 1991, in wsgi_app
    response = self.make_response(self.handle_exception(e))
  File “/root/Envs/SIPEvents/lib/python2.7/site-packages/flask/app.py”, line 1567, in handle_exception
    reraise(exc_type, exc_value, tb)
  File “/root/Envs/SIPEvents/lib/python2.7/site-packages/flask/app.py”, line 1988, in wsgi_app
    response = self.full_dispatch_request()
  File “/root/Envs/SIPEvents/lib/python2.7/site-packages/flask/app.py”, line 1641, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File “/root/Envs/SIPEvents/lib/python2.7/site-packages/flask/app.py”, line 1544, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File “/root/Envs/SIPEvents/lib/python2.7/site-packages/flask/app.py”, line 1639, in full_dispatch_request
    rv = self.dispatch_request()
  File “/root/Envs/SIPEvents/lib/python2.7/site-packages/flask/app.py”, line 1625, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File “/usr/share/nginx/html/SIPEvents/sipevents.py”, line 104, in index
    provinceEncodedGbk = province.encode(‘gbk’)
UnicodeEncodeError: ‘gbk’ codec can’t encode character u’\xe6′ in position 0: illegal multibyte sequence

那种字符编码支持国际音标?怎么把unicode的音标输出到屏幕上?_python吧_百度贴吧

微信 授权 获取用户信息 乱码

python – 微信中网页授权获取用户基本信息后得到中文乱码 – SegmentFault

-》调用接口前,设置编码为UTF-8 ?

微信公众平台(3)-网页授权获取用户基本信息 – 雪飘七月 – 51CTO技术博客

微信开发获取用户信息时昵称乱码,如何解决? – 知乎

去试试:

方案是先将接口返回的字符串编码成ISO-8859-1的字节流,然后再以UTF-8解码成字符串就OK。

java 微信授权后获取微信用户信息昵称乱码问题 解决 – qq727013465的专栏 – 博客频道 – CSDN.NET

获取微信用户信息出现乱码-布布扣-bubuko.com

【微信登陆】的乱码问题!-CSDN论坛-CSDN.NET-中国最大的IT技术社区

去试了试,果然是可以的:

代码:

province = respUserInfoDict[‘province’]
city = respUserInfoDict[‘city’]
country = respUserInfoDict[‘country’]
nickname = respUserInfoDict[‘nickname’]
app.logger.debug(‘province=%s, city=%s, country=%s, nickname=%s’, province, city, country, nickname)
app.logger.debug(‘type(province)=%s’, type(province))
# provinceEncodedUtf8 = province.encode(‘utf-8’)
# app.logger.debug(‘provinceEncodedUtf8=%s’, provinceEncodedUtf8)
# provinceEncodedGbk = province.encode(‘gbk’)
# app.logger.debug(‘provinceEncodedGbk=%s’, provinceEncodedGbk)
encodedProvinceIso8859_1 = province.encode(‘ISO-8859-1’)
app.logger.debug(‘encodedProvinceIso8859_1=%s’, encodedProvinceIso8859_1)
decodedProvinceUnicode = encodedProvinceIso8859_1.decode(‘utf-8’)
app.logger.debug(‘decodedProvinceUnicode=%s’, decodedProvinceUnicode)

输出:

DEBUG in sipevents [/usr/share/nginx/html/SIPEvents/sipevents.py:101]:
province=æ±è, city=èå·, country=中å½, nickname=礼è²

<div–<——————————————————————————

<div–<——————————————————————————

DEBUG in sipevents [/usr/share/nginx/html/SIPEvents/sipevents.py:102]:
type(province)=<type ‘unicode’>

<div–<——————————————————————————

<div–<——————————————————————————

DEBUG in sipevents [/usr/share/nginx/html/SIPEvents/sipevents.py:109]:
encodedProvinceIso8859_1=江苏

<div–<——————————————————————————

<div–<——————————————————————————

DEBUG in sipevents [/usr/share/nginx/html/SIPEvents/sipevents.py:111]:
decodedProvinceUnicode=江苏

接下来就是:

想办法,把整个的json的dict,respUserInfoDict,都一次性的:

先编码为ISO-8859-1

再用UTF-8去解码得到unicode的字符串

其实更好的办法是:

看看,此处能不能想办法,把此处的默认的HTTP的get请求中,添加字符编码为UTF-8

以便使得使用默认的UTF-8去编码和解码

此处的wechat-sdk内部的http的get请求用的是request的:

def request(self, method, url, body=None, headers={}):
    “””Send a complete request to the server.”””
    self._send_request(method, url, body, headers)

所以去:

[已解决]Python中如何给http的get请求中所使用的request库中设置字符编码为UTF-8

[引申]

其实,此处对于微信的API返回用户信息中的响应的头信息中没有包含charset为utf-8的标记

-》虽然是可以被理解的,但是还是属于水平不高,不严谨的做法啊。。

转载请注明:在路上 » [已解决]Python给unicode去编码出错:UnicodeEncodeError gbk codec can’t encode character u’xe6′ in position 0 illegal multibyte sequence

发表我的评论
取消评论

表情

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址
99 queries in 0.196 seconds, using 23.47MB memory