\'latin-1\' codec can\'t encode character 的解决方案-huaius-ChinaUnix博客

犹大huaius.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

huaius

博客访问： 2488833
博文数量： 328
博客积分： 4302
博客等级：上校
技术积分： 5486
用户组：普通用户
注册时间： 2010-07-01 11:14

个人简介

悲剧，绝对的悲剧，悲剧中的悲剧。

文章分类

全部博文（328）

Automation（3）
云计算（17）
数据库（41）
程序设计（104）

算法（1）

Java（10）

Python（36）

C / C++（8）

版本控制（14）

Perl 编程（29）

Shell 编程（6）
Web开发（25）
杂谈（4）
网络相关（22）
系统相关（87）

iOS（9）

ESX（9）

AIX（4）

HP UX（5）

Linux（24）

Solaris（21）

磁盘相关（9）
安全相关（3）
Unix 命令（22）
未分配的博文（0）

文章存档

2017年（6）

2016年（18）

2015年（28）

2014年（73）

2013年（62）

2012年（58）

2011年（55）

2010年（28）

我的朋友

相关博文

'latin-1' codec can't encode character 的解决方案

分类： Python/Ruby

2013-11-09 19:02:55

分析一个字符串，并更新数据库的时候，出现了如下错误：
'latin-1' codec can't encode character u'\u017e' in position 11: ordinal not in range(256)

进行了一些研究发现，原因是，数据库的编码和数据源的编码不一致，并且包含了不能处理的字符。

有两种方法可用，一个是先预先处理一下字符串，二是设置数据库参数

1. 处理字符串

>>> u = u'hello\u2013world'
>>> u.encode('latin-1', 'replace') # replace it with a question mark
'hello?world'
>>> u.encode('latin-1', 'ignore') # ignore it
'helloworld'
或者根据需求进行处理
>>> u.replace(u'\u2013', '-').encode('latin-1')
'hello-world'
If you aren't required to output Latin-1, then UTF-8 is a common and preferred choice. It is recommended by the W3C and nicely encodes all Unicode code points:
>>> u.encode('utf-8')
'hello\xe2\x80\x93world

2. 设置数据库

db.set_character_set('utf8')
dbc.execute('SET NAMES utf8;')
dbc.execute('SET CHARACTER SET utf8;')
dbc.execute('SET character_set_connection=utf8;')

阅读(3709) | 评论(0) | 转发(0) |

上一篇：Ubuntu 12.10 远程访问桌面

下一篇：Windows下通过plink非交互的远程操作Unix

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6