博客是我工作的好帮手,遇到困难就来博客找资料
分类: 系统运维
2017-02-09 18:44:17
Python 的 CSV模块的使用方法,包括,reader, writer, DictReader, DictWriter.register_dialect
一直非常喜欢python的csv模块,简单易用,经常在项目中使用,现在举几个例子说明一下。
参数表:
csvfile
需要是支持迭代(Iterator)的对象,并且每次调用next方法的返回值是字符串(string),通常的文件(file)对象,或者列表(list)对象都是适用的,如果是文件对象,打开是需要加"b"标志参数。
dialect
编码风格,默认为excel方式,也就是逗号(,)分隔,另外csv模块也支持excel-tab风格,也就是制表符(tab)分隔。其它的方式需要自己定义,然后可以调用register_dialect方法来注册,以及list_dialects方法来查询已注册的所有编码风格列表。
fmtparam
格式化参数,用来覆盖之前dialect对象指定的编码风格。
例子:
import csv
reader = csv.reader(file('your.csv', 'rb'))
for line in reader:
print line
参数表(略: 同reader, 见上)
例子:
import csv writer = csv.writer(file('your.csv', 'wb')) writer.writerow(['Column1', 'Column2', 'Column3']) lines = [range(3) for i in range(5)] for line in lines: writer.writerow(line)
1. 写入并生成csv文件
代码:
# coding: utf-8
import csv
csvfile = file('csv_test.csv', 'wb')
writer = csv.writer(csvfile)
writer.writerow(['姓名', '年龄', '电话'])
data = [
('小河', '25', '1234567'),
('小芳', '18', '789456')
]
writer.writerows(data)
csvfile.close()
wb中的w表示写入模式,b是文件模式
写入一行用writerow
多行用writerows
2. 读取csv文件
代码:
# coding: utf-8
import csv
csvfile = file('csv_test.csv', 'rb')
reader = csv.reader(csvfile)
for line in reader:
print line
csvfile.close()
运行结果:
root@he-desktop:~/python/example# python read_csv.py
['\xe5\xa7\x93\xe5\x90\x8d', '\xe5\xb9\xb4\xe9\xbe\x84', '\xe7\x94\xb5\xe8\xaf\x9d']
['\xe5\xb0\x8f\xe6\xb2\xb3', '25', '1234567']
['\xe5\xb0\x8f\xe8\x8a\xb3', '18', '789456']
Python读写CSV文件
csv模块方法
csv.reader
import csv
with open('temp.csv','rb') as f:
reader = csv.reader(f)
for row in reader:
print row
csv.writer
import csv
with open('temp.csv','wb') as f:
writer = csv.writer(f)
writer.writerow(['a','b','c'])
writer.writerow(['d','e','f'])
csv模块类
csv.DictReader
import csv
with open('temp.csv') as f:
reader = csv.DictReader(f)
for row in reader
print(row['first_name'],row['last_name'])
csv.DictWriter
import csv
with open('temp.csv','w') as f:
fieldnames = ['first_name','last_name']
writer = csv.DictWriter(f, fieldnames=fieldnames)
writer.writeheader()
writer.writerow({'first_name':'ryan', 'last_name':'xu'})
writer.writerow({'first_name':'koko', 'last_name':'xu'})
csv模块异常
csv.Error
import csv, sys
filename = 'some.csv'
with open(filename, 'rb') as f:
reader = csv.reader(f)
try:
for row in reader:
print row
except csv.Error as e:
sys.exit('file %s, line %d: %s' % (filename, reader.line_num, e))
读对象(DictReader实例和reader()函数返回的对象)公共方法
csvreader.next()
csvreader.line_num
csvreader.fieldnames
写对象(DictWriter实例和writer()函数返回的对象)公共方法
csvwriter.writerow(row)
csvwriter.writerows(rows)
csvwriter.writeheader()
python 读取excel数据到mysql
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import MySQLdb
import os
import sys
import re
reload(sys)
sys.setdefaultencoding( "utf-8" )
import time
import datetime
today=datetime.date.today()
oneday=datetime.timedelta(days=1)
to_yes=today-oneday
yesterday=to_yes.strftime('%Y%m%d')
currentDate=time.strftime('%Y%m%d',time.localtime())
import MySQLdb
import xlrd
from openpyxl import Workbook
from openpyxl.compat import range
xlsfile=r'C:\Users\cherry\Desktop\defriend_0\aaaaa.xlsx'
book=xlrd.open_workbook(xlsfile)
count=len(book.sheets())
print count
conn = MySQLdb.connect(host='192.168.10.70', user='dlan', passwd='root123', db='yy_access', charset="utf8")
conn.set_character_set('utf8')
cursor = conn.cursor()
cursor.execute('SET NAMES utf8;')
cursor.execute('SET CHARACTER SET utf8;')
cursor.execute('SET character_set_connection=utf8;')
starttime = datetime.datetime.now()
print '开始时间:%s' % (starttime)
#读取sheet数量
for i in range(0,count):
print i
sheet=book.sheet_by_index(i)
print sheet
query="""insert into yy_access.ca_user_phone_score(phone_number,score,notic)values(%s,%s,%s)"""
##循环每一行,不包含标题
for r in range(1,sheet.nrows):
phone_number = sheet.cell(r, 0).value
score = sheet.cell(r, 1).value
notic= sheet.cell(r, 2).value
values=(phone_number,score,notic)
print values,query
cursor.execute(query,values)
cursor.close()
conn.commit()
conn.close()
endtime=datetime.datetime.now()
print '结束时间:%s' % (endtime)
print '用时:%s 秒' % (endtime-starttime)
CSV数据的导出导入,最常用的方法:
导出:
SET NAMES "utf8"
select * from ricci_var into outfile'/tmp/var.csv' fields terminated by ',' optionally enclosed by '"' lines terminated by '\n'
导入:
SET NAMES "utf8"
load data infile "/tmp/var.csv" into table ricci_var fields terminated by ',' optionally enclosed by '"' lines terminated by '\n'
在某些特殊的情况下,是无法这么操作的,如垃圾的RDS,就需要这么操作了:
导出:
usr/local/mysql/bin/mysql -h192.168.1.10 -udlan -proot123 test -e"
SELECT * FROM manufactor_user_info where date(create_time)<='2017-05-02'" -N -s |sed 's/\t/","/g;s/^/"/;s/$/"/;s/\n//g'> /tmp/test.csv
导入:
SET NAMES "utf8"
load data infile '/tmp/test.csv' into table manufacturer_log fields terminated by ',' optionally enclosed by '"' lines terminated by '\n'
###具体的导出条件自己懂的。在导出的数据需要进行简单的清洗,有可能会碰到某某行的数据段数据错误或者定义错误这样的提示.Wrong data or column definition. Row: 69697, field: 43.
这样的提示主要是由于数据存在问题的需要清洗,从MYSQL导入infobright 会这样提示,可以设置 SET @BH_REJECT_FILE_PATH = '/tmp/reject_file';
SET @BH_ABORT_ON_COUNT = 10;(自定定义错误条数)
可以通过这个观察数据问题所在。
再者就是导出的时候,有权限问题,如:
ERROR 1290 (HY000): The MySQL server is running with the --secure-file-priv option so it cannot execute this statement
解决办法:
1. 设置安全目录: vi /etc/my.cnf
secure-file-priv=/home/自己的目录/
2. 有权限写入目录/home/自己的目录/ (5.7的需要设置)
再办法是:
mysqldump 导成SQL文件
最后办法:
导出:mysql -udlan -proot123 --database=test --execute='SELECT a, b FROM aaa LIMIT 0, 10000 ' -X > file.csv
导入:
SET NAMES "utf8"
load xml infile '/tmp/file.csv' into table user_info1
打印发行日期及标题,逐行处理:
for line in open("samples/sample.csv"): title, year, director = line.split(",") print year, title
使用csv模块处理:
import csv reader = csv.reader(open("samples/sample.csv")) for title, year, director in reader: print year, title
改变分隔符
创建一csv.excel的子类,并修改分隔符为”;”
# File: csv-example-2.py import csv class SKV(csv.excel): # like excel, but uses semicolons delimiter = ";" csv.register_dialect("SKV", SKV) reader = csv.reader(open("samples/sample.skv"), "SKV") for title, year, director in reader: print year, title
如果仅仅仅是改变一两个参数,则可以直接在reader参数中设置,如下:
# File: csv-example-3.py import csv reader = csv.reader(open("samples/sample.skv"), delimiter=";") for title, year, director in reader: print year, title
将数据存为CSV格式
通过csv.writer来生成一csv文件。
# File: csv-example-4.py import csv import sys data = [ ("And Now For Something Completely Different", 1971, "Ian MacNaughton"), ("Monty Python And The Holy Grail", 1975, "Terry Gilliam, Terry Jones"), ("Monty Python's Life Of Brian", 1979, "Terry Jones"), ("Monty Python Live At The Hollywood Bowl", 1982, "Terry Hughes"), ("Monty Python's The Meaning Of Life", 1983, "Terry Jones") ] writer = csv.writer(sys.stdout) for item in data: writer.writerow(item)