fileinput模块提供处理一个或多个文本文件的功能, 可以通过使用for..in来循环读取一个或多个文本文件内容.
例子中的文件,
1.txt
1a
2a
3a
4a
2.txt
1b
2b
DESCRIPTION
Typical use is:
import fileinput
for line in fileinput.input():
process(line)
This iterates over the lines of all files listed in sys.argv[1:],
defaulting to sys.stdin if the list is empty. If a filename is '-' it
is also replaced by sys.stdin. To specify an alternative list of
filenames, pass it as the argument to input(). A single file name is
also allowed.
[译]这个迭代了所有文件的行在sys.argv[1:]中,如果列表为空则默认为标准输入,如果文件名为”-”它也为标准输入。指定一个文件名列表来做为参数传递给input,一个单独的文件名也是允许的。
[例]
(1)#!/usr/bin/env python
import fileinput,sys
for line in fileinput.input(sys.argv[1:]):
pass
print fileinput.lineno(),
命令行下,输入python test.py 1.txt 2.txt
(2)#!/usr/bin/env python
import fileinput
for line in fileinput.input(['1.txt','2.txt']):
pass
print fileinput.lineno(),
(3) #!/usr/bin/env python
import fileinput
for line in fileinput.input(“1.txt”):
pass
print fileinput.lineno(),
Functions filename(), lineno() return the filename and cumulative line
number of the line that has just been read; filelineno() returns its
line number in the current file; isfirstline() returns true iff the
line just read is the first line of its file; isstdin() returns true
iff the line was read from sys.stdin. Function nextfile() closes the
current file so that the next iteration will read the first line from
the next file (if any); lines not read from the file will not count
towards the cumulative line count; the filename is not changed until
after the first line of the next file has been read. Function close()
closes the sequence.
[译]函数filename,lineno返回的读取的文件名与已经读的累计的行数;filelineno返回当前文件的行数的函数;如果读的是它自己的文件第一行,那么isfirstline是正确的。如果读的是来自标准输入那么isstdin返回真。函数nextfile关闭当前文件以致下一个迭代器将从下一个文件第一行读起;将不会累计上一个文件的行数.这个文件名不会改变,直到读取下一个文件的第一行。函数close关闭序列。
[例]
(1) #!/usr/bin/env python
import fileinput
for line in fileinput.input(['1.txt']):
pass
print fileinput.filename(),fileinput.lineno()
[root@newpatch3 /home/python]#python test.py
1.txt 4
(2)#!/usr/bin/env python
import fileinput
for line in fileinput.input(['1.txt','2.txt']):
pass
print fileinput.filename(),":",fileinput.filelineno()
print "1.txt and 2.txt total line:",fileinput.lineno()
[root@newpatch3 /home/python]#python test.py
2.txt : 2
1.txt and 2.txt total line: 6
大家看到没,filelineno与lineno的差异了吧?
(3) #!/usr/bin/env python
import fileinput,sys
for line in fileinput.input([‘1.txt’]):
if fileinput.isfirstline():
print line,
sys.exit(0)
[root@newpatch3 /home/python]#python test.py
1a
原1.txt中有1a,2a,3a,4a四行数,但我们通过条件判断,只取第一条,来演示isfirstline功能.
(4) #!/usr/bin/env python
import fileinput
for line in fileinput.input():
if fileinput.isstdin():
print "isstdin"
[root@newpatch3 /home/python]#python test.py
This is stdin
Isstdin
Before any lines have been read, filename() returns None and both line
numbers are zero; nextfile() has no effect. After all lines have been
read, filename() and the line number functions return the values
pertaining to the last line read; nextfile() has no effect.
[译]没有行读取前,filename返回None和行号为0,nextfile也不起作用。在所有行被读后,filename和获取行号的函数才能返回到已经读的当前行。Nextfile才能起作用。
[例]自己测试。
All files are opened in text mode. If an I/O error occurs during
opening or reading a file, the IOError exception is raised.
[译]所有的文件以文本模式打开,如果在打开或者读一个文件时发生了一个I/O错误,将会产生一个IOError异常。
If sys.stdin is used more than once, the second and further use will
return no lines, except perhaps for interactive use, or if it has been
explicitly reset (e.g. using sys.stdin.seek(0)).
[译]如果标准输入用了多次,第二次将不会返回任何行,除在交互模式下,或者将其重置。
[例]
#!/usr/bin/env python
import fileinput,sys
for line in fileinput.input():
print "line1:",line,
fileinput.close()
sys.stdin.seek(0)
for line in fileinput.input():
print "line2:",line,
fileinput.close()
[root@newpatch3 /home/python]#python test.py
a
b
c
line1: a
line1: b
line1: c
e
f
g
line2: e
line2: f
line2: g
Empty files are opened and immediately closed; the only time their
presence in the list of filenames is noticeable at all is when the
last file opened is empty.
It is possible that the last line of a file doesn't end in a newline
character; otherwise lines are returned including the trailing
newline.
Class FileInput is the implementation; its methods filename(),
lineno(), fileline(), isfirstline(), isstdin(), nextfile() and close()
correspond to the functions in the module. In addition it has a
readline() method which returns the next input line, and a
__getitem__() method which implements the sequence behavior. The
sequence must be accessed in strictly sequential order; sequence
access and readline() cannot be mixed.
[译]类fileinput是这个的实例;它的方法有filename(),….对应的功能在这个模块中。除此之外还有readline方法,返回下一行输入,和__getitem__()方法。该序列中,必须严格按顺序访问;序例访问和readline不能混淆。
Optional in-place filtering: if the keyword argument inplace=1 is
passed to input() or to the FileInput constructor, the file is moved
to a backup file and standard output is directed to the input file.
This makes it possible to write a filter that rewrites its input file
in place. If the keyword argument backup="." is also
given, it specifies the extension for the backup file, and the backup
file remains around; by default, the extension is ".bak" and it is
deleted when the output file is closed. In-place filtering is
disabled when standard input is read. XXX The current implementation
does not work for MS-DOS 8+3 filesystems.
[译]这段话总的意思是说,inplace如果设为1,那么就将读到的行,输出到输入文件中。如果有backup这个参数,就会将源文件内容输入到备份文件中,输出还会输出到输入文件中。
[例]
(1A)#!/usr/bin/env python
import fileinput,sys
for line in fileinput.input("1.txt", inplace=0):
print line,
[root@newpatch3 /home/python]#python test.py
1a
2a
3a
4a
[root@newpatch3 /home/python]#more 1.txt
1a
2a
3a
4a
(1B) #!/usr/bin/env python
import fileinput,sys
for line in fileinput.input("1.txt",inplace=1):
print “test”,
[root@newpatch3 /home/python]#python test.py
[root@newpatch3 /home/python]#more 1.txt
test test test test
通过1A与1B可以发现,我们如果不指定backup的话,就会将输出直接写入到1.txt文件中。
(2) #!/usr/bin/env python
import fileinput,sys
for line in fileinput.input("1.txt",inplace=1,backup=".bak"):
print "test\n",
[root@newpatch3 /home/python]#ls
1.txt 1.txt.bak 2.txt test.py
[root@newpatch3 /home/python]#more 1.txt
test
test
test
test
[root@newpatch3 /home/python]#more 1.txt.bak
1a
2a
3a
4a
Performance: this module is unfortunately one of the slower ways of
processing large numbers of input lines. Nevertheless, a significant
speed-up has been obtained by using readlines(bufsize) instead of
readline(). A new keyword argument, bufsize=N, is present on the
input() function and the FileInput() class to override the default
buffer size.
[译]对与处理大量的输入行的处理是不理想的。然而,一个重要的加速已使用readlines(bufsize)来代替ReadLine()。一个新的关键参数,bufsize=N,是在input()函数中存在的和FileInput()类中覆盖buffer size的默认值。总的一句话,通过buffer size这个关键函数可以提高大量的输入。
本文出自 “坏男孩” 博客,请务必保留此出处http://5ydycm.blog.51cto.com/115934/305488