治肾虚不含糖,专注内核性能优化二十年。 https://github.com/KnightKu
分类: Python/Ruby
2013-03-11 16:23:46
原文地址:python不得不说的string 作者:kinfinger
上一篇文章讲解了python的一切都是对象,在实际开发中运用最多的就是字符串了,那么字符串又是什么情况呢?
我们知道在C中存在两种类型:字符与字符串,下面我们来看一看二者的区别
#include#include int main(int argc,char * argv []){ /* definition of var */ int i,len; char ch,*chp; char cha[] = {'X','X','X','X','\\\\0'}; char cha2[] = "XXXX"; char * chp3=(char *) malloc(sizeof(char)*5); char * chp2; /* assignmemnt of var and action */ ch = 'a'; chp = "XXXX"; chp2=strdup(chp); for( i = 0; i < 4; i++) chp3[i] = i+65; chp3[4] = '\\\\0'; printf("************************before change****************************\\\\n"); printf("ch\\\\t%c\\\\ncha\\\\t%s\\\\ncha2\\\\t%s\\\\nchp\\\\t%s\\\\nchp2\\\\t%s\\\\nchp3\\\\t%s\\\\n",ch,cha,cha2,chp,chp2,chp3); printf("************************after change****************************\\\\n"); ch = 'y'; cha[3] = 'y'; cha2[3] = 'y'; /* chp[3]='3'; */ chp2[3] = 'y'; chp3[3] = 'y'; printf("ch\\\\t%c\\\\ncha\\\\t%s\\\\ncha2\\\\t%s\\\\nchp\\\\t%s\\\\nchp2\\\\t%s\\\\nchp3\\\\t%s\\\\n",ch,cha,cha2,chp,chp2,chp3); /*********add ********************/ printf("notice of the difference between cha and cha2 : %d,%d\\\\n",strlen(cha),strlen(cha2)); printf("notice of the difference between cha and cha2 : %d,%d\\\\n",sizeof(cha),sizeof(cha2)); printf("****************************************************\\\\n"); }
在python中的情况又是怎么样呢?尝试类似的定义:
ch='a'
string='XXXX' cha=['x','x','x','x'] print ch,string,cha,type(ch),type(string),type(cha), print hex(id(cha)),hex(id(string)),hex(id(cha[3])) # dive in string # modify ch='b' # string[3]='b' cha[3]='b' print ch,string,cha,type(ch),type(string),type(cha), print hex(id(cha)),hex(id(string)),hex(id(cha[3])) # string manipulation #iteration of string for letter in string: print "".join(letter), for letter in string: print ord(x), # string index access print #string slice str=string[:2] print str,string[2],len(str),type(str)
程序的运行结果:
a XXXX ['x', 'x', 'x', 'x']def myfun():
""" this is document""" # global fun='this is global area' return 'mymodule' x=myfun print x.__doc__ print x.__name__ # x.__module__='module name ' print '%s' % x.func_doc print '%s' % x.__module__ print x.func_code print x.func_globals print x.func_closure print x.func_dict print x() y=ord print y.__module__程序输出:
{'__builtins__': , '__file__': 'h:\\python\\myfun.py', 'myfun': , 'x': , '__name__': '__main__', '__doc__': None}
None
{}
mymodule
__builtin__
可以看到用户自定义函数与build_in函数的modulename是不同的,函数的属性有些是readable,有些是writable。
其中两个函数属性比较重要,一个是func_code,它表示了编译的函数体
func_globals 包含了该了函数详细信息的一个字典,该属性返回了一个字典对象,
y=x.func_globals
print type(y)
结果:
上面的讨论中,涉及到了generator function,查看文档,信息如下:
yield_stmt ::= yield_expression
The yield statement is only used when defining a generator
function, and is only used in the body of the generator function. Using a yield statement in a function definition is sufficient
to cause that definition to create a generator function instead of a normal
function.
先讨论这里吧,由于是想到哪里,写到哪里,思路有点混论
ps
Sequences
These represent finite ordered sets indexed by
non-negative numbers. The built-in function len() returns the number of items of a sequence. When the length of a sequence is n, the index set contains the numbers 0, 1, ..., n-1. Item i of sequence a is selected by a[i].
Sequences also support slicing: a[i:j] selects all items
with index k such that i <= k < j. When used as an expression, a slice is a
sequence of the same type. This implies that the index set is renumbered so that
it starts at 0.
Some sequences also support “extended slicing” with a third
“step” parameter: a[i:j:k] selects all items of a with index x where x = i + n*k, n >= 0 and i <= x < j.
Sequences are distinguished according to their mutability:
Immutable sequences
An object of an immutable sequence type cannot change
once it is created. (If the object contains references to other objects, these
other objects may be mutable and may be changed; however, the collection of
objects directly referenced by an immutable object cannot change.)
The following types are immutable sequences:
Strings
The items of a string are characters. There is no
separate character type; a character is represented by a string of one item.
Characters represent (at least) 8-bit bytes. The built-in functions chr() and ord() convert between characters and nonnegative integers representing the byte
values. Bytes with the values 0-127 usually represent the corresponding ASCII
values, but the interpretation of values is up to the program. The string data
type is also used to represent arrays of bytes, e.g., to hold data read from a
file.
(On systems whose native character set is not ASCII,
strings may use EBCDIC in their internal representation, provided the functions chr() and ord() implement a mapping between ASCII and EBCDIC, and string comparison preserves
the ASCII order. Or perhaps someone can propose a better rule?)
Unicode
The items of a Unicode object are Unicode code
units. A Unicode code unit is represented by a Unicode object of one item and
can hold either a 16-bit or 32-bit value representing a Unicode ordinal (the
maximum value for the ordinal is given in sys.maxunicode, and depends on how Python is configured at
compile time). Surrogate pairs may be present in the Unicode object, and will be
reported as two separate items. The built-in functions unichr() and ord() convert between code units and nonnegative integers representing the Unicode
ordinals as defined in the Unicode Standard 3.0. Conversion from and to other
encodings are possible through the Unicode method encode() and the built-in function unicode().
Tuples
The items of a tuple are arbitrary Python
objects. Tuples of two or more items are formed by comma-separated lists of
expressions. A tuple of one item (a ‘singleton’) can be formed by affixing a
comma to an expression (an expression by itself does not create a tuple, since
parentheses must be usable for grouping of expressions). An empty tuple can be
formed by an empty pair of parentheses.
Mutable sequences
Mutable sequences can be changed after they are
created. The subscription and slicing notations can be used as the target of
assignment and del (delete) statements.
There are currently two intrinsic mutable sequence types:
Lists
The items of a list are arbitrary Python
objects. Lists are formed by placing a comma-separated list of expressions in
square brackets. (Note that there are no special cases needed to form lists of
length 0 or 1.)
Byte Arrays
A bytearray object is a mutable array. They
are created by the built-in bytearray() constructor. Aside from being mutable (and
hence unhashable), byte arrays otherwise provide the same interface and
functionality as immutable bytes objects.
The extension module array provides an additional example of a mutable sequence type.
Callable types
These are the types to which the function call
operation (see section Calls) can be applied:
User-defined functions
A user-defined function object is created by a
function definition (see section Function definitions). It
should be called with an argument list containing the same number of items as
the function’s formal parameter list.
Special attributes:
Attribute Meaning
func_doc The function’s documentation string, or None if unavailable Writable
__doc__ Another way of spelling func_doc Writable
func_name The function’s name Writable
__name__ Another way of spelling func_name Writable
__module__ The name of the module the function was defined in, or None if unavailable. Writable
func_defaults A tuple containing default argument values for those arguments that have defaults, or None if no arguments have a default value Writable
func_code The code object representing the compiled function body. Writable
func_globals A reference to the dictionary that holds the function’s global variables — the global namespace of the module in which the function was defined. Read-only
func_dict The namespace supporting arbitrary function attributes. Writable
func_closure None or a tuple of cells that contain bindings for the function’s free variables. Read-only
Most of the attributes labelled “Writable” check the type of the assigned
value.
Changed in version 2.4: func_name is now
writable.
Function objects also support getting and setting arbitrary attributes, which
can be used, for example, to attach metadata to functions. Regular attribute
dot-notation is used to get and set such attributes. Note that the current
implementation only supports function attributes on user-defined functions.
Function attributes on built-in functions may be supported in the
future.
Additional information about a function’s definition can be
retrieved from its code object; see the description of internal types below.
User-defined methods
A user-defined method object combines a class, a
class instance (or None) and any callable object (normally a user-defined
function).
Special read-only attributes: im_self is
the class instance object, im_func is the function object; im_class is
the class of im_self for bound methods or the class that asked for the
method for unbound methods; __doc__ is the method’s documentation (same as im_func.__doc__); __name__ is
the method name (same as im_func.__name__); __module__ is the name of the module the method was defined in, or None if
unavailable.
Changed in version 2.2: im_self used to refer to the class that defined the
method.
Changed in version 2.6: For 3.0 forward-compatibility, im_func is
also available as __func__, and im_self as __self__.
Methods also support accessing (but not setting) the arbitrary
function attributes on the underlying function object.
User-defined method objects may be created when getting an attribute of a
class (perhaps via an instance of that class), if that attribute is a
user-defined function object, an unbound user-defined method object, or a class
method object. When the attribute is a user-defined method object, a new method
object is only created if the class from which it is being retrieved is the same
as, or a derived class of, the class stored in the original method object;
otherwise, the original method object is used as it is.
When a user-defined method object is created by retrieving a
user-defined function object from a class, its im_self attribute is None and the method object is said to be unbound. When one is created by retrieving a
user-defined function object from a class via one of its instances, its im_self attribute is the instance, and the method object is said to be bound. In either
case, the new method’s im_class attribute is the class from which the retrieval
takes place, and its im_func attribute is the original function object.
When a user-defined method object is created by retrieving
another method object from a class or instance, the behaviour is the same as for
a function object, except that the im_func attribute of the new instance is not the original method object but its im_func attribute.
When a user-defined method object is created by retrieving a
class method object from a class or instance, its im_self attribute is the class itself (the same as the im_class attribute), and its im_func attribute is the function object underlying the
class method.
When an unbound user-defined method object is called, the underlying function
(im_func) is called, with the restriction that the first
argument must be an instance of the proper class (im_class)
or of a derived class thereof.
When a bound user-defined method object is called, the underlying function
(im_func) is called, inserting the class instance (im_self) in
front of the argument list. For instance, when C is a
class which contains a definition for a function f(), and x is an instance of C, calling x.f(1) is equivalent to
calling C.f(x, 1).
When a user-defined method object is derived from a class method object, the
“class instance” stored in im_self will actually be the class itself, so that calling
either x.f(1) or C.f(1) is equivalent to
calling f(C,1) where f is the underlying
function.
Note that the transformation from function object to (unbound or
bound) method object happens each time the attribute is retrieved from the class
or instance. In some cases, a fruitful optimization is to assign the attribute
to a local variable and call that local variable. Also notice that this
transformation only happens for user-defined functions; other callable objects
(and all non-callable objects) are retrieved without transformation. It is also
important to note that user-defined functions which are attributes of a class
instance are not converted to bound methods; this only happens when the
function is an attribute of the class.
Generator functions
A function or method which uses the yield statement (see section The yield
statement) is called a generator function. Such a
function, when called, always returns an iterator object which can be used to
execute the body of the function: calling the iterator’s next() method will cause the function to execute until it provides a value using the yield statement. When the function executes a return statement or falls off the end, a StopIteration exception is raised and the iterator
will have reached the end of the set of values to be returned.
Built-in functions
A built-in function object is a wrapper around
a C function. Examples of built-in functions are len() and math.sin() (math is
a standard built-in module). The number and type of the arguments are determined
by the C function. Special read-only attributes: __doc__ is
the function’s documentation string, or None if unavailable; __name__ is
the function’s name; __self__ is set to None (but see the next item); __module__ is the name of the module the function was defined in or None if
unavailable.
Built-in methods
This is really a different disguise of a
built-in function, this time containing an object passed to the C function as an
implicit extra argument. An example of a built-in method is alist.append(), assuming alist is a list object. In this case, the special read-only attribute __self__ is set to the object denoted by alist.
Class Types
Class types, or “new-style classes,” are callable. These objects normally
act as factories for new instances of themselves, but variations are possible
for class types that override __new__(). The arguments of the call are passed to __new__() and, in the typical case, to __init__() to initialize the new instance.
Classic Classes
Class objects are described below. When a
class object is called, a new class instance (also described below) is created
and returned. This implies a call to the class’s __init__() method if it has one. Any arguments are
passed on to the __init__() method. If there is no __init__() method, the class must be called without
arguments.
Class instances
Class instances are described below. Class instances are callable only when
the class has a __call__() method; x(arguments) is a shorthand for x.__call__(arguments).