Chinaunix首页 | 论坛 | 博客
  • 博客访问: 467218
  • 博文数量: 55
  • 博客积分: 2603
  • 博客等级: 少校
  • 技术积分: 750
  • 用 户 组: 普通用户
  • 注册时间: 2006-12-31 02:30
文章分类

全部博文(55)

文章存档

2011年(1)

2010年(22)

2009年(17)

2008年(15)

我的朋友

分类: Python/Ruby

2010-10-06 00:57:21

.. _tut-fp-issues:

****************************************************************************************
Floating Point Arithmetic:  Issues and Limitations 浮点数算法:争议和限制
****************************************************************************************

.. sectionauthor:: Tim Peters


Floating-point numbers are represented in computer hardware as base 2 (binary)
fractions.  For example, the decimal fraction :

浮点数在计算机中表达为二进制(binary)小数。例如:十进制小数 ::

   0.125

has value 1/10 + 2/100 + 5/1000, and in the same way the binary fraction :

是 1/10 + 2/100 + 5/1000 的值,同样二进制小数 ::

   0.001

has value 0/2 + 0/4 + 1/8.  These two fractions have identical values, the only
real difference being that the first is written in base 10 fractional notation,
and the second in base 2.

是 0/2 + 0/4 + 1/8。这两个数值相同。唯一的实质区别是第一个写为十进制小
数记法,第二个是二进制。

Unfortunately, most decimal fractions cannot be represented exactly as binary
fractions.  A consequence is that, in general, the decimal floating-point
numbers you enter are only approximated by the binary floating-point numbers
actually stored in the machine.

遗憾的是,大多数十进制小数不能精确的表达二进制小数。

The problem is easier to understand at first in base 10.  Consider the fraction
1/3.  You can approximate that as a base 10 fraction:

这个问题更早的时候首先在十进制中发现。考虑小数形式的 1/3 ,你可以来个
十进制的近似值 ::

   0.3

or, better, :

或者更进一步的, ::

   0.33

or, better, :

或者,更进一步的 ::

   0.333

and so on.  No matter how many digits you're willing to write down, the result
will never be exactly 1/3, but will be an increasingly better approximation of
1/3.

诸如此类。如果你写多少位,这个结果永远不是精确的 1/3 ,但是可以无限接
近 1/3 。

In the same way, no matter how many base 2 digits you're willing to use, the
decimal value 0.1 cannot be represented exactly as a base 2 fraction.  In base
2, 1/10 is the infinitely repeating fraction :

同样,无论在二进制中写多少位,十进制数 0.1 都不能精确表达为二进制小数。
二进制来表达 1/10 是一个无限循环小数 ::

   0.0001100110011001100110011001100110011001100110011...

Stop at any finite number of bits, and you get an approximation.  This is why
you see things like:

在任意无限位数值中中止,你可以得到一个近似,这就是为什么你会看见这个 ::

   >>> 0.1
   0.10000000000000001

On most machines today, that is what you'll see if you enter 0.1 at a Python
prompt.  You may not, though, because the number of bits used by the hardware to
store floating-point values can vary across machines, and Python only prints a
decimal approximation to the true decimal value of the binary approximation
stored by the machine.  On most machines, if Python were to print the true
decimal value of the binary approximation stored for 0.1, it would have to
display :

在今天的大多数机器上,如果你在 Python 提示符后输入 0.1,就会看到上面的
内容。当然,也许你看到的不一样,因为不同的机器存储浮点数数值位的硬件会有区
别,Python 只打印十进制小数以二进制存储在机器中的近似值的十进制近似表示。
在大多数机器上,如果 Python 打印 0.1 的二进制存储的真正十进制值,应该
显示为这样! ::

   >>> 0.1
   0.1000000000000000055511151231257827021181583404541015625

instead!  The Python prompt uses the built-in :func:`repr` function to obtain a
string version of everything it displays.  For floats, ``repr(float)`` rounds
the true decimal value to 17 significant digits, giving :

Python 使用内置的 :func:`repr` 函数获取它要显示的每一个对象的字符串版
本。对于浮点数, ``repr(float)`` 将真正的十进制值处理为十七位精度,得
到 ::

   0.10000000000000001

``repr(float)`` produces 17 significant digits because it turns out that's
enough (on most machines) so that ``eval(repr(x)) == x`` exactly for all finite
floats *x*, but rounding to 16 digits is not enough to make that true.

``repr(float)`` 生成 17 位精度,这是因为它已经足够了(在大多数机器上)。
依此 ``eval(repr(x)) == x`` 可以精确的应用到所有的无限浮点数 *x* ,但
是 16 位的话就不够,不一定得到 true。

Note that this is in the very nature of binary floating-point: this is not a bug
in Python, and it is not a bug in your code either.  You'll see the same kind of
thing in all languages that support your hardware's floating-point arithmetic
(although some languages may not *display* the difference by default, or in all
output modes).

需要注意的是这在二进制浮点数是非常自然的:它不是 Python 的bug,也不是
你的代码的 bug。你会看到只要你的硬件支持浮点数算法,所有的语言都会有这
个现象(尽管有些语言可能默认或完全不 *显示* 这个差异)。

Python's built-in :func:`str` function produces only 12 significant digits, and
you may wish to use that instead.  It's unusual for ``eval(str(x))`` to
reproduce *x*, but the output may be more pleasant to look at:

Python 的内置函数 :func:`str` 只生成 12位精度。你可能更希望用它。通常
它用形如 ``eval(str(x))`` 来重现 *x* ,而看起来更合意 ::

   >>> print str(0.1)
   0.1

It's important to realize that this is, in a real sense, an illusion: the value
in the machine is not exactly 1/10, you're simply rounding the *display* of the
true machine value.

认识到这个幻觉的真相很重要:机器不能精确表达 1/10,你可以简单的截断 
*显示* 真正的机器值。

Other surprises follow from this one.  For example, after seeing :

这里还有另一个惊奇之处。例如,下面 ::

   >>> 0.1
   0.10000000000000001

you may be tempted to use the :func:`round` function to chop it back to the
single digit you expect.  But that makes no difference:

你可能受鼓动去尝试 :func:`round` 函数来截断这个数,使其回到你期待的精
度,但是结果有些出乎意料 ::

   >>> round(0.1, 1)
   0.10000000000000001

The problem is that the binary floating-point value stored for "0.1" was already
the best possible binary approximation to 1/10, so trying to round it again
can't make it better:  it was already as good as it gets.

这个问题在于存储“0.1”的浮点值已经达到1/10的最佳精度了,所以尝试截
断它不能改善:它已经尽可能的好了。

Another consequence is that since 0.1 is not exactly 1/10, summing ten values of
0.1 may not yield exactly 1.0, either:

另一个影响是因为 0.1 不能精确的表达 1/10,对10个 0.1 的值求和不能精确
的得到 1.0,即 ::

   >>> sum = 0.0
   >>> for i in range(10):
   ...     sum += 0.1
   ...
   >>> sum
   0.9999999999999999

Binary floating-point arithmetic holds many surprises like this.  The problem
with "0.1" is explained in precise detail below, in the "Representation Error"
section.  See `The Perils of Floating Point <`_
for a more complete account of other common surprises.

浮点数据算法产生了很多诸如此类的惊奇。在“表现错误”一节中,这个
“0.1”问题详细表达了精度问问题。更完整的其它常见的惊奇请参见“浮点数
的危害 < ”。

As that says near the end, "there are no easy answers."  Still, don't be unduly
wary of floating-point!  The errors in Python float operations are inherited
from the floating-point hardware, and on most machines are on the order of no
more than 1 part in 2\*\*53 per operation.  That's more than adequate for most
tasks, but you do need to keep in mind that it's not decimal arithmetic, and
that every float operation can suffer a new rounding error.

最后我要说,“没有简单的答案”。还是不要过度的敌视浮点数!Python 浮点
数操作的错误来自于浮点数硬件,大多数机器上同类的问题每次计算误差不超过 2\*\*53
分之一。对于大多数任务这已经足够让人满意了。但是你要在心中记住这不是十
进制算法,每个浮点数计算可能会带来一个新的精度错误。

While pathological cases do exist, for most casual use of floating-point
arithmetic you'll see the result you expect in the end if you simply round the
display of your final results to the number of decimal digits you expect.
:func:`str` usually suffices, and for finer control see the :meth:`str.format`
method's format specifiers in :ref:`formatstrings`.

问题已经存在了,对于大多数偶发的浮点数错误,你应该比对你期待的最终显示结果
是否符合你的期待。 :func:`str` 通常够用了,完全的控制参见
:ref:`formatstrings` 中 :meth:`str.format` 方法的格式化方式。

.. _tut-fp-error:

Representation Error 表达错误
==========================================================

This section explains the "0.1" example in detail, and shows how you can perform
an exact analysis of cases like this yourself.  Basic familiarity with binary
floating-point representation is assumed.

这一节详细说明“0.1”示例,教你怎样自己去精确的分析此类案例。假设这里
你已经对浮点数表示有基本的了解。

:dfn:`Representation error` refers to the fact that some (most, actually)
decimal fractions cannot be represented exactly as binary (base 2) fractions.
This is the chief reason why Python (or Perl, C, C++, Java, Fortran, and many
others) often won't display the exact decimal number you expect:

:dfn:`Representation error` 提及事实上有些(实际是大多数)十进制小数不
能精确的表示为二进制小数。这是 Python (或 Perl,C,C++,Java,Fortran
以及其它很多)语言往往不能按你期待的样子显示十进制数值的根本原因 ::

   >>> 0.1
   0.10000000000000001

Why is that?  1/10 is not exactly representable as a binary fraction. Almost all
machines today (November 2000) use IEEE-754 floating point arithmetic, and
almost all platforms map Python floats to IEEE-754 "double precision".  754
doubles contain 53 bits of precision, so on input the computer strives to
convert 0.1 to the closest fraction it can of the form *J*/2**\ *N* where *J* is
an integer containing exactly 53 bits.  Rewriting :

这是为什么? 1/10 不能精确的表示为二进制小数。大多数今天的机器(2000年
十一月)使用 IEEE-754 浮点数算法,大多数平台上 Python 将浮点数映射为
IEEE-754 “双精度浮点数”。754 双精度包含 53 位精度,所以计算机努力将
输入的 0.1 转为 *J*/2**\ *N* 最接近的二进制小数。*J* 是一个 53 位的整
数。改写 ::

   1 / 10 ~= J / (2**N)

as :

为 ::

   J ~= 2**N / 10

and recalling that *J* has exactly 53 bits (is ``>= 2**52`` but ``< 2**53``),
the best value for *N* is 56:

*J* 重现时正是 53 位(是 ``>= 2**52`` 而非 ``< 2**53`` ), *N*
的最佳值是 56 ::

   >>> 2**52
   4503599627370496L
   >>> 2**53
   9007199254740992L
   >>> 2**56/10
   7205759403792793L

That is, 56 is the only value for *N* that leaves *J* with exactly 53 bits.  The
best possible value for *J* is then that quotient rounded:

因此,56 是保持 *J* 精度的唯一 *N* 值。 *J* 最好的近似值是整除的商 ::

   >>> q, r = divmod(2**56, 10)
   >>> r
   6L

Since the remainder is more than half of 10, the best approximation is obtained
by rounding up:

因为余数大于 10 的一半,最好的近似是取上界 ::

   >>> q+1
   7205759403792794L

Therefore the best possible approximation to 1/10 in 754 double precision is
that over 2\*\*56, or :

因此在 754 双精度中 1/10 最好的近似值是是 2\*\*56,或 ::

   7205759403792794 / 72057594037927936

Note that since we rounded up, this is actually a little bit larger than 1/10;
if we had not rounded up, the quotient would have been a little bit smaller than
1/10.  But in no case can it be *exactly* 1/10!

要注意因为我们向上舍入,它其实比 1/10 稍大一点点。如果我们没有向上舍
入,它会比 1/10 稍小一点。但是没办法让它 *恰好* 是 1/10!

So the computer never "sees" 1/10:  what it sees is the exact fraction given
above, the best 754 double approximation it can get:

所以计算机永远也不 “知道” 1/10:它遇到上面这个小数,给出它所能得到的
最佳的 754 双精度实数 ::

   >>> .1 * 2**56
   7205759403792794.0

If we multiply that fraction by 10\*\*30, we can see the (truncated) value of
its 30 most significant decimal digits:

如果我们用 10\*\*30 除这个小数,会看到它最大30位(截断后的)的十进制值 ::

   >>> 7205759403792794 * 10**30 / 2**56
   100000000000000005551115123125L

meaning that the exact number stored in the computer is approximately equal to
the decimal value 0.100000000000000005551115123125.  Rounding that to 17
significant digits gives the 0.10000000000000001 that Python displays (well,
will display on any 754-conforming platform that does best-possible input and
output conversions in its C library --- yours may not!).

这表示存储在计算机中的实际值近似等于十进制值
0.100000000000000005551115123125。 Python 显示时取 17 位精度为
0.10000000000000001(是的,在任何符合754的平台上,都会由其C库转换为这
个最佳近似——你的可能不一样!)。

阅读(4075) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~