(转)Recursion-bilbo0214-ChinaUnix博客

Chinaunix首页 | 论坛 | 博客

首页　| 　博文目录　| 　关于我

博客访问： 1696587
博文数量： 230
博客积分： 10045
博客等级：上将
技术积分： 3357
用户组：普通用户
注册时间： 2006-12-30 20:40

文章分类

全部博文（230）

Plotting（4）
Literate Progra（2）
Ada（3）
行业窥探（2）

核电行业（0）

风电行业（1）

铁路行业（1）
Formal methods（7）
Networking（7）
Caml（12）
读书时间（16）
FieldBus（6）
GSM/GSM-R（4）
Good Resource（8）
Visual Language（0）
Software Enginee（13）
Compiler Enginee（2）
Safety System（34）
Tcl & Expect（1）
AutoHotKey（3）
其他编程语言（11）
C/C++（24）
胡言乱语（40）
技术幽默（2）
Hardware Desing（13）
Embedded System（4）
Perl（10）
Unix_Linux（0）
关注社会（2）
未分配的博文（0）

文章存档

2011年（7）

2010年（35）

2009年（62）

2008年（126）

我的朋友

最近访客

推荐博文

(转)Recursion

分类：

2008-06-08 15:07:18

A (hopefully) painless introduction to recursion

How recursion works, and why it's more powerful than regular looping

tail-recursive calls (tail recursion) why recursion? recursing through tree structures
looping with recursion	Although recursion isn't technically required for simple looping, it's surprisingly versatile and ultimately more expressive. We'll discuss recursion in two phases: First a simple problem where you don't really need recursion but it's easy to understand; then a harder problem where recursion works wonders. For the simple problem to get us started, let's look at the ubiquitous factorial function: fac n = if n==1 then 1 else n(fac (n-1)) What this says is that, if `n` is equal to 1, `fac` returns 1. For larger values of `n`, the function calls itself with a smaller value of `n` and multiplies the result of that call by the current value of `n`. The last call made to `fac` during runtime is with `n` equal to 1. This is typical in two ways: The end test is handled at the beginning of the source code, and it returns a simple value, which in this case is 1. In general, the end value is often the identity for whatever operation is used to combine values in the function. As each call to `fac` makes another call with a smaller value of `n`, a chain of calculations is built up on the call stack, all of them waiting for the last call to return its value. Each call is waiting to multiply its own value of `n` by the value it recieves from the next call. When the last function call is made, with `n==1`, that call returns 1 instead of recursing, thus ending the recursion chain. The stack then begins to unwind, multiplying later return values of the function by earlier (larger) values of `n`. The final result is built up as the calculations are done in the reverse of the order in which the function calls were made. This is what the call stack looks like as the calculation proceeds in time, one line for each operation: fac 3 --original call 3 (fac 2) --recursive call 3 * (2 * (fac 1)) --last recursive call 3 * (2 * 1) --result of last call 3 * 2 --result of previous call 6 --result of original call
	The version of `fac` listed above is a simple, non-tail-call recursion. Although it's simple and easy to write, it uses extra space on the stack, and it's slow (from all those function calls). This can be avoided by using tail recursion. A tail call is any function call where nothing else needs to be calculated in the calling function when the call is made. It can occur in the middle of the source code for the function, but the value of the recursive call has to be returned by the calling function immediately. If the function making the call needs to do more operations on that value before it can return, the compiler will have to push the next call onto the stack. Here's a tail-recursive version of `fac`: fac n = fac' n 1 where fac' m a = if m==1 then a else fac' (m-1) (m*a) In this case, `fac` calls a local helper function named `fac'` that's defined inside `fac`, and `fac'` does all the work. `fac'` contains an extra argument `a`, which is called an "accumulator", because it accumulates the end result of all the calculations. If `m` equals 1, `fac'` returns the value of its accumulator. For larger values of `m`, `fac'` first multiplies its accumulator by `m`, then passes that value to the next call, and returns the value of that call without changing it. So the tail-recursive function `fac'` does the calculation in the same order as the order in which the calls are made instead of the opposite order. In principle, a tail recursion could build up its calls on the stack just like a nontail recursion does, but it's not necessary. The last call (with `n==1`) returns the accumulater, which is the final result, and all the previous calls just pass that result back to the entry point (the original call). Since that's trivially easy, any compiler worth the name will convert the tail call into a jump machine instruction, re-use the same memory locations for `m` and `a` in every call, and just return the result immediately when the last call returns. Since `fac'` passes The last line of this example is a tail call: The value of each `fac' (m-1) a_m-1` is returned unchanged by the corresponding `fac' m a_m`. The compiler doesn't have to push anything onto the call stack in order to make the next call, because the two function calls will return the same value. This means the compiler can convert recursive calls to this function into jumps, eliminating the need for extra stack space. The final result is built up as the function calls are made, by computing the new values of `a`. Theoretically speaking, the final result is then passed unchanged from `fac' 1 a₁` back to `fac' 2 a₂`, etc., to `fac' n 1`. In reality, the whole process is effectively compiled into a loop. It can also be avoided by using a simple loop, but some languages (e.g. Scheme and Haskell) don't provide that feature. To the best of my knowledge, all functional-language compilers optimize all tail calls. The only exception I've heard of is a Common Lisp interpreter, and Common Lisp is officially a multiparadigm language. As far as I know, all of the popular Common Lisp compilers optimize tail calls.
why recursion?	So why don't we just use loops? They're simpler and faster for simple iteration, and tail recursion gets converted to loops anyway. In an ideal world, you'd think every language would have at least a limited looping facility to make simple loops easier. As things stand in early 2007, however, Common Lisp is one of the few widely used languages that provide good support for both functional programming and looping. At least two functional languages (Scheme and Haskell) don't support looping at all. You're right, it's dumb. Except Scheme and Haskell are both academically oriented languages, so they probably excluded loops intentionally for research purposes. But given a choice between the two, recursion is vastly more powerful. Many programming problems involve tree-shaped data structures, which you pretty much need recursion for. For instance, a company's organizational chart might show departments, subdepartments, subsubdepartments, and finally employees. Management might want to know how many direct subordinates each manager has on average. Or a program might implement a dictionary as a tree, and you might want to know the average depth of search to find a word. Or you might want to act on an entire directory tree (one call per subdirectory), or explore a maze (one call each for go-straight, turn-left, and turn-right).
	A tree structure is made up of data structures commonly called "nodes", which can have one or more (or zero) subnodes, or "children". The child nodes can also have children, so a tree typically has nodes branching out to more nodes, like this: ____ n0 ___________ \| \| \| ___ n1 ___ n2 n3 \| \| \| n4 n5 n6 n0 is called the "root" node; n1, n2, and n3 are n0's children; n4 and n5 are n1's children; n6 is the only child of n3; n4, n5, n2, and n6 are called "leaf nodes" because they don't have any children¹. This is a very simple tree, compared to trees that real-world programs are typically applied to. For instance, in a program that plays a game like chess or Go, the set of all legal moves forms a tree structure, one layer of nodes per move. Each node typically has dozens of children, and the tree may be infinitely deep. A program that plays well might explore half a dozen or more layers of the tree, leading to billions of leaf nodes². So how do you loop through all the nodes of the tree do do calculations on them? You need a separate loop for each layer of nodes. But how can you write all those loops into your program if you don't know how deep the tree is? It can't be done with simple looping. A new loop has to be created at runtime for each layer. That's what recursion does. One loop is built into the recursive function, and the function makes another call to itself for each branch node in the tree. The result is that each recursive call dynamically generates a new loop. That's what makes recursion so cool: You dynamically create new loops at runtime by calling the recursive function at each node. Note that these are not tail calls. The program has to build up a stack of recursive calls into a tree, because the recursive algorithm has to keep backtracking up the tree each time it finishes exploring a subtree, so it can start working on the next subtree. This is a simplified schematic of a recursive function: doNode node = if isaLeaf node then leafValue node else <call doNode for each child of node, combine the values of those calls in some appropriate way, and return the result.> There are many ways the results of calling `doNode child` for each child of `node` can be combined. It all depends on what you're trying to calculate. You might take a sum, an average, the min or max, concatenate all the results, or something else. The common pattern is that the function first tests and returns a value if the node is a leaf, then continues digging deeper into the tree if it's a branch. For example, suppose we want to count up how many leaves a tree has. Each leaf node counts as one, so it might as well return 1. Then each branch node counts up the sums from its child nodes to calculate the total. If `sum` calls itself on a node's child nodes sequentially, the process acting on the tree shown above would look something like this: ____ sum __________ \| \| \| -- First, calls to sum are built up __ sum ___ n2 n3 -- with the first child of each node. \| \| \| sum n5 n6 ____ sum __________ \| \| \| _ 1+sum __ n2 n3 \| \| \| -- sum n4 (a leaf node) returns 1 1 sum n6 -- and sum is called on n5 ___ sum ___________ \| \| \| ___ 2 ____ n2 n3 \| \| \| -- sum n5 also returns 1, 1 1 n6 -- so sum n1 returns the subtotal 2 ___ 2+sum _________ \| \| \| -- the stack unwinds partway ___ 2 ____ sum n3 -- and descends into the n2 branch \| \| \| 1 1 n6 ___ 3+sum _________ \| \| \| ___ 2 ____ 1 sum \| \| \| -- finally the n3 branch is explored 1 1 sum ____ 4 __________ / \| \ __ 2 ___ 1 1 / \ \| -- and the total (4) is returned 1 1 1 However, the calls to `sum` are not always sequential. If the function is applied to the node's children with a mapping function, the calling order may not be guaranteed.
footnotes	¹ Yes, it is mixing metaphors to talk about "children" and "branches" together. But it's common usage, along with the fact that the tree is pictured upside down. ² In traversing a very large or infinite tree, an algorithm will have to stop at some point, even though the current node has more branches. Any node where the algorithm computes a value without recursing is often called a "leaf" node, even though it has child nodes that technically could be explored.

阅读(936) | 评论(0) | 转发(0) |

0

上一篇：quicksort in OCaml

下一篇：(转)OCaml programming best practice

给主人留下些什么吧！~~

关于我们 | 关于IT168 | 联系方式 | 广告合作 | 法律声明 | 免费注册

Copyright 2001-2010 ChinaUnix.net All Rights Reserved 北京皓辰网域网络信息技术有限公司. 版权所有

感谢所有关心和支持过ChinaUnix的朋友们