Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1686084
  • 博文数量: 230
  • 博客积分: 10045
  • 博客等级: 上将
  • 技术积分: 3357
  • 用 户 组: 普通用户
  • 注册时间: 2006-12-30 20:40
文章分类

全部博文(230)

文章存档

2011年(7)

2010年(35)

2009年(62)

2008年(126)

我的朋友

分类:

2008-04-14 17:36:56

Why is the line terminator CR+LF?

This protocol dates back to the days of teletypewriters. CR stands for "carriage return" - the CR control character returned the print head ("carriage") to column 0 without advancing the paper. LF stands for "linefeed" - the LF control character advanced the paper one line without moving the print head. So if you wanted to return the print head to column zero (ready to print the next line) and advance the paper (so it prints on fresh paper), you need both CR and LF.

If you go to the various internet protocol documents, such as , , , or , you'll see that they all specify CR+LF as the line termination sequence. So the the real question is not "Why do CP/M, MS-DOS, and Win32 use CR+LF as the line terminator?" but rather "Why did other people choose to differ from these standards documents and use some other line terminator?"

Unix adopted plain LF as the line termination sequence. If you look at , you'll see that the onlcr option specifies whether a LF should be changed into CR+LF. If you get this setting wrong, you get stairstep text, where

each
    line
        begins
where the previous line left off. So even unix, when left in raw mode, requires CR+LF to terminate lines. The implicit CR before LF is a unix invention, probably as an economy, since it saves one byte per line.

The unix ancestry of the C language carried this convention into the C language standard, which requires only "\n" (which encodes LF) to terminate lines, putting the burden on the runtime libraries to convert raw file data into logical lines.

The C language also introduced the term "newline" to express the concept of "generic line terminator". I'm told that the ASCII committee changed the name of character 0x0A to "newline" around 1996, so the confusion level has been raised even higher.

Here's another discussion of the subject, from a unix perspective.

Published Thursday, March 18, 2004 6:59 AM by oldnewthing
Filed under:
 
-----------------------------------------------------------
Such a trivial problem - I simply want to add a CR/LF pair to the end (or middle) of a string.

(Uh, CR/LF means Carriage Return/Linefeed.)

The codes for this are

  Hex Decimal Visual Basic Delphi C++ Builder Java Paradox
CR 0D 13 vbCR #13 \r    
LF 0A 10 vbLF #10      
CR/LF 0D0A 13,10 vbCRLF
vbNewLine
#13#10 \n newLine() \n
Tab 09 9 vbTab #9 \t    

Well, there is more - the line termination character sequence is operating system dependent. The C \n, the VisualBasic vbNewLine, and the Java newLine() will be interpreted according to the following table.

Operating Environment Sequence
MS DOS/MS Windows CRLF
Unix LF
Macintosh CR

As a result, moving "straight text" files between systems can cause pretty severe problems. (In windows, DOS Edit and WinWord convert unix line terminators to DOS line terminators, but notepad just shows the unix teminators as boxes.)

阅读(2382) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~