分类: Java
2010-07-08 17:42:37
Hist:
100708, draft
The Java language is a general-purpose, concurrent, class-based, OO languages.
The Java language is strongly typed. It is a relatively high-level language and normally compiled to the bytecoded instruction set and bin format.
JLS does not describe reflection. Readers should be aware of this.
Context-Free Grammars: production, nonterminal, terminal, left-hand, and right-hand. Language and sentence.
The Lexical Grammar: token, identifier, keyword, literal, separator, operator
Syntactic Grammar.
Unicode
Java are writteng using the Unicode character set. The range of legal code points is now U+0000 to U+10FFFF. Characters whose code points are greater than U+FFFF are called supplementary characters. Unicode standard defines an encoding called UTF-16, which use paires of 16-bit code units to represent supplementary characters. For characters in the range U+0000 to U+FFFF, the values of code points and UTF-16 code units are the same. Tha Java represents text in sequences of 16-bit code untis, using the UTF-16 encoding.
Lexical Translations
3 steps: 1) translate the Unicode escaps ‘\uxxxx’ in raw stream of Unicode characters to the corresponding Unicode character. 2)translate into a stream of input characters and line terminators. 3)comprise the tokens that are the terminals symbols of the syntactic grammar, after discarding ws and comments.
Unicode Escapes
\u followed by four hex digits with the indicated hex value. The character produced by a Unicode escape does not participate in further Unicode escapes.
The Java language specifies a standard way of transforming a program written in Unicode into ASCII that changes a program into a form that can be processed by ASCII-based tools.
Line Terminators
LF, CR, or CR LF
Input Elements and Tokens
WS
SP, HT, FF, or LineTerminator
Comments
/* traditional comment */
// end-of-line comment
Comments do not nest.
/*, */, and // have no special meaning in comments.
Identifiers
Keyword
One of :
abstract continue for new switch
assert default if package synchronized
boolean do goto private
this
break double implements protected throw
byte else import public throws
case enum instanceof return transient
catch extends int short try
char final interface static void
class finally long strictfp volatile
const float native super while
Attention: true , false, and null are not here.
Literals
A literal is the source code representation of a value of a primitive type, the String type, or the null type.
Integer literals: decimal, hexadecimal, or octal.
Decimal: 0, 123, 1234566666666666L
Hexadecimal: 0x12ab, 0x7fffffffffffffffffffL
Octal: 017777, 077777777777777L
Float literals
Boolean literals
true, or false
Character literals
A character or an escape sequence: \uxxxx
String literals
Zero or more characters enclosed in double quotes. Each string literal is a reference to an instance of class String. String objects have a constant value.
Escape Sequences for characters and String literals
\b BS, \u0008
\t HT, \u0009
\n LF, \u000a
\f FF, \u000c
\r CR, \u000d
\” double quote, \u0022
\’ single quote, \u0027
\\ backslash, u\005c
octalEscape, such as \777, from \u0000 to \u00ff
The Null literal
Null
Separators
( ) { } [ ] ; , .
Operators
= > < ! ~ ? :
== <= >= != && || ++ --
+ - * / & | ^ % << >> >>>
+= -= *= /= &= |= ^= %= <<= >>=
>>>=
1. JLS 3nd