Lucene Field-linxh-ChinaUnix博客

Field( name, byte[] value, Field.Store store)
Create a stored field with binary value.

Field( name, reader)
Create a tokenized and indexed field that is not stored.

Field( name, reader, Field.TermVector termVector)
Create a tokenized and indexed field that is not stored, optionally with storing term vectors.

Field( name, value, Field.Store store, Field.Index index)
Create a field by specifying its name, value and how it will be saved in the index.

Field( name, value, Field.Store store, Field.Index index, Field.TermVector termVector)
Create a field by specifying its name, value and how it will be saved in the index.

其中：

Field.Store 表示“是否存储”，即该Field内的信息是否要被原封不动的保存在索引中。

Field.Index 表示“是否索引”，即在这个Field中的数据是否在将来检索时需要被用户检索到，一个“不索引”的Field通常仅是提供辅助信息储存的功能。

Field.TermVector 表示“是否切词”，即在这个Field中的数据是否需要被切词。

通常，参数用Reader，表示在文本流数据源中获取数据，数据量一般会比较大。像链接地址URL、文件系统路径信息、时间日期、人名、居民身份证、电话号码等等通常将被索引并且完整的存储在索引中，但一般不需要切分词，通常用上面的第四个构造函数，第三四个参数分别为Field.Store.YES, Field.Index.YES。而长文本通常可用第3个构造函数。引用[http://blog.csdn.net/colasnail/archive/2007/03/21/1536417.aspx]

1. 2.0 以前的版本

Keyword: Field 的值将被保存到索引文件，为Field的值建立索引，建立索引时不需要分词。对应Field.Store.YES, Field.Index.UN_TOKENIZED
UnIndexed: Field 的值将被保存到索引文件，不为Field的值建立索引，因此不能通过该Field搜索文档。对应Field.Store.YES, Field.Index.NO
UnStored: Field 的值不被保存到索引文件，将Field的值分词后建立索引。对应Field.Store.NO, Field.Index.TOKENIZED
Text: Field 的值分词后建立索引。如果参数为String值将被保存，为Reader值不被保存。对应Field.Store.YES, Field.Index.TOKENIZED

2. 2.0 版本

用几个内部类的组合来区分Field的具体类型。

Store

² COMPRESS: 压缩保存。用于长文本或二进制数据

² YES ：保存

² NO ：不保存

Index

² NO ：不建索引

² TOKENIZED ：分词， 建索引

² UN_TOKENIZED ：不分词， 建索引

² NO_NORMS ：不分词， 建索引。但是Field的值不像通常那样被保存，而是只取一个byte，这样节约存储空间

TermVector

² NO ：不保存term vectors

² YES ：保存term vectors。

² WITH_POSITIONS ：保存term vectors。（保存值和token位置信息）

² WITH_OFFSETS ：保存term vectors。（保存值和Token的offset）WITH_POSITIONS_OFFSETS：保存term vectors。（保存值和token位置信息和Token的offset）