首页　| 　博文目录　| 　关于我

博客访问： 406790
博文数量： 69
博客积分： 1984
博客等级：上尉
技术积分： 953
用户组：普通用户
注册时间： 2007-03-28 00:43

个人简介

学无所长，一事无成

文章分类

全部博文（69）

clojure（1）

compojure（1）
node.js（1）

generator（1）
dojo学习（16）

dojox（0）

dijit（3）

dojo（7）

操作数据（1）

杂项（2）

创建 widget（2）
oracle（2）
ruby（37）

杂项（2）

celluloid（10）

sinatra（2）

元编程（6）

DBI（4）

celluloid/dcell（8）

neo4j（1）
Catalyst 学习（10）
未分配的博文（2）

文章存档

2015年（19）

2014年（14）

2013年（9）

2012年（17）

2010年（10）

我的朋友

Neo4j::Core Lucene

词霸解释：Lucene是一个非常优秀的开源的全文搜索引擎；我们可以在它的上面开发出各种全文搜索的应用来。Lucene在国外有很高的知名度；现在已经是Apache的顶级项目。

Neo4j 已经内置了lucene document database。一个常见的应用场景是使用 lucene 搜索到一个节点，然后从其遍历或使用 cypher 查询。

定义一个 Index

Lucene 通过 beforeCommit 钩子进行装配，详见 .
Neo4j::Node.trigger_on 方法描述neo4j.rb 对哪些节点或者 relationship 进行索引。
Neo4j::Node.index 方法描述对哪些属性进行索引，以及哪些类型的索引会被使用。

例子：

Neo4j::Node.trigger_on(:typex => 'MyTypeX')
Neo4j::Node.index :name
Neo4j::Node.index :description, :type => :fulltext
Neo4j::Node.index :age, :field_type => Fixnum

注意：
only properties that have been set are added to the index. Thus a property with no set value and no default value will NOT be matched by a wildcard query (*)!

:trigger_on

上述声明告诉 neo4j.rb 只有拥有属性 typex 且值为 MyTypeX 的节点才会被索引。当 before commit hook 发现一个节点拥有上述属性，则将调用 Neo4j::Node.index 方法对该节点的所有属性进行索引。你可以声明多个不同属性，对应多个不同的值。
This is used for example when the same property (_classname) can be triggered by different values (the name of each subclass) – see for example the use of trigger_on in the Neo4j::NodeMixin implemementation.

:type

索引的缺省类型会使用 exact. 上面的例子在属性name 和description 上声明了不同类型的索引。

:field_type

缺省情况下索引会保存为 lucene 格式的字符串。如果你需要以数字进行索引，那么你可以将:field_type 设为 Fixnum 或 Float。这样你就可以使用 lucene 进行范围检索了。

注意：如果你使用 Neo4j::NodeMixin or Neo4j::Rails::Model@ 你可以使用 property 来定义一个索引。
注意：你不能通过字串搜索一个非 String field_types 类型。你必须使用 hash 进行查询 (比如 Neo4j::Node.find(:age => 4) !

Custom Index

除了直接使用 Neo4j::Node.index 或 Neo4j::Relationship.index 方法来定义索引，你还可以通过自定义索引类来进行。这将使你拥有不同的 lucene 索引文件和配置信息。

例子：

class MyIndex
extend Neo4j::Core::Index::ClassMethods
include Neo4j::Core::Index
self.node_indexer do
index_names :exact => 'myindex_exact', :fulltext => 'myindex_fulltext'
trigger_on :myindex => true # trigger on all nodes having property myindex == true
end
index :name
end

我们可以使用这个类来进行搜索，如 MyIndex.find(:name => 'andreas').first

如果要对关系进行索引，那么使用 rel_indexer 方法取代 node_indexer 方法即可。

参考： neo4j-creating-custom-index blog

如何搜索

Lucene Query Language（Lucene 查询语言）

你可以使用 lucene 查询语言（）

比方说你在 Neo4j::Node 类上定义索引如下：

Neo4j::Node.trigger_on(:typex => 'MyTypeX')
Neo4j::Node.index(:name)

然后你创建了一个节点：

a = Neo4j::Node.new(:name => 'andreas', :typex => 'MyTypeX')

现在你可以使用 lucene 查询语法进行检索了，代码如下：

Neo4j::Node.find('name: andreas') {|result| puts result.first[:name] }
# or same if this if you want to close the lucene connection yourself
result = Neo4j::Node.find('name: andreas')
puts result.first[:name]
result.close

注意： 务必要关闭 lucene 连接。要么使用 block （会自动调用 close），要么在查询结果上使用 close 方法。
注意： Neo4j::Rails::Model.find 和 Neo4j::Rails::Relationship.find 方法会通过使用 Rack 自动关闭 lucene 连接。

Search in Property Arrays.

Neo4j 中属性值可以是一个数组。Lucene 也支持数组值的搜索。

例子：

# don't forget declare index on things: Neo4j::Node.index :things
Neo4j::Transaction.new do
Neo4j::Node.new(:things => ['aaa', 'bbb', 'ccc'])
end
Neo4j::Node.find("things: bbb"){|r| puts r.first}

可以传入数组作为搜索条件，我们来搜索 :things 为 bbb 或 ccc 的节点。

Neo4j::Node.find(:things => ["bbb", "ccc"]){|r| puts r.first}

这个搜索语句会促使 lucene 采用 OR 进行逻辑判断。

Fulltext and Exact （全文本检索和精确匹配）

缺省索引类型是 :exact，这也是用的最多的。如果你想检索匹配索引文本中的每一个 word，那么你需要使用 fulltext 索引类型。分析器中，fulltext 使用空白字符作为拆分 word 的间隔。创建索引时添加类型 :fulltext (缺省为:exact) 即可声明。

例子：

MyIndex.index :name, :type => :fulltext
MyIndex.find('name: andreas', :type => :fulltext).first #=> andreas

注意： 如果不是用的 :exact 索引，那么 find 方法中你必须指定 :type 。

Hash Queries

使用 hash 作为检索条件，可以一次传入多个属性值，对应的 lucene 将采用 AND 逻辑。

MyIndex.find(:name => 'asd').or(:wheels => 8).first.should == thing3

Compound Queries (复合查询)

你可以使用 and or 和not
等方法组成一个复合查询语句。

MyIndex.find(:name => 'asd').or(:wheels => 8).first.should == thing3

Range Search(范围查询)

如果你想做一个数值范围检索，首先你必须将索引类型 field_type 声明为 Fixnum or Float。
对于 Neo4j::Rails::Model 和 Neo4j::NodeMixin ，你可以在 property 中设置 :type 。

有两种方式进行范围检索--使用 between 方法或者使用 Ruby Range class.

使用 between 的例子：

MyIndex.find(:age).between(2, 5)

使用 Ruby Range class 的例子：

MyIndex.find(:name => 'thing').and(:wheels => (9..15)).should be_empty

注意：Range queries on none String (e.g. :field_type => Fixnum) is not possible using a String lucene query, instead you must use a hash query, as shown above.

Sorting(排序)
MyIndex.find('name: *@gmail.com').asc(:name).desc(:age)

注意： 排序字段上必须有索引

Manually Indexing （手工索引）

You can instead of waiting for the transaction to finish manually index a node or relationship.

Example (from RSpec):

new_node = Neo4j::Node.new
new_node[:name] = 'Kalle Kula'
new_node.add_index(:name)
new_node.rm_index(:name)
new_node[:name] = 'lala'
new_node.add_index(:name)
Neo4j::Node.find('name: lala').first.should == new_node
Neo4j::Node.find('name: "Kalle Kula"').first.should_not == new_node

优化

如果你正穿越大量节点进行检索，为获取最好性能，可以取消加载 Ruby wrappers，直接调用 Java 节点。

MyIndex.find('name: andreas', :type => :fulltext, :wrapped => false)

当使用 :wrapped => false 参数是，find 方法会返回一个 org.neo4j.graphdb.index.IndexHit实例。
(这个东西类似 Ruby 中的可枚举类型，你可以调用 each，collect 等方法 )

Lucene 配置
可以创建你自己的 lucene configuration。

Example, see the configuration for fulltext and exact indexing in the Neo4j::Config[:lucene]

You can add your own lucene indexing configuration in the Neo4j::Config and use it with the index keyword.

Neo4j::Config[:lucene][:my_index_type] = ...
class Person
index :name, :type => :my_index_type
end

I have not tested this :-)

Gotchas

Nil 值永远不会被索引！

阅读(1907) | 评论(0) | 转发(0) |

上一篇：ruby Reel 学习 -day1 基础知识

下一篇：Sinatra 学习

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6