在代码的主目录下添加如下文件:
IK Analyzer扩展配置
extwords.dic
stopword.dic
这里指明要加载当前目录下的stopword.dic文件,作为扩展停用词,加载当前目录下的extwords.dic文件,作为扩展词典
IK代码如下,在执行new IKSegmenter的时候会自动初始化扩展词典:
-
StringReader reader = new StringReader(str);
-
IKSegmenter ik = new IKSegmenter(reader, true);
-
Lexeme lexeme = null;
-
int pos = 0;
-
String wordName = null;
-
-
//System.out.println("into makeTagReal:" + str);
-
-
inital();
-
-
String splitLine = "";
-
try {
-
while ((lexeme = ik.next()) != null) {
-
wordName = lexeme.getLexemeText();
-
pos = lexeme.getBeginPosition();
-
//System.out.println("wordName: " + wordName + " pos: " + pos);
-
splitLine += wordName;
-
splitLine += " ";
-
}
-
} catch (Exception e) {
-
e.printStackTrace();
-
}
阅读(6004) | 评论(0) | 转发(0) |