Elasticsearch安装IK分词器-Ddmit

安装方式大概分为两种：

自行编译；
使用elasticsearch-rtf版，利用别人编译好的文件进行安装；

环境说明：

我下载的es版本是：elasticsearch-1.7.4.tar.gz，这里我是用的ik版本是elasticsearch-rtf-1.0.0.zip解压后得到的elasticsearch-analysis-ik-1.2.6.jar文件。

方式一：自行编译

自行编译的步骤如下：

到https://github.com/medcl/elasticsearch-analysis-ik页面下载elasticsearch-analysis-ik-x.x.x.zip压缩包；
解压elasticsearch-analysis-ik-x.x.x.zip，然后进行elasticsearch-analysis-ik-x.x.x目录；
使用maven进行打包，得到elasticsearch-analysis-ik-x.x.x.jar(然而我并不会使用maven进行打包，所以我没用这种方式)；
进行elasticsearch-1.7.4/plugins目录，创建目录analysis-ik，并把你编译好的elasticsearch-analysis-ik-x.x.x.jar放置到此目录；
将你下载的elasticsearch-analysis-ik-x.x.x.zip解压后的config目录下的ik目录复制到elasticsearch-1.7.4/config目录；
配置elasticsearch-1.7.4/config目录下的elasticsearch.yml文件，在文件尾部加入如下代码：

index:  
      analysis:                     
        analyzer:        
          ik:  
              alias: [ik_analyzer]  
              type: org.elasticsearch.index.analysis.IkAnalyzerProvider  
          ik_max_word:  
              type: ik  
              use_smart: false  
          ik_smart:  
              type: ik  
              use_smart: true

或简单配置：

index.analysis.analyzer.ik.type : "ik"

7.重新启动es；

方式二：使用elasticsearch-rtf版，利用别人编译好的文件进行安装

我使用的就是这种方式，第一种方式我并没有进行测试，所以暂时不保证正确，着重说明一下第二种方式，步骤如下：

到这里https://github.com/medcl/elasticsearch-rtf/releases下载rtf版的es，我下载的是elasticsearch-rtf-1.0.0.zip；
解压elasticsearch-rtf-1.0.0.zip压缩包；
将elasticsearch-rtf-1.0.0/plugins/analysis-ik目录，复制到elasticsearch-1.7.4/plugins得到elasticsearch-1.7.4/plugins/analysis-ik；
将elasticsearch-rtf-1.0.0/config/ik复制到elasticsearch-1.7.4/config/得到elasticsearch-1.7.4/config/ik；
编辑elasticsearch-1.7.4/config/elasticsearch.yml文件，在文件尾部加入以下内容：

index:  
      analysis:                     
        analyzer:        
          ik:  
              alias: [ik_analyzer]  
              type: org.elasticsearch.index.analysis.IkAnalyzerProvider  
          ik_max_word:  
              type: ik  
              use_smart: false  
          ik_smart:  
              type: ik  
              use_smart: true

或简单配置：

index.analysis.analyzer.ik.type : "ik"

6.重新启动es；

测试（我在第二种安装方式下进行的测试）：

创建索引：

curl -XPUT http://localhost:9200/index

创建映射：

curl -XPOST http://localhost:9200/index/fulltext/_mapping -d'
{
    "fulltext": {
             "_all": {
            "analyzer": "ik_max_word",
            "search_analyzer": "ik_max_word",
            "term_vector": "no",
            "store": "false"
        },
        "properties": {
            "content": {
                "type": "string",
                "store": "no",
                "term_vector": "with_positions_offsets",
                "analyzer": "ik_max_word",
                "search_analyzer": "ik_max_word",
                "include_in_all": "true",
                "boost": 8
            }
        }
    }
}'

为索引添加一些内容：

curl -XPOST http://localhost:9200/index/fulltext/1 -d'
{"content":"美国留给伊拉克的是个烂摊子吗"}
'
curl -XPOST http://localhost:9200/index/fulltext/2 -d'
{"content":"公安部：各地校车将享最高路权"}
'
curl -XPOST http://localhost:9200/index/fulltext/3 -d'
{"content":"中韩渔警冲突调查：韩警平均每天扣1艘中国渔船"}
'
curl -XPOST http://localhost:9200/index/fulltext/4 -d'
{"content":"中国驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首"}
'

进行高亮查询：

curl -XPOST http://localhost:9200/index/fulltext/_search  -d'
{
    "query" : { "term" : { "content" : "中国" }},
    "highlight" : {
        "pre_tags" : ["<tag1>", "<tag2>"],
        "post_tags" : ["</tag1>", "</tag2>"],
        "fields" : {
            "content" : {}
        }
    }
}
'

查询结果：

{"took":31,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":2,"max_score":0.61370564,"hits":[{"_index":"index","_type":"fulltext","_id":"4","_score":0.61370564,"_source":
{"content":"中国驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首"}
,"highlight":{"content":["<tag1>中国</tag1>驻洛杉矶领事馆遭亚裔男子枪击 嫌犯已自首"]}},{"_index":"index","_type":"fulltext","_id":"3","_score":0.61370564,"_source":
{"content":"中韩渔警冲突调查：韩警平均每天扣1艘中国渔船"}
,"highlight":{"content":["中韩渔警冲突调查：韩警平均每天扣1艘<tag1>中国</tag1>渔船"]}}]}}

或者你可以直接在浏览器地址栏进行测试：http://localhost:9200/index/_analyze?analyzer=ik&pretty=true&text=%E6%88%91%E6%98%AF%E4%B8%AD%E5%9B%BD%E4%BA%BA；

注意：如果你的版本不对应，可能会出现如下错误：

{"error":"IndexCreationException[[index] failed to create index]; nested: ElasticsearchIllegalArgumentException[failed to find analyzer type [ik] or tokenizer for [ik_max_word]]; nested: NoClassSettingsException[Failed to load class setting [type] with value [ik]]; nested: ClassNotFoundException[org.elasticsearch.index.analysis.ik.IkAnalyzerProvider]; ","status":400}

参考文章：

http://samchu.logdown.com/posts/277928-elasticsearch-chinese-word-segmentation；
https://github.com/medcl/elasticsearch-analysis-ik；

Elasticsearch安装IK分词器

方式一：自行编译

方式二：使用elasticsearch-rtf版，利用别人编译好的文件进行安装

测试（我在第二种安装方式下进行的测试）：

相关推荐

评论抢沙发

最新评论

热门标签

切换注册登录

切换登录注册

方式一：自行编译

方式二：使用elasticsearch-rtf版，利用别人编译好的文件进行安装

测试（我在第二种安装方式下进行的测试）：

相关推荐

评论 抢沙发

最新评论

热门标签

切换注册登录

切换登录注册

评论抢沙发