首先要安装java 和 elasticsearch,相关步骤参见之前文章。
1. 安装maven
1
2
3
4
5
6
7
8
|
# wget -c http://mirror.cc.columbia.edu/pub/software/apache/maven/maven-3/3.3.3/binaries/apache-maven-3.3.3-bin.tar.gz
# tar zxvf apache-maven-3.3.3-bin.tar.gz -C /usr/local/
# cd /usr/local/
# ln -s apache-maven-3.3.3 maven
# vi /etc/profile
export M2_HOME=/alidata/server/maven
export PATH=${M2_HOME}/bin:${PATH}
# . /etc/profile
|
2. 安装elasticsearch-analysis-ik
1
2
3
4
5
6
|
# wget -c https://github.com/medcl/elasticsearch-analysis-ik/archive/master.zip
# unzip master.zip
# cd elasticsearch-analysis-ik-master/
# cp -Rf config/* ik /usr/local/elasticsearch/config/
# mvn package
# /usr/local/elasticsearch/bin/plugin –install analysis-ik –url file:///alidata/download/elasticsearch-analysis-ik-master/target/releases/elasticsearch-analysis-ik-1.4.1.zip
|
3. 修改elasticsearch.yml
在该文件后面添加下面的内容
1
2
3
4
5
6
7
8
9
10
11
12
13
|
index:
analysis:
analyzer:
ik:
alias: [ik_analyzer]
type: org.elasticsearch.index.analysis.IkAnalyzerProvider
ik_max_word:
type: ik
use_smart: false
ik_smart:
type: ik
use_smart: true
index.analysis.analyzer.default.type: ik
|
4. 重新启动elasticsearch
1
|
/usr/local/elasticsearch/bin/elasticsearch –d
|
5. ik分词测试
创建一个索引,名为index
1
|
# curl -XPUT http://localhost:9200/index
|
为索引index创建mapping
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
# curl -XPOST http://localhost:9200/index/fulltext/_mapping -d’
{
“fulltext”: {
“_all”: {
“analyzer”: “ik”
},
“properties”: {
“content”: {
“type” : “string”,
“boost” : 8.0,
“term_vector” : “with_positions_offsets”,
“analyzer” : “ik”,
“include_in_all” : true
}
}
}
}‘
|
测试
1
2
3
4
|
# curl ‘http://localhost:9200/index/_analyze?analyzer=ik&pretty=true’ -d ‘
{
“text”:“www.ttlsa.com 教程太好了受益匪浅,望ttlsa越来越好”
}‘
|
显示结果如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
|
{
“tokens” : [ {
“token” : “text”,
“start_offset” : 4,
“end_offset” : 8,
“type” : “ENGLISH”,
“position” : 1
}, {
“token” : “www.ttlsa.com”,
“start_offset” : 11,
“end_offset” : 24,
“type” : “LETTER”,
“position” : 2
}, {
“token” : “www”,
“start_offset” : 11,
“end_offset” : 14,
“type” : “ENGLISH”,
“position” : 3
}, {
“token” : “ttlsa”,
“start_offset” : 15,
“end_offset” : 20,
“type” : “ENGLISH”,
“position” : 4
}, {
“token” : “com”,
“start_offset” : 21,
“end_offset” : 24,
“type” : “ENGLISH”,
“position” : 5
}, {
“token” : “教程”,
“start_offset” : 25,
“end_offset” : 27,
“type” : “CN_WORD”,
“position” : 6
}, {
“token” : “太好了”,
“start_offset” : 27,
“end_offset” : 30,
“type” : “CN_WORD”,
“position” : 7
}, {
“token” : “太好”,
“start_offset” : 27,
“end_offset” : 29,
“type” : “CN_WORD”,
“position” : 8
}, {
“token” : “好了”,
“start_offset” : 28,
“end_offset” : 30,
“type” : “CN_WORD”,
“position” : 9
}, {
“token” : “受益匪浅”,
“start_offset” : 30,
“end_offset” : 34,
“type” : “CN_WORD”,
“position” : 10
}, {
“token” : “受益”,
“start_offset” : 30,
“end_offset” : 32,
“type” : “CN_WORD”,
“position” : 11
}, {
“token” : “匪”,
“start_offset” : 32,
“end_offset” : 33,
“type” : “CN_CHAR”,
“position” : 12
}, {
“token” : “浅”,
“start_offset” : 33,
“end_offset” : 34,
“type” : “CN_CHAR”,
“position” : 13
}, {
“token” : “望”,
“start_offset” : 35,
“end_offset” : 36,
“type” : “CN_CHAR”,
“position” : 14
}, {
“token” : “ttlsa”,
“start_offset” : 36,
“end_offset” : 41,
“type” : “ENGLISH”,
“position” : 15
}, {
“token” : “越来越好”,
“start_offset” : 41,
“end_offset” : 45,
“type” : “CN_WORD”,
“position” : 16
}, {
“token” : “越来越”,
“start_offset” : 41,
“end_offset” : 44,
“type” : “CN_WORD”,
“position” : 17
}, {
“token” : “越来”,
“start_offset” : 41,
“end_offset” : 43,
“type” : “CN_WORD”,
“position” : 18
}, {
“token” : “越好”,
“start_offset” : 43,
“end_offset” : 45,
“type” : “CN_WORD”,
“position” : 19
} ]
}
|
文章转载来自:ttlsa.com