小皮博客 | Xiaopi's Blog

69-elastic-search-api

学习归纳和整理elastic search api

分类

  • 文档API 对文档进行增删改查操作。
  • 搜索API 实现搜索检索功能。
  • 索引API 对索引进行操作。
  • 查看API 按照直观的形式返回数据。
  • 集群API 对集群进行操作和查看的API。

使用说明

  • 每一个命令后加入v参数都可以打开详细输出

curl -XGET ‘localhost:9200/_cat/master?v&pretty’。

shengl-pro:tmp shengl$ curl -XGET 'localhost:9200/_cat/master?v&pretty'
id                     host      ip        node
H-thNd-UT6SoHcIN8_c-_g 127.0.0.1 127.0.0.1 H-thNd-  
shengl-pro:tmp shengl$ curl -XGET 'localhost:9200/_cat/master?pretty'
H-thNd-UT6SoHcIN8_c-_g 127.0.0.1 127.0.0.1 H-thNd-  
  • 每一个命令后都可以加入pretty参数来格式化输出(看json会方便点)

TODO

  • 每一个命令后都可以加入help来查看用法

curl -XGET ‘localhost:9200/_cat/health?help&pretty’

shengl-pro:tmp shengl$ curl -XGET 'localhost:9200/_cat/health?help&pretty'
epoch                 | t,time                                   | seconds since 1970-01-01 00:00:00  
timestamp             | ts,hms,hhmmss                            | time in HH:MM:SS                   
cluster               | cl                                       | cluster name                       
status                | st                                       | health status                      
node.total            | nt,nodeTotal                             | total number of nodes              
node.data             | nd,nodeData                              | number of nodes that can store data
shards                | t,sh,shards.total,shardsTotal            | total number of shards             
pri                   | p,shards.primary,shardsPrimary           | number of primary shards           
relo                  | r,shards.relocating,shardsRelocating     | number of relocating nodes         
init                  | i,shards.initializing,shardsInitializing | number of initializing nodes       
unassign              | u,shards.unassigned,shardsUnassigned     | number of unassigned shards        
pending_tasks         | pt,pendingTasks                          | number of pending tasks            
max_task_wait_time    | mtwt,maxTaskWaitTime                     | wait time of longest task pending  
active_shards_percent | asp,activeShardsPercent                  | active number of shards in percent 
  • 指定输出的列

curl -XGET ‘localhost:9200/_cat/nodes?h=ip,heapPercent&pretty’

shengl-pro:tmp shengl$ curl -XGET 'localhost:9200/_cat/nodes?h=ip,heapPercent&pretty'
127.0.0.1 8
  • 更改输出的单位,size,bytes..

curl ‘localhost:9200/_cat/indeces?bytes=b’

TODO

  • 更改响应文本格式,可以使用format参数,取值为text,json,smile,yaml, cbor,也可以使用 Http header “Accept: “ , Elastic Search应该做了很多巧妙的适配器来转换各种格式的数据。这个架构实践可以借鉴在开源项目中。

curl ‘localhost:9200/_cat/indices?pretty&format=cbor’

curl ‘localhost:9200/_cat/indices?pretty’ -H “Accept: application/yaml”

shengl-pro:tmp shengl$ curl 'localhost:9200/_cat/indices?pretty' -H "Accept: application/smile"
:)
??shengl-pro:tmp shengl$ 
shengl-pro:tmp shengl$ 
shengl-pro:tmp shengl$ curl 'localhost:9200/_cat/indices?pretty' -H "Accept: application/yaml"
--- []

文档API

TODO

搜索API

TODO

索引API,CRUD操作及批处理

对索引的基本操作

创建索引

curl -XPUT ‘localhost:9200/[indexName]?pretty’

shengl-pro:tmp shengl$ curl -XPUT 'localhost:9200/address?pretty'
{
  "acknowledged" : true,
  "shards_acknowledged" : true
}

查看所有的索引

curl ‘localhost:9200/_cat/indices?v&pretty’

shengl-pro:tmp shengl$ curl 'localhost:9200/_cat/indices?v&pretty'
health status index   uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   map     ixyXdNSqTO6I30lsT40_eg   5   1          0            0       810b           810b
yellow open   address EvIdMEAdRd2fSNPK2GXOfw   5   1          0            0       810b           810b
前面创建的address索引有五个私有分片和一个副本,0个文档。 ### 索引设置 #### Mappings与Settings
PUT /[indexName]
{
    "settings": {...},
    "mappings": {
        "field_one": { .. },
        "field_two": { ... },
        ...
    }
}
  • config/elasticsearch.yml在每个节点下添加配置:
    action.auto_create_index: false # 禁止自动创建索引。

static和dynamic设置:

  • static设置在索引关闭或者创建时可以指定。
  • dynamic设置可以直接通过Update api来设置修改。

  • 动态索引设置:

    PUT /[indexName]/_settings
    {
      "index" : {
          "number_of_replicas": 0
      }
    }
    

插入数据

curl -XPUT ‘[ip]:[port]/[indexName]/[type]/[docId]’ -d ‘
{
[document content]
}

shengl-pro:tmp shengl$ curl -XPUT 'localhost:9200/address/normal/1?pretty' -d ' 
{
     "country": "china",
     "city":"beijing",
     "region":"haidian"
}'

{
  "_index" : "address",
  "_type" : "normal",
  "_id" : "1",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "created" : true #表示插入成功
}

获取索引

curl -XGET ‘[ip]:[port]/[indexName]/[indexType]/[docId]’

shengl-pro:tmp shengl$ curl -XGET 'localhost:9200/address/normal/1?pretty'
{
  "_index" : "address",
  "_type" : "normal",
  "_id" : "1",
  "_version" : 1,
  "found" : true,
  "_source" : {
    "country" : "china",
    "city" : "beijing",
    "region" : "haidian"
  }
}

删除数据

curl -XDELETE ‘[ip]:[port]/[indexName]/[indexType]/[docId]’

shengl-pro:tmp shengl$ curl -XDELETE 'localhost:9200/address/normal/1?pretty'
{
  "found" : true,
  "_index" : "address",
  "_type" : "normal",
  "_id" : "1",
  "_version" : 2,
  "result" : "deleted",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  }
}
* 再查已经删除掉了.
shengl-pro:tmp shengl$ curl -XGET 'localhost:9200/address/normal/1?pretty'

{
  "_index" : "address",
  "_type" : "normal",
  "_id" : "1",
  "found" : false
}

删除索引

curl -XDELETE ‘[ip]:[port]/[indexName]’
只要有接口权限的情况下就可以直接删除,这也是Elastic Search不适合直接作为最终存储的原因之一,当然可以做一些限制,然而还是不安全。

shengl-pro:tmp shengl$ curl -XDELETE 'localhost:9200/map'
{"acknowledged":true}

shengl-pro:tmp shengl$ curl 'localhost:9200/_cat/indices?v'
health status index   uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   address EvIdMEAdRd2fSNPK2GXOfw   5   1          0            0       839b           839b

修改数据

curl -XPUT ‘[ip]:[port]/[indexName]/[indexType]/[docId]’ -d ‘[document content]’
并没有使用Update的指令,也就是说插入操作其实就是可重入的

shengl-pro:tmp shengl$ curl -XPUT 'localhost:9200/address/normal/1?pretty' -d ' 
{
     "country": "china",
     "city":"beijing",
     "region":"chaoyang"
}'  
{
  "_index" : "address",
  "_type" : "normal",
  "_id" : "1",
  "_version" : 2,
  "result" : "updated",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "created" : false
}

curl -XPOST ‘[ip]:[port]/[indexName]/[indexType]/[docId]/_update’ -d ‘[document content]’
显式的update

  • 直接Update不存在的id,会报错。ADD重复的数据可重入,但是反过来Update不存在的数据不行。体会一下它的设计理念。
shengl-pro:tmp curl -XPOST 'localhost:9200/address/normal/2/_update?pretty' -d '{  
   "doc":
   {
      "base": {
         "country": "china"
      },
      "detail":{
         "city":"beijing",
         "region":"chaoyang"
      }
   }
}'
{
  "error" : {
    "root_cause" : [
      {
        "type" : "document_missing_exception",
        "reason" : "[normal][2]: document missing",
        "index_uuid" : "EvIdMEAdRd2fSNPK2GXOfw",
        "shard" : "2",
        "index" : "address"
      }
    ],
    "type" : "document_missing_exception",
    "reason" : "[normal][2]: document missing",
    "index_uuid" : "EvIdMEAdRd2fSNPK2GXOfw",
    "shard" : "2",
    "index" : "address"
  },
  "status" : 404
}

通过doc参数直接修改文档内容

curl -XPOST 'localhost:9200/address/normal/1/_update?pretty' -d '{  
   "doc":
   {
      "base": {
         "country": "china"
      },
      "detail":{
         "city":"beijing",
         "region":"chaoyang"
      },
      "rank": 20
   }
}'

{
  "_index" : "address",
  "_type" : "normal",
  "_id" : "1",
  "_version" : 3,
  "result" : "updated",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  }
}

通过script参数对文档进行操作。

curl -XPOST 'localhost:9200/address/normal/1/_update?pretty' -d '{
    "script" : "ctx._source.rank += 1.5"
}'
{
  "_index" : "address",
  "_type" : "normal",
  "_id" : "1",
  "_version" : 5,
  "result" : "updated",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  }
}
  • Pattern 基本的API模式

curl -X[REST Verb] ‘[ip]:[port]/[indexName]/[indexType]/[docId]’ -d

批处理

  • 索引批处理操作

curl -POST ‘[ip]:[port]/[indexName]/[indexType]/_bulk’ -d ‘
{[document content1]},
{[document content1]}…

  • 批量创建用 index参数JSON中的设置_index和_type会覆盖掉URL中的配置
shengl-pro:tmp shengl$ curl -XPOST 'localhost:9200/address/tiny/_bulk?pretty' -d '
         {"index":{"_id":"23"}}
         {"county":"china", "city": "beijing" }
         {"index":{"_id":"24"}}
         {"county":"china", "city": "changsha" }
         '
 返回每一条的处理结果。
{
  "took" : 25,
  "errors" : false,
  "items" : [
    {
      "index" : {
        "_index" : "address",
        "_type" : "tiny",
        "_id" : "23",
        "_version" : 3,
        "result" : "updated",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "created" : false,
        "status" : 200
      }
    },
    {
      "index" : {
        "_index" : "address",
        "_type" : "tiny",
        "_id" : "24",
        "_version" : 2,
        "result" : "updated",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "created" : false,
        "status" : 200
      }
    }
  ]
}
  • 批量处理还可以将update和create或者其他操作同时进行。返回结果中也会返回每一个操作的结果。

curl -XPOST ‘[ip]:[port]/[indexName]/[indexType]/_bulk’ -d ‘
{‘[type]:{“_id”:”[docId]”}}
{“doc”:{[doc content]}}…’

shengl-pro:tmp shengl$ curl -XPOST 'localhost:9200/address/tiny/_bulk?pretty' -d '
          {"update":{"_id":"23"}}
          {"doc":{"county":"china", "city": "shenzhen" }}
          {"delete":{"_id":"24"}}
          '
 #同样会分条目返回
{
  "took" : 26,
  "errors" : false,
  "items" : [
    {
      "update" : {
        "_index" : "address",
        "_type" : "tiny",
        "_id" : "23",
        "_version" : 6,
        "result" : "updated",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "status" : 200
      }
    },
    {
      "delete" : {
        "found" : true,
        "_index" : "address",
        "_type" : "tiny",
        "_id" : "24",
        "_version" : 4,
        "result" : "deleted",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "status" : 200
      }
    }
  ]
}

导入数据集合

  • vim /tmp/address-1.json
{"index":{"_id":"10001"}}
{"county":"china","city":"beijing"}
{"index":{"_id":"10002"}}
{"county":"china","city":"changsha"}

curl -XPOST ‘[ip]:[port]/[indexName]/[indexType]/_bulk?pretty’ –data-binary “@filepath”

curl -XPOST 'localhost:9200/addressfile/normal/_bulk?pretty' --data-binary "@/tmp/address-1.json"

 # 返回
{
  "took" : 115,
  "errors" : false,
  "items" : [
    {
      "index" : {
        "_index" : "addressfile",
        "_type" : "normal",
        "_id" : "10001",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "created" : true,
        "status" : 201
      }
    },
    {
      "index" : {
        "_index" : "addressfile",
        "_type" : "normal",
        "_id" : "10002",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "created" : true,
        "status" : 201
      }
    }
  ]
}

Scroll

  • scroll 基本用法

    curl -XGET ‘[ip]:[port]/[indexName]/[indexType]/_search?scroll=1m&search_type=scan’ -d ‘
    {“query”: { [query param] }}’ # 貌似最新版的es已经不支持scan(不排序)了?

shengl-pro:tmp shengl$ curl -XGET 'localhost:9200/address/normal/_search?scroll=1m&pretty' -d '
> {"size":2,"query":{"match_all":{}}}'
{
  "_scroll_id" : "DnF1ZXJ5VGhlbkZldGNoBQAAAAAAAAALFkgtdGhOZC1VVDZTb0hjSU44X2MtX2cAAAAAAAAADBZILXRoTmQtVVQ2U29IY0lOOF9jLV9nAAAAAAAAAA0WSC10aE5kLVVUNlNvSGNJTjhfYy1fZwAAAAAAAAAOFkgtdGhOZC1VVDZTb0hjSU44X2MtX2cAAAAAAAAADxZILXRoTmQtVVQ2U29IY0lOOF9jLV9n",
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "address",
        "_type" : "normal",
        "_id" : "12",
        "_score" : 1.0,
        "_source" : {
          "city" : "changsha"
        }
      },
      {
        "_index" : "address",
        "_type" : "normal",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "country" : "china",
          "city" : "beijing",
          "region" : "chaoyang",
          "detail" : {
            "city" : "beijing",
            "region" : "chaoyang"
          },
          "base" : {
            "country" : "china"
          },
          "rank" : 21
        }
      }
    ]
  }
}  

 # 可以使用scroll_id继续查询
curl -XGET 'localhost:9200/_search/scroll?scroll=1m&pretty&scroll_id=DnF1ZXJ5VGhlbkZldGNoBQAAAAAAAAALFkgtdGhOZC1VVDZTb0hjSU44X2MtX2cAAAAAAAAADBZILXRoTmQtVVQ2U29IY0lOOF9jLV9nAAAAAAAAAA0WSC10aE5kLVVUNlNvSGNJTjhfYy1fZwAAAAAAAAAOFkgtdGhOZC1VVDZTb0hjSU44X2MtX2cAAAAAAAAADxZILXRoTmQtVVQ2U29IY0lOOF9jLV9n'

  • scannling search的性能优势

    • 没有排序和打分,就是doc入库时的顺序。
    • 不支持聚合
    • 最初的查询结果的hits列表中不包含结果。
    • 如果设定了size,则是从每个分片中的数量。也就是说size=3,有5个shard,则每次会查出15条结果。
  • 主动清除Scroll API,虽然设置了超时时间,但是也可以主动清除。

curl -XDELETE ‘[ip]:[port]/_search/scroll’ -d “[scrollId]”
curl -XDELETE ‘[ip]:[port]/_search/scroll/_all’

  • 分析源码可以看到build包含几个部分
    • Type:String,查询的类型ParsedScrollId.QUERY_THEN_FETCH_TYPE=queryThenFetch;
    • searchPhaseResults:结果信息
    • attributes:查询条件参数

查看API

集群&运维常用API

TODO.

查看集群健康

curl ‘localhost:9200/_cat/health?v’

epoch      timestamp cluster       status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1539434816 20:46:56  elasticsearch green           1         1      0   0    0    0        0             0                  -                100.0%

查看所有集群的节点列表

curl ‘localhost:9200/_cat/nodes?v&pretty’

shengl-pro:tmp shengl$ curl 'localhost:9200/_cat/nodes?v&pretty'
ip        heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
127.0.0.1            8          98  99   12.93                  mdi       *      H-thNd-

版权声明

本文标题:69-elastic-search-api

文章作者:盛领

发布时间:2018年10月13日 - 19:31:33

原始链接:http://blog.xiaoyuyu.net/post/3e9b01b2.html

许可协议: 署名-非商业性使用-禁止演绎 4.0 国际 转载请保留原文链接及作者。

如您有任何商业合作或者授权方面的协商,请给我留言:sunsetxiao@126.com

盛领 wechat
欢迎您扫一扫上面的微信公众号,订阅我的博客!
坚持原创技术分享,您的支持将鼓励我继续创作!