Elasticsearch高亮功能詳解

一、Elasticsearch高亮顯示name欄位

在Elasticsearch中，高亮功能是通過在搜索結果中對搜索關鍵字進行標記、突出顯示的功能。為了快速了解高亮顯示，我們可以通過對name欄位進行高亮來體驗一下。

curl -X GET "localhost:9200/my_index/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": {
    "match": {
      "name": "elasticsearch"
    }
  },
  "highlight": {
    "fields": {
      "name": {
        "number_of_fragments": 3
      }
    }
  }
}
'

通過以上代碼，可以看到在搜索結果中，對包含搜索關鍵字的name欄位進行了突出顯示。

二、Elasticsearch啟動

在使用Elasticsearch高亮功能前，我們需要先啟動Elasticsearch。

cd elasticsearch-7.14.1/bin
./elasticsearch

以上代碼啟動Elasticsearch。如果您已經安裝了 Elasticsearch，可以直接在終端輸入 elasticsearch 命令啟動。如果您是通過安裝包或者其他方式安裝的Elasticsearch，命令會有所不同，請自行查詢。

三、Elasticsearch高亮查詢

在Elasticsearch中，高亮查詢是用於查找指定欄位中包含關鍵字的文檔，並將匹配關鍵字的部分進行標記、突出顯示的功能。

GET /my_index/_search
{
    "query": {
        "match_phrase" : {
            "text" : "elasticsearch"
        }
    },
    "highlight": {
        "fields" : {
            "text" : {}
        }
    }
}

以上代碼用於查詢text欄位中包含關鍵字「elasticsearch」的文檔，並對匹配的部分進行標記、突出顯示。

四、Elasticsearch高亮內容回填

Elasticsearch高亮功能支持將高亮顯示的內容回填到原有的數據中。回填功能主要會在搜索結果中將高亮顯示的內容回填到原始文檔中。

GET /my_index/_search
{
    "query": {
        "match_phrase" : {
            "text" : "elasticsearch"
        }
    },
    "highlight": {
        "fields" : {
            "text" : {}
        },
        "pre_tags" : [""],
        "post_tags" : [""]
    }
}

以上代碼用於將在搜索結果中搜索關鍵字「elasticsearch」並對匹配部分進行標記、突出顯示，同時將標記的信息回填到原始的文檔中。

五、Elasticsearch高亮設置

Elasticsearch高亮功能支持對高亮部分進行設置，如設置標記色、截取長度、顯示位置等等。常見設置如下：

GET /my_index/_search
{
    "query": {
        "match_phrase" : {
            "text" : "elasticsearch"
        }
    },
    "highlight": {
        "fields" : {
            "text" : {"number_of_fragments" : 2, "fragment_size" : 150}
        },
        "pre_tags" : [""],
        "post_tags" : [""]
    }
}

以上代碼用於將在搜索結果中搜索關鍵字「elasticsearch」並對匹配部分進行標記、突出顯示，同時設置每個匹配部分最多返回2段節選，每段節選最多返回150個字元。

六、Elasticsearch高亮跳轉

在搜索結果中點擊高亮顯示的部分可以跳轉至原始文檔中的對應位置。

涉及到html標籤的部分，在此省略轉義處理，下同。

curl -X GET "localhost:9200/my_index/_search?pretty" -H 'Content-Type: application/json' -d'
{
  ...
  "highlight": {
      "pre_tags": [""],
      "post_tags": [""],
      "fields": {
          "title": {},
          "content": {},
          "comments.comment": {}
      }
  }
}
'

以上代碼用於將在搜索結果中搜索關鍵字並對匹配部分進行標記、突出顯示，並設置每個匹配部分的開始和結束標記，以便實現跳轉到指定部分的效果。

七、Elasticsearch高亮還原

當文檔中含有HTML或其他轉義字元時，高亮顯示的內容未必適用於所有界面。我們需要用到escape_html參數，以便還原標記後的內容，即將文檔中的HTML標記轉義還原。

curl -X GET "localhost:9200/my_index/_search?pretty" -H 'Content-Type: application/json' -d'
{
  ...
  "highlight": {
      "fields": {
          "title": {"escape": true}
      }
  }
}
'

以上代碼用於將在搜索結果中搜索關鍵字並對匹配部分進行標記、突出顯示，並設置escapte_html參數以便還原HTML標記。

八、Elasticsearch高亮用什麼框架實現

Elasticsearch的高亮功能可以通過多種框架實現，如Java、Python等。以下是Java實現的示例代碼：

public class SearchDemo {
    private TransportClient client;
    private static final String ES_CLUSTER_NAME = "my_cluster";
    private static final String ES_SERVER_IP = "127.0.0.1:9300;";

    public void initialize() throws Exception {
        //設置連接池配置
        Settings settings = Settings.builder().put("cluster.name", ES_CLUSTER_NAME).build();
        TransportClient transportClient = new PreBuiltTransportClient(settings).addTransportAddresses(
                new TransportAddress(InetAddress.getByName(ES_SERVER_IP.split(":")[0]), Integer
                        .valueOf(ES_SERVER_IP.split(":")[1])));
        this.client = transportClient;
    }

    public void search(String index, String type, String name) {
        QueryBuilder query = QueryBuilders.matchQuery("name", name);
        HighlightBuilder hiBuilder = new HighlightBuilder();
        hiBuilder.preTags("");
        hiBuilder.postTags("");
        hiBuilder.field("name");
        SearchResponse response = client.prepareSearch(index).setTypes(type).setQuery(query)
                .highlighter(hiBuilder).get();
        SearchHits hits = response.getHits();
        for (SearchHit hit : hits) {
            Map fields = hit.getHighlightFields();
            HighlightField text = fields.get("name");
            Text[] fragments = text.fragments();
            String fragmentString = fragments[0].string();
            System.out.println(fragmentString);
        }
    }
}

以上代碼用於連接Elasticsearch並設置連接池配置，然後執行關鍵字查詢並對匹配部分進行標記、突出顯示並輸出。可以通過Java語法來實現Elasticsearch高亮功能。

九、Elasticsearch高亮只能高亮一個字

在使用Elasticsearch高亮功能時，當需要高亮的部分只有一個字時，容易出現不亮的情況。這是由於Elasticsearch對高亮的片段長度進行限制，導致出現不亮的情況。可以通過以下方式解決這個問題：

GET /my_index/_search
{
    "query": {
        "match": {
            "text": "elasticsearch"
        }
    },
    "highlight": {
        "fields": {
            "text": {
                "number_of_fragments": 0,
                "require_field_match": false,
                "pre_tags": [""],
                "post_tags": [""]
            }
        }
    }
}

以上代碼用於將一行文本中的每個字元都作為單獨的標記選項，以進行高亮顯示。

十、Elasticsearch高亮截取中文字元串

在使用Elasticsearch高亮功能時，由於中文長度不穩定，會導致一些字元不被高亮顯示，而不同於其他關鍵字匹配的字元。可以使用第三方庫中的SubstringScore分析中文字元，並對匹配部分進行截取，以解決該問題。下面是String分析的示例代碼：

Analyzer analyzer = new StandardAnalyzer();    // 使用標準分詞器
QueryParser parser = new QueryParser("", analyzer);   // 構建QueryParser對象
Query query = parser.parse("text");  // QueryParser.parse(String str)將會調用analyzer對象
TokenStream tokenStream = analyzer.tokenStream("", new StringReader(content));
CharTermAttribute charTermAttribute = tokenStream.addAttribute(CharTermAttribute.class);
tokenStream.reset();
float score = 0;
int start = 0;
int end = 0;
while (tokenStream.incrementToken()) {
    String term = charTermAttribute.toString();
    String pattern = term.replace("*", "");
    if (pattern.length()  loc)
                start = loc;
            if (end FragSize / 2)
        start -= Emergent.length() / 2;
    if (end + FragSize / 2 < content.length())
        end += Emergent.length() / 2;
    if (start = content.length())
        end = content.length() - 1;
    fragmentString = getHighlightSnippet(content, start, end);
}

以上代碼使用StringTokenizer函數處理中文字元，並對關鍵字進行截取和高亮顯示，並解決了中文字元不穩定導致有些字元不被高亮顯示的問題。

原創文章，作者：小藍，如若轉載，請註明出處：https://www.506064.com/zh-tw/n/279351.html