Kouhei Sutou
null+****@clear*****
Thu Jul 20 15:41:19 JST 2017
Kouhei Sutou 2017-07-20 15:41:19 +0900 (Thu, 20 Jul 2017) New Revision: 4c2e0e40f0b20fb1b27d5c0d9ceaf4e9832b9531 https://github.com/pgroonga/pgroonga.github.io/commit/4c2e0e40f0b20fb1b27d5c0d9ceaf4e9832b9531 Message: Add note about Japanese similar search Modified files: _po/ja/reference/operators/similar-search-v2.po ja/reference/operators/similar-search-v2.md reference/operators/similar-search-v2.md Modified: _po/ja/reference/operators/similar-search-v2.po (+32 -1) =================================================================== --- _po/ja/reference/operators/similar-search-v2.po 2017-07-20 15:32:42 +0900 (e26474f) +++ _po/ja/reference/operators/similar-search-v2.po 2017-07-20 15:41:19 +0900 (ceb9543) @@ -1,7 +1,7 @@ msgid "" msgstr "" "Project-Id-Version: PACKAGE VERSION\n" -"PO-Revision-Date: 2017-06-10 13:29+0900\n" +"PO-Revision-Date: 2017-07-20 15:41+0900\n" "Language: ja\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" @@ -160,3 +160,34 @@ msgstr "" "SELECT * FROM memos WHERE content &~? 'MroongaはGroongaを使うMySQLの拡張機能です。';\n" "-- ERROR: pgroonga: operator &~? is available only in index scan\n" "```" + +msgid "## For Japanese" +msgstr "## 日本語向け" + +msgid "" +"You should use `TokenMecab` tokenizer instead of the default `TokenBigram` for" +" similar search against Japanese documents:" +msgstr "日本語の文書を類似文書検索する場合はデフォルトの`TokenBigram`ではなく`TokenMecab`を使う方がよいです。" + +msgid "" +"```sql\n" +"CREATE INDEX pgroonga_content_index ON memos\n" +" USING pgroonga (content pgroonga.text_full_text_search_ops_v2)\n" +" WITH (tokenizer='TokenMecab');\n" +"```" +msgstr "" + +msgid "" +"`TokenMecab` will tokenize target documents to words. It improves similar sear" +"ch precision." +msgstr "`TokenMecab`は対象の文書を(ほぼ)単語にトークナイズします。これにより類似文書検索の精度が上がります。" + +msgid "" +"See also [`CREATE INDEX USING pgroonga`][create-index-using-pgroonga] how to s" +"pecify `TokenMecab` tokenizer." +msgstr "" +"`TokenMecab`トークナイザーの指定方法については[`CREATE INDEX USING pgroonga`][create-index-usin" +"g-pgroonga]も参照してください。" + +msgid "[create-index-using-pgroonga]:../create-index-using-pgroonga.html" +msgstr "" Modified: ja/reference/operators/similar-search-v2.md (+16 -0) =================================================================== --- ja/reference/operators/similar-search-v2.md 2017-07-20 15:32:42 +0900 (cfbf5df) +++ ja/reference/operators/similar-search-v2.md 2017-07-20 15:41:19 +0900 (3cb675c) @@ -72,3 +72,19 @@ SELECT * FROM memos WHERE content &~? 'MroongaはGroongaを使うMySQLの拡張 SELECT * FROM memos WHERE content &~? 'MroongaはGroongaを使うMySQLの拡張機能です。'; -- ERROR: pgroonga: operator &~? is available only in index scan ``` + +## 日本語向け + +日本語の文書を類似文書検索する場合はデフォルトの`TokenBigram`ではなく`TokenMecab`を使う方がよいです。 + +```sql +CREATE INDEX pgroonga_content_index ON memos + USING pgroonga (content pgroonga.text_full_text_search_ops_v2) + WITH (tokenizer='TokenMecab'); +``` + +`TokenMecab`は対象の文書を(ほぼ)単語にトークナイズします。これにより類似文書検索の精度が上がります。 + +`TokenMecab`トークナイザーの指定方法については[`CREATE INDEX USING pgroonga`][create-index-using-pgroonga]も参照してください。 + +[create-index-using-pgroonga]:../create-index-using-pgroonga.html Modified: reference/operators/similar-search-v2.md (+16 -0) =================================================================== --- reference/operators/similar-search-v2.md 2017-07-20 15:32:42 +0900 (e3d5105) +++ reference/operators/similar-search-v2.md 2017-07-20 15:41:19 +0900 (b369d6e) @@ -72,3 +72,19 @@ You can't use similar search with sequential scan. If you use similar search wit SELECT * FROM memos WHERE content &~? 'Mroonga is a MySQL extension taht uses Groonga'; -- ERROR: pgroonga: operator &~? is available only in index scan ``` + +## For Japanese + +You should use `TokenMecab` tokenizer instead of the default `TokenBigram` for similar search against Japanese documents: + +```sql +CREATE INDEX pgroonga_content_index ON memos + USING pgroonga (content pgroonga.text_full_text_search_ops_v2) + WITH (tokenizer='TokenMecab'); +``` + +`TokenMecab` will tokenize target documents to words. It improves similar search precision. + +See also [`CREATE INDEX USING pgroonga`][create-index-using-pgroonga] how to specify `TokenMecab` tokenizer. + +[create-index-using-pgroonga]:../create-index-using-pgroonga.html -------------- next part -------------- HTML����������������������������... Descargar