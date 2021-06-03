openbase logo
openbase logo
CategoriesLeaderboard
kur

kuromojin

by azu
3.0.0 (see all)

Provide a high-level wrapper for kuromoji.js. Cache/Promise API

npm
GitHub
CDN

Overview

DocumentationTutorialsReviewsMaintenanceDependenciesVersionsAlternatives
Showing:

Popularity

Downloads/wk

23.1K

GitHub Stars

61

Maintenance

Last Commit

8mos ago

Contributors

3

Package

Dependencies

2

License

MIT

Type Definitions

Built-In

Tree-Shakeable

No?

Categories

Reviews

Be the first to rate

Readme

kuromojin Actions Status: test

Provide a high level wrapper for kuromoji.js.

Features

  • Promise based API
  • Cache Layer
    • Fetch the dictionary at once
    • Return same tokens for same text

Installation

npm install kuromojin

Usage

Export two API.

  • getTokenizer() return Promise that is resolved with kuromoji.js's tokenizer instance.
  • tokenize() return Promise that is resolved with analyzed tokens.
import {tokenize, getTokenizer} from "kuromojin";

getTokenizer().then(tokenizer => {
    // kuromoji.js's `tokenizer` instance
});

tokenize(text).then(tokens => {
    console.log(tokens)
    /*
    [ {
        word_id: 509800,          // 辞書内での単語ID
        word_type: 'KNOWN',       // 単語タイプ(辞書に登録されている単語ならKNOWN, 未知語ならUNKNOWN)
        word_position: 1,         // 単語の開始位置
        surface_form: '黒文字',    // 表層形
        pos: '名詞',               // 品詞
        pos_detail_1: '一般',      // 品詞細分類1
        pos_detail_2: '*',        // 品詞細分類2
        pos_detail_3: '*',        // 品詞細分類3
        conjugated_type: '*',     // 活用型
        conjugated_form: '*',     // 活用形
        basic_form: '黒文字',      // 基本形
        reading: 'クロモジ',       // 読み
        pronunciation: 'クロモジ'  // 発音
      } ]
    */
});

For browser/global options

If window.kuromojin.dicPath is defined, kuromojin use it as default dict path.

import {getTokenizer} from "kuromojin";
// Affect all module that are used kuromojin.
window.kuromojin = {
    dicPath: "https://cdn.jsdelivr.net/npm/kuromoji@0.1.2/dict"
};
// this `getTokenizer` function use "https://kuromojin.netlify.com/dict" 
getTokenizer();
// === 
getTokenizer({dicPath: "https://cdn.jsdelivr.net/npm/kuromoji@0.1.2/dict"})

📝 Test dictionary URL

Note: backward compatibility for <= 1.1.0

kuromojin v1.1.0 export tokenize as default function.

kuromojin v2.0.0 remove the default function.

import kuromojin from "kuromojin";
// kuromojin === tokenize

Recommended: use import {tokenize} from "kuromojin" instead of it

import {tokenize} from "kuromojin";

Note: kuromoji version is pinned

kuromojin pin kuromoji's version.

It aim to dedupe kuromoji's dictionary. The dictionary is large and avoid to duplicated dictionary.

Tests

npm test

Contributing

  1. Fork it!
  2. Create your feature branch: git checkout -b my-new-feature
  3. Commit your changes: git commit -am 'Add some feature'
  4. Push to the branch: git push origin my-new-feature
  5. Submit a pull request :D

License

MIT

Rate & Review

Great Documentation0
Easy to Use0
Performant0
Highly Customizable0
Bleeding Edge0
Responsive Maintainers0
Poor Documentation0
Hard to Use0
Slow0
Buggy0
Abandoned0
Unwelcoming Community0
100
No reviews found
Be the first to rate

Alternatives

No alternatives found

Tutorials

No tutorials found
Add a tutorial