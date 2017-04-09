A better English POS tagger written in JavaScript

Installation and usage

Install via NPM:

npm i --save en-pos

How to use

const Tag = require ( "en-pos" ).Tag; var tags = new Tag([ "this" , "is" , "my" , "sentence" ]) .initial() .smooth() .tags; console .log(tags);

Annotation Specification

Annotation Name Example NN Noun dog man NNS Plural noun dogs men NNP Proper noun London Alex NNPS Plural proper noun Smiths VB Base form verb be VBP Present form verb throw VBZ Present form (3rd person) throws VBG Gerund form verb throwing VBD Past tense verb threw VBN Past participle verb thrown MD Modal verb can shall will may must ought JJ Adjective big fast JJR Comparative adjective bigger JJS Superlative adjective biggest RB Adverb not quickly closely RBR Comparative adverb less-closely faster RBS Superlative adverb fastest DT Determiner the a some both PDT Predeterminer all quite PRP Personal Pronoun I you he she PRP$ Possessive Pronoun I you he she POS Possessive ending 's IN Preposition of by in PR Particle up off TO to to WDT Wh-determiner which that whatever whichever WP Wh-pronoun who whoever whom what WP$ Wh-possessive whose WRB Wh-adverb how where EX Expletive there there CC Coordinating conjugation & and nor or CD Cardinal Numbers 1 7 77 one LS List item marker 1 B C One UH Interjection ah oh oops FW Foreign Words viva mon toujours , Comma , : Mid-sent punct : ; ... . Sent-final punct. . ! ? ( Left parenthesis ) } ] ) Right parenthesis ( { [ # Pound sign # $ Currency symbols $ € £ ¥ SYM Other symbols + * / < > EM Emojis & emoticons :) ❤

Accuracy and performance

When smoothing is enabled: 96.43% accuracy (processing 132K tokens in 38 seconds)

accuracy (processing 132K tokens in 38 seconds) When smoothing is disabled: 94.4% accuracy (processing 132K tokens in 3 seconds)

As of 25 Jan 2017, this library scored 96.43% at the Penn Treebank test (0.3% away from being a state of the art tagger).

Being written in JavaScript, I think it's safe to say that this is the most accurate JavaScript POS tagger, since the only JS library I know of is pos-js which when I tested on the same treebank scored 87.8%, though it was faster than my implementation when smoothing is enabled.

However, if performance is what's you're after rather than accuracy, then you have the option to disable smoothing in this library and this will marginally increase performance making this library even faster than pos-js but with far better accuracy (94.4%).

Building from source and testing

Build: tsc (requires typescript)

(requires typescript) Test: node test/test.ts

Credits