Sign Up for Free

RunKit +

Try any Node.js package right in your browser

This is a playground to test code. It runs a full Node.js environment and already has all of npm’s 1,000,000+ packages pre-installed, including node-normalizer with all npm packages installed. Try it out:

var nodeNormalizer = require("node-normalizer")

This service is provided by RunKit and is not affiliated with npm, Inc or the package authors.

node-normalizer v0.2.0

Normalize and clean text

Normalize, clean and fix text

npm install node-normalizer

The simple app processes input and tries to make it consumable for a bot.

The order in which the processing happes is important.

  • <xxx means sentence start then xxx
    1. spelling corrections for common spelling errors
    1. idiom conversions
    1. junk word removal from sentence
    1. special sentence effects (question, exclamation, revert question)
    1. abbreviation expansion and canonization
  • for abbreviations, do not use _ before the .
  • for apostrophied left side, must follow tokenizing conventions
  • for apostrophied right side, it means do not spell check the word, the apostrophe will disappear
  • Format is left phrase separated by _ yields right phrase separated by +
  • if right side is %value means set that bit on the sentence (%EXCLAMATIONMARK %QUESTIONMARK)
  • if right side is a ~word its an interjection
  • only proper names should have capital letters
  • Right phrase missing means delete left phrase
  • Substitutions files include:
  • we use + because we dont want the resulting phrase recognized by the idiom processor and thus cause the processor to delete the phrase
  • xxx> means sentence then end stop
  • if you want to have the result NOT tokenized, put it in quotes

Metadata

RunKit is a free, in-browser JavaScript dev environment for prototyping Node.js code, with every npm package installed. Sign up to share your code.
Sign Up for Free