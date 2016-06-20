The node
icu-bidi package binds to the ICU (55.1) library in order to
provide an implementation of the Unicode BiDi algorithm.
The JavaScript API follows the
API of
icu4c
fairly closely.
var ubidi = require('icu-bidi');
var e = 'English';
var h = 'עִבְרִית';
var input = e + ' ' + h;
console.log( input );
var p = ubidi.Paragraph(input, {
// this hash is optional; these are the default values:
paraLevel: ubidi.DEFAULT_LTR,
reorderingMode: ubidi.ReorderingMode.DEFAULT,
reorderingOptions: 0, // uses ubidi.ReorderingOptions.*
orderParagraphsLTR: false,
inverse: false,
prologue: '',
epilogue: '',
embeddingLevels: null /* Unimplemented */
});
console.log( 'number of paragraphs', p.countParagraphs() );
console.log( 'paragraph level', p.getParaLevel() );
// direction is 'ltr', 'rtl', or 'mixed'
console.log( 'direction', p.getDirection() );
var i, levels = [];
for (i=0; i < p.getProcessedLength(); i++) {
levels.push( p.getLevelAt(i) );
}
console.log( levels.join(' ') );
for (i=0; i < p.countRuns(); i++) {
var run = p.getVisualRun(i);
console.log( 'run', run.dir, 'from', run.logicalStart, 'len', run.length );
}
console.log( p.writeReordered(ubidi.Reordered.KEEP_BASE_COMBINING) );
This example prints the following when run:
English עִבְרִית
number of paragraphs 1
paragraph level 0
direction mixed
0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
run ltr from 0 len 8
run rtl from 8 len 8
English תירִבְעִ
Returns a new
Paragraph object with the results of the bidi algorithm.
text: UTF-16 encoded text as a standard JavaScript string; you know
the drill.
options (optional): a hash containing various settings which can
affect the bidi algorithm. All are optional.
paraLevel:
Specifies the default level for the text; it is
typically 0 (LTR) or 1 (RTL). If the function shall determine
the paragraph level from the text, then paraLevel can be set
to either
ubidi.DEFAULT_LTR or
ubidi.DEFAULT_RTL.
reorderingMode:
Modify the operation of the Bidi algorithm such that it implements
some variant to the basic Bidi algorithm or approximates an
"inverse Bidi" algorithm, depending on different values of the
"reordering mode".
ubidi.ReorderingMode.DEFAULT:
The standard Bidi Logical to Visual algorithm is applied.
ubidi.ReorderingMode.REORDER_NUMBERS_SPECIAL:
Approximate the algorithm used in Microsoft Windows XP rather
than strictly conform to the Unicode Bidi algorithm.
ubidi.ReorderingMode.GROUP_NUMBERS_WITH_R:
Numbers located between LTR text and RTL text are
associated with the RTL text. This makes the algorithm
reversible and makes it useful when round trip must be
achieved without adding LRM characters. However, this is a
variation from the standard Unicode Bidi algorithm.
ubidi.ReorderingMode.RUNS_ONLY:
Logical-to-Logical transformation. If the default text
level of the source text (
paraLevel) is even, the source
text will be handled as LTR logical text and will be
transformed to the RTL logical text which has the same LTR
visual display. If the default level of the source text
is odd, the source text will be handled as RTL logical
text and will be transformed to the LTR logical text which
has the same LTR visual display.
ubidi.ReorderingMode.INVERSE_NUMBERS_AS_L:
An "inverse Bidi" algorithm is applied. This mode is
equivalent to setting the
inverse option to
true.
ubidi.ReorderingMode.INVERSE_LIKE_DIRECT:
The "direct" Logical to Visual Bidi algorithm is used as
an approximation of an "inverse Bidi" algorithm.
ubidi.ReorderingMode.INVERSE_FOR_NUMBERS_SPECIAL:
The Logical to Visual Bidi algorithm used in Windows XP is
used as an approximation of an "inverse Bidi" algorithm.
reorderingOptions:
Specify which of the reordering options should be applied
during Bidi transformations. The value is a combination using
bitwise OR of zero or more of the following:
ubidi.ReorderingOptions.DEFAULT
Disable all the options which can be set with this function.
ubidi.ReorderingOptions.INSERT_MARKS
Insert Bidi marks (LRM or RLM) when needed to ensure correct
result of a reordering to a Logical order.
ubidi.ReorderingOptions.REMOVE_CONTROLS
Remove Bidi control characters.
ubidi.ReorderingOptions.STREAMING
Process the output as part of a stream to be continued.
orderParagraphsLTR: (boolean)
Specify whether block separators must be allocated level zero, so
that successive paragraphs will progress from left to right.
inverse: (boolean)
Modify the operation of the Bidi algorithm such that it approximates
an "inverse Bidi" algorithm.
prologue: (string)
A preceding context for the given text.
epilogue: (string)
A trailing context for the given text.
embeddingLevels: (array)
CURRENTLY UNIMPLEMENTED.
Returns a subsidiary
Paragraph object containing the reordering
information, especially the resolved levels, for all the characters in
a line of text.
This line of text is specified by referring to a
Paragraph object
representing this information for a piece of text containing one or
more paragraphs, and by specifying a range of indexes in this text.
In the returned line object, the indexes will range from
0 to
limit-start-1.
This is used after creating a
Paragraph for a piece of text, and
after line-breaking on that text. It is not necessary if each
paragraph is treated as a single line.
After line-breaking, rules (L1) and (L2) for the treatment of trailing
WS and for reordering are performed on the returned
Paragraph object
representing a line.
start:
The line's first index into the text
limit:
The index just behind the line's last index into the text (its last
index +1). It must be
0<=start<limit<=containing paragraph limit.
If the specified line crosses a paragraph boundary, the function will
throw an Exception.
See the icu docs for more information.
Get the directionality of the text.
Returns
'ltr' or
'rtl' if the text is unidirectional; otherwise returns
'mixed'.
See the icu docs for more information.
Returns the length of the text that the
Paragraph object was created for.
See the icu docs for more information.
Get the length of the source text processed by the
Paragraph constructor.
This length may be different from the length of the source text if option
ubidi.ReorderingOptions.STREAMING has been set.
See the icu docs for more information.
Get the length of the reordered text resulting from
Paragraph constructor.
This length may be different from the length of the source text if
option
ubidi.ReorderingOptions.INSERT_MARKS or option
ubidi.ReorderingOptions.REMOVE_CONTROLS has been set.
See the icu docs for more information.
Returns the paragraph level. If there are multiple
paragraphs, their level may vary if the required
paraLevel is
ubidi.DEFAULT_LTR or
ubidi.DEFAULT_RTL. In that case, the level of
the first paragraph is returned.
See the icu docs for more information.
Returns the number of paragraphs.
See the icu docs for more information.
Get information about a paragraph, given a position within the text which is contained by that paragraph.
Returns an object with the following properties:
index:
The index of the paragraph containing the specified position.
start:
The index of the first character of the paragraph in the text.
limit:
The limit of the paragraph.
level:
The level of the paragraph.
dir:
The directionality of the run, either
'ltr' or
'rtl'. This is
derived from bit 0 of the level; see UBiDiLevel.
See the icu docs for more information.
Get information about a paragraph, given the index of the paragraph.
Returns an object with the same properties as
getParagraph, above.
See the icu docs for more information.
Return the level for the character at
charIndex, or
0 if
charIndex is not in the valid range.
See the icu docs for more information.
Returns the number of runs in this text.
See the icu docs for more information.
Get one run's logical start, length, and directionality (
'ltr' or
'rtl').
In an RTL run, the character at the logical start is visually on the
right of the displayed run. The length is the number of characters in
the run. The
runIndex parameter is the number of the run in visual
order, in the range
[0..Paragraph#countRuns()-1].
Returns an object with the following properties:
dir:
The directionality of the run, either
'ltr' or
'rtl'.
logicalStart:
The first logical character index in the text.
length:
The number of characters (at least one) in the run.
See the icu docs for more information.
This function returns information about a run and is used to retrieve runs in logical order. This is especially useful for line-breaking on a paragraph.
Returns an object with the following properties:
logicalLimit:
The limit of the corresponding run.
level:
The level of the corresponding run.
dir:
The directionality of the run, either
'ltr' or
'rtl'. This is
derived from bit 0 of the level; see UBiDiLevel.
See the icu docs for more information.
Get the visual position from a logical text position.
The value returned may be
ubidi.MAP_NOWHERE if there is no visual
position because the corresponding text character is a Bidi control
removed from output by the option
ubidi.ReorderingOptions.REMOVE_CONTROLS.
See the icu docs for more information.
Get the logical text position from a visual position.
The value returned may be
ubidi.MAP_NOWHERE if there is no logical
position because the corresponding text character is a Bidi mark
inserted in the output by option
ubidi.ReorderingOptions.INSERT_MARKS.
See the icu docs for more information.
Take a
Paragraph object containing the reordering information for a
piece of text (one or more paragraphs) set by
new Paragraph or for a
line of text set by
Paragraph#setLine() and returns a reordered string.
This function preserves the integrity of characters with multiple code units and (optionally) combining characters. Characters in RTL runs can be replaced by mirror-image characters in the returned string. There are also options to insert or remove Bidi control characters.
The
options argument is optional. If it is present, it is a bit set
of options for the reordering that control how the reordered text is
written. The available options are:
ubidi.Reordered.KEEP_BASE_COMBINING
Keep combining characters after their base characters in RTL runs.
ubidi.Reordered.DO_MIRRORING
Replace characters with the "mirrored" property in RTL runs by their
mirror-image mappings.
ubidi.Reordered.INSERT_LRM_FOR_NUMERIC
Surround the run with LRMs if necessary; this is part of the approximate
"inverse Bidi" algorithm.
ubidi.Reordered.REMOVE_BIDI_CONTROLS
Remove Bidi control characters (this does not affect
ubidi.Reordered.INSERT_LRM_FOR_NUMERIC).
ubidi.Reordered.OUTPUT_REVERSE
Write the output in reverse order.
See the icu docs for more information.
You can use
npm to download and install:
The latest
icu-bidi package:
npm install icu-bidi
GitHub's
master branch:
npm install https://github.com/cscott/node-icu-bidi/tarball/master
In both cases the module is automatically built with npm's internal version of
node-gyp,
and thus your system must meet node-gyp's requirements.
It is also possible to make your own build of
icu-bidi from its source instead of its npm package (see below).
Unless building via
npm install (which uses its own
node-gyp) you will need
node-gyp installed globally:
npm install node-gyp -g
The
icu-bidi module depends only on
libicu. However, by default, an internal/bundled copy of
libicu will be built and statically linked, so an externally installed
libicu is not required.
If you wish to install against an external
libicu then you need to
pass the
--libicu argument to
node-gyp,
npm install or the
configure wrapper.
./configure --libicu=external
make
Or, using the node-gyp directly:
node-gyp --libicu=external rebuild
Or, using npm:
npm install --libicu=external
If building against an external
libicu make sure to have the
development headers available. Mac OS X ships with these by
default. If you don't have them installed, install the
-dev package
with your package manager, e.g.
apt-get install libicu-dev for
Debian/Ubuntu. Make sure that you have at least
libicu >= 52.1
mocha is required to run unit tests.
npm install mocha
npm test
Copyright (c) 2013-2014 C. Scott Ananian.
icu-bidi is licensed using the same ICU license as the libicu library
itself. It is an MIT/X-style license.