This simplifies the code quite a bit.
- GERMAN_UMLAUTS are now handled through a key-value attribute.
The dictionary generator does not need to know about it any more.
- FRENCH_LIGATURES are deprecated as we handle them with shortcuts now.
- CONTAINS_BIGRAMS is deprecated. Bigram processing is always applied
regardless of this flag.
Bug: 11281748
Change-Id: I55a11ba61d3589c1584a3fa6c941374b349b7b5c
The important bug is in findWordInTree. The problem, which is
not obvious, is that we were calling codePointAt() with the
code point index in the string, instead of the char index.
The other bug this change fixes was harmless in the practice,
because it's in the iteration which is only used for debug and
pretty printing purposes. It's very similar in that it would
substract a length in code point to a length in chars and
truncate a StringBuilder at that length, so it would fail in a
quite similar manner. This changes the meaning of the "length"
attribute in Position, but it's clearer this way anyway.
Bug: 8450145
Change-Id: If396f883a9e6449de39351553ba83f5be5bd30f0