editDistance() can access the outside of mEditDistanceTable when called
with strings that contain MAX_WORD_LENGTH_INTERNAL characters.
Change-Id: I996e6cf21bd6acd6584beb4046c10491a044191e
Move a purely dictionary-format-related function that is needed
both by unigrams and bigrams to the binary format handling
file.
Also remove the empty UnigramDictionary::getBigrams placeholder
function, on grounds that it should be in the BigramDictionary
class.
Bug: 5046459
Change-Id: I8a67a25f72122e2fa0b19ae1d936db25eb0b20ba
Getting the frequency of a terminal is not very useful, however
getting its position will be very useful for retrieving bigrams
later.
Moreover, from the position it's easy to find out the frequency.
Bug: 5046459
Change-Id: Ica53472c2038c7e407dbd1399d336511c731087f
Take a function that does not need to be a member and make it
static inline.
Also replace the return value of -1 by a #define'd constant.
Change-Id: I92e0deaa1df65998b76aba6329a4c8eb4d287485
This actually implements the new dictionary format, but does not
activate the implementation through #defines.
Bug: 4392433
Change-Id: I9b26b9bcb4b823a36e0984799b69730acfc6f7f3
Move functions that will be modified and enclose those that will
be replaced into #ifdefs.
This change does not modify any code, only move some code around.
Bug: 4392433
Change-Id: Ibefbda1eb8bdc8a0c72de47ad9c67a08d0aca960
Consolidate terminal cases, streamline the word adding process
and create the entrances for adding alternate spellings with an
empty implementation.
Bug: 4392433
Change-Id: I781c93ec49945d71c7c20624c86596aa49add4c8
This prepares the way for spell checking, which is to be done
without context so without proximity info.
Bug: 4176026
Change-Id: I1b4bfaefe2611e1b484acdf3c33598cb80f81ff4
- Bug: 4271049
- Due to the result of the recent user study, a word with a missing character needs to be promoted a bit.
so I changed the formula from:
- freq * 70 * (n - 2) / (n - 1)
to:
- freq * 90 * (10n - 12) / (10n - 2)
Change-Id: Ibff72cbdb0f2d7b91460a06a0fd39a9f5749aa46
Words that matched user input with skipped characters used to be demoted
in BinaryDictionary by a constant factor and not at all in those dictionaries
implemented in java code. To represent the fact that the impact of a skipped
character gets larger as the word is shorter, this change will implement a
demotion that gets larger as the typed word is shorter. The demotion rate
is (n - 2) / (n - 1) where n is the length of the typed word for n >= 2.
It implements it for both BinaryDictionary and java dictionaries.
Bug: 3340731
Change-Id: I3a18be80a9708981d56a950dc25fe08f018b5b89
For German : handle "ae", "oe" and "ue" to be alternate forms for
umlaut-bearing versions of "a", "o" and "u".
Issue: 3275926
Change-Id: I056c707cdacc464ceab63be56c016c7f8439196c
The `snr' variable has a very obscure name. Rename it to `matchWeight'.
Also, the `toLowerCase' function is error-prone, since it actually returns
a lower case version of the BASE char, that is without diacritics. Hence,
rename it to `toBaseLowerCase' and update variables with similar names.
Change-Id: Ibdbe73018a33ee864db59a51d664c3b104d5fb3f
When entering a word without accents the user expects the system to
add accents automatically if there is no other matching word. This
patch ensures the accented version is promoted accordingly and
autocorrection really takes place.
Issue: 3400015
Change-Id: I8cd3db5bf131ec6844b26abecc1ecbd1d6269df4
Stop considering accented characters as different from their base
character for proximity scoring.
Also give a huge boost (basically overriding frequency) to a word
fully matched with only differing accents.
Bug: 2550587
Change-Id: I2da7a71229fb3868d9e4a53703ccf8caeb6fcf10
Bug: 3374359
Bug: 3278422
"zbe" will be auto corrected to "be" by fixing s-line
"teh" will be auto corrected to "the" by promotion of full matched words
Change-Id: I314c632820e4e0b1501edeca60ada205d291451f
ApplicationInfo.sourceDir may or may not be apk file name. It can be a directory as well.
The spec just says it's "Full path to the location of this package".
Also, added error handling in loadDictionary().
Change-Id: I5e64d0aba4b1ec7634f4b3ac5537e7a774433ece
This change is a preparation for upcoming optimizations on dictionary file loading.
* We can consolidate dictionary files because we are no longer relying on Asset Manager.
* Stopping compressing dictionary files as planning to use mmap() on the region in the apk file.
* Probably we won't rely on Asset Manager. Instead we'll probably use offset and size obtained from AssetFileDescriptor.
Change-Id: Id57dce512fd3d2397a58628f8264bd824194da76
IA builds will break (due to the bionic _dso_handle bug) if stale libraries are
used. For now, just guard the defns against IA builds.
Change-Id: Ic9df6e0de78a0e221b95370ba6f01ce07714edde
Signed-off-by: Bruce Beare <bruce.j.beare@intel.com>