Without personalization:
Total words: 1134774, Success Num: 899230, Success Percentage: 79.243%
Bad Failures, with auto-correction (typed word == expected word, output word != expected word): 1871, Bad Failure Percentage: 0.165%
Failures, with auto-correction (F-C): 29084, F-C Percentage: 2.563%
Max Keystrokes: 6072959, Min Keystrokes: 4436090, Keystroke Saving Percentage:26.953%
Before:
Total words: 1134646, Success Num: 925194, Success Percentage: 81.540%
Bad Failures, with auto-correction (typed word == expected word, output word != expected word): 1316, Bad Failure Percentage: 0.116%
Failures, with auto-correction (F-C): 28288, F-C Percentage: 2.493%
Max Keystrokes: 6072831, Min Keystrokes: 3946188, Keystroke Saving Percentage:35.019%
After
Total words: 1134659, Success Num: 944746, Success Percentage: 83.263%
Bad Failures, with auto-correction (typed word == expected word, output word != expected word): 1258, Bad Failure Percentage: 0.111%
Failures, with auto-correction (F-C): 28016, F-C Percentage: 2.469%
Max Keystrokes: 6072844, Min Keystrokes: 3387333, Keystroke Saving Percentage:44.222%
Change-Id: I3af42ec37a11847c0429c28616e726f6a339247f
Previously, when both legitimate 0-frequency words (such as
distracters) and offensive words were encoded in the same
way, distracters would never show up when the user blocked
offensive words (the default setting, as well as the setting
for regression tests).
When b/11031090 was fixed and a separate encoding was used
for offensive words, 0-frequency words would no longer be
blocked when they were an "exact match" (where case
mismatches and accent mismatches would be considered an
"exact match"). The exact match boosting functionality meant
that, for example, when the user typed "mt" they would be
suggested the word "Mt", although they most probably meant
to type "my".
For this reason, we introduced this change, which does the
following:
* Defines the "perfect match" as a really exact match, with
no room for case or accent mismatches
* When the target word has probability zero (as "Mt" does,
because it is a distracter), ONLY boost its score if it is a
perfect match.
By doing this, when the user types "mt", the word "Mt" will
NOT be boosted, and they will get "my". However, if the user
makes an explicit effort to type "Mt", we do boost the word
"Mt" so that the user's input is not autocorrected to "My".
Bug: 11031090
Change-Id: I92ee1b4e742645d52e2f7f8c4390920481e8fff0
If the user has chosen to block offensive words and types
"aaaxbb", where "aaa" is an offensive word and "bb" is not,
we should not suggest "aaa bb".
Bug: 11031090
Change-Id: Ie23b8dd5d347bc26b1c046c3f5e8dfbc259bf528
If a word is once deleted, the word never gets into the
personalized dictionaries due to this bug.
Change-Id: Ife4e3fe1ba0615b4135e6291d2151b0db7d3f940
This CL enables Address Sanitizer for native host test. Note that
production build is not affected with this change. ASan is enabled
only in static lib for test executables.
Change-Id: I2c8e99b8c55e611e86f74579f24a63ac949bb02d
It turned out that building native code for host environment
is not supported in NDK build. Hence this CL makes the host
native test available only as a part of platform build to
avoid accidental build breakage in unbundled build.
BUG: 18095678
Change-Id: If608da166d5a478358e6890b8db526b4c2c0ab41
This CL enables Address Sanitizer for native host test. Note that
production build is not affected with this change. ASan is enabled
only in static lib for test executables.
Change-Id: Idbe1f2e4502dfce9b6fb0253d7ebda8d37fbf84e
No behaviour changes.
Unified the overloaded FusionDictionary::add method to always take an
isPossiblyOffensive argument.
Bug: 11031090
Change-Id: I5741a023ca1ce842d2cf10d4f6c926b0efabaa78
Without personalization:
Total words: 1079345, Success Num: 819749, Success Percentage: 75.949%
Bad Failures, with auto-correction (typed word == expected word, output word != expected word): 1754, Bad Failure Percentage: 0.163%
Failures, with auto-correction (F-C): 28463, F-C Percentage: 2.637%
Max Keystrokes: 6074285, Min Keystrokes: 4649326, Keystroke Saving Percentage:23.459%
With current probability computing logic:
Total words: 1079382, Success Num: 838329, Success Percentage: 77.667%
Bad Failures, with auto-correction (typed word == expected word, output word != expected word): 1332, Bad Failure Percentage: 0.123%
Failures, with auto-correction (F-C): 28558, F-C Percentage: 2.646%
Max Keystrokes: 6074503, Min Keystrokes: 4474102, Keystroke Saving Percentage:26.346%
Remove isof files.
With new probability computing logic:
Total words: 1079356, Success Num: 844954, Success Percentage: 78.283%
Bad Failures, with auto-correction (typed word == expected word, output word != expected word): 1306, Bad Failure Percentage: 0.121%
Failures, with auto-correction (F-C): 27214, F-C Percentage: 2.521%
Max Keystrokes: 6074477, Min Keystrokes: 4243021, Keystroke Saving Percentage:30.150%
Remove isof files.
Bug: 16547409
Change-Id: I3d2a49c7aaa2c0f6835c52ef72d22466ee225789
The locale is used to determine additional proximity characters. This
is dependent on the dictionary language, but was passed as a function
of the layout, which is wrong and would have given bad suggestions in
multi-lingual mode.
Ideally, additional proximity characters should be inserted in the
dictionary header, but for now it's a rather simple change to get
it from the dictionary's locale instead of the proximity info locale.
Also, that allows us to remove completely the locale parameter from
proximity info, which is a much needed change.
This change has zero effect on unit tests and on regression tests.
Bug: 11230254
Change-Id: If95157155db7dccd1f00b8ba55ccb3600283f9e4
Remaining work is changing bigram to ngram for supporting
ngram entry counting, dumping, and migration.
Bug: 14425059
Change-Id: Ifba288a1166996d62a5e57698f63537ea0a2a8ee