Commit graph

1401 commits

Author SHA1 Message Date
Keisuke Kuroyanagi
88bc312ad3 Move dictionary code to top level dictionary dir.
Bug: 18725954
Change-Id: Ia442ba4b5d84311057d83edf6e7aeb151d6a820b
2014-12-17 16:02:09 +09:00
Keisuke Kuroyanagi
0bb038a19d Remove needless include.
Bug: 18725954
Change-Id: I3c823fda1b7daf41d82b118d9495f5f2356a1a5e
2014-12-15 18:54:42 +09:00
Keisuke Kuroyanagi
5e1b225082 Add missing error check in TrieMap.
Bug: 18725954
Change-Id: I8fcb0d15dda3f823a4575fe707bcdec57ff5e89b
2014-12-15 18:53:22 +09:00
Keisuke Kuroyanagi
ad546afbaa Remove dependency on jni.h from WordProperty.
Bug: 18725954
Change-Id: Ic97d3a56b036ff042322c9f794504064046fd7d7
2014-12-15 18:09:45 +09:00
Keisuke Kuroyanagi
52e92f812b Merge "Quit auto-correct explicit accented letters to base letters." 2014-12-09 10:24:56 +00:00
Keisuke Kuroyanagi
515c508135 Quit auto-correct explicit accented letters to base letters.
Bug: 7677193
Change-Id: I66eddbf27a9db8682c0347a1be19922792a3bea7
2014-12-09 19:23:27 +09:00
Keisuke Kuroyanagi
b0063751fc Merge "Enable Quadgram for personalized dicts." 2014-12-03 03:43:50 +00:00
Keisuke Kuroyanagi
20da4f07be Merge "Use enum to specify ngram type." 2014-11-25 10:34:15 +00:00
Keisuke Kuroyanagi
60021bbdc2 Enable Quadgram for personalized dicts.
Before:
Total words: 1134659, Success Num: 944709, Success Percentage: 83.259%
Bad Failures, with auto-correction (typed word == expected word, output word != expected word): 1258, Bad Failure Percentage: 0.111%
Failures, with auto-correction (F-C): 28013, F-C Percentage: 2.469%
Max Keystrokes: 6072844, Min Keystrokes: 3347332, Keystroke Saving Percentage:44.880%

After:
Total words: 1134665, Success Num: 945026, Success Percentage: 83.287%
Bad Failures, with auto-correction (typed word == expected word, output word != expected word): 1271, Bad Failure Percentage: 0.112%
Failures, with auto-correction (F-C): 27756, F-C Percentage: 2.446%
Max Keystrokes: 6072850, Min Keystrokes: 3290996, Keystroke Saving Percentage:45.808%

Change-Id: I16af52a3e9c371b95fd6f0741f45ee6b2443bea6
2014-11-25 19:07:13 +09:00
Keisuke Kuroyanagi
78212a6d3d Use enum to specify ngram type.
Change-Id: Ie28768ceadcd7a2d940c57eb30be7d4c364e509f
2014-11-25 19:07:10 +09:00
Keisuke Kuroyanagi
62ac89e149 Merge "Implement ArgumentsParser::parseArguments and add tests." 2014-11-25 01:35:30 +00:00
Keisuke Kuroyanagi
1f8d4f47e4 Implement ArgumentsParser::parseArguments and add tests.
Bug: 10059681
Change-Id: I6511a46c879d7a52d0bb4fcab445a66bc40db98c
2014-11-21 12:25:30 +09:00
Jean Chalard
2a3ed8c988 Fix the base character of D with stroke
Bug: 18436480
Change-Id: Ic92cae7c85c07c8f62a5b6e69d2b71e204aff50d
2014-11-19 17:26:00 +09:00
Keisuke Kuroyanagi
fdf92789c1 Merge "Add unit tests for ArgumentsParser.validateSpecs()." 2014-11-17 23:02:52 +00:00
Keisuke Kuroyanagi
681dbc295b Add unit tests for ArgumentsParser.validateSpecs().
Bug: 10059681
Change-Id: I3ba5d856ad679e32dd3360863335c436ad6e7301
2014-11-18 07:52:01 +09:00
Ken Wakasa
fc81196741 Revert "Follow up to https://android-review.googlesource.com/114561"
This reverts commit 64d3f78ee5 per https://android-review.googlesource.com/114664

Change-Id: I2acab828d41e79847db72f8d2677c12173a323b6
2014-11-17 22:18:03 +00:00
Ken Wakasa
64d3f78ee5 Follow up to https://android-review.googlesource.com/114561
Looks like unbundled builds need to use --hash-style=sysv for the
sake of compatibility

Change-Id: Ia7b3a1cc3b2c91a8628551888a74925926dff855
2014-11-17 18:29:13 +09:00
Keisuke Kuroyanagi
79273b0477 Define arguments for commands in dicttoolkit.
Bug: 10059681
Change-Id: I1ceaeeaa9e2055c357fe969818498de9d6288862
2014-11-15 09:58:19 +09:00
Keisuke Kuroyanagi
52582a22d1 Merge "Add OffdeviceIntermediateDictHeader." 2014-11-13 01:59:10 +00:00
Keisuke Kuroyanagi
99754e2d3e Add OffdeviceIntermediateDictHeader.
Used to have header information in OffdeviceIntermediateDict.

Bug: 10059681

Change-Id: I966c26e514ddd229cf5597d3b96941234c530863
2014-11-13 01:57:42 +00:00
Keisuke Kuroyanagi
bae0fff04a Merge "Utf8Utils for dicttoolkit." 2014-11-13 01:56:56 +00:00
Keisuke Kuroyanagi
f0c303dd02 Utf8Utils for dicttoolkit.
Bug: 10059681
Change-Id: Ie484ba8096823792f0ac663524d1c02d1be070e9
2014-11-13 10:47:37 +09:00
Keisuke Kuroyanagi
da99cfc29d Merge "Introduce OffdeviceIntermediateDict for dicttolkit." 2014-11-11 21:21:04 +00:00
Keisuke Kuroyanagi
cd10540973 Introduce OffdeviceIntermediateDict for dicttolkit.
Bug: 10059681
Change-Id: Ib6e9019502b59dd959c04c8f4996ca932c2b1ba8
2014-11-12 04:08:25 +09:00
Keisuke Kuroyanagi
580420d21b Implement IntArrayView::split for dicttoolkit.
Bug: 10059681
Change-Id: Ic29e79d049bb532727cf5cb1e529fec5d35156ed
2014-11-11 15:06:48 +09:00
Keisuke Kuroyanagi
0c1822df5b Merge "Implement help command for dicttoolkit." 2014-11-10 18:53:19 +00:00
Keisuke Kuroyanagi
b23f03488f Merge "Use reference instead of pointer for WordProperty()." 2014-11-10 18:32:24 +00:00
Keisuke Kuroyanagi
7d5420aa5e Make profiler use getTimeInMicroSec().
Bug: 17797064
Change-Id: Ie992c9454edfc3bf93d5ea367c3a4427b513a205
2014-11-11 01:38:49 +09:00
Keisuke Kuroyanagi
395f6e7020 Implement help command for dicttoolkit.
Bug: 10059681
Change-Id: I0cadf1f80103136cdac5c00b6fca4d81b4bf7384
2014-11-11 00:18:25 +09:00
Keisuke Kuroyanagi
bbf0d4141b Use reference instead of pointer for WordProperty().
Change-Id: Idf03e97661d64186c752e35964d641a5528be5b1
2014-11-10 09:15:11 +09:00
Keisuke Kuroyanagi
bd48963bdf Add CommandExecutor for dicttoolkit.
Bug: 10059681
Change-Id: I90334caaf37c84ce7d1b93d12efbfb5f244a9420
2014-11-09 06:22:28 +09:00
Keisuke Kuroyanagi
4bfa3b273e Introduce CommandUtils for dicttoolkit
Bug: 10059681
Change-Id: Ic6947e76d77dc87bf88dc3a2b749e41fae7553b7
2014-11-08 09:58:26 +09:00
Keisuke Kuroyanagi
2cf5550749 Fix: BoS prediction after inputting just once.
Change-Id: Ib69569ab6b6edfcc8c1d2c621b95de4127789ab6
2014-11-01 17:58:22 +09:00
Keisuke Kuroyanagi
b3bae2e89b Merge "Update v4 format version from 402 to 403." 2014-10-31 14:19:44 +00:00
Keisuke Kuroyanagi
ef931546a0 Merge "Add hacks for better handling count value during migration." 2014-10-31 13:53:57 +00:00
Keisuke Kuroyanagi
a88c9682fc Merge "Change v403 historical info format." 2014-10-31 13:38:38 +00:00
Keisuke Kuroyanagi
3cde19ded1 Merge "Initial commit for native dicttoolkit." 2014-10-31 11:29:20 +00:00
Keisuke Kuroyanagi
e101a53ffc Initial commit for native dicttoolkit.
Bug: 10059681

Change-Id: Ib730af8ebc944e08aaada869c0626724a499747c
2014-10-31 20:27:06 +09:00
Keisuke Kuroyanagi
ea468cc9de Update v4 format version from 402 to 403.
Without personalization:
Total words: 1134774, Success Num: 899230, Success Percentage: 79.243%
Bad Failures, with auto-correction (typed word == expected word, output word != expected word): 1871, Bad Failure Percentage: 0.165%
Failures, with auto-correction (F-C): 29084, F-C Percentage: 2.563%
Max Keystrokes: 6072959, Min Keystrokes: 4436090, Keystroke Saving Percentage:26.953%

Before:
Total words: 1134646, Success Num: 925194, Success Percentage: 81.540%
Bad Failures, with auto-correction (typed word == expected word, output word != expected word): 1316, Bad Failure Percentage: 0.116%
Failures, with auto-correction (F-C): 28288, F-C Percentage: 2.493%
Max Keystrokes: 6072831, Min Keystrokes: 3946188, Keystroke Saving Percentage:35.019%

After
Total words: 1134659, Success Num: 944746, Success Percentage: 83.263%
Bad Failures, with auto-correction (typed word == expected word, output word != expected word): 1258, Bad Failure Percentage: 0.111%
Failures, with auto-correction (F-C): 28016, F-C Percentage: 2.469%
Max Keystrokes: 6072844, Min Keystrokes: 3387333, Keystroke Saving Percentage:44.222%

Change-Id: I3af42ec37a11847c0429c28616e726f6a339247f
2014-10-31 17:23:39 +09:00
Keisuke Kuroyanagi
c611989929 Add hacks for better handling count value during migration.
Bug: 14425059
Change-Id: Ib050574aa7c4babd4285322a11c3af9be9fbab1e
2014-10-31 17:22:13 +09:00
Keisuke Kuroyanagi
2383575d2d Change v403 historical info format.
count -> 2B, level -> 0B.

Change-Id: I3b241126f56eb33cdf09cb1ebfed04f534e4ec48
2014-10-31 17:22:13 +09:00
Adrian Velicu
009e02ce4a Further fixes to treat 0-frequency words
Previously, when both legitimate 0-frequency words (such as
distracters) and offensive words were encoded in the same
way, distracters would never show up when the user blocked
offensive words (the default setting, as well as the setting
for regression tests).

When b/11031090 was fixed and a separate encoding was used
for offensive words, 0-frequency words would no longer be
blocked when they were an "exact match" (where case
mismatches and accent mismatches would be considered an
"exact match"). The exact match boosting functionality meant
that, for example, when the user typed "mt" they would be
suggested the word "Mt", although they most probably meant
to type "my".

For this reason, we introduced this change, which does the
following:
* Defines the "perfect match" as a really exact match, with
no room for case or accent mismatches
* When the target word has probability zero (as "Mt" does,
because it is a distracter), ONLY boost its score if it is a
perfect match.

By doing this, when the user types "mt", the word "Mt" will
NOT be boosted, and they will get "my". However, if the user
makes an explicit effort to type "Mt", we do boost the word
"Mt" so that the user's input is not autocorrected to "My".

Bug: 11031090
Change-Id: I92ee1b4e742645d52e2f7f8c4390920481e8fff0
2014-10-31 15:58:50 +09:00
Adrian Velicu
10416241f7 Block offensive words in multi-word suggestions
If the user has chosen to block offensive words and types
"aaaxbb", where "aaa" is an offensive word and "bb" is not,
we should not suggest "aaa bb".

Bug: 11031090
Change-Id: Ie23b8dd5d347bc26b1c046c3f5e8dfbc259bf528
2014-10-31 15:58:50 +09:00
Adrian Velicu
aa20342d7e Merge "Using "blacklist" flag as "possibly offensive"" 2014-10-31 06:49:29 +00:00
Adrian Velicu
7c87859d4c Using "blacklist" flag as "possibly offensive"
Bug: 11031090
Change-Id: I5cc0d006ab003656498eb82b0875eb9c051d331e
2014-10-31 14:33:05 +09:00
Keisuke Kuroyanagi
0cd1f222fd Fix: native unit test build.
Change-Id: Id2bd4b60d6a4023815a630ebb3059a435b72c193
2014-10-31 12:50:45 +09:00
Keisuke Kuroyanagi
bcb52d73e2 Enable count based dynamic ngram language model for v403.
Bug: 14425059

Change-Id: Icc15e14cfd77d37cd75f75318fd0fa36f9ca7a5b
2014-10-30 23:38:19 +09:00
Keisuke Kuroyanagi
660b00477c Add DynamicLanguageModelProbabilityUtils.
Bug: 14425059
Change-Id: Ia58ab3f0ead02798046d182a9464dcbd95f086bc
2014-10-30 21:33:57 +09:00
Keisuke Kuroyanagi
0a9c3f30b6 Add method to encode probability.
Bug: 14425059
Change-Id: I3e5d359ba5fa38f1669f0e98dfae792ff53efbf8
2014-10-30 12:42:35 +09:00
Keisuke Kuroyanagi
c2ba0ce411 Fix: TRT and ime-simulator bulid.
Change-Id: I1697a907562d1ed6aff2b001763d1594263ba0d3
2014-10-30 01:01:40 +09:00