Keisuke Kuroyanagi
b23f03488f
Merge "Use reference instead of pointer for WordProperty()."
2014-11-10 18:32:24 +00:00
Keisuke Kuroyanagi
7d5420aa5e
Make profiler use getTimeInMicroSec().
...
Bug: 17797064
Change-Id: Ie992c9454edfc3bf93d5ea367c3a4427b513a205
2014-11-11 01:38:49 +09:00
Keisuke Kuroyanagi
bbf0d4141b
Use reference instead of pointer for WordProperty().
...
Change-Id: Idf03e97661d64186c752e35964d641a5528be5b1
2014-11-10 09:15:11 +09:00
Keisuke Kuroyanagi
a88c9682fc
Merge "Change v403 historical info format."
2014-10-31 13:38:38 +00:00
Keisuke Kuroyanagi
2383575d2d
Change v403 historical info format.
...
count -> 2B, level -> 0B.
Change-Id: I3b241126f56eb33cdf09cb1ebfed04f534e4ec48
2014-10-31 17:22:13 +09:00
Adrian Velicu
009e02ce4a
Further fixes to treat 0-frequency words
...
Previously, when both legitimate 0-frequency words (such as
distracters) and offensive words were encoded in the same
way, distracters would never show up when the user blocked
offensive words (the default setting, as well as the setting
for regression tests).
When b/11031090 was fixed and a separate encoding was used
for offensive words, 0-frequency words would no longer be
blocked when they were an "exact match" (where case
mismatches and accent mismatches would be considered an
"exact match"). The exact match boosting functionality meant
that, for example, when the user typed "mt" they would be
suggested the word "Mt", although they most probably meant
to type "my".
For this reason, we introduced this change, which does the
following:
* Defines the "perfect match" as a really exact match, with
no room for case or accent mismatches
* When the target word has probability zero (as "Mt" does,
because it is a distracter), ONLY boost its score if it is a
perfect match.
By doing this, when the user types "mt", the word "Mt" will
NOT be boosted, and they will get "my". However, if the user
makes an explicit effort to type "Mt", we do boost the word
"Mt" so that the user's input is not autocorrected to "My".
Bug: 11031090
Change-Id: I92ee1b4e742645d52e2f7f8c4390920481e8fff0
2014-10-31 15:58:50 +09:00
Adrian Velicu
10416241f7
Block offensive words in multi-word suggestions
...
If the user has chosen to block offensive words and types
"aaaxbb", where "aaa" is an offensive word and "bb" is not,
we should not suggest "aaa bb".
Bug: 11031090
Change-Id: Ie23b8dd5d347bc26b1c046c3f5e8dfbc259bf528
2014-10-31 15:58:50 +09:00
Keisuke Kuroyanagi
bcb52d73e2
Enable count based dynamic ngram language model for v403.
...
Bug: 14425059
Change-Id: Icc15e14cfd77d37cd75f75318fd0fa36f9ca7a5b
2014-10-30 23:38:19 +09:00
Keisuke Kuroyanagi
8a809f3433
Improve space substitution error correction.
...
Bug: 17432052
[Category diff]
+1 262
-1 93
+2 2
-2 18
+3 18
-3 2
+4 111
-4 148
+5 295
-5 217
+6 51
-6 276
+7 139
-7 124
[Weighted category diff]
+1 276
-1 100
+2 4
-2 20
+3 20
-3 4
+4 118
-4 160
+5 309
-5 225
+6 52
-6 298
+7 163
-7 135
show diff for ./en_user_log_phones_2011_08.csv
+1 173
-1 28
+2 2
-2 17
+3 17
-3 2
+4 63
-4 82
+5 120
-5 51
+6 24
-6 220
+7 88
-7 87
Change-Id: I9d673acb0ff632828ae2e0ead56e76e3a20411c6
2014-10-28 17:11:14 +09:00
Keisuke Kuroyanagi
090c3819d7
Fix: Personalized dicts suggest invalid words with v403.
...
Bug: 14425059
Change-Id: I45ae00069dd3b7c461dd9a1f3558b96af0a1c975
2014-10-23 19:26:01 +09:00
Keisuke Kuroyanagi
b5ef884fbb
Support dumping ngram entries.
...
Bug: 14425059
Change-Id: Ib03a0c3d166ed6f1e60c67127b28006d55143b6b
2014-10-22 18:15:53 +09:00
Keisuke Kuroyanagi
dfc82fa366
Merge changes I210acb81,Ie9508788
...
* changes:
Make NgramProperty have NgramContext.
Create .cpp file for NgramContext.
2014-10-21 10:28:25 +00:00
Keisuke Kuroyanagi
88bb28c132
Make NgramProperty have NgramContext.
...
Bug: 14425059
Change-Id: I210acb816b122857dbbe1ee4dd6a35c5335bf2bf
2014-10-21 17:12:32 +09:00
Keisuke Kuroyanagi
f87bb77a91
Create .cpp file for NgramContext.
...
Bug: 14425059
Change-Id: Ie950878817b9c80cc9c970e1a84880c9b9ab228a
2014-10-21 17:04:56 +09:00
Adrian Velicu
05172bf1a5
Renaming "blacklist" flag to "possibly offensive"
...
No behaviour changes.
Unified the overloaded FusionDictionary::add method to always take an
isPossiblyOffensive argument.
Bug: 11031090
Change-Id: I5741a023ca1ce842d2cf10d4f6c926b0efabaa78
2014-10-21 11:51:47 +09:00
Keisuke Kuroyanagi
d8ccb9093b
Quit using weightChildNode for ADDITIONAL_PROXIMITY and SUBSTITUTION.
...
[Category diff]
+1 0
-1 1
+2 0
-2 0
+3 0
-3 0
+4 1
-4 1
+5 8
-5 7
+6 0
-6 1
+7 1
-7 0
[Weighted category diff]
+1 0
-1 1
+2 0
-2 0
+3 0
-3 0
+4 1
-4 1
+5 8
-5 7
+6 0
-6 1
+7 1
-7 0
Bug: 13756409
Change-Id: I6ac3567545676bbefbee3e87dda54bc083c15fb6
2014-10-14 20:20:55 +09:00
Jean Chalard
7d5e1cb265
[ML23] Introduce a different accuracy/performance tradeoff
...
Bug: 11230254
Change-Id: Ic09518c818ae7b68942b1c63160dd462e5922cb5
2014-10-10 18:02:52 +09:00
Keisuke Kuroyanagi
620ebde704
Make members of classes that are used with std::vector const
...
Change-Id: Id93fb87f5630230fc3f9cd339e12f3b0e2006ea9
2014-10-09 21:28:40 +09:00
Keisuke Kuroyanagi
45783013bf
Rename prev_words_info.h to ngram_context.h
...
Bug: 14425059
Change-Id: I0e906631ecad2361a8198b3f9e3394bb22c5bf83
2014-10-09 21:28:19 +09:00
Keisuke Kuroyanagi
72e2383d11
Rename PrevWordsInfo to NgramContext.
...
Bug: 14425059
Change-Id: I30703fc80e9450d4e2dbfec965e7f9f4468f6a11
2014-10-09 17:34:32 +09:00
Keisuke Kuroyanagi
ab4437f468
Rename updateCounter to updateEntriesForWordWithNgramContext.
...
Bug: 14425059
Change-Id: Id9b0dd7e32c711ed4292981517c3febd5fe9e897
2014-10-09 17:34:29 +09:00
Keisuke Kuroyanagi
36c4eaadfb
Show prediction results in debug build.
...
Bug: 16547409
Change-Id: If85418583998cd639c794bf5d5cfbbb972c34f72
2014-10-06 18:36:54 +09:00
Jean Chalard
4ef27c0358
[ML13] Fix the locale passing in ProximityInfo
...
The locale is used to determine additional proximity characters. This
is dependent on the dictionary language, but was passed as a function
of the layout, which is wrong and would have given bad suggestions in
multi-lingual mode.
Ideally, additional proximity characters should be inserted in the
dictionary header, but for now it's a rather simple change to get
it from the dictionary's locale instead of the proximity info locale.
Also, that allows us to remove completely the locale parameter from
proximity info, which is a much needed change.
This change has zero effect on unit tests and on regression tests.
Bug: 11230254
Change-Id: If95157155db7dccd1f00b8ba55ccb3600283f9e4
2014-10-03 18:16:34 +09:00
Keisuke Kuroyanagi
29777e3a8a
Implement updateCounter() by using existing entry adding methods.
...
Bug: 14425059
Change-Id: I0b6cb80e1fb8f738e9c7d9e80fbc0c479546b879
2014-10-01 19:59:39 +09:00
Keisuke Kuroyanagi
287e155e44
Move HistoricalInfo to property and use it in *Property.
...
Bug: 14425059
Change-Id: Icccccabad98fb543c6a6be2844cfc0086d80b739
2014-10-01 11:39:33 +09:00
Keisuke Kuroyanagi
79bb37d499
Rename BigramProperty to NgramProperty.
...
Remaining work is changing bigram to ngram for supporting
ngram entry counting, dumping, and migration.
Bug: 14425059
Change-Id: Ifba288a1166996d62a5e57698f63537ea0a2a8ee
2014-09-29 19:10:39 +09:00
Keisuke Kuroyanagi
2842e50c4b
Use std::move for dictionary properties.
...
Change-Id: I15056b36b7493f4bac1dbcbb46a0b44343ede153
2014-09-25 11:36:52 +09:00
Keisuke Kuroyanagi
80d139a694
Use CodePointArrayView in WordProperty.
...
Change-Id: I45a9755c413003831788d190beb499fee8ce63aa
2014-09-24 14:15:36 +09:00
Keisuke Kuroyanagi
65a7ccfa00
Refactoring method to get code points and code point count.
...
Bug: 14425059
Change-Id: I4731bd6076d34556e46e6714180fed324fb6aba3
2014-09-24 14:15:36 +09:00
Keisuke Kuroyanagi
7d911d6f91
Move word flags to language model dict content.
...
Bug: 14425059
Change-Id: I64712e5c83d0bc241e6f0f16117ab47b5d75bd4b
2014-09-24 14:15:34 +09:00
Jean Chalard
6da9b21191
[ML8] Add a language weight
...
...and rename an improperly named normalization value
Bug: 11230254
Change-Id: I0f5633148a9f66dbfd7d28540b8a8985131c4549
2014-09-19 13:44:42 +09:00
Keisuke Kuroyanagi
fc7d0540fe
Use CodePointArrayView in DictionaryUtils.
...
Change-Id: I9ae308e60124ea5acb4ee09847c4fdd58ff168e2
2014-09-17 20:13:36 +09:00
Keisuke Kuroyanagi
3e75c59133
Use CodePointArrayView in Dictionary.
...
Change-Id: I63fa0a8348f6de6ec7a424a8033e936b4af72beb
2014-09-17 20:13:36 +09:00
Keisuke Kuroyanagi
d2230525bc
Have mPrevWordCount in DicNodeProperties.
...
Bug: 14425059
Change-Id: I5ce22bace4ec08d0da4e5c167288a742c4426c33
2014-09-16 12:46:16 +09:00
Keisuke Kuroyanagi
c43b6664fa
Use passed previous word count in PrevWordsInfo.
...
Bug: 14425059
Change-Id: I04007bdacf0176a05be7a27ef1c20c5b851d8bed
2014-09-14 17:29:38 +09:00
Keisuke Kuroyanagi
537f6eea8a
Use WordIdArrayView for prevWordIds.
...
Bug: 14425059
Change-Id: Ia84fb997d89564e60111b46ca83bbfa3b187f316
2014-09-11 19:36:22 +09:00
Keisuke Kuroyanagi
d53aea5af9
Remove unigram probability from dicNode.
...
Bug: 14425059
Change-Id: Ie848e8568bb4dbb1d8358e823a881d9157a1aad3
2014-09-10 21:21:25 +09:00
Keisuke Kuroyanagi
c32356c229
Quit using dicNode.getUnigramProbability().
...
Bug: 14425059
Change-Id: I192070cc11e5d46c8413ebc19982d6a8c93577fc
2014-09-10 21:21:25 +09:00
Keisuke Kuroyanagi
521e2382da
Use CodePointArrayView to create children DicNodes.
...
Change-Id: Ie940b6595f3f3f804fbb8dd03c710ea062b75af3
2014-09-10 21:21:23 +09:00
Keisuke Kuroyanagi
87a5c76906
Use WordAttributes for checking flags.
...
Bug: 14425059
Change-Id: Idee84478a482a0e7b5cc53e5dbd4e2484584ba79
2014-09-10 19:51:57 +09:00
Keisuke Kuroyanagi
2111e3abc9
Introduce WordAttributes to get word probability and flags.
...
Bug: 14425059
Change-Id: Iee11d038e0893d7ddd6c52447907f8c55fecb6a5
2014-09-10 19:51:48 +09:00
Keisuke Kuroyanagi
11a48f92a5
Use getProbabilityOfWordInContext for prediction.
...
Bug: 14425059
Change-Id: I9d5c905a0adda3503c593bfbf0bb9af8d1686f5d
2014-09-10 19:51:14 +09:00
Keisuke Kuroyanagi
9f8da0f833
Use MultiBigramMap in structure policy.
...
Bug: 14425059
Change-Id: I4d78da4839ef177e0223e6e5bcf0ebd7315c3099
2014-09-09 17:53:44 +09:00
Keisuke Kuroyanagi
9c42ad47d4
Rename probability to unigramProbability.
...
Bug: 14425059
Change-Id: I6a204c3b8fb257d037ad95a1a455ae6fb89068fd
2014-09-09 14:09:01 +09:00
Keisuke Kuroyanagi
d028294890
Remove mHasChildrenPtNodes from DicNodeProperties.
...
Bug: 14425059
Change-Id: I3a9511e7f7c3a722f9942f525530f04def5965da
2014-09-09 14:08:41 +09:00
Keisuke Kuroyanagi
9ff6fee838
Remove DicNode.getPtNodePos().
...
Bug: 14425059
Change-Id: If6e291d23e68342792febb85f8a576ce785b3845
2014-09-05 17:27:01 +09:00
Keisuke Kuroyanagi
94e4cd25a8
Use word id to get code ponits of the word.
...
Bug: 14425059
Change-Id: I81accffcdf5abe447c33ffc3a8e8315f9a4cde7f
2014-09-03 18:55:31 +09:00
Keisuke Kuroyanagi
ac983b13a9
Use word id to get shortcut iterator.
...
Bug: 14425059
Change-Id: I1b35a139bd29f70b328cbc82648783b99f633d72
2014-09-03 18:33:10 +09:00
Keisuke Kuroyanagi
847a026cd8
Make dictionary structure policy return shortcut iterator.
...
Bug: 14425059
Change-Id: I0da22c41f818673430c285103af340397aaba9fb
2014-09-03 18:20:14 +09:00
Keisuke Kuroyanagi
89a003b12b
Use word id for methods related to n-grams.
...
Bug: 14425059
Change-Id: I81e5d3793527776d3c9faa5594005ddbd4a71354
2014-09-03 16:32:43 +09:00