Commit Graph

978 Commits (450c8e8935c35b93ab7c8183931313cb9acfb3a6)

Author SHA1 Message Date
Keisuke Kuroyanagi 3844f74aff Fix: deleted PtNode handling in v403.
If a word is once deleted, the word never gets into the
personalized dictionaries due to this bug.

Change-Id: Ife4e3fe1ba0615b4135e6291d2151b0db7d3f940
2014-10-27 15:32:05 +09:00
Keisuke Kuroyanagi 090c3819d7 Fix: Personalized dicts suggest invalid words with v403.
Bug: 14425059
Change-Id: I45ae00069dd3b7c461dd9a1f3558b96af0a1c975
2014-10-23 19:26:01 +09:00
Keisuke Kuroyanagi 16cc3992d7 Use trigrams for personalization dict.
5Bug: 14425059
Change-Id: I73cf6904e569d60996a3b079f16ea6df0cb90f02
2014-10-23 14:32:45 +09:00
Keisuke Kuroyanagi b5ef884fbb Support dumping ngram entries.
Bug: 14425059
Change-Id: Ib03a0c3d166ed6f1e60c67127b28006d55143b6b
2014-10-22 18:15:53 +09:00
Keisuke Kuroyanagi c9865785f4 Support ngram entry migration.
Bug: 14425059
Change-Id: I98cb9fa303af2d93a0a3512e8732231c564e3c5d
2014-10-22 11:31:16 +09:00
Keisuke Kuroyanagi 0b8bb0c21b Fix debug build.
Change-Id: Id94636714d04a8828718b87741c0ee62a14cb3b4
2014-10-21 20:20:11 +09:00
Keisuke Kuroyanagi dfc82fa366 Merge changes I210acb81,Ie9508788
* changes:
  Make NgramProperty have NgramContext.
  Create .cpp file for NgramContext.
2014-10-21 10:28:25 +00:00
Keisuke Kuroyanagi 88bb28c132 Make NgramProperty have NgramContext.
Bug: 14425059
Change-Id: I210acb816b122857dbbe1ee4dd6a35c5335bf2bf
2014-10-21 17:12:32 +09:00
Keisuke Kuroyanagi f87bb77a91 Create .cpp file for NgramContext.
Bug: 14425059

Change-Id: Ie950878817b9c80cc9c970e1a84880c9b9ab228a
2014-10-21 17:04:56 +09:00
Keisuke Kuroyanagi fa1e65cb3a Merge "Use EntryCounters during GC." 2014-10-21 07:55:04 +00:00
Adrian Velicu c51b9b5b3f Merge "Renaming "blacklist" flag to "possibly offensive"" 2014-10-21 07:39:18 +00:00
Keisuke Kuroyanagi 47fc656cd7 Use EntryCounters during GC.
Bug: 14425059
Change-Id: I61eb798686dc753fb6c0fe99a0719c1732198f30
2014-10-21 16:36:03 +09:00
Keisuke Kuroyanagi e8750d970e Introduce EntryCounters to count entries in a dictionary.
Bug: 14425059

Change-Id: Ic13ba827d96fa4a147485ba92fdb37e23e04e8e8
2014-10-21 15:46:14 +09:00
Adrian Velicu 05172bf1a5 Renaming "blacklist" flag to "possibly offensive"
No behaviour changes.
Unified the overloaded FusionDictionary::add method to always take an
isPossiblyOffensive argument.

Bug: 11031090
Change-Id: I5741a023ca1ce842d2cf10d4f6c926b0efabaa78
2014-10-21 11:51:47 +09:00
Keisuke Kuroyanagi 1085fef8d0 Change entry count limit.
Unigram 10K, Bigram 30K, Trigram 30K.

Change-Id: Ibd19c6a2b618499df1c70000bad7b47498187f0a
2014-10-20 15:01:49 +09:00
Keisuke Kuroyanagi f4928ad4dd Merge "Update useless n-gram entry detection logic during GC." 2014-10-15 21:44:45 +00:00
Keisuke Kuroyanagi 3601c214f8 Update useless n-gram entry detection logic during GC.
Bug: 14425059
Change-Id: Ib939deae5b60167751dee07965bb1ef1a43c4625
2014-10-15 20:43:27 +09:00
Keisuke Kuroyanagi 183e21c36c Merge "Use better conditional probability for ngram entries." 2014-10-15 09:27:21 +00:00
Keisuke Kuroyanagi 72d17d9209 Use better conditional probability for ngram entries.
Old:
P(W | W_prev) = f(W, W_prev) + C
New:
P(W | W_prev) = f(W, W_prev) / f(W_prev)

Bug: 14425059
Bug: 16547409

Change-Id: I4d13be6de2c6bad6bad7fb22320a23ba4ecd361c
2014-10-15 18:23:00 +09:00
Keisuke Kuroyanagi c2429c54ac Merge "Move entry updating method to language model dict content." 2014-10-15 04:51:04 +00:00
Keisuke Kuroyanagi 5400701908 Move entry updating method to language model dict content.
Bug: 14425059
Change-Id: I710055490d141539458cbf968adf5a7ccffd9552
2014-10-15 12:29:31 +09:00
Keisuke Kuroyanagi d8ccb9093b Quit using weightChildNode for ADDITIONAL_PROXIMITY and SUBSTITUTION.
[Category diff]
+1       0
-1       1
+2       0
-2       0
+3       0
-3       0
+4       1
-4       1
+5       8
-5       7
+6       0
-6       1
+7       1
-7       0

[Weighted category diff]
+1       0
-1       1
+2       0
-2       0
+3       0
-3       0
+4       1
-4       1
+5       8
-5       7
+6       0
-6       1
+7       1
-7       0

Bug: 13756409
Change-Id: I6ac3567545676bbefbee3e87dda54bc083c15fb6
2014-10-14 20:20:55 +09:00
Keisuke Kuroyanagi d70b8ff291 Fix: BoS bigram from user history dictionary is too strong.
They can be always stronger than BoS predictions from the
contextual dictionary.

Bug: 17961731
Change-Id: I70297d82436c10c790bdfad6f3dfefdb4bb2f852
2014-10-13 08:52:08 +00:00
Jean Chalard 7d5e1cb265 [ML23] Introduce a different accuracy/performance tradeoff
Bug: 11230254
Change-Id: Ic09518c818ae7b68942b1c63160dd462e5922cb5
2014-10-10 18:02:52 +09:00
Keisuke Kuroyanagi 229f354fdc Merge "Make members of classes that are used with std::vector const" 2014-10-10 05:39:57 +00:00
Keisuke Kuroyanagi b559c65e7e Merge "Rename prev_words_info.h to ngram_context.h" 2014-10-09 12:47:24 +00:00
Keisuke Kuroyanagi 10fa30e380 Merge "Rename PrevWordsInfo to NgramContext." 2014-10-09 12:47:17 +00:00
Keisuke Kuroyanagi 620ebde704 Make members of classes that are used with std::vector const
Change-Id: Id93fb87f5630230fc3f9cd339e12f3b0e2006ea9
2014-10-09 21:28:40 +09:00
Keisuke Kuroyanagi 45783013bf Rename prev_words_info.h to ngram_context.h
Bug: 14425059
Change-Id: I0e906631ecad2361a8198b3f9e3394bb22c5bf83
2014-10-09 21:28:19 +09:00
Adrian Velicu 44efbe64b1 Fixing misspelled word
Change-Id: I51d77e271143d40256b39e5c60a3065d9fdf63fb
2014-10-09 19:26:54 +09:00
Keisuke Kuroyanagi 72e2383d11 Rename PrevWordsInfo to NgramContext.
Bug: 14425059
Change-Id: I30703fc80e9450d4e2dbfec965e7f9f4468f6a11
2014-10-09 17:34:32 +09:00
Keisuke Kuroyanagi ab4437f468 Rename updateCounter to updateEntriesForWordWithNgramContext.
Bug: 14425059
Change-Id: Id9b0dd7e32c711ed4292981517c3febd5fe9e897
2014-10-09 17:34:29 +09:00
Keisuke Kuroyanagi 948ef10d03 Merge "Improve bigram probability computation for decaying dicts." 2014-10-06 13:06:29 +00:00
Keisuke Kuroyanagi aae1a062eb Improve bigram probability computation for decaying dicts.
Without personalization:
Total words: 1079345, Success Num: 819749, Success Percentage: 75.949%
Bad Failures, with auto-correction (typed word == expected word, output word != expected word): 1754, Bad Failure Percentage: 0.163%
Failures, with auto-correction (F-C): 28463, F-C Percentage: 2.637%
Max Keystrokes: 6074285, Min Keystrokes: 4649326, Keystroke Saving Percentage:23.459%

With current probability computing logic:
Total words: 1079382, Success Num: 838329, Success Percentage: 77.667%
Bad Failures, with auto-correction (typed word == expected word, output word != expected word): 1332, Bad Failure Percentage: 0.123%
Failures, with auto-correction (F-C): 28558, F-C Percentage: 2.646%
Max Keystrokes: 6074503, Min Keystrokes: 4474102, Keystroke Saving Percentage:26.346%
Remove isof files.

With new probability computing logic:
Total words: 1079356, Success Num: 844954, Success Percentage: 78.283%
Bad Failures, with auto-correction (typed word == expected word, output word != expected word): 1306, Bad Failure Percentage: 0.121%
Failures, with auto-correction (F-C): 27214, F-C Percentage: 2.521%
Max Keystrokes: 6074477, Min Keystrokes: 4243021, Keystroke Saving Percentage:30.150%
Remove isof files.

Bug: 16547409
Change-Id: I3d2a49c7aaa2c0f6835c52ef72d22466ee225789
2014-10-06 22:03:11 +09:00
Keisuke Kuroyanagi 552470c882 Merge "Make sure to suppress BoS prediction until input twice." 2014-10-06 10:46:31 +00:00
Keisuke Kuroyanagi c7d199e770 Merge "Fix: BoS prediction is shown after inputting just once." 2014-10-06 10:38:18 +00:00
Keisuke Kuroyanagi 1c2f00f6b5 Make sure to suppress BoS prediction until input twice.
Change-Id: I98d91f264d5d1d3f5bcda1fd9ec885779ba2f746
2014-10-06 19:36:04 +09:00
Keisuke Kuroyanagi ca17ed7d9d Fix: BoS prediction is shown after inputting just once.
Change-Id: Ibba209f47cb5b1a4b08281689d607711b8dcfad4
2014-10-06 19:28:39 +09:00
Keisuke Kuroyanagi 36c4eaadfb Show prediction results in debug build.
Bug: 16547409
Change-Id: If85418583998cd639c794bf5d5cfbbb972c34f72
2014-10-06 18:36:54 +09:00
Keisuke Kuroyanagi 16e1615301 Fix: ProbabilityDictContent can be wrongly large.
It can be twice as large as it should be (80KB larger).

Change-Id: If94f748f8c48a442b3c95ac989099aaed2aa2f86
2014-10-06 11:55:07 +09:00
Jean Chalard 4ef27c0358 [ML13] Fix the locale passing in ProximityInfo
The locale is used to determine additional proximity characters. This
is dependent on the dictionary language, but was passed as a function
of the layout, which is wrong and would have given bad suggestions in
multi-lingual mode.

Ideally, additional proximity characters should be inserted in the
dictionary header, but for now it's a rather simple change to get
it from the dictionary's locale instead of the proximity info locale.

Also, that allows us to remove completely the locale parameter from
proximity info, which is a much needed change.

This change has zero effect on unit tests and on regression tests.

Bug: 11230254
Change-Id: If95157155db7dccd1f00b8ba55ccb3600283f9e4
2014-10-03 18:16:34 +09:00
Keisuke Kuroyanagi 29777e3a8a Implement updateCounter() by using existing entry adding methods.
Bug: 14425059
Change-Id: I0b6cb80e1fb8f738e9c7d9e80fbc0c479546b879
2014-10-01 19:59:39 +09:00
Keisuke Kuroyanagi 287e155e44 Move HistoricalInfo to property and use it in *Property.
Bug: 14425059
Change-Id: Icccccabad98fb543c6a6be2844cfc0086d80b739
2014-10-01 11:39:33 +09:00
Keisuke Kuroyanagi 79bb37d499 Rename BigramProperty to NgramProperty.
Remaining work is changing bigram to ngram for supporting
ngram entry counting, dumping, and migration.

Bug: 14425059
Change-Id: Ifba288a1166996d62a5e57698f63537ea0a2a8ee
2014-09-29 19:10:39 +09:00
Keisuke Kuroyanagi cb4f544198 Quit reading unigram probability in Ver4PatriciaTrieNodeReader.
Bug: 14425059
Change-Id: I4fc7b0e236151a2c64e7131772264024c6597633
2014-09-25 11:41:50 +09:00
Keisuke Kuroyanagi 2842e50c4b Use std::move for dictionary properties.
Change-Id: I15056b36b7493f4bac1dbcbb46a0b44343ede153
2014-09-25 11:36:52 +09:00
Keisuke Kuroyanagi 80d139a694 Use CodePointArrayView in WordProperty.
Change-Id: I45a9755c413003831788d190beb499fee8ce63aa
2014-09-24 14:15:36 +09:00
Keisuke Kuroyanagi 65a7ccfa00 Refactoring method to get code points and code point count.
Bug: 14425059
Change-Id: I4731bd6076d34556e46e6714180fed324fb6aba3
2014-09-24 14:15:36 +09:00
Keisuke Kuroyanagi 7d911d6f91 Move word flags to language model dict content.
Bug: 14425059
Change-Id: I64712e5c83d0bc241e6f0f16117ab47b5d75bd4b
2014-09-24 14:15:34 +09:00
Keisuke Kuroyanagi ddfaeff544 Prepare supporting n-gram for user history dictionary.
Bug:17097992
Change-Id: Ic8bfde3d4cc0e720bf7681e08e16fb2ad94d5670
2014-09-22 18:18:50 +09:00