Keisuke Kuroyanagi
78212a6d3d
Use enum to specify ngram type.
...
Change-Id: Ie28768ceadcd7a2d940c57eb30be7d4c364e509f
2014-11-25 19:07:10 +09:00
Keisuke Kuroyanagi
2383575d2d
Change v403 historical info format.
...
count -> 2B, level -> 0B.
Change-Id: I3b241126f56eb33cdf09cb1ebfed04f534e4ec48
2014-10-31 17:22:13 +09:00
Keisuke Kuroyanagi
bcb52d73e2
Enable count based dynamic ngram language model for v403.
...
Bug: 14425059
Change-Id: Icc15e14cfd77d37cd75f75318fd0fa36f9ca7a5b
2014-10-30 23:38:19 +09:00
Keisuke Kuroyanagi
660b00477c
Add DynamicLanguageModelProbabilityUtils.
...
Bug: 14425059
Change-Id: Ia58ab3f0ead02798046d182a9464dcbd95f086bc
2014-10-30 21:33:57 +09:00
Keisuke Kuroyanagi
c2ba0ce411
Fix: TRT and ime-simulator bulid.
...
Change-Id: I1697a907562d1ed6aff2b001763d1594263ba0d3
2014-10-30 01:01:40 +09:00
Keisuke Kuroyanagi
6b0561f9d2
Add a class to have global counters for LanguageModelDictContent.
...
Bug: 14425059
Change-Id: I08ec19903432356b6028853fd73b4eefce20218e
2014-10-29 21:05:41 +09:00
Keisuke Kuroyanagi
b5ef884fbb
Support dumping ngram entries.
...
Bug: 14425059
Change-Id: Ib03a0c3d166ed6f1e60c67127b28006d55143b6b
2014-10-22 18:15:53 +09:00
Keisuke Kuroyanagi
c9865785f4
Support ngram entry migration.
...
Bug: 14425059
Change-Id: I98cb9fa303af2d93a0a3512e8732231c564e3c5d
2014-10-22 11:31:16 +09:00
Keisuke Kuroyanagi
47fc656cd7
Use EntryCounters during GC.
...
Bug: 14425059
Change-Id: I61eb798686dc753fb6c0fe99a0719c1732198f30
2014-10-21 16:36:03 +09:00
Keisuke Kuroyanagi
e8750d970e
Introduce EntryCounters to count entries in a dictionary.
...
Bug: 14425059
Change-Id: Ic13ba827d96fa4a147485ba92fdb37e23e04e8e8
2014-10-21 15:46:14 +09:00
Keisuke Kuroyanagi
3601c214f8
Update useless n-gram entry detection logic during GC.
...
Bug: 14425059
Change-Id: Ib939deae5b60167751dee07965bb1ef1a43c4625
2014-10-15 20:43:27 +09:00
Keisuke Kuroyanagi
72d17d9209
Use better conditional probability for ngram entries.
...
Old:
P(W | W_prev) = f(W, W_prev) + C
New:
P(W | W_prev) = f(W, W_prev) / f(W_prev)
Bug: 14425059
Bug: 16547409
Change-Id: I4d13be6de2c6bad6bad7fb22320a23ba4ecd361c
2014-10-15 18:23:00 +09:00
Keisuke Kuroyanagi
5400701908
Move entry updating method to language model dict content.
...
Bug: 14425059
Change-Id: I710055490d141539458cbf968adf5a7ccffd9552
2014-10-15 12:29:31 +09:00
Keisuke Kuroyanagi
287e155e44
Move HistoricalInfo to property and use it in *Property.
...
Bug: 14425059
Change-Id: Icccccabad98fb543c6a6be2844cfc0086d80b739
2014-10-01 11:39:33 +09:00
Keisuke Kuroyanagi
79bb37d499
Rename BigramProperty to NgramProperty.
...
Remaining work is changing bigram to ngram for supporting
ngram entry counting, dumping, and migration.
Bug: 14425059
Change-Id: Ifba288a1166996d62a5e57698f63537ea0a2a8ee
2014-09-29 19:10:39 +09:00
Keisuke Kuroyanagi
cb4f544198
Quit reading unigram probability in Ver4PatriciaTrieNodeReader.
...
Bug: 14425059
Change-Id: I4fc7b0e236151a2c64e7131772264024c6597633
2014-09-25 11:41:50 +09:00
Keisuke Kuroyanagi
7d911d6f91
Move word flags to language model dict content.
...
Bug: 14425059
Change-Id: I64712e5c83d0bc241e6f0f16117ab47b5d75bd4b
2014-09-24 14:15:34 +09:00
Keisuke Kuroyanagi
09c154925f
Add firstOrDefault and lastOrDefault to IntArrayView.
...
Change-Id: I854c02eff3fa0b53c72a5f1cabce001f4854ada0
2014-09-17 21:16:31 +09:00
Keisuke Kuroyanagi
4926b90ec5
Support n-gram for look-up.
...
Bug: 14425059
Change-Id: I19523c29fb802cd65158c7540d1608e7f55c4ca7
2014-09-17 16:20:00 +09:00
Keisuke Kuroyanagi
36ba139ca6
Support decaying dict in getWordProbability().
...
Bug: 14425059
Change-Id: I24db3f9131c2999fc388035dc365c7faaef3bdb1
2014-09-14 17:29:50 +09:00
Keisuke Kuroyanagi
395fe8e98d
Implement LanguageModelDictContent.getWordProbability().
...
Bug: 14425059
Change-Id: I290a05cee6f341caa25fb222892505529cef1eb7
2014-09-10 19:51:12 +09:00
Keisuke Kuroyanagi
93e3b5a16f
Add TerminalPositionLookupTableTest.
...
Change-Id: I4a3ab4c94a7759d7f24c7edc9c167fe6bbdd3eb7
2014-08-29 14:16:15 +09:00
Keisuke Kuroyanagi
fe395232d6
Remove bigram dict content.
...
Bug: 14425059
Change-Id: I75918c6761a50832da511088eb83becd56b23662
2014-08-27 20:05:59 +09:00
Keisuke Kuroyanagi
758d093644
Get entry count after truncation using LanguageModelDictContent.
...
Bug: 14425059
Change-Id: I41b237c1c22c21740946d52e3be9d6f963c9cd54
2014-08-27 20:04:39 +09:00
Keisuke Kuroyanagi
07b3b41c25
Add a method to iterate entries in LanguageModelDictContent.
...
Bug: 14425059
Change-Id: I4e9c3a97891c020f762fa709f806d333c067f496
2014-08-26 12:01:08 +09:00
Keisuke Kuroyanagi
063f86d40f
Truncate entries in language model dict content.
...
Bug: 14425059
Change-Id: I023c1d5109a2c43fcea3bb11a0fd7198c82891ba
2014-08-22 20:13:04 +09:00
Keisuke Kuroyanagi
9aa6699107
Update probabilities in language model dict content for GC.
...
Bug: 14425059
Change-Id: I354408afd8e5c1955ff0acea3d0243d628fe3843
2014-08-22 20:07:54 +09:00
Keisuke Kuroyanagi
ace03d7919
Merge "Add BoS flag in probability entry."
2014-08-16 04:15:21 +00:00
Keisuke Kuroyanagi
623067a183
Add BoS flag in probability entry.
...
Bug: 14425059
Change-Id: I50439630034ada0280c44cbbb308aa0b95b72048
2014-08-19 11:49:05 +09:00
Keisuke Kuroyanagi
bfcd5efd50
Merge "Use byte array view in ver4 dict contents."
2014-08-16 04:15:21 +00:00
Keisuke Kuroyanagi
1f6e52ef02
Use byte array view in ver4 dict contents.
...
Change-Id: Icf79a51a200f7ccd775264d1a83dd61e7dcfbab2
2014-08-18 22:46:10 +09:00
Keisuke Kuroyanagi
d3097c67ca
Remove entry from language model dict content.
...
Bug: 14425059
Change-Id: Iea51c0ae908d499da19839de06222a1c4d19088e
2014-08-18 12:34:50 +09:00
Keisuke Kuroyanagi
b4531d861e
Add method to remove entry from language model dict content.
...
Bug: 14425059
Change-Id: Id21af0110e770caa3e95cb5d7ba8b3d1af8e0b12
2014-08-18 12:34:48 +09:00
Keisuke Kuroyanagi
9a23f0fba2
Add bigrams to language model content.
...
Bug: 14425059
Change-Id: Id81e3775ea0104750a23e3dca62c00681ed8dc2e
2014-08-12 20:32:42 +09:00
Keisuke Kuroyanagi
03dc44f543
Add/Get n-gram probability entry in languageModelDictContent
...
Bug: 14425059
Change-Id: I7926c3812f89b9a71fe1873a5bc32f793f91b640
2014-08-06 00:42:56 +00:00
Keisuke Kuroyanagi
851e0458fe
Remove ProbabilityDictContent and use LanguageModelDictContent
...
Bug: 14425059
Change-Id: I1bb9e78ecb24139b87c99be6722e37eec0a2285d
2014-08-05 14:13:07 +09:00
Keisuke Kuroyanagi
0889484266
Add methods for unigrams to LanguageModelDictContent.
...
Bug: 14425059
Change-Id: I0a6b480a3d4735787ffac68c47b4ffefc3f1b8a5
2014-08-05 12:38:55 +09:00
Keisuke Kuroyanagi
c4696b2eb6
Save language model in the body buffer.
...
Bug: 14425059
Change-Id: Iaec277f7bed03d6c6780c6ce90fbe5fe799e175e
2014-08-01 20:19:16 +09:00
Keisuke Kuroyanagi
c0c674cdc0
Make MmappedBuffer use byte array view.
...
Bug: 16691311
Change-Id: I2122c01ee27c33e11dec52643925c069927bea2b
2014-08-01 19:26:01 +09:00
Keisuke Kuroyanagi
dc3856d758
Add LanguageModelDictContent.
...
This class will replace BigramDictContent and
ProbabilityDictContent.
Bug: 14425059
Change-Id: I3d15c833957e27b2f5999386db042188272bbb4b
2014-08-01 12:45:00 +09:00
Keisuke Kuroyanagi
90b7c1729f
Remove DictContent.
...
Bug: 14425059
Change-Id: I74fa4b6ba4605447c1c87427371e4be5eb8e7ae6
2014-08-01 12:06:21 +09:00
Keisuke Kuroyanagi
b22f95ec8a
Remove isUpdatable from constructors of dict contents.
...
Change-Id: I2d54f477d9b341e944e265786a734f23d152bb81
2014-07-11 15:23:55 +09:00
Keisuke Kuroyanagi
2ac934296c
Concatenate dict buffers other than header to a single file.
...
Bug: 13664080
Change-Id: I34c9d8046b339c9b855be378a5fad907382d1359
2014-07-11 15:15:47 +09:00
Keisuke Kuroyanagi
804f7450fc
Use linked list for bigram list.
...
BinaryDictionaryTests for VERSION4_DEV:
Before
Time: 36.461
After
Time: 33.031
Bug: 14425059
Change-Id: I9ca2714f450f61f713df6ebd34c953dece991cdb
2014-07-07 21:09:25 +09:00
Keisuke Kuroyanagi
f9ce867d80
Add boundary check for v4 bigram reading.
...
Bug: 14496386
Change-Id: Iedd3445c3222a777a2476beed7d9eb53773f406c
2014-05-27 19:29:35 +09:00
Keisuke Kuroyanagi
64341927d2
Quit use bigram probability diff for ver4 dict.
...
Change-Id: I2cfcfbcf351877d1dff466a24974dbb05908f14e
2014-05-15 16:02:58 +09:00
Keisuke Kuroyanagi
ad518d9a5b
Avoid copying bigram list if possible.
...
Constructing en_US main dict using dicttool:
Before:
real 1m8.699s
user 1m10.600s
sys 0m2.390s
After:
real 0m17.204s
user 0m20.560s
sys 0m0.720s
Bug: 13406708
Change-Id: I3b0476be57e5cb93c6497025b3ffa7064ac326c6
2014-05-08 14:19:33 +09:00
Keisuke Kuroyanagi
8d8fb396a0
Add new bigram entry at the tail of existing list.
...
Bug: 13406708
Change-Id: If3162e65fc9aa2c47f046aee528276cb51fad9f4
2014-05-01 19:29:43 +09:00
Ken Wakasa
8ca9be17db
s/hash_map_compat/unordered_map/
...
Change-Id: Icce5f9a12b04bdd7540c52750d303a585d71f28a
2014-04-11 18:07:59 +09:00
Keisuke Kuroyanagi
4ce480d5ce
Use unique_ptr.
...
Change-Id: Id92a5b07da4f7f95e2cd293ce8dc1a5f979b7853
2014-03-07 14:31:54 +09:00