Commit graph

114 commits

Author SHA1 Message Date
Jean Chalard
46a1eec4d8 Add a variable-length header region to the binary format.
Also bump up the format version to 2.

Bug: 5686638
Change-Id: I3aafdd7e42c422202122998ec093280051aa8e07
2012-03-06 17:37:28 +09:00
satok
a85f4929cd Support multi words suggestion
result: I4d097612db2f2a93522

Change-Id: Iedbb24f431dac43e52b6dcce8cb610a75e0ca46e
2012-02-08 13:00:31 +09:00
satok
1f6b52e76c Implement multi words suggestions step1
Change-Id: I96e8e1b0d9ccc0ed13d53c40300d8c19bcb7af5b
2012-01-30 18:01:25 +09:00
satok
9955716d0b Merge missing space and mistyped space correction algorithm
Change-Id: Idd64d38d3d29be24748f9c0359667883698a5756
2012-01-27 16:54:15 +09:00
satok
3c09bb18d9 Merge multiple words suggestions algorithm
Change-Id: I70d85b90ddaa28a41e9679f445bc14ef9ff50f16
2012-01-26 18:59:51 +09:00
satok
7409d151a1 Refactor words priority queue
Change-Id: I14b7ef39263ad2b1d5ec087bc80b7b8d7c30abe7
2012-01-26 16:13:25 +09:00
satok
1c03306994 Cleanup unused code
Change-Id: I6c840f9ed170919e48d1c576cd0a48777ad44030
2012-01-26 14:56:52 +09:00
satok
8330b488e9 Do other error correction for the second word of two word correction
result: I4e0b68a12190933f9

Change-Id: I98afce6fe4d5bde97392146d204370ba31a72566
2012-01-25 22:30:37 +09:00
satok
bd6ccdd5f0 Clean up two word correction
Change-Id: I5cd2697d7f61b81aff0c249df01479d86ad0fba5
2012-01-23 15:35:03 +09:00
satok
54af64ae92 Two words error correction with other error correction for the first word
+1      26
-1       5
+2       0
-2       0
+3       0
-3       0
+4       9
-4      25
+5      20
-5      21
+6      13
-6       6
+7      15
-7      26

Change-Id: Iad682d417a6bb42b11ca6e60157698ca66fef3ff
2012-01-19 19:17:29 +09:00
satok
29dc80614b Prepair for advanced two words error correction
Change-Id: I4c8a21f0f6e349ddafd9b402583321a60855cfe8
2012-01-17 16:00:55 +09:00
Jean Chalard
82ddd16889 Stop avoiding adding what the user typed to candidates
There does not seem to be any reason other than a historical
one to avoid doing this, but it takes processing power and
makes things more complicated.

This has a very limited impact on regression tests:
5 -> 3 [He,           the]
5 -> 3 [An,           an]
5 -> 3 [Where,        where]
5 -> 3 [This,         this]
7 -> 1 [wAtch,        watch]
6 -> 4 [oveNs,        oceans]
5 -> 1 [Ahere,        Where]
7 -> 1 [Hast,         Hast]
7 -> 5 [bjp,          bill]
5 -> 1 [What,         What]
5 -> 3 [Sound,        So und]
7 -> 3 [causalities,  casualties]
7 -> 3 [discontentment, discontent]
7 -> 3 [irregardless, regardless]

5 -> 1 : 2
5 -> 3 : 5
6 -> 4 : 1
7 -> 1 : 2
7 -> 3 : 3
7 -> 5 : 1

+1       4
-1       0
+2       0
-2       0
+3       8
-3       0
+4       1
-4       0
+5       1
-5       7
+6       0
-6       1
+7       0
-7       6

Change-Id: I6407cf922f27bbd3992df11d63690e71fc61111b
2012-01-16 18:58:10 +09:00
satok
67e13976b7 Merge "Store suggestions for each input length for missing space algorithm etc." 2012-01-16 00:18:37 -08:00
satok
6ad15fcd15 Store suggestions for each input length for missing space algorithm etc.
Change-Id: Ief8f6ddd29e043744863e5b9be3a51a70987291c
2012-01-16 17:11:17 +09:00
Jean Chalard
4c0eca6e41 Read multi-byte char group counts
Change-Id: Idc62382f1c814e9bd1466c9f7dda1fcc8ba4137d
2012-01-16 15:59:33 +09:00
Jean Chalard
6d4198107b Remove a bunch of obsolete methods.
Change-Id: I218007bf411489d1d648fd9b8b408c5d27c41811
2012-01-16 15:19:47 +09:00
Jean Chalard
512c669fee Fix a native crash with shortcuts
Creation of the TerminalAttributes object failed to take into
account that there may be children on this node.

Change-Id: I8224a1a51532d1a40a8555f46425e3744388326b
2012-01-13 20:50:43 +09:00
satok
9fb6f47a6a New LOG lib
Change-Id: I977e7e10fa58c0a64ca0c3c7b5cb2272446e3efe
2012-01-13 18:04:15 +09:00
Jean Chalard
b0c49b7684 Actually add shortcut targets to the suggestions (A4)
Change-Id: Ia6f551d36b2897863e7faf5143bc319522b0668e
2012-01-06 15:21:11 +09:00
Jean Chalard
cf9dbbdd1a Add methods to read shortcuts from the binary dict (A2)
This contains stubs only, it does not work yet, however it
doesn't break anything.

Change-Id: If912ae84ff3ccd7a2d6588ffd6fbb9974f87ef3d
2012-01-06 12:24:30 +09:00
satok
1a6da631ab Prepare for proximity + two word correction No2
Change-Id: Idfa1413e853299f1db459ef07da3efa932047981
2011-12-19 17:12:20 +09:00
satok
744dab691e Prepare for proximity + two words suggestion
Change-Id: I3637f9bec1f4a3c5953498c4562e1f17a7bf593c
2011-12-16 17:32:53 +09:00
satok
a7e5a5a6b9 Add words priority queue pool
Change-Id: I152df7b876a1756b69ded2ca4fb3ee26b38c971f
2011-12-15 19:20:28 +09:00
satok
4d355989bd Add a functionality to limit the max correction errors
Before
==== test finished, terminate logcat =====
(0)  121.97 (0.28%)
(1)  42032.07 (95.46%)
(2)  11.03 (0.03%)
(3)  12.19 (0.03%)
(4)  10.02 (0.02%)
(5)  1417.41 (3.22%)
(6)  258.43 (0.59%)
(20) 50.20 (0.11%)
Total 44033.07 (sum of others 43913.32)

After
==== test finished, terminate logcat =====
(0)  110.81 (0.29%)
(1)  36416.11 (94.47%)
(2)  10.06 (0.03%)
(3)  9.45 (0.02%)
(4)  9.83 (0.03%)
(5)  1535.52 (3.98%)
(6)  290.25 (0.75%)
(20) 40.57 (0.11%)
Total 38546.83 (sum of others 38422.60)

Change-Id: Iffd24ce0b2dc422c8c6085d5be5f6bfdaf59ca7d
2011-12-15 16:00:08 +09:00
satok
d03317c4be Prune traversing a bit agressively and add a flag not to do auto completion
+1       1
-1       2
+2       0
-2       0
+3       0
-3       0
+4       6
-4       1
+5       4
-5       3
+6       3
-6      10
+7       7
-7       5

Before:
Total 42936.28 (sum of others 42814.63)

After:
Total 40860.56 (sum of others 40733.92)

Change-Id: I6a3d52f31ec181970083358280c3ebaca0a1f63e
2011-12-15 12:09:25 +09:00
satok
1147c7bac9 Unbundle members in unigram_dictionary
Change-Id: Id737d943d20e3de3db568162caf40d3e956c7fae
2011-12-14 19:45:51 +09:00
satok
16379df633 Use priority queue for native string buffer
+1 2
-6 2

Performance

before

==== test finished, terminate logcat =====
(0)  100.34 (0.26%)
(1)  37149.26 (95.30%)
(2)  8.43 (0.02%)
(3)  11.18 (0.03%)
(4)  9.92 (0.03%)
(5)  1330.60 (3.41%)
(6)  250.46 (0.64%)
(20) 12.41 (0.03%)
Total 38982.50 (sum of others 38872.59)

after

==== test finished, terminate logcat =====
(0)  97.65 (0.26%)
(1)  35427.43 (95.32%)
(2)  10.30 (0.03%)
(3)  8.95 (0.02%)
(4)  11.01 (0.03%)
(5)  1224.67 (3.30%)
(6)  243.76 (0.66%)
(20) 40.91 (0.11%)
Total 37167.04 (sum of others 37064.68)

Change-Id: Id4d3b88a9cdef765affc52973aeac951ecc6a8ca
2011-12-13 16:32:52 +09:00
Tadashi G. Takaoka
6e3cb27cff Reorganize char_utils.h and basechars.h
* make BASE_CHARS[] const
  * add several inline menthods for ASCII character handling

Change-Id: I49664f219af88faf0aef43ac350cfc216570b185
2011-11-11 19:44:08 +09:00
Tadashi G. Takaoka
d862b93578 Cleanup unused function
Change-Id: Ic0895e1973b3879b2a63f7e0b888e9a0480be6f3
2011-10-27 19:58:46 +09:00
satok
eb050fc2dc Demote words with a capitalized char
Bug: 5371514

+1       4
-1       2
+2       0
-2       0
+3       0
-3       0
+4       1
-4       3
+5       0
-5      12
+6       3
-6       3
+7      12
-7       0

Change-Id: I6b46e43f9059f1e8a1cc02a626ea6eb8f1f9924f
2011-10-03 20:11:06 +09:00
Yusuke Nojima
da9f556a15 Merge "Classify touches into three types." 2011-09-30 01:26:15 -07:00
Yusuke Nojima
258bfe66e0 Classify touches into three types.
Change-Id: I7c1d42835e0c15d596a1b66d421b0aa514ec0890
2011-09-30 17:22:22 +09:00
satok
40a5f6fa4d Add a flag to demote completed suggestions
Bug: 5390063
Change-Id: I0ef4fbcc705539624269fd2f8c4e782679fc44b3
2011-09-29 19:48:18 +09:00
Yusuke Nojima
032cfeef5a Delete unused function and add TODO comment for a potential bug.
Change-Id: I7b16de1bd6b278c51d56eb1904e186c3db3b7f3d
2011-09-14 16:09:24 +09:00
satok
10266c09ec Combine the skipped and transposed correction
bug: 4170136

Change-Id: I7b50b40478abf27f51ec5e001815ff4882f3e5e5
2011-08-23 23:40:29 +09:00
satok
9db2097f7b Do the transposed correction and the excessive correction by one loop
Change-Id: Idc7a3451a65f7b980e5c499e9083f67646b3a199
2011-08-19 17:10:10 +09:00
satok
0cedd2bcc3 Combilne normal correction and skip correction
Change-Id: Ide868d977c0f35900340c7be1b71d572c69a8806
2011-08-15 17:13:39 +09:00
satok
f3948c1eac Calculate the skip correction by one loop
Change-Id: Ie70829407cd58be2ffe75c7d649d86f62ee4df24
2011-08-11 17:18:23 +09:00
satok
208268d149 Add correction state.
Change-Id: I0a1419922e1ce7a15b566d1b6da3794f8e84c754
2011-08-10 19:10:26 +09:00
satok
cfca3c6317 Refactor CorrectionState to Correction
Change-Id: I5f1ce35413731f930b43b1c82014e65d9eaa240b
2011-08-10 14:40:25 +09:00
satok
8876b75ca1 Move scoring part to the correction state
Change-Id: I2dc4a0869636fce5526f48b3a6267b6bdf61dbfb
2011-08-05 17:24:56 +09:00
satok
f071e75b78 Change the prune condition
Change-Id: I92aef12e0e1d89cfe1b346ddc6ef4df158ffe0b3
2011-08-04 18:32:37 +09:00
satok
4e4e74e6b6 Move the input index and output index to correction state
Change-Id: Idebdb59143f3367929df6a0475cefe941eb16d01
2011-08-04 14:16:14 +09:00
satok
0f6c8e8aeb Move code related to ranking algorithm to correction_state.cpp
Change-Id: I52b34de45969fef82e46d9c10079c2d45e0b94eb
2011-08-03 20:34:19 +09:00
satok
612c6e49c0 Move code related to ranking algorithm to the correction state
Change-Id: I2d9e2db81cf6597ca4e88d7bc6737ab3b52b34b2
2011-08-02 15:44:59 +09:00
satok
db2c0919cf Remove old dictionary format code
Change-Id: Ic4b9e069c9bd5c088769519f44d0a9ea45acb833
2011-08-01 16:01:54 +09:00
satok
2df3060883 Add correction state
Change-Id: I0d281cede1590893bd1def005cf83c9431d12750
2011-08-01 15:42:09 +09:00
Jean Chalard
6a0e9642a8 Small native refactoring.
Move a purely dictionary-format-related function that is needed
both by unigrams and bigrams to the binary format handling
file.
Also remove the empty UnigramDictionary::getBigrams placeholder
function, on grounds that it should be in the BigramDictionary
class.

Bug: 5046459
Change-Id: I8a67a25f72122e2fa0b19ae1d936db25eb0b20ba
2011-07-26 16:13:53 +09:00
Jean Chalard
848b69a5f9 Some refactoring
Getting the frequency of a terminal is not very useful, however
getting its position will be very useful for retrieving bigrams
later.
Moreover, from the position it's easy to find out the frequency.

Bug: 5046459
Change-Id: Ica53472c2038c7e407dbd1399d336511c731087f
2011-07-26 15:44:51 +09:00
Jean Chalard
999ba61b34 Some native cleanup
Take a function that does not need to be a member and make it
static inline.
Also replace the return value of -1 by a #define'd constant.

Change-Id: I92e0deaa1df65998b76aba6329a4c8eb4d287485
2011-07-22 18:09:48 +09:00