Commit graph

67 commits

Author SHA1 Message Date
satok
29dc80614b Prepair for advanced two words error correction
Change-Id: I4c8a21f0f6e349ddafd9b402583321a60855cfe8
2012-01-17 16:00:55 +09:00
satok
6ad15fcd15 Store suggestions for each input length for missing space algorithm etc.
Change-Id: Ief8f6ddd29e043744863e5b9be3a51a70987291c
2012-01-16 17:11:17 +09:00
Ken Wakasa
ecbf3f2cbc Merge "Fix indentations." 2012-01-05 19:29:22 -08:00
Ken Wakasa
e12e9b5b69 Fix indentations.
Change-Id: I25c26e2fe50427d11d97b6204174a4f651963d24
2012-01-06 12:24:38 +09:00
Jean Chalard
cf9dbbdd1a Add methods to read shortcuts from the binary dict (A2)
This contains stubs only, it does not work yet, however it
doesn't break anything.

Change-Id: If912ae84ff3ccd7a2d6588ffd6fbb9974f87ef3d
2012-01-06 12:24:30 +09:00
Jean Chalard
c8c6585f21 Add a forgotten constant
This fixes the build.
A constant was used before it was declared in another file.

Change-Id: I72dfca2f76f0c3b7dd64072d062cd48c9bfcbd56
2011-12-27 12:58:49 +09:00
satok
1a6da631ab Prepare for proximity + two word correction No2
Change-Id: Idfa1413e853299f1db459ef07da3efa932047981
2011-12-19 17:12:20 +09:00
satok
744dab691e Prepare for proximity + two words suggestion
Change-Id: I3637f9bec1f4a3c5953498c4562e1f17a7bf593c
2011-12-16 17:32:53 +09:00
satok
a7e5a5a6b9 Add words priority queue pool
Change-Id: I152df7b876a1756b69ded2ca4fb3ee26b38c971f
2011-12-15 19:20:28 +09:00
satok
4d355989bd Add a functionality to limit the max correction errors
Before
==== test finished, terminate logcat =====
(0)  121.97 (0.28%)
(1)  42032.07 (95.46%)
(2)  11.03 (0.03%)
(3)  12.19 (0.03%)
(4)  10.02 (0.02%)
(5)  1417.41 (3.22%)
(6)  258.43 (0.59%)
(20) 50.20 (0.11%)
Total 44033.07 (sum of others 43913.32)

After
==== test finished, terminate logcat =====
(0)  110.81 (0.29%)
(1)  36416.11 (94.47%)
(2)  10.06 (0.03%)
(3)  9.45 (0.02%)
(4)  9.83 (0.03%)
(5)  1535.52 (3.98%)
(6)  290.25 (0.75%)
(20) 40.57 (0.11%)
Total 38546.83 (sum of others 38422.60)

Change-Id: Iffd24ce0b2dc422c8c6085d5be5f6bfdaf59ca7d
2011-12-15 16:00:08 +09:00
satok
1147c7bac9 Unbundle members in unigram_dictionary
Change-Id: Id737d943d20e3de3db568162caf40d3e956c7fae
2011-12-14 19:45:51 +09:00
satok
16379df633 Use priority queue for native string buffer
+1 2
-6 2

Performance

before

==== test finished, terminate logcat =====
(0)  100.34 (0.26%)
(1)  37149.26 (95.30%)
(2)  8.43 (0.02%)
(3)  11.18 (0.03%)
(4)  9.92 (0.03%)
(5)  1330.60 (3.41%)
(6)  250.46 (0.64%)
(20) 12.41 (0.03%)
Total 38982.50 (sum of others 38872.59)

after

==== test finished, terminate logcat =====
(0)  97.65 (0.26%)
(1)  35427.43 (95.32%)
(2)  10.30 (0.03%)
(3)  8.95 (0.02%)
(4)  11.01 (0.03%)
(5)  1224.67 (3.30%)
(6)  243.76 (0.66%)
(20) 40.91 (0.11%)
Total 37167.04 (sum of others 37064.68)

Change-Id: Id4d3b88a9cdef765affc52973aeac951ecc6a8ca
2011-12-13 16:32:52 +09:00
Tadashi G. Takaoka
0e97148f6d Remove NULL from native/src
Change-Id: I5299af7773d28fd12faebbfe644829a401ae5644
2011-10-28 17:02:09 +09:00
satok
40a5f6fa4d Add a flag to demote completed suggestions
Bug: 5390063
Change-Id: I0ef4fbcc705539624269fd2f8c4e782679fc44b3
2011-09-29 19:48:18 +09:00
satok
10266c09ec Combine the skipped and transposed correction
bug: 4170136

Change-Id: I7b50b40478abf27f51ec5e001815ff4882f3e5e5
2011-08-23 23:40:29 +09:00
satok
208268d149 Add correction state.
Change-Id: I0a1419922e1ce7a15b566d1b6da3794f8e84c754
2011-08-10 19:10:26 +09:00
satok
cfca3c6317 Refactor CorrectionState to Correction
Change-Id: I5f1ce35413731f930b43b1c82014e65d9eaa240b
2011-08-10 14:40:25 +09:00
satok
8876b75ca1 Move scoring part to the correction state
Change-Id: I2dc4a0869636fce5526f48b3a6267b6bdf61dbfb
2011-08-05 17:24:56 +09:00
satok
4e4e74e6b6 Move the input index and output index to correction state
Change-Id: Idebdb59143f3367929df6a0475cefe941eb16d01
2011-08-04 14:16:14 +09:00
satok
0f6c8e8aeb Move code related to ranking algorithm to correction_state.cpp
Change-Id: I52b34de45969fef82e46d9c10079c2d45e0b94eb
2011-08-03 20:34:19 +09:00
satok
612c6e49c0 Move code related to ranking algorithm to the correction state
Change-Id: I2d9e2db81cf6597ca4e88d7bc6737ab3b52b34b2
2011-08-02 15:44:59 +09:00
satok
db2c0919cf Remove old dictionary format code
Change-Id: Ic4b9e069c9bd5c088769519f44d0a9ea45acb833
2011-08-01 16:01:54 +09:00
satok
2df3060883 Add correction state
Change-Id: I0d281cede1590893bd1def005cf83c9431d12750
2011-08-01 15:42:09 +09:00
Jean Chalard
6a0e9642a8 Small native refactoring.
Move a purely dictionary-format-related function that is needed
both by unigrams and bigrams to the binary format handling
file.
Also remove the empty UnigramDictionary::getBigrams placeholder
function, on grounds that it should be in the BigramDictionary
class.

Bug: 5046459
Change-Id: I8a67a25f72122e2fa0b19ae1d936db25eb0b20ba
2011-07-26 16:13:53 +09:00
Jean Chalard
999ba61b34 Some native cleanup
Take a function that does not need to be a member and make it
static inline.
Also replace the return value of -1 by a #define'd constant.

Change-Id: I92e0deaa1df65998b76aba6329a4c8eb4d287485
2011-07-22 18:09:48 +09:00
satok
d24df43eaf (Step2)Move functions related to proximity to proximity_info.cpp
Change-Id: Iae0eb2a5cd758bda820fa42b4bc3eb3d2665bf96
2011-07-14 15:47:32 +09:00
satok
1d7eaf8462 (Step 1) Move proximity related parameters from unigram_dictionary to proximity_info
Change-Id: Ic630b35f4abffeb84c38bcf5935795b7ff07556a
2011-07-14 13:21:34 +09:00
Jean Chalard
1059f27364 New dict format, step 7
This actually implements the new dictionary format, but does not
activate the implementation through #defines.

Bug: 4392433
Change-Id: I9b26b9bcb4b823a36e0984799b69730acfc6f7f3
2011-07-13 14:33:48 +09:00
Jean Chalard
bb15e77511 Move a function to make next commit more readable
Change-Id: Ieaa935ff4d68ce88137dcc5c672a4149a4c9c64f
2011-06-30 20:14:38 +09:00
Jean Chalard
0584f02ee1 Rename parameters for future change
Change-Id: Id15a17340fb26f91c72687f30bef24b2d8b94940
2011-06-30 19:23:16 +09:00
Jean Chalard
432789ac93 Internal cleanup
Moving functions around, renaming parameters

Change-Id: I3ab480f483d7d9700b9328cb07b16b51005098e5
2011-06-30 17:50:48 +09:00
Jean Chalard
ffefdb6c1a Cleanup.
Function renaming, moving around for future patch readability

Change-Id: Id33b961cf2e899b5a3c9189951d2199aba801666
2011-06-30 17:22:19 +09:00
Jean Chalard
980d6b6fef Internal cleanup.
Function renaming, useless function supressing, fix comments

Change-Id: I148acbaf367cd556a85b89016676b46cc971af81
2011-06-30 17:02:23 +09:00
Jean Chalard
594a9a1963 Internal cleanup.
Removed unused function prototypes.

Change-Id: Ia56ea8e285deed17ce8377df855b045b7850d58d
2011-06-30 16:51:17 +09:00
Ken Wakasa
ce9e52a12a Clean up in LatinIME native code
Change-Id: I0062200a0181a491690115ac0fab8d11358e2f14
2011-06-18 23:52:09 +09:00
Jean Chalard
ca5ef2890e New dict format, step 4
Consolidate terminal cases, streamline the word adding process
and create the entrances for adding alternate spellings with an
empty implementation.

Bug: 4392433
Change-Id: I781c93ec49945d71c7c20624c86596aa49add4c8
2011-06-17 20:59:21 +09:00
Jean Chalard
581335c3fb Fix a bug where bigram search would never return
Bug: 4690487
Change-Id: Ie8f3f651508cc48bbb043a0b308f7e0d1524371c
2011-06-17 12:45:17 +09:00
Jean Chalard
17e44a72e8 New dict format, step 3
Some refactoring and add of a parameter that will be necessary.

Bug: 4392433
Change-Id: I17f001a7efd4f69f4c35f94ee1ca8e97391b81d5
2011-06-16 23:28:09 +09:00
Jean Chalard
8124e64dcc New dict format, step 2
Move some methods around and make static some methods

Bug: 4392433
Change-Id: I2bbe98aec118a416d21d1e293638e1d324505b9b
2011-06-16 22:33:41 +09:00
Jean Chalard
293ece0f34 New dict format, step 1
This renames some variables and removes dependancies to values that
will disappear

Bug: 4392433
Change-Id: I79a44462d6bf25248cc2de0d63d7918fc6925d68
2011-06-16 22:18:10 +09:00
satok
d8db9f86d0 Fix a bug on the calculation of the freq on the mistyped space error correction
Bug: 4402942

Change-Id: I0b611e3d0e8c25ca528ef7408c3949200e5cad85
2011-05-18 18:36:54 +09:00
satok
3c4bb7747d A bug fix for the mistyped space algorithm
Bug: 3311719

-- also fixed compiler warnings

Change-Id: I6941c0d02f10d67af88bc943748dde8d8783fabb
2011-03-04 23:25:48 -08:00
Jean Chalard
eaecb56f94 Merge "Demote skipped characters matched words with respect to length." into honeycomb-mr1 2011-03-04 22:43:16 -08:00
satok
817e517e46 Add the suggestion algorithm of words with space proximity
Bug: 3311719

Change-Id: Ide12a4a6280103c092fa0f563dd5b9e3f7f5c89b
2011-03-04 20:37:18 -08:00
Jean Chalard
07a8406bc1 Demote skipped characters matched words with respect to length.
Words that matched user input with skipped characters used to be demoted
in BinaryDictionary by a constant factor and not at all in those dictionaries
implemented in java code. To represent the fact that the impact of a skipped
character gets larger as the word is shorter, this change will implement a
demotion that gets larger as the typed word is shorter. The demotion rate
is (n - 2) / (n - 1) where n is the length of the typed word for n >= 2.
It implements it for both BinaryDictionary and java dictionaries.

Bug: 3340731
Change-Id: I3a18be80a9708981d56a950dc25fe08f018b5b89
2011-03-05 13:20:19 +09:00
Jean Chalard
a787dba83b Fix a bug with umlaut processing.
Issue: 3275926
Change-Id: Ibcb00aaea3ff05ad59ad4e8e54dd3caab5ab9bca
2011-03-04 13:07:07 +09:00
Jean Chalard
c2bbc6a449 Use translation of fallback umlauts digraphs for German.
For German : handle "ae", "oe" and "ue" to be alternate forms for
umlaut-bearing versions of "a", "o" and "u".

Issue: 3275926

Change-Id: I056c707cdacc464ceab63be56c016c7f8439196c
2011-03-03 11:52:23 +09:00
satok
8fbd552292 Add proximity info to native
Bug: 3311719

Change-Id: Ie596304070e321ad23fb67a13bf05e2b6af1b54b
2011-02-23 23:04:00 +09:00
Jean Chalard
f5f834afcd Rename variables with obscure names.
The `snr' variable has a very obscure name. Rename it to `matchWeight'.
Also, the `toLowerCase' function is error-prone, since it actually returns
a lower case version of the BASE char, that is without diacritics. Hence,
rename it to `toBaseLowerCase' and update variables with similar names.

Change-Id: Ibdbe73018a33ee864db59a51d664c3b104d5fb3f
2011-02-22 16:43:19 +09:00
Tadashi G. Takaoka
887f11ee43 Remove next letters frequency handling
Bug: 3428942
Change-Id: Id62f467ce4e50c60a56d59bf96770e799a4659e2
2011-02-17 13:59:41 +09:00