Jean Chalard
e308459531
Compute the correct frequency for bigram prediction
...
Change-Id: I3196f48a0ca2ed5e94f430254d58e65d341398c8
2012-05-29 16:22:46 +09:00
Jean Chalard
cb99376307
Fix a bug where the bigram freq would be underevaluated
...
The difference in score is not large, but it's still a bug
Change-Id: Ie22c2b6e1206e829c1c8af096469df05af14d47b
2012-05-29 16:04:07 +09:00
Jean Chalard
19ebd93646
Split a method to reconstruct freq from uni/bi freq
...
This has no impact at all on the logic.
Change-Id: I3788c8335cc193433ad9a7512b211a49bb2ffb02
2012-05-29 16:00:25 +09:00
Jean Chalard
402b057050
Fix two small possible bugs.
...
None of these had any real impact, but they were potential
liabilities for the future
Change-Id: I2de581f8b638e423d47a6d99b1a3c96af4c6150d
2012-05-29 15:56:30 +09:00
Ken Wakasa
7d81f31871
am 7b1570e6: Merge "Cleanup Makefiles of LatinIME" into jb-dev
...
* commit '7b1570e60c2e04fe7d132df758476b34685eb709':
Cleanup Makefiles of LatinIME
2012-05-23 20:22:04 -07:00
Ken Wakasa
dd58065733
Cleanup Makefiles of LatinIME
...
Change-Id: Id4c6700bc045825eb64fb2b7ae57f23a6211441d
2012-05-24 12:08:59 +09:00
satok
074e8c9206
am a0ac31fc: Fix the issue on multiple words suggestion
...
* commit 'a0ac31fcaa01c21592a6e7af243c14dada65cf3e':
Fix the issue on multiple words suggestion
2012-05-23 05:06:08 -07:00
satok
a0ac31fcaa
Fix the issue on multiple words suggestion
...
Bug: 6509844
Change-Id: I823074a2b29befc3e60c63699ab4dc7719105c63
2012-05-23 20:40:59 +09:00
Jean Chalard
7557d3c6f3
am bc77adef: Merge "Return the bigram frequency if available." into jb-dev
...
* commit 'bc77adefbb0305c5ec0e41ab01e3a085c47c21eb':
Return the bigram frequency if available.
2012-05-17 03:31:15 -07:00
Jean Chalard
bc77adefbb
Merge "Return the bigram frequency if available." into jb-dev
2012-05-17 03:15:40 -07:00
Jean-Baptiste Queru
cd7c41352f
Fix build
...
Change-Id: I799811aa3afb59bba2e4086a063f5da03669bba3
2012-05-16 16:56:11 -07:00
Ken Wakasa
3b088a2f36
Add missing includes.
...
Change-Id: Ic7199045d0cffb208871f52cc167194013351d32
2012-05-16 23:05:32 +09:00
Jean Chalard
9416c81403
Return the bigram frequency if available.
...
This concludes the work on bug#6313806.
Don't submit it before the dictionaries are suitably amended.
Bug: 6313806
Change-Id: Icfea45bd52bb9d8cc68ba2266f80640e3942bb7f
2012-05-16 21:14:06 +09:00
satok
0028ed3627
Use "float" instead of "double"
...
Change-Id: I93ed4d88ede4058f081dd8d634b00dfff4e96d07
2012-05-16 20:45:05 +09:00
satok
f837b57bf5
Merge "Reorder suggestions result according to auto correction threshold" into jb-dev
2012-05-16 04:13:08 -07:00
satok
db1939dbaa
Reorder suggestions result according to auto correction threshold
...
Bug: 5413904
Change-Id: I3aa3a8109ba45d2129b58d8242866fd3dd3473cb
2012-05-16 19:58:48 +09:00
satok
6804b8e0fd
Fix a bug of handling single quote in the correction algorithm
...
Bug: 6096247
Change-Id: I5490bbdee4ce1e3e0729ec1510a2baab85eeaf05
2012-05-15 15:12:55 +09:00
Tom Ouyang
4d289d39ae
Contacts dictionary rebuilds only when contact names have changed.
...
Bug: 6396600
Change-Id: Iad693ec4bab6351793d624e5c5b0a9f5c12a60e3
2012-05-11 18:43:53 -07:00
Jean Chalard
49ba135fde
Perform the actual bigram frequency lookup.
...
This still returns the unigram frequency, because the values stored
for bigrams in the dictionary are not ready to be returned in-place
instead of unigram values. Aside from this, the code is complete.
Bug: 6313806
Change-Id: If7bb7b644730782277f0f6663334c170b7fe13fb
2012-05-10 20:01:44 +09:00
Jean Chalard
8950ce6c44
Replace the bigram list position with the map and filter
...
Passing the position will not allow us a reasonable lookup
time. Replace this with a map and bloom filter for very fast
lookup.
Bug: 6313806
Change-Id: I3a61c0001cbc987c1c3c7b8df635d4590a370144
2012-05-07 17:15:21 +09:00
Jean Chalard
f1634c872c
Fill in the bloom filter for bigram lookup.
...
Bug: 6313806
Change-Id: Ib79e14f6f8b241f053da6069c15f19c71084317e
2012-05-07 15:38:43 +09:00
Jean Chalard
1ff8dc47be
Fill up a map of bigram addresses for lookup.
...
We don't want to do a linear search on each terminal when there
may be 100+ bigrams for a given word because that would be
disastrous for performance. Also, we need to resolve each bigram
address anyway.
This change resolves the addresses at first and puts them in a
balanced tree so that lookup will be O(log(n)).
Bug: 6313806
Change-Id: Ibf088035870b9acb41e948f0ab7af4726f2cee24
2012-05-02 17:50:44 +09:00
Jean Chalard
351864b38a
Fetch and pass the bigram position on suggestions.
...
This is a cherry-pick of change I2d81742f
Bug: 6313806
Change-Id: Ic1190b7980d032bc11b57841bca040d980889b6b
2012-04-26 15:24:42 +09:00
Jean Chalard
4d9b202c40
Pass the bigram list position from the top level
...
The position itself is still a const int = 0 until we have the previous
word passed to the function. This basically does the plumbing.
Bug: 6313806
Change-Id: Ib58995f334fe93e3ff5704d7c79f332017f101ac
2012-04-24 16:47:09 +09:00
Jean Chalard
171d1809ff
Add methods to inverse compute the probability.
...
For now the probability is just returned with the same
value it had, but this is some ground work that needs to be
done anyway.
Bug: 6313806
Change-Id: I9bb8b96b294109771208ade558c9ad56932d2f8e
2012-04-24 09:40:44 +09:00
Jean Chalard
522a04ea5b
Pass words as int[] to the native code.
...
We need to get the bigrams during the call to getSuggestions for
bug#6313806. We already give an int[] to getSuggestions and we
wanted to get rid of char[]'s anyway because it doesn't work with
surrogate pairs, so here we go.
Bug: 6313806
Change-Id: I56ce99f1db6b3302cdf42f0527343bded837091e
2012-04-23 16:05:36 +09:00
Tadashi G. Takaoka
a58ebc73ae
Fix typo of some methods' name
...
Also changes some methods' argument type from Locale to String.
Change-Id: Ib68b528a450dc68a01546483403230f76500bee4
2012-04-18 16:40:50 +09:00
Jean Chalard
bde232dcaa
Merge "Pass the previous word down to native code in getSuggestions"
2012-04-17 18:01:05 -07:00
Jean Chalard
80111f08e2
Pass the previous word down to native code in getSuggestions
...
Change-Id: I477b631d81ef58461e44954f3ae5fd895928bb97
2012-04-17 20:07:10 +09:00
Jean Chalard
fec6837ae1
Fix debug compilation + small cleanup
...
Change-Id: Ia89d84f62ba38dee05d25fbc94698e889cf27d2c
2012-04-17 17:33:25 +09:00
Jean Chalard
ee396df162
Fix a native crash
...
This was introduced by Ieb2e306a which failed to keep the return
behavior in case the word doesn't have a bigram.
Change-Id: I6d2f0b79d41c4335e94696690c8331e314961133
2012-04-17 16:57:42 +09:00
Jean Chalard
9c2a96aa6c
Preparatory refactoring
...
Split out getting the pointer to the bigrams to a separate
function. This is a preparative change to bug#6313806
Change-Id: Ieb2e306a1151cd95dc1a16793c8dc2f7fed8b654
2012-04-17 11:46:20 +09:00
Ken Wakasa
db87fe4d5c
Just cosmetic changes in jni code
...
Change-Id: I8628131b5a7ccdee4c158e891002c8b86623b0cd
2012-04-16 19:16:05 +09:00
Jean Chalard
3f675f7060
Fix a large native memory leak.
...
This leak was about 500k and would happen whenever a new binary
dictionary was opened/closed.
Bug: 6299535
Change-Id: I4fad5b4d9c556ca889f5ef62d9d083a2eff6346a
2012-04-16 16:48:53 +09:00
Jean Chalard
338d3ec725
Replace the flags in getSuggestions with a boolean.
...
Change-Id: I0ec44df1979cb1dc21017ea290d2151a2af0e7cd
Conflicts:
java/src/com/android/inputmethod/latin/Suggest.java
2012-04-06 19:34:48 +09:00
Jean Chalard
aa8df59914
Enable using the flags read from the binary file.
...
Change-Id: Ib420c3e174ccc1a80c4b6fd066de3b7a2b6fb290
2012-04-06 18:54:20 +09:00
Jean Chalard
cd274b1469
Save the flags in a member in the unigram dictionary.
...
Change-Id: Ic8fad9110db6b97f98ace27af0f347b4e69de8c8
2012-04-06 18:34:59 +09:00
Jean Chalard
e81ac8baa0
Add a method to get the flags from a binary dictionary.
...
This method is not used yet
Change-Id: Ic15d3d423aff2c83c712bc0aa56571d30755e663
2012-04-06 18:34:22 +09:00
Jean Chalard
5b0761e6a9
Remove write-only stuff
...
Change-Id: I5ac8ab64c77a298502b3d063ea70db9b4da41716
2012-04-06 17:52:18 +09:00
Jean Chalard
9a933a742d
Read shortcuts as strings in the dictionary.
...
This has no impact on performance.
Before:
(0) 9.61 (0.01%)
(1) 57514.58 (56.70%)
(2) 10.55 (0.01%)
(3) 10.79 (0.01%)
(4) 133.20 (0.13%)
(5) 43553.87 (42.94%)
(6) 10.03 (0.01%)
(20) 47.20 (0.05%)
Total 101431.47 (sum of others 101289.84)
After:
(0) 10.52 (0.01%)
(1) 56311.16 (56.66%)
(2) 13.40 (0.01%)
(3) 10.98 (0.01%)
(4) 136.72 (0.14%)
(5) 42707.92 (42.97%)
(6) 9.79 (0.01%)
(20) 51.35 (0.05%)
Total 99390.76 (sum of others 99251.84)
The difference is not significant with regard to measure imprecision
Change-Id: I2e4f1ef7a5e99082e67dd27f56cf4fc432bb48fa
2012-04-06 16:22:08 +09:00
Ken Wakasa
0c1a3ec629
Make LatinIME's native Makefile NDK-friendly
...
Change-Id: I55d430756b3a8251c9ff49dfabfcecb047d979a4
2012-03-31 05:07:32 +09:00
Ying Wang
32f0e24b33
Remove ".." in the native LOCAL_SRC_FILES.
...
The build system does not work well with ".." in the path of native
source code.
".." causes the object files to spill out of the module's intermediate
directory.
Change-Id: Ib4a473426be296a738e7facbaa091e56f0b7c5b8
2012-03-30 10:50:26 -07:00
Ken Wakasa
3ef3e24a12
Move the "src" directory as a preparation for Ib4a47342 and I66f6c5b9
...
Change-Id: I3ab65059f6e356530484bfd0bba26a634a4cba65
2012-03-30 09:53:51 +09:00
satok
6ba8de2a60
Good bye the proximity logic in Java code
...
Bug: 4343280
Change-Id: I82f7d08703647a3492ce6e2d3b741146df58927e
2012-03-28 18:42:30 +09:00
Tom Ouyang
aeda8a7798
Change the first character check in bigram dictionary to be case insensitive.
...
Bug: 6188977
Change-Id: I121c1abf245c7f8734730810c07d3351b1ec581a
2012-03-24 15:31:27 +09:00
satok
acb6c5445f
Fix build breakage
...
Change-Id: Ic4d3cf6932dcd57c1040c7877ab7c7f48cd6c408
2012-03-23 20:58:18 +09:00
Jean Chalard
350ffc879a
Merge "Fix a bug with negative coordinates, step 3"
2012-03-23 04:04:48 -07:00
Jean Chalard
e2222b78d3
Merge "Fix a bug with negative coordinates, step 2"
2012-03-23 03:50:21 -07:00
Jean Chalard
7f18f44461
Merge "Fix a bug with negative coordinates, step 1"
2012-03-23 03:48:53 -07:00
Jean Chalard
52612a0d1b
Fix a bug with negative coordinates, step 3
...
This implements the actual change, now that indentation is okay
Change-Id: Idd897f988394125611516431711c1e575df871df
2012-03-23 19:38:23 +09:00
Jean Chalard
3094d12cdc
Fix a bug with negative coordinates, step 2
...
Indentation changes only
Change-Id: I95011e7d3f787ae6749b826af627f9acaed34e97
2012-03-23 19:37:13 +09:00
Jean Chalard
88ec125cfc
Fix a bug with negative coordinates, step 1
...
This breaks style guidelines but for some reason git diff gets
so lost on this re-indent that it's better to do it like this
Change-Id: Ie0a603eb0739704894a5adc25f9d527b37bdf151
2012-03-23 19:34:53 +09:00
satok
8980bd4a25
Merge "Cleanup jni 1"
2012-03-23 03:24:08 -07:00
satok
9df4a4527a
Cleanup jni 1
...
Change-Id: Ieb6af8385356e259720b50f1fe46a694a098b30f
2012-03-23 19:03:20 +09:00
Jean Chalard
2b5b6388d6
Merge "Fix a typo"
2012-03-23 02:48:15 -07:00
Jean Chalard
bbc25607f0
Fix a typo
...
Change-Id: If794344629e93b558d60b023ae70b703f9c039ab
2012-03-23 17:05:03 +09:00
Jean Chalard
cc78d03a62
Add processing for French ligatures.
...
Bug: 5140033
Change-Id: I1c2751fc617e662aad9f67506e28a622f81d0bc9
2012-03-23 16:50:59 +09:00
Jean Chalard
081616cd2f
Send correct coordinates for the spell checker
...
This results in the computation being done in native code
and the correct proximity being used.
Bug: 6181080
Change-Id: I08fa05c781d607e4feca2caeda353ec19c133a3d
2012-03-23 13:02:58 +09:00
Jean Chalard
d30433837d
Add a replacement character to digraphs system
...
The digraphs system used to allow only the replacement of
a pair (A, B) by (A). This change allows the replacement to
be any character.
Bug: 5140033
Change-Id: Icf5995f0ec553f7b7989af9902cbb2c4c6b51009
2012-03-22 11:37:26 +09:00
Jean Chalard
9c4396abb3
Merge "Generalize the digraph code"
2012-03-21 18:47:10 -07:00
Jean Chalard
6c30061c70
Generalize the digraph code
...
Bug: 5140033
Change-Id: I19c8c89f79f7c1ce1fba58d50bc2697747052599
2012-03-21 17:50:55 +09:00
Jean Chalard
9715cc4ed5
Fix a bug where the returned number of bigrams was incorrect
...
Bug: 6191885
Change-Id: I1daa4d2eaeec5f5c1a4eef79221fd7de357763e9
2012-03-21 16:55:04 +09:00
satok
1caff47ecd
Calculate proximity characters in the native code
...
Bug: 4343280
Change-Id: I6adaf560f7a4f1f96dcb6ec2f61f20ee3001167e
2012-03-16 17:26:36 +09:00
satok
0cb2097a45
Fix additional proximity in the native code
...
Bug: 4343280
Change-Id: I4164bb916b2dbdfb6bdc151b99d46a6171d9c355
2012-03-14 11:17:59 +09:00
satok
5eec574cf0
Use additional proximity chars in the native code
...
Bug: 4343280
Change-Id: Ida690fe246cea80a82fcdb3ad0c28e2907b882ac
2012-03-13 19:00:16 +09:00
satok
552c3c27f0
Implement additional proximity characters in the native code
...
Bug: 4343280
Change-Id: I9bbc5cab2fef1ee80c1fe32017df811ef8af10bc
2012-03-13 17:38:50 +09:00
Ken Wakasa
951ab9d7eb
Fix typo.
...
Change-Id: Ia18cd090fd81022041854ce190e36eca49c6b04a
2012-03-09 19:18:59 +09:00
satok
f0d5a78388
Merge "Add functions to calculate proximity characters in the native code"
2012-03-07 23:31:17 -08:00
satok
219a514082
Fix a bug on German umlaut digraph correction
...
Bug: 6129372
Change-Id: I2d629735028c35bf12289f381ada2f4ffe8d7ad3
2012-03-08 13:55:34 +09:00
satok
a70ee6e3b3
Add functions to calculate proximity characters in the native code
...
Bug: 4343280
Change-Id: I17f8f6295b01900948b98680d0267753f33a46cf
2012-03-08 12:55:15 +09:00
Jean Chalard
46a1eec4d8
Add a variable-length header region to the binary format.
...
Also bump up the format version to 2.
Bug: 5686638
Change-Id: I3aafdd7e42c422202122998ec093280051aa8e07
2012-03-06 17:37:28 +09:00
Tadashi G. Takaoka
d1dbdb6b20
Make some debug aid functions to inline
...
Change-Id: I973f9d4a3989f3d2b797ad26f9d006c0f2c613b5
2012-03-06 15:51:32 +09:00
satok
bb0bd66942
Fix correction utility
...
Bug: 6096247
Change-Id: Ie17c60dde9bd081790b79312ce8d96d292c5128c
2012-03-02 12:34:17 +09:00
Jean Chalard
ad290d6505
Activate bigram predictions from the binary dictionary
...
Change-Id: If1cc50539d7677b854b1cd3bea3423c8c0865de5
2012-02-15 19:51:24 -08:00
Jean-Baptiste Queru
26e315785d
resolved conflicts for merge of 3ad1145a
to master
...
Change-Id: I13159b95f90c5095373951bf9e91b7dbf8b14558
2012-02-14 13:09:49 -08:00
satok
a85f4929cd
Support multi words suggestion
...
result: I4d097612db2f2a93522
Change-Id: Iedbb24f431dac43e52b6dcce8cb610a75e0ca46e
2012-02-08 13:00:31 +09:00
Ken Wakasa
4c5daa8a55
Fix indent
...
Change-Id: I77b4fb3a47faae7d4ad45d9903556e77a2fc7163
2012-02-06 21:51:31 +09:00
Tadashi G. Takaoka
a27cb62390
Merge "Use C++ template for min/max"
2012-02-05 17:38:30 -08:00
Tadashi G. Takaoka
09baa36f7d
Use C++ template for min/max
...
To be more friendly for off-device regression test.
Change-Id: I7edf4c9de73915aad9c1760ace7df3177ed3c4e9
2012-02-06 09:41:41 +09:00
satok
1b9fa942b4
Support correction conversion from skip to additional proximity
...
Result: I34bedff6149a6a4e01
Change-Id: I46d528f228a969a0a996299221622627f43c55ec
2012-02-03 20:00:15 +09:00
satok
04fd04d6ff
Separate the logic for touch caribration again
...
Change-Id: I59c6244674caa899af559402290160ad411d1bb5
2012-02-02 19:03:04 +09:00
satok
e05b3f4b3a
Support additional proximity characters
...
Change-Id: Ifbe0d7e4eafea1926bbce968eae4724dd5769689
2012-02-02 16:07:16 +09:00
Bhanu Chetlapalli
b093cc4824
[MIPS] Remove reference to NDK
...
Change-Id: I6137c4a93b29a8906abb5bd0f320dd3f37fdea08
Signed-Off-By: Bhanu Chetlapalli <bhanu@mips.com>
2012-01-31 12:08:27 -08:00
satok
1f6b52e76c
Implement multi words suggestions step1
...
Change-Id: I96e8e1b0d9ccc0ed13d53c40300d8c19bcb7af5b
2012-01-30 18:01:25 +09:00
satok
9955716d0b
Merge missing space and mistyped space correction algorithm
...
Change-Id: Idd64d38d3d29be24748f9c0359667883698a5756
2012-01-27 16:54:15 +09:00
satok
3c09bb18d9
Merge multiple words suggestions algorithm
...
Change-Id: I70d85b90ddaa28a41e9679f445bc14ef9ff50f16
2012-01-26 18:59:51 +09:00
satok
7409d151a1
Refactor words priority queue
...
Change-Id: I14b7ef39263ad2b1d5ec087bc80b7b8d7c30abe7
2012-01-26 16:13:25 +09:00
satok
f8ce19c29d
Merge "Cleanup unused code"
2012-01-25 22:12:52 -08:00
satok
1c03306994
Cleanup unused code
...
Change-Id: I6c840f9ed170919e48d1c576cd0a48777ad44030
2012-01-26 14:56:52 +09:00
satok
61b31a646e
Merge "Do other error correction for the second word of two word correction"
2012-01-25 05:48:15 -08:00
satok
8330b488e9
Do other error correction for the second word of two word correction
...
result: I4e0b68a12190933f9
Change-Id: I98afce6fe4d5bde97392146d204370ba31a72566
2012-01-25 22:30:37 +09:00
Jean Chalard
0bfe359ee4
Add a test for auto-correction.
...
Fix two related subtle bugs:
- Stop singling out fat-finger-only corrections for rejection
when touch coordinates are not available.
- Remove a racy check that would happen only in debug mode
Change-Id: Ic904f9b27c091ca6b369052c4e65a630bff81257
2012-01-25 19:29:40 +09:00
Jean-Baptiste Queru
11c41216f1
Merge 2577fca1
...
Change-Id: Ie2c9f6c2eafb59dff95db8954481ce49c87a6d44
2012-01-23 09:06:00 -08:00
satok
bd6ccdd5f0
Clean up two word correction
...
Change-Id: I5cd2697d7f61b81aff0c249df01479d86ad0fba5
2012-01-23 15:35:03 +09:00
satok
54af64ae92
Two words error correction with other error correction for the first word
...
+1 26
-1 5
+2 0
-2 0
+3 0
-3 0
+4 9
-4 25
+5 20
-5 21
+6 13
-6 6
+7 15
-7 26
Change-Id: Iad682d417a6bb42b11ca6e60157698ca66fef3ff
2012-01-19 19:17:29 +09:00
Robert CH Chou
bd1ed5b859
Make the JNI lib an optional module
...
Make it a user module will force it installed no matter the
IME is actually required by the product or not. Replace the
user by optional and add requiring the libjni_latinime by using
LOCAL_REQUIRED_MODULES
Change-Id: Ibfc37cf2e2391021d45538c7cea342894b56fbf8
2012-01-19 10:02:25 +08:00
satok
29dc80614b
Prepair for advanced two words error correction
...
Change-Id: I4c8a21f0f6e349ddafd9b402583321a60855cfe8
2012-01-17 16:00:55 +09:00
satok
a161a4afd6
Use edit distance for transposing correction
...
+1 73
-1 4
+2 0
-2 0
+3 0
-3 0
+4 11
-4 19
+5 9
-5 3
+6 2
-6 63
+7 2
-7 8
Change-Id: I269cd2386f451f8932e4e0ae66223e794fdfa862
2012-01-17 13:14:35 +09:00
Jean Chalard
82ddd16889
Stop avoiding adding what the user typed to candidates
...
There does not seem to be any reason other than a historical
one to avoid doing this, but it takes processing power and
makes things more complicated.
This has a very limited impact on regression tests:
5 -> 3 [He, the]
5 -> 3 [An, an]
5 -> 3 [Where, where]
5 -> 3 [This, this]
7 -> 1 [wAtch, watch]
6 -> 4 [oveNs, oceans]
5 -> 1 [Ahere, Where]
7 -> 1 [Hast, Hast]
7 -> 5 [bjp, bill]
5 -> 1 [What, What]
5 -> 3 [Sound, So und]
7 -> 3 [causalities, casualties]
7 -> 3 [discontentment, discontent]
7 -> 3 [irregardless, regardless]
5 -> 1 : 2
5 -> 3 : 5
6 -> 4 : 1
7 -> 1 : 2
7 -> 3 : 3
7 -> 5 : 1
+1 4
-1 0
+2 0
-2 0
+3 8
-3 0
+4 1
-4 0
+5 1
-5 7
+6 0
-6 1
+7 0
-7 6
Change-Id: I6407cf922f27bbd3992df11d63690e71fc61111b
2012-01-16 18:58:10 +09:00
satok
67e13976b7
Merge "Store suggestions for each input length for missing space algorithm etc."
2012-01-16 00:18:37 -08:00