Jean Chalard
03f8c6aed3
Be careful about the dictionary size in detection methods
...
Bug: 8857618
Change-Id: I29345ec96d53da601571ba73197a6485643a10a7
2013-05-08 18:55:18 +09:00
Tom Ouyang
9559dd2e30
Improve bigram frequency lookup
...
Bug: 8592527
Change-Id: I1908bcb552279b9acb140fe4d8d26b10ed9eda72
2013-04-26 12:22:23 -07:00
Ken Wakasa
dad23dda9d
A small follow-up to I8d03bae3264974eff7b790e27c073b0a8758d17a
...
Change-Id: Id3727f075e74c0102edcb51eabcfdbef745b94b7
2013-04-27 00:24:34 +09:00
Jean Chalard
c9688ef267
Fix a small bug
...
Tests results in Ibcd6c110f1d5582425f9592c42e31152131ef80c
Bug: 7226877
Change-Id: I8d03bae3264974eff7b790e27c073b0a8758d17a
2013-04-22 18:30:46 +09:00
Satoshi Kataoka
252412d7eb
Use additional multi-word cost per language (for Russian)
...
Bug: 7540133
Change-Id: I7eb7b8399746c15452ed2ed5069955e88fb546d3
2013-04-16 16:42:36 +09:00
Satoshi Kataoka
e0e6737373
Refactor parameters by naming convention
...
Change-Id: I8bda8075b33f656ecbec08320afcd864b620fe77
2013-03-18 15:42:15 +09:00
Ken Wakasa
5e21ac60b0
Small cleanups in binary_format.h
...
Change-Id: I6049a2f312b7d53a3ffa688ddca5731004784ebe
2013-01-30 23:56:50 +09:00
Ken Wakasa
cffb3126ac
Small cleanups
...
Change-Id: I3e5862a405b4c63616c7ea947cd53c52b5035862
2013-01-30 01:19:29 +09:00
Ken Wakasa
fc799ba03c
Clean up sign conversions in binary_format.h (done by -Wsign-conversion)
...
Change-Id: I9ca88c22ee5bbb66d50640e1d96021fbe71fc8ab
2012-12-10 20:23:18 +09:00
Ken Wakasa
02421af02a
Merge "remove invalid comparison"
2012-12-03 01:27:41 -08:00
Satoshi Kataoka
94885f572c
remove invalid comparison
...
Change-Id: I162c478debe5897be057998bd22924ed487d01af
2012-12-03 18:15:06 +09:00
Ken Wakasa
17f71ca6bc
Fix offdevice regression test build error
...
Change-Id: I97128108b3bd75c61069517c3f8ce68ecc7bf285
2012-11-30 19:32:45 +09:00
Jean Chalard
da439fa461
Merge "Add utilities to read header values."
2012-11-29 03:11:06 -08:00
Jean Chalard
22025c6a37
Add utilities to read header values.
...
Bug: 7540132
Change-Id: I19d85481135e79d8782f711da5cbb3a5a7bc06f8
2012-11-29 20:08:37 +09:00
Ken Wakasa
2a6f58d902
Prep for GCC 4.7
...
On Galaxy Nexus (./vendor/google/apps/LatinImeGoogle/tests/etc/run-profile.sh -g)
Before
==== test finished, terminate logcat =====
(0) 2506.11 (10.48%)
(1) 21289.22 (89.01%)
(2) 108.29 (0.45%)
(3) 0.00 (0.00%)
(4) 0.00 (0.00%)
(5) 0.00 (0.00%)
(6) 0.00 (0.00%)
(20) 0.00 (0.00%)
Total 23917.44 (sum of others 23903.62)
After
==== test finished, terminate logcat =====
(0) 2499.58 (10.98%)
(1) 20145.66 (88.51%)
(2) 103.17 (0.45%)
(3) 0.00 (0.00%)
(4) 0.00 (0.00%)
(5) 0.00 (0.00%)
(6) 0.00 (0.00%)
(20) 0.00 (0.00%)
Total 22761.98 (sum of others 22748.42)
Change-Id: I662cb361ff9205ef87d640c458b8473df7d54659
2012-11-27 20:11:29 +09:00
Ken Wakasa
5f2fa6b82c
Tidy up visibility of members of BinaryFormat.
...
Change-Id: I38a00076b82de8e1a19209c67954fe01585f7943
2012-11-05 20:16:52 +09:00
Ken Wakasa
6e66349ed1
Adjust compiler warning options with the offdevice Makefile
...
Make use of AK_FORCE_INLINE for -Winline and better performance
Change-Id: If0016e2ef61c1fe007c83bb1a5133a6b6bde568e
2012-11-05 14:26:53 +09:00
Ken Wakasa
1e61493c50
Use 32-bit code points for suggestions output
...
This is a multi-project commit with Ic43dd666
bug: 6526418
Change-Id: I39c1acb4e91d04cd8a4ec5a943c8cf575da75ebc
2012-11-01 00:09:51 +09:00
Jean Chalard
18ebba3a66
Fix one-off bugs reported by Valgrind
...
Bug: 7108990
Change-Id: I40ba30f50a26b65bcac905fc005ad6bb9cb034cc
2012-09-06 20:37:55 +09:00
Ken Wakasa
f2789819bd
Cosmetic fixes and a bug fix in UnigramDictionary::testCharGroupForContinuedLikeness().
...
This change has actually been extracted from a change work in progress I4fe423834b8131fb122251892c98228a6e08ba25
Change-Id: I52568fa09da2ea22be7f8bfe9676b7cd73c31fa4
2012-09-04 14:23:37 +09:00
Jean Chalard
72b1c93941
Reinstate the shortcut-only attribute
...
Also add the blacklist attribute
Bug: 7005742
Bug: 2704000
Change-Id: Icbe60bdf25bfb098d9e3f20870be30d6aef07c9d
2012-08-31 22:11:52 +09:00
Ken Wakasa
de8a9a8227
Small cleanups
...
Change-Id: Ib66507b8934bc8019a762d24d5311411e044ec84
2012-08-17 13:06:28 +09:00
Jean Chalard
b14fc88e48
Tag the whitelisted entries in native code.
...
Since this is already used in Java land, this actually does
activate the whitelist path, and the code is now fully
functional. We still have to remove the old whitelist resource
and to compile the dictionary that includes the whitelist.
Bug: 6906525
Change-Id: Iacde5313e303b9ed792940efaf6bcfa4ee1317bd
2012-08-13 16:35:59 +09:00
Ken Wakasa
77e8e81ad9
Header cleanup. Moved a couple of functions from .h to .cpp.
...
Change-Id: Ifd12a7632f75395bd0ef5e394d5c2abd6cbe28c6
2012-08-02 20:19:39 +09:00
Jean Chalard
195605084e
Move flags belonging to BinaryFormat to the right place.
...
These masks and flags are constants that are an integral part
of the format. They belong in BinaryFormat and have nothing to
do in UnigramDictionary.
This needs I6751dda4 to not break the build
Bug: 6429243
Change-Id: Ic1c842b3245f7fdc25aa8d1459c5bb07b262e265
2012-08-01 00:23:52 +09:00
Ken Wakasa
0bbb917d12
Cosmetic fixes and style fixes
...
Change-Id: I69c42ff945cdf0d5205c6ca61d6861a0479492dc
2012-07-25 18:56:51 +09:00
Jean Chalard
e9a86e2cdb
Search bigrams for the lower case version of the word (A46)
...
...if there aren't any for the exact case version.
Bug: 6752830
Change-Id: I2737148b01ba04a64febe009ceb2ef53c265d224
2012-07-04 20:12:58 +09:00
satok
1bc038c5e4
Move correction state to stack memory
...
*Before
(0) 13.18 (0.01%)
(1) 93025.41 (62.06%)
(2) 10.75 (0.01%)
(3) 10.50 (0.01%)
(4) 117.50 (0.08%)
(5) 55678.98 (37.14%)
(6) 9.09 (0.01%)
(20) 883.84 (0.59%)
Total 149898.24 (sum of others 149749.25)
*After
(0) 17.41 (0.01%)
(1) 92673.41 (61.95%)
(2) 10.62 (0.01%)
(3) 10.37 (0.01%)
(4) 120.96 (0.08%)
(5) 55741.18 (37.26%)
(6) 11.01 (0.01%)
(20) 862.72 (0.58%)
Total 149595.52 (sum of others 149447.68)
Change-Id: Ia5a25a544fc388e4dab1e08d8f78d5117b249cf3
2012-06-14 15:57:28 -07:00
Jean Chalard
e308459531
Compute the correct frequency for bigram prediction
...
Change-Id: I3196f48a0ca2ed5e94f430254d58e65d341398c8
2012-05-29 16:22:46 +09:00
Jean Chalard
cb99376307
Fix a bug where the bigram freq would be underevaluated
...
The difference in score is not large, but it's still a bug
Change-Id: Ie22c2b6e1206e829c1c8af096469df05af14d47b
2012-05-29 16:04:07 +09:00
Jean Chalard
19ebd93646
Split a method to reconstruct freq from uni/bi freq
...
This has no impact at all on the logic.
Change-Id: I3788c8335cc193433ad9a7512b211a49bb2ffb02
2012-05-29 16:00:25 +09:00
Jean Chalard
402b057050
Fix two small possible bugs.
...
None of these had any real impact, but they were potential
liabilities for the future
Change-Id: I2de581f8b638e423d47a6d99b1a3c96af4c6150d
2012-05-29 15:56:30 +09:00
Jean Chalard
9416c81403
Return the bigram frequency if available.
...
This concludes the work on bug#6313806.
Don't submit it before the dictionaries are suitably amended.
Bug: 6313806
Change-Id: Icfea45bd52bb9d8cc68ba2266f80640e3942bb7f
2012-05-16 21:14:06 +09:00
Jean Chalard
49ba135fde
Perform the actual bigram frequency lookup.
...
This still returns the unigram frequency, because the values stored
for bigrams in the dictionary are not ready to be returned in-place
instead of unigram values. Aside from this, the code is complete.
Bug: 6313806
Change-Id: If7bb7b644730782277f0f6663334c170b7fe13fb
2012-05-10 20:01:44 +09:00
Jean Chalard
8950ce6c44
Replace the bigram list position with the map and filter
...
Passing the position will not allow us a reasonable lookup
time. Replace this with a map and bloom filter for very fast
lookup.
Bug: 6313806
Change-Id: I3a61c0001cbc987c1c3c7b8df635d4590a370144
2012-05-07 17:15:21 +09:00
Jean Chalard
171d1809ff
Add methods to inverse compute the probability.
...
For now the probability is just returned with the same
value it had, but this is some ground work that needs to be
done anyway.
Bug: 6313806
Change-Id: I9bb8b96b294109771208ade558c9ad56932d2f8e
2012-04-24 09:40:44 +09:00
Jean Chalard
522a04ea5b
Pass words as int[] to the native code.
...
We need to get the bigrams during the call to getSuggestions for
bug#6313806. We already give an int[] to getSuggestions and we
wanted to get rid of char[]'s anyway because it doesn't work with
surrogate pairs, so here we go.
Bug: 6313806
Change-Id: I56ce99f1db6b3302cdf42f0527343bded837091e
2012-04-23 16:05:36 +09:00
Jean Chalard
e81ac8baa0
Add a method to get the flags from a binary dictionary.
...
This method is not used yet
Change-Id: Ic15d3d423aff2c83c712bc0aa56571d30755e663
2012-04-06 18:34:22 +09:00
Jean Chalard
5b0761e6a9
Remove write-only stuff
...
Change-Id: I5ac8ab64c77a298502b3d063ea70db9b4da41716
2012-04-06 17:52:18 +09:00
Jean Chalard
9a933a742d
Read shortcuts as strings in the dictionary.
...
This has no impact on performance.
Before:
(0) 9.61 (0.01%)
(1) 57514.58 (56.70%)
(2) 10.55 (0.01%)
(3) 10.79 (0.01%)
(4) 133.20 (0.13%)
(5) 43553.87 (42.94%)
(6) 10.03 (0.01%)
(20) 47.20 (0.05%)
Total 101431.47 (sum of others 101289.84)
After:
(0) 10.52 (0.01%)
(1) 56311.16 (56.66%)
(2) 13.40 (0.01%)
(3) 10.98 (0.01%)
(4) 136.72 (0.14%)
(5) 42707.92 (42.97%)
(6) 9.79 (0.01%)
(20) 51.35 (0.05%)
Total 99390.76 (sum of others 99251.84)
The difference is not significant with regard to measure imprecision
Change-Id: I2e4f1ef7a5e99082e67dd27f56cf4fc432bb48fa
2012-04-06 16:22:08 +09:00
Ken Wakasa
3ef3e24a12
Move the "src" directory as a preparation for Ib4a47342 and I66f6c5b9
...
Change-Id: I3ab65059f6e356530484bfd0bba26a634a4cba65
2012-03-30 09:53:51 +09:00