Commit Graph

146 Commits (c05a70a4ca769664f2315fff1f3bf3cb2ce3e300)

Author SHA1 Message Date
Jean Chalard 890b44e537 Correctly read the header of APK-embedded dicts
Bug: 13164518
Change-Id: I8768ad887af8b89ad9f29637f606c3c68629c7ca
2014-02-24 22:54:01 +09:00
Keisuke Kuroyanagi 95d16561e0 Remove unused code.
Bug: 12810574
Change-Id: I9c7fff60ae0e94d52f3bd19c3e88de5a53b917d7
2014-02-15 17:39:24 +09:00
Keisuke Kuroyanagi 0fc93fe445 Implement PatriciaTriePolicy::getNextWordAndNextToken().
Bug: 12810574
Change-Id: Id1d44f90de9455d9cbe7b6e0a161cae91d6d422c
2014-02-15 17:39:20 +09:00
Keisuke Kuroyanagi 85fe06e759 Merge "Remove unused argument from readDictionaryBinary." 2014-02-14 10:37:56 +00:00
Keisuke Kuroyanagi 8e3a1d0f89 Remove unused argument from readDictionaryBinary.
Bug: 12810574
Change-Id: Ice415ebd8d11162facca3fe8927ef8a616b11424
2014-02-14 19:02:15 +09:00
Keisuke Kuroyanagi 8fa7a09f1e Merge "Implement PatriciaTriePolicy::getWordProperty()." 2014-02-14 09:08:09 +00:00
Keisuke Kuroyanagi c63d183473 Implement PatriciaTriePolicy::getWordProperty().
Bug: 12810574
Change-Id: I7bcccfd3641ebbcf2b8d857d33bb4734c42af5eb
2014-02-14 17:56:45 +09:00
Tadashi G. Takaoka da973e75dc Make InputLogicTest more robust
Change-Id: I134f14971126cbeed05b472c08747f2b88ad30e6
2014-02-13 19:38:51 +09:00
Keisuke Kuroyanagi 8ffc631826 Make PtNode have ProbabilityInfo instead of raw value.
Bug: 11281877
Bug: 12810574
Change-Id: Id1cda0afc74c4e30633c735729143491b2274a7b
2014-02-10 15:05:08 +09:00
Keisuke Kuroyanagi ab6a93773b Use native logic to read Ver4 dict.
Bug: 11281877
Bug: 12810574
Change-Id: Ief371d3ef61818e4e031de4659aee3c9584c7379
2014-02-06 21:55:37 +09:00
Keisuke Kuroyanagi b986f78ba8 Separate header class from FormatSpec.
Bug: 12810574
Change-Id: Iacf1cd05a268bf690ab864b5e32a18a4b0ccc693
2014-02-04 21:36:04 +09:00
Keisuke Kuroyanagi 35ff07c70b Merge "Fix BinaryDictDecoderEncoderTests." 2014-01-31 10:58:26 +00:00
Keisuke Kuroyanagi 5cb7509314 Fix BinaryDictDecoderEncoderTests.
Bug: 12809791
Change-Id: I04313df78692b01e153a34c932a37f079a924105
2014-01-31 19:44:17 +09:00
Jean Chalard 79b2e4d86c [HD03] Straighten out attribute key names in Java.
Bug: 11281748
Change-Id: I1d813bdacd45bcfd9c4cc73ac1d67c5c89854e86
2014-01-31 14:46:07 +09:00
Keisuke Kuroyanagi 26bd46095a Reading dictionary containing timestamps in Java Side.
Just skipping historical information fields.

Bug: 11281877
Change-Id: I43d2adaa576b7da11ed3ca54990265dbb6f53b08
2014-01-29 20:19:24 +09:00
Keisuke Kuroyanagi c2fd53ee0e Remove ver4 dict updater.
Change-Id: I468994c98d091be621b9fb3fbe6405c67fc6a465
2013-12-17 18:17:51 +09:00
Keisuke Kuroyanagi 42334bb493 Quit checking bigram order in BinaryDictDecoderEncoderTests.
Change-Id: I1b8eb6ab2ea797d2590495b1f57f9ec9560ea817
2013-12-17 17:38:24 +09:00
Keisuke Kuroyanagi 4fdcefe504 Move DictUpdater to the tests directory.
Bug: 11245133
Change-Id: I0907a091ac3ae960eaf3b27da78dbb48a24b2ea1
2013-12-17 14:31:25 +09:00
Jean Chalard b868375763 Fix failing tests
- Version 3 is not supported
- Now passing the right string to open v4 dicts. Fix the tests for this.

Change-Id: I7829330c3568a715b96396ba4e4e69c6e17775ab
2013-12-16 14:32:19 +09:00
Jean Chalard 1dc7eca114 Remove references to the v3 dictionary.
Change-Id: I811c8c923ad67a6d9bfdd11bdef8991eb7135c27
2013-12-13 18:53:41 +09:00
Jean Chalard a245d15da5 Have dicttool use the native library to generate v4 dicts.
Yay !

Change-Id: Iea8ced9e81031b9ab7eff05ad9ef7215be248de9
2013-12-13 18:18:20 +09:00
Jean Chalard 7b55cd3e2b Remove flags from Java side.
This simplifies the code quite a bit.
- GERMAN_UMLAUTS are now handled through a key-value attribute.
  The dictionary generator does not need to know about it any more.
- FRENCH_LIGATURES are deprecated as we handle them with shortcuts now.
- CONTAINS_BIGRAMS is deprecated. Bigram processing is always applied
  regardless of this flag.

Bug: 11281748
Change-Id: If567e52e245a9342adc7f3104a0f7d8d782df8c1
2013-12-13 18:15:05 +09:00
Ken Wakasa 2fa3693c26 Reset to 9bd6dac470
The bulk merge from -bayo to klp-dev should not have been merged to master.

Change-Id: I527a03a76f5247e4939a672f27c314dc11cbb854
2013-12-13 17:13:32 +09:00
Yuichiro Hanada 9514ed5c2a Add the new format of bigram entries.
In new format, each bigram entry has flags (1 byte), a terminal id (3 byte),
a time-stamp (4 byte), a counter (1 byte) and a level (1 byte).

Bug: 10920255
Bug: 10920165
Change-Id: I0f7fc125a6178e6d25a07e8462afc41a7f57e3e1
2013-10-11 14:50:41 +09:00
Keisuke Kuroyanagi 61aae2b450 Merge "Add Ime language switching test." 2013-10-08 05:35:25 +00:00
Keisuke Kuroyanagi 72bc9ad37e Add Ime language switching test.
Change-Id: I6a96dc5fdd533899353d537382608c2759faff1c
2013-10-08 14:00:57 +09:00
Yuichiro Hanada e4e0add9fb Add Ver4DictUpdater.
Change-Id: I986ab26faf535fc4bc98443053f534eced9d048f
2013-10-04 17:33:29 +09:00
Yuichiro Hanada 75d60e821c Refactor BinaryDictIOUtilsTests.
Change-Id: I2208378b33038771b460abb33f9a690872e998e2
2013-10-04 14:19:13 +09:00
Yuichiro Hanada d6e307a4b7 Add DictUpdater.
Change-Id: Ic586e46e5a9f59de53d53e59886d635345940974
2013-10-03 20:16:34 +09:00
Yuichiro Hanada 3aa8977cb2 Remove some unused variables.
Change-Id: Iaf1556fec194d17cb4318f2bdcc837f8d79449ef
2013-10-02 18:26:03 +09:00
Yuichiro Hanada 4284e9aae7 Make SparseTable have multiple content tables.
Bug: 10920165
Change-Id: Ie9008452ee292fb0b1fec66e2ffed228c4af6c3e
2013-10-02 15:36:13 +09:00
Jean Chalard fa946d4a0f Fix a test and crash with a better error message when reading
When there are too many bigrams, we stop reading the file,
so the file pointer is in an inconsistent place. This means we
have no idea what's going to happen next. It's better to crash
right away.

Change-Id: Id3b7b78cbe4fda3493b3c9c46758763e1ab5f6a3
2013-10-02 11:48:47 +09:00
Yuichiro Hanada d188af7022 Add SparseTable.
Bug: 10920165
Change-Id: I749dd0269e788799e30b10beb2671813d40ce15f
2013-09-26 16:16:30 +09:00
Yuichiro Hanada 1625aeafd2 Fix runReadUnigramsAndBigramsTests.
Change-Id: Idd9176c9943dfacac5a06957f1a07187b642b207
2013-09-24 12:31:45 +09:00
Yuichiro Hanada 14087ba52c Add Ver4DictDecoder.
Bug: 9618601
Change-Id: I43c5840505c6a847aaf4893a400392ccd45903c0
2013-09-19 16:11:23 +09:00
Keisuke Kuroyanagi 78b55a31cb Fix handling multi-bytes characters and add a test.
Bug: 6669677

Change-Id: Id2154db47adea2929559a4187a726f9dfa83363e
2013-09-17 15:11:24 +09:00
Yuichiro Hanada a141d8ef7d Add Ver4DictEncoder.
Bug: 9618601
Change-Id: I161d2845906f07c1251deb8005fdffe49c5b7940
2013-09-13 17:33:51 +09:00
Yuichiro Hanada 0e40cd0c40 Add getDictDecoder.
Bug: 9618601
Change-Id: I173100ac704c03f7d5d0d53477e83cab5d1110d4
2013-09-12 20:14:09 +09:00
Yuichiro Hanada 95bc256f41 Add a flag to readDictioanryBinary in DictDecoder.
Change-Id: I356adb72047ebc43c924fbff1ff45e7460508a31
2013-09-11 18:20:56 +09:00
Yuichiro Hanada 2232a70806 Clean up unused imports.
Change-Id: I7147ca237b99399e79210852aa5bf5a01101d779
2013-08-26 08:29:16 +00:00
Yuichiro Hanada 752a33640c [Refactor] Add DictDecoder.readUnigramsAndBigramsBinary.
Change-Id: I259db91d837c67cbcb3b6dc504b21dca23a6a5be
2013-08-26 17:24:38 +09:00
Yuichiro Hanada bb5b84a826 [Refactor] Add DictDecoder.getTerminalPosition.
Change-Id: I9d04f64a58f5481cbb64cf1c09b5c485dd4176b4
2013-08-26 16:14:59 +09:00
Yuichiro Hanada 576f625ee1 Rename CharGroup to PtNode.
Bug: 10233675
Change-Id: I7b0eb07d195cd386cd0d9e97cd59bf48fcf24107
2013-08-26 15:58:30 +09:00
Yuichiro Hanada e9a10ff0f0 Add DictDecoder.readDictionaryBinary.
Bug: 10434720
Change-Id: I14690a6e0f922ed1bab3a4b6c9a457ae84d4c1a4
2013-08-23 20:29:25 +09:00
Yuichiro Hanada 373c492a02 Add an unit test for CharEncoding.
Change-Id: Ifb1cc01fa5bc2d6d69671f1acb9b9675a4081d32
2013-08-22 23:05:09 +09:00
Yuichiro Hanada aa4168ee09 Fix writePlacedNode.
Change-Id: I1d6b086f1d9f0dbd8d74f964e29ae62c533af978
2013-08-22 23:02:08 +09:00
Yuichiro Hanada e301085a70 Move findWordByBinaryDictReader to BinaryDictIOUtilsTests.
Change-Id: I443238fd816dea9650dcbbeb3ea757f9674fa52f
2013-08-22 14:49:49 +09:00
Yuichiro Hanada c922c8a504 Add DictEncoder.
Change-Id: I41049b9118b58838e5dedf8e5618d939ca70c5ef
2013-08-22 11:53:41 +09:00
Yuichiro Hanada 558e34c7bd Make readPtNode be called with the address from the beginning of the file.
Change-Id: I8939fdfb4f79e55bcd7393633784effb30df3f8f
2013-08-21 20:02:18 +09:00
Yuichiro Hanada a306e08753 Rename BinaryDictEncoder to BinaryDictEncoderUtils.
Change-Id: I4dabf17da7003b1d8204a83dbd10e5be6e8fd805
2013-08-21 18:54:34 +09:00
Yuichiro Hanada 107a5f6fb8 Add PtNodeReader.
Change-Id: Ic918822fc1b3a8a7c39ffbcf7defde2c5bf888db
2013-08-21 18:43:18 +09:00
Yuichiro Hanada 065aad9501 Add DictDecoder.
Change-Id: Ia1c32f21fe07081ce04d093660e18146b93275a4
2013-08-20 17:43:13 +09:00
Yuichiro Hanada 112257e40f Rename BinaryDictDecoder to Ver3DictDecoder.
Change-Id: Ibf9b95b658df6e2c2218bdb62e2380f326a03832
2013-08-20 17:11:51 +09:00
Yuichiro Hanada 66004ce2de Remove populateOptions.
Change-Id: I1a1830aaa8ea586b68fc34ff3a27ae52b810e8af
2013-08-20 16:06:52 +09:00
Yuichiro Hanada 77bce05e6f [Refactor] Rename BinaryDictReader and BinaryDictDecoder.
BinaryDictReader -> BinaryDictDecoder.
BinaryDictDecoder -> BianryDictDecoderUtils.

Change-Id: Iadf2153b379b760538ecda488dda4f17225e5f37
2013-08-19 19:36:31 +09:00
Ken Wakasa a83e25642f Merge "Add HeaderReaderInterface." 2013-08-19 02:34:23 +00:00
Yuichiro Hanada d794b42f98 Add HeaderReaderInterface.
Change-Id: I298f86b70d18cd08b240509b6f757c72e1a59ffe
2013-08-19 11:15:03 +09:00
Yuichiro Hanada 8aaae56cf6 Fix unit test.
Change-Id: Ib104d5de71c2ab1a07921b407c74c21b0409d9af
2013-08-19 11:10:28 +09:00
Yuichiro Hanada 3a73b37b30 Make BinaryDictIOUtils and DynamicBinaryIOUtils use BinaryDictReader.
Change-Id: I191dfe0e05ff3c2c5af99e8beebbb73b097748a3
2013-08-16 21:06:23 +09:00
Jean Chalard af30cbf0ee Rename Node to PtNodeArray
Bug: 10247660
Change-Id: I1a0ac19f58f96adb5efac5fd35c6404831618c99
2013-08-16 16:24:54 +09:00
Yuichiro Hanada 94460eba11 [Refactor] Divide BinaryDictInputOutput into BinaryDictEncoder and BinaryDictDecoder.
Change-Id: I7c3269d77e3e3b567e459dcaa1bc029903941744
2013-08-15 20:23:07 +09:00
Ken Wakasa 117f18e844 Revert "[Refactor] Divide BinaryDictInputOutput into BinaryDictInputUtils and BinaryDictOutputUtils."
This reverts commit 4c63d0614e.

Change-Id: I1fa277d720bab4d895259df7d6d82eebfa5eb6c5
2013-08-15 08:54:29 +00:00
Yuichiro Hanada 4c63d0614e [Refactor] Divide BinaryDictInputOutput into BinaryDictInputUtils and BinaryDictOutputUtils.
Change-Id: I0d476abe763c11ba9005152f928e8dccf15ac9de
2013-08-15 15:46:58 +09:00
Yuichiro Hanada 3edb62c69b Move some methods in BinaryDictIOUtils to DynamicBinaryDictIOUtils.
Change-Id: I9ba55582c533fef0eb3e60c46bf23c8b16ee1ff4
2013-08-14 19:33:36 +09:00
Yuichiro Hanada bbc8a930f7 Add FusionDictionaryBufferFromWritableByteBufferFactory.
Change-Id: I23de0a178e7f11f2cf301fd433cde60c6152055b
2013-08-14 17:07:44 +09:00
Yuichiro Hanada 3feacba1eb Add BinaryDictReader.
Bug: 9618601

Change-Id: Ief07fa0c3c4f7f5999a3fafcef4e47b6b6fd8143
2013-08-13 19:55:05 +09:00
Ken Wakasa b03447e1af Move a couple classes to the utils package
Change-Id: Ia14a2011d79bad7cd02697b9254705f6e2099442
2013-07-19 10:46:46 +09:00
Jean Chalard 1588252968 Small debug helper
So that I don't have to find out everything again each time the
test facility finds a case that does not work, and I want to dump
the output to a combined file.

Change-Id: I9f77f86055d1609c2e37747ac47421db1ba2498e
2013-07-16 15:57:09 +09:00
Jean Chalard 4a1c26aba7 Change how the length of the random words are chosen.
This is much more robust and much better for testing.

Change-Id: I43f900f9debc1d1ae4c3f3dd07dbe0ac85d31f52
2013-07-04 22:22:34 +09:00
Jean Chalard cea80fd955 Have random words stick to a restricted (random) charset
Change-Id: Ib4045ebc9659f1b60183f2356e60e449d62c5be9
2013-07-04 22:21:27 +09:00
Jean Chalard fe156213d7 Add a two-args constructor to BinaryDictIOTests
Change-Id: Ie26e22754bfa5d58135349164c57007c86bd97e8
2013-07-04 17:14:02 +09:00
Jean Chalard 4b7acd1df6 Add args to dicttool test.
Change-Id: I0667e0a5a6f6db3964cfcca5c8f083b9ceb41a2e
2013-07-01 15:28:32 +09:00
Ken Wakasa e28eba5074 Move util classes to the latin/utils directory
Change-Id: I1c5b27c8edf231680edb8d96f63b9d04cfc6a6fa
2013-06-24 17:04:40 +09:00
Jean Chalard 23d4eb55ba Add tests to dicttool test.
Bug: 8526576
Change-Id: Idd6f9cd076d5915361c68f5c29afbba67dd54eba
2013-06-20 17:29:37 +09:00
Jean Chalard 001884a1ee Clean up tests and increase speed
Conservatively reduce the number of unigrams to test from 1000
to 100.

Bug: 8583091
Change-Id: I48621ec44ff5f0590640d7c6b174ab5a6d267aaf
2013-04-15 14:25:32 +09:00
Jean Chalard c2653d0b5c Fix a typo
Change-Id: I27b925be030e9e6ee8ae49dc13f39accec996d7e
2013-04-15 12:57:27 +09:00
Jean Chalard c2e9c511cb Fix Binary dict tests
There are two problems here. The first one is the tests would send
an invalid unicode character. Although we could want dicttool to
handle this more gracefully, it's fine for now.

The second problem is much more serious. If a node has more than
128 children, then the java code will crash trying to read the
dictionary back because of a bug that this change fixes. In
theory, it's possible that happens when we try to load the user
history dictionary back from the disk - native code is not affected
so there is no other point that may cause a problem.
In the practice, that means you'd need to have 129 words with a
common prefix (including empty string) but all different after
this. It's almost impossible with Google Keyboard since there are
only so many keys on the keyboard that you can make a word out
of, and then again you'd have to do it repeatedly until it
actually enters the user history dictionary, wait for it to get
saved on the disk.
The bad news is, if you manage to get this far, the keyboard will
crash every time and won't be able to get up until you clear
data for the package.
The good news is, the dictionary itself is not corrupted and only
the reading code is wrong. So updating to a newer version would
actually even recover from this situation.

All in all, considering how almost-impossible this is to trigger,
I don't think even a single user actually did hit this bug.

Bug: 8583091
Change-Id: Iabb2a7f47cbd9ed3193d2a3487318d280753e071
2013-04-15 12:48:16 +09:00
Jean Chalard a411595b16 Fix two nasty bugs with surrogate pairs.
The important bug is in findWordInTree. The problem, which is
not obvious, is that we were calling codePointAt() with the
code point index in the string, instead of the char index.
The other bug this change fixes was harmless in the practice,
because it's in the iteration which is only used for debug and
pretty printing purposes. It's very similar in that it would
substract a length in code point to a length in chars and
truncate a StringBuilder at that length, so it would fail in a
quite similar manner. This changes the meaning of the "length"
attribute in Position, but it's clearer this way anyway.

Bug: 8450145
Change-Id: If396f883a9e6449de39351553ba83f5be5bd30f0
2013-04-01 17:06:19 +09:00
Jean Chalard 1c5b2a41ec Cleanups
Follow-up to Idc6f419a

Change-Id: I4aae8f4e19f27a0a309879dc19af6e40906d58c5
2013-02-11 21:14:56 -08:00
Jean Chalard 68ed7aa990 Fix a test
The test is wrong - it checks a struct that contains a string
instead of checking the string itself.

Bug: 8149360
Change-Id: Ifb93d61f25a64a64e1c1e689de792f27994487b6
2013-02-06 15:27:21 -08:00
Tadashi G. Takaoka b4598f7d05 Add unit tests tags
Bug: 8131968
Change-Id: Ibca5a0d63a492134b8af401a62ca3a5748e003cf
2013-02-04 17:21:30 -08:00
Jean Chalard 1ec8299389 Fix the build (again)
Change-Id: Idb7addede891a5c672d7fc09ddfe4d2585f8d647
2012-10-23 19:34:37 +09:00
Yuichiro Hanada 6def28d1da Make unit tests create temporary files in the cache directory.
Change-Id: I90791b364b441bc4e8e221d9e668372d15591719
2012-10-05 15:17:13 +09:00
Yuichiro Hanada d2579c4832 fix writeCharGroup.
Change-Id: Ib841afaba0a20c3b300eb7d3e9133243f9f3ae58
2012-10-05 14:54:17 +09:00
Yuichiro Hanada 3c6d9fe148 Add insertWord.
bug: 6669677

Change-Id: Ide55a4931071de9cd42c1cddae63ddd531d2feba
2012-10-04 17:19:47 +09:00
Yuichiro Hanada 82d9deaaf2 Combine mHasParentAddress with mHasLinkedListNode into mSupportsDynamicUpdate.
bug: 6669677

Change-Id: I82799af199358420f09ac34fc005091e202c5d3b
2012-09-24 13:17:44 +09:00
Yuichiro Hanada 66597f5e5f Add deleteWord.
bug: 6669677

Change-Id: I1a5b90ee05e5cffd74a5c140384a3e37c79e7e70
2012-09-21 12:40:07 +09:00
Yuichiro Hanada d36245fad2 Add getTerminalPosition.
Change-Id: If04d779db23b1aea2cc12e5e9b8cecfcb35a5737
2012-09-20 18:02:16 +09:00
Yuichiro Hanada 65feee12e5 Make BinaryDictIOUtils.
Change-Id: I45830235ee738233e8eb2bd91d659705b698f58c
2012-09-19 15:37:37 +09:00
Yuichiro Hanada bf45dc4860 Make writePlacedNode write the linked-list node.
Change-Id: I60feda815ea08cf73300fccca1ae12b97550f116
2012-09-18 21:20:07 +09:00
Yuichiro Hanada 7e35841053 Add tests for BinaryDictInputOutput.
Change-Id: I2ca66fd9a3568d5b6ece79d954095383d23a0a9f
2012-09-13 18:06:45 +09:00
Yuichiro Hanada 1a347723c5 Move FormatOptions and FileHeader to FormatSpec.
Change-Id: I232e35598635113bf2c81825669c744aadc79efe
2012-09-13 16:35:41 +09:00
Yuichiro Hanada be5db53a09 Add tests for readDictionaryBinary with byte array.
Change-Id: I2c2815e9d4867687fb3f5b0c661e6162b88c0a0c
2012-09-06 20:35:33 +09:00
Yuichiro Hanada 13b85c4167 Refactor BinaryDictIOTests.
Change-Id: I6eef88ab436f478a9255cc20ea59a24cd472807e
2012-09-06 18:06:14 +09:00
Yuichiro Hanada 6e422af881 Check shortcuts in checkDictionary.
Change-Id: I150913833e586bf7d3f0b9b2e796a61f89fa4f83
2012-09-06 16:32:54 +09:00
Yuichiro Hanada 2a2b5edc21 Change BinaryDictIOTests's package.
Change-Id: Ie9df2f7767cd925051c5e1fdcc325cc3359bca20
2012-09-05 18:23:53 +09:00