Commit Graph

97 Commits (128f68acb28e704f3fbccb898233e10bba9871c4)

Author SHA1 Message Date
Tadashi G. Takaoka 6a1b37353d Fix dicttool build
This CL partially reverts
  - Id88b02b74bdfe4ca05b08181ceb6b34d5652fc0c
  - I05c7d8429e8d9a26139456763c77997340fea8c2
And followup (remove shortcut support)
  - I73b7dc008a5acaf75a31a36a2d332b5afabd82d0

Bug: 28255684
Test: make -j10 dicttool_aosp
Change-Id: I2e01ed86b9517a1141aee35ea6d8ef39258981d1
2018-10-29 15:59:05 +09:00
Tadashi G. Takaoka b0a0ce0b31 Remove unused import and local variable
Change-Id: I256d1c6bf96c07b10d2d063d7935e20e0ab8ea17
2014-11-28 11:53:12 +09:00
Adrian Velicu 9f46834839 dicttool header to read stream exhaustively
Change-Id: I50a286c115f5bd6e93763bd2f79031676d6fffd8
2014-11-11 18:10:26 -08:00
Adrian Velicu 1e72f9da12 Dicttool to handle unpackaging non-latest version dicts
Change-Id: I738735186213b3a40eff997ae2fd83069c6445f1
2014-11-11 16:35:04 -08:00
Adrian Velicu 0691f29d36 Merge "Making 'dicttool header' output format version" 2014-11-11 01:34:36 +00:00
Adrian Velicu 8e394ffcf4 Making 'dicttool header' output format version
Change-Id: I4198f6b463711feb4ab78020934cca4d23870fbb
2014-11-08 10:35:37 +09:00
Jean Chalard 5b91b551e5 Move util classes under common
Also why did we have two copies of LocaleUtils >.>

Bug: 18108776
Change-Id: I03b4403dfd51934e66b567f2f8b87da419cfb3ab
2014-11-07 18:00:03 +09:00
Jean Chalard 5b33d197ba Add a header command to dicttool.
This will allow to greatly improve the performance of the
metadata-generating files, as they won't have to wait for
the info command to read the entire dictionary when the
header is all we need.

Also add tests, and while we're at it, use the seed as
intended to enable reproducible tests.

Change-Id: I0ba79ef62f0292b23e63aed57ff565bb102281a2
2014-11-06 18:50:59 +09:00
Jean Chalard f6b0e32df3 Add a *FAST* dictionary header reader.
It's still unused as of this change but the next change will use it

As a reference point, generating the metadata for Bayo takes
3'02" on my machine with the info command; it's down to 16" if
made to use this instead. The gains increases with the number
of dictionaries obviously.

Change-Id: I0eeea2d8f81bb74b0d1570af658e91b56f7c2b79
2014-11-06 13:17:08 +09:00
Jean Chalard 5564317f83 Genericize getting a raw dictionary
This will allow for not copying the whole dictionary when only
the header is needed.

Change-Id: Ie4a649b507ccd4a430201824ed87b8b8bbf55e9f
2014-11-06 13:12:39 +09:00
Jean Chalard ae55db95a7 Large simplification in obtaining a raw dictionary
That is where the last refactorings were leading. This code is
simpler, but it's far more flexible. Importantly, it only makes
a single copy instead of making a full disk copy for every
intermediate step.
Next we're going to make the "copy" part modular for processes
that don't need to copy the whole file.

Change-Id: Ief32ac665d804b9b20c44f443a9c87452ceb367a
2014-11-05 12:27:35 +09:00
Keisuke Kuroyanagi 3cde19ded1 Merge "Initial commit for native dicttoolkit." 2014-10-31 11:29:20 +00:00
Keisuke Kuroyanagi e101a53ffc Initial commit for native dicttoolkit.
Bug: 10059681

Change-Id: Ib730af8ebc944e08aaada869c0626724a499747c
2014-10-31 20:27:06 +09:00
Adrian Velicu 7c87859d4c Using "blacklist" flag as "possibly offensive"
Bug: 11031090
Change-Id: I5cc0d006ab003656498eb82b0875eb9c051d331e
2014-10-31 14:33:05 +09:00
Tadashi G. Takaoka 067d8cdf56 Fix unit test breakage
Change-Id: I538288054a58eb2c81ce3cbe5c9bef900fb653a5
2014-10-24 16:48:46 +09:00
Jean Chalard 9e58ae4698 Merge "Some more simplification of DecoderSpec works" 2014-10-24 03:56:22 +00:00
Jean Chalard 40c11fdbff Merge "Simplify handling of steps in DecoderChainSpec" 2014-10-24 03:50:47 +00:00
Jean Chalard afdde63374 Some more simplification of DecoderSpec works
Change-Id: I23fa4e4ed96228406e70aa94d84fd7b8d3f69347
2014-10-23 16:57:14 +09:00
Jean Chalard 52e92b8a3f Simplify handling of steps in DecoderChainSpec
This is a preliminary refactoring change to improve performance
in dicttool diagnostic tools.

Change-Id: I9a59328af62e336809246be5bebbbf2e154366b3
2014-10-23 16:57:11 +09:00
Tadashi G. Takaoka 92d073c2fd Remove unused import and method
Bug: 18003991
Change-Id: Id6b67bf66b397301e5186826dba2b60df9cb4c65
2014-10-23 16:37:07 +09:00
Tadashi G. Takaoka d3a4c51324 Fix Javadoc and null analysis related warnings
This CL also adds @SuppressWarning("unused" to java-overridable package.

Bug: 18003991
Change-Id: If70527e30654384705d7a814f5efd181d9f539e1
2014-10-23 09:58:42 +09:00
Jean Chalard 90aa229f01 Remove XML input/output from dicttool.
This hasn't been used for a while. It's deprecated. Let's kill it.

Change-Id: Ib1c491fa14b6406f6f77f2b0869f4db1810eb078
2014-10-22 17:28:33 +09:00
Tadashi G. Takaoka 5f00fe09e9 Fix some compiler warnings
This CL fixes the following compiler warnings.

- Indirect access to static member
- Access to a non-accessible member of an enclosing type
- Parameter assignment
- Method can be static
- Local variable declaration hides another field or variable
- Value of local variable is not used
- Unused import
- Unused private member
- Unnecessary 'else' statement
- Unnecessary declaration of throw exception
- Redundant type arguments
- Missing '@Override' annotation
- Unused '@SuppressWarning' annotations

Bug: 18003991
Change-Id: Icfebe753e53a2cc621848f769d6a3d7ce501ebc7
2014-10-21 19:28:37 +09:00
Adrian Velicu 05172bf1a5 Renaming "blacklist" flag to "possibly offensive"
No behaviour changes.
Unified the overloaded FusionDictionary::add method to always take an
isPossiblyOffensive argument.

Bug: 11031090
Change-Id: I5741a023ca1ce842d2cf10d4f6c926b0efabaa78
2014-10-21 11:51:47 +09:00
Akifumi Yoshimoto 7e5614520a Merge "Include a code point table in the binary dictionary." 2014-10-02 08:55:18 +00:00
Akifumi Yoshimoto 9168ab60cf Include a code point table in the binary dictionary.
Bug:17097992
Change-Id: I677a5eb3a704e4386f6573360e44ca335d81d2df
2014-10-02 12:27:49 +09:00
Keisuke Kuroyanagi c6a6f6a990 Introduce NgramProperty in Java side.
Bug: 14425059
Change-Id: I8b3458ad22730b3dccbe0caea2c5930f5276dc82
2014-10-01 11:21:08 +09:00
Akifumi Yoshimoto f4329f7fff Read dicttool option for switching code point table
Bug:17097992
Change-Id: I0b3f12c4450f784b9a33470d1dc4c306062de91e
2014-09-26 15:15:10 +09:00
Tadashi G. Takaoka fec4769e0b Refactor dicttool with try-with-resource
This CL must be checked in together with Idd7c744d0f.

Change-Id: Ia0ff09a054c1852b39cdce22a4377108afb254e2
2014-06-22 23:20:37 -07:00
Tadashi G. Takaoka a91561aa58 Use Java 7 diamond operator
Change-Id: If16ef50ae73147594615d0f49d6a22621eaf1aef
2014-05-24 01:05:42 +09:00
Jean Chalard 7086d88d3e Have dicttool test tidy up after itself.
Bug: 13776363
Change-Id: Icb1d3fc0efe71e0339b434928e8aed507f2fb590
2014-05-23 19:56:57 +09:00
Keisuke Kuroyanagi 93cda5bb39 Move code only used for dicttool and tests under tests.
Bug: 13035567
Change-Id: I13c6df013ef2b67c9bf67455d9c32d283bf9ea2e
2014-03-27 15:30:32 +09:00
Keisuke Kuroyanagi f14cf3e64c Fix: dicttool build.
Change-Id: I5c3bcbe9f3054bdd1a760398fe11344e0e05ac6a
2014-03-07 13:01:48 +00:00
Keisuke Kuroyanagi 3ad4af2354 Move DictionaryOptions from FusionDictionary to FormatSpec.
Bug: 8187060
Bug:13035567

Change-Id: Id4f45e589521ae98c926a4c0607be10ce1a983f2
2014-03-06 18:53:09 +09:00
Keisuke Kuroyanagi 516f86815d Separate WeightedString from FusionDictionary.
Bug: 8187060

Change-Id: I40c1dafca3eb52244c64fdb4c1db30a56385d678
2014-03-06 18:53:06 +09:00
Keisuke Kuroyanagi 36305d4207 Fix: dicttool build.
Change-Id: I592b14eba895786d0981586a01ef545e003396c8
2014-02-28 19:04:49 +09:00
Jean Chalard 890b44e537 Correctly read the header of APK-embedded dicts
Bug: 13164518
Change-Id: I8768ad887af8b89ad9f29637f606c3c68629c7ca
2014-02-24 22:54:01 +09:00
Keisuke Kuroyanagi 8e3a1d0f89 Remove unused argument from readDictionaryBinary.
Bug: 12810574
Change-Id: Ice415ebd8d11162facca3fe8927ef8a616b11424
2014-02-14 19:02:15 +09:00
Keisuke Kuroyanagi 69ccac6e51 Remove unused code.
Bug: 12810574
Change-Id: If0ef02a984469a3b6e0c00b1c3c8d98d0d2b5466
2014-02-10 15:05:11 +09:00
Keisuke Kuroyanagi 8ffc631826 Make PtNode have ProbabilityInfo instead of raw value.
Bug: 11281877
Bug: 12810574
Change-Id: Id1cda0afc74c4e30633c735729143491b2274a7b
2014-02-10 15:05:08 +09:00
Keisuke Kuroyanagi b24de426fc Use CombinedFormatUtils to convert dict elements to strings.
Bug: 11281877
Bug: 12810574
Change-Id: Ib631f75eab73abc9877a7698171c45e8f2fc7600
2014-02-06 16:09:25 +09:00
Keisuke Kuroyanagi 5f5feeba13 Consolidate WordProperty and Word.
Bug: 11281877
Bug: 12810574
Change-Id: I9dc99188f80f25a8780c1860dab46e4aa80a23e5
2014-02-06 15:13:33 +09:00
Keisuke Kuroyanagi df1d3e733e Make WeightedString have ProbabilityInfo.
Bug: 11281877
Bug: 12810574
Change-Id: I265e3d8654c75766cd0e0d09d67ef62b4566298a
2014-02-05 21:44:55 +09:00
Keisuke Kuroyanagi c2fd53ee0e Remove ver4 dict updater.
Change-Id: I468994c98d091be621b9fb3fbe6405c67fc6a465
2013-12-17 18:17:51 +09:00
Jean Chalard b868375763 Fix failing tests
- Version 3 is not supported
- Now passing the right string to open v4 dicts. Fix the tests for this.

Change-Id: I7829330c3568a715b96396ba4e4e69c6e17775ab
2013-12-16 14:32:19 +09:00
Jean Chalard a245d15da5 Have dicttool use the native library to generate v4 dicts.
Yay !

Change-Id: Iea8ced9e81031b9ab7eff05ad9ef7215be248de9
2013-12-13 18:18:20 +09:00
Jean Chalard 7b55cd3e2b Remove flags from Java side.
This simplifies the code quite a bit.
- GERMAN_UMLAUTS are now handled through a key-value attribute.
  The dictionary generator does not need to know about it any more.
- FRENCH_LIGATURES are deprecated as we handle them with shortcuts now.
- CONTAINS_BIGRAMS is deprecated. Bigram processing is always applied
  regardless of this flag.

Bug: 11281748
Change-Id: If567e52e245a9342adc7f3104a0f7d8d782df8c1
2013-12-13 18:15:05 +09:00
Ken Wakasa 2fa3693c26 Reset to 9bd6dac470
The bulk merge from -bayo to klp-dev should not have been merged to master.

Change-Id: I527a03a76f5247e4939a672f27c314dc11cbb854
2013-12-13 17:13:32 +09:00
Yuichiro Hanada 73665510ca Show more messages when reading a compressed combined format file.
Change-Id: I51a1b9454fcfe656e0fcf762dcfd9ecbadde86c3
2013-10-08 17:05:39 +09:00
Yuichiro Hanada 48e01ec111 Make dicttool read the compressed combined format.
Change-Id: Ib39fa110402895a655f4e705caae53397ace9259
2013-09-30 14:59:19 +09:00