Merge "Switch the AOSP word lists to the combined format."
This commit is contained in:
commit
63b66b6f39
51 changed files with 38 additions and 17 deletions
BIN
dictionaries/cs_wordlist.combined.gz
Normal file
BIN
dictionaries/cs_wordlist.combined.gz
Normal file
Binary file not shown.
Binary file not shown.
BIN
dictionaries/da_wordlist.combined.gz
Normal file
BIN
dictionaries/da_wordlist.combined.gz
Normal file
Binary file not shown.
Binary file not shown.
BIN
dictionaries/de_wordlist.combined.gz
Normal file
BIN
dictionaries/de_wordlist.combined.gz
Normal file
Binary file not shown.
Binary file not shown.
BIN
dictionaries/el_wordlist.combined.gz
Normal file
BIN
dictionaries/el_wordlist.combined.gz
Normal file
Binary file not shown.
Binary file not shown.
BIN
dictionaries/en_GB_wordlist.combined.gz
Normal file
BIN
dictionaries/en_GB_wordlist.combined.gz
Normal file
Binary file not shown.
BIN
dictionaries/en_US_wordlist.combined.gz
Normal file
BIN
dictionaries/en_US_wordlist.combined.gz
Normal file
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
BIN
dictionaries/en_wordlist.combined.gz
Normal file
BIN
dictionaries/en_wordlist.combined.gz
Normal file
Binary file not shown.
Binary file not shown.
BIN
dictionaries/es_wordlist.combined.gz
Normal file
BIN
dictionaries/es_wordlist.combined.gz
Normal file
Binary file not shown.
Binary file not shown.
BIN
dictionaries/fi_wordlist.combined.gz
Normal file
BIN
dictionaries/fi_wordlist.combined.gz
Normal file
Binary file not shown.
Binary file not shown.
BIN
dictionaries/fr_wordlist.combined.gz
Normal file
BIN
dictionaries/fr_wordlist.combined.gz
Normal file
Binary file not shown.
Binary file not shown.
BIN
dictionaries/hr_wordlist.combined.gz
Normal file
BIN
dictionaries/hr_wordlist.combined.gz
Normal file
Binary file not shown.
Binary file not shown.
BIN
dictionaries/it_wordlist.combined.gz
Normal file
BIN
dictionaries/it_wordlist.combined.gz
Normal file
Binary file not shown.
Binary file not shown.
BIN
dictionaries/lt_wordlist.combined.gz
Normal file
BIN
dictionaries/lt_wordlist.combined.gz
Normal file
Binary file not shown.
Binary file not shown.
BIN
dictionaries/lv_wordlist.combined.gz
Normal file
BIN
dictionaries/lv_wordlist.combined.gz
Normal file
Binary file not shown.
Binary file not shown.
BIN
dictionaries/nb_wordlist.combined.gz
Normal file
BIN
dictionaries/nb_wordlist.combined.gz
Normal file
Binary file not shown.
Binary file not shown.
BIN
dictionaries/nl_wordlist.combined.gz
Normal file
BIN
dictionaries/nl_wordlist.combined.gz
Normal file
Binary file not shown.
Binary file not shown.
BIN
dictionaries/pl_wordlist.combined.gz
Normal file
BIN
dictionaries/pl_wordlist.combined.gz
Normal file
Binary file not shown.
Binary file not shown.
BIN
dictionaries/pt_BR_wordlist.combined.gz
Normal file
BIN
dictionaries/pt_BR_wordlist.combined.gz
Normal file
Binary file not shown.
BIN
dictionaries/pt_PT_wordlist.combined.gz
Normal file
BIN
dictionaries/pt_PT_wordlist.combined.gz
Normal file
Binary file not shown.
Binary file not shown.
Binary file not shown.
BIN
dictionaries/ru_wordlist.combined.gz
Normal file
BIN
dictionaries/ru_wordlist.combined.gz
Normal file
Binary file not shown.
Binary file not shown.
38
dictionaries/sample.combined
Normal file
38
dictionaries/sample.combined
Normal file
|
@ -0,0 +1,38 @@
|
|||
# This is a sample wordlist that can be converted to a binary dictionary
|
||||
# for use by the Latin IME.
|
||||
# The file is essentially a CSV file, with indent level denoting nesting.
|
||||
#
|
||||
# The file starts with a single CSV line with the header attributes. Whatever
|
||||
# the content, these are included as is in the binary file. The first attribute
|
||||
# of the file should be `dictionary'. Usual fields are `locale', `description',
|
||||
# `date', `version', `options'.
|
||||
#
|
||||
# Each word has a `word' entry and at least a `f' argument denoting its
|
||||
# probability, as an integer between 0 and 255 on a logarithmic scale, with
|
||||
# 255 meaning 1 and each decrement in 1 dividing probability by 1.15.
|
||||
# As a special case, a weight of 0 is taken to mean profanity - words that
|
||||
# should not be considered a typo, but that should never be suggested
|
||||
# explicitly. An entry may be made not a word by adding a `not_a_word'
|
||||
# field with a value of `true'. The main reason for putting such entries
|
||||
# into the dictionary is to add shortcut targets and maybe a whitelist
|
||||
# replacement.
|
||||
#
|
||||
# Each word may or may not have any number of shortcut target lines
|
||||
# starting with a `shortcut' entry and having at least a `f' frequency
|
||||
# value between 0 and 14, or the special value `whitelist' which becomes
|
||||
# 15, which is then taken to be the whitelist target of this word.
|
||||
#
|
||||
# Each word may also have any number of bigram lines starting with a
|
||||
# `bigram' entry containing the following word whose frequency should
|
||||
# override the unigram frequency when following the word this bigram is
|
||||
# for.
|
||||
#
|
||||
dictionary=main:en,locale=en,description=Sample wordlist,date=1351495318,version=1
|
||||
word=sample,f=200
|
||||
bigram=wordlist,f=243
|
||||
word=wordlist,f=180
|
||||
word=shortcut,f=176
|
||||
shortcut=target,f=10
|
||||
word=witelisted,f=10,not_a_word=true
|
||||
shortcut=whitelisted,f=whitelist
|
||||
word=profanity,f=0
|
|
@ -1,17 +0,0 @@
|
|||
<!-- This is a sample wordlist that can be converted to a binary dictionary
|
||||
for use by the Latin IME.
|
||||
The format of the word list is a flat list of word entries.
|
||||
Each entry has a frequency between 255 and 0.
|
||||
Highest frequency words get more weight in the prediction algorithm. As a
|
||||
special case, a weight of 0 is taken to mean profanity - words that should
|
||||
not be considered a typo, but that should never be suggested explicitly.
|
||||
You can capitalize words that must always be capitalized, such as "January".
|
||||
You can have a capitalized and a non-capitalized word as separate entries,
|
||||
such as "robin" and "Robin".
|
||||
-->
|
||||
<wordlist>
|
||||
<w f="255">this</w>
|
||||
<w f="255">is</w>
|
||||
<w f="128">sample</w>
|
||||
<w f="1">wordlist</w>
|
||||
</wordlist>
|
BIN
dictionaries/sl_wordlist.combined.gz
Normal file
BIN
dictionaries/sl_wordlist.combined.gz
Normal file
Binary file not shown.
Binary file not shown.
BIN
dictionaries/sr_wordlist.combined.gz
Normal file
BIN
dictionaries/sr_wordlist.combined.gz
Normal file
Binary file not shown.
Binary file not shown.
BIN
dictionaries/sv_wordlist.combined.gz
Normal file
BIN
dictionaries/sv_wordlist.combined.gz
Normal file
Binary file not shown.
Binary file not shown.
BIN
dictionaries/tr_wordlist.combined.gz
Normal file
BIN
dictionaries/tr_wordlist.combined.gz
Normal file
Binary file not shown.
Binary file not shown.
Loading…
Reference in a new issue