Commit Graph

15 Commits (3e91b6f793df6364c1dc31eb4bc313a2d5c1f1ab)

Author SHA1 Message Date
Jean Chalard bb0d93c4b0 Update dictionaries
>>> dictionaries/es_wordlist.combined.gz
Header :
  date : 1403847862 <=> 1404131686
  version : 48 <=> 49
Body :
Added: apurate 50
Added: bondi 50
Added: chamuyar 50
Added: conocela 50
Added: conocelo 50
Added: conoceme 50
Added: conocenos 50
Added: conocete 50
Added: copate 50
Added: creele 50
Added: creeme 50
Added: creenos 50
Added: creete 50
Added: creiste 50
Added: creés 50
Added: dale 50
Added: dame 50
Added: danos 50
Added: decile 50
Added: decime 50
Added: decinos 50
Added: estate 50
Added: hablale 50
Added: hablales 50
Added: hablame 50
Added: hablanos 50
Added: hablate 50
Added: hablá 50
Added: hacele 50
Added: haceme 50
Added: hacenos 50
Added: hacete 50
Added: hacés 50
Added: llegás 50
Added: llevale 50
Added: llevame 50
Added: llevanos 50
Added: llevate 50
Added: llevá 50
Added: llevás 50
Added: parecé 50
Added: parecés 50
Added: pasala 50
Added: pasale 50
Added: pasales 50
Added: pasalo 50
Added: pasame 50
Added: pasanos 50
Added: pasate 50
Added: pasás 50
Added: podés 50
Added: ponele 50
Added: poneme 50
Added: ponenos 50
Added: ponete 50
Added: quedá 50
Added: querela 50
Added: querelo 50
Added: quereme 50
Added: querenos 50
Added: querete 50
Added: querés 50
Added: rascate 50
Added: sabelo 50
Added: sabés 50
Added: tenele 50
Added: teneme 50
Added: tenenos 50
Added: tenete 50
Added: tenés 50

>>> java/res/raw/main_es.dict
Header :
  date : 1403847862 <=> 1404131686
  version : 48 <=> 49
Body :
Same changes

Bug: 8010862
Change-Id: I98fc8542e21e35a7c80b332148c461144425e61a
2014-07-01 18:19:30 +09:00
Jean Chalard a70b710c9d Update the Spanish dictionary
>>> dictionaries/es_wordlist.combined.gz
Header :
  date : 1403153360 <=> 1403847862
  version : 47 <=> 48
Body :
Added: bañate 30
Added: correte 30
Added: duchate 30
Added: mostrame 40
Added: muestrame 40
Added: prestame 40
Added: sos 100

>>> java/res/raw/main_es.dict
Header :
  date : 1403153360 <=> 1403847862
  version : 47 <=> 48
Body :
Added: bañate 30
Added: correte 30
Added: duchate 30
Added: mostrame 40
Added: muestrame 40
Added: prestame 40
Added: sos 100

Bug: 8010862
Change-Id: I0a478b5fd5edfadea420f306dc9b2d98876c246e
2014-06-27 14:56:29 +09:00
Jean Chalard 75bc45cb12 Update dictionaries
>>> dictionaries/es_wordlist.combined.gz
Header :
  date : 1401802362 <=> 1403153360
  version : 45 <=> 47
Body :
Added: grandísimo 30

>>> java/res/raw/main_es.dict
Header :
  date : 1401802362 <=> 1403153360
  version : 45 <=> 47
Body :
Added: grandísimo 30

Bug: 15719556
Change-Id: Ifaa97d40d52a278e41f4dd1292781494d4eb939b
2014-06-23 16:56:00 +09:00
Jean Chalard ff3e488e1e Enrich the Spanish dictionary.
Enrich the dictionary with many words generated from stems
extracted from the dictionary and rules written by hand.
This adds 45,619 words to the dictionary. Hopefully, almost none
of them is incorrect, though a lot are not very common.

Bug: 8010862
Change-Id: I51c7ebd16ff859ec1e765b0604dd1cfca159ab08
2014-06-03 22:48:19 +09:00
Jean Chalard 004cec01a9 Update all dicts to version 44.
Bug: 13164302
Change-Id: I8dc1a839c7dcfaa08a53e26cb6600e9f871447ce
2014-02-24 21:27:25 +09:00
Jean Chalard a267ebed5a Update dictionaries
Add KitKat to all dictionaries.
Version
da, fi, pl : 29 → 40
cs, de, hr, it, lt, lv, nb, nl, sl, sr, sv, tr : 35 → 40
es : 36 → 40
en_gb, en_us, en, fr, pt_br, pt_pt : 39 → 40

Bug: 10958192
Change-Id: I14436616285ced5eb3b70b8c44b9243da94eed4f
2013-09-30 07:12:03 +00:00
Jean Chalard 665e4ecc62 Update dictionaries
>>> dictionaries/en_GB_wordlist.combined.gz
Header :
  date : 1374634548 <=> 1374721653
Body :
Added: Caltrain 30

>>> dictionaries/en_US_wordlist.combined.gz
Header :
  date : 1374634548 <=> 1374721654
Body :
Added: Caltrain 30

>>> dictionaries/en_wordlist.combined.gz
Header :
  date : 1374634568 <=> 1374721663
Body :
Added: Caltrain 30

>>> dictionaries/es_wordlist.combined.gz
Header :
  date : 1372393817 <=> 1374721654
  version : 35 <=> 36
Body :
Added: Caltrain 10

>>> java/res/raw/main_en.dict
Header :
  date : 1374634568 <=> 1374721663
Body :
Added: Caltrain 30

>>> java/res/raw/main_es.dict
Header :
  date : 1372393817 <=> 1374721654
  version : 35 <=> 36
Body :
Added: Caltrain 10

Bug: 9995706
Change-Id: Icf96bf01e45ef94d3ffd6d6a9d6431c52f0f5a86
2013-07-25 12:48:55 +09:00
Jean Chalard ffe7dbbe7a Update dictionaries
>>> dictionaries/cs_wordlist.combined.gz
Header :
  date : 1355802831 <=> 1372393817
  version : 29 <=> 35
Body :
Added: LTE 25

>>> dictionaries/de_wordlist.combined.gz
Header :
  date : 1355802835 <=> 1372393817
  version : 29 <=> 35
Body :
Added: LTE 25

>>> dictionaries/en_GB_wordlist.combined.gz
Header :
  date : 1366272052 <=> 1372393817
  version : 31 <=> 35
Body :
Deleted: Sea 126
Added: LTE 25

>>> dictionaries/en_US_wordlist.combined.gz
Header :
  date : 1366272093 <=> 1372393817
  version : 31 <=> 35
Body :
Added: LTE 25

>>> dictionaries/en_wordlist.combined.gz
Header :
  date : 1366272977 <=> 1372393837
  version : 31 <=> 35
Body :
Deleted: Sea 126
Added: LTE 25

>>> dictionaries/es_wordlist.combined.gz
Header :
  date : 1355802832 <=> 1372393817
  version : 29 <=> 35
Body :
Added: LTE 25

>>> dictionaries/fr_wordlist.combined.gz
Header :
  date : 1366272255 <=> 1372393818
  version : 31 <=> 35
Body :
Deleted: R'n'B 95
Deleted: count 60
Deleted: d'Inti 34
Added: beurk 25

>>> dictionaries/hr_wordlist.combined.gz
Header :
  date : 1355802836 <=> 1372393818
  version : 29 <=> 35
Body :
Added: LTE 25

>>> dictionaries/it_wordlist.combined.gz
Header :
  date : 1355802836 <=> 1372393818
  version : 29 <=> 35
Body :
Added: LTE 25

>>> dictionaries/lt_wordlist.combined.gz
Header :
  date : 1355802843 <=> 1372393818
  version : 29 <=> 35
Body :
Added: LTE 25

>>> dictionaries/lv_wordlist.combined.gz
Header :
  date : 1355802843 <=> 1372393818
  version : 29 <=> 35
Body :
Added: LTE 25

>>> dictionaries/nb_wordlist.combined.gz
Header :
  date : 1366003450 <=> 1372393818
  version : 31 <=> 35
Body :
Added: LTE 25

>>> dictionaries/nl_wordlist.combined.gz
Header :
  date : 1355802844 <=> 1372393818
  version : 29 <=> 35
Body :
Added: LTE 25

>>> dictionaries/ru_wordlist.combined.gz
Header :
  date : 1370244430 <=> 1372393835
  version : 34 <=> 35
Body :
Freq changed: связывание 93 -> 0

>>> dictionaries/sl_wordlist.combined.gz
Header :
  date : 1355802835 <=> 1372393835
  version : 29 <=> 35
Body :
Added: LTE 25

>>> dictionaries/sr_wordlist.combined.gz
Header :
  date : 1355802853 <=> 1372393835
  version : 29 <=> 35
Body :
Added: LTE 25

>>> dictionaries/sv_wordlist.combined.gz
Header :
  date : 1366003804 <=> 1372393836
  version : 31 <=> 35
Body :
Added: LTE 25

>>> dictionaries/tr_wordlist.combined.gz
Header :
  date : 1355802858 <=> 1372393837
  version : 29 <=> 35
Body :
Added: LTE 25

>>> java/res/raw/main_de.dict
Header :
  date : 1355802835 <=> 1372393817
  version : 29 <=> 35
Body :
Added: LTE 25

>>> java/res/raw/main_en.dict
Header :
  date : 1366272977 <=> 1372393837
  version : 31 <=> 35
Body :
Deleted: Sea 126
Added: LTE 25

>>> java/res/raw/main_es.dict
Header :
  date : 1355802832 <=> 1372393817
  version : 29 <=> 35
Body :
Added: LTE 25

>>> java/res/raw/main_fr.dict
Header :
  date : 1366272255 <=> 1372393818
  version : 31 <=> 35
Body :
Deleted: R'n'B 95
Deleted: count 60
Deleted: d'Inti 34
Added: beurk 25

>>> java/res/raw/main_it.dict
Header :
  date : 1355802836 <=> 1372393818
  version : 29 <=> 35
Body :
Added: LTE 25

>>> java/res/raw/main_ru.dict
Header :
  date : 1370244430 <=> 1372393835
  version : 34 <=> 35
Body :
Freq changed: связывание 93 -> 0

Bug: 9301610
Bug: 9607966
Change-Id: I1117ed85d97fbb0ee50f11bc31776f1970b56f12
2013-06-28 14:54:51 +09:00
Jean Chalard 21dbe3701c Update dictionaries
cs, da, de, el, es, fi, fr, hr, it, lt, lv, nb, nl, pl,
pt_BR, pt_PT, sl, sr, sv, tr : rescale frequencies to match
spec. This has no large effect in the practice except the
dictionary will become stronger vs spatial model (especially in
lower count corpora, like lt, lv, sr)
en* : Small changes (rounding going the other way essentially)
ru : the above rescaling, and remove the following words:
Дре, ОСТа, Планше, легкими, легком, легкому, легкости,
легкую, нелегкие, нелегкий, нелегким, нелегкое, нелегкой,
нелегкую, полулегком and add нелёгкие, нелёгкое, нелёгкую;
other accented forms were already in the dictionary.

Change-Id: I40386c2ebd4d2be38874e822bde89db7cb512ae6
2012-12-18 13:06:48 +09:00
Jean Chalard d080986f93 Update dictionaries
>>> dictionaries/en_GB_wordlist.combined.gz
Header :
  date : 1354870724 <=> 1355112440
  version : 27 <=> 28
Body :
Deleted: DoCoMo 65
Added: Docomo 65
Added: KDDI 25
Added: Softbank 25

>>> dictionaries/en_US_wordlist.combined.gz
Header :
  date : 1354870736 <=> 1355112451
  version : 27 <=> 28
Body :
Deleted: DoCoMo 65
Added: Docomo 65
Added: KDDI 25
Added: Softbank 25

>>> dictionaries/en_wordlist.combined.gz
Header :
  date : 1354870744 <=> 1355112460
  version : 27 <=> 28
Body :
Deleted: DoCoMo 65
Added: Docomo 65
Added: KDDI 25
Added: Softbank 25

>>> dictionaries/es_wordlist.combined.gz
Header :
  date : 1351676002 <=> 1355117676
  version : 26 <=> 28
Body :
Deleted: DoCoMo 40
Added: Docomo 40
Added: KDDI 25
Added: Softbank 25

>>> dictionaries/fi_wordlist.combined.gz
Header :
  date : 1351676054 <=> 1355117691
  version : 26 <=> 28
Body :
Deleted: DoCoMo 28
Added: Docomo 28
Added: KDDI 25
Added: Softbank 25

>>> dictionaries/fr_wordlist.combined.gz
Header :
  date : 1354872988 <=> 1355117708
  version : 27 <=> 28
Body :
Deleted: DoCoMo 52
Added: Docomo 52
Added: KDDI 25
Added: Softbank 25

>>> dictionaries/pt_PT_wordlist.combined.gz
Header :
  date : 1351676510 <=> 1355117723
  version : 26 <=> 28
Body :
Deleted: DoCoMo 48
Added: Docomo 48
Added: Softbank 25

>>> java/res/raw/main_en.dict
Header :
  date : 1354870744 <=> 1355112460
  version : 27 <=> 28
Body :
Deleted: DoCoMo 65
Added: Docomo 65
Added: KDDI 25
Added: Softbank 25

>>> java/res/raw/main_es.dict
Header :
  date : 1353500806 <=> 1355117676
  version : 27 <=> 28
Body :
Deleted: DoCoMo 40
Added: Docomo 40
Added: KDDI 25
Added: Softbank 25

>>> java/res/raw/main_fr.dict
Header :
  date : 1354872988 <=> 1355117708
  version : 27 <=> 28
Body :
Deleted: DoCoMo 52
Added: Docomo 52
Added: KDDI 25
Added: Softbank 25

Change-Id: I3801cbe4535407f55ede8db327674d493a92d1ae
2012-12-10 14:52:43 +09:00
Jean Chalard d5f53710c5 Update dictionaries and fix mistakes
- Combined de dict :
  Remove digraph shortcuts that were in by mistake.
- Combined en dict :
  Set freq of "baton" "batons" "mace" "puff"
  "puffs" and "tasers" to zero. They are offensive
  in en_GB.
- Combined en_GB dict :
  Change freq of "il" to 0 and flag it "not a word". Still
  in the dict as a whitelist entry for "I'll"; for some
  reason it had freq 99.
  Add "milk:122" and "practice:143"
- Combined fr dict :
  Add missing words : "Nostradamus:40" "défendais:30"
  "gmail:50" "générale:140" "hm:0" "hmm:0" "y'en:130"
  "l'apocalypse:31" "m'épuise:30" "recontacter:80"
  "t'annonce:30"
  Set freq of non-word shortcuts for digraphs to 1 instead
  of 0, allowing to gesture them.
- Combined ru dict :
  Remove a lot of two-character non-words.

- Binary de dict :
  Remove the obsolete "options" header, and add the "dictionary"
  header.
- Binary en dict :
  Flag "hoe" "hoes" "il" "shel" as non-words.
  Also drop freq of "il" and "shel" to 0
  Add the "locale" header that was missing.
- Binary es dict :
  Add the "dictionary" header.
- Binary fr dict :
  Add the same words as above. Non-word shortcuts were already
  set to 1.
- Binary it dict :
  Add a "dictionary" header. Also change freq of
  "Šarapova" from 50 to 37; not sure why it was 50.
- Binary pt_BR dict :
  Add a "dictionary" header.
- Binary ru dict :
  Add a "dictionary" header and remove the same words as above.

For all dictionaries : bump the version to 27.

Change-Id: I94fe7f8f42b31fdad223085c00a94115e14d2276
2012-11-21 22:03:24 +09:00
Jean Chalard d0cf96493c Use all Lexiteria sources and update existing directories.
New dictionaries :
- Danish
- Greek
- Finnish
- Lithuanian
- Latvian
- Dutch
- Polish
- Russian
- Slovene
- Serbian
- Swedish
- Turkish

Also, compress those files to reduce the footprint in the
repository.
Also, update and improve English and French dictionaries, and
add the ligatures shortcut into the French dictionary.
Finally, move the Russian binary dictionary here now that it
can at last be open sourced.

Bug: 5587752
Bug: 6775251
Bug: 6995793
Bug: 7149666
Change-Id: Iec9831d4dce425a2b5b0657571e4448436610525
2012-09-21 22:07:23 +09:00
Jean Chalard 80058c73cb Update AOSP dictionaries
Change-Id: Ia6bb1f9d6df4a9f859f132affc9cb030f14effd9
2012-05-22 16:12:50 +09:00
Jean Chalard 624150b11b Update dictionaries.
Bug: 6517432
Bug: 6525702
Change-Id: I47a8c4612bffb16971575b59e9e20fd0276a2f92
2012-05-22 11:29:33 +09:00
Jean Chalard 8fec807800 Add open-source-able word lists to AOSP.
Bug: 6458744
Change-Id: If28aeb7360ee7ec7408f55934ca2a684f032e338
2012-05-17 19:20:04 +09:00