LatinIME

Commit Graph

Author	SHA1	Message	Date
Adrian Velicu	8dd31a28ae	Update dictionaries (possibly_offensive flag) Correctly encoding possibly offensive words with their correct frequency and the possibly_offensive flag set. Continuing to encode with zero frequency only distracters or words that should never come up. https://paste.googleplex.com/5167060875214848 Bug: 11031090 Change-Id: Ia394b1827f292ff8d4791cc2f3e6e50b5aff4cbe	2014-10-31 14:49:24 +09:00
Jean Chalard	004cec01a9	Update all dicts to version 44. Bug: 13164302 Change-Id: I8dc1a839c7dcfaa08a53e26cb6600e9f871447ce	2014-02-24 21:27:25 +09:00
Jean Chalard	a267ebed5a	Update dictionaries Add KitKat to all dictionaries. Version da, fi, pl : 29 → 40 cs, de, hr, it, lt, lv, nb, nl, sl, sr, sv, tr : 35 → 40 es : 36 → 40 en_gb, en_us, en, fr, pt_br, pt_pt : 39 → 40 Bug: 10958192 Change-Id: I14436616285ced5eb3b70b8c44b9243da94eed4f	2013-09-30 07:12:03 +00:00
Jean Chalard	50b36e2a4b	Update dictionaries >>> dictionaries/en_GB_wordlist.combined.gz Header : date : 1374721653 <=> 1380099152 version : 36 <=> 39 Body : Freq changed: gay 127 -> 10 Added: draft 138 >>> dictionaries/en_US_wordlist.combined.gz Header : date : 1374721654 <=> 1380099152 version : 36 <=> 39 Body : Freq changed: gay 127 -> 10 >>> dictionaries/en_wordlist.combined.gz Header : date : 1374721663 <=> 1380099172 version : 36 <=> 39 Body : Freq changed: gay 127 -> 10 >>> dictionaries/fr_wordlist.combined.gz Header : date : 1376888819 <=> 1380099153 version : 37 <=> 39 Body : Added: septembre 150 >>> dictionaries/pt_BR_wordlist.combined.gz Header : date : 1376884524 <=> 1380099168 version : 37 <=> 39 Body : Freq changed: atras 87 -> 0 Not a word: atras false -> true Shortcut added: atras atrás 15 Shortcut added: cade cadê 15 Shortcut added: cafe café 15 Shortcut added: ferias férias 15 Shortcut added: musica música 15 Shortcut added: musicas músicas 15 >>> dictionaries/pt_PT_wordlist.combined.gz Header : date : 1376884536 <=> 1380099168 version : 37 <=> 39 Body : Shortcut added: atras atrás 15 Shortcut added: cade cadê 15 Shortcut added: ferias férias 15 Shortcut added: musica música 15 Shortcut added: musicas músicas 15 Added: cafe 0 >>> java/res/raw/main_en.dict Header : date : 1374721663 <=> 1380099172 version : 36 <=> 39 Body : Freq changed: gay 127 -> 10 >>> java/res/raw/main_fr.dict Header : date : 1376888819 <=> 1380099153 version : 37 <=> 39 Body : Added: septembre 150 >>> java/res/raw/main_pt_br.dict Header : date : 1376884524 <=> 1380099168 version : 37 <=> 39 Body : Freq changed: atras 87 -> 0 Not a word: atras false -> true Shortcut added: atras atrás 15 Shortcut added: cade cadê 15 Shortcut added: cafe café 15 Shortcut added: ferias férias 15 Shortcut added: musica música 15 Shortcut added: musicas músicas 15 Bug: 10504313 Bug: 10507536 Bug: 10561100 Change-Id: I4267c76cf0de221a703523d5f2dd2befbaf020a0	2013-09-26 08:34:53 +00:00
Jean Chalard	5937c03f15	Update dictionaries Bug: 10354668 Bug: 10188528 >>> dictionaries/fr_wordlist.combined.gz Header : date : 1374634549 <=> 1376888819 version : 36 <=> 37 Body : Deleted: color 78 Deleted: men 85 Deleted: o 115 Added: nationaux 120 >>> dictionaries/iw_wordlist.combined.gz Added. New dictionary. >>> dictionaries/pt_BR_wordlist.combined.gz Header : date : 1374634563 <=> 1376884524 version : 36 <=> 37 Body : Deleted: la 152 >>> dictionaries/pt_PT_wordlist.combined.gz Header : date : 1357790930 <=> 1376884536 version : 30 <=> 37 Body : Deleted: la 152 >>> dictionaries/ru_wordlist.combined.gz Header : date : 1372393835 <=> 1376897704 version : 35 <=> 37 Body : Freq changed: говно 68 -> 0 >>> java/res/raw/main_fr.dict Header : date : 1374634549 <=> 1376888819 version : 36 <=> 37 Body : Deleted: color 78 Deleted: men 85 Deleted: o 115 Added: nationaux 120 >>> java/res/raw/main_pt_br.dict Header : date : 1374634563 <=> 1376884524 version : 36 <=> 37 Body : Deleted: la 152 >>> java/res/raw/main_ru.dict Header : date : 1372393835 <=> 1376897704 version : 35 <=> 37 Body : Freq changed: говно 68 -> 0 Change-Id: I87a85571c61068ff46a32d291aa43becbb75598a	2013-08-19 16:41:09 +09:00
Jean Chalard	f0046aea26	Update dictionaries en, en_GB, en_US: Add "id" -> "I'd" whitelist entry Reinstate "id" and "ID" in the respective dicts fr: Remove many words that are not French Change "google" to "Google" pt_BR: Delete "idéia" Change-Id: I942266ac7995345580926f60de45d202aa257ae7	2013-07-24 12:10:06 +09:00
Jean Chalard	84f932be73	Add words to Portuguese >>> dictionaries/pt_BR_wordlist.combined.gz Header : date : 1355802839 <=> 1357790917 version : 29 <=> 30 Body : Added: à 30 Added: é 30 Added: ò 30 Added: ô 30 >>> dictionaries/pt_PT_wordlist.combined.gz Header : date : 1355802856 <=> 1357790930 version : 29 <=> 30 Body : Added: à 30 Added: é 30 Added: ò 30 Added: ô 30 >>> java/res/raw/main_pt_br.dict Header : date : 1355802839 <=> 1357790917 version : 29 <=> 30 Body : Added: à 30 Added: é 30 Added: ò 30 Added: ô 30 Bug: 7966948 Change-Id: I71c0986cf616d67926d0a6a0e53099b04b0427d5	2013-01-10 14:14:17 +09:00
Jean Chalard	21dbe3701c	Update dictionaries cs, da, de, el, es, fi, fr, hr, it, lt, lv, nb, nl, pl, pt_BR, pt_PT, sl, sr, sv, tr : rescale frequencies to match spec. This has no large effect in the practice except the dictionary will become stronger vs spatial model (especially in lower count corpora, like lt, lv, sr) en* : Small changes (rounding going the other way essentially) ru : the above rescaling, and remove the following words: Дре, ОСТа, Планше, легкими, легком, легкому, легкости, легкую, нелегкие, нелегкий, нелегким, нелегкое, нелегкой, нелегкую, полулегком and add нелёгкие, нелёгкое, нелёгкую; other accented forms were already in the dictionary. Change-Id: I40386c2ebd4d2be38874e822bde89db7cb512ae6	2012-12-18 13:06:48 +09:00
Jean Chalard	a424ff06ec	Switch the AOSP word lists to the combined format. This will help with managing the word lists. Bug: 7388859 Change-Id: I89f049569b177d3027fe56d6c67eaca27d44dc7d	2012-10-31 18:52:00 +09:00

9 Commits (6d4054689ca9649455106841630c72397d939cae)