Jean Chalard
66c96e8813
Update dictionaries
...
en* : add common app and Google product names
en_GB : also add "filters"
ru : add some missing words
Bug: 11043181
Bug: 12276653
Bug: 12953122
Change-Id: I6b62e681a07b7f0149a10ba4e05954e60d6212d4
2014-02-24 15:30:47 +09:00
Jean Chalard
155cb77231
Update dictionaries
...
This change has no effect on TRT results.
>>> dictionaries/en_US_wordlist.combined.gz
Header :
date : 1381226409 <=> 1389654051
version : 42 <=> 43
Body :
Added: dialogue 120
Added: dialogues 94
>>> dictionaries/fr_wordlist.combined.gz
Header :
date : 1381226409 <=> 1389654052
version : 42 <=> 43
Body :
Deleted: d'Orange 114
Added: d'orange 114
>>> dictionaries/it_wordlist.combined.gz
Header :
date : 1380519383 <=> 1389654052
version : 40 <=> 43
Body :
Freq changed: ciao 85 -> 180
>>> java/res/raw/main_fr.dict
Header :
date : 1381226409 <=> 1389654052
version : 42 <=> 43
Body :
Deleted: d'Orange 114
Added: d'orange 114
>>> java/res/raw/main_it.dict
Header :
date : 1380519383 <=> 1389654052
version : 40 <=> 43
Body :
Freq changed: ciao 85 -> 180
Bug: 12487270
Bug: 12344108
Change-Id: I94768e223d05ad2551a5508e9e01222a028665c4
2014-01-14 10:37:15 +09:00
Jean Chalard
b1eedc6ba0
Update dictionaries
...
>>> dictionaries/en_GB_wordlist.combined.gz
Header :
date : 1381130519 <=> 1381226409
version : 41 <=> 42
Body :
Added: haha 45
>>> dictionaries/en_US_wordlist.combined.gz
Header :
date : 1380293342 <=> 1381226409
version : 40 <=> 42
Body :
Added: haha 45
>>> dictionaries/en_wordlist.combined.gz
Header :
date : 1380293363 <=> 1381226429
version : 40 <=> 42
Body :
Added: haha 45
>>> dictionaries/fr_wordlist.combined.gz
Header :
date : 1380519383 <=> 1381226409
version : 40 <=> 42
Body :
Freq changed: haha 0 -> 30
>>> java/res/raw/main_en.dict
Header :
date : 1380293363 <=> 1381226429
version : 40 <=> 42
Body :
Added: haha 45
>>> java/res/raw/main_fr.dict
Header :
date : 1380519383 <=> 1381226409
version : 40 <=> 42
Body :
Freq changed: haha 0 -> 30
Bug: 11114205
Change-Id: I39d429d24d93ee07a70d8613ce0752432b26acc4
2013-10-08 10:34:56 +00:00
Jean Chalard
0ce97695dc
Update en_GB dictionary
...
Header :
date : 1380293342 <=> 1381130519
version : 40 <=> 41
Body :
Added: filter 115
Bug: 11076171
Change-Id: I4e88b38b61b794c58b645f7b39e28524d979caba
2013-10-07 17:58:38 +09:00
Jean Chalard
a267ebed5a
Update dictionaries
...
Add KitKat to all dictionaries.
Version
da, fi, pl : 29 → 40
cs, de, hr, it, lt, lv, nb, nl, sl, sr, sv, tr : 35 → 40
es : 36 → 40
en_gb, en_us, en, fr, pt_br, pt_pt : 39 → 40
Bug: 10958192
Change-Id: I14436616285ced5eb3b70b8c44b9243da94eed4f
2013-09-30 07:12:03 +00:00
Jean Chalard
50b36e2a4b
Update dictionaries
...
>>> dictionaries/en_GB_wordlist.combined.gz
Header :
date : 1374721653 <=> 1380099152
version : 36 <=> 39
Body :
Freq changed: gay 127 -> 10
Added: draft 138
>>> dictionaries/en_US_wordlist.combined.gz
Header :
date : 1374721654 <=> 1380099152
version : 36 <=> 39
Body :
Freq changed: gay 127 -> 10
>>> dictionaries/en_wordlist.combined.gz
Header :
date : 1374721663 <=> 1380099172
version : 36 <=> 39
Body :
Freq changed: gay 127 -> 10
>>> dictionaries/fr_wordlist.combined.gz
Header :
date : 1376888819 <=> 1380099153
version : 37 <=> 39
Body :
Added: septembre 150
>>> dictionaries/pt_BR_wordlist.combined.gz
Header :
date : 1376884524 <=> 1380099168
version : 37 <=> 39
Body :
Freq changed: atras 87 -> 0
Not a word: atras false -> true
Shortcut added: atras atrás 15
Shortcut added: cade cadê 15
Shortcut added: cafe café 15
Shortcut added: ferias férias 15
Shortcut added: musica música 15
Shortcut added: musicas músicas 15
>>> dictionaries/pt_PT_wordlist.combined.gz
Header :
date : 1376884536 <=> 1380099168
version : 37 <=> 39
Body :
Shortcut added: atras atrás 15
Shortcut added: cade cadê 15
Shortcut added: ferias férias 15
Shortcut added: musica música 15
Shortcut added: musicas músicas 15
Added: cafe 0
>>> java/res/raw/main_en.dict
Header :
date : 1374721663 <=> 1380099172
version : 36 <=> 39
Body :
Freq changed: gay 127 -> 10
>>> java/res/raw/main_fr.dict
Header :
date : 1376888819 <=> 1380099153
version : 37 <=> 39
Body :
Added: septembre 150
>>> java/res/raw/main_pt_br.dict
Header :
date : 1376884524 <=> 1380099168
version : 37 <=> 39
Body :
Freq changed: atras 87 -> 0
Not a word: atras false -> true
Shortcut added: atras atrás 15
Shortcut added: cade cadê 15
Shortcut added: cafe café 15
Shortcut added: ferias férias 15
Shortcut added: musica música 15
Shortcut added: musicas músicas 15
Bug: 10504313
Bug: 10507536
Bug: 10561100
Change-Id: I4267c76cf0de221a703523d5f2dd2befbaf020a0
2013-09-26 08:34:53 +00:00
Jean Chalard
5937c03f15
Update dictionaries
...
Bug: 10354668
Bug: 10188528
>>> dictionaries/fr_wordlist.combined.gz
Header :
date : 1374634549 <=> 1376888819
version : 36 <=> 37
Body :
Deleted: color 78
Deleted: men 85
Deleted: o 115
Added: nationaux 120
>>> dictionaries/iw_wordlist.combined.gz
Added. New dictionary.
>>> dictionaries/pt_BR_wordlist.combined.gz
Header :
date : 1374634563 <=> 1376884524
version : 36 <=> 37
Body :
Deleted: la 152
>>> dictionaries/pt_PT_wordlist.combined.gz
Header :
date : 1357790930 <=> 1376884536
version : 30 <=> 37
Body :
Deleted: la 152
>>> dictionaries/ru_wordlist.combined.gz
Header :
date : 1372393835 <=> 1376897704
version : 35 <=> 37
Body :
Freq changed: говно 68 -> 0
>>> java/res/raw/main_fr.dict
Header :
date : 1374634549 <=> 1376888819
version : 36 <=> 37
Body :
Deleted: color 78
Deleted: men 85
Deleted: o 115
Added: nationaux 120
>>> java/res/raw/main_pt_br.dict
Header :
date : 1374634563 <=> 1376884524
version : 36 <=> 37
Body :
Deleted: la 152
>>> java/res/raw/main_ru.dict
Header :
date : 1372393835 <=> 1376897704
version : 35 <=> 37
Body :
Freq changed: говно 68 -> 0
Change-Id: I87a85571c61068ff46a32d291aa43becbb75598a
2013-08-19 16:41:09 +09:00
Jean Chalard
665e4ecc62
Update dictionaries
...
>>> dictionaries/en_GB_wordlist.combined.gz
Header :
date : 1374634548 <=> 1374721653
Body :
Added: Caltrain 30
>>> dictionaries/en_US_wordlist.combined.gz
Header :
date : 1374634548 <=> 1374721654
Body :
Added: Caltrain 30
>>> dictionaries/en_wordlist.combined.gz
Header :
date : 1374634568 <=> 1374721663
Body :
Added: Caltrain 30
>>> dictionaries/es_wordlist.combined.gz
Header :
date : 1372393817 <=> 1374721654
version : 35 <=> 36
Body :
Added: Caltrain 10
>>> java/res/raw/main_en.dict
Header :
date : 1374634568 <=> 1374721663
Body :
Added: Caltrain 30
>>> java/res/raw/main_es.dict
Header :
date : 1372393817 <=> 1374721654
version : 35 <=> 36
Body :
Added: Caltrain 10
Bug: 9995706
Change-Id: Icf96bf01e45ef94d3ffd6d6a9d6431c52f0f5a86
2013-07-25 12:48:55 +09:00
Jean Chalard
f0046aea26
Update dictionaries
...
en, en_GB, en_US:
Add "id" -> "I'd" whitelist entry
Reinstate "id" and "ID" in the respective dicts
fr:
Remove many words that are not French
Change "google" to "Google"
pt_BR:
Delete "idéia"
Change-Id: I942266ac7995345580926f60de45d202aa257ae7
2013-07-24 12:10:06 +09:00
Jean Chalard
ffe7dbbe7a
Update dictionaries
...
>>> dictionaries/cs_wordlist.combined.gz
Header :
date : 1355802831 <=> 1372393817
version : 29 <=> 35
Body :
Added: LTE 25
>>> dictionaries/de_wordlist.combined.gz
Header :
date : 1355802835 <=> 1372393817
version : 29 <=> 35
Body :
Added: LTE 25
>>> dictionaries/en_GB_wordlist.combined.gz
Header :
date : 1366272052 <=> 1372393817
version : 31 <=> 35
Body :
Deleted: Sea 126
Added: LTE 25
>>> dictionaries/en_US_wordlist.combined.gz
Header :
date : 1366272093 <=> 1372393817
version : 31 <=> 35
Body :
Added: LTE 25
>>> dictionaries/en_wordlist.combined.gz
Header :
date : 1366272977 <=> 1372393837
version : 31 <=> 35
Body :
Deleted: Sea 126
Added: LTE 25
>>> dictionaries/es_wordlist.combined.gz
Header :
date : 1355802832 <=> 1372393817
version : 29 <=> 35
Body :
Added: LTE 25
>>> dictionaries/fr_wordlist.combined.gz
Header :
date : 1366272255 <=> 1372393818
version : 31 <=> 35
Body :
Deleted: R'n'B 95
Deleted: count 60
Deleted: d'Inti 34
Added: beurk 25
>>> dictionaries/hr_wordlist.combined.gz
Header :
date : 1355802836 <=> 1372393818
version : 29 <=> 35
Body :
Added: LTE 25
>>> dictionaries/it_wordlist.combined.gz
Header :
date : 1355802836 <=> 1372393818
version : 29 <=> 35
Body :
Added: LTE 25
>>> dictionaries/lt_wordlist.combined.gz
Header :
date : 1355802843 <=> 1372393818
version : 29 <=> 35
Body :
Added: LTE 25
>>> dictionaries/lv_wordlist.combined.gz
Header :
date : 1355802843 <=> 1372393818
version : 29 <=> 35
Body :
Added: LTE 25
>>> dictionaries/nb_wordlist.combined.gz
Header :
date : 1366003450 <=> 1372393818
version : 31 <=> 35
Body :
Added: LTE 25
>>> dictionaries/nl_wordlist.combined.gz
Header :
date : 1355802844 <=> 1372393818
version : 29 <=> 35
Body :
Added: LTE 25
>>> dictionaries/ru_wordlist.combined.gz
Header :
date : 1370244430 <=> 1372393835
version : 34 <=> 35
Body :
Freq changed: связывание 93 -> 0
>>> dictionaries/sl_wordlist.combined.gz
Header :
date : 1355802835 <=> 1372393835
version : 29 <=> 35
Body :
Added: LTE 25
>>> dictionaries/sr_wordlist.combined.gz
Header :
date : 1355802853 <=> 1372393835
version : 29 <=> 35
Body :
Added: LTE 25
>>> dictionaries/sv_wordlist.combined.gz
Header :
date : 1366003804 <=> 1372393836
version : 31 <=> 35
Body :
Added: LTE 25
>>> dictionaries/tr_wordlist.combined.gz
Header :
date : 1355802858 <=> 1372393837
version : 29 <=> 35
Body :
Added: LTE 25
>>> java/res/raw/main_de.dict
Header :
date : 1355802835 <=> 1372393817
version : 29 <=> 35
Body :
Added: LTE 25
>>> java/res/raw/main_en.dict
Header :
date : 1366272977 <=> 1372393837
version : 31 <=> 35
Body :
Deleted: Sea 126
Added: LTE 25
>>> java/res/raw/main_es.dict
Header :
date : 1355802832 <=> 1372393817
version : 29 <=> 35
Body :
Added: LTE 25
>>> java/res/raw/main_fr.dict
Header :
date : 1366272255 <=> 1372393818
version : 31 <=> 35
Body :
Deleted: R'n'B 95
Deleted: count 60
Deleted: d'Inti 34
Added: beurk 25
>>> java/res/raw/main_it.dict
Header :
date : 1355802836 <=> 1372393818
version : 29 <=> 35
Body :
Added: LTE 25
>>> java/res/raw/main_ru.dict
Header :
date : 1370244430 <=> 1372393835
version : 34 <=> 35
Body :
Freq changed: связывание 93 -> 0
Bug: 9301610
Bug: 9607966
Change-Id: I1117ed85d97fbb0ee50f11bc31776f1970b56f12
2013-06-28 14:54:51 +09:00
Jean Chalard
e73802f335
Update dictionaries
...
>>> dictionaries/ru_wordlist.combined.gz
Header :
date : 1366974711 <=> 1370244430
MULTIPLE_WORDS_DEMOTION_RATE : 0 <=> 50
version : 32 <=> 34
Body :
Deleted: МДА 2
Freq changed: а 0 -> 60
Freq changed: в 0 -> 60
Deleted: возбужденные 0
Freq changed: гей 92 -> 0
Freq changed: жид 80 -> 0
Freq changed: зареган 0 -> 50
Freq changed: и 0 -> 60
Freq changed: к 0 -> 60
Deleted: клевом 0
Freq changed: куи 29 -> 0
Freq changed: лох 69 -> 0
Freq changed: о 0 -> 60
Freq changed: ребут 0 -> 50
Freq changed: с 0 -> 60
Freq changed: у 0 -> 60
Freq changed: хуй 77 -> 0
Freq changed: хукера 38 -> 0
Freq changed: широко 0 -> 144
Deleted: щеткой 70
Freq changed: щёткой 69 -> 70
Freq changed: я 0 -> 60
Added: жены 134
Added: звони 100
Added: клёвом 50
Added: мда 0
>>> java/res/raw/main_ru.dict
Header :
date : 1366974711 <=> 1370244430
version : 32 <=> 34
MULTIPLE_WORDS_DEMOTION_RATE : 0 <=> 50
Body :
(same changes)
Change-Id: Ie10bdd1f33cac43c5be35e99faef7cfdfe877d2b
2013-06-03 16:41:12 +09:00
Jean Chalard
d57a7748c1
Update dictionaries
...
>>> dictionaries/ru_wordlist.combined.gz
Header :
date : 1366957492 <=> 1366974711
Body :
Added: ложись 100
Added: под 100
Added: посмотрю 100
Added: угу 100
Added: ух 100
>>> java/res/raw/main_ru.dict
Header :
date : 1366957492 <=> 1366974711
Body :
Added: ложись 100
Added: под 100
Added: посмотрю 100
Added: угу 100
Added: ух 100
Change-Id: Ida39ea2cf25cd291554f3b2f3ce31f57dca24113
2013-04-26 20:15:14 +09:00
Jean Chalard
7ec72b80ed
Update dictionaries
...
Full diff too long: truncated
Summary diff
>>> dictionaries/ru_wordlist.combined.gz
Header :
date : 1366277083 <=> 1366957492
version : 31 <=> 32
Contents :
- Reinstate 2- and 3- letter words that were demoted to avoid
bad space insertion (343 entries)
- Add missing words as per b/6341908 and b/5674314
(98 entries)
This has zero effect on the regression tests
Bug: 6341908
Bug: 5674314
Change-Id: Ifce268a7eab5edd264d963489187e975017f8b72
2013-04-26 15:56:54 +09:00
Jean Chalard
9cf468646f
Update dictionaries
...
>>> dictionaries/en_GB_wordlist.combined.gz
Header :
date : 1366021966 <=> 1366272052
Body :
Added: yt 0
>>> dictionaries/en_US_wordlist.combined.gz
Header :
date : 1366021978 <=> 1366272093
Body :
Added: yt 0
>>> dictionaries/en_wordlist.combined.gz
Header :
date : 1366021987 <=> 1366272977
Body :
Added: yt 0
>>> dictionaries/fr_wordlist.combined.gz
Header :
date : 1366003217 <=> 1366272255
Body :
Freq changed: cash 80 -> 20
>>> dictionaries/ru_wordlist.combined.gz
Header :
date : 1366003693 <=> 1366277083
Body :
Deleted: толщ 76
>>> java/res/raw/main_en.dict
Header :
date : 1366021987 <=> 1366272977
Body :
Added: yt 0
>>> java/res/raw/main_fr.dict
Header :
date : 1366003217 <=> 1366272255
Body :
Freq changed: cash 80 -> 20
>>> java/res/raw/main_ru.dict
Header :
date : 1366003693 <=> 1366277083
Body :
Deleted: толщ 76
Bug: 8635822
Change-Id: I44dc73bd010b125c994387894847a008276d69f7
2013-04-18 18:41:19 +09:00
Jean Chalard
e99daea083
Update dictionaries
...
>>> dictionaries/en_GB_wordlist.combined.gz
Header :
date : 1366003032 <=> 1366021966
Body :
Deleted: FTP 88
Deleted: HTTPS 66
Added: www 72
>>> dictionaries/en_US_wordlist.combined.gz
Header :
date : 1366003070 <=> 1366021978
Body :
Deleted: FTP 88
Deleted: HTTPS 66
Added: http 95
Added: www 71
>>> dictionaries/en_wordlist.combined.gz
Header :
date : 1366003861 <=> 1366021987
Body :
Deleted: FTP 88
Deleted: HTTPS 66
Freq changed: http 120 -> 95
Added: www 71
>>> java/res/raw/main_en.dict
Header :
date : 1366003861 <=> 1366021987
Body :
Deleted: FTP 88
Deleted: HTTPS 66
Freq changed: http 120 -> 95
Added: www 71
Bug: 8233807
Change-Id: Id55f6e0dcc9ddff26902c0857edcbb9b10d42328
2013-04-15 20:25:48 +09:00
Jean Chalard
da175bdcb1
Update dictionaries
...
>>> dictionaries/en_GB_wordlist.combined.gz
Header :
date : 1355802832 <=> 1366003032
version : 29 <=> 31
Body :
Deleted: HTTP 95
Deleted: WWW 72
Added: mm 135
>>> dictionaries/en_US_wordlist.combined.gz
Header :
date : 1355112451 <=> 1366003070
version : 28 <=> 31
Body :
Deleted: HTTP 95
Deleted: WWW 71
Added: mm 135
>>> dictionaries/en_wordlist.combined.gz
Header :
date : 1355802851 <=> 1366003861
version : 29 <=> 31
Body :
Deleted: HTTP 95
Deleted: WWW 71
Added: mm 135
>>> dictionaries/fr_wordlist.combined.gz
Header :
date : 1357617878 <=> 1366003217
version : 29 <=> 31
Body :
Not a word: re false -> true
Shortcut added: re le 15
>>> dictionaries/nb_wordlist.combined.gz
Header :
date : 1355802836 <=> 1366003450
version : 29 <=> 31
Body :
Freq changed: iPhone 91 -> 30
Added: app 30
>>> dictionaries/ru_wordlist.combined.gz
Header :
date : 1358763720 <=> 1366003693
version : 30 <=> 31
Body :
Freq changed: за 140 -> 181
Freq changed: не 140 -> 191
Freq changed: про 131 -> 151
Freq changed: эры 125 -> 140
>>> dictionaries/sv_wordlist.combined.gz
Header :
date : 1355802856 <=> 1366003804
version : 29 <=> 31
Body :
Added: vi 180
>>> java/res/raw/main_en.dict
Header :
date : 1355802851 <=> 1366003861
version : 29 <=> 31
Body :
Deleted: HTTP 95
Deleted: WWW 71
Added: mm 135
>>> java/res/raw/main_fr.dict
Header :
date : 1357617878 <=> 1366003217
version : 29 <=> 31
Body :
Not a word: re false -> true
Shortcut added: re le 15
>>> java/res/raw/main_ru.dict
Header :
date : 1358763720 <=> 1366003693
version : 30 <=> 31
Body :
Freq changed: за 140 -> 181
Freq changed: не 140 -> 191
Freq changed: про 131 -> 151
Freq changed: эры 125 -> 140
Bug: 8560415
Bug: 7556679
Change-Id: If1c628edcb1cc5efd67e1715acf94f19c0eb4643
2013-04-15 14:51:02 +09:00
Jean Chalard
be94d212e8
Update the Russian dictionary
...
The point is to get as close as possible to having the
golden Russian tests pass.
>>> dictionaries/ru_wordlist.combined.gz
Header :
date : 1355818916 <=> 1358763720
version : 29 <=> 30
Body :
Deleted: НКТ 14
Freq changed: без 0 -> 140
Freq changed: бонус 94 -> 130
Freq changed: за 0 -> 140
Freq changed: на 0 -> 180
Freq changed: не 0 -> 140
Freq changed: парка 133 -> 110
Freq changed: про 0 -> 131
Freq changed: ручьи 93 -> 80
Freq changed: ура 86 -> 100
Freq changed: юрты 86 -> 60
Added: вечерком 100
Added: задачки 100
Added: сорри 100
Added: узнай 100
Added: учти 100
>>> java/res/raw/main_ru.dict
All the same above changes
Change-Id: I8685c34d9ab1dcbf8ae8e23d2e26380059684c95
2013-01-21 19:30:17 +09:00
Jean Chalard
84f932be73
Add words to Portuguese
...
>>> dictionaries/pt_BR_wordlist.combined.gz
Header :
date : 1355802839 <=> 1357790917
version : 29 <=> 30
Body :
Added: à 30
Added: é 30
Added: ò 30
Added: ô 30
>>> dictionaries/pt_PT_wordlist.combined.gz
Header :
date : 1355802856 <=> 1357790930
version : 29 <=> 30
Body :
Added: à 30
Added: é 30
Added: ò 30
Added: ô 30
>>> java/res/raw/main_pt_br.dict
Header :
date : 1355802839 <=> 1357790917
version : 29 <=> 30
Body :
Added: à 30
Added: é 30
Added: ò 30
Added: ô 30
Bug: 7966948
Change-Id: I71c0986cf616d67926d0a6a0e53099b04b0427d5
2013-01-10 14:14:17 +09:00
Jean Chalard
420528ed97
Update dictionaries
...
>>> dictionaries/fr_wordlist.combined.gz
Header :
date : 1355802835 <=> 1357617878
Body :
Deleted: jai 50
>>> dictionaries/pl_wordlist.combined.gz
Header :
date : 1355802847 <=> 1357618222
Body :
Added: żebyście 69
Added: żebyśmy 69
>>> java/res/raw/main_fr.dict
Header :
date : 1355802835 <=> 1357617878
Body :
Deleted: jai 50
Change-Id: I8651a4689bea06d5fe2caead471ef52969c77089
2013-01-08 14:24:22 +09:00
Jean Chalard
cd89c5d6ed
Update dictionaries
...
>>> dictionaries/ru_wordlist.combined.gz
Header :
date : 1355802857 <=> 1355818916
Body :
Freq changed: БД 18 -> 0
Freq changed: ГБ 14 -> 0
Freq changed: ЕС 44 -> 0
Freq changed: ЖД 3 -> 0
Freq changed: ЖЖ 8 -> 0
Freq changed: ЖК 3 -> 0
Freq changed: ИИ 21 -> 0
Freq changed: КБ 37 -> 0
Freq changed: МБ 19 -> 0
Freq changed: МО 26 -> 0
Freq changed: ОС 40 -> 0
Freq changed: РФ 65 -> 0
Freq changed: СБ 21 -> 0
Freq changed: СК 23 -> 0
Freq changed: ТВ 37 -> 0
Freq changed: УК 36 -> 0
Freq changed: ЦБ 11 -> 0
Freq changed: ЦК 59 -> 0
Deleted: бэ 0
Freq changed: дБ 92 -> 0
Deleted: йо 0
Freq changed: мм 149 -> 0
Freq changed: рН 104 -> 0
Deleted: ша 0
>>> java/res/raw/main_ru.dict
Header :
date : 1355802857 <=> 1355818916
Body :
Freq changed: БД 18 -> 0
Freq changed: ГБ 14 -> 0
Freq changed: ЕС 44 -> 0
Freq changed: ЖД 3 -> 0
Freq changed: ЖЖ 8 -> 0
Freq changed: ЖК 3 -> 0
Freq changed: ИИ 21 -> 0
Freq changed: КБ 37 -> 0
Freq changed: МБ 19 -> 0
Freq changed: МО 26 -> 0
Freq changed: ОС 40 -> 0
Freq changed: РФ 65 -> 0
Freq changed: СБ 21 -> 0
Freq changed: СК 23 -> 0
Freq changed: ТВ 37 -> 0
Freq changed: УК 36 -> 0
Freq changed: ЦБ 11 -> 0
Freq changed: ЦК 59 -> 0
Deleted: бэ 0
Freq changed: дБ 92 -> 0
Deleted: йо 0
Freq changed: мм 149 -> 0
Freq changed: рН 104 -> 0
Deleted: ша 0
Change-Id: I03f0f4e8d03e0f77f5879e6dd5c424673466afca
2012-12-18 17:25:37 +09:00
Jean Chalard
21dbe3701c
Update dictionaries
...
cs, da, de, el, es, fi, fr, hr, it, lt, lv, nb, nl, pl,
pt_BR, pt_PT, sl, sr, sv, tr : rescale frequencies to match
spec. This has no large effect in the practice except the
dictionary will become stronger vs spatial model (especially in
lower count corpora, like lt, lv, sr)
en* : Small changes (rounding going the other way essentially)
ru : the above rescaling, and remove the following words:
Дре, ОСТа, Планше, легкими, легком, легкому, легкости,
легкую, нелегкие, нелегкий, нелегким, нелегкое, нелегкой,
нелегкую, полулегком and add нелёгкие, нелёгкое, нелёгкую;
other accented forms were already in the dictionary.
Change-Id: I40386c2ebd4d2be38874e822bde89db7cb512ae6
2012-12-18 13:06:48 +09:00
Jean Chalard
d080986f93
Update dictionaries
...
>>> dictionaries/en_GB_wordlist.combined.gz
Header :
date : 1354870724 <=> 1355112440
version : 27 <=> 28
Body :
Deleted: DoCoMo 65
Added: Docomo 65
Added: KDDI 25
Added: Softbank 25
>>> dictionaries/en_US_wordlist.combined.gz
Header :
date : 1354870736 <=> 1355112451
version : 27 <=> 28
Body :
Deleted: DoCoMo 65
Added: Docomo 65
Added: KDDI 25
Added: Softbank 25
>>> dictionaries/en_wordlist.combined.gz
Header :
date : 1354870744 <=> 1355112460
version : 27 <=> 28
Body :
Deleted: DoCoMo 65
Added: Docomo 65
Added: KDDI 25
Added: Softbank 25
>>> dictionaries/es_wordlist.combined.gz
Header :
date : 1351676002 <=> 1355117676
version : 26 <=> 28
Body :
Deleted: DoCoMo 40
Added: Docomo 40
Added: KDDI 25
Added: Softbank 25
>>> dictionaries/fi_wordlist.combined.gz
Header :
date : 1351676054 <=> 1355117691
version : 26 <=> 28
Body :
Deleted: DoCoMo 28
Added: Docomo 28
Added: KDDI 25
Added: Softbank 25
>>> dictionaries/fr_wordlist.combined.gz
Header :
date : 1354872988 <=> 1355117708
version : 27 <=> 28
Body :
Deleted: DoCoMo 52
Added: Docomo 52
Added: KDDI 25
Added: Softbank 25
>>> dictionaries/pt_PT_wordlist.combined.gz
Header :
date : 1351676510 <=> 1355117723
version : 26 <=> 28
Body :
Deleted: DoCoMo 48
Added: Docomo 48
Added: Softbank 25
>>> java/res/raw/main_en.dict
Header :
date : 1354870744 <=> 1355112460
version : 27 <=> 28
Body :
Deleted: DoCoMo 65
Added: Docomo 65
Added: KDDI 25
Added: Softbank 25
>>> java/res/raw/main_es.dict
Header :
date : 1353500806 <=> 1355117676
version : 27 <=> 28
Body :
Deleted: DoCoMo 40
Added: Docomo 40
Added: KDDI 25
Added: Softbank 25
>>> java/res/raw/main_fr.dict
Header :
date : 1354872988 <=> 1355117708
version : 27 <=> 28
Body :
Deleted: DoCoMo 52
Added: Docomo 52
Added: KDDI 25
Added: Softbank 25
Change-Id: I3801cbe4535407f55ede8db327674d493a92d1ae
2012-12-10 14:52:43 +09:00
Jean Chalard
bd793ed50d
Update dictionaries
...
>>> dictionaries/en_GB_wordlist.combined.gz
Header :
date : 1353500789 <=> 1354870724
Body :
Added: Dad 75
Added: Daddy 60
Added: Grandma 60
Added: Grandpa 55
Added: Mama 59
Added: Mom 77
Added: Papa 55
>>> dictionaries/en_US_wordlist.combined.gz
Header :
date : 1351675958 <=> 1354870736
version : 26 <=> 27
Body :
Deleted: Rod's 46
Added: Dad 75
Added: Daddy 60
Added: Grandma 60
Added: Grandpa 55
Added: Mama 59
Added: Mom 77
Added: Papa 55
>>> dictionaries/en_wordlist.combined.gz
Header :
date : 1353500998 <=> 1354870744
Body :
Deleted: Rod's 46
Added: Dad 75
Added: Daddy 60
Added: Grandma 60
Added: Grandpa 55
Added: Mama 59
Added: Mom 77
Added: Papa 55
>>> dictionaries/fr_wordlist.combined.gz
Header :
date : 1353500832 <=> 1354872988
Body :
Deleted: noël 71
Deleted: po 73
Deleted: ti 73
Added: Noël 71
Added: lose 1
Added: y'a 130
>>> dictionaries/ru_wordlist.combined.gz
Header :
date : 1353567943 <=> 1354870130
Body :
Demote all CAPS words by 80
Freq changed: модно 51 -> 20
>>> java/res/raw/main_en.dict
Header :
date : 1353500998 <=> 1354870744
Body :
Deleted: Rod's 46
Added: Dad 75
Added: Daddy 60
Added: Grandma 60
Added: Grandpa 55
Added: Mama 59
Added: Mom 77
Added: Papa 55
>>> java/res/raw/main_fr.dict
Header :
date : 1353500832 <=> 1354872988
Body :
Deleted: noël 71
Deleted: po 73
Deleted: ti 73
Added: Noël 71
Added: lose 1
Added: y'a 130
>>> java/res/raw/main_ru.dict
Header :
date : 1353567943 <=> 1354870130
Body :
Demote all CAPS words by 80
Freq changed: модно 51 -> 20
Change-Id: I6f2d1c359d716535923b22c33d7fa4c3b0a330e4
2012-12-07 18:52:21 +09:00
Jean Chalard
b40a1ce50b
Update RU dictionary header.
...
>>> dictionaries/ru_wordlist.combined.gz
>>> java/res/raw/main_ru.dict
Header :
date : 1353500945 <=> 1353567943
MULTIPLE_WORDS_DEMOTION_RATE : null <=> 0
Body :
No differences
Bug: 7540132
Change-Id: I837831b1e214da64962cf1bb68c840a3d4e6bf76
2012-11-22 16:21:10 +09:00
Jean Chalard
d5f53710c5
Update dictionaries and fix mistakes
...
- Combined de dict :
Remove digraph shortcuts that were in by mistake.
- Combined en dict :
Set freq of "baton" "batons" "mace" "puff"
"puffs" and "tasers" to zero. They are offensive
in en_GB.
- Combined en_GB dict :
Change freq of "il" to 0 and flag it "not a word". Still
in the dict as a whitelist entry for "I'll"; for some
reason it had freq 99.
Add "milk:122" and "practice:143"
- Combined fr dict :
Add missing words : "Nostradamus:40" "défendais:30"
"gmail:50" "générale:140" "hm:0" "hmm:0" "y'en:130"
"l'apocalypse:31" "m'épuise:30" "recontacter:80"
"t'annonce:30"
Set freq of non-word shortcuts for digraphs to 1 instead
of 0, allowing to gesture them.
- Combined ru dict :
Remove a lot of two-character non-words.
- Binary de dict :
Remove the obsolete "options" header, and add the "dictionary"
header.
- Binary en dict :
Flag "hoe" "hoes" "il" "shel" as non-words.
Also drop freq of "il" and "shel" to 0
Add the "locale" header that was missing.
- Binary es dict :
Add the "dictionary" header.
- Binary fr dict :
Add the same words as above. Non-word shortcuts were already
set to 1.
- Binary it dict :
Add a "dictionary" header. Also change freq of
"Šarapova" from 50 to 37; not sure why it was 50.
- Binary pt_BR dict :
Add a "dictionary" header.
- Binary ru dict :
Add a "dictionary" header and remove the same words as above.
For all dictionaries : bump the version to 27.
Change-Id: I94fe7f8f42b31fdad223085c00a94115e14d2276
2012-11-21 22:03:24 +09:00
Jean Chalard
f5adbb1e1b
Move the emoji dictionaries source under AOSP.
...
Change-Id: Ie870a90d483d9f27aed96fb4b44126315c43922f
2012-10-31 19:22:42 +09:00
Jean Chalard
a424ff06ec
Switch the AOSP word lists to the combined format.
...
This will help with managing the word lists.
Bug: 7388859
Change-Id: I89f049569b177d3027fe56d6c67eaca27d44dc7d
2012-10-31 18:52:00 +09:00
Jean Chalard
306e0a800f
Update AOSP dictionaries.
...
Changes :
- Add "emoji"
- Change the whitelist target of "foo" from "for" to "too"
- Fix non-word frequencies to 0
- Fix the freq of common en_US vs en_GB words
- Add "connection" to the en_GB dictionary
Bug: 7368441
Bug: 7370033
Bug: 7371955
Change-Id: Ib22a97e97b486b05012d5496619557f406c441b9
2012-10-24 16:12:28 +09:00
Jean Chalard
3d83a1648b
Update AOSP dictionaries.
...
Differences :
oh 90 -> 105
ooh 54 -> 54
hoy,kinkier,kinkiest,kinkiness,kinkily,kinky -> 0
trst -> remove
New whitelist entries (actually old that had not been applied)
"berm" -> "been"
"foe" -> "for"
"hid" -> "his"
"thong" -> "thing"
French :
Add "six" and remove some non-words
Bug: 7329149
Bug: 7356297
Change-Id: I55092f0538db8627148b0a314e50eff926c47275
2012-10-18 00:39:16 +09:00
Jean Chalard
b24cda3c0c
Fix the Danish dictionary
...
Human error: this contained "Nederlands" ("Dutch" in Dutch) as the
human-readable description in the header.
Bug: 7272686
Change-Id: I7a67e7bf1afca6928de7825fb63c5b213e8d7978
2012-10-04 15:52:50 +09:00
Jean Chalard
a44942810d
Update the AOSP dictionaries for the 0-freq review
...
Bug: 7227265
Change-Id: I384f7d76cef67b96b106ddac96e4baf1fa32afd4
2012-10-03 21:15:27 +09:00
Jean Chalard
d0cf96493c
Use all Lexiteria sources and update existing directories.
...
New dictionaries :
- Danish
- Greek
- Finnish
- Lithuanian
- Latvian
- Dutch
- Polish
- Russian
- Slovene
- Serbian
- Swedish
- Turkish
Also, compress those files to reduce the footprint in the
repository.
Also, update and improve English and French dictionaries, and
add the ligatures shortcut into the French dictionary.
Finally, move the Russian binary dictionary here now that it
can at last be open sourced.
Bug: 5587752
Bug: 6775251
Bug: 6995793
Bug: 7149666
Change-Id: Iec9831d4dce425a2b5b0657571e4448436610525
2012-09-21 22:07:23 +09:00
Jean Chalard
c278142745
Remove useless backslashes from the whitelist dictionary
...
For some reason, these are necessary for resources, but XML
standard does not require them.
Change-Id: I7cdaecb6815aa4020e0d453e33be38ff2968df50
2012-08-13 15:53:07 +09:00
Jean Chalard
1d8103ea57
Add a shortcut-format version of the whitelist.
...
This will ultimately replace the whitelist resource, but
this change doesn't delete it to avoid removing the functionality
temporarily.
Bug: 6906525
Change-Id: I576edc42cd2a964b86b7597f1ede1cf6ec8e26c3
2012-08-10 15:51:18 +09:00
Jean Chalard
6f7b1ff468
Update dictionaries.
...
- English : some words caught through regression tests
- English : some words externally reported
- French : some words externally reported
- French : finished review of all accented words
Bug: 6726969
Bug: 6730031
Change-Id: I37d0dc310db2c79e03ac7ad452391e92d9b13357
2012-06-29 19:30:01 +09:00
Jean Chalard
401e70535e
Make sure whitelist targets are in the main dictionary
...
Bug: 6680976
Change-Id: Ieddb5eecb813da3a8a515930568e356bc3526386
2012-06-19 02:08:57 +09:00
Jean Chalard
79451e0a70
Update dictionaries.
...
- English dict scrubbed for distractors
- EN, FR, IT, DE include improvements from user feedback
Bug: 6394369
Change-Id: I9af5415d0b6a5edfea2956657b0fee7906ebb344
2012-06-16 04:25:43 +09:00
Ken Wakasa
46b6004a18
Remove unusable Persian dictionary source data
...
Change-Id: I4e1c3460d0a6f737355a46fd3e7bfecf00aca598
2012-06-15 13:46:38 +09:00
Jean Chalard
73e417b57b
Merge "Improvements to the English dicts" into jb-dev
2012-05-31 03:05:53 -07:00
Jean Chalard
51fb65569a
Improvements to the English dicts
...
Bug: 6394369
Change-Id: I7a4747386adef44e6d1a0c9fec52d09611f1ce10
2012-05-31 18:46:22 +09:00
Jean Chalard
164a47fe20
Add word lists for Czech, Norwegian Bokmål and EU Portuguese
...
Bug: 5779153
Bug: 5587752
Change-Id: I0b91c99097bdc644bb21eea3f3a1e3385281471f
2012-05-31 15:16:01 +09:00
Jean Chalard
90868195e4
Fix spelling of a unused flag
...
Change-Id: I28fc3004da0ead2d373f36541732ac4450d86318
2012-05-28 18:49:37 +09:00
Jean Chalard
3a6efa06e2
Small update to the English dictionaries
...
Demote 'HDTV'
Bug: 6563090
Change-Id: I39a1632397569cf79a8d67d93cdff5cf29f82f3a
2012-05-28 13:01:59 +09:00
Jean Chalard
b2acdba809
Remove non-words from the French dictionary.
...
Change-Id: I98c546818aa456a534e833495deb670e79df4104
2012-05-24 17:16:41 +09:00
Jean Chalard
80058c73cb
Update AOSP dictionaries
...
Change-Id: Ia6bb1f9d6df4a9f859f132affc9cb030f14effd9
2012-05-22 16:12:50 +09:00
Jean Chalard
1fc0c71fad
Update French/English dictionaries to the latest version
...
Change-Id: I9c98280f900914d1af22b47019ebc0ad5ab175de
2012-05-18 18:54:37 +09:00
Jean Chalard
8fec807800
Add open-source-able word lists to AOSP.
...
Bug: 6458744
Change-Id: If28aeb7360ee7ec7408f55934ca2a684f032e338
2012-05-17 19:20:04 +09:00
Jean Chalard
4fc97c2c01
Add a note of documentation to the sample word list
...
Change-Id: I95f09da03457933a14b549e04575d566de97dd49
2011-12-14 15:25:31 +09:00
The Android Open Source Project
923bf41f85
auto import from //branches/cupcake/...@138744
2009-03-13 15:11:42 -07:00