LatinIME

Commit Graph

Author	SHA1	Message	Date
Jean Chalard	d57a7748c1	Update dictionaries >>> dictionaries/ru_wordlist.combined.gz Header : date : 1366957492 <=> 1366974711 Body : Added: ложись 100 Added: под 100 Added: посмотрю 100 Added: угу 100 Added: ух 100 >>> java/res/raw/main_ru.dict Header : date : 1366957492 <=> 1366974711 Body : Added: ложись 100 Added: под 100 Added: посмотрю 100 Added: угу 100 Added: ух 100 Change-Id: Ida39ea2cf25cd291554f3b2f3ce31f57dca24113	2013-04-26 20:15:14 +09:00
Jean Chalard	7ec72b80ed	Update dictionaries Full diff too long: truncated Summary diff >>> dictionaries/ru_wordlist.combined.gz Header : date : 1366277083 <=> 1366957492 version : 31 <=> 32 Contents : - Reinstate 2- and 3- letter words that were demoted to avoid bad space insertion (343 entries) - Add missing words as per b/6341908 and b/5674314 (98 entries) This has zero effect on the regression tests Bug: 6341908 Bug: 5674314 Change-Id: Ifce268a7eab5edd264d963489187e975017f8b72	2013-04-26 15:56:54 +09:00
Jean Chalard	9cf468646f	Update dictionaries >>> dictionaries/en_GB_wordlist.combined.gz Header : date : 1366021966 <=> 1366272052 Body : Added: yt 0 >>> dictionaries/en_US_wordlist.combined.gz Header : date : 1366021978 <=> 1366272093 Body : Added: yt 0 >>> dictionaries/en_wordlist.combined.gz Header : date : 1366021987 <=> 1366272977 Body : Added: yt 0 >>> dictionaries/fr_wordlist.combined.gz Header : date : 1366003217 <=> 1366272255 Body : Freq changed: cash 80 -> 20 >>> dictionaries/ru_wordlist.combined.gz Header : date : 1366003693 <=> 1366277083 Body : Deleted: толщ 76 >>> java/res/raw/main_en.dict Header : date : 1366021987 <=> 1366272977 Body : Added: yt 0 >>> java/res/raw/main_fr.dict Header : date : 1366003217 <=> 1366272255 Body : Freq changed: cash 80 -> 20 >>> java/res/raw/main_ru.dict Header : date : 1366003693 <=> 1366277083 Body : Deleted: толщ 76 Bug: 8635822 Change-Id: I44dc73bd010b125c994387894847a008276d69f7	2013-04-18 18:41:19 +09:00
Jean Chalard	e99daea083	Update dictionaries >>> dictionaries/en_GB_wordlist.combined.gz Header : date : 1366003032 <=> 1366021966 Body : Deleted: FTP 88 Deleted: HTTPS 66 Added: www 72 >>> dictionaries/en_US_wordlist.combined.gz Header : date : 1366003070 <=> 1366021978 Body : Deleted: FTP 88 Deleted: HTTPS 66 Added: http 95 Added: www 71 >>> dictionaries/en_wordlist.combined.gz Header : date : 1366003861 <=> 1366021987 Body : Deleted: FTP 88 Deleted: HTTPS 66 Freq changed: http 120 -> 95 Added: www 71 >>> java/res/raw/main_en.dict Header : date : 1366003861 <=> 1366021987 Body : Deleted: FTP 88 Deleted: HTTPS 66 Freq changed: http 120 -> 95 Added: www 71 Bug: 8233807 Change-Id: Id55f6e0dcc9ddff26902c0857edcbb9b10d42328	2013-04-15 20:25:48 +09:00
Jean Chalard	da175bdcb1	Update dictionaries >>> dictionaries/en_GB_wordlist.combined.gz Header : date : 1355802832 <=> 1366003032 version : 29 <=> 31 Body : Deleted: HTTP 95 Deleted: WWW 72 Added: mm 135 >>> dictionaries/en_US_wordlist.combined.gz Header : date : 1355112451 <=> 1366003070 version : 28 <=> 31 Body : Deleted: HTTP 95 Deleted: WWW 71 Added: mm 135 >>> dictionaries/en_wordlist.combined.gz Header : date : 1355802851 <=> 1366003861 version : 29 <=> 31 Body : Deleted: HTTP 95 Deleted: WWW 71 Added: mm 135 >>> dictionaries/fr_wordlist.combined.gz Header : date : 1357617878 <=> 1366003217 version : 29 <=> 31 Body : Not a word: re false -> true Shortcut added: re le 15 >>> dictionaries/nb_wordlist.combined.gz Header : date : 1355802836 <=> 1366003450 version : 29 <=> 31 Body : Freq changed: iPhone 91 -> 30 Added: app 30 >>> dictionaries/ru_wordlist.combined.gz Header : date : 1358763720 <=> 1366003693 version : 30 <=> 31 Body : Freq changed: за 140 -> 181 Freq changed: не 140 -> 191 Freq changed: про 131 -> 151 Freq changed: эры 125 -> 140 >>> dictionaries/sv_wordlist.combined.gz Header : date : 1355802856 <=> 1366003804 version : 29 <=> 31 Body : Added: vi 180 >>> java/res/raw/main_en.dict Header : date : 1355802851 <=> 1366003861 version : 29 <=> 31 Body : Deleted: HTTP 95 Deleted: WWW 71 Added: mm 135 >>> java/res/raw/main_fr.dict Header : date : 1357617878 <=> 1366003217 version : 29 <=> 31 Body : Not a word: re false -> true Shortcut added: re le 15 >>> java/res/raw/main_ru.dict Header : date : 1358763720 <=> 1366003693 version : 30 <=> 31 Body : Freq changed: за 140 -> 181 Freq changed: не 140 -> 191 Freq changed: про 131 -> 151 Freq changed: эры 125 -> 140 Bug: 8560415 Bug: 7556679 Change-Id: If1c628edcb1cc5efd67e1715acf94f19c0eb4643	2013-04-15 14:51:02 +09:00
Jean Chalard	be94d212e8	Update the Russian dictionary The point is to get as close as possible to having the golden Russian tests pass. >>> dictionaries/ru_wordlist.combined.gz Header : date : 1355818916 <=> 1358763720 version : 29 <=> 30 Body : Deleted: НКТ 14 Freq changed: без 0 -> 140 Freq changed: бонус 94 -> 130 Freq changed: за 0 -> 140 Freq changed: на 0 -> 180 Freq changed: не 0 -> 140 Freq changed: парка 133 -> 110 Freq changed: про 0 -> 131 Freq changed: ручьи 93 -> 80 Freq changed: ура 86 -> 100 Freq changed: юрты 86 -> 60 Added: вечерком 100 Added: задачки 100 Added: сорри 100 Added: узнай 100 Added: учти 100 >>> java/res/raw/main_ru.dict All the same above changes Change-Id: I8685c34d9ab1dcbf8ae8e23d2e26380059684c95	2013-01-21 19:30:17 +09:00
Jean Chalard	84f932be73	Add words to Portuguese >>> dictionaries/pt_BR_wordlist.combined.gz Header : date : 1355802839 <=> 1357790917 version : 29 <=> 30 Body : Added: à 30 Added: é 30 Added: ò 30 Added: ô 30 >>> dictionaries/pt_PT_wordlist.combined.gz Header : date : 1355802856 <=> 1357790930 version : 29 <=> 30 Body : Added: à 30 Added: é 30 Added: ò 30 Added: ô 30 >>> java/res/raw/main_pt_br.dict Header : date : 1355802839 <=> 1357790917 version : 29 <=> 30 Body : Added: à 30 Added: é 30 Added: ò 30 Added: ô 30 Bug: 7966948 Change-Id: I71c0986cf616d67926d0a6a0e53099b04b0427d5	2013-01-10 14:14:17 +09:00
Jean Chalard	420528ed97	Update dictionaries >>> dictionaries/fr_wordlist.combined.gz Header : date : 1355802835 <=> 1357617878 Body : Deleted: jai 50 >>> dictionaries/pl_wordlist.combined.gz Header : date : 1355802847 <=> 1357618222 Body : Added: żebyście 69 Added: żebyśmy 69 >>> java/res/raw/main_fr.dict Header : date : 1355802835 <=> 1357617878 Body : Deleted: jai 50 Change-Id: I8651a4689bea06d5fe2caead471ef52969c77089	2013-01-08 14:24:22 +09:00
Jean Chalard	cd89c5d6ed	Update dictionaries >>> dictionaries/ru_wordlist.combined.gz Header : date : 1355802857 <=> 1355818916 Body : Freq changed: БД 18 -> 0 Freq changed: ГБ 14 -> 0 Freq changed: ЕС 44 -> 0 Freq changed: ЖД 3 -> 0 Freq changed: ЖЖ 8 -> 0 Freq changed: ЖК 3 -> 0 Freq changed: ИИ 21 -> 0 Freq changed: КБ 37 -> 0 Freq changed: МБ 19 -> 0 Freq changed: МО 26 -> 0 Freq changed: ОС 40 -> 0 Freq changed: РФ 65 -> 0 Freq changed: СБ 21 -> 0 Freq changed: СК 23 -> 0 Freq changed: ТВ 37 -> 0 Freq changed: УК 36 -> 0 Freq changed: ЦБ 11 -> 0 Freq changed: ЦК 59 -> 0 Deleted: бэ 0 Freq changed: дБ 92 -> 0 Deleted: йо 0 Freq changed: мм 149 -> 0 Freq changed: рН 104 -> 0 Deleted: ша 0 >>> java/res/raw/main_ru.dict Header : date : 1355802857 <=> 1355818916 Body : Freq changed: БД 18 -> 0 Freq changed: ГБ 14 -> 0 Freq changed: ЕС 44 -> 0 Freq changed: ЖД 3 -> 0 Freq changed: ЖЖ 8 -> 0 Freq changed: ЖК 3 -> 0 Freq changed: ИИ 21 -> 0 Freq changed: КБ 37 -> 0 Freq changed: МБ 19 -> 0 Freq changed: МО 26 -> 0 Freq changed: ОС 40 -> 0 Freq changed: РФ 65 -> 0 Freq changed: СБ 21 -> 0 Freq changed: СК 23 -> 0 Freq changed: ТВ 37 -> 0 Freq changed: УК 36 -> 0 Freq changed: ЦБ 11 -> 0 Freq changed: ЦК 59 -> 0 Deleted: бэ 0 Freq changed: дБ 92 -> 0 Deleted: йо 0 Freq changed: мм 149 -> 0 Freq changed: рН 104 -> 0 Deleted: ша 0 Change-Id: I03f0f4e8d03e0f77f5879e6dd5c424673466afca	2012-12-18 17:25:37 +09:00
Jean Chalard	21dbe3701c	Update dictionaries cs, da, de, el, es, fi, fr, hr, it, lt, lv, nb, nl, pl, pt_BR, pt_PT, sl, sr, sv, tr : rescale frequencies to match spec. This has no large effect in the practice except the dictionary will become stronger vs spatial model (especially in lower count corpora, like lt, lv, sr) en* : Small changes (rounding going the other way essentially) ru : the above rescaling, and remove the following words: Дре, ОСТа, Планше, легкими, легком, легкому, легкости, легкую, нелегкие, нелегкий, нелегким, нелегкое, нелегкой, нелегкую, полулегком and add нелёгкие, нелёгкое, нелёгкую; other accented forms were already in the dictionary. Change-Id: I40386c2ebd4d2be38874e822bde89db7cb512ae6	2012-12-18 13:06:48 +09:00
Jean Chalard	d080986f93	Update dictionaries >>> dictionaries/en_GB_wordlist.combined.gz Header : date : 1354870724 <=> 1355112440 version : 27 <=> 28 Body : Deleted: DoCoMo 65 Added: Docomo 65 Added: KDDI 25 Added: Softbank 25 >>> dictionaries/en_US_wordlist.combined.gz Header : date : 1354870736 <=> 1355112451 version : 27 <=> 28 Body : Deleted: DoCoMo 65 Added: Docomo 65 Added: KDDI 25 Added: Softbank 25 >>> dictionaries/en_wordlist.combined.gz Header : date : 1354870744 <=> 1355112460 version : 27 <=> 28 Body : Deleted: DoCoMo 65 Added: Docomo 65 Added: KDDI 25 Added: Softbank 25 >>> dictionaries/es_wordlist.combined.gz Header : date : 1351676002 <=> 1355117676 version : 26 <=> 28 Body : Deleted: DoCoMo 40 Added: Docomo 40 Added: KDDI 25 Added: Softbank 25 >>> dictionaries/fi_wordlist.combined.gz Header : date : 1351676054 <=> 1355117691 version : 26 <=> 28 Body : Deleted: DoCoMo 28 Added: Docomo 28 Added: KDDI 25 Added: Softbank 25 >>> dictionaries/fr_wordlist.combined.gz Header : date : 1354872988 <=> 1355117708 version : 27 <=> 28 Body : Deleted: DoCoMo 52 Added: Docomo 52 Added: KDDI 25 Added: Softbank 25 >>> dictionaries/pt_PT_wordlist.combined.gz Header : date : 1351676510 <=> 1355117723 version : 26 <=> 28 Body : Deleted: DoCoMo 48 Added: Docomo 48 Added: Softbank 25 >>> java/res/raw/main_en.dict Header : date : 1354870744 <=> 1355112460 version : 27 <=> 28 Body : Deleted: DoCoMo 65 Added: Docomo 65 Added: KDDI 25 Added: Softbank 25 >>> java/res/raw/main_es.dict Header : date : 1353500806 <=> 1355117676 version : 27 <=> 28 Body : Deleted: DoCoMo 40 Added: Docomo 40 Added: KDDI 25 Added: Softbank 25 >>> java/res/raw/main_fr.dict Header : date : 1354872988 <=> 1355117708 version : 27 <=> 28 Body : Deleted: DoCoMo 52 Added: Docomo 52 Added: KDDI 25 Added: Softbank 25 Change-Id: I3801cbe4535407f55ede8db327674d493a92d1ae	2012-12-10 14:52:43 +09:00
Jean Chalard	bd793ed50d	Update dictionaries >>> dictionaries/en_GB_wordlist.combined.gz Header : date : 1353500789 <=> 1354870724 Body : Added: Dad 75 Added: Daddy 60 Added: Grandma 60 Added: Grandpa 55 Added: Mama 59 Added: Mom 77 Added: Papa 55 >>> dictionaries/en_US_wordlist.combined.gz Header : date : 1351675958 <=> 1354870736 version : 26 <=> 27 Body : Deleted: Rod's 46 Added: Dad 75 Added: Daddy 60 Added: Grandma 60 Added: Grandpa 55 Added: Mama 59 Added: Mom 77 Added: Papa 55 >>> dictionaries/en_wordlist.combined.gz Header : date : 1353500998 <=> 1354870744 Body : Deleted: Rod's 46 Added: Dad 75 Added: Daddy 60 Added: Grandma 60 Added: Grandpa 55 Added: Mama 59 Added: Mom 77 Added: Papa 55 >>> dictionaries/fr_wordlist.combined.gz Header : date : 1353500832 <=> 1354872988 Body : Deleted: noël 71 Deleted: po 73 Deleted: ti 73 Added: Noël 71 Added: lose 1 Added: y'a 130 >>> dictionaries/ru_wordlist.combined.gz Header : date : 1353567943 <=> 1354870130 Body : Demote all CAPS words by 80 Freq changed: модно 51 -> 20 >>> java/res/raw/main_en.dict Header : date : 1353500998 <=> 1354870744 Body : Deleted: Rod's 46 Added: Dad 75 Added: Daddy 60 Added: Grandma 60 Added: Grandpa 55 Added: Mama 59 Added: Mom 77 Added: Papa 55 >>> java/res/raw/main_fr.dict Header : date : 1353500832 <=> 1354872988 Body : Deleted: noël 71 Deleted: po 73 Deleted: ti 73 Added: Noël 71 Added: lose 1 Added: y'a 130 >>> java/res/raw/main_ru.dict Header : date : 1353567943 <=> 1354870130 Body : Demote all CAPS words by 80 Freq changed: модно 51 -> 20 Change-Id: I6f2d1c359d716535923b22c33d7fa4c3b0a330e4	2012-12-07 18:52:21 +09:00
Jean Chalard	b40a1ce50b	Update RU dictionary header. >>> dictionaries/ru_wordlist.combined.gz >>> java/res/raw/main_ru.dict Header : date : 1353500945 <=> 1353567943 MULTIPLE_WORDS_DEMOTION_RATE : null <=> 0 Body : No differences Bug: 7540132 Change-Id: I837831b1e214da64962cf1bb68c840a3d4e6bf76	2012-11-22 16:21:10 +09:00
Jean Chalard	d5f53710c5	Update dictionaries and fix mistakes - Combined de dict : Remove digraph shortcuts that were in by mistake. - Combined en dict : Set freq of "baton" "batons" "mace" "puff" "puffs" and "tasers" to zero. They are offensive in en_GB. - Combined en_GB dict : Change freq of "il" to 0 and flag it "not a word". Still in the dict as a whitelist entry for "I'll"; for some reason it had freq 99. Add "milk:122" and "practice:143" - Combined fr dict : Add missing words : "Nostradamus:40" "défendais:30" "gmail:50" "générale:140" "hm:0" "hmm:0" "y'en:130" "l'apocalypse:31" "m'épuise:30" "recontacter:80" "t'annonce:30" Set freq of non-word shortcuts for digraphs to 1 instead of 0, allowing to gesture them. - Combined ru dict : Remove a lot of two-character non-words. - Binary de dict : Remove the obsolete "options" header, and add the "dictionary" header. - Binary en dict : Flag "hoe" "hoes" "il" "shel" as non-words. Also drop freq of "il" and "shel" to 0 Add the "locale" header that was missing. - Binary es dict : Add the "dictionary" header. - Binary fr dict : Add the same words as above. Non-word shortcuts were already set to 1. - Binary it dict : Add a "dictionary" header. Also change freq of "Šarapova" from 50 to 37; not sure why it was 50. - Binary pt_BR dict : Add a "dictionary" header. - Binary ru dict : Add a "dictionary" header and remove the same words as above. For all dictionaries : bump the version to 27. Change-Id: I94fe7f8f42b31fdad223085c00a94115e14d2276	2012-11-21 22:03:24 +09:00
Jean Chalard	f5adbb1e1b	Move the emoji dictionaries source under AOSP. Change-Id: Ie870a90d483d9f27aed96fb4b44126315c43922f	2012-10-31 19:22:42 +09:00
Jean Chalard	a424ff06ec	Switch the AOSP word lists to the combined format. This will help with managing the word lists. Bug: 7388859 Change-Id: I89f049569b177d3027fe56d6c67eaca27d44dc7d	2012-10-31 18:52:00 +09:00
Jean Chalard	306e0a800f	Update AOSP dictionaries. Changes : - Add "emoji" - Change the whitelist target of "foo" from "for" to "too" - Fix non-word frequencies to 0 - Fix the freq of common en_US vs en_GB words - Add "connection" to the en_GB dictionary Bug: 7368441 Bug: 7370033 Bug: 7371955 Change-Id: Ib22a97e97b486b05012d5496619557f406c441b9	2012-10-24 16:12:28 +09:00
Jean Chalard	3d83a1648b	Update AOSP dictionaries. Differences : oh 90 -> 105 ooh 54 -> 54 hoy,kinkier,kinkiest,kinkiness,kinkily,kinky -> 0 trst -> remove New whitelist entries (actually old that had not been applied) "berm" -> "been" "foe" -> "for" "hid" -> "his" "thong" -> "thing" French : Add "six" and remove some non-words Bug: 7329149 Bug: 7356297 Change-Id: I55092f0538db8627148b0a314e50eff926c47275	2012-10-18 00:39:16 +09:00
Jean Chalard	b24cda3c0c	Fix the Danish dictionary Human error: this contained "Nederlands" ("Dutch" in Dutch) as the human-readable description in the header. Bug: 7272686 Change-Id: I7a67e7bf1afca6928de7825fb63c5b213e8d7978	2012-10-04 15:52:50 +09:00
Jean Chalard	a44942810d	Update the AOSP dictionaries for the 0-freq review Bug: 7227265 Change-Id: I384f7d76cef67b96b106ddac96e4baf1fa32afd4	2012-10-03 21:15:27 +09:00
Jean Chalard	d0cf96493c	Use all Lexiteria sources and update existing directories. New dictionaries : - Danish - Greek - Finnish - Lithuanian - Latvian - Dutch - Polish - Russian - Slovene - Serbian - Swedish - Turkish Also, compress those files to reduce the footprint in the repository. Also, update and improve English and French dictionaries, and add the ligatures shortcut into the French dictionary. Finally, move the Russian binary dictionary here now that it can at last be open sourced. Bug: 5587752 Bug: 6775251 Bug: 6995793 Bug: 7149666 Change-Id: Iec9831d4dce425a2b5b0657571e4448436610525	2012-09-21 22:07:23 +09:00
Jean Chalard	c278142745	Remove useless backslashes from the whitelist dictionary For some reason, these are necessary for resources, but XML standard does not require them. Change-Id: I7cdaecb6815aa4020e0d453e33be38ff2968df50	2012-08-13 15:53:07 +09:00
Jean Chalard	1d8103ea57	Add a shortcut-format version of the whitelist. This will ultimately replace the whitelist resource, but this change doesn't delete it to avoid removing the functionality temporarily. Bug: 6906525 Change-Id: I576edc42cd2a964b86b7597f1ede1cf6ec8e26c3	2012-08-10 15:51:18 +09:00
Jean Chalard	6f7b1ff468	Update dictionaries. - English : some words caught through regression tests - English : some words externally reported - French : some words externally reported - French : finished review of all accented words Bug: 6726969 Bug: 6730031 Change-Id: I37d0dc310db2c79e03ac7ad452391e92d9b13357	2012-06-29 19:30:01 +09:00
Jean Chalard	401e70535e	Make sure whitelist targets are in the main dictionary Bug: 6680976 Change-Id: Ieddb5eecb813da3a8a515930568e356bc3526386	2012-06-19 02:08:57 +09:00
Jean Chalard	79451e0a70	Update dictionaries. - English dict scrubbed for distractors - EN, FR, IT, DE include improvements from user feedback Bug: 6394369 Change-Id: I9af5415d0b6a5edfea2956657b0fee7906ebb344	2012-06-16 04:25:43 +09:00
Ken Wakasa	46b6004a18	Remove unusable Persian dictionary source data Change-Id: I4e1c3460d0a6f737355a46fd3e7bfecf00aca598	2012-06-15 13:46:38 +09:00
Jean Chalard	73e417b57b	Merge "Improvements to the English dicts" into jb-dev	2012-05-31 03:05:53 -07:00
Jean Chalard	51fb65569a	Improvements to the English dicts Bug: 6394369 Change-Id: I7a4747386adef44e6d1a0c9fec52d09611f1ce10	2012-05-31 18:46:22 +09:00
Jean Chalard	164a47fe20	Add word lists for Czech, Norwegian Bokmål and EU Portuguese Bug: 5779153 Bug: 5587752 Change-Id: I0b91c99097bdc644bb21eea3f3a1e3385281471f	2012-05-31 15:16:01 +09:00
Jean Chalard	90868195e4	Fix spelling of a unused flag Change-Id: I28fc3004da0ead2d373f36541732ac4450d86318	2012-05-28 18:49:37 +09:00
Jean Chalard	3a6efa06e2	Small update to the English dictionaries Demote 'HDTV' Bug: 6563090 Change-Id: I39a1632397569cf79a8d67d93cdff5cf29f82f3a	2012-05-28 13:01:59 +09:00
Jean Chalard	b2acdba809	Remove non-words from the French dictionary. Change-Id: I98c546818aa456a534e833495deb670e79df4104	2012-05-24 17:16:41 +09:00
Jean Chalard	80058c73cb	Update AOSP dictionaries Change-Id: Ia6bb1f9d6df4a9f859f132affc9cb030f14effd9	2012-05-22 16:12:50 +09:00
Jean Chalard	1fc0c71fad	Update French/English dictionaries to the latest version Change-Id: I9c98280f900914d1af22b47019ebc0ad5ab175de	2012-05-18 18:54:37 +09:00
Jean Chalard	8fec807800	Add open-source-able word lists to AOSP. Bug: 6458744 Change-Id: If28aeb7360ee7ec7408f55934ca2a684f032e338	2012-05-17 19:20:04 +09:00
Jean Chalard	4fc97c2c01	Add a note of documentation to the sample word list Change-Id: I95f09da03457933a14b549e04575d566de97dd49	2011-12-14 15:25:31 +09:00
The Android Open Source Project	923bf41f85	auto import from //branches/cupcake/...@138744	2009-03-13 15:11:42 -07:00

38 Commits (ae577ac1dbe94e120715e7ce551c6a07213e8cc8)