[新手上路]批处理新手入门导读[视频教程]批处理基础视频教程[视频教程]VBS基础视频教程[批处理精品]批处理版照片整理器
[批处理精品]纯批处理备份&还原驱动[批处理精品]CMD命令50条不能说的秘密[在线下载]第三方命令行工具[在线帮助]VBScript / JScript 在线参考
返回列表 发帖

简体繁体汉字转拼音工具pin.exe

本帖最后由 happy886rr 于 2016-10-19 08:41 编辑

  很多汉字转拼音工具都携带太大的字典,而且转换的样式比较单调。因此pin.exe利用读音界点去压缩字典数据,采用辗转移位加速。支持文本或管道流、格式化输出拼音、支持简繁体转化、支持大小写转化。

[2016/10/19]version 1.3  修复数个漏洞,增加大小写转化,改进开关识别,功能增但体积减,仅9kb大小.
下载文件:把下面这张图片另存为pin.7z直接解压即是。

具体用法
  1. PIN.EXE
  2. __________________________________________________________________________
  3. 汉字转拼音工具, 版本 1.3
  4. COPYRIGHT@2016~2018 BY HAPPY
  5. 使用:
  6. pin [file|-p] [(-d num&delims)|-l|-u|-s|-t]
  7. __________________________________________________________________________
  8. 选项:
  9. -h    显示帮助信息
  10. -p    从管道读取信息流
  11. -d    格式化输出: num为拼音样式编号, delims为间隔符(串)
  12. -l    大写转小写
  13. -u    小写转大写
  14. -s    将信息流转为简体
  15. -t    将信息流转为繁体
  16. __________________________________________________________________________
  17. num :
  18. 0    小写全拼
  19. 1    大写全拼
  20. 2    首字母大写全拼
  21. 3    首字母大写缩拼
  22. 4    首字母小写缩拼
  23. __________________________________________________________________________
  24. 示例:
  25.          pin test.txt                   //将test.txt中汉字转为拼音
  26.          pin test.txt -d0"""            //以 " 号为间隔符,转为拼音
  27.          pin test.txt -u                //将test.txt中字母转为大写
  28.          dir|pin -p -t                  //将dir 显示的信息流转为繁体
  29.          echo 簡體轉換|pin -p -s        //将管道信息流强制转化为简体
  30.          echo 簡體轉換|pin -p -d3" "    //首字母大写缩拼,以空格为间隔(串)
  31. __________________________________________________________________________
复制代码
核心代码:
  1. #include   <stdio.h>
  2. #include <windows.h>
  3. //读音数组
  4. const char* Pinyin_Code[]={"a","ai","an","ang","ao","ba","bai","ban","bang","bao","bei","ben","beng","bi","bian","biao","bie","bin","bing","bo","bu","ca","cai","can","cang","cao","ce","ceng","cha","chai","chan","chang","chao","che","chen","cheng","chi","chong","chou","chu","chuai","chuan","chuang","chui","chun","chuo","ci","cong","cou","cu","cuan","cui","cun","cuo","da","dai","dan","dang","dao","de","deng","di","dian","diao","die","ding","diu","dong","dou","du","duan","dui","dun","duo","e","en","er","fa","fan","fang","fei","fen","feng","fo","fou","fu","ga","gai","gan","gang","gao","ge","gei","gen","geng","gong","gou","gu","gua","guai","guan","guang","gui","gun","guo","ha","hai","han","hang","hao","he","hei","hen","heng","hong","hou","hu","hua","huai","huan","huang","hui","hun","huo","ji","jia","jian","jiang","jiao","jie","jin","jing","jiong","jiu","ju","juan","jue","jun","ka","kai","kan","kang","kao","ke","ken","keng","kong","kou","ku","kua","kuai","kuan","kuang","kui","kun","kuo","la","lai","lan","lang","lao","le","lei","leng","li","lia","lian","liang","liao","lie","lin","ling","liu","long","lou","lu","lv","luan","lue","lun","luo","ma","mai","man","mang","mao","me","mei","men","meng","mi","mian","miao","mie","min","ming","miu","mo","mou","mu","na","nai","nan","nang","nao","ne","nei","nen","neng","ni","nian","niang","niao","nie","nin","ning","niu","nong","nu","nv","nuan","nue","nuo","o","ou","pa","pai","pan","pang","pao","pei","pen","peng","pi","pian","piao","pie","pin","ping","po","pu","qi","qia","qian","qiang","qiao","qie","qin","qing","qiong","qiu","qu","quan","que","qun","ran","rang","rao","re","ren","reng","ri","rong","rou","ru","ruan","rui","run","ruo","sa","sai","san","sang","sao","se","sen","seng","sha","shai","shan","shang","shao","she","shen","sheng","shi","shou","shu","shua","shuai","shuan","shuang","shui","shun","shuo","si","song","sou","su","suan","sui","sun","suo","ta","tai","tan","tang","tao","te","teng","ti","tian","tiao","tie","ting","tong","tou","tu","tuan","tui","tun","tuo","wa","wai","wan","wang","wei","wen","weng","wo","wu","xi","xia","xian","xiang","xiao","xie","xin","xing","xiong","xiu","xu","xuan","xue","xun","ya","yan","yang","yao","ye","yi","yin","ying","yo","yong","you","yu","yuan","yue","yun","za","zai","zan","zang","zao","ze","zei","zen","zeng","zha","zhai","zhan","zhang","zhao","zhe","zhen","zheng","zhi","zhong","zhou","zhu","zhua","zhuai","zhuan","zhuang","zhui","zhun","zhuo","zi","zong","zou","zu","zuan","zui","zun","zuo"};
  5. //读音界点
  6. const unsigned short Pinyin_Side[]={0xB0A1,0xB0A3,0xB0B0,0xB0B9,0xB0BC,0xB0C5,0xB0D7,0xB0DF,0xB0EE,0xB0FA,0xB1AD,0xB1BC,0xB1C0,0xB1C6,0xB1DE,0xB1EA,0xB1EE,0xB1F2,0xB1F8,0xB2A3,0xB2B8,0xB2C1,0xB2C2,0xB2CD,0xB2D4,0xB2D9,0xB2DE,0xB2E3,0xB2E5,0xB2F0,0xB2F3,0xB2FD,0xB3AC,0xB3B5,0xB3BB,0xB3C5,0xB3D4,0xB3E4,0xB3E9,0xB3F5,0xB4A7,0xB4A8,0xB4AF,0xB4B5,0xB4BA,0xB4C1,0xB4C3,0xB4CF,0xB4D5,0xB4D6,0xB4DA,0xB4DD,0xB4E5,0xB4E8,0xB4EE,0xB4F4,0xB5A2,0xB5B1,0xB5B6,0xB5C2,0xB5C5,0xB5CC,0xB5DF,0xB5EF,0xB5F8,0xB6A1,0xB6AA,0xB6AB,0xB6B5,0xB6BC,0xB6CB,0xB6D1,0xB6D5,0xB6DE,0xB6EA,0xB6F7,0xB6F8,0xB7A2,0xB7AA,0xB7BB,0xB7C6,0xB7D2,0xB7E1,0xB7F0,0xB7F1,0xB7F2,0xB8C1,0xB8C3,0xB8C9,0xB8D4,0xB8DD,0xB8E7,0xB8F8,0xB8F9,0xB8FB,0xB9A4,0xB9B3,0xB9BC,0xB9CE,0xB9D4,0xB9D7,0xB9E2,0xB9E5,0xB9F5,0xB9F8,0xB9FE,0xBAA1,0xBAA8,0xBABB,0xBABE,0xBAC7,0xBAD9,0xBADB,0xBADF,0xBAE4,0xBAED,0xBAF4,0xBBA8,0xBBB1,0xBBB6,0xBBC4,0xBBD2,0xBBE7,0xBBED,0xBBF7,0xBCCE,0xBCDF,0xBDA9,0xBDB6,0xBDD2,0xBDED,0xBEA3,0xBEBC,0xBEBE,0xBECF,0xBEE8,0xBEEF,0xBEF9,0xBFA6,0xBFAA,0xBFAF,0xBFB5,0xBFBC,0xBFC0,0xBFCF,0xBFD3,0xBFD5,0xBFD9,0xBFDD,0xBFE4,0xBFE9,0xBFED,0xBFEF,0xBFF7,0xC0A4,0xC0A8,0xC0AC,0xC0B3,0xC0B6,0xC0C5,0xC0CC,0xC0D5,0xC0D7,0xC0E2,0xC0E5,0xC1A9,0xC1AA,0xC1B8,0xC1C3,0xC1D0,0xC1D5,0xC1E1,0xC1EF,0xC1FA,0xC2A5,0xC2AB,0xC2BF,0xC2CD,0xC2D3,0xC2D5,0xC2DC,0xC2E8,0xC2F1,0xC2F7,0xC3A2,0xC3A8,0xC3B4,0xC3B5,0xC3C5,0xC3C8,0xC3D0,0xC3DE,0xC3E7,0xC3EF,0xC3F1,0xC3F7,0xC3FD,0xC3FE,0xC4B1,0xC4B4,0xC4C3,0xC4CA,0xC4CF,0xC4D2,0xC4D3,0xC4D8,0xC4D9,0xC4DB,0xC4DC,0xC4DD,0xC4E8,0xC4EF,0xC4F1,0xC4F3,0xC4FA,0xC4FB,0xC5A3,0xC5A7,0xC5AB,0xC5AE,0xC5AF,0xC5B0,0xC5B2,0xC5B6,0xC5B7,0xC5BE,0xC5C4,0xC5CA,0xC5D2,0xC5D7,0xC5DE,0xC5E7,0xC5E9,0xC5F7,0xC6AA,0xC6AE,0xC6B2,0xC6B4,0xC6B9,0xC6C2,0xC6CB,0xC6DA,0xC6FE,0xC7A3,0xC7B9,0xC7C1,0xC7D0,0xC7D5,0xC7E0,0xC7ED,0xC7EF,0xC7F7,0xC8A6,0xC8B1,0xC8B9,0xC8BB,0xC8BF,0xC8C4,0xC8C7,0xC8C9,0xC8D3,0xC8D5,0xC8D6,0xC8E0,0xC8E3,0xC8ED,0xC8EF,0xC8F2,0xC8F4,0xC8F6,0xC8F9,0xC8FD,0xC9A3,0xC9A6,0xC9AA,0xC9AD,0xC9AE,0xC9AF,0xC9B8,0xC9BA,0xC9CA,0xC9D2,0xC9DD,0xC9E9,0xC9F9,0xCAA6,0xCAD5,0xCADF,0xCBA2,0xCBA4,0xCBA8,0xCBAA,0xCBAD,0xCBB1,0xCBB5,0xCBB9,0xCBC9,0xCBD1,0xCBD4,0xCBE1,0xCBE4,0xCBEF,0xCBF2,0xCBFA,0xCCA5,0xCCAE,0xCCC0,0xCCCD,0xCCD8,0xCCD9,0xCCDD,0xCCEC,0xCCF4,0xCCF9,0xCCFC,0xCDA8,0xCDB5,0xCDB9,0xCDC4,0xCDC6,0xCDCC,0xCDCF,0xCDDA,0xCDE1,0xCDE3,0xCDF4,0xCDFE,0xCEC1,0xCECB,0xCECE,0xCED7,0xCEF4,0xCFB9,0xCFC6,0xCFE0,0xCFF4,0xD0A8,0xD0BD,0xD0C7,0xD0D6,0xD0DD,0xD0E6,0xD0F9,0xD1A5,0xD1AB,0xD1B9,0xD1C9,0xD1EA,0xD1FB,0xD2AC,0xD2BB,0xD2F0,0xD3A2,0xD3B4,0xD3B5,0xD3C4,0xD3D9,0xD4A7,0xD4BB,0xD4C5,0xD4D1,0xD4D4,0xD4DB,0xD4DF,0xD4E2,0xD4F0,0xD4F4,0xD4F5,0xD4F6,0xD4FA,0xD5AA,0xD5B0,0xD5C1,0xD5D0,0xD5DA,0xD5E4,0xD5F4,0xD6A5,0xD6D0,0xD6DB,0xD6E9,0xD7A5,0xD7A7,0xD7A8,0xD7AE,0xD7B5,0xD7BB,0xD7BD,0xD7C8,0xD7D7,0xD7DE,0xD7E2,0xD7EA,0xD7EC,0xD7F0,0xD7F2};
  7. //缓存大小
  8. #define BUFF_SIZE 4096
  9. //全局数组
  10. int flag[2]={1,1};
  11. char* delims="0 ";
  12. /*拼音函数*/
  13. inline void Pinyin(unsigned short S)
  14. {
  15. unsigned short i, N, L=0, R=395;
  16. const char* tmp;
  17. while(R-L>1){
  18. N=(R+L)>>1;
  19. if(Pinyin_Side[N]==S){R=N;break;}
  20. if(Pinyin_Side[N] <S){L=N;}else{R=N;}
  21. }
  22. tmp=Pinyin_Code[(Pinyin_Side[R]<=S)?R:L];
  23. switch(delims[0]){
  24. case '0':
  25. fprintf(stdout,"%s",tmp);
  26. break;
  27. case '1':
  28. for(i=0; tmp[i]!='\0'; i++){
  29. fprintf(stdout,"%c",tmp[i]-32);
  30. }
  31. break;
  32. case '2':
  33. fprintf(stdout,"%c",tmp[0]-32);
  34. for(i=1; tmp[i]!='\0'; i++){
  35. fprintf(stdout,"%c",tmp[i]);
  36. }
  37. break;
  38. case '3':
  39. fprintf(stdout,"%c",tmp[0]-32);
  40. break;
  41. case '4':
  42. fprintf(stdout,"%c",tmp[0]);
  43. break;
  44. }
  45. }
  46. /*串化拼音*/
  47. inline void String2Pinyin(const unsigned char* Str)
  48. {
  49. int i=0,L=strlen(Str);
  50. while(i<L){
  51. if(Str[i]<0x80){
  52. fprintf(stdout,"%c",Str[i]);
  53. }else if((0xB0<=Str[i]) && (Str[i]<=0xD7) && (0xA1<=Str[i+1]) && (Str[i+1]<=0xFE)){
  54. if(i>0 && Str[i-1]<0x80){fprintf(stdout,"%s",(delims+1));}
  55. Pinyin(Str[i]<<8|Str[i+1]);
  56. fprintf(stdout,"%s",(delims+1));
  57. i++;
  58. }else{
  59. fprintf(stdout,"%c%c",Str[i],Str[i+1]);
  60. i++;
  61. }
  62. i++;
  63. }
  64. }
  65. /*简繁体互转*/
  66. inline char* TraTTSim(const char* Str, int FLAG)
  67. {
  68. int L=LCMapString(LOCALE_SYSTEM_DEFAULT, FLAG, Str, -1, NULL, 0);
  69. char* Out=(char *)calloc(L+1, sizeof(char));
  70. LCMapString(LOCALE_SYSTEM_DEFAULT, FLAG, Str, -1, Out, L);
  71. return Out;
  72. }
  73. /*帮助信息*/
  74. void Help(FILE* stream, int Exit_Code)
  75. {
  76. fprintf(stream,
  77. "--------------------------------------------------------------------------\n"
  78. "CHINESE CHARACTERS TO PINYIN, VERSION 1.3\n"
  79. "COPYRIGHT@2016~2018 BY HAPPY\n"
  80. "USAGE:\n"
  81. " pin [file|-p] [(-d num&delims)|-l|-u|-s|-t]\n"
  82. "--------------------------------------------------------------------------\n"
  83. "OPTIONS:\n"
  84. " -h    show pin's help information\n"
  85. " -p    read stream from pipe\n"
  86. " -d    pinyin format and delims\n"
  87. " -l    to Lowercase\n"
  88. " -u    to Uppercase\n"
  89. " -s    to Simplified\n"
  90. " -t    to Traditional\n"
  91. "--------------------------------------------------------------------------\n"
  92. "The value of 'num' is zero to four,the corresponding actions are:\n"
  93. " 0    lowercase full pinyin;\n"
  94. " 1    uppercase full pinyin;\n"
  95. " 2    initial capitalization of full pinyin;\n"
  96. " 3    pinyin initials capitalized;\n"
  97. " 4    pinyin initials lowercase.\n"
  98. "--------------------------------------------------------------------------\n"
  99.                 "                                                       10/18/2016\n"
  100. );
  101. exit(Exit_Code);
  102. }
  103. /*行显函数*/
  104. void DisplayLine(FILE* stream)
  105. {
  106. char* Line=(char *)malloc(BUFF_SIZE*sizeof(char));
  107. while(!feof(stream)){
  108. memset(Line, 0, BUFF_SIZE*sizeof(char));
  109. fgets(Line, BUFF_SIZE, stream);
  110. switch(flag[1]){
  111. case 1:
  112. String2Pinyin(TraTTSim(Line,LCMAP_SIMPLIFIED_CHINESE));
  113. break;
  114. case 2:
  115. fputs(strlwr(Line), stdout);
  116. break;
  117. case 3:
  118. fputs(strupr(Line), stdout);
  119. break;
  120. case 4:
  121. fputs(TraTTSim(Line,LCMAP_SIMPLIFIED_CHINESE ), stdout);
  122. break;
  123. case 5:
  124. fputs(TraTTSim(Line,LCMAP_TRADITIONAL_CHINESE), stdout);
  125. break;
  126. }
  127. }
  128. }
  129. /* Main主函数入口 */
  130. int main(int argc, char** argv)
  131. {
  132. if(argc<=3 && !strcasecmp(argv[1],"-p")){
  133. flag[0]=0;
  134. }
  135. if(argc==3 && argv[2][0]=='-'){
  136. switch(argv[2][1]){
  137. case 'H':
  138. case 'h':
  139. Help(stdout, 0);
  140. return 0;
  141. case 'P':
  142. case 'p':
  143. flag[0]=0;
  144. break;
  145. case 'D':
  146. case 'd':
  147. delims=(argv[2]+2);
  148. if((delims[0]<'0') || ('4'<delims[0])){delims[0]='0';}
  149. break;
  150. case 'L':
  151. case 'l':
  152. flag[1]=2;
  153. break;
  154. case 'U':
  155. case 'u':
  156. flag[1]=3;
  157. break;
  158. case 'S':
  159. case 's':
  160. flag[1]=4;
  161. break;
  162. case 'T':
  163. case 't':
  164. flag[1]=5;
  165. break;
  166. default:
  167. Help(stderr, 1);
  168. return 1;
  169. }
  170. }
  171. if(flag[0]==1){
  172. FILE* fp=fopen(argv[1], "rb");
  173. if(fp==NULL){
  174. Help(stderr, 1);
  175. return 1;
  176. }
  177. DisplayLine(fp);
  178. fclose(fp);
  179. return 0;
  180. }
  181. if(flag[0]==0){
  182. DisplayLine(stdin);
  183. }
  184. return 0;
  185. }
复制代码
1

评分人数

可以考虑用git管理代码。放到github上。
代码有更新或者修改也不用改附件什么的。贴链接就行了。
去学去写去用才有进步。安装python3代码存为xx.py 双击运行或右键用IDLE打开按F5运行

TOP

本帖最后由 523066680 于 2016-10-18 10:09 编辑

em... 怎么说呢,我觉得用户体验比较重要,体积其实大不到哪里去的。而且有些字在不同的词组中读音不同。宁可字典完整一些。

TOP

本帖最后由 happy886rr 于 2016-10-18 10:33 编辑

回复 3# 523066680
体验是一致的,拼音总共才400多种,都包含了。只是一个字在不同的词组里读音不同,如何实现这个智能辨音,我几乎没找到别人实现的,那得带词组数据库才行。

TOP

本帖最后由 523066680 于 2016-10-18 11:18 编辑

还是想要有声标,个人来说。
找到一个新华字典离线版的App,有各种成语词汇的对应拼音,不过逆向不熟,有时间继续探索。

TOP

perl非常适合这种搜索,C语言需要借助正则库,去匹配可能的字词组合,才可以显示出标准的语境拼音。
需要有断句,断意的功能,比如人的姓氏,“单”你需要智能判断其在句子中是作为姓氏(shan)去发音,还是作为标准(dan)发音。这是正则都难以匹配出的,需要AI词法分析。当然如果带个大字典,查表的话那也能解决,但是效率堪忧。总之两套思路,希望有能者做个结合语境的汉字转拼音。

TOP

本帖最后由 523066680 于 2016-10-18 12:08 编辑

哦,结合语义的还不敢想,前面说的只是对于具体词组的准确拼音。
不过我的方向主要在于找现成的词典,资源要丰满,耶。

多音字表,估计不够全
http://www.chazidian.com/info/8/

http://www.fuhaoku.com/duoyinzi/

TOP

回复 7# 523066680

人参  参加  这个多音字   你那两个链接都没列

TOP

本帖最后由 aa77dd@163.com 于 2016-10-18 12:58 编辑

试断句:

有些人参加了。

普通话语音规范 官方网站   内容很遗憾
http://www.pthyygf.org/guifanbia ... /2011-11-26/20.html

普通话异读词审音表

http://wenku.baidu.com/link?url= ... -d2Qr8e5b8efNBkW###

上面是大陆的标准文件, 竟然也找不到 参加 人参

下面台湾的标准文件已经 PDF 电子档(非扫描) 也找到了 参加 人参 , 不过是繁体的

台湾:
国语一字多音审订表(初稿)101年12月12日公告 - 教育部语文成果网

http://language.moe.gov.tw/files ... 8%9D%E7%A8%BF_1.pdf
1

评分人数

TOP

已加入肯德基豪华午餐:
http://www.bathome.net/s/tool/?key=pin

但是如果系统的区域语言设置不是 中国[拼音],那这种转拼音的方案就用不了吧

TOP

回复 10# CrLf
不会的,这点在写的时候已考虑到了,pin.exe用的是变种算法。不受任何外部限制

TOP

回复 11# happy886rr


    还是魔明白,看起来并没有指定代码页之类的操作,纯粹的比大小怎么兼容不同的语言环境?

TOP

回复 12# CrLf
并非比较大小

TOP

回复 13# happy886rr


    没想明白,这个难道不是在比大小吗:
  1. ...
  2. if(Pinyin_Side[N] <S){L=N;}else{R=N;}
  3. ...
复制代码

TOP

返回列表