本帖最后由 523066680 于 2013-6-10 19:27 编辑
因为某个手机快要报废了,谷歌通讯录的联系人和sim联系人保存得不太一致,
为了谨慎起见,先导出了vcf 复制到电脑上,一看,ID部分都是UTF-8编码,无法直视。
所以试着写了python读取联系人信息,并转换UTF-8部分为文字。
以下是google_contact.txt.vcf的部分段落
- BEGIN:VCARD
- VERSION:2.1
- N;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:=E5=85=B0=E6=A0=BC;=E8=82=96;;;
- FN;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:=E5=85=B0=E6=A0=BC=20=E8=82=96
- TEL;VOICE;PREF:1-352-013-0000
- END:VCARD
- BEGIN:VCARD
- VERSION:2.1
- N;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:;=E9=98=BF=E5=87=AF=32=30=30=39;;;
- FN;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:=E9=98=BF=E5=87=AF=32=30=30=39
- TEL;VOICE;PREF:1-234-567-8900
- END:VCARD
- BEGIN:VCARD
- VERSION:2.1
- N;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:;=E7=A7=91=E6=8A=80=E6=B3=A2;;;
- FN;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:=E7=A7=91=E6=8A=80=E6=B3=A2
- TEL;VOICE;PREF:1-328-354-0000
- END:VCARD
复制代码
代码:- # -*- coding: cp936 -*-
- '''
- code by vicyang
- paktcmail@gmail.com
-
- 2013-03-01
- 将联系人存储到字典,三种情况:
- 有电话无联系人、有联系人没有电话、联系人名字相同
- '''
-
- def utf2unichr(brr):
- binstr=brr[0][4:]
- for i in range(1,len(brr)):
- binstr=binstr+(brr[i][2:])
- return unichr(int(binstr,2))
-
- def analyse(arr):
- for i in range(len(arr)):
- arr[i]=bin(eval("0x"+arr[i]))[2:] #[2:]去掉 "0b"
- dst=""
- for i in range(len(arr)):
- if (int(arr[i],2)<int("10000000",2)): #ASCII char
- dst=dst+chr(int(arr[i],2))
- elif (arr[i][:4]=='1110'):
- dst=dst+utf2unichr(arr[i:i+3]) #i,i+1,i+2
- elif (arr[i][:2]=='10'):
- pass
- else:
- print "something worng,%s"%arr[i]
- return dst
-
-
- import re
- spec="FN;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:="
- f=open("google_contact.txt",'r')
-
- cnt=-1
- array=[]
-
- for i in f:
-
- if "BEGIN:" in i:
- cnt+=1
- array.append({})
-
- if re.search("^FN",i):
- s=i.strip("\n")
- if "FN;" in i:
- s=s.replace(spec,"")
- s=s.split("=")
- name=analyse(s)
- elif "FN:" in i:
- name=s.replace("FN:","")
- else:
- print "something wrong!%s"%i
-
- array[cnt]['name']=name
-
- if "TEL;" in i:
- s=i.strip("\n")
- s=re.sub("TEL;.*;PREF:","",s)
- PhoneCode=re.sub("-","",s)
- array[cnt]["tel"]=PhoneCode
-
- for i in range(len(array)):
- if "name" in array[i]:
- print array[i]['name']
- else:
- pass
- if "tel" in array[i]:
- print array[i]['tel']
- else:
- pass
-
- raw_input()
- f.close()
复制代码 显示结果:
兰格 肖
13520130000
阿凯2009
12345678900
科技波
13283540000
本想写成对比两个通讯录的异同之处,搁置了 |