I had saved a bunch of messages from my Nokia N73. I wanted to get a quick summary of the content in all those messages. So I just wrote a python script to read all the messages and print the output to a file.
import sys
import re
import os
path = "msgs"
filesList = os.listdir(path)
regexPattern = 'Date:(.*)END:VBODY'
searchPattern = re.compile(regexPattern, re.DOTALL|re.I)
compileOutFile = open("Summary.txt", 'w')
for fname in filesList:
fullFname = path+"/"+fname
fileLine = open(fullFname, 'r').read().decode('utf-16')
matchObj = searchPattern.search(fileLine)
compileOutFile.write(matchObj.group(1))
compileOutFile.write("\n")
compileOutFile.close()
Thanks to this guy: http://www.xiirus.net/articles/article-decode-or-convert-_vmg-files-to-_txt-using-c-980ut.aspx
Without which it would have been difficult to figure out that the file is Unicode encoded. I was breaking my head for hours trying to figure out why my regex was not working. It was not the regex, but the string was read wrongly and hence the regex was failing.
Hope this helps someone.
import sys
import re
import os
path = "msgs"
filesList = os.listdir(path)
regexPattern = 'Date:(.*)END:VBODY'
searchPattern = re.compile(regexPattern, re.DOTALL|re.I)
compileOutFile = open("Summary.txt", 'w')
for fname in filesList:
fullFname = path+"/"+fname
fileLine = open(fullFname, 'r').read().decode('utf-16')
matchObj = searchPattern.search(fileLine)
compileOutFile.write(matchObj.group(1))
compileOutFile.write("\n")
compileOutFile.close()
Thanks to this guy: http://www.xiirus.net/articles/article-decode-or-convert-_vmg-files-to-_txt-using-c-980ut.aspx
Without which it would have been difficult to figure out that the file is Unicode encoded. I was breaking my head for hours trying to figure out why my regex was not working. It was not the regex, but the string was read wrongly and hence the regex was failing.
Hope this helps someone.
No comments:
Post a Comment