Konverzija dokumenata u UTF-8
Postano: 09 lip 2017, 11:02
Nedavno sam trebao konvertirati datoteke nastale u Clarionu/DOS-u sa starim encodingom IBM-852 u danas sveprisutni UTF-8. Ova skripta radi s Pythonom 2 i 3. Kod Pythona2 io.open je isto što u Pythonu3 open. Prvi argument je ime datoteke, drugi argument je encoding ulazne datoteke.
- Kod: Označi sve
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
Convert to a document with UTF-8 encoding.
The first argument is a file name and the second argument
is a coding point eg. 852, cp1252 etc.
python convert_to_utf8.py my_table.csv ibm852
Author: Hrvoje T
Last edit: June 2017.
"""
import sys
import io
try:
# command line arguments
input_file = sys.argv[1]
my_encoding = sys.argv[2]
with io.open(input_file,'r',encoding=my_encoding) as f:
data = f.read()
with io.open(input_file,'w',encoding='utf8') as f:
f.write(data)
except:
print("Wrong arguments: file name (1) and/or encoding (2)")
sys.exit()
print("OK, successfully converted to UTF-8!")