Ansi, UTF & BOM ???


GPL is an old application and likes his configuration files to be in ANSI format. Most of us are aware that the modern way to save text files use UTF-8 which is a no go for GPL.

As I know that I have seen some problems in the past with files saved as UTF-8 I decided to give it a go and check it 🙂

Whoooooa! A mistake? An adventure? 

The first: There’s UTF-8 and UTF-8. Strange? Yep. Some save UTF-8 with some special binary header bytes called BOM (byte order mark) and some don’t. So it’s easy to say if you find a file with those byte in front we have UTF-8.

Easy, or? But now the tricky part. We have a text file and save as UTF-8 without BOM. Now you don’t find the header bytes and where is the difference?

If you are american and the text file only contain normal GPL parameters and comments chances are 99.999999999% then this text file will be  ……………………………………… ansi.

What? Ansi but I converted the text file to UTF-8 without BOM in Notepad++. Can’t be.

Yes, it can. The reason is that the difference between ANSI and UTF will be mostly seen in extended characters like öäü. Then ANSI and UTF will differ and you can detect the difference. But this is not easy. I will test my routine later.

Here’s a great article: http://www.joelonsoftware.com/articles/Unicode.html

Advertisements

About gplps

Grand Prix Legends Fan
This entry was posted in General Information. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s