AkelPad Forum Index AkelPad
Support forum
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Why .txt file becomes ANSI after removing BOM?

 
Post new topic   Reply to topic    AkelPad Forum Index -> Discussion (English)
View previous topic :: View next topic  
Author Message
akyahoo



Joined: 10 Nov 2009
Posts: 24
Location: Beijing, China

PostPosted: Wed Nov 11, 2009 11:35 am    Post subject: Why .txt file becomes ANSI after removing BOM? Reply with quote



I use AkelPad to save a .txt file as UTF-8. The BOM is removed.



Later, I use Windows Notepad to open it. It shows its encoding is ANSI, not UTF-8. Why?
Back to top
View user's profile Send private message Visit poster's website
FeyFre



Joined: 07 Aug 2007
Posts: 2034
Location: Vinnitsa, Ukraine

PostPosted: Wed Nov 11, 2009 1:02 pm    Post subject: Reply with quote

akyahoo
Because latin 1 subset of UTF-8 encoding is egual to ANSI's latin-1 Encoding (Windows 1252): they both uses byte range [0-127]. And so texts UTF-8 w/o BOM and ANSI 1252 are binary equal. And Notepad has no any idea what was this or that text.
If you will try to to save UTF-8 text which contains Cyrillic chars w/o BOM, you`ll notice that those chars represented by two bytes, and the any smart-enought text editor(don't know if Notepad is) will analyze those pairs and correctly decide that text is in BOM-less UTF-8 encoding.
Back to top
View user's profile Send private message AIM Address Yahoo Messenger MSN Messenger
akyahoo



Joined: 10 Nov 2009
Posts: 24
Location: Beijing, China

PostPosted: Thu Nov 19, 2009 3:09 am    Post subject: Reply with quote

FeyFre wrote:
akyahoo
Because latin 1 subset of UTF-8 encoding is egual to ANSI's latin-1 Encoding (Windows 1252): they both uses byte range [0-127]. And so texts UTF-8 w/o BOM and ANSI 1252 are binary equal. And Notepad has no any idea what was this or that text.
If you will try to to save UTF-8 text which contains Cyrillic chars w/o BOM, you`ll notice that those chars represented by two bytes, and the any smart-enought text editor(don't know if Notepad is) will analyze those pairs and correctly decide that text is in BOM-less UTF-8 encoding.


Thank you.
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    AkelPad Forum Index -> Discussion (English) All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


SourceForge.net Logo Powered by phpBB © 2001, 2005 phpBB Group