If Codepage recognition: is set to None, then AkelPad can detect (subject to sufficient buffer size) only Unicode files:
with BOM (byte order mark) present in file
UTF-16LE or UTF-16BE without BOM present.
The Buffer: value is the number of characters to test in the recognition algorithm. This must be set to a sufficient number to discern the codepage using the internal algorithm. In order to correctly determine the codepage or Unicode type, this minimum buffer size also varies somewhat with the size of file.
==========================================================
I have checked this manual. The CodePage Recognizion of my AlekPad is set to None.
Sometimes, it cannot display Chinese charaters correctly. Most Chinese characters use GB2312, GBK or BIG5 as the encoding, not UTF-8.
Is it this problem? What should I do if AlekPad cannot display characters?
Why some .txt files cannot be displayed correctly?
- Author
- Message
-
Offline
- Posts: 24
- Joined: Tue Nov 10, 2009 2:43 am
- Location: Beijing, China
- Contact:
-
Offline
- Posts: 147
- Joined: Fri Feb 08, 2008 6:41 pm
- Location: British Columbia, Canada
I have not checked with Instructor on this topic, but perhaps you already know that Windows contains (depending on your installation) MANY codepages. Often codepages contain the same characters as several others, with some unique characters. It is possible that your text files do not use enough characters unique to the codepage, and so the AkelPad algorithm cannot make the determination.akyahoo wrote:Sometimes, it cannot display Chinese charaters correctly.
If you are not getting the correct codepage, perhaps it is better to load the file using the codepage that you know is correct. If you enable the option "Options/Settings.../Registry/Remember code page", AkelPad tries to keep track of the codepage used for each file, but only for the files on the "Recent files" list; this may help a little.
I'm not sure that this behavior is a failing in AkelPad - it probably just points out the complexities in language.
Also, the "Options/Settings.../General" page contains settings for a default codepage - if you always work in the same codepage, use that as a default.
-
Offline
- Posts: 26
- Joined: Sun Mar 02, 2008 12:53 pm
Re: Why some .txt files cannot be displayed correctly?
AkelPad doesn't automatically recognize those encodings of Chinese.akyahoo wrote:I have checked this manual. The CodePage Recognizion of my AlekPad is set to None.
Sometimes, it cannot display Chinese charaters correctly. Most Chinese characters use GB2312, GBK or BIG5 as the encoding, not UTF-8.
Is it this problem? What should I do if AlekPad cannot display characters?
Automatic recognition actually involves a very intricate algorithm. Considering how many encodings there are, it's too much to ask of a free program. That's one of the reasons they invented a thing called unicode.
Actually, 100% accuracy is almost impossible. Many shareware programs aren't good either in this regard. It's rare to find programmers who are well versed in both programming and natural languages
-
Offline
- Posts: 24
- Joined: Tue Nov 10, 2009 2:43 am
- Location: Beijing, China
- Contact: