utf-8 recognition
- Author
- Message
-
Offline
- Posts: 15
- Joined: Wed Jan 14, 2009 7:45 am
utf-8 recognition
Hi
thanks for your new version feature
Added: Chinese recognition (UTF-8).
but korean utf-8 character auto recognition is still unavailable
if you want test, visit http://www.cineast.co.kr/ and click view source
then korean utf-8 characters will be shown in broken status
thanks for your new version feature
Added: Chinese recognition (UTF-8).
but korean utf-8 character auto recognition is still unavailable
if you want test, visit http://www.cineast.co.kr/ and click view source
then korean utf-8 characters will be shown in broken status
-
Offline
- Site Admin
- Posts: 6311
- Joined: Thu Jul 06, 2006 7:20 am
Test version for Japanese and Korean codepage recognition.
-
Offline
- Site Admin
- Posts: 6311
- Joined: Thu Jul 06, 2006 7:20 am
lupin1984 & harfman
1. turn on "Options->Settings...->General->Codepage recognition->Chinese or Korean".
2. turn off "Options->Settings...->Registry->Remember code page" (not necessary, but for clean results).
3. change default codepage to your native (if you change it). Don't use UTF-8 as your default ANSI codepage.
"Options->Settings...->General->Default codepage"
4. open file again.
Note:
File must have been not too small.
1. turn on "Options->Settings...->General->Codepage recognition->Chinese or Korean".
2. turn off "Options->Settings...->Registry->Remember code page" (not necessary, but for clean results).
3. change default codepage to your native (if you change it). Don't use UTF-8 as your default ANSI codepage.
"Options->Settings...->General->Default codepage"
4. open file again.
Note:
File must have been not too small.
-
Offline
- Site Admin
- Posts: 6311
- Joined: Thu Jul 06, 2006 7:20 am
u_u86
Test version "Turkish (OEM, UTF-8)".
Test version "Turkish (OEM, UTF-8)".
-
Offline
- Posts: 16
- Joined: Wed Jul 09, 2008 7:04 am
Don't work for me. And what about (ANSI 1254)? Example: Turkish.rc akelpad language resource file in cp1254, when opening (default cp set to cp1251 or 1252) i want to automaticaly open it in cp1254.Instructor wrote:u_u86
Test version "Turkish (OEM, UTF-8)".
The possible workaround with default cp set to cp1254, and recognize cp1251 also don't work - text always reconized as cp1251.
File with turkish text: http://www.box.net/shared/iio2nq3dum
Only difference between 1254 and 1252 - ~6 chars
-
Offline
- Site Admin
- Posts: 6311
- Joined: Thu Jul 06, 2006 7:20 am
I hope you understand that you must turn on "Options->Settings...->General->Codepage recognition->Turkish (OEM, UTF-8)". As I wrote in this thread before.u_u86 wrote:Don't work for me.
It will be worked as you want only if you set 1254 as your default codepage.u_u86 wrote:... i want to automaticaly open it in cp1254
-
Offline
- Posts: 16
- Joined: Wed Jul 09, 2008 7:04 am
-
Offline
- Posts: 20
- Joined: Mon May 07, 2007 6:14 pm
you can test the two software , no utf-8 recognition problem, perfect
but akelpad is faster and lighter , efficient
they are open source software , thanks
Notepad++
http://notepad-plus.sourceforge.net/uk/site.htm
notepad2
http://www.flos-freeware.ch/notepad2.html
but akelpad is faster and lighter , efficient
they are open source software , thanks
Notepad++
http://notepad-plus.sourceforge.net/uk/site.htm
notepad2
http://www.flos-freeware.ch/notepad2.html
-
Offline
- Site Admin
- Posts: 6311
- Joined: Thu Jul 06, 2006 7:20 am
lupin1984
Do you get my answers on your emails? Try to read...
1. turn on "Options->Settings...->General->Codepage recognition->Chinese".
2. turn off "Options->Settings...->Registry->Remember code page" (not necessary, but for clean results).
3. change default codepage to your native (if you change it). Don't use UTF-8 as your default ANSI codepage.
"Options->Settings...->General->Default codepage"
4. open file again.
"Options->Settings...->General->Buffer"
Do you get my answers on your emails? Try to read...
This one detected correctly. Make sure you make all this steps on XP:i test on xp and vista
vista is ok ,but xp...
thanks
1. turn on "Options->Settings...->General->Codepage recognition->Chinese".
2. turn off "Options->Settings...->Registry->Remember code page" (not necessary, but for clean results).
3. change default codepage to your native (if you change it). Don't use UTF-8 as your default ANSI codepage.
"Options->Settings...->General->Default codepage"
4. open file again.
This one is to small (has not much Chinese characters) for detection as UTF-8. Try to copy contents and it will detected correctly:this text file can't be detected
Code: Select all
测试文本thanks谢谢
18:42 2009/1/13
测试文本thanks谢谢
18:42 2009/1/13
Increase recognition buffer, for example to 8096:it's firefox's simple chinese lang package
all no-bom text files,the pageInfo.properties can't be auto detected . you can test
"Options->Settings...->General->Buffer"