AkelPad Forum Index AkelPad
Support forum
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

utf-8 recognition

 
Post new topic   Reply to topic    AkelPad Forum Index -> Discussion (English)
View previous topic :: View next topic  
Author Message
harfman



Joined: 14 Jan 2009
Posts: 14

PostPosted: Wed Jan 14, 2009 7:55 am    Post subject: utf-8 recognition Reply with quote

Hi

thanks for your new version feature

Added: Chinese recognition (UTF-8).

but korean utf-8 character auto recognition is still unavailable

if you want test, visit http://www.cineast.co.kr/ and click view source

then korean utf-8 characters will be shown in broken status
Back to top
View user's profile Send private message
Instructor
Site Admin


Joined: 06 Jul 2006
Posts: 5454

PostPosted: Wed Jan 14, 2009 9:36 am    Post subject: Reply with quote

Test version for Japanese and Korean codepage recognition.
Back to top
View user's profile Send private message Send e-mail
lupin1984



Joined: 07 May 2007
Posts: 20

PostPosted: Wed Jan 14, 2009 10:53 am    Post subject: Reply with quote

don't work Sad

if the default codepage is utf-8 ,the codepage recognition don't work(you can choose none,cyrilic,latin,chinese)

the no-bom utf-8 text can be auto recognized Very Happy

but i don't use utf-8 always Smile
Back to top
View user's profile Send private message Send e-mail
harfman



Joined: 14 Jan 2009
Posts: 14

PostPosted: Wed Jan 14, 2009 11:04 am    Post subject: Reply with quote

thanaks your fast reply

test version 4.14 still don't works for korean utf-8 charaters
Back to top
View user's profile Send private message
Instructor
Site Admin


Joined: 06 Jul 2006
Posts: 5454

PostPosted: Wed Jan 14, 2009 11:19 am    Post subject: Reply with quote

lupin1984 & harfman
1. turn on "Options->Settings...->General->Codepage recognition->Chinese or Korean".
2. turn off "Options->Settings...->Registry->Remember code page" (not necessary, but for clean results).
3. change default codepage to your native (if you change it). Don't use UTF-8 as your default ANSI codepage.
"Options->Settings...->General->Default codepage"
4. open file again.

Note:
File must have been not too small.
Back to top
View user's profile Send private message Send e-mail
harfman



Joined: 14 Jan 2009
Posts: 14

PostPosted: Wed Jan 14, 2009 12:46 pm    Post subject: Reply with quote

Ok it works well, thanks for your efforts
Back to top
View user's profile Send private message
u_u86



Joined: 09 Jul 2008
Posts: 16

PostPosted: Fri Jan 16, 2009 3:32 pm    Post subject: Reply with quote

Since the work go this way, it is possible to add recognition of Turkish codepage (ANSI 1254)? If you need any information about, feel free to ask.
Back to top
View user's profile Send private message
Instructor
Site Admin


Joined: 06 Jul 2006
Posts: 5454

PostPosted: Fri Jan 16, 2009 4:48 pm    Post subject: Reply with quote

u_u86
Test version "Turkish (OEM, UTF-8)".
Back to top
View user's profile Send private message Send e-mail
u_u86



Joined: 09 Jul 2008
Posts: 16

PostPosted: Fri Jan 16, 2009 5:27 pm    Post subject: Reply with quote

Instructor wrote:
u_u86
Test version "Turkish (OEM, UTF-Cool".


Don't work for me. And what about (ANSI 1254)? Example: Turkish.rc akelpad language resource file in cp1254, when opening (default cp set to cp1251 or 1252) i want to automaticaly open it in cp1254.

The possible workaround with default cp set to cp1254, and recognize cp1251 also don't work - text always reconized as cp1251.

File with turkish text: http://www.box.net/shared/iio2nq3dum
Only difference between 1254 and 1252 - ~6 chars
Back to top
View user's profile Send private message
Instructor
Site Admin


Joined: 06 Jul 2006
Posts: 5454

PostPosted: Fri Jan 16, 2009 6:43 pm    Post subject: Reply with quote

u_u86 wrote:
Don't work for me.
I hope you understand that you must turn on "Options->Settings...->General->Codepage recognition->Turkish (OEM, UTF-8)". As I wrote in this thread before.

u_u86 wrote:
... i want to automaticaly open it in cp1254
It will be worked as you want only if you set 1254 as your default codepage.
Back to top
View user's profile Send private message Send e-mail
u_u86



Joined: 09 Jul 2008
Posts: 16

PostPosted: Sat Jan 17, 2009 3:23 am    Post subject: Reply with quote

Of cause, i set.
If i set cp1254 as default it always open all files in that cp, what reason for recognition? Only to recognize UTF-8 and OEM?

Cyrillic recognition works well when default cp set to 1252 or 1254, and recognize to 1251. May be it has some algorithms?
Back to top
View user's profile Send private message
Instructor
Site Admin


Joined: 06 Jul 2006
Posts: 5454

PostPosted: Sat Jan 17, 2009 5:26 am    Post subject: Reply with quote

u_u86 wrote:
Only to recognize UTF-8 and OEM?
Yes.
Back to top
View user's profile Send private message Send e-mail
u_u86



Joined: 09 Jul 2008
Posts: 16

PostPosted: Sat Jan 17, 2009 6:12 am    Post subject: Reply with quote

Ok, understand. Thanks for implementation!
Back to top
View user's profile Send private message
lupin1984



Joined: 07 May 2007
Posts: 20

PostPosted: Fri Jan 23, 2009 2:50 am    Post subject: Reply with quote

you can test the two software , no utf-8 recognition problem, perfect

but akelpad is faster and lighter , efficient Very Happy

they are open source software , thanks

Notepad++
http://notepad-plus.sourceforge.net/uk/site.htm

notepad2
http://www.flos-freeware.ch/notepad2.html
Back to top
View user's profile Send private message Send e-mail
Instructor
Site Admin


Joined: 06 Jul 2006
Posts: 5454

PostPosted: Fri Jan 23, 2009 3:42 am    Post subject: Reply with quote

lupin1984
Do you get my answers on your emails? Try to read...

Quote:
i test on xp and vista

vista is ok ,but xp...

thanks

This one detected correctly. Make sure you make all this steps on XP:
1. turn on "Options->Settings...->General->Codepage recognition->Chinese".
2. turn off "Options->Settings...->Registry->Remember code page" (not necessary, but for clean results).
3. change default codepage to your native (if you change it). Don't use UTF-8 as your default ANSI codepage.
"Options->Settings...->General->Default codepage"
4. open file again.

Quote:
this text file can't be detected

This one is to small (has not much Chinese characters) for detection as UTF-8. Try to copy contents and it will detected correctly:
Code:
测试文本thanks谢谢
18:42 2009/1/13
测试文本thanks谢谢
18:42 2009/1/13


Quote:
it's firefox's simple chinese lang package

all no-bom text files,the pageInfo.properties can't be auto detected . you can test

Increase recognition buffer, for example to 8096:

"Options->Settings...->General->Buffer"
Back to top
View user's profile Send private message Send e-mail
Display posts from previous:   
Post new topic   Reply to topic    AkelPad Forum Index -> Discussion (English) All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


SourceForge.net Logo Powered by phpBB © 2001, 2005 phpBB Group