Click Here
home features news forums classifieds faqs links search
6155 members 
Amiga Q&A /  Free for All /  Emulation /  Gaming / (Latest Posts)
Login

Nickname

Password

Lost Password?

Don't have an account yet?
Register now!

Support Amigaworld.net
Your support is needed and is appreciated as Amigaworld.net is primarily dependent upon the support of its users.
Donate

Menu
Main sections
» Home
» Features
» News
» Forums
» Classifieds
» Links
» Downloads
Extras
» OS4 Zone
» IRC Network
» AmigaWorld Radio
» Newsfeed
» Top Members
» Amiga Dealers
Information
» About Us
» FAQs
» Advertise
» Polls
» Terms of Service
» Search

IRC Channel
Server: irc.amigaworld.net
Ports: 1024,5555, 6665-6669
SSL port: 6697
Channel: #Amigaworld
Channel Policy and Guidelines

Who's Online
22 crawler(s) on-line.
 95 guest(s) on-line.
 0 member(s) on-line.



You are an anonymous user.
Register Now!

/  Forum Index
   /  General Technology (No Console Threads)
      /  Unicode support is becoming more and more important
Register To Post

PosterThread
Hans 
Unicode support is becoming more and more important
Posted on 7-May-2008 18:34:34
#1 ]
Elite Member
Joined: 27-Dec-2003
Posts: 5134
From: New Zealand

See here.
I don't know what state OS4's unicode support is in, but maybe it should get a priority bump.

What functions should be used by people wanting unicode support in their apps? IIRC, the standard c-library text functions are ASCII only.

Hans

Last edited by Hans on 07-May-2008 at 06:37 PM.

_________________
Join the Kea Campus - upgrade your skills; support my work; enjoy the Amiga corner.
https://keasigmadelta.com/ - see more of my work

 Status: Offline
Profile     Report this post  
ara 
Re: Unicode support is becoming more and more important
Posted on 8-May-2008 9:01:28
#2 ]
Regular Member
Joined: 11-Jan-2006
Posts: 138
From: Unknown

@Hans
According to the included graph the number of Web sites using UTF-8 has dramatically increased. However, I bet most of those are just plain english ASCII-7 declared as UTF-8 encoded.

Nevertheless, I'm a big fan of Unicode, using UTF-16 most of the time.

 Status: Offline
Profile     Report this post  
Jolo 
Re: Unicode support is becoming more and more important
Posted on 8-May-2008 18:55:11
#3 ]
Member
Joined: 7-Oct-2006
Posts: 30
From: Unknown

@Hans

Quote:

What functions should be used by people wanting unicode support in their apps? IIRC, the standard c-library text functions are ASCII only.


Yes, the standard clib string functions are ASCII only.

In case you need strcmp(), tolower() and alike functions I would suggest to switch to the appropriate Locale library functions - at least they support right now ISO-8859-x encodings. In the future these functions of the Locale library may support the Unicode character set. In the mean time you use the ISO-8859-x functionality of these functions and you will profit from it when they support Unicode.

If you already want right now Unicode support you have to take the detour of Codesets library and map the entire strings to their ISO-8859-x counterparts. Unfortunately, toupper() and alike functions of the standard clib will fail if a character code is beyond 127 character barrier (Basic Latin/ASCII). So in this special case you have to use the Locale library and its functions.

Another approach could be the use of Uni library, which supports the most essential Unicode Standard 5 character code attributes. Unfortunately, when using Uni library, you cannot use the ordinary text output function of Graphics but have to fall-back onto a rendering engine that supports Unicode character codes.
Another disadvantage of Uni library is, that it was written to support programmers when using UTF-8 strings, not UTF-16 or UTF-32, although single character codes will be supplied as a UTF-32 character code to the appropriate function ( UniIsLower() and alike ).

 Status: Offline
Profile     Report this post  
NomadOfNorad 
Re: Unicode support is becoming more and more important
Posted on 9-May-2008 3:21:00
#4 ]
Cult Member
Joined: 2-Jun-2003
Posts: 750
From: Jacksonville, Florida, USA, Earth, Sol system, Milky Way galaxy

@thread

One thing I'm rather curious about, when it comes to AmigaOS presumably eventually supporting Unicode as a unified and standardized part of the overall OS... will there be a standard place to stick all the different Unicode character sets and their respective fonts? I'm hoping we don't wind up with a bunch of different, independently-arrived-at folders that each different program developer has arbitrarily stashed his particular group of Unicode sets in, and a different, nonstandard way of crossconverting them to work in their particular program...

That said, there are a bajillion different unicode language sets out there, including a whole vast section of Unicode-space set aside for unofficial language sets. Specifically, conlangs, constructed languages like Klingon and Elvish. Eventually, it'd be nice to view those, or even create some of those, natively in AmigaOS or MorphOS or AROS ...

edit: fixed a minor grammatical issue.

Last edited by NomadOfNorad on 09-May-2008 at 03:22 AM.

_________________
"I love peacenicks, they're so easy to conquer." --Ivan J Ironfist, the Dictator

 Status: Offline
Profile     Report this post  
CodeSmith 
Re: Unicode support is becoming more and more important
Posted on 9-May-2008 6:29:34
#5 ]
Elite Member
Joined: 8-Mar-2003
Posts: 3045
From: USA

@Hans

OS4 is going nowhere until 2010 at least. In the meantime, standards will continue to evolve, so there's no reason to waste time on something that will have been superseded by the time the lawsuit's decided. The active standard will probably be unicode 6.x or 7 by then, and utf8 may also have been superseded by something that doesn't bloat when used with Far Eastern languages (eg most Chinese characters can be represented using two bytes in UCS16, but it require three when using UTF8). As Asian influence grows and more data in their languages gest sent over the net, I believe that ISPs will vote with their wallets for something that consumes less bandwidth.

 Status: Offline
Profile     Report this post  
ara 
Re: Unicode support is becoming more and more important
Posted on 9-May-2008 10:49:55
#6 ]
Regular Member
Joined: 11-Jan-2006
Posts: 138
From: Unknown

@CodeSmith
Quote:

and utf8 may also have been superseded by something that doesn't bloat when used with Far Eastern languages

Considering the fact that text documents (including HTML) only account for 10% of the traffic volume, I assume that nobody will care about the factor 1.5, even if 50% of all documents would be written in Chinese in 2010.

 Status: Offline
Profile     Report this post  
Jolo 
Re: Unicode support is becoming more and more important
Posted on 9-May-2008 18:18:13
#7 ]
Member
Joined: 7-Oct-2006
Posts: 30
From: Unknown

@NomadOfNorad

Quote:

One thing I'm rather curious about, when it comes to AmigaOS presumably eventually supporting Unicode as a unified and standardized part of the overall OS... will there be a standard place to stick all the different Unicode character sets and their respective fonts?


What do you mean with different Unicode character sets?
Unicode (Standard 5) itself is just a character set, but in contrast to ASCII not 7 bits wide but 21.
And fonts are just fonts. It's an ordinary accumulation of glyphs. Whether the font data is bitmap or vector related doesn't make any difference as long as you use the proper rendering engine.

Quote:

I'm hoping we don't wind up with a bunch of different, independently-arrived-at folders that each different program developer has arbitrarily stashed his particular group of Unicode sets in, and a different, nonstandard way of crossconverting them to work in their particular program...


Once the Locale library supports the Unicode character set you have your central point. It's already there, yet only supporting ISO-8859-x encodings AFAIK.
If Unicode finds one day its way into the OS, then even every C/C++ compiler will be shipped with a Unicode aware string library. In addition, if I'm not mistaken, clib2 already has some support for Unicode, basing on the UTF-16 encoding.
And even when an application programmer comes with his own solution regarding Unicode support, is this IMHO irrelevant as long as he keeps all object files in the same drawer and does not force the user to create certain drawers within the system partition. Moreover, Unicode support doesn't mean a bunch of data files, just one single library/kernel module is enough, although dictionaries for certain languages would become handy for special casing purposes.

Quote:

That said, there are a bajillion different unicode language sets out there,


No, just 179. But you should avoid the term language. ASCII is a character set like Unicode, too, but do you use the term language in conjunction with ASCII?

Quote:

including a whole vast section of Unicode-space set aside for unofficial language sets.


Try to use the term code chart instead of language. "Basic Latin", "Latin-1 Supplement" and so on are strictly speaking, no languages, but code charts and moreover used in different languages.

Yes, I know, I'm nitpicking, but using the official terms avoids confusion, at least for me.

 Status: Offline
Profile     Report this post  
Jolo 
Re: Unicode support is becoming more and more important
Posted on 9-May-2008 18:25:53
#8 ]
Member
Joined: 7-Oct-2006
Posts: 30
From: Unknown

@CodeSmith

Quote:

and utf8 may also have been superseded by something that doesn't bloat when used with Far Eastern languages (eg most Chinese characters can be represented using two bytes in UCS16, but it require three when using UTF8).


If you take only into account the storage space, you are right, but once you load a UTF-16 file from disk into memory and use UTF-16, you waste CPU resources because you have to test against surrogate code points.
UTF-16 was the best decision as long as the code unit could be kept in 16 bits. Today we are faced with 21 bits and thus with surrogate code points in case UTF-16 is used.

Quote:

As Asian influence grows and more data in their languages gest sent over the net, I believe that ISPs will vote with their wallets for something that consumes less bandwidth.


I agree with ara.
Do you really believe text files are the ones to blame for the traffic?
I can only speak for myself but I listen to internet radio stations or watch their video clips, what sums up in 500 Mbytes per hour.

 Status: Offline
Profile     Report this post  
Hans 
Re: Unicode support is becoming more and more important
Posted on 9-May-2008 18:41:33
#9 ]
Elite Member
Joined: 27-Dec-2003
Posts: 5134
From: New Zealand

@CodeSmith

Quote:

CodeSmith wrote:
@Hans

OS4 is going nowhere until 2010 at least. In the meantime, standards will continue to evolve, so there's no reason to waste time on something that will have been superseded by the time the lawsuit's decided. The active standard will probably be unicode 6.x or 7 by then, and utf8 may also have been superseded by something that doesn't bloat when used with Far Eastern languages (eg most Chinese characters can be represented using two bytes in UCS16, but it require three when using UTF8). As Asian influence grows and more data in their languages gest sent over the net, I believe that ISPs will vote with their wallets for something that consumes less bandwidth.


I said unicode, not utf8. That means supporting all unicode variants from utf8 to utf32 (and whatever else exists). I don't understand why we would want to sit back and let things stagnate. If unicode is updated, then having a working code-base will mean a much smaller task to support the changes than having nothing.

Saying "why bother" because OS4 is going nowhere till 2010 and the standards will evolve is like suggesting that we shouldn't support CSS in browsers, because by the time it's done it could be superseded by a newer variant. Or how about work the work being done to update MiniGL, is that pointless because it's going to be replaced by MESA over the next few years? IMHO, having a more complete OpenGL 1.x implementation now is worth it. Likewise, having full unicode support in the OS now is worth it, even if the standard changes in a year or two.

Hans

_________________
Join the Kea Campus - upgrade your skills; support my work; enjoy the Amiga corner.
https://keasigmadelta.com/ - see more of my work

 Status: Offline
Profile     Report this post  
CodeSmith 
Re: Unicode support is becoming more and more important
Posted on 9-May-2008 20:36:40
#10 ]
Elite Member
Joined: 8-Mar-2003
Posts: 3045
From: USA

@Hans

Utf8 is just an example of the things that I expect to see changing between now and the time when OS4 has more than a few hundred users. Jolo's right, too - UCS16 is painful to use with GB18030 so it's probably going to be superseded soon as well. I don't think we'll have to wait too long for 32 bit unicode to become commonplace.

I do believe that keeping MiniGL up to date is a waste of time, simply because 3D hardware is changing so fast. I bet that by the time OS4 finally gets untangled, so much code would have had to change in MiniGL to keep up with new shaders etc that you might as well not have changed anything in the meantime. CSS on the other hand is a different story - web standards change so slowly that I expect standards that are commonplace now to also be commonplace in 2010 and even up to maybe 2015. So I see OWB and similar efforts as worthwhile.

In any case, full unicode support is going to be like 64 bit disk access - only useful to new applications that are written to take advantage of it. We've had a 64 bit DOS for as long as OS4 has been out, and even a filesystem to support it, and yet I'm only aware of one program that makes use of the 64 bit extensions, a CD writer that doesn't have any choice because a DVD ISO is > 4GB in size.

 Status: Offline
Profile     Report this post  
Hans 
Re: Unicode support is becoming more and more important
Posted on 9-May-2008 21:13:57
#11 ]
Elite Member
Joined: 27-Dec-2003
Posts: 5134
From: New Zealand

@CodeSmith

Quote:

CodeSmith wrote:
I do believe that keeping MiniGL up to date is a waste of time, simply because 3D hardware is changing so fast. I bet that by the time OS4 finally gets untangled, so much code would have had to change in MiniGL to keep up with new shaders etc that you might as well not have changed anything in the meantime. CSS on the other hand is a different story - web standards change so slowly that I expect standards that are commonplace now to also be commonplace in 2010 and even up to maybe 2015. So I see OWB and similar efforts as worthwhile.


You obviously have no idea what's going on with MiniGL. Here are the facts:
- MiniGL is not up-to-date at present
- MiniGL will never support shaders and so will never be up-to-date
- MiniGL is being updated to give us a more complete implementation in interim until a full MESA port is done. The MESA port will take some time.
- Once we have a MESA port, we will have an up-to-date OpenGL implementation
- MESA will be kept up-to-date by the MESA development team

So why update MiniGL at all? Well, it will enable the following:
- ports of existing OpenGL apps/games that don't require shaders (already happening thanks to new features that have been added)
- developers wishing to develop 3D apps/games are able to do so, albeit without the newer features (some people have been experimenting already)
- Expanding the range of software that is available will make Amiga OS4 worth more to the end user (and developers too)

Now let's say that we were to work to add shaders. What advantage would doing nothing until the legal wranglings are over do? Absolutely nothing. We'd still have to make all those changes regardless.

Quote:

In any case, full unicode support is going to be like 64 bit disk access - only useful to new applications that are written to take advantage of it. We've had a 64 bit DOS for as long as OS4 has been out, and even a filesystem to support it, and yet I'm only aware of one program that makes use of the 64 bit extensions, a CD writer that doesn't have any choice because a DVD ISO is > 4GB in size.


Old apps not being able to use it is no justification for doing nothing. The dvd writer app that uses the 64-bit functions would never have been able to use >4GB isos if someone hadn't taken the time to add the 64-bit functions in the first place.

What would you suggest we do while we're waiting for the legal proceedings to finish instead of modernizing the OS?

I'd say that improving the OS now will make it more useful to users when the legal wranglings are over and the availability problem is solved.

Hans

Last edited by Hans on 09-May-2008 at 09:17 PM.
Last edited by Hans on 09-May-2008 at 09:16 PM.
Last edited by Hans on 09-May-2008 at 09:15 PM.

_________________
Join the Kea Campus - upgrade your skills; support my work; enjoy the Amiga corner.
https://keasigmadelta.com/ - see more of my work

 Status: Offline
Profile     Report this post  

[ home ][ about us ][ privacy ] [ forums ][ classifieds ] [ links ][ news archive ] [ link to us ][ user account ]
Copyright (C) 2000 - 2019 Amigaworld.net.
Amigaworld.net was originally founded by David Doyle