Poster | Thread |
Belxjander
| |
Perception-IME - Asian Languages support AOS4=current, +AROS?, +MOS? Interested? Posted on 20-Sep-2015 9:06:43
| | [ #1 ] |
|
|
|
Cult Member |
Joined: 4-Jan-2005 Posts: 557
From: Chiba prefecture Japan | | |
|
| In response to the suggestion about a new thread Here.
I'm actively asking if any MOS or AROS developers would be willing to point at what needs to be changed (beyond the obvious AOS4 Interface abuse) for getting Perception-IME functional on those systems as well...
Or is there no interest in expanding the supported languages on those systems?
EDIT: response to Broadblues suggestion, thank you. Last edited by Belxjander on 21-Sep-2015 at 03:44 AM. Last edited by Belxjander on 21-Sep-2015 at 03:43 AM.
|
|
Status: Offline |
|
|
broadblues
| |
Re: Perception-IME - Asian Languages support Posted on 20-Sep-2015 15:12:41
| | [ #2 ] |
|
|
|
Amiga Developer Team |
Joined: 20-Jul-2004 Posts: 4446
From: Portsmouth England | | |
|
| |
Status: Offline |
|
|
itix
| |
Re: Perception-IME - Asian Languages support Posted on 20-Sep-2015 15:26:13
| | [ #3 ] |
|
|
|
Elite Member |
Joined: 22-Dec-2004 Posts: 3398
From: Freedom world | | |
|
| @Belxjander
How do you render Japanese glyphs? _________________ Amiga Developer Amiga 500, Efika, Mac Mini and PowerBook |
|
Status: Offline |
|
|
BSzili
| |
Re: Perception-IME - Asian Languages support Posted on 20-Sep-2015 17:53:54
| | [ #4 ] |
|
|
|
Regular Member |
Joined: 16-Nov-2013 Posts: 447
From: Unknown | | |
|
| @itix
He doesn't... _________________ This is just like television, only you can see much further. |
|
Status: Offline |
|
|
jacadcaps
| |
Re: Perception-IME - Asian Languages support Posted on 20-Sep-2015 20:00:03
| | [ #5 ] |
|
|
|
Regular Member |
Joined: 20-Nov-2007 Posts: 203
From: Canada | | |
|
| @Belxjander
Well, I've had a quick look at the code on github and read up on the RFC 2237.
First question: how do you handle katakana input -> kanji conversion? I have a lot of experience with text input handling on Mac OS X and that is pretty complicated - yet allows all those languages with complex input systems. So, how did you do this?
Second question: do you only support ISO 2022? No Unicode input support? |
|
Status: Offline |
|
|
Belxjander
| |
Re: Perception-IME - Asian Languages support Posted on 21-Sep-2015 3:40:25
| | [ #6 ] |
|
|
|
Cult Member |
Joined: 4-Jan-2005 Posts: 557
From: Chiba prefecture Japan | | |
|
| @Itix: I let the OS render the glyphs, I don't actively patch unless required.
@BSzilli: I won't modify the OS without it being an essential need.
@All: Why is *rendering* the glyphs considered any priority until after the engine itself works?
@jacadcaps: I'm currently accepting English Input characters and chording them to Hiragana for Japanese, post-processing the Hiragana as a lookup chording for Kanji (I'm not dealing with ISO-2022-JP only...)
*current* all text is internally represented using Unicode CodePoints, I'm accepting ISO-Latin1 for input.device as given in AmigaOS rawkey events.
I've only provided an ISO-2022-JP strings module for that codeset, japanese.language itself is UTF-8, the plan is to push everything to output as UTF8, this keeps ASCII at least for 127 character codes.
I'd welcome to find out any potential mistakes or errors in design and implimentation for the Kanji handling.
Can we discuss this more implimentation specific directly?
I definitely feel I need to discuss what I have as a concept so far with someone more experienced in this field.
I'm also working through automated script generation for the Kanji reading lookups using Hiragana CodePoint values as the Primary Search key in a tree of syllables arrangement.
@All : Chinese is delayed for the completion of the Kanji handling, Korean may be workable from a similar construct as for the Hiragana chording from romaji. Katakana mode only requires a single function to be properly connected in with modal support.
output just requires being hooked up to re-feed input.device and the filter to ignore the refeed events.
I've been testing the result code live on my own machine from the time that IPrefs first loaded the initial "japanese.language" module that bound to perception.library directly.
AROS and MOS may be extensible using the same dynamic linkage arrangement.
this also means that the language hooks for input processing are only registered when the main library is installed. removing any complexity of modifying startup on the system and only requiring language selection to happen.
there may still be an issue with multiple perception capable language libraries installed... Any feedback or help in tracking down that bug would also be appreciated.
Last edited by Belxjander on 21-Sep-2015 at 03:52 AM. Last edited by Belxjander on 21-Sep-2015 at 03:41 AM.
|
|
Status: Offline |
|
|
BSzili
| |
Re: Perception-IME - Asian Languages support Posted on 21-Sep-2015 5:49:26
| | [ #7 ] |
|
|
|
Regular Member |
Joined: 16-Nov-2013 Posts: 447
From: Unknown | | |
|
| @Belxjander
Modify the OS? Now I see you really mean business! _________________ This is just like television, only you can see much further. |
|
Status: Offline |
|
|
olegil
| |
Re: Perception-IME - Asian Languages support Posted on 21-Sep-2015 6:00:28
| | [ #8 ] |
|
|
|
Elite Member |
Joined: 22-Aug-2003 Posts: 5895
From: Work | | |
|
| @itix
How does an _input_ method usually render glyphs? _________________ This weeks pet peeve: Using "voltage" instead of "potential", which leads to inventing new words like "amperage" instead of "current" (I, measured in A) or possible "charge" (amperehours, Ah or Coulomb, C). Sometimes I don't even know what people mean. |
|
Status: Offline |
|
|
Belxjander
| |
Re: Perception-IME - Asian Languages support Posted on 21-Sep-2015 6:15:07
| | [ #9 ] |
|
|
|
Cult Member |
Joined: 4-Jan-2005 Posts: 557
From: Chiba prefecture Japan | | |
|
| @BSzili
I'm making this open source so that ALL family branches can gain...
AND there is a public 3rd party example of working OS compliant sources usable as a general reference.
the Core Library contains a hidden Application launched at opening the library,
I did this to isolate operations from input.device (run a filter within input.device and plugins within my application).
This isolating application context means that an IME failure does NOT fatally kill input.device, and I can refeed "cooked" UTF8 encoded text back to the OS as a whole for Intuition/console and Commodities to respond equally to the InputEvents as configured.
I've always been serious from the beginning and gave this several months of thought as to locale and possible system patching, my decision at that point was to ask for each OS to handle UTF8 or another Unicode variant as a standard text rendering and not modify anything without essential need.
AmigaOS and "children" (AOS4/AROS/MOS) are all equally capable with the common core.
I've still got a LOT of automation to build as well.
Last edited by Belxjander on 21-Sep-2015 at 06:16 AM.
|
|
Status: Offline |
|
|
BSzili
| |
Re: Perception-IME - Asian Languages support Posted on 21-Sep-2015 6:18:41
| | [ #10 ] |
|
|
|
Regular Member |
Joined: 16-Nov-2013 Posts: 447
From: Unknown | | |
|
| @Belxjander
Ugh, that sounds way too technical to me. What does it do now? _________________ This is just like television, only you can see much further. |
|
Status: Offline |
|
|
Belxjander
| |
Re: Perception-IME - Asian Languages support Posted on 21-Sep-2015 6:58:12
| | [ #11 ] |
|
|
|
Cult Member |
Joined: 4-Jan-2005 Posts: 557
From: Chiba prefecture Japan | | |
|
| @BSzili
Core "Perception.Library is mostly done, only needs to "re-feed" inputs back to the system. Current output does present some UTF8 through Sashimi and the kernel debug printf mechanism.
"Chinese.Language" and "Korean.Language" Skeletons are present along with some codeset skeletons.
Japanese.Language actively provides a conversion from a qwerty/dvorak ASCII characters in matching syllables for Hiragana (First Japanese Target),
ka=か:ki=き:ku=く:ke=け: ko=こ for the entire Hiragana syllabary.
ARexx processing of the Unicode readings for asian scripts (more automation is needed). currently generates a directory tree based on Japanese reading and a list of Japanese UTF8 encoded Kanji (everything UTF8 valid to specification version 7).
ToDo: Additional filter steps for Chinese and Korean as separate scripts. ToDo: confirm Hiragana to Katakana Transform function, complete Kanji selection by Hiragana "reading" and present user-popup-lists for multiple-Kanji selections.
Also need to add use of a standardized Dictionary for additional "suggestion" mechanism ( possibly useful as a word selection shortcut in any language?) specifically to accelerate accented characters and non-european language word selections for "whole word" text entry.
Possible extensions... Touch Input, Speech Recognition (definitely out of what I know...) and Gesture support? (Meridian commodity on AOS3...refer to Aminet) Last edited by Belxjander on 21-Sep-2015 at 07:05 AM. Last edited by Belxjander on 21-Sep-2015 at 07:02 AM. Last edited by Belxjander on 21-Sep-2015 at 06:59 AM.
|
|
Status: Offline |
|
|
itix
| |
Re: Perception-IME - Asian Languages support Posted on 21-Sep-2015 7:06:35
| | [ #12 ] |
|
|
|
Elite Member |
Joined: 22-Dec-2004 Posts: 3398
From: Freedom world | | |
|
| @Belxjander
I am just wondering if you can get programs to display Japanese text. I have seen screenshots of native Amiga programs with Japanese. Of course, that is not even necessity and out of scope but I was wondering if you are using some existing patch to display text.
_________________ Amiga Developer Amiga 500, Efika, Mac Mini and PowerBook |
|
Status: Offline |
|
|
Belxjander
| |
Re: Perception-IME - Asian Languages support Posted on 21-Sep-2015 7:13:57
| | [ #13 ] |
|
|
|
Cult Member |
Joined: 4-Jan-2005 Posts: 557
From: Chiba prefecture Japan | | |
|
| @itix
No, I have no Japanese display system patches, I currently rely on logging to files and visually inspecting the results within TimberWolf ( I have overridden the defaults for ISO-* encodings due to Japanese.Language being used with system-wide UTF8 encoding being the default)
TimberWolf internally using Cairo and FreeType (at least by appearance to use?) allows display of the extended Unicode range.
I have test texts in Japanese, Korean and Chinese along with autogenerated listings of the encoded characters with which I can validate most text.
I've also run into the RTL and LTR switch trigger characters, these actively affect TimberWolf at least. Last edited by Belxjander on 21-Sep-2015 at 07:21 AM.
|
|
Status: Offline |
|
|
BSzili
| |
Re: Perception-IME - Asian Languages support Posted on 21-Sep-2015 7:27:17
| | [ #14 ] |
|
|
|
Regular Member |
Joined: 16-Nov-2013 Posts: 447
From: Unknown | | |
|
| @Belxjander
Whoa! Will this work with existing programs? _________________ This is just like television, only you can see much further. |
|
Status: Offline |
|
|
olegil
| |
Re: Perception-IME - Asian Languages support Posted on 21-Sep-2015 7:32:38
| | [ #15 ] |
|
|
|
Elite Member |
Joined: 22-Aug-2003 Posts: 5895
From: Work | | |
|
| @Belxjander
The three languages you're mentioning here all use left-to-right writing on computers, though (I know japanese handwriting is usually done up-to-down columns starting from the upper right corner of the page, but already from the typewriter era left-to-right was accepted in general).
1: How do you select between all the different Kanji that maps to the same (single or group of) hiragana?
2: Do you have a complete(ish) list of what is needed from the display side of things? _________________ This weeks pet peeve: Using "voltage" instead of "potential", which leads to inventing new words like "amperage" instead of "current" (I, measured in A) or possible "charge" (amperehours, Ah or Coulomb, C). Sometimes I don't even know what people mean. |
|
Status: Offline |
|
|
Belxjander
| |
Re: Perception-IME - Asian Languages support Posted on 21-Sep-2015 8:16:10
| | [ #16 ] |
|
|
|
Cult Member |
Joined: 4-Jan-2005 Posts: 557
From: Chiba prefecture Japan | | |
|
| @BSzilli: I'm modifying the sources and have no issues within CygnusEd for editing purposes... other than Firefox using Cairo directly and freetype for texr within that... any system program that can accept input of UTF8 sequences may work unchanged.
The primary issue comes in the form of unaware text editors which will generally cut only the first octet of a sequence due to code level ignorance.
Any Text Editor/Word Processor/Desktop Publishing Application will need some awareness updates, If they are updated at all. Otherwise the user needs to copy/paste whole sequences.
Deletion is easy, first del breaks the sequence showing strange characters, and you can delete the whole sequence. cut/copy/paste, ARexx ov another scripted override may workaround paste-enabling.
Mostly expanded scripting and configs initially.
@olegil
1: Each reading presents multiple "candidates" back to the core...this will trigger for user-selection... I need to work on this more and keep it as straight forward as possible. only thos3 Kanji for any given reading (regardless of length) will be returned candidates from the LanguageHook inside each estended Language library.
2: The display side of this only requires to upgrade the graphics.library's Text Rendering, loop over the incoming string as normal, when the charactercode is 128 and higher, validate utf8 sequences, If UTF8 valid then render Unicode, otherwise render ASCII ISO-Latin-1 characters.
outside that, I don't have or know of anything else required. Last edited by Belxjander on 21-Sep-2015 at 09:37 AM. Last edited by Belxjander on 21-Sep-2015 at 08:18 AM.
|
|
Status: Offline |
|
|
olegil
| |
Re: Perception-IME - Asian Languages support Posted on 21-Sep-2015 10:10:57
| | [ #17 ] |
|
|
|
Elite Member |
Joined: 22-Aug-2003 Posts: 5895
From: Work | | |
|
| @Belxjander
Since I don't have any experience with programming GUI (neither Amiga nor elsewhere, except text and html based ones), how does the looping work? Is this something that is normally done for a text input box in a GUI? Or only for larger things like a text editor where you create your own output mechanism?
Sorry for the stupid questions
Oh, and why aren't you testing with a simple http server and server side push? Last edited by olegil on 21-Sep-2015 at 10:11 AM.
_________________ This weeks pet peeve: Using "voltage" instead of "potential", which leads to inventing new words like "amperage" instead of "current" (I, measured in A) or possible "charge" (amperehours, Ah or Coulomb, C). Sometimes I don't even know what people mean. |
|
Status: Offline |
|
|
Belxjander
| |
Re: Perception-IME - Asian Languages support Posted on 21-Sep-2015 10:32:31
| | [ #18 ] |
|
|
|
Cult Member |
Joined: 4-Jan-2005 Posts: 557
From: Chiba prefecture Japan | | |
|
| @olegil
Http is a network protocol and I am writing a system component stack.
C language primarily.
during input.device, there is no loop, one pass with queue push of fifo in memory (everything is same machine) or ignore and pass-through.
within the "Perception-IME" Application process it waits for "signal"s to trigger IPC port checks and responds to Event messages of various types. Every "input.device" push to the FIFO also includes Signalling the Perception-IME Application Process. the application process then handles UTF8 conversion when a chord sequence is recognised or a Hash-Lookup list has a user selection.
Test input across the whole OS is potentially affected with any text being ISO-Latin-1 extended ASCII characters,
UTF8 sequences appear as ASCII symbols when broken (highest bit stripped or invalidly encoded.
I don't need to care about whole documents at this time. every single character becoming a full Http request would change the 64KB FIFO buffer from 8 bytes per character and possibly need closer to 512 Bytes per keypress. (rough estimate, grain of salt material...)
so any "simple ... server" at this level is equivalent to the whole current project.
HTML/CSS is a very different beast, no scripting support without writing the port handling for me.
"Web programming" gets most of the system interfacing tools premade, system components are what those tools use... different layers in the cake.
Last edited by Belxjander on 21-Sep-2015 at 11:30 AM.
|
|
Status: Offline |
|
|
jacadcaps
| |
Re: Perception-IME - Asian Languages support Posted on 21-Sep-2015 12:26:29
| | [ #19 ] |
|
|
|
Regular Member |
Joined: 20-Nov-2007 Posts: 203
From: Canada | | |
|
| @Belxjander
Well... I'm not sure I'd like to go with the integration route you are proposing in MorphOS. I'll keep an eye on what you're doing though. Ideally, one day I'd prefer to phase the input.device away entirely and use something higher level. OSX does remarkably well in that regard - the application (or class) author has to implement several methods that will for example tell the system engine at which offset should the selection menu be displayed. That lets one change an 'a' into an 'ą' for example. Or display a menu to select a proper kanji from. All of this, though, requires a capable text rendering engine as a first step. |
|
Status: Offline |
|
|
Belxjander
| |
Re: Perception-IME - Asian Languages support Posted on 21-Sep-2015 13:29:10
| | [ #20 ] |
|
|
|
Cult Member |
Joined: 4-Jan-2005 Posts: 557
From: Chiba prefecture Japan | | |
|
| @jacadcaps
I've separated the Language handling of the input to language symbol mapping from the input.device filter... It is possible to cut out the input.device usage and handle that part differently I think...
Each variant by target doesn't need to be 100% identical in how the "input filter"/"plugin processing" happens, the only need is the common staging model is kept. And that is a documented abstraction you can impliment however you want
the staging model is to deal with Keyboard Input for Applications initially as transparently as possible, expanding service functionality as required/requested for direct Application access.
I'm just doing an initial setup one way and not worrying about how different each port is for the details.
If you want to implement classes ("BOOPSI"/"MUI"/other?) and use those for feeding the Input FIFO or directly make use of the language specific hook registrations and call the hooks directly from the calling Application context, what specifically did you have in mind?
I see no problem in providing both methods if you wish to add some modifications to the sources?
the only issue I would have is with any submission that breaks other platform targets without prior discussion and a unanimous acceptance.
I know the AOS4 specific Interface handling would be definitely changed for AROS & MOS, beyond that... initially making it work on all three with the existing design would be a primary initial goal. Adding classes or preparation for classes would be acceptable where the primary goal is also accomplished.
Then in later versions we can add additional features as the project grows if you wish to make submissions and help improve what is there. even just discussing my current "brute force" approach and explaining your own ideas in how to make improvements will be a big help.
Does that work for you? Last edited by Belxjander on 21-Sep-2015 at 02:05 PM.
|
|
Status: Offline |
|
|