Amigaworld.net - The Amiga Computer Community Portal Website

home

features

news

forums

classifieds

faqs

links

search

6225 members

Amiga Q&A / Free for All / Emulation / Gaming / (Latest Posts)

Login

Lost Password?

Don't have an account yet?
Register now!

Support Amigaworld.net

Your support is needed and is appreciated as Amigaworld.net is primarily dependent upon the support of its users.

Menu

Main sections

»	Home
»	Features
»	News
»	Forums
»	Classifieds
»	Links
»	Downloads

Extras

»	OS4 Zone
»	IRC Network
»	AmigaWorld Radio
»	Newsfeed
»	Top Members
»	Amiga Dealers

Information

»	About Us
»	FAQs
»	Advertise
»	Polls
»	Terms of Service
»	Search

IRC Channel

Server: irc.amigaworld.net
Ports: 1024,5555, 6665-6669
SSL port: 6697
Channel: #Amigaworld
Channel Policy and Guidelines

Who's Online

22 crawler(s) on-line.

95 guest(s) on-line.

0 member(s) on-line.

You are an anonymous user.
Register Now!

michalsc: 11 mins ago

BigD: 49 mins ago

Karlos: 1 hr 48 mins ago

pixie: 1 hr 53 mins ago

Amiboy: 2 hrs 38 mins ago

Rob: 2 hrs 41 mins ago

Mobileconnect: 2 hrs 51 mins ago

Everblue: 3 hrs 24 mins ago

Frank: 3 hrs 29 mins ago

matthey: 3 hrs 31 mins ago

Forum Index

MorphOS Software

Assembly startup codes for ECX compiler in VAsm?

Poster

Thread

Samurai_Crow

Re: Assembly startup codes for ECX compiler in VAsm?
Posted on 13-Aug-2021 16:42:11

[ #21 ]

Elite Member

Joined: 18-Jan-2003
Posts: 2320
From: Minnesota, USA

@Hypex

Quote:

Hypex wrote:

I also wonder how an OS supports C++. To me this looks a compiler feature with calling methods. But with an OS it must work with a specific ABI. A compiler can't define the ABI nor constrict a class how it wants to privately. Like how the OS4 methods work. It's customised to work on OS4. Somehow on modern systems they have some kind of transparency with calling methods.

It's called the "This Call" calling convention. The way that IInterfacename is passed to the function as the first parameter as well as being the VTable structure for the call. Most operating systems hide that the interface is both indexed as the VTable and the first parameter is hidden behind the calling convention even though they do the exact same thing as OS 4 under the hood/bonnet.

Status: Offline

Hypex

Re: Assembly startup codes for ECX compiler in VAsm?
Posted on 14-Aug-2021 14:20:59

[ #22 ]

Elite Member

Joined: 6-May-2007
Posts: 11351
From: Greensborough, Australia

@matthey

Quote:
I'm guessing it was ELF which may have worked if you knew what variation was needed for AmigaOS 4 and how to specify it. Any AmigaOS variation so far needs unusual ELF handling for the relocatable scatter loader.

It's natural to think it would use some form of ELF. But in fact it needs HUNK. It's a means to an end. It's not OS specific. It's just PPC code assembled into an object. The OS4 variant of the code just changes the OS calls. The code is included in all platform versions of the compiler and it can compile E code for any platform. any being 68K, OS4 or MOS. If the PPC code isn't HUNK it reports a "scanados_t" error upon E module load. The target code compiler must use a common HUNK loader for the code generator regardless of CPU that is loaded in at run time.

Quote:
The a4 register is used as a pointer to global data when compiling with small data which has become less common as program sizes have grown (also used in libraries sometimes requiring a geta4() call so a4 points to library data). The a5 register is the frame pointer on the 68k Amiga but this is often turned off on the Amiga (vbcc defaults to no frame pointer) as it is more efficient to use the a7 stack pointer for local variables (a5 frame pointer is only for debugging). It is rare for programs to use a4 or a5 for args as they may be used but it is also rare to have that many pointer args. Too many register args can cause the function called to spill registers to gain working registers which defeats the purpose of register args.

The biggest I know is graphics Blt functions where some functions take on average 10 parameters. A lot of Amiga C code commonly uses A4 for globals. And using A5 was just as common, by being linked to the stack to create some local space. Suppose a C compiler could access it directly on the stack, but they tended to reserve it by doing a LINK A5 on it. The only exception I know would be hand crafted ASM or some other compiler that doesn't follow common conventions.

Quote:
I have to say the AmigaOS 4 library call setup looks better even with the double memory indirect. MOS using the 68k registers in memory isn't efficient.

There's probably no way around the indirect. The base pointer has to be loaded, then the function loaded from the offset. But at least it is a direct jump from there. MOS loads in the call function, then loads in base, stores it, then sets offset, then calls routine. The "Emul" in the naming suggests some form of 68K ABI emulation, though emulation is a touchy term for what should be a native call. The register array used is exactly how the OS4 one looks in a 68K EmuTrap which is used as a trampoline into native code.

Quote:
Most 68k fans are not interested in switching to x86-64. The CPUs are so high performance that there is no need for efficient code anyway.

OS4 and MOS for that matter got a lot of criticism for not using a cheaply available and more powerful CPU. Of course both OS4 and MOS are derivative of the PowerUP hardware they were developed on. No x86 accelerator was ever made for a real Amiga so saying OS4 should have been on x86 (like the first one that Amiga Inc. did announce) is rather pointless. It is only in recent times that that ARM is making itself inside Amigas but not with the same integration. The only exception would be PC bridgeboard cards. Suppose someone could have made a "PowerPC" board with a custom kernel that off loaded sidecode to the x86 and said "Amiga goes Power PC". But porting AmigaOS so it all ran together is whole different ballgame. However, it is an opportunity for a RSIC reboot, or perhaps RISC remix, with ARM. They could follow the theme and make up a new slogan, "The PowerARM. Amiga extends it's power with a strong ARM!"

Quote:
PC relative addressing has become more important in modern ISAs and with 64 bit addressing. The x86-64 RIP relative addressing was a big improvement for x86 which had poor PC relative support due to early segment use. AArch 64 has improved PC relative support.

Given the amount of PPC instructions that load off base offsets it would have made sense to add PC there as well. But, they reduced it too much, even the move is really doing an add or an or. It can be done if a routine loads a PC base into a GPR, and there is the -fPIC option, but it's not as dynamic.

Quote:
The 68k has good PC relative support but it could be better for 64 bit addressing including a (d32,pc) explicit mode (it has implicit support for branches) and PC relative writes. RIP relative addressing of x86-64 has shown the advantage of PC relative writes even though some purists don't think they should be allowed even though restricting writes adds little to security. Some OSs have separate sections for code (read only protected) and text areas (read only and no execute) but this reduces PC relative use and code density. PC relative accesses save a register while often providing more compact code. Absolute addressing is very inefficient with 64 bit addressing and (d16,Rn) addressing which was common for the 68k and PPC only accesses 64kiB of data which is limiting for large programs today.

ELF is suitable for "segmenting" the different code sections into text, read only, data and so forth. And this happens for OS4 code as well. Found this myself when a char * wasn't accepted by GCC and had to const char * it, which got annoying when char pointers used to just work. So the code and data couldn't be far away. Apparently OS4 code uses absolute addressing by default in GCC which is why examples of interrupt handlers accessing globals in interrupts don't crash. I always wondered why they didn't use the data pointer and simply assumed they could access a variable from anywhere. Even if useful I think it's a bad example. My own interrupt code always uses data as it never assumes it can always write to a global. ECX uses base offsets. So RIP is only recent to x86? I thought PC relative was a CISC standard.

Quote:
The trap method of system calls allows more separation of user and supervisor code but it is also more expensive. Switching to Supervisor mode and flushing the pipeline makes a trap relatively more expensive on modern processors.

It was common in the 80's. I've come across some ST code once. And Mac code at times.

Quote:
I would think disabling interrupts would be necessary and that a forbid would not be enough. Mutexs and semaphores are tricky to use.

An interrupt disable is done the same way as a forbid, increasing a counter. There may be some extra logic on PPC like on 68K where it disables DMA on hardware first. OS4 offers mutxes to the user. I use semaphores but the access routines do a forbid. That's out of my control and I think if they wanted people to avoid forbid the first place is within the system routines.

Quote:
It's no worse than using a CAS instruction which can also fail.

Didn't see that in use much at all and not in OS routines.

Quote:
While PPC has 3 op and twice the number of 68k registers, it's not unusual for the 68k to use half the number of instructions and half the code size. This is what happens when a RISC processor touches memory but the problems include PPC deficiencies too. PPC may make up some ground when doing complex functions but how many instructions on average are executed without using memory?

Not many when I've looked at code. There's a lot of loading and storing. The optimisation has to be turned on otherwise there's too much stack framing and not enough registered variables.

Quote:
The 68020+ has BSR.L and Bcc.L so the PPC only beats the 68000. AArch64 has better PC relative addressing than PPC in many cases too as I posted above. In some ways, PPC is more outdated than the 68k.

Oh dear. The BSR.L takes up 6 bytes. But I missed something. There are 24-bits reserved for the branch offset, but the offset has a 32MB radius. I also found there is also an absolute addressing mode, unheard on PowerPC! So branch can be 24 bit relative or absolute. Optimised to 32 bit alignment. 24 bit would seem limiting, even with 4GB ram space, but VM would expand it virtually.

Status: Offline

NutsAboutAmiga

Re: Assembly startup codes for ECX compiler in VAsm?
Posted on 14-Aug-2021 19:50:09

[ #23 ]

Elite Member

Joined: 9-Jun-2004
Posts: 12993
From: Norway

@Hypex

Quote:
but VM would expand it virtually.

it be problematic if code was pushed to virtual page, then moved into some other memory address without relocating it first, also hate think of the stack, regarding branch return addresses. It needs to moved back exact same place, if where to be done.

Anyhow I think its possible, if it was only apps/programs, and not libraries and devices.

Last edited by NutsAboutAmiga on 14-Aug-2021 at 11:01 PM.
Last edited by NutsAboutAmiga on 14-Aug-2021 at 11:00 PM.
Last edited by NutsAboutAmiga on 14-Aug-2021 at 07:51 PM.
Last edited by NutsAboutAmiga on 14-Aug-2021 at 07:51 PM.

_________________
http://lifeofliveforit.blogspot.no/
Facebook::LiveForIt Software for AmigaOS

Status: Offline

matthey

Re: Assembly startup codes for ECX compiler in VAsm?
Posted on 15-Aug-2021 2:23:46

[ #24 ]

Elite Member

Joined: 14-Mar-2007
Posts: 2754
From: Kansas

Hypex Quote:

It's natural to think it would use some form of ELF. But in fact it needs HUNK. It's a means to an end. It's not OS specific. It's just PPC code assembled into an object. The OS4 variant of the code just changes the OS calls. The code is included in all platform versions of the compiler and it can compile E code for any platform. any being 68K, OS4 or MOS. If the PPC code isn't HUNK it reports a "scanados_t" error upon E module load. The target code compiler must use a common HUNK loader for the code generator regardless of CPU that is loaded in at run time.

The Amiga unfriendly official AmigaOS 4 GCC compiler only produces ELF so they at least need ELF2HUNK.

Hypex Quote:

The biggest I know is graphics Blt functions where some functions take on average 10 parameters. A lot of Amiga C code commonly uses A4 for globals. And using A5 was just as common, by being linked to the stack to create some local space. Suppose a C compiler could access it directly on the stack, but they tended to reserve it by doing a LINK A5 on it. The only exception I know would be hand crafted ASM or some other compiler that doesn't follow common conventions.

A lot of older and smaller 68k Amiga programs use small data (default a5 base in GCC & LLVM, default a4 base on Amiga). The data must fit in 64kiB which is less common with more modern code. Some programmers avoid using small data if they think their data may grow past 64kiB. Small data is barely faster on the 68040 and may be slower in some cases as the 68040 is not slowed as much as other 68k CPUs by large code. Leaving small data off simplifies compiling and makes debugging easier. The advantage of SD is reduced code size from converting xxx.l memory accesses to (d16,An) and relocatable code.

The frame pointer (default a6 in GCC & LLVM, a5 on Amiga) is commonly turned off by 68k programs which results in higher performance code (-fomit-frame-pointer in GCC). The only reason to use it is for debugging and a good debugger works fine without it in my experience. There are programmers which forget to turn it off with GCC. Some Amiga users have suggested Stefan Franke (bebbo) turn it off by default in the new unofficial GCC. It is off by default in vbcc.

Hypex Quote:

OS4 and MOS for that matter got a lot of criticism for not using a cheaply available and more powerful CPU. Of course both OS4 and MOS are derivative of the PowerUP hardware they were developed on. No x86 accelerator was ever made for a real Amiga so saying OS4 should have been on x86 (like the first one that Amiga Inc. did announce) is rather pointless. It is only in recent times that that ARM is making itself inside Amigas but not with the same integration. The only exception would be PC bridgeboard cards. Suppose someone could have made a "PowerPC" board with a custom kernel that off loaded sidecode to the x86 and said "Amiga goes Power PC". But porting AmigaOS so it all ran together is whole different ballgame. However, it is an opportunity for a RSIC reboot, or perhaps RISC remix, with ARM. They could follow the theme and make up a new slogan, "The PowerARM. Amiga extends it's power with a strong ARM!"

At one time, PPC was not a bad choice for a 68k replacement being big endian. ARM was not powerful and x86 was little endian with less big endian support at that time. SuperH would have been an interesting 68k replacement but it really should have used a variable length encoding from the start instead of a fixed length 16 bit encoding. It is handicapped with so few immediate and displacement encoding bits leading to way too many (often dependent) instructions. The code density is good but it doesn't matter if it has to execute 50% more instructions. Then again, it does have more addressing modes than most other RISC processors and they are copied from the 68k. If they had used a variable length instruction set, maybe ARM Thumb2 would never have replaced SuperH.

Hypex Quote:

ELF is suitable for "segmenting" the different code sections into text, read only, data and so forth. And this happens for OS4 code as well. Found this myself when a char * wasn't accepted by GCC and had to const char * it, which got annoying when char pointers used to just work. So the code and data couldn't be far away. Apparently OS4 code uses absolute addressing by default in GCC which is why examples of interrupt handlers accessing globals in interrupts don't crash. I always wondered why they didn't use the data pointer and simply assumed they could access a variable from anywhere. Even if useful I think it's a bad example. My own interrupt code always uses data as it never assumes it can always write to a global. ECX uses base offsets. So RIP is only recent to x86? I thought PC relative was a CISC standard.

It should be possible to combine code/text and read only sections to allow PC relative addressing of everything but the most secure OSs which leave code/text (read only) and read only data (read only, no execute) in different pages with different MMU page privileges. Separate pages results in lower performance from increased code size and wasting memory from increased code size and page alignment. It would still be possible to use PC relative addressing across different page types if they are near but it is less efficient. It is still more efficient than absolute addressing with 64 bit addresses though. A 64 bit xxx.q is 8 bytes where (d32,pc) is 4 bytes but neither of these exist with a fixed length 32 bit encoding. PC relative addressing is usually associated with CISC because CISC more commonly uses a variable length encoding. RISC with a variable length encoding can have more efficient PC relative addressing and more immediate and displacement bits that scale into larger instruction sizes as necessary. However, ARM Thumb2 and RISC-V did *not* make good use of this but rather mostly chose to make short encodings of 32 bit instructions. Mitch Alsup was working on a RISC ISA which better takes advantage of this scaling like the 68k (he is one of the designers of the 68k). Let me demonstrate.

16 bit encodings using (d8,pc) - only implicit short branch instructions on the 68k
32 bit encodings using (d16,pc) - this is a regular EA on the 68k, implicit 16 bit branch
48 bit encodings using (d32,pc) - no EA on the 68k but the encoding is available, implicit 32 bit branch
64 bit encodings using (d32,pc) - the 68020 can use (bd,An,Xn*SF) EA for (d32,pc) and other variations
96 bit encoding using (d64,pc) - encoding is available for bd=64 above and 64 bit branches

It would also be nice to allow base relative addressing that scales in a similar manner although this requires a base register instead of using the PC.

32 bit encoding using (d16,Rn) - most RISC ISAs and 68k have this
48 bit encoding using (d32,Rn) - 68k does not have this and difficult to add, same length as xxx.l
64 bit encoding using (d32,Rn) - the 68020 has (bd,An,Xn*SF), xxx.l is cheaper for 32 bit addressing
96 bit encoding using (d64,Rn) - the 68k encoding for (bd,An,Xn*SF) with bd=64 is open

The 68k also has absolute addressing which avoids using a base register but scales funny as the accesses are more efficient in lower memory. Code is not position independent and may require relocation information (RELOCs).

32 bit encoding xxx.w
48 bit encoding xxx.l
80 bit encoding xxx.q - this could be added to the 68k for a 64 bit ISA but worth it?

The x86-64 RIP relative addressing only supports (d32,pc) which with PC relative writes gives a big advantage. There are many x86-64 code models though and some use 64 bit absolute addressing which is very wasteful (8 bytes/EA). The x86-64 ISA lacks consistency and the many code models shows that the EA support is not optimum. The 68k ISA is nearly optimum for 32 bit addressing with its scaling across the whole 32 bit address space but 64 bit addressing support is more difficult. It shouldn't be difficult to do better than x86-64 though. Sadly, most RISC 64 bit ISAs still only support 16 bit base register relative, 16 bit immediates and 16 bit displacements which is inferior to x86-64. Compilers have to do more work when there is no encoding for what they need and this often results in multiple dependent instructions and larger code.

Hypex Quote:

It was common in the 80's. I've come across some ST code once. And Mac code at times.

The Amiga is the only 68k OS I'm aware of that does not trap into Supervisor mode for system calls. The Amiga was one of the few early microkernels where the philosophy is to stay out of supervisor mode as much as possible. Unfortunately, the Amiga allowed any program to enter Supervisor mode.

Hypex Quote:

An interrupt disable is done the same way as a forbid, increasing a counter. There may be some extra logic on PPC like on 68K where it disables DMA on hardware first. OS4 offers mutxes to the user. I use semaphores but the access routines do a forbid. That's out of my control and I think if they wanted people to avoid forbid the first place is within the system routines.

Disabling interrupts is more than disabling DMA and more than disabling multitasking.

Hypex Quote:

Didn't see that in use much at all and not in OS routines.

CAS has been the minimum data sharing instruction and allows to build more advanced data sharing operations. Newer fancy RISC processors may have other support.

Hypex Quote:

Oh dear. The BSR.L takes up 6 bytes. But I missed something. There are 24-bits reserved for the branch offset, but the offset has a 32MB radius. I also found there is also an absolute addressing mode, unheard on PowerPC! So branch can be 24 bit relative or absolute. Optimised to 32 bit alignment. 24 bit would seem limiting, even with 4GB ram space, but VM would expand it virtually.

PPC doesn't encode the lower 2 bits since code is 4 byte aligned. This allows the same range as 26 bits (2^26) of encoding which is +-32MiB. The extra range is nice from removing the lower bits but this made a compressed encoding more difficult which hurt PPC. IBM's CodePack had to use a more complex dictionary based compression and Freescale's VLE is a replacement encoding for PPC.

Last edited by matthey on 15-Aug-2021 at 03:24 PM.
Last edited by matthey on 15-Aug-2021 at 02:42 AM.
Last edited by matthey on 15-Aug-2021 at 02:33 AM.

Status: Offline

NutsAboutAmiga

Re: Assembly startup codes for ECX compiler in VAsm?
Posted on 15-Aug-2021 8:58:35

[ #25 ]

Elite Member

Joined: 9-Jun-2004
Posts: 12993
From: Norway

@matthey

Quote:

32 bit encoding using (d16,Rn) - most RISC ISAs and 68k have this

If they drop bit 0, 1 they can do 4byte aligned version of it.
address +1,+2,+3 is miss alignment, it does not make a lot of sense to have instructions that can cause misalignment, anyway. The compiler should auto pad most structs, to prohibit this from happenings.

So instead you have Bit 33 and Bit 32, extending the range to +/- 131070. Not sure what typical size of struct is, but lot smaller than arrays, for index arrays I think (Rn+Rn), index load/store is more common.

Last edited by NutsAboutAmiga on 15-Aug-2021 at 09:07 AM.

_________________
http://lifeofliveforit.blogspot.no/
Facebook::LiveForIt Software for AmigaOS

Status: Offline

matthey

Re: Assembly startup codes for ECX compiler in VAsm?
Posted on 15-Aug-2021 14:26:22

[ #26 ]

Elite Member

Joined: 14-Mar-2007
Posts: 2754
From: Kansas

NutsAboutAmiga Quote:

If they drop bit 0, 1 they can do 4byte aligned version of it.
address +1,+2,+3 is miss alignment, it does not make a lot of sense to have instructions that can cause misalignment, anyway. The compiler should auto pad most structs, to prohibit this from happenings.

For (d16,Rn), it is possible to scale the displacement by the data access size so accesses are always naturally aligned. SuperH does this for (d4,Rn), (d8,gbr) and (d8,pc). SuperH only has a fixed 16 bit encoding with limited encoding space so only (d4,Rn) is supported for all GP registers which is pretty limiting even for structure accesses. Global data is accessed by the GBR register (d8,gbr) and local data by the stack (d8,pc) but this is still restrictive. The scaling by size idea is nice but SuperH is handicapped by the fixed 16 bit encoding making it a toy ISA only useful for small (embedded) systems IMO. A 68k64 ISA could scale the displacement by the size although this would require a separate 64 mode (which I favored but Gunnar rejected). A separate mode is required anyway to allow the 2 bit instruction size field to support byte, word, longword and quadword which is optimum.

NutsAboutAmiga Quote:

So instead you have Bit 33 and Bit 32, extending the range to +/- 131070. Not sure what typical size of struct is, but lot smaller than arrays, for index arrays I think (Rn+Rn), index load/store is more common.

(Rn,Rn) is the basic array addressing mode but index scaling is nice for arrays too (Rn,Rn*SF). This avoids the shift instruction for the data access size which is free in some core designs. Adding a fixed offset to an array access can be supported with (dn,Rn,Rn*SF) which is supported by the 68k, x86 and AArch64. PPC needs 3 instructions for a (dn,Rn,Rn*SF) data access. MIPS doesn't even have (Rn,Rn) so needs 4 instructions.

Status: Offline

Hypex

Re: Assembly startup codes for ECX compiler in VAsm?
Posted on 16-Aug-2021 13:01:34

[ #27 ]

Elite Member

Joined: 6-May-2007
Posts: 11351
From: Greensborough, Australia

@Samurai_Crow

Quote:
It's called the "This Call" calling convention. The way that IInterfacename is passed to the function as the first parameter as well as being the VTable structure for the call. Most operating systems hide that the interface is both indexed as the VTable and the first parameter is hidden behind the calling convention even though they do the exact same thing as OS 4 under the hood/bonnet.

The other thing is I read that other systems aren't as pedantic as AmigaOS is with the need to open libraries. Such as available functionality in main(). To be able to use standard OS resources like DOS, graphics and user interfaces. And on OS4 it's more manual intervention required with interfaces.

The Self pointer is hidden in the form of the APICALL. But it's hidden on the caller side. On the method side it's the first parameter. Of course it needs to be there so the method knows what object it was called off. But it's confusing in compiler errors where the parameters are shifted to the wrong spot.

Of course, 68K had something similar, the library base. It was almost always needed to be in A6 and called off it. So in some ways it acted like the object or interface, as it just sat at the end of the parameter list and wasn't listed as a parameter. But, it was implicit, it needed to be there most of the time.

Status: Offline

Hypex

Re: Assembly startup codes for ECX compiler in VAsm?
Posted on 16-Aug-2021 14:42:28

[ #28 ]

Elite Member

Joined: 6-May-2007
Posts: 11351
From: Greensborough, Australia

@NutsAboutAmiga

Quote:
it be problematic if code was pushed to virtual page, then moved into some other memory address without relocating it first, also hate think of the stack, regarding branch return addresses. It needs to moved back exact same place, if where to be done.

Generally It wouldn't matter as it would be mapped in the OS. Though each process having it's own page sounds expensive above context switching. In AmigaOS it would be a problem because each task needs to share space with others so can't have cross over in addresses.

Status: Offline

Hypex

Re: Assembly startup codes for ECX compiler in VAsm?
Posted on 17-Aug-2021 14:01:40

[ #29 ]

Elite Member

Joined: 6-May-2007
Posts: 11351
From: Greensborough, Australia

@matthey

Quote:
The Amiga unfriendly official AmigaOS 4 GCC compiler only produces ELF so they at least need ELF2HUNK.[/quote

I tried to assemble the source with gas or as which was a pain in the proverbial. It liked it as much as Vasm so I gave up. I managed to compile Pasm which I found a source only archive of. PAsm assembled it straight away. In fact I thought something went wrong because it didn't output any messages. But it silently produced a file. The hunk format was just another option away.

[quote]A lot of older and smaller 68k Amiga programs use small data (default a5 base in GCC & LLVM, default a4 base on Amiga). The data must fit in 64kiB which is less common with more modern code. Some programmers avoid using small data if they think their data may grow past 64kiB. Small data is barely faster on the 68040 and may be slower in some cases as the 68040 is not slowed as much as other 68k CPUs by large code. Leaving small data off simplifies compiling and makes debugging easier. The advantage of SD is reduced code size from converting xxx.l memory accesses to (d16,An) and relocatable code.

Generally the small data would be taken up by the global variables. 64KB seems small but even in a large program it would need many thousands of variables to break the limit. Using long ints over short ints would bloat it. But static structures would cause the most trouble.

Quote:
The frame pointer (default a6 in GCC & LLVM, a5 on Amiga) is commonly turned off by 68k programs which results in higher performance code (-fomit-frame-pointer in GCC). The only reason to use it is for debugging and a good debugger works fine without it in my experience. There are programmers which forget to turn it off with GCC. Some Amiga users have suggested Stefan Franke (bebbo) turn it off by default in the new unofficial GCC. It is off by default in vbcc.

It must be a modern convention. I'm used to the classic compilers like MANX or SAS. Lattice. Where classic 68k code featured A4 and A5 a lot.

Quote:
At one time, PPC was not a bad choice for a 68k replacement being big endian. ARM was not powerful and x86 was little endian with less big endian support at that time. SuperH would have been an interesting 68k replacement but it really should have used a variable length encoding from the start instead of a fixed length 16 bit encoding. It is handicapped with so few immediate and displacement encoding bits leading to way too many (often dependent) instructions. The code density is good but it doesn't matter if it has to execute 50% more instructions. Then again, it does have more addressing modes than most other RISC processors and they are copied from the 68k. If they had used a variable length instruction set, maybe ARM Thumb2 would never have replaced SuperH.

From things I've read online the ARM could compete quite well with the 68K. But they needed to go beyond the 100 Mhz barrier. At the time, PPC was good. It gave the Amiga some extra life with modern 3d games. Extra power. Of course it was really off sided to CPU intensive tasks. Used as a co-processor. Funny, the RISC code for PPC is just like the RISC code on the copper.

Another option, they could have tried to go the way Commodore was planning to go, with the HP PA/RISC. By the time that had happened, the Amiga may have had an Intel Inside. :o But it still wouldn't have turned it into a PC.

Quote:
It should be possible to combine code/text and read only sections to allow PC relative addressing of everything but the most secure OSs which leave code/text (read only) and read only data (read only, no execute) in different pages with different MMU page privileges. Separate pages results in lower performance from increased code size and wasting memory from increased code size and page alignment. It would still be possible to use PC relative addressing across different page types if they are near but it is less efficient. It is still more efficient than absolute addressing with 64 bit addresses though. A 64 bit xxx.q is 8 bytes where (d32,pc) is 4 bytes but neither of these exist with a fixed length 32 bit encoding. PC relative addressing is usually associated with CISC because CISC more commonly uses a variable length encoding. RISC with a variable length encoding can have more efficient PC relative addressing and more immediate and displacement bits that scale into larger instruction sizes as necessary. However, ARM Thumb2 and RISC-V did *not* make good use of this but rather mostly chose to make short encodings of 32 bit instructions. Mitch Alsup was working on a RISC ISA which better takes advantage of this scaling like the 68k (he is one of the designers of the 68k). Let me demonstrate.

The modern "ELF" movement goes against conventions like self modifying code and inserting data where you like. I was tripped up on this just last night. Debugging code that was trying to clear an arg string. Kept crashing. Gave it enough space and kept doing it. The I realised, since I was using AmigaE, that a string I had substituted as a blank was sitting in read only. I changed the pointer and then it worked.

I had to read up on this relativity and am surprised i386 can't do this directly. They need to cheat by doing a branch when entering a function to grab the PC and pull it off the stack to use as a base. Funny, that's exactly what PPC would do. Except the PPC code is better because it uses registers and not stack. But PPC64 lost out again to x64 without a RIP. x64 ripped it again. However, the SysV ABI is designed to do with this on PPC, With the TOC and GOT.

Quote:
The x86-64 RIP relative addressing only supports (d32,pc) which with PC relative writes gives a big advantage. There are many x86-64 code models though and some use 64 bit absolute addressing which is very wasteful (8 bytes/EA). The x86-64 ISA lacks consistency and the many code models shows that the EA support is not optimum. The 68k ISA is nearly optimum for 32 bit addressing with its scaling across the whole 32 bit address space but 64 bit addressing support is more difficult. It shouldn't be difficult to do better than x86-64 though. Sadly, most RISC 64 bit ISAs still only support 16 bit base register relative, 16 bit immediates and 16 bit displacements which is inferior to x86-64. Compilers have to do more work when there is no encoding for what they need and this often results in multiple dependent instructions and larger code.

The 32 bit offset does allow for a relative 4GB range which is large but would be rather wasteful when only 16 bits is needed. OTOH, PPC has backed itself into a corner by only leaving 16 bits to play with. But it can only do RIP relative, I mean routine independent positioning. That is each routine has its own position independent of others, used as a base reference. It can use a registers to expand the range, but of course the old problem comes into play. It needs to load the register.

Quote:
The Amiga is the only 68k OS I'm aware of that does not trap into Supervisor mode for system calls. The Amiga was one of the few early microkernels where the philosophy is to stay out of supervisor mode as much as possible. Unfortunately, the Amiga allowed any program to enter Supervisor mode.

The OS and machine let programs do it, The OS had a support function to enter it when needed. And a program could easily enough set up the illegal vector then do an illegal to enter into supervisor. Then the 010 and 020 came along and caught some games out, because they used to MOVE to SR which wasn't free any more and crash. Simple to patch, wrote one myself when my A500 Psygnosis games crashed on my new A1200. But at the time hardware banging was the way, not using OS resources nor producing versions for each CPU. Funny though, as a lot of OS friendly C written games used to do a MOVE SR. I didn't know what the compiler was thinking.

Quote:
Disabling interrupts is more than disabling DMA and more than disabling multitasking.

It's only to disable it in software within the Exec subsystem. A higher level than the hardware.

Quote:
CAS has been the minimum data sharing instruction and allows to build more advanced data sharing operations. Newer fancy RISC processors may have other support.

Would have been good to use in semaphore access but they use a forbid lock instead.

Quote:
PPC doesn't encode the lower 2 bits since code is 4 byte aligned. This allows the same range as 26 bits (2^26) of encoding which is +-32MiB. The extra range is nice from removing the lower bits but this made a compressed encoding more difficult which hurt PPC. IBM's CodePack had to use a more complex dictionary based compression and Freescale's VLE is a replacement encoding for PPC.

It sits in a logical place, and doesn't need shifting, just masking out. Would have been a good idea for other operations to increase range to 24 bits. Somewhat better than the 16 bit limit. The lower bits used as flags for specifying linkage and if absolute.

The VLE is interesting but it goes against the PPC design and just seems wrong, with an "extension" to reduce 32 bit encodings in half to 16 bit. Useful for low memory systems since only 8 bits values on average can be encoded. But to me it's like the Vampire extension to the copper, trying to turn it into a 68K, with a 32 bit write. That puts the codes out of alignment. The copper is like the PPC, it's designed so each instruction is 32 bits, with a 16 bit opcode and 16 bit operand. a 64 bit copper code makes more sense in my book.

I think an actual extension to PPC with 64 bit codes would be a better option. At least on more powerful systems. Then it could deal with code more practically by allowing up to 48 bits for address and data operations. A 64 bit load would only need 96 bits, instead of the 160 bits it needs now.

I wasn't aware of CodePack. So PPC had hardware code compression? That sounds quite advanced. Perhaps they could have worked on the idea further and used it as a CISC to RISC translator for a hybrid CPU like x86 became. Perhaps more complicated than a compression algorithm but maybe they could have considered a 68K translator instead.

Status: Offline

matthey

Re: Assembly startup codes for ECX compiler in VAsm?
Posted on 19-Aug-2021 2:57:58

[ #30 ]

Elite Member

Joined: 14-Mar-2007
Posts: 2754
From: Kansas

Hypex Quote:

Generally the small data would be taken up by the global variables. 64KB seems small but even in a large program it would need many thousands of variables to break the limit. Using long ints over short ints would bloat it. But static structures would cause the most trouble.

When the Amiga came out, 64kiB of space for data seemed adequate. Today, even 640kiB is not enough.

Hypex Quote:

It must be a modern convention. I'm used to the classic compilers like MANX or SAS. Lattice. Where classic 68k code featured A4 and A5 a lot.

Early Amiga programs were more optimized and modular often using Amiga libraries which reduced global data requirements in all but the largest Amiga programs but then more modern and bloated ports of software came to the Amiga. At one time, nearly all 68k programs used small data (a4 base register) but today it is significantly less. Look at the number of 68k programs that are 100kiB or more today compared to the ancient Amiga history Manx compiler days.

GCC may have made elimination of the stack pointer popular as it is an optimization made for -O1 and above on targets when possible and where there is a savings (search for --fomit-frame-pointer).

https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

How many programmers release GCC compiled code without optimizations?

Hypex Quote:

From things I've read online the ARM could compete quite well with the 68K. But they needed to go beyond the 100 Mhz barrier. At the time, PPC was good. It gave the Amiga some extra life with modern 3d games. Extra power. Of course it was really off sided to CPU intensive tasks. Used as a co-processor. Funny, the RISC code for PPC is just like the RISC code on the copper.

ARM was targeted more at the embedded market by the time StrongARM came out where clocking up a core to compete is a disadvantage. The heat produced from clocking Alpha up resulted in DEC going bankrupt even on the desktop. At least StrongARM was more practical as low power instead of maximum performance was the goal but higher clock speeds sabotage this advantage by wasting more power. The 68060 in 1994 offered better integer performance per MHz (1.47 DMips/MHz) compared to the StrongARM SA-110 in 1996 (1.05 DMips/MHz) even when StrongARM had a process advantage. Of course StrongARM was clocked up and surpassed the 68060 in performance while the 68k was abandoned even though the 68060 8 stage pipeline and low power should have made it a better candidate to clock up than most other processors around that time. PPC core designs usually had significantly better performance/MHz than StrongARM cores even though low end PPC cores had trouble competing with the 68060 due to poor code density. Most PPC core designs were shallow pipeline which was appropriate for embedded systems but made them difficult to clock up. PPC shallow pipelines reduced area which made them cheaper to produce while ARM was able to take advantage of it's simpler ISA to clock it up which is cheaper than improving performance/MHz. It was x86 which had integer performance/MHz almost as good as the equivalent 68k processor (68020 vs 286, 68030 vs 386, 68040 vs 486 and 68060 vs early Pentium) and the Pentium quickly added pipeline stages (like the 68060) for higher clock speeds.

The Amiga copper is RISC but difficult to compare to a CPU core. The instructions needed are few and simple. The code size is small, there are no registers in the core and there are no caches. This is all pretty much the opposite of PPC.

Hypex Quote:

Another option, they could have tried to go the way Commodore was planning to go, with the HP PA/RISC. By the time that had happened, the Amiga may have had an Intel Inside. :o But it still wouldn't have turned it into a PC.

PA/RISC had more addressing modes than most RISC architectures and is natively big endian which should have helped transitioning from the 68k. The code density is horrible and there are often reduced immediate/displacement bits though. I think the 88k would have been an easier transition than PPC or PA-RISC. Instructions sometimes resemble the 68k instructions, it has a descent amount of useful addressing modes and it had the extended precision FPU which somewhat resembled the 68k. Still, I think SuperH with a variable length instruction set would have made the easiest transition to RISC. AArch64 would likely be the easiest RISC transition today although it is natively little endian.

Hypex Quote:

The modern "ELF" movement goes against conventions like self modifying code and inserting data where you like. I was tripped up on this just last night. Debugging code that was trying to clear an arg string. Kept crashing. Gave it enough space and kept doing it. The I realised, since I was using AmigaE, that a string I had substituted as a blank was sitting in read only. I changed the pointer and then it worked.

Ideally, there would be only a read only privilege violation with a task held requestor instead of a crash.

Hypex Quote:

I had to read up on this relativity and am surprised i386 can't do this directly. They need to cheat by doing a branch when entering a function to grab the PC and pull it off the stack to use as a base. Funny, that's exactly what PPC would do. Except the PPC code is better because it uses registers and not stack. But PPC64 lost out again to x64 without a RIP. x64 ripped it again. However, the SysV ABI is designed to do with this on PPC, With the TOC and GOT.

x86-64 is a mess even with improvements from x86. The 68k often puts x86(-64) and PPC to shame in simplicity of code. It's a shame as the 68k hardware isn't much more complex.

Hypex Quote:

The 32 bit offset does allow for a relative 4GB range which is large but would be rather wasteful when only 16 bits is needed. OTOH, PPC has backed itself into a corner by only leaving 16 bits to play with. But it can only do RIP relative, I mean routine independent positioning. That is each routine has its own position independent of others, used as a base reference. It can use a registers to expand the range, but of course the old problem comes into play. It needs to load the register.

The x86-64 RIP relative mode MOV instruction is 6 bytes which is not bad for (d32,pc) although x86-64 often requires MOV (d32,RIP),Rn + OP Rn or the reverse (same as RISC load/store). PPC has (d16,pc) which uses 4 bytes so roughly equivalent code density but much less range. The 68k has (d16,pc) which is normally 4 bytes like PPC and (d32,pc) which is 8 bytes although the addressing mode supports operations (not just MOVE) but all PC relative addressing modes are read only. I would like 68k64 to support a shorter 6 byte encoding for (d32,pc), to allow PC relative writes and maybe even add (d64,pc) which would normally be 12 bytes. The 12 bytes is more expensive than absolute 64 bit addressing which is normally 10 bytes but most displacements would be shorter usually offering a savings while avoiding the multitude of x86-64 code models (small code, medium code, large code, small PIC, medium PIC, large PIC) and remaining position independent.

https://eli.thegreenplace.net/2012/01/03/understanding-the-x64-code-models

Supporting 6 code models can't be easy for compilers.

Hypex Quote:

The OS and machine let programs do it, The OS had a support function to enter it when needed. And a program could easily enough set up the illegal vector then do an illegal to enter into supervisor. Then the 010 and 020 came along and caught some games out, because they used to MOVE to SR which wasn't free any more and crash. Simple to patch, wrote one myself when my A500 Psygnosis games crashed on my new A1200. But at the time hardware banging was the way, not using OS resources nor producing versions for each CPU. Funny though, as a lot of OS friendly C written games used to do a MOVE SR. I didn't know what the compiler was thinking.

Yes, programs can trap to code in Supervisor mode. Emulators for 68k systems can be easier on the Amiga because these vectors are not used on the Amiga.

The 68000 doesn't have MOVE CCR,EA so it is difficult for a 68000 compiler to generate code to get the CCR without using MOVE SR,EA. The Amiga has exec/GetCC() but those early compilers and programs needed patching

Hypex Quote:

The VLE is interesting but it goes against the PPC design and just seems wrong, with an "extension" to reduce 32 bit encodings in half to 16 bit. Useful for low memory systems since only 8 bits values on average can be encoded. But to me it's like the Vampire extension to the copper, trying to turn it into a 68K, with a 32 bit write. That puts the codes out of alignment. The copper is like the PPC, it's designed so each instruction is 32 bits, with a 16 bit opcode and 16 bit operand. a 64 bit copper code makes more sense in my book.

VLE has new 16 bit *and* 32 bit instruction encodings which may reduce some of the overlap for immediates and displacements. Gaining up to 30% better code density while instruction paths can be up to 10% longer, floating point registers are unavailable and 16 bit instructions can only use 16 GP registers and CR0 doesn't seem so appealing to me either. The 68k still usually has better code density, usually has shorter instruction paths than normal PPC code due to CISC design and can use 16 GP integer registers and a FPU. Freescale threw away the 68k and then created something inferior.

The copper could have padding added to align the code. There isn't so much code that reduced code density is going to hurt much. Then again, most of the Amiga hardware is set up to minimize mis-alignment penalties. I don't know enough about the Vampire copper to say much other than it must suck. At least copper is more compatible in a Vampire than silver.

Hypex Quote:

I think an actual extension to PPC with 64 bit codes would be a better option. At least on more powerful systems. Then it could deal with code more practically by allowing up to 48 bits for address and data operations. A 64 bit load would only need 96 bits, instead of the 160 bits it needs now.

POWER added a variable length encoding and 64 bit instructions. IBM claims the 64 bit encoding improves code density and reduces the number of instructions because of increased displacement and immediate sizes. Longer instructions and a variable length encoding are more difficult for low end cores to handle but it is better than the alternative of trying to execute many short often dependent instructions, especially for a 64 bit CPU.

Hypex Quote:

I wasn't aware of CodePack. So PPC had hardware code compression? That sounds quite advanced. Perhaps they could have worked on the idea further and used it as a CISC to RISC translator for a hybrid CPU like x86 became. Perhaps more complicated than a compression algorithm but maybe they could have considered a 68K translator instead.

An efficient CISC encoding with internal RISC execution pipeline is simpler than CodePack. The x86-64 isn't even that efficient but it was still good enough to rule the desktop and server markets. The 68k had better code density, was more efficient to decode and had better performance/MHz but was thrown away.

Status: Offline

Hypex

Re: Assembly startup codes for ECX compiler in VAsm?
Posted on 23-Aug-2021 7:53:48

[ #31 ]

Elite Member

Joined: 6-May-2007
Posts: 11351
From: Greensborough, Australia

@matthey

Quote:
When the Amiga came out, 64kiB of space for data seemed adequate. Today, even 640kiB is not enough.

That 640K never goes away.

Quote:
Early Amiga programs were more optimized and modular often using Amiga libraries which reduced global data requirements in all but the largest Amiga programs but then more modern and bloated ports of software came to the Amiga. At one time, nearly all 68k programs used small data (a4 base register) but today it is significantly less. Look at the number of 68k programs that are 100kiB or more today compared to the ancient Amiga history Manx compiler days.

The alternative would be PC relative. Of course it's limited being in code. Another way would be each routine using its own base reference if it only used a portion of global data. Though that is like i386 and PPC would do. Of course all the global data would be merged in one spot, though isolating each module to it's own segment would make sense, since bottom end code should't need to touch top level data.

Quote:
GCC may have made elimination of the stack pointer popular as it is an optimization made for -O1 and above on targets when possible and where there is a savings (search for --fomit-frame-pointer).

I'm familiar with it from PPC but not on 68K. On 68K it seems strange as I wouldn't associate a stack frame with 68K. The 68K stack frame I know is used in interrupts. i suppose I just don't think of A5 local space as being in a stack frame, just allocated from the stack.

Using registered parameters could be another optimisation. With Amiga ABI conventions 4 parameters could passed in volatile registers and avoid stack. Of course depending how complex routine is it may or may not need extra stack to run.

Quote:
How many programmers release GCC compiled code without optimizations?

I wouldn't know. Depends if they add it in before building the release binary. Aside from the classic StormC days I thought real Amiga C coders used Vacc.

But, I came across a discussion recently about GCC optimising code not exactly optimising, and changing base relative access to absolute pointers when optimising was tuned on. Specifying small data didn't fix it and corrupted the code.

Quote:
PPC shallow pipelines reduced area which made them cheaper to produce while ARM was able to take advantage of it's simpler ISA to clock it up which is cheaper than improving performance/MHz. It was x86 which had integer performance/MHz almost as good as the equivalent 68k processor (68020 vs 286, 68030 vs 386, 68040 vs 486 and 68060 vs early Pentium) and the Pentium quickly added pipeline stages (like the 68060) for higher clock speeds.

The days when PPC was cheaper to produce.

Quote:
The Amiga copper is RISC but difficult to compare to a CPU core. The instructions needed are few and simple. The code size is small, there are no registers in the core and there are no caches. This is all pretty much the opposite of PPC.

It's reduced as much as much as reduction could be. Very basic and dedicated to one task. But the instruction format is coded the same as the PPC 16 bit opcode/ 16 bit operand format. More details here:
https://blitterwolf.blogspot.com/2019/05/what-has-powerpc-got-to-do-with-amiga.html

Quote:
PA/RISC had more addressing modes than most RISC architectures and is natively big endian which should have helped transitioning from the 68k. The code density is horrible and there are often reduced immediate/displacement bits though. I think the 88k would have been an easier transition than PPC or PA-RISC. Instructions sometimes resemble the 68k instructions, it has a descent amount of useful addressing modes and it had the extended precision FPU which somewhat resembled the 68k. Still, I think SuperH with a variable length instruction set would have made the easiest transition to RISC. AArch64 would likely be the easiest RISC transition today although it is natively little endian.

Here's an article about an experimental 88K Mac board before everyone including Motorola abandoned 88K for PPC:
https://computerhistory.org/blog/transplanting-the-macs-central-processor-gary-davidian-and-his-68000-emulator/

So if ARM was LE then the Acorn would have been an 80's LE computer as well. Common for 8 bitters. But for 16 bit BE was common with popular machines like ST and Amiga.

Another option, is to port OS4 to x64, but using a very customised compiler idea I made up that only runs x86 code as big endian. What this means, is that all single R/W accesses must go through a MOVEBE instruction. This would lock out all other arithmetic operations except register to register. So it would restrict it to being load/store, like PPC. Though memory to memory would be fine as end result is the same. Of course, it would be a but of a hack, since the CPU and codes is still little endian, so might as well go ARM.

Quote:
Ideally, there would be only a read only privilege violation with a task held requestor instead of a crash.

I think it treats the read as an illegal read like any bad address. But, lately they have been tightening the nuts to the extreme, beyond the GetCC() zero barrier. A perfectly legal set of programs I wrote calling on Alert() and DisplayAlert() with a RECOVERY_ALERT now crash with a Grim Reaper. While destroying the classic red and yellow alert with a Software Error that goes back to the 90s, replacing it with a crash window is but too far and confusing for the user.

Quote:
Supporting 6 code models can't be easy for compilers.

It seems to me it would also make sense for the compiler to reference large arrays as a pointer. Actually storing the array in small data just seems like a waste. What I think would be better is storing larger data before or after the small data space and using small data as a reference only.

Quote:
The 68000 doesn't have MOVE CCR,EA so it is difficult for a 68000 compiler to generate code to get the CCR without using MOVE SR,EA. The Amiga has exec/GetCC() but those early compilers and programs needed patching

I don't know why code would need to touch the SR or CC in general operations. I used to write ASM a bit, though nothing big, and don't ever recall needing to hack the SR for my code to work.

In any case, accordingly, it was available early on before V33 and possibly at first. So any compiler supporting OS1.2 and up should have used it.

https://d0.se/autodocs/exec.library

Quote:
VLE has new 16 bit *and* 32 bit instruction encodings which may reduce some of the overlap for immediates and displacements. Gaining up to 30% better code density while instruction paths can be up to 10% longer, floating point registers are unavailable and 16 bit instructions can only use 16 GP registers and CR0 doesn't seem so appealing to me either. The 68k still usually has better code density, usually has shorter instruction paths than normal PPC code due to CISC design and can use 16 GP integer registers and a FPU. Freescale threw away the 68k and then created something inferior.

That would mean it's incompatible with standard PPC32. They already did this with SPE. The incompatibilities across the board for the 68K are minor compared to the major wreckage they've done to PPC. When I first read about PPC one of the standard features was built in FPU. Unless I misread something. Since then they have thought of different ways to make the PPC line incompatible. It's like another ColdFire. The fire had gone out but they are still poking around in the embers.

Somehow Intel have modified the x86 design to keep it going and maintain good backward compatibility. Now they have the finances and manpower to do so even if the end result is a complicated mess. But Windows has never needed to be ported to another ISA or be replaced. A generic Windows CD always booted on a PC without needing a version for this or that.

When Mac was on PPC one OSX CD would only boot on a specific Mac. I think this goes beyond Apple restricting it. The different PPC chips were incompatible. We see evidence of this with the AmigaOne series. No generic Linux kernel can boot on an XE, Sam or Pegasos 2. It would crash. Except Pegasos is most compatible and could boot official Debian CDs with the included CHRP kernel.

Quote:
The copper could have padding added to align the code. There isn't so much code that reduced code density is going to hurt much. Then again, most of the Amiga hardware is set up to minimize mis-alignment penalties. I don't know enough about the Vampire copper to say much other than it must suck. At least copper is more compatible in a Vampire than silver.

The copper is like the rest of the chipset, 16 bit based. So it suits dividing up quantities into 16 bit. Including pointers. Which appear as hi and low 16 bit words. The 68K is similar and I wonder if the 16 bit theme is due to using the 68K? Since it matches the data bus and 16 bit word sizes used in the 68000. Had the 68020 been the base and they had the time perhaps the chip would have been 32 bit design ass well. But we just left the 8 bit era.

I can't find the exact page now, possibly on the forum, but a copper write has been extended to 32 bit. When I read the details it looked like they just copied the 68K VLE for 32 bit move taking up 6 words. But the copper design doesn't look like it suits VLE. Copper codes are 32 bit, so although it makes sense in a 16 bit alignment, it will put codes in odd places out of 32 bit alignment. Maybe Gunnar should have copied PowerPC!

I wonder if this is an early Vampire model?

https://en.wikipedia.org/wiki/Vampire_tap

Quote:
POWER added a variable length encoding and 64 bit instructions. IBM claims the 64 bit encoding improves code density and reduces the number of instructions because of increased displacement and immediate sizes. Longer instructions and a variable length encoding are more difficult for low end cores to handle but it is better than the alternative of trying to execute many short often dependent instructions, especially for a 64 bit CPU.

So they did take my idea on board for the top level architecture. It's fairly new and I can't find much info on it except it's in POWER10. The only info I found was on Wikipedia. A basic look at OpenPower failed. No wonder people think Power/PC is dead, where an internet search about POWER 10 64 bit prefix doesn't turn up any Power ISA results except Wikipedia. So it looks like they used some kind of prefix and copied Intel. Hmmm, that doesn't look like it would suit. They didn't do it right.

Quote:
An efficient CISC encoding with internal RISC execution pipeline is simpler than CodePack. The x86-64 isn't even that efficient but it was still good enough to rule the desktop and server markets. The 68k had better code density, was more efficient to decode and had better performance/MHz but was thrown away.

And thrown away by something that became less popular. Even the 88K is said to share ideas with 68K that were also thrown out. Both followed up by the Coldfire 68K abomination that was the talk of Amiga town before Phase5 stepped in.

I've had idea a while ago for a hybrid code decoder. On x86. I thought it would be useful for the OS4 market for a cheaper CPU source. If there was a hardware translator that read in PPC code from memory and translated it to x86/64 on the fly. By the looks of it this kind of idea is already in place with ideas like CodePak. Loading in custom code format and then unpacking/translating to main core. But, even with a translator, differing endian can still cause issues. And self modifying code or even self peeking code would be banned. Anything that reads one code in memory where the format is different would break.

But there was the PowerPC 615, featuring both a PPC and x86 core and furthermore even a PPC64 core of some kind. So perhaps a hybrid kind of CPU would work. Had the core been produced for the Mac x86 switch it might have had more purpose. But this was when PPC could compete with x86 power. Now what's left of PPC can hardly compete with a mobile ARM CPU. And silly sounding ideas like running desktop OS4 on a mini RPi make sense because of the more powerful CPU.

Status: Offline

[ home ][ about us ][ privacy ] [ forums ][ classifieds ] [ links ][ news archive ] [ link to us ][ user account ]

Amigaworld.net was originally founded by David Doyle