Click Here
home features news forums classifieds faqs links search
6071 members 
Amiga Q&A /  Free for All /  Emulation /  Gaming / (Latest Posts)
Login

Nickname

Password

Lost Password?

Don't have an account yet?
Register now!

Support Amigaworld.net
Your support is needed and is appreciated as Amigaworld.net is primarily dependent upon the support of its users.
Donate

Menu
Main sections
» Home
» Features
» News
» Forums
» Classifieds
» Links
» Downloads
Extras
» OS4 Zone
» IRC Network
» AmigaWorld Radio
» Newsfeed
» Top Members
» Amiga Dealers
Information
» About Us
» FAQs
» Advertise
» Polls
» Terms of Service
» Search

IRC Channel
Server: irc.amigaworld.net
Ports: 1024,5555, 6665-6669
SSL port: 6697
Channel: #Amigaworld
Channel Policy and Guidelines

Who's Online
11 crawler(s) on-line.
 96 guest(s) on-line.
 0 member(s) on-line.



You are an anonymous user.
Register Now!
 NutsAboutAmiga:  12 mins ago
 pixie:  50 mins ago
 billt:  1 hr 1 min ago
 OlafS25:  1 hr 2 mins ago
 Deaths_Head:  1 hr 4 mins ago
 MichaelMerkel:  1 hr 27 mins ago
 amigakit:  1 hr 33 mins ago
 DiscreetFX:  1 hr 33 mins ago
 matthey:  1 hr 57 mins ago
 t0lkien:  2 hrs 1 min ago

/  Forum Index
   /  Amiga OS4.x \ Workbench 4.x
      /  AmigaOS4 KVM Edition? virtual gpu driver Picasso96 coming soon.
Register To Post

Goto page ( Previous Page 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 Next Page )
Poll : AmigaOS4 KVM/Emulation
I would get AmigaOS4 Forever Edition/check out emulation
I already run OS4 in Emulation
Intresting, see where this goes...
AmigaOS4 Hardware only!
Not intrested in Emulation
Not intrested in OS4
Pancakes!
 
PosterThread
cdimauro 
Re: AmigaOS4 KVM Edition? virtual gpu driver Picasso96 coming soon.
Posted on 25-Oct-2023 21:52:10
#181 ]
Elite Member
Joined: 29-Oct-2012
Posts: 3650
From: Germany

@matthey

Quote:

matthey wrote:

The problem with 64 bit code is that it is larger most of the time

This depends on the specific processor architecture.
Quote:
while moving or calculating on 64 bits is only advantageous some of the time.

It depends also on the creativity of developers on how to use them.
Quote:
Most of the time, the same operations are done with twice as much data which takes twice as much space in memory. Some ISAs allow 32 bit and 64 bit sizes giving the option to use 32 bit to gain the smaller data advantages of 32 bit but it is important to avoid partial register writes of 32 bit data to 64 bit registers. The performance advantage of 64 bit over 32 bit is often overestimated. A paper called "Performance Characterization of SPEC CPU2006 Integer Benchmarks on x86-64 Architecture" found less than a 1% performance gain in the SPEC CPU2000 int benchmark and a 7% performance gain in the SPEC CPU2006 int benchmark with 64 bit vs 32 bit compiled code on a x86-64 64 bit CPU. It looks like the SPEC CPU2006 int benchmark was changed to better support 64 bit code.

Which makes sense. As the paper shows, the code wasn't properly written and it penalized 64 bit architectures.

Code should be "neutral".
Quote:
The x86-64 code has 16 GP registers vs 8 GP registers in x86 code but x86-64 code is larger by 21% in the case of the SPEC CPU2006 benchmark. A 7% performance gain at the expense of 21% code size isn't bad but another paper showed a 4.4% int performance gain on SPEC CPU2000 from 16 GP registers instead of 8 GP registers so perhaps less than 3% 64 bit performance gain at the expense of 21% larger code ("Performance Characterization of the 64-bit x86 Architecture from Compiler Optimizations’ Perspective").

When checking the results you've also to consider how much it was saved by using 16 registers instead of 8 or 12 in terms of memory accesses & memory bandwidth.

Whilst it doesn't matter so much of single cores systems, it's quite important on multicores ones where the cores compete for accessing the memory for their needs.

That's why having more registers is A Good Thing To Have.
Quote:
A naive recompile of code to 64 bit can easily result in an overall loss in performance. There is code that more than doubles performance with 64 bit code and more modern high end hardware can handle 64 bit code better without as much slowdown but 64 bit is not some magic bullet for performance. It mainly gives more than 4GiB of addressing with an increased hardware cost.

That's normal / expected: there's no free lunch and 64 bits cannot always make miracles.

There are cases where there's a performance loss and others where there's a benefit..

 Status: Offline
Profile     Report this post  
Deaths_Head 
Re: AmigaOS4 KVM Edition? virtual gpu driver Picasso96 coming soon.
Posted on 25-Oct-2023 22:55:07
#182 ]
Member
Joined: 15-Apr-2005
Posts: 82
From: Unknown

@Karlos

Powerpc should have been left behind years ago. arm just seems like the obvious choice for the next cpu especially considering its current use in the community, the 68k jit could also aid in moving amigaos over to arm.

Perhaps Aeon could produce a more powerful custom arm board for the higher end, whilst products such as piistorm could provide an upgrade path for classic machines & the a600gs & a500mini serve the lower end. This could potentially expand the target platform for the next iteration of amigaos.

 Status: Offline
Profile     Report this post  
rzookol 
Re: AmigaOS4 KVM Edition? virtual gpu driver Picasso96 coming soon.
Posted on 25-Oct-2023 22:59:07
#183 ]
Regular Member
Joined: 4-Oct-2005
Posts: 318
From: Poland, Lublin

@Deaths_Head

Why Aeon should produce yet another arm board?

 Status: Offline
Profile     Report this post  
Deaths_Head 
Re: AmigaOS4 KVM Edition? virtual gpu driver Picasso96 coming soon.
Posted on 25-Oct-2023 23:23:15
#184 ]
Member
Joined: 15-Apr-2005
Posts: 82
From: Unknown

@rzookol

Just throwing ideas out there & ways to expand the market. They seem think there's a market for a higher end AmigaOne e.g X5000.

I was thinking they could produce a custom high end arm AmigaOne machine for people that want that, whilst there's also affordable mid range to low range priced boards such as piistorm from anybody that wants to produce those.

Like back in the day we had the A4000 for the higher end & the A1200/600 for the lower end.

 Status: Offline
Profile     Report this post  
kolla 
Re: AmigaOS4 KVM Edition? virtual gpu driver Picasso96 coming soon.
Posted on 25-Oct-2023 23:55:49
#185 ]
Elite Member
Joined: 21-Aug-2003
Posts: 2917
From: Trondheim, Norway

Why spend so much effort to make the OS code portable and then not port it?

_________________
B5D6A1D019D5D45BCC56F4782AC220D8B3E2A6CC

 Status: Offline
Profile     Report this post  
Matt3k 
Re: AmigaOS4 KVM Edition? virtual gpu driver Picasso96 coming soon.
Posted on 26-Oct-2023 2:00:44
#186 ]
Regular Member
Joined: 28-Feb-2004
Posts: 223
From: NY

@kolla

Especially when the OS is pretty much dead...

 Status: Offline
Profile     Report this post  
matthey 
Re: AmigaOS4 KVM Edition? virtual gpu driver Picasso96 coming soon.
Posted on 26-Oct-2023 4:10:21
#187 ]
Elite Member
Joined: 14-Mar-2007
Posts: 2024
From: Kansas

cdimauro Quote:

Which makes sense. As the paper shows, the code wasn't properly written and it penalized 64 bit architectures.

Code should be "neutral".


It is easy to judge with hindsight and say "the code wasn't properly written". I expect the SPEC benchmark code was typical of pre-64 bit aware code.

cdimauro Quote:

When checking the results you've also to consider how much it was saved by using 16 registers instead of 8 or 12 in terms of memory accesses & memory bandwidth.

Whilst it doesn't matter so much of single cores systems, it's quite important on multicores ones where the cores compete for accessing the memory for their needs.

That's why having more registers is A Good Thing To Have.


Load/store architectures usually have many GP registers but loading code uses memory bandwidth too. Load/store architectures have more memory traffic than many alternatives except for accumulator architectures. Reg-mem architectures typically have less data memory traffic and less code memory traffic than load/store architectures. The 68k is not a pure reg-mem architecture but uses mem-mem operations which reduces both data memory traffic and code memory traffic further.

Classifying Instruction Set Architectures (see page 9 for memory traffic comparison)
http://cs.uccs.edu/~cs520/S99ch2.PDF

Deaths_Head Quote:

Powerpc should have been left behind years ago. arm just seems like the obvious choice for the next cpu especially considering its current use in the community, the 68k jit could also aid in moving amigaos over to arm.


PPC was an obvious porting choice for the AmigaOS when it was chosen too. Despite ignoring, belittling and suppressing the 68k Amiga market by PPC AmigaNOne powers for over 20 years, the 68k Amiga remains much more popular. Add THEA500 Mini, FPGA Amiga hardware, ReAmiga/Ramixx500/AA3000+ Amiga recreations with new 68060 accelerators and PiMiga/PiStorm type emulation devices sold and it shows how pathetic PPC AmigaNOne sales are and where the Amiga market is which is 68k Amiga. Hyperion returned to the 68k Amiga market to save itself and AmigaKit is trying to tap in to the 68k Amiga market with the A600GS even though the 68k Amiga market belongs to Amiga Corporation who owns most of the Amiga IP and AmigaOS. Amiga mass market potential is not on ARM or x86-64 but on the 68k yet this Amiga surge in popularity goes nowhere on emulation, Amiga recreations and FPGA hardware because these do not offer competitive value and are not developer targets.

Deaths_Head Quote:

Perhaps Aeon could produce a more powerful custom arm board for the higher end, whilst products such as piistorm could provide an upgrade path for classic machines & the a600gs & a500mini serve the lower end. This could potentially expand the target platform for the next iteration of amigaos.


Can A-Eon produce a niche market ARM board for less than mass produced ARM boards like the RPi?

 Status: Offline
Profile     Report this post  
cdimauro 
Re: AmigaOS4 KVM Edition? virtual gpu driver Picasso96 coming soon.
Posted on 26-Oct-2023 6:00:43
#188 ]
Elite Member
Joined: 29-Oct-2012
Posts: 3650
From: Germany

@matthey

Quote:

matthey wrote:
cdimauro Quote:

Which makes sense. As the paper shows, the code wasn't properly written and it penalized 64 bit architectures.

Code should be "neutral".


It is easy to judge with hindsight and say "the code wasn't properly written". I expect the SPEC benchmark code was typical of pre-64 bit aware code.

Looking at the example it was clear that it was made a mistake selecting the proper data type.

In general, it's very well know since long time that sizeof(int) "less than or equal" sizeof(long) and that there were/are machines having 32-bit for ints and 64-bit for longs.

As a coder, you should select the proper data type for the specific need. Choosing a long when you do NOT need to manipulate/represent more than 32-bit it's clearly a bad design. But the concept is similar for any data type to be used (char, short, int, long, single, double).
Quote:
cdimauro Quote:

When checking the results you've also to consider how much it was saved by using 16 registers instead of 8 or 12 in terms of memory accesses & memory bandwidth.

Whilst it doesn't matter so much of single cores systems, it's quite important on multicores ones where the cores compete for accessing the memory for their needs.

That's why having more registers is A Good Thing To Have.


Load/store architectures usually have many GP registers but loading code uses memory bandwidth too. Load/store architectures have more memory traffic than many alternatives except for accumulator architectures. Reg-mem architectures typically have less data memory traffic and less code memory traffic than load/store architectures. The 68k is not a pure reg-mem architecture but uses mem-mem operations which reduces both data memory traffic and code memory traffic further.

Classifying Instruction Set Architectures (see page 9 for memory traffic comparison)
http://cs.uccs.edu/~cs520/S99ch2.PDF

Nice read, thanks.

Looking at it I find very strange that stack architectures were put on top for memory traffic. In fact, the continuous push & pop of data generates much more memory traffic compared to other architectures.
Quote:
Deaths_Head Quote:

Powerpc should have been left behind years ago. arm just seems like the obvious choice for the next cpu especially considering its current use in the community, the 68k jit could also aid in moving amigaos over to arm.


PPC was an obvious porting choice for the AmigaOS when it was chosen too. Despite ignoring, belittling and suppressing the 68k Amiga market by PPC AmigaNOne powers for over 20 years, the 68k Amiga remains much more popular. Add THEA500 Mini, FPGA Amiga hardware, ReAmiga/Ramixx500/AA3000+ Amiga recreations with new 68060 accelerators and PiMiga/PiStorm type emulation devices sold and it shows how pathetic PPC AmigaNOne sales are and where the Amiga market is which is 68k Amiga. Hyperion returned to the 68k Amiga market to save itself and AmigaKit is trying to tap in to the 68k Amiga market with the A600GS even though the 68k Amiga market belongs to Amiga Corporation who owns most of the Amiga IP and AmigaOS. Amiga mass market potential is not on ARM or x86-64 but on the 68k yet this Amiga surge in popularity goes nowhere on emulation, Amiga recreations and FPGA hardware because these do not offer competitive value and are not developer targets.

Exactly. Amiga is all about 68k and its chipset. That's so obvious looking at which hardware platforms are sold and even at emulation.

 Status: Offline
Profile     Report this post  
matthey 
Re: AmigaOS4 KVM Edition? virtual gpu driver Picasso96 coming soon.
Posted on 26-Oct-2023 18:33:21
#189 ]
Elite Member
Joined: 14-Mar-2007
Posts: 2024
From: Kansas

cdimauro Quote:

Looking at the example it was clear that it was made a mistake selecting the proper data type.

In general, it's very well know since long time that sizeof(int) "less than or equal" sizeof(long) and that there were/are machines having 32-bit for ints and 64-bit for longs.

As a coder, you should select the proper data type for the specific need. Choosing a long when you do NOT need to manipulate/represent more than 32-bit it's clearly a bad design. But the concept is similar for any data type to be used (char, short, int, long, single, double).


There could be a logical reason why long was chose over int. The LP32 memory model was popular being used by Windows, Mac, Atari ST, etc. which used a 16 bit int resulting in better performance on 16 bit CPUs and saving memory even on 32 bit CPUs. The Amiga Aztec compiler used 16 bit int as default too but Lattice used a 32 bit int and this won on the Amiga. This ILP32 memory model gave better Unix compatibility and better performance on 32 bit CPUs which was a good choice as all future 68k CPUs would be 32 bit. Modern 32 bit CPUs using LP32 suffered from partial register writes and results could not be forwarded with a 16 bit int significantly reducing performance. The easy solution was to use a long datatype to get a 32 bit integer for better performance. What was the proper way to get a 32 bit integer datatype for better performance on a 32 bit CPU before C99?

cdimauro Quote:

Nice read, thanks.

Looking at it I find very strange that stack architectures were put on top for memory traffic. In fact, the continuous push & pop of data generates much more memory traffic compared to other architectures.


I expect the stack architecture example is best case and stack architectures are only used in specialized cases where data is generally handled in LIFO order. The example works out perfectly for the stack architecture and the code is tiny which is why the memory traffic is low. The examples are tiny as it is and miss many advantages. The M68020 example could have used a MOVEM.L with a significant code and memory traffic savings by using more registers. The example misses many of the M68020 memory traffic advantages too.

op reg,mem ; load, op, store with load/store
op #imm,mem ; load, op, store with load/store

move mem,mem ; load, store with load/store

Having more GP registers reduces data memory traffic but increases code memory traffic. For a load/store architecture, the gains are partially offset by needing a free register for a load and more than double memory traffic when out of registers (without a CISC like XCHG mem,reg instruction).

op var1,reg ; store var0, load var1, op var1,reg with load/store not counting reloading var0

It's more important to have GP registers with a load/store architecture to reduce data memory traffic but this increases code size and code memory traffic offsetting gains. There is no free lunch.

cdimauro Quote:

Exactly. Amiga is all about 68k and its chipset. That's so obvious looking at which hardware platforms are sold and even at emulation.


Compatibility is important and castration is not popular for a reason. I expect most Amiga users would prefer real 68k Amiga hardware but at least emulation is not castration, even if it is on poor ARM hardware.

Last edited by matthey on 26-Oct-2023 at 07:08 PM.
Last edited by matthey on 26-Oct-2023 at 06:37 PM.

 Status: Offline
Profile     Report this post  
Karlos 
Re: AmigaOS4 KVM Edition? virtual gpu driver Picasso96 coming soon.
Posted on 26-Oct-2023 23:02:42
#190 ]
Elite Member
Joined: 24-Aug-2003
Posts: 4405
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

@matthey

Quote:
Compatibility is important and castration is not popular for a reason. I expect most Amiga users would prefer real 68k Amiga hardware but at least emulation is not castration, even if it is on poor ARM hardware.


The physical internals of the 68060 were very different than the previous members of the 68K series, along with some incompatibilities. Nobody had a problem with it. How it works inside is irrelevant as long as it executes 68K code and talks to the hardware.

In my mind, the PiStorm is no different in concept, only different in implementation. As long as it executes 68K code and talks to the hardware it's as legitimate as any other 68K.

_________________
Doing stupid things for fun...

 Status: Offline
Profile     Report this post  
matthey 
Re: AmigaOS4 KVM Edition? virtual gpu driver Picasso96 coming soon.
Posted on 27-Oct-2023 0:24:24
#191 ]
Elite Member
Joined: 14-Mar-2007
Posts: 2024
From: Kansas

Karlos Quote:

The physical internals of the 68060 were very different than the previous members of the 68K series, along with some incompatibilities. Nobody had a problem with it. How it works inside is irrelevant as long as it executes 68K code and talks to the hardware.

In my mind, the PiStorm is no different in concept, only different in implementation. As long as it executes 68K code and talks to the hardware it's as legitimate as any other 68K.


I get it. Tolerance increases as cost decreases. The PiStorm is acceptable for many Amiga users and allows them to retain their 68k Amiga hardware with good compatibility. Would they rather have a 68060@100Mhz for the same cost? Most probably would even though the PiStorm is higher performance. Would they rather have a 68060@100MHz Amiga with SAGA for the same cost? Almost all would. Would they rather have a 68060@1GHz Amiga with SAGA for the same price? I expect everyone would choose the real Amiga and the whole Amiga community could move forward together. A 68060+SAGA uses fewer transistors than a RP2040 SoC that costs $1, a fraction of the transistors of a RPi 3 SoC and a tiny fraction of the transistors of a RPi 4. Some people think 68k emulation on a Cortex-A53 offers good value though. The problem is that we have this long list of Amiga IP squatters that keeps growing including Hyperion, A-Eon, AmigaKit, Amedia Computer and AAA Technologies that are impediments to Amiga progress. Michele Battilana gave us a taste of what is possible with professionals and THEA500 Mini. The Amiga IP squatters just dug in deeper instead of getting out of the way though. The vultures are worse than the vampires in Amiga Neverland.

Last edited by matthey on 27-Oct-2023 at 01:11 AM.

 Status: Offline
Profile     Report this post  
Karlos 
Re: AmigaOS4 KVM Edition? virtual gpu driver Picasso96 coming soon.
Posted on 27-Oct-2023 1:27:14
#192 ]
Elite Member
Joined: 24-Aug-2003
Posts: 4405
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

@matthey

You already have the Vampire if you want enhanced AGA. I think I prefer RTG and the potential for future video and 3D acceleration on the PiStorm, with legacy modes being handled by the actual hardware.

However, there's no way I'd say no to a hybrid. Something like a standalone PiStorm with an FPGA realisation of some backwards compatible but otherwise souped up native chipset emulation for geeking out on. That doesn't seem likely, so the PiStorm in a genuine Amiga is my preference.

The performance of the PiStorm on existing machines is making a whole load of things possible in reality that were merely hypothetical before. Things like running Quake on OCS/ECS in HAM mode or playing HD content (software decode), etc.

_________________
Doing stupid things for fun...

 Status: Offline
Profile     Report this post  
Hammer 
Re: AmigaOS4 KVM Edition? virtual gpu driver Picasso96 coming soon.
Posted on 27-Oct-2023 3:30:32
#193 ]
Elite Member
Joined: 9-Mar-2003
Posts: 5312
From: Australia

@matthey

Quote:

The 68060 has an 8 stage integer pipeline while the Alpha 21164 has a 7 stage integer pipeline. Both these CPUs were released in 1994, used a 500nm chip fab process and operated at 3.3V yet the Alpha 21164 had a 333MHz part and the Alpha 21164A with no changes to the pipeline or voltage had a 666MHz part using a 350nm chip fab process. Motorola originally planned to release parts with a higher clock rating than were ever released, not that the 68060 would clock as high as the Alpha but it didn't need to as it had significantly better integer performance/MHz like the Pentium CPUs which ended the Alpha and DEC. A deeper pipeline and smaller L1 caches allow higher clock ratings and Motorola didn't want the 68060 competing with the shallow pipeline PPC CPUs with big caches that couldn't be clocked as high. The 68060 already had better integer performance/MHz (DMIPS/MHz) than the PPC 601, PPC 603 and all Alpha CPUs. Motorola made a political decision to push the PPC with the AIM alliance which improved economies of scale but ignored existing technology.


68060 wasn't fully superscalar pipelined when the FPU was not pipelined.

When compared to Amiga 4000T with 68060, RISC hardware Windows NT with Lightwave shows RISC CPU raytracing rendering power.

68060 FPU is weaker than the classic Pentium FPU.

68060 has a 68040's 32-bit front side bus while the classic Pentium has a 64-bit front side bus. The RISC competition has a 64-bit front-side bus. Dual 32-bit integer pipelines or FP64 FPU need a 64-bit memory bus.

68060 weak FPU has a similar problem with PC's weak FPU 586 clones.

On my TF1260, I overclocked 68060 Rev 1 to 62.5 Mhz and it was stable. 74 Mhz wasn't stable.
In 1994, Intel released Pentium 75 to 100 Mhz SKUs.

Your pro-68060 argument didn't show in Quake benchmarks i.e. it wouldn't beat my 1996 era PC's Pentium 150.

Quote:

Chip cost is pretty simple. Area, the fab process and economies of scale are the primary factors in cost. ARM cores originally had an advantage in area as the ARM2 CPU used 30k transistors to the 68000 68k transistors. ARM CPUs originally tried clocking up the memory with the CPU but this led to expensive memory prices for the Acorn computers and caches were adopted in ARM3 but the 4kiB cache used more transistors than the whole ARM2 core (4kiB SRAM=196,608 transistors). Lack of code density significantly reduced the efficiency too. Modern CPUs use many more transistors for caches than the CPU cores. Even lower end ARM cores like the Cortex-A53 have large caches. The 68060 used about 2.5 million transistors while a Cortex-A53 core uses roughly 12.5 million transistors according to "Digital Design and Computer Architecture". They are both superscalar in-order 8 stage pipeline cores although the Cortex-A53 is 64 bit while the 68060 is 32 bit. The 68060 used in an Amiga doesn't need 64 bit or as many cores which saves transistors. The 68060 uses fewer transistors than the RP2040 ARM based SoC chip that only costs $1 using a 40nm chip fab process. THEA500 Mini is evidence that mass production may be possible. Cortex-A53 emulation of the 68k seems to be the choice though. Amiga purgatory seems to be expensive hardware or poor emulation with nothing in between.

RP2040 ARM-based SoC chip is more than a CPU.

RP2040 SoC has the following:
Dual ARM Cortex-M0+ @ 133MHz
264kB on-chip SRAM in six independent bank.
DMA controller
Fully-connected AHB crossbar
2 on-chip PLLs to generate USB and core clocks
30 GPIO pins, 4 of which can be used as analogue inputs
Peripherals
2 UARTs
2 SPI controllers
2 I2C controllers
16 PWM channels
USB 1.1 controller and PHY, with host and device support
8 PIO state machines

https://community.arm.com/support-forums/f/architectures-and-processors-forum/5176/arm-cortex-m0-details

Without going into details, some of the low cost Cortex-M0 microcontrollers on the market has less than 50K gates and that included bus system, peripehrals, and possibly DMA support, etc (exclude memory area and analog components). The 12K gate number is based on minimum configuration at 180ULL process. However, you can get different gate count using different processes, some gives better figure and some give larger areas. For the Cortex-M0 DesignStart, as it has got 16 interrupts and the SysTick timer, the area would be a bit larger than 12K.



----------

ARM's Cortex A53 CPU core includes fully superscalar pipeline FP64 FPU, pack math 128-bit Neon integer/FP SIMD, and 64-bit integer units which are missing on 68060.

Last edited by Hammer on 27-Oct-2023 at 03:46 AM.
Last edited by Hammer on 27-Oct-2023 at 03:44 AM.

_________________
Ryzen 9 7900X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB
Amiga 1200 (Rev 1D1, KS 3.2, PiStorm32lite/RPi 4B 4GB/Emu68)
Amiga 500 (Rev 6A, KS 3.2, PiStorm/RPi 3a/Emu68)

 Status: Offline
Profile     Report this post  
Hammer 
Re: AmigaOS4 KVM Edition? virtual gpu driver Picasso96 coming soon.
Posted on 27-Oct-2023 4:10:47
#194 ]
Elite Member
Joined: 9-Mar-2003
Posts: 5312
From: Australia

@matthey

Quote:

matthey wrote:
Karlos Quote:

The physical internals of the 68060 were very different than the previous members of the 68K series, along with some incompatibilities. Nobody had a problem with it. How it works inside is irrelevant as long as it executes 68K code and talks to the hardware.

In my mind, the PiStorm is no different in concept, only different in implementation. As long as it executes 68K code and talks to the hardware it's as legitimate as any other 68K.


I get it. Tolerance increases as cost decreases. The PiStorm is acceptable for many Amiga users and allows them to retain their 68k Amiga hardware with good compatibility. Would they rather have a 68060@100Mhz for the same cost? Most probably would even though the PiStorm is higher performance. Would they rather have a 68060@100MHz Amiga with SAGA for the same cost? Almost all would. Would they rather have a 68060@1GHz Amiga with SAGA for the same price? I expect everyone would choose the real Amiga and the whole Amiga community could move forward together. A 68060+SAGA uses fewer transistors than a RP2040 SoC that costs $1, a fraction of the transistors of a RPi 3 SoC and a tiny fraction of the transistors of a RPi 4. Some people think 68k emulation on a Cortex-A53 offers good value though. The problem is that we have this long list of Amiga IP squatters that keeps growing including Hyperion, A-Eon, AmigaKit, Amedia Computer and AAA Technologies that are impediments to Amiga progress. Michele Battilana gave us a taste of what is possible with professionals and THEA500 Mini. The Amiga IP squatters just dug in deeper instead of getting out of the way though. The vultures are worse than the vampires in Amiga Neverland.

1. Besides very limited production Warp560, there's no other available 68060-based accelerator for the Amiga 500.

Commodore's AGA custom chips are not being produced for the new RE-Amiga 1200 PCB. For battery-damaged Amiga 1200 PCBs, Commodore's existing AGA chips are being recycled for new RE-Amiga 1200 PCB builds.

PiStorm-RPi3A-Emu68 crushed Warp560/Warp1260's 68060 Rev6 @ 100 MHz. RPi 4B can be used with the original PiStorm.

Vampire Standalone, A500Mini, and Minimig VER 1.97itx (Amiga 500Plus ECS clone including CPU/RTG accelerator feature) can expand the AmigaOS 68K install base.

2. AmigaKit's A600GS's operating system solution is based on 68K AROS and 68K backport System 54 components.

There was a split between Hyperion's AmigaOS 4.1 FE Update 2 and A-EON's System 54 "Enhancer". Hyperion is focusing on 68K based AmigaOS 3.3.x. development.

Cloanto has "AmigaOS 3.X" distro.


Last edited by Hammer on 27-Oct-2023 at 04:31 AM.
Last edited by Hammer on 27-Oct-2023 at 04:11 AM.

_________________
Ryzen 9 7900X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB
Amiga 1200 (Rev 1D1, KS 3.2, PiStorm32lite/RPi 4B 4GB/Emu68)
Amiga 500 (Rev 6A, KS 3.2, PiStorm/RPi 3a/Emu68)

 Status: Offline
Profile     Report this post  
cdimauro 
Re: AmigaOS4 KVM Edition? virtual gpu driver Picasso96 coming soon.
Posted on 27-Oct-2023 5:36:33
#195 ]
Elite Member
Joined: 29-Oct-2012
Posts: 3650
From: Germany

@matthey

Quote:

matthey wrote:
cdimauro Quote:

Looking at the example it was clear that it was made a mistake selecting the proper data type.

In general, it's very well know since long time that sizeof(int) "less than or equal" sizeof(long) and that there were/are machines having 32-bit for ints and 64-bit for longs.

As a coder, you should select the proper data type for the specific need. Choosing a long when you do NOT need to manipulate/represent more than 32-bit it's clearly a bad design. But the concept is similar for any data type to be used (char, short, int, long, single, double).


There could be a logical reason why long was chose over int. The LP32 memory model was popular being used by Windows, Mac, Atari ST, etc. which used a 16 bit int resulting in better performance on 16 bit CPUs and saving memory even on 32 bit CPUs. The Amiga Aztec compiler used 16 bit int as default too but Lattice used a 32 bit int and this won on the Amiga. This ILP32 memory model gave better Unix compatibility and better performance on 32 bit CPUs which was a good choice as all future 68k CPUs would be 32 bit. Modern 32 bit CPUs using LP32 suffered from partial register writes and results could not be forwarded with a 16 bit int significantly reducing performance. The easy solution was to use a long datatype to get a 32 bit integer for better performance. What was the proper way to get a 32 bit integer datatype for better performance on a 32 bit CPU before C99?

Before C99 the only way was something like that:

#ifdef AZTEC_C
#define INT16 int
#else
#define INT16 short
#endif


Here the problem is clearly the language / standand library which wasn't able to precisely define the data types.
However a solution that above was always possible if the goal of a project was (also) to be portable.
Quote:

cdimauro Quote:

Nice read, thanks.

Looking at it I find very strange that stack architectures were put on top for memory traffic. In fact, the continuous push & pop of data generates much more memory traffic compared to other architectures.


I expect the stack architecture example is best case and stack architectures are only used in specialized cases where data is generally handled in LIFO order. The example works out perfectly for the stack architecture and the code is tiny which is why the memory traffic is low. The examples are tiny as it is and miss many advantages. The M68020 example could have used a MOVEM.L with a significant code and memory traffic savings by using more registers.

Right. With the MOVEM.L you can move the 5 data sources in one shot to 5 data registers. Unbeatable.
It should give the best result as code density. And almost on par with the VAX in terms of executed instructions.
Quote:
The example misses many of the M68020 memory traffic advantages too.

op reg,mem ; load, op, store with load/store
op #imm,mem ; load, op, store with load/store

move mem,mem ; load, store with load/store

The example is too simple to exploit them. This study is just a school exercise and cannot be take serious for comparing the different types of architectures.
Quote:
Having more GP registers reduces data memory traffic but increases code memory traffic.

Code memory traffic? Do you refer to the multiple instructions needed to load / store values for L/S architectures, compared to the reg,mem / mem,reg / mem-mem ones?
Quote:
For a load/store architecture, the gains are partially offset by needing a free register for a load and more than double memory traffic when out of registers (without a CISC like XCHG mem,reg instruction).

op var1,reg ; store var0, load var1, op var1,reg with load/store not counting reloading var0

It's more important to have GP registers with a load/store architecture to reduce data memory traffic but this increases code size and code memory traffic offsetting gains. There is no free lunch.

Yes, but this applies only for L/S architectures, since they only have load/store instructions for accessing memory.
Quote:
cdimauro Quote:

Exactly. Amiga is all about 68k and its chipset. That's so obvious looking at which hardware platforms are sold and even at emulation.


Compatibility is important and castration is not popular for a reason. I expect most Amiga users would prefer real 68k Amiga hardware but at least emulation is not castration, even if it is on poor ARM hardware.

The main problem with the Amiga solutions, either hardware or software/emulation, is that compatibility is very hard to achieve, because the original schema were lost.

However we already have very good approximations.

 Status: Offline
Profile     Report this post  
matthey 
Re: AmigaOS4 KVM Edition? virtual gpu driver Picasso96 coming soon.
Posted on 28-Oct-2023 5:12:41
#196 ]
Elite Member
Joined: 14-Mar-2007
Posts: 2024
From: Kansas

Karlos Quote:

You already have the Vampire if you want enhanced AGA. I think I prefer RTG and the potential for future video and 3D acceleration on the PiStorm, with legacy modes being handled by the actual hardware.

However, there's no way I'd say no to a hybrid. Something like a standalone PiStorm with an FPGA realisation of some backwards compatible but otherwise souped up native chipset emulation for geeking out on. That doesn't seem likely, so the PiStorm in a genuine Amiga is my preference.

The performance of the PiStorm on existing machines is making a whole load of things possible in reality that were merely hypothetical before. Things like running Quake on OCS/ECS in HAM mode or playing HD content (software decode), etc.


The Vampire has chunky RTG too. Why slow HAM instead of fast true color chunky RTG (HAM and HAM8 work on Vampire if wanting to retro geek out or save memory)?

Hammer Quote:

68060 wasn't fully superscalar pipelined when the FPU was not pipelined.

When compared to Amiga 4000T with 68060, RISC hardware Windows NT with Lightwave shows RISC CPU raytracing rendering power.

68060 FPU is weaker than the classic Pentium FPU.


The Pentium FPU is fully pipelined but not superscalar. The 68060 FPU is superscalar but not fully pipelined.

https://en.wikipedia.org/wiki/Motorola_68060#Architecture Quote:

Against the Pentium, the 68060 can perform better on mixed code; Pentium's decoder cannot issue an FP instruction every opportunity and hence the FPU is not superscalar as the ALUs were. If the 68060's non-pipelined FPU can accept an instruction, it can be issued one by the decoder. This means that optimizing for the 68060 is easier: no rules prevent FP instructions from being issued whenever was convenient for the programmer other than well understood instruction latencies. However, with properly optimized and scheduled code, the Pentium's FPU is capable of double the clock for clock throughput of the 68060's FPU.


The Pentium FPU has the edge in performance but the 68060 is not as far behind as theoretical performance suggests. Easier optimizing is very important, applies to the 68060 in general and is special for an in-order CPU. The vbcc support code I worked on allowed a vbcc compiled ByteMark floating point benchmark to nearly achieve the same per MHz performance. Since most floating point code is mixed code, an improved integer vbcc backend and/or instruction scheduler could allow the 68060 to outperform the Pentium. That won't happen with an emulated 68k Amiga platform of course. Vbcc's Dr. Volker Barthelman and Frank Wille are Amiga fans but emulation indicates a dead platform and not a compiler target.

Hammer Quote:

RP2040 ARM-based SoC chip is more than a CPU.

RP2040 SoC has the following:
Dual ARM Cortex-M0+ @ 133MHz
264kB on-chip SRAM in six independent bank.
DMA controller
Fully-connected AHB crossbar
2 on-chip PLLs to generate USB and core clocks
30 GPIO pins, 4 of which can be used as analogue inputs
Peripherals
2 UARTs
2 SPI controllers
2 I2C controllers
16 PWM channels
USB 1.1 controller and PHY, with host and device support
8 PIO state machines

Without going into details, some of the low cost Cortex-M0 microcontrollers on the market has less than 50K gates and that included bus system, peripehrals, and possibly DMA support, etc (exclude memory area and analog components). The 12K gate number is based on minimum configuration at 180ULL process. However, you can get different gate count using different processes, some gives better figure and some give larger areas. For the Cortex-M0 DesignStart, as it has got 16 interrupts and the SysTick timer, the area would be a bit larger than 12K.


Most 2 input logic gates use 2 transistors but CMOS increases the requirement. Even 12k gates is several times that many transistors in CMOS. The Cortex-M0(+) cores are small but they may use extra transistors to reduce power. Both the Cortex-M0+ cores together are smaller than a 68060 core but the "264kB on-chip SRAM" uses 12,976,128 transistors (SRAM uses 6 transistors/bit). On-chip SRAM acts like a CPU cache but no jitter. The 68060 was 2,530,000 transistors and the Amiga AA+ custom chips would have been 200,000 transistors so the RP2040 SoC is significantly larger. Even a dual core 68060+AA+ SoC would only be 5,260,000 transistors which is less than half the transistors of the $1 RP2040 SoC chip. That leaves plenty of room for more caches and other enhancements not that sticking to a $1 SoC is necessary. A $2 SoC with a L2 cache may still be competitive enough but the embedded competition and cheap Cortex-A53 SoCs with many features really starts picking up here. Then a $3 SoC may allow an Imagination Technologies hybrid ray tracing GPU which would be good for Amiga marketing and allow for a semi-modern budget gaming platform but the AmigaOS is better on small footprint hardware where Linux looks fat and unresponsive so it is important not to upscale too far.

cdimauro Quote:

Before C99 the only way was something like that:
#ifdef AZTEC_C
#define INT16 int
#else
#define INT16 short
#endif


Here the problem is clearly the language / standand library which wasn't able to precisely define the data types.
However a solution that above was always possible if the goal of a project was (also) to be portable.


C99 was a big improvement for integer datatypes and should have already been out by the time of the benchmark but the code was likely older. Even with C99, choosing optimal integer datatypes can be challenging. The performance of a larger datatype in memory may be better while in caches but then the larger size may cause it to be slower when the caches aren't large enough anymore and worse if MMU paging occurs. Standardized hardware makes the job of choosing and profiling easier.

cdimauro Quote:

Code memory traffic? Do you refer to the multiple instructions needed to load / store values for L/S architectures, compared to the reg,mem / mem,reg / mem-mem ones?


I was talking more in general but accessing memory is the major bottleneck of load/store architectures. Not reduced instruction set load/store architectures should be efficient when all data is in registers although larger immediates can result in multiple dependent instructions. Accessing memory not only causes increased code memory traffic but it likely cause increased data traffic as well. The 68k AmigaOS 3.1 had a default stack size of about 4kiB while some increased this to 8kiB for safety. It looks like 64kiB is the default stack size for PPC AmigaOS 4 and a Hyperion AmigaOS Core Developer broadblues recommended increasing this to 80kiB for safety.

https://forum.hyperion-entertainment.com/viewtopic.php?t=2934

Some of the difference is more efficient stack alignment but PPC often needs double the stack space to go with 50% more code size and both increase memory traffic. More GP registers is supposed to reduce memory traffic though?

cdimauro Quote:

The main problem with the Amiga solutions, either hardware or software/emulation, is that compatibility is very hard to achieve, because the original schema were lost.

However we already have very good approximations.


Jeri Ellsworth has some of the schematics which she received from a C= engineer. Other ex-C= engineers may have schematics. Michele Battilana may have schematics and much more C= documentation.

https://github.com/nonarkitten/amiga_replacement_project/blob/master/agnus/alice_schematics.pdf

There is already at least 3 different AGA compatible FPGA cores and the logic in them has been tested and bug fixed. The chipset is not the problem as even a cheap FPGA will hold it. A semi-modern high performance CPU is severely limited in an affordable FPGA though. A 3D GPU can perform better than a CPU in a FPGA but more parallelism is better and in the case of Vamp hardware, the CPU used up most of the resources.

 Status: Offline
Profile     Report this post  
cdimauro 
Re: AmigaOS4 KVM Edition? virtual gpu driver Picasso96 coming soon.
Posted on 28-Oct-2023 6:21:29
#197 ]
Elite Member
Joined: 29-Oct-2012
Posts: 3650
From: Germany

@matthey

Quote:

matthey wrote:

Most 2 input logic gates use 2 transistors but CMOS increases the requirement. Even 12k gates is several times that many transistors in CMOS. The Cortex-M0(+) cores are small but they may use extra transistors to reduce power. Both the Cortex-M0+ cores together are smaller than a 68060 core but the "264kB on-chip SRAM" uses 12,976,128 transistors (SRAM uses 6 transistors/bit). On-chip SRAM acts like a CPU cache but no jitter. The 68060 was 2,530,000 transistors and the Amiga AA+ custom chips would have been 200,000 transistors so the RP2040 SoC is significantly larger. Even a dual core 68060+AA+ SoC would only be 5,260,000 transistors which is less than half the transistors of the $1 RP2040 SoC chip.

When you count the number of transistors, are you removing / reusing the 13M ones used by the RP2040 for the 264kB SRAM for implementing the two 68060 cores + AGA?
Quote:
[...]would be good for Amiga marketing and allow for a semi-modern budget gaming platform but the AmigaOS is better on small footprint hardware where Linux looks fat and unresponsive so it is important not to upscale too far.

It depends on the specific market: the Amiga o.s. is very lightweight and responsive, but it can easily crash with a breath.

Even for a game machine, something which crashes so easily can be not acceptable for the wider audience of modern game players.

Last but not least, we (Amiga coders) were used to directly hit the hardware, to better use it. Taking the AGA chipset as it is on the above "new Amiga SoC" may require the same, which isn't good nowadays.
Quote:
cdimauro Quote:

Code memory traffic? Do you refer to the multiple instructions needed to load / store values for L/S architectures, compared to the reg,mem / mem,reg / mem-mem ones?


I was talking more in general but accessing memory is the major bottleneck of load/store architectures. Not reduced instruction set load/store architectures should be efficient when all data is in registers although larger immediates can result in multiple dependent instructions. Accessing memory not only causes increased code memory traffic but it likely cause increased data traffic as well.

OK, now I got it and I fully agree. Yes, there's an increase on both code cache (more instructions) AND data cache many times on L/S architectures, where as non-L/S have a clear advantage here.
Quote:
The 68k AmigaOS 3.1 had a default stack size of about 4kiB while some increased this to 8kiB for safety. It looks like 64kiB is the default stack size for PPC AmigaOS 4 and a Hyperion AmigaOS Core Developer broadblues recommended increasing this to 80kiB for safety.

https://forum.hyperion-entertainment.com/viewtopic.php?t=2934

Some of the difference is more efficient stack alignment but PPC often needs double the stack space to go with 50% more code size and both increase memory traffic.

LOL: 64kB or even 80kB.

The stack had to be at least 16-bit aligned with the 68k, so even doubling the default 4kB size of the generic Amiga o.s. application does NOT justify bringing it to 64kB or even 80kB: you can expect 8kB as the corresponding size for PowerPCs.

Here the problem I think is more related to the ridiculous ABI which PPCs have regarding function calls. There's nothing else that can justify the so much increased default stack space required.
Quote:
More GP registers is supposed to reduce memory traffic though?

OK, NOT in this specific case.
Quote:
cdimauro Quote:

The main problem with the Amiga solutions, either hardware or software/emulation, is that compatibility is very hard to achieve, because the original schema were lost.

However we already have very good approximations.


Jeri Ellsworth has some of the schematics which she received from a C= engineer. Other ex-C= engineers may have schematics. Michele Battilana may have schematics and much more C= documentation.

https://github.com/nonarkitten/amiga_replacement_project/blob/master/agnus/alice_schematics.pdf

That's only Agnus, unfortunately. Which is very very important, but not enough.

I hope that someone else has the schematics of the other chips.
Quote:
There is already at least 3 different AGA compatible FPGA cores and the logic in them has been tested and bug fixed. The chipset is not the problem as even a cheap FPGA will hold it.

Indeed.
Quote:
A semi-modern high performance CPU is severely limited in an affordable FPGA though. A 3D GPU can perform better than a CPU in a FPGA but more parallelism is better and in the case of Vamp hardware, the CPU used up most of the resources.

That's the biggest problem.

 Status: Offline
Profile     Report this post  
NutsAboutAmiga 
Re: AmigaOS4 KVM Edition? virtual gpu driver Picasso96 coming soon.
Posted on 28-Oct-2023 16:56:19
#198 ]
Elite Member
Joined: 9-Jun-2004
Posts: 12825
From: Norway

@matthey

Quote:
Some of the difference is more efficient stack alignment but PPC often needs double the stack space to go with 50% more code size and both increase memory traffic. More GP registers is supposed to reduce memory traffic though?


68K programs might use little bit more stack than PowerPC program under AmigaOS4.1, because exception handler calls a stubs, the stub does the translation to native API’s, and call native functions.

the stubs job is take the D0-D7, A0-D7 and remap thins into arguments for the native function.

Also several of older Function will call another functions, so you won’t need duplicated code.
what this manes is more stack is used. BlaBlaTags() will call BlaBlaTagList(). so you end up jumping through many hoops of functions, etch level adds stack.

OldOpenLibrary will call OpenLibrary and so on..

When AmigaOS1.x was written most code written in assembly, when AmigaOS4.x was written most code was written in C code, arguments easily be assigned to register, but I remember OS4 developers advoking against using registers, also advised against writing in assembly.

Last edited by NutsAboutAmiga on 28-Oct-2023 at 04:59 PM.

_________________
http://lifeofliveforit.blogspot.no/
Facebook::LiveForIt Software for AmigaOS

 Status: Offline
Profile     Report this post  
cdimauro 
Re: AmigaOS4 KVM Edition? virtual gpu driver Picasso96 coming soon.
Posted on 28-Oct-2023 19:17:49
#199 ]
Elite Member
Joined: 29-Oct-2012
Posts: 3650
From: Germany

@NutsAboutAmiga: the stack discussion was NOT about 68k applications, rather about native / PowerPC applications...

 Status: Offline
Profile     Report this post  
matthey 
Re: AmigaOS4 KVM Edition? virtual gpu driver Picasso96 coming soon.
Posted on 29-Oct-2023 19:50:17
#200 ]
Elite Member
Joined: 14-Mar-2007
Posts: 2024
From: Kansas

cdimauro Quote:

When you count the number of transistors, are you removing / reusing the 13M ones used by the RP2040 for the 264kB SRAM for implementing the two 68060 cores + AGA?


The RP2040 SoC chip at 40nm uses over 13 million transistors and costs $1 USD (production cost is significantly lower as development costs, licensing costs, overhead costs and profit should be included). Most mass produced chips at 40nm which are similar could have a similar cost. I just pointed out that a dual core 68060 with AA+ chipset uses less than half the transistors of the RP2040 chip.

https://en.wikipedia.org/wiki/AA+_Chipset

The fact that the 264kiB of SRAM uses most of the transistor budget of the RP2040 is not important. The SRAM is simple and offers increased performance for a microcontroller like caches increase performance for a microprocessor. At least the L1 I+D caches of the 68060s should be increased as well which would use 2,359,296 transistors for each core or 4,718,592 for increasing both cores to 32kiB I+D L1 caches.

dual 68060 Amiga SoC
AA+ chipset uses 200,000 transistors
2x68060 with 32kiB I+D L1 cache uses 9,778,592 transistors
---
9,978,592 transistors total

L2 cache options
128kiB shared L2 cache uses 6,291,456 transistors
256kiB shared L2 cache uses 12,582,912 transistors
512kiB shared L2 cache uses 25,165,824 transistors
1MiB shared L2 cache uses 50,331,648 transistors
2MiB shared L2 cache uses 100,663,296 transistors

The SRAM for a L2 cache could also be configurable as memory for an embedded microcontroller or Amiga where no external memory is used. It would be cool to be able to run the 1MiB or 2MiB Amiga standard out of ultra fast SRAM and it could be useful for a retro Amiga built into a controller or a portable Amiga gaming device. A $1 SoC chip includes the chip package and other per chip costs so the transistors/$ are significantly cheaper when scaling up to a larger transistor budget. It's just amazing how cheap transistors have become.

cdimauro Quote:

It depends on the specific market: the Amiga o.s. is very lightweight and responsive, but it can easily crash with a breath.

Even for a game machine, something which crashes so easily can be not acceptable for the wider audience of modern game players.


It wasn't a problem for the Amiga back when it was popular and it is not a problem for budget and retro gaming today. It would be a different story if trying to make the Amiga competitive on the desktop.

cdimauro Quote:

Last but not least, we (Amiga coders) were used to directly hit the hardware, to better use it. Taking the AGA chipset as it is on the above "new Amiga SoC" may require the same, which isn't good nowadays.


More performance means developers don't need to bang the hardware and I would encourage them not to for new projects. At the same time, it is important to accept that some developers enjoy banging the hardware and that is easier to enhance games that already do. Compatibility is important while AmigaOS 3 can be enhanced with more modern functionality including MMU use for improved stability. WHDLoad and patching works well enough for misbehaving software.

cdimauro Quote:

LOL: 64kB or even 80kB.

The stack had to be at least 16-bit aligned with the 68k, so even doubling the default 4kB size of the generic Amiga o.s. application does NOT justify bringing it to 64kB or even 80kB: you can expect 8kB as the corresponding size for PowerPCs.

Here the problem I think is more related to the ridiculous ABI which PPCs have regarding function calls. There's nothing else that can justify the so much increased default stack space required.


It's not even the whole PPC ABI as it is good about passing function args in registers which decreases memory traffic from push/pop of args on the stack. It's the PPC mandated stack frame handling and function prologues/epilogues without mandated multiple register load/store that is both annoying and inefficient for shallow functions. The 68k AmigaOS makes use of extensive code sharing that results in such a small footprint but this is inefficient on load/store cores with load-to-use penalties. The PPC ABI is optimized for deep functions with extensive function inlining and loop unrolling to minimize load-to-use penalties. These are different and practically opposite philosophies and we can see the result as modern compilers are designed for this too. Old C compilers generated WYSIWYG code that was similar to the translation of what was written and worked fine on the 68k while the code from a modern compiler is often bloated and unrecognizable, especially so in PPC assembler which is difficult to read. The 68k takes more penalties from unaligned memory accesses, extra function calls and a little decoder overhead but even the mostly modern 68060 handles it with grace while avoiding load-to-use penalties and sharing already dense code which more than offsets the penalties. The 68k processors should be called a code sharing processors while load/store processors should be called code bloating processors. Most of the world chose code bloating as it is better for security but the Amiga was different.

It is not unusual for an Amiga to have many tasks/processes executing on startup even though most are waiting asleep. The following video shows someone testing SysMon on AmigaOS 4 which states, "There are 102 tasks/process in the system".

SysMon for Amiga OS4, by Guillaume Boesel
https://youtu.be/pbAKRtirckc?t=22

With an AmigaOS 4 stack default of 64kiB of stack for each task/process that would be ~6.4 MiB of memory and with 80kiB of stack that would be ~8.0MiB of memory. AmigaOS 3.1 with a 4000 byte stack per task/process would take less than 400kiB of memory and potentially still work on a 2MiB system where the stack alone in AmigaOS 4 would take 8MiB of memory as recommended by an official Hyperion AmigaOS 4 developer. That is a huge increase to the 68k Amiga footprint before even mentioning 50% larger PPC code.

Last edited by matthey on 29-Oct-2023 at 07:58 PM.

 Status: Offline
Profile     Report this post  
Goto page ( Previous Page 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 Next Page )

[ home ][ about us ][ privacy ] [ forums ][ classifieds ] [ links ][ news archive ] [ link to us ][ user account ]
Copyright (C) 2000 - 2019 Amigaworld.net.
Amigaworld.net was originally founded by David Doyle