Your support is needed and is appreciated as Amigaworld.net is primarily dependent upon the support of its users.
|
|
|
|
Poster | Thread | Hammer
 |  |
Re: The (Microprocessors) Code Density Hangout Posted on 13-Feb-2024 3:56:31
| | [ #261 ] |
| |
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 6500
From: Australia | | |
|
| @cdimauro
Quote:
1. Your NEx64T ISA is only theory.
2. Your cited https://www.extremetech.com/extreme/188396-the-final-isa-showdown-is-arm-x86-or-mips-intrinsically-more-power-efficient link is nearly useless for the real-world gaming use case.
I'm aware of SPECINT which is meaningless for FP heavy Quake.
I cited a gaming benchmark (3D Mark Ice Storm Physics) which is Amiga's core audience.
Last edited by Hammer on 13-Feb-2024 at 04:05 AM. Last edited by Hammer on 13-Feb-2024 at 04:04 AM.
_________________ Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68) Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68) Ryzen 9 7950X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB |
| Status: Offline |
| | kolla
|  |
Re: The (Microprocessors) Code Density Hangout Posted on 13-Feb-2024 4:43:26
| | [ #262 ] |
| |
 |
Elite Member  |
Joined: 20-Aug-2003 Posts: 3473
From: Trondheim, Norway | | |
|
| @Hammer
Quote:
I cited a gaming benchmark (3D Mark Ice Storm Physics) which is Amiga's core audience. |
Pffff, no it isn't._________________ B5D6A1D019D5D45BCC56F4782AC220D8B3E2A6CC |
| Status: Offline |
| | cdimauro
|  |
Re: The (Microprocessors) Code Density Hangout Posted on 13-Feb-2024 5:58:09
| | [ #263 ] |
| |
 |
Elite Member  |
Joined: 29-Oct-2012 Posts: 4431
From: Germany | | |
|
| | Status: Offline |
| | Hammer
 |  |
Re: The (Microprocessors) Code Density Hangout Posted on 14-Feb-2024 4:14:38
| | [ #264 ] |
| |
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 6500
From: Australia | | |
|
| @cdimauro
Quote:
cdimauro wrote:
Do you see ONLY SPEC INT there or even SOMETHING ELSE?!?
As it was proven already several times, you're just a PARROT which repeat the same meaningless things and post things took by googling around without even understanding their context and, what's worse, the CONTEXT of discussions. Quote:
I cited a gaming benchmark (3D Mark Ice Storm Physics) which is Amiga's core audience. |

|
Fact: AMD Jaguar won the two-game console design wins when it competed against ARM Cortex A15.
BobCat is not in consideration.
Your article is dated August 2014 and it's obsolete when Jaguar-based APUs were shipping in 2013. LOL Last edited by Hammer on 14-Feb-2024 at 04:17 AM.
_________________ Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68) Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68) Ryzen 9 7950X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB |
| Status: Offline |
| | Hammer
 |  |
Re: The (Microprocessors) Code Density Hangout Posted on 14-Feb-2024 4:24:51
| | [ #265 ] |
| |
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 6500
From: Australia | | |
|
| @kolla
Quote:
kolla wrote: @Hammer
Quote:
I cited a gaming benchmark (3D Mark Ice Storm Physics) which is Amiga's core audience. |
Pffff, no it isn't. |
Pffff, it is.
https://benchmarks.ul.com/news/understanding-3dmark-results-from-the-apple-iphone-5s-and-ipad-air
What does 3DMark Ice Storm Physics test measure? 3DMark is designed to benchmark real-world gaming performance. 3DMark tests are meticulously designed to mirror the content and techniques used by developers and artists to create games.
To that end, the 3DMark Ice Storm Physics test uses the Bullet Physics Library. Bullet is an open source physics engine that is used in Grand Theft Auto V, Trials HD and many other popular games on Playstation 3, Xbox 360, Nintendo Wii, PC, Android and iPhone.
The purpose of the Physics test is to measure the CPU's ability to calculate complex physics simulations. The Ice Storm Physics test has four simulated worlds. Each world has two soft bodies and two rigid bodies colliding with each other. This workload is similar to the demands placed on the CPU by many popular physics-based puzzle, platform and racing games.
Bullet Physics Library's main author, Erwin Coumans worked for Sony Computer Entertainment US R&D from 2003 until 2010. He now works for AMD.
FACT: the Xbox One / Xbox Series S/X and PS4 / PS5 are the two AMD-powered APU game consoles.
https://webharvest.gov/congress112th/20121215195436/http://en.wikipedia.org/wiki/Bullet_(software)
Commercial games
Games using Bullet created by professional game developers for video game consoles or other platforms include:
Toy Story 3: The Video Game published by Disney Interactive Studios.[3] Grand Theft Auto IV and Red Dead Redemption by Rockstar Games.[4] Trials HD by RedLynx.[5] Free Realms by Sony Online Entertainment.[6] HotWheels: Battle Force 5.[7] Gravitronix.[8] Madagascar Kartz published by Activision.[9] Regnum Online by ngd Studios. An MMORPG which in its latest major update its physics engine was replaced by Bullet. 3D Mark 2011 by Futuremark.[10] Blood Drive published by Activision.[11] Hydro Thunder Hurricane. [12]
Movies Several Hollywood movie studios are using Bullet rigid body simulation for special effects in commercial films. Movies using the Bullet engine include:
2012 by Sony Pictures Imageworks.[13][14] Hancock by Sony Pictures Imageworks.[15] Bolt by Walt Disney Animation Studios used Bullet in their Dynamica Maya plugin.[16] The A-Team by Weta Digital[17] Sherlock Holmes by Framestore[18] Megamind and Shrek 4 by PDI/DreamWorks[19]
3D Authoring tools Blender -- A free 3D production suite that uses Bullet physics for animations and its internal game engine Game Blender. Carrara Pro added Bullet Physics in version 8 (only included in the Pro edition).[20] Cheetah3D, a 3D modeling, rendering and animation software for Apple Mac OS X uses the Bullet physics engine to simulate rigid body and soft body dynamics. (As of version 6.0.) Cinema 4D version 11.5 uses Bullet as part of MoDynamics.[21] Houdini has native Bullet Physics support in the dynamics context as of version 12. Available as an community supported open source plugin for previous versions. LightWave 3D CORE.[22] Modo Recoil Allows users to simulate dynamic rigid body interactions based upon the popular Open Source Bullet Physics Library MikuMikuDance a freeware 3D animation program, added the use of Bullet Physics Engine in version 5 Softimage plugin Momentum developed by Helge Mathee and distributed by Exocortex
Open source and other Panda3D integration. GameKit a game engine with Bullet integration OGRE integration through the OgreBullet add-on. Irrlicht Engine has several integrations with Bullet including the Bullet Physics Wrapper, irrBP and GameKit. OpenSceneGraph through the osgBullet plugin.[23] Crystal Space -- Game engine supporting bullet for physics and switching to it as the main physics plugin. Cafu Engine -- Game engine with bullet physics engine. Physics Abstraction Layer C4 Engine -- A proprietary game engine developed by Terathon Software into which JamesH has integrated the Bullet physics engine.[24] jMonkeyEngine -- A game engine made in Java.[25] Blitz3D integration through the BlitzBullet wrapper. Maratis3D a game engine with Bullet integration www.maratis3d.org David Piuva's Graphics Engine a game engine made in C++ for Visual Basic with a simplified built in version of Bullet. Pybullet Python bindings for Bullet.
Last edited by Hammer on 14-Feb-2024 at 04:39 AM.
_________________ Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68) Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68) Ryzen 9 7950X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB |
| Status: Offline |
| | cdimauro
|  |
Re: The (Microprocessors) Code Density Hangout Posted on 14-Feb-2024 5:36:02
| | [ #266 ] |
| |
 |
Elite Member  |
Joined: 29-Oct-2012 Posts: 4431
From: Germany | | |
|
| @Hammer
Quote:
Hammer wrote: @cdimauro
Quote:
cdimauro wrote:
Do you see ONLY SPEC INT there or even SOMETHING ELSE?!?
As it was proven already several times, you're just a PARROT which repeat the same meaningless things and post things took by googling around without even understanding their context and, what's worse, the CONTEXT of discussions. 
|
Fact: AMD Jaguar won the two-game console design wins when it competed against ARM Cortex A15.
BobCat is not in consideration.
Your article is dated August 2014 and it's obsolete when Jaguar-based APUs were shipping in 2013. LOL |
FACT: the article was about a study and Jaguar was NOT part of the study, but only BobCat was. So, and again, you're going OUTSIDE the topic/context with the sole purpose of defending your beloved AMD (BTW, have you bought stocks from it?)
FACT: YOU stated that the used benchmarks were "nearly useless" (YOUR words) for gaming. I've asked proof of that which did NOT come, because it's clearly evident that your was a pure load of b@lls that nobody with a grain of salt could sustain.
FACT: YOU stated that SPEC INT was useless for FP code, where the chart was clearly showing ALSO the SPEC FP results. Which proves that you do NOT read what people write because you just mechanically follow your PARROTTING activities like a puppet.
FACT: YOU stated that Amiga's core is about gaming by reporting a PC benchmark used SEVERAL DECADES after that this platform is... died. Here, really, no comment...
And those where only the last pure bull$hit$ that you continue to write because you wanted to appear like the tech expert that you obviously aren't. You're just a googler which searches around something which might match the context, but your lack of knowledge about the topic doesn't allow you to give proper answers. So, you continue to report pure non-sense of wall-of-texts. Even a LLM chat bot nowadays performs better than you... |
| Status: Offline |
| | Gunnar
|  |
Re: The (Microprocessors) Code Density Hangout Posted on 14-Feb-2024 7:35:54
| | [ #267 ] |
| |
 |
Cult Member  |
Joined: 25-Sep-2022 Posts: 512
From: Unknown | | |
|
| @cdimauro
Quote:
And those where only the last pure bull$hit$ that you continue to write because you wanted to appear like the tech expert that you obviously aren't. You're just a googler which searches around something which might match the context, but your lack of knowledge about the topic doesn't allow you to give proper answers. So, you continue to report pure non-sense of wall-of-texts. |
Cesare is right here.
@Hammer I think everyone sees this exactly the same as Cesare said. But most people are more polite than Cesare and not tell you directly this Hammer
@Hammer I dont understand why you act like you do. But if your goal is to appear as "tech-expert" then google/posting INTEL screenshots is not working.
|
| Status: Offline |
| | matthey
|  |
Re: The (Microprocessors) Code Density Hangout Posted on 19-Jun-2025 0:57:46
| | [ #268 ] |
| |
 |
Elite Member  |
Joined: 14-Mar-2007 Posts: 2744
From: Kansas | | |
|
| This is the resurrection of an old thread.
I would like to talk about the relation between code density and memory footprint. In review, I post a code density chart for 32-bit and 64-bit ISAs of a large compiled executable from the "Code Density Compared Between Way Too Many Instruction Sets" found in post #5 of this thread.


The 68k does not have a particularly good result here which may be do to compiling for the 68040 which is the worst 68020+ target for code density, including stack frames which many newer architectures have off by default, ancient compiler support for the 68k, etc. Some other code density benchmark results have the 68k competing with Thumb2 in code density. These charts still give a rough idea of code density as the memory footprint is also considered.
The memory footprint of a system includes the memory used for code, data, stack and caches/buffers. There is the bootup and idle footprint before additional programs are started which determines how much memory is available for programs and there is the footprint when executing programs which is just as important. Surprisingly, I have not found many papers on this important topic. Most of the info I found comes from users posting data and results.
I will start by looking at the 68k AmigaOS footprint. I know the engineer Steve Shireman personally who worked on Amiga embedded systems.
https://kgsvr.net/andrew/amiga/amiga.diffnt.html#efficient Quote:
AmigaOS is Efficient
From: Shireman, Steve
I have run control software on the Amiga booting off of a battery-backed SRAM PCMCIA card without a hard drive or floppy using only 4K of the PCMCIA card to boot. Think of the PCMCIA card as replacing the hard drive in a desktop system. The only RAM overhead was about 54K, and with this I have the full color model and mouse control, and fully preemptive multitasking and of the 2 Meg of RAM that comes with the A1200, The Amiga OS has only needed less than 1 / 10,000 of the RAM available. And I know it is using a few of the OO Objects in the Kickstart, but not very many.
Of course, the same thing can be done on an A600, which is even cheaper, or custom boards.
It would be nice for OEM's to be able to license Kickstart (remove parts they don't want), and link application code, and plug a Flash chip into the same socket where Kickstart goes.
Envoy, the network software also has tiny requirements. I have booted from a floppy on an A500 with Envoy and served files to the network with it.
The benefit of the Soft Machine Architecture gives an embedded designer the chance to only use the parts of the Amiga OS that they need. Exec has the OpenLibrary() function, which gives the user or application designer for Amiga systems to decide exactly what libraries to open after that point. It is a very nice to have that much control of the system, without mucking with the source code of the _microkernel.
I believe that the current design of the Amiga Exec is much better suited for Consumer Electronics than WindowsCE. This goes also for HPC's or PDA. (Personal Digital Amiga, wouldn't that be cool with a video out. With AAA chips it could have video in as well, and not eat batteries, but now I am dreaming...)
I hope future 'improvements' if and when they occur do not ruin the resource-smallness of the Amiga design.
|
The "resource-smallness" Steve refers to is the footprint. It is funny that he talks about removing modules from the Kickstart for custom embedded use considering the "standard" 68k Amiga footprint compared to the footprint of many modern embedded systems. The PPC AmigaOS also did "ruin the resource-smallness of the Amiga design" as I will get into later.
The 68k AmigaOS 3 only used 54kiB of 2MiB of memory after boot which includes preemptive multitasking and a GUI. Most of the AmigaOS code is in the 512kiB Kickstart which I expect from experience is more than 90% code with some obviously read only data. The Kickstart code is executed directly from ROM without a MapROM copy to memory. The 54kiB of 2MiB consists of writable library data including jump tables, stacks for various programs and caches/buffers. The floppy drive by default uses 5 buffers of 512 bytes each for 2,560 bytes. The default stack size for AmigaOS 3 is 4kiB with some processes/tasks having a lower stack size set. At least this we can compare to some other systems like AmigaOS 4 to start.
https://www.amigans.net/modules/newbb/viewtopic.php?post_id=60027#forumpost60027 ssolie Quote:
You need to have at least 60k of stack or so for any program using a GUI. If you don't, one will be provided for you via a hidden stack swap which will slow your program down slightly. I recommend setting a stack cookie at about 80k for anything with a GUI to avoid the implicit stack swapping in Intuition.
Come to think of it, I should probably document this fact in my Modern Amiga Programming article.
It is best to never nitpick about stack size in AmigaOS. Give it plenty of room and always use a stack cookie in your programs.
|
Just the stack of one program running under AmigaOS 4 uses more memory than the whole 68k AmigaOS. The majority of the footprint is code which is likely 40%-50% larger for PPC on average. Then there is the overhead for the MMU and there is a night and day difference in footprint. All things are relative and the following are some default stack sizes for other OSs.
AmigaOS 4kiB VxWorks 20kiB QNX 512kiB Windows 1024kiB Linux 8192kiB
The majority of the boot up and idle OS footprint is likely code for a small footprint system but the stack memory is substantial for modern systems which is likely increased with 64-bit do to alignment and pointers on the stack. Data structures also no doubt bloat up. Let us take a look at a 1GiB memory Ubuntu system with 32-bit vs 64-bit Ubuntu OS.
32-bit Ubuntu idle
 vs 64-bit Ubuntu idle
 https://askubuntu.com/questions/7034/what-are-the-differences-between-32-bit-and-64-bit-and-which-should-i-choose
Most of an idle system is code, the caches/buffers are the same and the stack is likely the same even though it could be less for 32-bit. OS structures are likely larger for the 64-bit system though. The 64-bit Ubuntu uses 103MiB more memory or ~27% more memory at idle. Of the 1GiB of system memory, ~38% is used with 32-bit Ubuntu and ~48% is used with 64-bit Ubuntu. Much of the footprint difference at idle is likely do to the code density difference between x86 and x86-64 which is not true for the footprint when using applications as we see from the same link.


The 64-bit Ubuntu uses more memory than the code density difference which makes sense as some programs use a lot of data like Firefox. The swap memory used for 64-bit Firefox is crazy compared to the 32-bit Firefox. The 64-bit system becomes slower than the 32-bit system do to swap.
https://askubuntu.com/questions/7034/what-are-the-differences-between-32-bit-and-64-bit-and-which-should-i-choose Quote:
Additionally, in my testing, a web-application written in Python used up to 60% more memory on a 64-bit machine which resulted in a test suite running in 380 secs on a 32-bit machine but taking 523 seconds on a 64-bit one (both with 1GiB of RAM). If the machines were not RAM-limited the results would likely be different (as phoronix tests show).
|
This is similar to the conclusion and recommendation for the RPi that a 32-bit OS should be used with less than 4MiB of memory.
https://smist08.wordpress.com/2022/02/04/raspberry-pi-os-goes-64-bit/ Quote:
How Does it Run?
First off, even though it will run on a Raspberry Pi 3 or even a Raspberry Pi Zero 2, I wouldn’t recommend it. A 64-bit operating system uses more memory than the corresponding 32-bit version and 1 Gig of RAM isn’t enough. To use this, you really need a Raspberry Pi 4 with at least 4 Gig of RAM. On my Raspberry Pi 4 with 8 Gig of RAM, it is noticeably peppier than the 32-bit version, especially when browsing with Chromium.
...
Why Move to 64-Bit?
The Raspberry Pi OS has been around for a while in 32-bits, the advantage is that it runs on all Raspberry Pi’s no matter how old and it runs in compact memory footprints of 512 MB or 1 Gig. It runs a vast library of open source software and has provided a great platform for millions of students and DIY’ers. However, most of the rest of the world including mainstream Linux, Windows, MacOS, Android and iOS have all been 64-bit for some time. So let’s look at some reasons to move to 64-bits.
1. Memory addressing simplifies, making life simpler for Linux. 2. In 64-bit mode, each ARM CPU has twice as many general purpose registers (32 vs 16) allowing programs to keep more variables in registers, saving loading and storing things to and from memory. 3. All new compiler optimizations are targeting the ARM 64-bit instructions set, not much work is being done on the 32-bit code generators. 4. The CPU registers are now 64-bits, so if you are doing lots of long integer arithmetic, it will now be done in one clock cycle rather than taking several under 32-bits.
All that being said, there are a couple of disadvantages, namely 64-bit tends to take more memory due to:
1. If integers are 64-bit, rather than 32-bit then it takes twice as much memory to hold each one. 2. Normal ARM instructions are 32-bits in size when running in either 32-bit or 64-bit mode. However in 32-bit mode there is a special “thumb” mode where each instruction is only 16-bits in size. Using these can greatly reduce memory footprint and the GCC compiler supports producing these as an option.
|
For both AArch64 and x86-64, they moved to 64-bit not only for the address space but to increase the GP registers, reduce the number of instructions executed and reduce the memory traffic for more performance than Thumb2 and x86 respectively. The 32-bit 68k is already competitive in these important performance metrics while it has the code density of Thumb-2 and room for further improvement. ARM abandoning 32-bit support for their Cortex-A cores is an opportunity for less expensive 4GiB and under systems that do not use the additional address space. The difference in footprint between the 68k AmigaOS and 64-bit Linux is huge. Memory is not so tight with a 68k Amiga system because most of the memory is available after boot instead of loosing nearly half a GiB of memory and good code density saves additional memory. So many people want to rush to 64-bit but it practically requires 2GiB-4GiB of memory when the 68k Amiga can do so much with 1MiB-2MiB that it is difficult to imagine what it can do with 1GiB-2GiB of memory.
The following is one last link to backup the memory footprint difference between 32-bit and 64-bit Firefox with the same data displayed. To display the same tabs, Firefox 32-bit uses 287.4MiB of memory and 64-bit uses 974.4MiB of memory which is 239% more memory used for 64-bit.
https://superuser.com/questions/1450419/firefox-64-vs-32-bit-memory-consumption
Was the additional cost of moving low end hardware from a 32-bit footprint to a 64-bit footprint considered before abandoning 32-bit or was the bloat already too much for 32-bit?
Last edited by matthey on 19-Jun-2025 at 01:21 AM. Last edited by matthey on 19-Jun-2025 at 01:17 AM. Last edited by matthey on 19-Jun-2025 at 01:00 AM.
|
| Status: Offline |
| | Hammer
 |  |
Re: The (Microprocessors) Code Density Hangout Posted on 19-Jun-2025 5:56:18
| | [ #269 ] |
| |
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 6500
From: Australia | | |
|
| @matthey
Quote:
For both AArch64 and x86-64, they moved to 64-bit not only for the address space but to increase the GP registers, reduce the number of instructions executed and reduce the memory traffic for more performance than Thumb2 and x86 respectively. The 32-bit 68k is already competitive in these important performance metrics while it has the code density of Thumb-2 and room for further improvement.
|
Current generation mainstream embedded desktop game consoles, such as Xbox Series X and PlayStation 5 have exceeded the 32-bit memory address space e.g. at least 16 GB RAM.
The past generation PS4 and Xbox One have exceeded the 32-bit memory address space e.g. at least 8 GB RAM.
Nintendo Switch 2 has 12 GB of RAM.
-----------
AES extensions (e.g. AES-NI) are also important for modern content protection. https://en.wikipedia.org/wiki/AES_instruction_set
In AES-NI Performance Analyzed, Patrick Schmid and Achim Roos found "impressive results from a handful of applications already optimized to take advantage of Intel's AES-NI capability". A performance analysis using the Crypto++ security library showed an increase in throughput from approximately 28.0 cycles per byte to 3.5 cycles per byte with AES/GCM versus a Pentium 4 with no acceleration.
-----------
Since the PPC road map is still alive, embedded PPC gained Variable Length Encoding (VLE) i.e. 16-bit and 32-bit instruction lengths. This shows NXP is not returning to 68K.
----------- I frame my arguments to bring the Amiga platform into mainstream game console capability.
Last edited by Hammer on 19-Jun-2025 at 06:56 AM. Last edited by Hammer on 19-Jun-2025 at 06:06 AM.
_________________ Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68) Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68) Ryzen 9 7950X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB |
| Status: Offline |
| | Hammer
 |  |
Re: The (Microprocessors) Code Density Hangout Posted on 19-Jun-2025 6:16:04
| | [ #270 ] |
| |
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 6500
From: Australia | | |
|
| @cdimauro
Quote:
FACT: the article was about a study and Jaguar was NOT part of the study, but only BobCat was. So, and again, you're going OUTSIDE the topic/context with the sole purpose of defending your beloved AMD (BTW, have you bought stocks from it?)
|
FACT: Jaguar is the direct successor to Bobcat. What you didn't get is that the Jaguar is superior to the Bobcat uarch. Are you too stupid to realize a direct successor uArch is superior to the older uArch?
FACT: Jaguar has two game console design wins, NOT BobCat.
https://www.cpubenchmark.net/cpu.php?cpu=AMD+Athlon+5370+APU&id=2763 AMD Athlon 5370 4 Core (Jaguar) 2.2 GHz APU physics (Bullet physics library): 129 Frames/Sec
https://www.cpubenchmark.net/cpu.php?cpu=ARM+Cortex-A15+4+Core+2000+MHz&id=5261 ARM Cortex-A15 4 Core 2GHz physics (Bullet physics library): 106 Frames/Sec
https://www.cpubenchmark.net/cpu_test_info.html
The Physics Test uses the Bullet Physics Engine (version 2.88 for x86, 3.07 for ARM)
This is for the 2011 context for Xbox One's and PlayStation 4's development cycle. -----------------------------------------
https://www.tomshardware.com/video-games/console-gaming/amd-to-design-processor-for-xbox-next-team-red-extends-long-standing-microsoft-partnership Date: June 2025, AMD to design processor for Xbox Next
https://videocardz.com/newz/amd-reportedly-won-contract-to-design-playstation-6-chip-outbidding-intel-and-broadcom Date: Sep 2024, AMD reportedly won contract to design PlayStation 6 chip, outbidding Intel and Broadcom.
That's three game console generations. Let that sink in.
My arguments are framed within the Amiga 500's majority use case i.e. games.
Quote:
FACT: YOU stated that the used benchmarks were "nearly useless" (YOUR words) for gaming. I've asked proof of that which did NOT come, because it's clearly evident that your was a pure load of b@lls that nobody with a grain of salt could sustain.
|
Are you so stupid that even Intel bans UserBenchmark e.g. https://www.reddit.com/r/intel/comments/g36a2a/userbenchmark_has_been_banned_from_rintel/
Last edited by Hammer on 19-Jun-2025 at 07:43 AM. Last edited by Hammer on 19-Jun-2025 at 07:39 AM. Last edited by Hammer on 19-Jun-2025 at 07:37 AM. Last edited by Hammer on 19-Jun-2025 at 07:37 AM. Last edited by Hammer on 19-Jun-2025 at 07:29 AM.
_________________ Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68) Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68) Ryzen 9 7950X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB |
| Status: Offline |
| | Hammer
 |  |
Re: The (Microprocessors) Code Density Hangout Posted on 19-Jun-2025 6:31:10
| | [ #271 ] |
| |
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 6500
From: Australia | | |
|
| @cdimauro
Quote:
Another proof that you do NOT read what people write. Here's the main chart with the benchmark results:
Do you see ONLY SPEC INT there or even SOMETHING ELSE?!?
As it was proven already several times, you're just a PARROT which repeat the same meaningless things and post things took by googling around without even understanding their context and, what's worse, the CONTEXT of discussions.
|
You FAILED to factor in the mixed integer/floating point game use case.
SPEC INT and SPEC FP benchmarks are separate from each other, while the Quake benchmark is a mixed integer/floating point game use case.
SPEC INT and SPEC FP focus has trainwrecked PPC's mixed integer/floating point game use case e.g. Doom 3 PPC vs X86.
https://barefeats.com/doom3.html
From Glenda Adams, Director of Development at Aspyr Media,
PowerPC architectural differences, including a much higher penalty for float to int conversion on the PPC. This is a penalty on all games ported to the Mac, and can't be easily fixed. It requires re-engineering much of the game's math code to keep data in native formats more often. This isn't 'bad' coding on the PC -- they don't have the performance penalty, and converting results to ints saves memory and can be faster in many algorithms on that platform. It would only be a few percentage points that could be gained on the Mac, so its one of those optimizations that just isn't feasible to do for the speed increase.
Have you run Lightwave benchmark between AC68080 vs MC68060 rev6?
Lightwave's performance difference is not like Quake benchmark, which showcases's AC68080 advantage over 68060 rev6. https://eab.abime.net/showthread.php?t=113338 Lightwave benchmark for users with PPC, 060, PiStorm, Vampire accelerators
Apollo Ice Drake's Lightwave benchmark results weren't a major leap from 68060 Rev6. 68080's quad instruction issue per cycle wasn't matched with multiple floating point pipelines. Last edited by Hammer on 19-Jun-2025 at 06:41 AM. Last edited by Hammer on 19-Jun-2025 at 06:38 AM.
_________________ Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68) Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68) Ryzen 9 7950X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB |
| Status: Offline |
| | Hammer
 |  |
Re: The (Microprocessors) Code Density Hangout Posted on 19-Jun-2025 6:54:47
| | [ #272 ] |
| |
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 6500
From: Australia | | |
|
| @cdimauro
Quote:
OK, and? What's the point? |
https://www.phoronix.com/news/Intel-AVX10-Drops-256-Bit Date: March 2025 Intel drops 256-bit AVX10 only E-Cores, future Intel desktop CPUs have 512-bit AVX10.2.
Intel's u-turn on AVX-512 support for desktop. All Intel future platforms will support a 512-bit vector width.
Pat Gelsinger is fired in December 2024.
Within GCC patches also spelled out clearly:
"In this new whitepaper, all the platforms will support 512 bit vector width (previously, E-core is up to 256 bit, leading to hybrid clients and Atom Server 256 bit only). Also, 256 bit rounding is not that useful because we currently have rounding feature directly on E-core now and no need to use 256-bit rounding as somehow a workaround. HW will remove that support.
Thus, there is no need to add avx10.x-256/512 into compiler options. A simple avx10.x supporting all vector length is all we need. The change also makes -mno-evex512 not that useful. It is introduced with avx10.1-256 for compiling 256 bit only binary on legacy platforms to have a partial trial for avx10.x-256. What we also need to do is to remove 256 bit rounding."
Stop defending Intel (Pat Gelsinger)'s absurd flip-flopping with AVX-512.Last edited by Hammer on 19-Jun-2025 at 06:58 AM.
_________________ Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68) Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68) Ryzen 9 7950X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB |
| Status: Offline |
| | Hammer
 |  |
Re: The (Microprocessors) Code Density Hangout Posted on 19-Jun-2025 7:04:39
| | [ #273 ] |
| |
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 6500
From: Australia | | |
|
| @kolla
Quote:
kolla wrote:
Pffff, no it isn't. |
3DMark Ice Storm Physics test uses the Bullet Physics Library.
https://benchmarks.ul.com/news/understanding-3dmark-results-from-the-apple-iphone-5s-and-ipad-air
the 3DMark Ice Storm Physics test uses the Bullet Physics Library. Bullet is an open source physics engine that is used in Grand Theft Auto V, Trials HD and many other popular games on Playstation 3, Xbox 360, Nintendo Wii, PC, Android and iPhone.
The "physics test" in PassMark CPU benchmarks also uses the Bullet Physics Engine.
Last edited by Hammer on 19-Jun-2025 at 07:08 AM.
_________________ Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68) Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68) Ryzen 9 7950X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB |
| Status: Offline |
| | matthey
|  |
Re: The (Microprocessors) Code Density Hangout Posted on 19-Jun-2025 23:24:06
| | [ #274 ] |
| |
 |
Elite Member  |
Joined: 14-Mar-2007 Posts: 2744
From: Kansas | | |
|
| Hammer Quote:
Current generation mainstream embedded desktop game consoles, such as Xbox Series X and PlayStation 5 have exceeded the 32-bit memory address space e.g. at least 16 GB RAM.
The past generation PS4 and Xbox One have exceeded the 32-bit memory address space e.g. at least 8 GB RAM.
Nintendo Switch 2 has 12 GB of RAM.
|
With 8GiB of memory or more, it is worthwhile to move to 64-bit. With 4GiB of memory or less, there is more cost for memory with minimal or no benefit from the larger 64-bit address space. With good code density and performance for 32-bit, it is possible to save on the cost of memory.
32-bit 2GiB ~= 64-bit 4GiB (2GiB memory difference) 32-bit 1GiB ~= 64-bit 2GiB (1GiB memory difference) 32-bit 0.5GiB ~= 64-bit 1GiB (0.5GiB memory difference)
RPi SBCs with 256MiB to 1GiB of memory using ARM Thumb-2 with 68k like code density is what made the RPi successful. The low price was good for the hobby, embedded and educational market. Elimination of 32-bit support and Thumb-2 means the minimum 64-bit Cortex-A CPU SBC may have 2GiB of memory with 4GiB of memory likely the more popular option for better performance. ARM's decision to drop 32-bit support is forcing RPi SBCs to scale up and compete with x86-64 SBCs with more software, better GPU support for games and more performance. The price of memory is an important factor when trying to keep the price of SBCs below $100 USD. The RPi 8GiB SBCs are at this important psychological threshold and the 16GiB SBCs are above and compete poorly with x86-64 hardware while original 32-bit SBCs started at $25 USD and cheaper RPi Zeros are available today.
If talking about a 32-bit console, it would be a microconsole. Competing in the full sized console market, like the desktop market, requires billions of USD, for example the $4 billion Microsoft lost on the original XBox hardware. Ideally, such a microconsole could be turned into a computer like the CD32 to improve value, something the closed hardware full sized consoles refuse to allow.
Hammer Quote:
Since the PPC road map is still alive, embedded PPC gained Variable Length Encoding (VLE) i.e. 16-bit and 32-bit instruction lengths. This shows NXP is not returning to 68K.
|
PPC is dead with no road map. PPC VLE was never popular and is not a natural extension of PPC. It is a replacement ISA with new 16-bit and 32-bit encodings which avoids a mode switch. if willing to recompile existing PPC programs, the performance potential should be better than many compressed RISC ISAs do to supporting ~16 GP registers in most instructions.
Variable-Length Encoding (VLE) Extension Programming Interface Manual https://www.nxp.com/docs/en/supporting-information/VLEPIM.pdf Quote:
The major objectives of the VLE extension are as follows:
• Maintain coexistence and consistency with the existing PowerPC Book E ISA and architecture • Maintain a common programming model and instruction operation model in the VLE extension • Reduce overall code size by 30 percent over existing PowerPC text segments • Limit the increase in execution path length to under 10 percent for most important applications • Limit the increase in hardware complexity for implementations containing the VLE extension
...
Book E floating-point registers are not accessible to VLE instructions. Book E GPRs and SPRs are used by VLE instructions with the following limitations:
• VLE instructions using the 16-bit formats are limited to addressing GPR0–GPR7 and GPR24–GPR31 in most instructions. Move instructions are provided to transfer register contents between these registers and GPR8–GPR23. • VLE instructions using the 16-bit formats are limited to addressing CR0. • VLE instructions using the 32-bit formats are limited to addressing CR0-CR3.
|
While reducing the code size by 30% over PPC is not spectacular, limiting the increase in execution path length to under 10% is good. The number of instructions executed and memory traffic should both be reduced do to the ~16 GP registers compared to most compressed RISC ISAs. Both ARM and x86-64 were incentivized to move to 64-bit with more GP registers than 32-bit, thus improving performance.
arch | ISA | GP regs reg-mem x86 8 load/store Thumb 8 reg-mem x86-64 16 load/store ARM 16 reg-mem 68k 16 load/store PPC 32 load/store AArch64 32
Fewer reg encoding bits allows for smaller instructions and code. The x86 and Thumb ISAs have good code density but poor performance from executing too many instructions and increased memory traffic.
Efficient Use of Invisible Registers in Thumb Code https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.85.208&rep=rep1&type=pdf Quote:
More than 98% of all microprocessors are used in embedded products, the most popular 32-bit processors among them being the ARM family of embedded processors. The ARM processor core is used both as a macrocell in building application specific system chips and standard processor chips. In the embedded domain, in addition to having good performance, applications must execute under constraints of limited memory. ARM supports dual width ISAs that are simple to implement and provide a tradeoff between code size and performance. In prior work we studied the characteristics of ARM and Thumb code and showed that for some embedded applications the Thumb code size was 29.8% to 32.5% smaller than the corresponding ARM code size. However, it was also observed that there was an increase in instruction counts for Thumb code which was typically around 30%. We studied the instruction sets and then compared the Thumb and ARM code versions to identify the causes of performance loss. The reasons we identified fall into two categories: Global inefficiency - Global inefficiency arises due to the fact that only half of the register file is visible to most instructions in Thumb code. Peephole inefficiency - Peephole inefficiency arises because pairs of Thumb instructions are required to perform the same task that can be performed by individual ARM instructions.
|
Executing more instructions is bad but there is also increased memory traffic with not enough registers which is less than 16 GP registers.
No. of Regs | Program size | Load/Store | Move 27 100.00% 27.90% 22.58% 24 100.35% 28.21% 22.31% 22 100.51% 28.34% 22.27% 20 100.56% 28.38% 22.24% 18 100.97% 28.85% 21.93% 16 101.62% 30.22% 20.47% 14 103.49% 31.84% 19.28% 12 104.45% 34.31% 16.39% 10 109.41% 41.02% 10.96% 8 114.76% 44.45% 8.46%
High-Performance Extendable Instruction Set Computing https://www.researchgate.net/publication/3888194_High-performance_extendable_instruction_set_computing
From the same article, Thumb with 8 GP registers had a 16.3% higher load/store percentage than the EISC ISA with 16 GP registers.
The x86 ISA was also handicapped by not enough GP registers resulting in too many instructions executed and too high of memory traffic which were reduced with x86-64 and 16 GP registers. However, x86-64 code density is not as good and the "memory footprint" increased.
Performance Characterization of SPEC CPU2006 Integer Benchmarks on x86-64 Architecture https://ece.northeastern.edu/groups/nucar/publications/SWC06.pdf Quote:
Fig. 3. 64-bit binary is larger than 32-bit binary by the amount of 21% on average for the CPU2006 integer benchmarks.
...
Fig. 4. 64-bit mode execution has a larger runtime memory footprint than 32-bit mode by 25.1% on average for the CPU2006 integer benchmarks.
...
Fig. 5. The number of instructions dynamically executed is 12% less in 64-bit mode than in 32-bit mode.
...
Fig. 6. The IPC observed in 64-bit mode decreases by 7.8% on average versus that in 32-bit mode across the CPU2006 integer benchmarks.
...
Fig. 8. The instruction cache request rate observed in 64-bit mode increases by 14% on average versus that in 32-bit mode across the CPU2006 integer benchmarks.
...
The data cache request rate observed in 64-bit mode decreases by 28% on average versus that in 64-bit mode across the CPU2006 integer benchmarks.
...
Fig. 11. The data cache miss rate observed in 64-bit mode increases significantly (nearly 40% on average) versus that in 32-bit mode. Note that the Y axis is the number of misses per 1,000 retired instructions.
...
Fig. 12. The memory controller utilization observed in 64-bit mode is nearly 20% higher than that in 32-bit mode. Note that the maximum available memory controller bandwidth on the experimental system was 6.4GB/s and the memory controller utilization was calculated by dividing the observed memory bandwidth by this maximum bandwidth.
|
The x86-64 64-bit mode provides a 7% performance increase over 32-bit mode on average for SPEC CPU2006 integer benchmarks and 0.46% increase for SPEC CPU2000 integer benchmarks. Most of the performance gain is do to the upgrade from 8 to 16 GP integer registers reducing the number instructions executed and the data memory traffic. Most other 64-bit performance characteristics were bad.
The 32-bit 68k is in good shape with 16 GP integer registers. It does not suffer from executing too many instructions or excessive memory traffic in Vince Weaver's code density comparison.
https://docs.google.com/spreadsheets/u/0/d/e/2PACX-1vTyfDPXIN6i4thorNXak5hlP0FQpqpZFk2sgauXgYZPdtJX7FvVgabfbpCtHTkp5Yo9ai6MhiQqhgyG/pubhtml?gid=909588979&single=true&pli=1
The 68k is #1 in code density and #2 in instruction count. Also reg-mem (CISC) architectures do not need as many GP registers as load/store (RISC) architectures. The 68k code using 16 GP registers likely executes fewer instructions than most code for load/store architectures with 32 GP registers. The 32-bit 68k has very good performance traits.
Last edited by matthey on 20-Jun-2025 at 06:15 PM.
|
| Status: Offline |
| | kolla
|  |
Re: The (Microprocessors) Code Density Hangout Posted on 20-Jun-2025 16:09:10
| | [ #275 ] |
| |
 |
Elite Member  |
Joined: 20-Aug-2003 Posts: 3473
From: Trondheim, Norway | | |
|
| @Hammer
Quote:
Hammer wrote: @kolla
Quote:
kolla wrote:
Pffff, no it isn't. |
3DMark Ice Storm Physics test uses the Bullet Physics Library.
https://benchmarks.ul.com/news/understanding-3dmark-results-from-the-apple-iphone-5s-and-ipad-air
the 3DMark Ice Storm Physics test uses the Bullet Physics Library. Bullet is an open source physics engine that is used in Grand Theft Auto V, Trials HD and many other popular games on Playstation 3, Xbox 360, Nintendo Wii, PC, Android and iPhone.
The "physics test" in PassMark CPU benchmarks also uses the Bullet Physics Engine.
|
What I was "Pffff"-ing was your claim that ... Quote:
a gaming benchmark (3D Mark Ice Storm Physics) which is Amiga's core audience.
|
Are you seriously arguing that 3D gaming is what the core audience is buying and using Amiga for? If so... how dumb are they?!_________________ B5D6A1D019D5D45BCC56F4782AC220D8B3E2A6CC |
| Status: Offline |
| | matthey
|  |
Re: The (Microprocessors) Code Density Hangout Posted on 20-Jun-2025 23:57:10
| | [ #276 ] |
| |
 |
Elite Member  |
Joined: 14-Mar-2007 Posts: 2744
From: Kansas | | |
|
| kolla Quote:
What I was "Pffff"-ing was your claim that ... Hammer Quote:
a gaming benchmark (3D Mark Ice Storm Physics) which is Amiga's core audience.
|
Are you seriously arguing that 3D gaming is what the core audience is buying and using Amiga for? If so... how dumb are they?!
|
That gets back to what an Amiga has become and should become.
Amiga (68k CPU + Amiga chipset + AmigaOS) AmigaNOne (PPC CPU + commodity chipset + AmigaOS) Amiga PiStorm (ARM CPU + emulator + Amiga chipset + Amiga OS) AmiBench/A600GS (ARM CPU + emulator + embedded chipset + AROS) Commodore Amiga (x86-64 CPU + commodity chipset + Commodore OS Vision)
AmigaNOne gaming is mostly 3D gaming. The improved performance from ARM CPUs is being used for 3D games with a Warp3D driver being developed and Hyperion porting 3D Games for ARMigas. My Retro Computer Ltd, the guys trying to buy the Commodore brand, would have likely fully replaced everything but the Amiga brand with x86-64 hardware capable of 3D and this was not that far from happening.
https://storage.courtlistener.com/recap/gov.uscourts.wawd.256770/gov.uscourts.wawd.256770.7.0.pdf Quote:
81. Failure by the Amiga Parties to maintain rights in the Licensed Marks, leading to the uncertainty caused by Cloanto and Hyperion’s competing applications, has already invited another entity unknown to Plaintiff, namely My Retro Computer LTD, listed in USPTO records as a limited partnership existing under the laws of the United Kingdom with an address of Unit A1, Pegham Ind Park, Fareham, United Kingdom PO156SD, to file U.S. Application Ser. No. 87/752,895 on January 12, 2018, seeking to register AMIGA, in connection with:
|
The Amiga almost became an x86-64 PC capable of playing modern 3D games. The pioneer in graphics and multimedia that the Amiga was was nearly fully assimilated back into modern hardware that it influenced. The 68k Amiga apparently has nothing to offer even though the memory footprint of these systems has increased as more and more of the 68k Amiga was replaced.
https://www.amigans.net/modules/newbb/viewtopic.php?post_id=152300#forumpost152300 joerg Quote:
Compared to real PPC hardware like the X1000 and X5000, or even just a Sam4x0, running AmigaOS 4.x with QEmu on a Raspberry Pi is unusable slow, however it's a replacement board for the A1200 and even QEmu emulated AmigaOne XE/PPC AmigaOS 4.x on it should be faster than the classic Amiga AmigaOS 4.x on real hardware (A1200 with BlizzardPPC), and with 4GB you have enough RAM for running AmigaOS 4.x software, compared to the BlizzardPPC with max. 256 MB which is way too few. Additionally the RAM access speed on the BlizzardPPC is extremely slow, which was the main reason AmigaOS 4.x was unusable on the A1200/BPPC.
The A600GS makes much more sense.
|
PPC AmigaNOne already "with 4GB you have enough RAM for running AmigaOS 4.x software, compared to the BlizzardPPC with max. 256 MB which is way too few." The A600GS has 2GiB of memory to provide AmiBench with 512MiB and the A600GS+ has 4GiB of memory to provide AmiBench with 1GiB of memory. My Retro Computer Ltd offers 4GiB and 8GiB versions of the Commodore 64x using x86-64 hardware and, with the Amiga brand, would likely offer the same for the Amiga considering it would be the same hardware and software in a different case. If assimilation would provide Amiga (and Commodore) unification on cheaper hardware then maybe it would be worth considering but it appears divergence of Amiga compatibility and development on expensive bloated hardware is where the Amiga is headed for the future. I guess elegant and small footprint hardware were destined to become extinct and perhaps I am a dinosaur on the verge of extinction with it.
 https://searchengineland.com/meet-the-new-borg-google-facebook-apple-54969
Last edited by matthey on 21-Jun-2025 at 12:31 AM. Last edited by matthey on 21-Jun-2025 at 12:16 AM.
|
| Status: Offline |
| | kolla
|  |
Re: The (Microprocessors) Code Density Hangout Posted on 21-Jun-2025 3:54:44
| | [ #277 ] |
| |
 |
Elite Member  |
Joined: 20-Aug-2003 Posts: 3473
From: Trondheim, Norway | | |
|
| @matthey
Quote:
AmiBench/A600GS (ARM CPU + emulator + embedded chipset + AROS)
|
What "embedded chipset"? It's Amiberry, a variant of UAE, emulating both 68k and the Amiga chipset, running AROS/m68k... nothing "3D" in there, so what?
Quote:
Commodore Amiga (x86-64 CPU + commodity chipset + Commodore OS Vision)
|
Commodore OS Vision being a common Linux distro with certain "theme" for Gnome?
Quote:
AmigaNOne gaming is mostly 3D gaming.
|
Pfff... the main "game" on AmigaOne is the tinkering and shopping required to get all the components needed to even launch any of the handful of 3D games exists, and then show off screenshots with less impressive frames-per-second counts.
Quote:
The improved performance from ARM CPUs is being used for 3D games with a Warp3D driver being developed and Hyperion porting 3D Games for ARMigas. |
First... ARMiga is a very specific thing, you know? And secondly... no? Yes, there's work going making not a "Warp3D driver", but rather a Warp3D equivalent for Emu68 RaspberryPi, but that has _nothing_ to do with "ARMiga", and certainly not much to do with Hyperion "porting 3D games" other than a certain developer being a Hyperion fanboy of sort.
Quote:
Even if that had happened, it would not imply that Amiga's core audience is in this for "3D gaming", because that is simply not true - the core audience are in it for the more than 2600 legacy games and close to 1000 demos one can run through WHDLoad. It is nice that some "2.5D" games classics like Doom and Quake also have become available with faster CPUs and faster RTG, but actual 3D games are extremely few on Amiga, and it's all quite cumbersome and convoluted. Anyone who's really into what you can call "3D gaming" is certainly _not_ seeking any sort of Amiga for it.
As for the A600GS/A1200NG - though the Orange Pi zero 3 has a Mali GPU, it is not exposed from inside Amiberry. However work has been ongoing to use OpenGL (which Mali supports) to render the emulation screen (via SDL) and add CRT-like effects such as fake scanlines etc. However, as stated on the Amiberry build page... "Currently not fully implemented! Do not enable!" - https://github.com/BlitterStudio/amiberry/wiki/Compile-from-source_________________ B5D6A1D019D5D45BCC56F4782AC220D8B3E2A6CC |
| Status: Offline |
| | matthey
|  |
Re: The (Microprocessors) Code Density Hangout Posted on 21-Jun-2025 13:44:32
| | [ #278 ] |
| |
 |
Elite Member  |
Joined: 14-Mar-2007 Posts: 2744
From: Kansas | | |
|
| kolla Quote:
What "embedded chipset"? It's Amiberry, a variant of UAE, emulating both 68k and the Amiga chipset, running AROS/m68k... nothing "3D" in there, so what?
|
The A600GS ARM SoC has a non standard embedded chipset. ARM has standardized the 64-bit AArch64 CPU ISA but the chipset is not standard. Maybe they are trying to standardize to an ARM GPU too but the remaining SoC is still a la carte. An Orange Pi chipset is not compatible with a RPi chipset which is not compatible with an Apple chipset and the GPU is different on all of them for now. They are all ARM hardware but less compatible than x86(-64) hardware.
kolla Quote:
Commodore OS Vision being a common Linux distro with certain "theme" for Gnome?
|
Commodore OS Vision is what Commodore would have created with their amazing "vision" of the future, according to Peri. It is an officially branded "Commodore" OS which is all that should matter for full acceptance by every Commodore and Commodore Amiga customer and user.
kolla Quote:
Pfff... the main "game" on AmigaOne is the tinkering and shopping required to get all the components needed to even launch any of the handful of 3D games exists, and then show off screenshots with less impressive frames-per-second counts.
|
But AmigaOne is officially "AmigaOne" branded and uses officially branded "AmigaOS 4". 3D support and games are their strength. The 68k Amiga has retro games, PPC MorphOS has productivity software and PPC AmigaOS 4 has 3D. It is irrelevant that "tinkering and shopping" are required or that the total price is double that of competitors while the performance half. It is not the pain to setup or the pain in the pocket book but the "Amiga" brand that is important to customers. Until My Retro Computer Ltd obtains the Amiga brand and uses it for their x86-64 Linux systems, it is the best 3D the Amiga has.
kolla Quote:
First... ARMiga is a very specific thing, you know? And secondly... no? Yes, there's work going making not a "Warp3D driver", but rather a Warp3D equivalent for Emu68 RaspberryPi, but that has _nothing_ to do with "ARMiga", and certainly not much to do with Hyperion "porting 3D games" other than a certain developer being a Hyperion fanboy of sort.
|
Ok, a Warp3D replacement to go with the ARM replacement of the 68k. You are right, only ARMiga has that official branding and official Amiga branding it is not as it is off by one letter.
kolla Quote:
Even if that had happened, it would not imply that Amiga's core audience is in this for "3D gaming", because that is simply not true - the core audience are in it for the more than 2600 legacy games and close to 1000 demos one can run through WHDLoad. It is nice that some "2.5D" games classics like Doom and Quake also have become available with faster CPUs and faster RTG, but actual 3D games are extremely few on Amiga, and it's all quite cumbersome and convoluted. Anyone who's really into what you can call "3D gaming" is certainly _not_ seeking any sort of Amiga for it.
As for the A600GS/A1200NG - though the Orange Pi zero 3 has a Mali GPU, it is not exposed from inside Amiberry. However work has been ongoing to use OpenGL (which Mali supports) to render the emulation screen (via SDL) and add CRT-like effects such as fake scanlines etc. However, as stated on the Amiberry build page... "Currently not fully implemented! Do not enable!" - https://github.com/BlitterStudio/amiberry/wiki/Compile-from-source
|
If the "more than 2600 legacy games and close to 1000 demos" are what is important for the Amiga then why is it the high performance Amiga hardware replacements that are selling instead of, for example, a $45 USD FleaFPGA Ohm? The extra performance is not needed for most of the Amiga retro games so is the performance desired for 3D, and if not, what else? If Amiga fans want 3D and are willing to replace everything Amiga but the retro games, is Commodore OS Vision on x86-64 hardware in an Amiga case the best solution?
|
| Status: Offline |
| | cdimauro
|  |
Re: The (Microprocessors) Code Density Hangout Posted on 22-Jun-2025 6:14:51
| | [ #279 ] |
| |
 |
Elite Member  |
Joined: 29-Oct-2012 Posts: 4431
From: Germany | | |
|
| @matthey
Quote:
matthey wrote: This is the resurrection of an old thread. |
Which is very welcome.  Quote:
I would like to talk about the relation between code density and memory footprint. In review, I post a code density chart for 32-bit and 64-bit ISAs of a large compiled executable from the "Code Density Compared Between Way Too Many Instruction Sets" found in post #5 of this thread.


The 68k does not have a particularly good result here which may be do to compiling for the 68040 which is the worst 68020+ target for code density, including stack frames which many newer architectures have off by default, ancient compiler support for the 68k, etc. Some other code density benchmark results have the 68k competing with Thumb2 in code density. |
I agree. Those results are a big surprise (especially considering the ones for the ARC architecture) and they should be taken with a grain of salt.
What I don't like is that the guy hasn't shared all compilation options that he used, because some were tweaked by his admissions. So, it's difficult to reproduce the results or make some changes (like removing frames) to make a fair comparison (or trying different compilers, like LLVM). Quote:
These charts still give a rough idea of code density as the memory footprint is also considered.
The memory footprint of a system includes the memory used for code, data, stack and caches/buffers. |
In reality the above information only takes into account code and data (and a small part for the executable header, which might we roughly assume it's the same): bss, stack and caches/buffers are not considered.
Which is ok, because this is roughly what's needed when talking about code density (ideally it would be counting only the text and dataro sections/segments). Quote:
There is the bootup and idle footprint before additional programs are started which determines how much memory is available for programs and there is the footprint when executing programs which is just as important. Surprisingly, I have not found many papers on this important topic. Most of the info I found comes from users posting data and results.
I will start by looking at the 68k AmigaOS footprint. I know the engineer Steve Shireman personally who worked on Amiga embedded systems.
https://kgsvr.net/andrew/amiga/amiga.diffnt.html#efficient Quote:
AmigaOS is Efficient
From: Shireman, Steve
I have run control software on the Amiga booting off of a battery-backed SRAM PCMCIA card without a hard drive or floppy using only 4K of the PCMCIA card to boot. Think of the PCMCIA card as replacing the hard drive in a desktop system. The only RAM overhead was about 54K, and with this I have the full color model and mouse control, and fully preemptive multitasking and of the 2 Meg of RAM that comes with the A1200, The Amiga OS has only needed less than 1 / 10,000 of the RAM available. And I know it is using a few of the OO Objects in the Kickstart, but not very many.
Of course, the same thing can be done on an A600, which is even cheaper, or custom boards.
It would be nice for OEM's to be able to license Kickstart (remove parts they don't want), and link application code, and plug a Flash chip into the same socket where Kickstart goes.
Envoy, the network software also has tiny requirements. I have booted from a floppy on an A500 with Envoy and served files to the network with it.
The benefit of the Soft Machine Architecture gives an embedded designer the chance to only use the parts of the Amiga OS that they need. Exec has the OpenLibrary() function, which gives the user or application designer for Amiga systems to decide exactly what libraries to open after that point. It is a very nice to have that much control of the system, without mucking with the source code of the _microkernel.
I believe that the current design of the Amiga Exec is much better suited for Consumer Electronics than WindowsCE. This goes also for HPC's or PDA. (Personal Digital Amiga, wouldn't that be cool with a video out. With AAA chips it could have video in as well, and not eat batteries, but now I am dreaming...)
I hope future 'improvements' if and when they occur do not ruin the resource-smallness of the Amiga design.
|
The "resource-smallness" Steve refers to is the footprint. It is funny that he talks about removing modules from the Kickstart for custom embedded use considering the "standard" 68k Amiga footprint compared to the footprint of many modern embedded systems. The PPC AmigaOS also did "ruin the resource-smallness of the Amiga design" as I will get into later.
The 68k AmigaOS 3 only used 54kiB of 2MiB of memory after boot which includes preemptive multitasking and a GUI. Most of the AmigaOS code is in the 512kiB Kickstart which I expect from experience is more than 90% code with some obviously read only data. The Kickstart code is executed directly from ROM without a MapROM copy to memory. The 54kiB of 2MiB consists of writable library data including jump tables, stacks for various programs and caches/buffers. The floppy drive by default uses 5 buffers of 512 bytes each for 2,560 bytes. The default stack size for AmigaOS 3 is 4kiB with some processes/tasks having a lower stack size set. |
Indeed. The Amiga OS is very very efficient when talking about memory consumption, and we know it very well. Quote:
At least this we can compare to some other systems like AmigaOS 4 to start.
https://www.amigans.net/modules/newbb/viewtopic.php?post_id=60027#forumpost60027 ssolie Quote:
You need to have at least 60k of stack or so for any program using a GUI. If you don't, one will be provided for you via a hidden stack swap which will slow your program down slightly. I recommend setting a stack cookie at about 80k for anything with a GUI to avoid the implicit stack swapping in Intuition.
Come to think of it, I should probably document this fact in my Modern Amiga Programming article.
It is best to never nitpick about stack size in AmigaOS. Give it plenty of room and always use a stack cookie in your programs.
|
Just the stack of one program running under AmigaOS 4 uses more memory than the whole 68k AmigaOS. |
That's something which I don't get: with much more registers available on PowerPCs, you should have required LESS stack storage compared to equivalent 68k applications.
Unless there's something in the ABI which requires additional storage (like some allocated memory area of fixed/minimum size in the stack). Quote:
The majority of the footprint is code which is likely 40%-50% larger for PPC on average. Then there is the overhead for the MMU and there is a night and day difference in footprint. All things are relative and the following are some default stack sizes for other OSs.
AmigaOS 4kiB VxWorks 20kiB QNX 512kiB Windows 1024kiB Linux 8192kiB
The majority of the boot up and idle OS footprint is likely code for a small footprint system but the stack memory is substantial for modern systems which is likely increased with 64-bit do to alignment and pointers on the stack. Data structures also no doubt bloat up. |
MMUs affect the overall memory footprint, but certainly not the stack usage.
If the default stack is 4KB on the Amiga OS and AmigaOS4 sets a 4KB memory page granularity on PowerPCs, then there's absolutely no difference on the memory required for the stack.
On 64-bit systems the stack memory likely requires to be raised to 8KB because of the double registers/pointers sizes and/or for keeping the stack aligned to 64-bit, but here we're talking about 32-bit applications.
BTW, the Windows and, especially, Linux requirements are so high. Very very strange... Quote:
Let us take a look at a 1GiB memory Ubuntu system with 32-bit vs 64-bit Ubuntu OS.
32-bit Ubuntu idle
 vs 64-bit Ubuntu idle
 https://askubuntu.com/questions/7034/what-are-the-differences-between-32-bit-and-64-bit-and-which-should-i-choose
Most of an idle system is code, the caches/buffers are the same and the stack is likely the same even though it could be less for 32-bit. OS structures are likely larger for the 64-bit system though. The 64-bit Ubuntu uses 103MiB more memory or ~27% more memory at idle. Of the 1GiB of system memory, ~38% is used with 32-bit Ubuntu and ~48% is used with 64-bit Ubuntu. Much of the footprint difference at idle is likely do to the code density difference between x86 and x86-64 |
And the default stack size, I would say, because the pure increase in the code is around 20-25% when going from x86 to x86-64. Quote:
which is not true for the footprint when using applications as we see from the same link.


The 64-bit Ubuntu uses more memory than the code density difference which makes sense as some programs use a lot of data like Firefox. The swap memory used for 64-bit Firefox is crazy compared to the 32-bit Firefox. The 64-bit system becomes slower than the 32-bit system do to swap. |
Right. There's a consistent increase in the memory usage on 64-bit systems. Which is more clearly shown by checking the results for the x32 system (e.g.: 32/64 in the chart), since this ABI is using 32-bit pointers (not for pushes & calls, if I recall correctly, since 64-bit results are used in those cases).
That's interesting, because it makes a stronger case for 64-bit architectures which support 32-bit (or even less) size for pointers. Quote:
https://askubuntu.com/questions/7034/what-are-the-differences-between-32-bit-and-64-bit-and-which-should-i-choose Quote:
Additionally, in my testing, a web-application written in Python used up to 60% more memory on a 64-bit machine which resulted in a test suite running in 380 secs on a 32-bit machine but taking 523 seconds on a 64-bit one (both with 1GiB of RAM). If the machines were not RAM-limited the results would likely be different (as phoronix tests show).
|
|
Strange. Doubling the pointer sizes shouldn't bring to such results. There should be some other factor which is influencing them. Quote:
This is similar to the conclusion and recommendation for the RPi that a 32-bit OS should be used with less than 4MiB of memory.
https://smist08.wordpress.com/2022/02/04/raspberry-pi-os-goes-64-bit/ Quote:
How Does it Run?
First off, even though it will run on a Raspberry Pi 3 or even a Raspberry Pi Zero 2, I wouldn’t recommend it. A 64-bit operating system uses more memory than the corresponding 32-bit version and 1 Gig of RAM isn’t enough. To use this, you really need a Raspberry Pi 4 with at least 4 Gig of RAM. On my Raspberry Pi 4 with 8 Gig of RAM, it is noticeably peppier than the 32-bit version, especially when browsing with Chromium.
...
Why Move to 64-Bit?
The Raspberry Pi OS has been around for a while in 32-bits, the advantage is that it runs on all Raspberry Pi’s no matter how old and it runs in compact memory footprints of 512 MB or 1 Gig. It runs a vast library of open source software and has provided a great platform for millions of students and DIY’ers. However, most of the rest of the world including mainstream Linux, Windows, MacOS, Android and iOS have all been 64-bit for some time. So let’s look at some reasons to move to 64-bits.
1. Memory addressing simplifies, making life simpler for Linux. 2. In 64-bit mode, each ARM CPU has twice as many general purpose registers (32 vs 16) allowing programs to keep more variables in registers, saving loading and storing things to and from memory. 3. All new compiler optimizations are targeting the ARM 64-bit instructions set, not much work is being done on the 32-bit code generators. 4. The CPU registers are now 64-bits, so if you are doing lots of long integer arithmetic, it will now be done in one clock cycle rather than taking several under 32-bits.
All that being said, there are a couple of disadvantages, namely 64-bit tends to take more memory due to:
1. If integers are 64-bit, rather than 32-bit then it takes twice as much memory to hold each one. 2. Normal ARM instructions are 32-bits in size when running in either 32-bit or 64-bit mode. However in 32-bit mode there is a special “thumb” mode where each instruction is only 16-bits in size. Using these can greatly reduce memory footprint and the GCC compiler supports producing these as an option.
|
For both AArch64 and x86-64, they moved to 64-bit not only for the address space but to increase the GP registers, reduce the number of instructions executed and reduce the memory traffic for more performance than Thumb2 and x86 respectively. The 32-bit 68k is already competitive in these important performance metrics while it has the code density of Thumb-2 and room for further improvement. ARM abandoning 32-bit support for their Cortex-A cores is an opportunity for less expensive 4GiB and under systems that do not use the additional address space. The difference in footprint between the 68k AmigaOS and 64-bit Linux is huge. Memory is not so tight with a 68k Amiga system because most of the memory is available after boot instead of loosing nearly half a GiB of memory and good code density saves additional memory. So many people want to rush to 64-bit but it practically requires 2GiB-4GiB of memory when the 68k Amiga can do so much with 1MiB-2MiB that it is difficult to imagine what it can do with 1GiB-2GiB of memory. |
Correct, but the Amiga OS is not using an MMU, and the granularity of the memory allocator is just 8 bytes (if I recall correctly), which makes a huge impact compared to any other os (not only Linux) which is using an MMU.
If you want to compare the Amiga OS and Linux, then probably some embedded version (which is not using the MMU) should be considered.
Besides that, here you're just talking about the memory consumption and no other factor (e.g.: system robustness, security, resource tracking, multi-user, ... SMP), which restricts the discussion to the embedded market, basically. Quote:
It's really too much even if we consider doubling all data usage due to switch from 32 to 64-bit: there should be something else which is poisoning the results. Quote:
Was the additional cost of moving low end hardware from a 32-bit footprint to a 64-bit footprint considered before abandoning 32-bit or was the bloat already too much for 32-bit? |
I agree. To me doesn't make sense to switch everything to 64-bit, and the RPi situation is a clear example of too fast decisions made without taking into account all factors.
If you've 8 (or even 16GB), requiring all applications to be 64-bit means just wasting memory without any reason.
Even considering browsers, which are resources hogs nowadays, you don't need that a single process (tab in the browser) uses a 64-bit address space. I'm pretty sure that the most complex web page/application can be fine by using at most 4GB of space (I mean: max 4GB per each opened tab).
That's the reason why I advocate 64-bit systems which support a 32-bit model ("medium") for almost all applications, leaving the 64-bit model ("large") only for applications which really need to handle / access more than 4GB. |
| Status: Offline |
| | cdimauro
|  |
Re: The (Microprocessors) Code Density Hangout Posted on 22-Jun-2025 6:45:47
| | [ #280 ] |
| |
 |
Elite Member  |
Joined: 29-Oct-2012 Posts: 4431
From: Germany | | |
|
| @Hammer
Quote:
Hammer wrote: @matthey
Quote:
For both AArch64 and x86-64, they moved to 64-bit not only for the address space but to increase the GP registers, reduce the number of instructions executed and reduce the memory traffic for more performance than Thumb2 and x86 respectively. The 32-bit 68k is already competitive in these important performance metrics while it has the code density of Thumb-2 and room for further improvement.
|
Current generation mainstream embedded desktop game consoles, such as Xbox Series X and PlayStation 5 have exceeded the 32-bit memory address space e.g. at least 16 GB RAM.
The past generation PS4 and Xbox One have exceeded the 32-bit memory address space e.g. at least 8 GB RAM.
Nintendo Switch 2 has 12 GB of RAM. |
Which does NOT mean that a 64-bit architecture is necessarily required for them.
Contrary to some common belief, games don't require 64-bit architecture only because their assets require more than 4GB of memory.
When talking about architecture here, I mean both the CPU and the GPU architectures.
I leave this proof as a school exercise, since it's trivial. Quote:
-----------
AES extensions (e.g. AES-NI) are also important for modern content protection. https://en.wikipedia.org/wiki/AES_instruction_set
In AES-NI Performance Analyzed, Patrick Schmid and Achim Roos found "impressive results from a handful of applications already optimized to take advantage of Intel's AES-NI capability". A performance analysis using the Crypto++ security library showed an increase in throughput from approximately 28.0 cycles per byte to 3.5 cycles per byte with AES/GCM versus a Pentium 4 with no acceleration.
----------- |
Totally irrelevant for this part of the discussion. Quote:
Since the PPC road map is still alive, |
Where? It's dead since around 3 lustrum.
The only road map which is still alive is the POWER one, but it's NOT gaining any (new) consensus.
In fact, they are still alive only because of some big contracts. Which is exactly for the same thing for the Z architecture.
Anyway, there's no future for them, beside those niches where they are operating. Quote:
embedded PPC gained Variable Length Encoding (VLE) i.e. 16-bit and 32-bit instruction lengths. |
First of all, how (which SoC) used it?
Second, benchmarks available for it? I'd like to see how it compares to other architectures. Quote:
This shows NXP is not returning to 68K. |
Neither to PowerPCs: they've already moved to ARM since 3 lustrum, as I've already said. And like that have done almost all processors/SoC producers.
In the last years RISC-V gained consensus and market. But certainly NOT PowerPCs (which are dead) neither POWERs (see above on this). Quote:
----------- I frame my arguments to bring the Amiga platform into mainstream game console capability. |
Non-sense: there's no chance. |
| Status: Offline |
| |
|
|
|
[ home ][ about us ][ privacy ]
[ forums ][ classifieds ]
[ links ][ news archive ]
[ link to us ][ user account ]
|