Your support is needed and is appreciated as Amigaworld.net is primarily dependent upon the support of its users.
|
|
|
|
Poster | Thread | Hammer
 |  |
Re: Finally a great and useful PPC port! Posted on 14-Mar-2025 2:09:28
| | [ #21 ] |
| |
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 6320
From: Australia | | |
|
| @Lou
Quote:
A WiiU is a triple-core 1125 Mhz cpu with 2GB of RAM and Radeon 4670-ish gpu that runs 2 screens and has wi-fi and bluetooth. |
Wii U's GPU has 160 cores scale, hence it's not Radeon HD 4670 class. Wii U's GPU core scale is similar to OEM only Radeon HD 7450's 160 cores scale model.
Radeon HD 4670 has 320 shader cores.
My 2008 era Core 2 Sony laptop has a mobility Radeon HD 4650 with 320 shader cores.
From https://www.techpowerup.com/gpu-specs/wii-u-gpu.c1903 Wii U's GPU is a TeraScale 2 with 160 cores.
Wii U's Espresso CPU has 3 Broadway-based PPC cores at 1.24 GHz.
Wii U tried to be a mobile handheld and failed, hence it was replaced by the Switch's Tegra X1 SoC that contains four ARM Cortex A57 cores and NVIDIA's Maxwell 2.5 based 256 core GPU.
Switch's Tegra X1 SoC contains four ARM Cortex A57 cores (1 core reserved for the OS) and four disabled ARM Cortex-A53 cores on the silicon.
Lessons are learnt when AMD GPUs returned for handheld devices from Samsung's Exynos SoCs mobile phones to SteamDeck clones.
_________________ Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68) Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68) Ryzen 9 7950X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB |
| Status: Offline |
| | agami
|  |
Re: Finally a great and useful PPC port! Posted on 14-Mar-2025 3:37:01
| | [ #22 ] |
| |
 |
Super Member  |
Joined: 30-Jun-2008 Posts: 1919
From: Melbourne, Australia | | |
|
| @Lou
I can see you are very fired up about this. Very passionate. Which is not a bad thing, unless misplaced.
Yes, the Wii U is an inexpensive piece of old PPC hardware when compared to a SAM460LE, and if there were AROS PPC developers who'd care to port it (or at all), that would get a bunch of us spending few hundred dollars on getting a Wii U off of Ebay to give it a whirl. Soon after that, we'd pack it away or it would sit there taking up space, because we'd need developers to port their AROS apps to PPC, and we all know that's a long wait for a train that ain't coming. We'd end up feeling just as shitty as SAM460 owners, only a little less so because we spent less money.
Windows NT (PPC) and Linux (PPC) have decent hardware abstraction, so a port is more straightforward. AmigaOS is still hardware version specific, and MorphOS is in maintenance mode for PPC while we wait to hear where they're going next. Neither of these are open source, so the benefits of a "developer community" do not apply.
The examples of the native Wii U power you shared, do not say anything about the budgets and the sizes of the game development teams that worked on those titles. Which is to say, things are only as powerful as the power people aim to leverage.
If we could get titles like Deus Ex or Mass Effect on AROS PPC, the world would wake up confused as to why there are no Wii U consoles anywhere. But as it stands, you're getting worked up over what is at best yet another target for running an Amiga (68k) emulator, and we already have plenty of those, to suit all wallets.
_________________ All the way, with 68k |
| Status: Offline |
| | Hans
|  |
Re: Finally a great and useful PPC port! Posted on 14-Mar-2025 7:12:34
| | [ #23 ] |
| |
 |
Elite Member  |
Joined: 27-Dec-2003 Posts: 5120
From: New Zealand | | |
|
| @Lou
As agami pointed out, AmigaOS 4 doesn't have a full hardware abstraction layer that would allow anyone to write the bootloader and drivers needed to port to another platform.
For example, the kernel has code for both the A1-XE and Pegasos-II, even though those two platforms used the same G3 & G4 CPUs. The platform-specific code includes things such as interrupt controller drivers, setting up of PCI, reading configuration data from the firmware, etc. This is compounded by certain driver APIs (like graphics) being "top-secret" and requiring an NDA.**
So any such port would have to be written by existing AmigaOS 4 devs, who have access to the OS source-code.
If third-party developers were able to port AmigaOS to new machines without needing to be part of the OS4 dev team, then we'd probably be able to run OS4 on old Power Macs, and more.
Hans
** If it were up to me, all driver APIs would be freely available, and as many drivers would be open-source (or "source-available") as possible. _________________ Join the Kea Campus - upgrade your skills; support my work; enjoy the Amiga corner. https://keasigmadelta.com/ - see more of my work |
| Status: Offline |
| | fricopal!
|  |
Re: Finally a great and useful PPC port! Posted on 14-Mar-2025 7:20:51
| | [ #24 ] |
| |
 |
Cult Member  |
Joined: 12-Mar-2025 Posts: 799
From: Unknown | | |
|
| | Status: Offline |
| | kolla
|  |
Re: Finally a great and useful PPC port! Posted on 14-Mar-2025 11:13:16
| | [ #25 ] |
| |
 |
Elite Member  |
Joined: 20-Aug-2003 Posts: 3418
From: Trondheim, Norway | | |
|
| @fricopal!
For sure, you're a bot :) _________________ B5D6A1D019D5D45BCC56F4782AC220D8B3E2A6CC |
| Status: Offline |
| | fricopal!
|  |
Re: Finally a great and useful PPC port! Posted on 14-Mar-2025 12:16:02
| | [ #26 ] |
| |
 |
Cult Member  |
Joined: 12-Mar-2025 Posts: 799
From: Unknown | | |
|
| | Status: Offline |
| | Heimdall
|  |
Re: Finally a great and useful PPC port! Posted on 14-Mar-2025 12:43:43
| | [ #27 ] |
| |
 |
Regular Member  |
Joined: 20-Jan-2025 Posts: 103
From: North Dakota | | |
|
| @Lou
Quote:
A WiiU is a triple-core 1125 Mhz cpu with 2GB of RAM and Radeon 4670-ish gpu that runs 2 screens and has wi-fi and bluetooth.
I'm generally no fan of PPC...but a WiiU is better, cheaper and more useful than any other 'Amiga-compatible' PPC hardware. | A Nintendo console to run Amiga code is a very hard sell. A lot of people can't accept 68060 as an Amiga CPU, and here you're asking for a Nintendo HW (!!!).
That, right there, is just wrong. 20 years ago, when we didn't know any better, it would have been OK to support Nintendo. But it is the year 2025.
The world does not need Nintendo. This is the case, where it would actually be more beneficial for the gaming world to have one less competitor. Give me higher prices of next-gen consoles and less variety in gaming titles - all that is a very fair price for the world where Nintendo does not exist anymore.
Besides, my unplayed backlog has recently crossed 1,500 games (don't know why but I compulsively bought another ~30 games since January). Even if I lived another 50 years, I won't have time to play it all anyway, so I don't care for smaller console competition anymore.
BTW, I am not trying to persuade you - I don't care for that (you liking BigN is your business) - just explaining that as a developer who likes PowerPC CPU, I wouldn't support this endeavor. I'm sure there's plenty other devs who don't feel this way towards Nintendo, so you might find more luck there... |
| Status: Offline |
| | Heimdall
|  |
Re: Finally a great and useful PPC port! Posted on 14-Mar-2025 13:01:00
| | [ #28 ] |
| |
 |
Regular Member  |
Joined: 20-Jan-2025 Posts: 103
From: North Dakota | | |
|
| @agami
Quote:
The examples of the native Wii U power you shared, do not say anything about the budgets and the sizes of the game development teams that worked on those titles. Which is to say, things are only as powerful as the power people aim to leverage. | Exactly. Amiga world can't get new ~1993-era games developed which require 3-5 devs, let alone 300-500 ! That wouldn't even qualify as futile 
Besides, showcasing Unreal 3 engine games doesn't actually showcase the HW power. I worked with UE3. It's abstraction layer is ridiculous (for a good reason) - it's fast enough to enable quick prototyping/creation, but is extremely far from remotely utilizing GPU efficiently (but that's a whole another debate I won't go into).
By which I mean that if somebody could be arsed with a proper tech demo written in PowerPC Assembler, removing all pipeline bubbles, then WiiU HW could drop jaws (as any HW, really) - but the amount of work required for that is, practically, insane.
Quote:
If we could get titles like Deus Ex or Mass Effect on AROS PPC, the world would wake up confused as to why there are no Wii U consoles anywhere. | I'm huge fan of both games (and replay them each year), but why on Earth would I prefer to run it on a Nintendo console running some version of Amiga OS, if I can play it just fine on my PC & XBOX ?
Also, it's just a question of few years when used WiiUs will go for $500-$1000 on eBay, like was the case with every other old console becoming retro enough. At which point the only advantage (price) will be long gone...[quote]
|
| Status: Offline |
| | Lou
|  |
Re: Finally a great and useful PPC port! Posted on 14-Mar-2025 14:31:22
| | [ #29 ] |
| |
 |
Elite Member  |
Joined: 2-Nov-2004 Posts: 4255
From: Rhode Island | | |
|
| @hans,
https://wiiubrew.org/wiki/IOS
Wii U's IOS is pretty well documented and that's why it has a huge homebrew community.
Who got sued over Moana? This community needs to be like NIKE and just do it!
The Wii U is 3x more powerful than a SAM460 if all 3 cores are used and uses 800Mhz DDR3 ram (vs 533mhz DDR2) which again is faster than the SAM460. Also has more cpu cache. Double, I believe...if not more overall.
https://en.wikipedia.org/wiki/Espresso_(processor)
At worst, it's a PPC 750 single core running as fast as a SAM460 (vs 440 core) with much faster RAM and much more cpu cache.
Diety_of_choice forbid that good cheap PPC hardware actually exists! |
| Status: Offline |
| | Kronos
|  |
Re: Finally a great and useful PPC port! Posted on 14-Mar-2025 14:56:15
| | [ #30 ] |
| |
 |
Elite Member  |
Joined: 8-Mar-2003 Posts: 2745
From: Unknown | | |
|
| @Lou
The WiiU has either 8 or 32GB of internal storage which is limiting if you plan to run a desktop OS on it. USB and SD cards do exist but that is slow and comes with SW overhead.
CPU performance in single core would be "meh" even among HW currently supported by OS4.
No drivers exist and legal trouble could be real.
So unless there is a blocking factor for a MacMini (and beyond) port that somehow does not apply to the WiiU it is path not worth venturing.
In reality both "ideas" are so long overdue that they stopped making sense long ago and all post 2015 low level development should have gone to getting away from PPC. _________________ - We don't need good ideas, we haven't run out on bad ones yet - blame Canada |
| Status: Offline |
| | Lou
|  |
Re: Finally a great and useful PPC port! Posted on 14-Mar-2025 15:03:12
| | [ #31 ] |
| |
 |
Elite Member  |
Joined: 2-Nov-2004 Posts: 4255
From: Rhode Island | | |
|
| | Status: Offline |
| | Lou
|  |
Re: Finally a great and useful PPC port! Posted on 14-Mar-2025 15:09:26
| | [ #32 ] |
| |
 |
Elite Member  |
Joined: 2-Nov-2004 Posts: 4255
From: Rhode Island | | |
|
| @Kronos
Yeah, I mean 'Amiga' is really limited with only 32GB of storage... 
I mean additional USB 2.0 storage is like so slow...cuz that OS is just so heavy!
Here we go with the 'legal' arguments, LOL!
Drivers exist...you are just blind.
By the way - how much does that currently supported hardware cost? LMFAO!
Wii U's RAM is twice as fast as the A1222...and where is even that hardware? http://www.a1222plus.com/?page=about
All anti-arguments as lame as F@ck! Last edited by Lou on 14-Mar-2025 at 03:23 PM.
|
| Status: Offline |
| | Kronos
|  |
Re: Finally a great and useful PPC port! Posted on 14-Mar-2025 15:30:14
| | [ #33 ] |
| |
 |
Elite Member  |
Joined: 8-Mar-2003 Posts: 2745
From: Unknown | | |
|
| @Lou
Most WiiUs are 8GB that is insufficient 32GB version would be o.k. for the OS and some basic apps but little more. USB2 is o.k. but far from perfect.
All while the other option can be easily upgraded with a 1TB mSATA+adapter
The idea that releasing a commercial OS for a jailbroken console would wake up Nintendo's lawyers isn't that far fetched.
Other option will just boot any OS out of the box.
Drivers don't exist and could only be written by a handful of developers all under contract with companies that insist that OS4-HW most be slow, expensive, hard to get and cr#p (or at least 3 out of those 4).
Other option has a 50% ready port that was canceled for only hitting 1 out 4.
Lets say MorphOS got ported. I would for sure ask for my 32BG WiiU to be whitelisted in the betas, but unless there is a way to get at least an 2k (better 4k) signal out of the HDMI and/or FullHD VGA out of the "AV Multi out" I really don't see me using it in any real way. Last edited by Kronos on 14-Mar-2025 at 03:31 PM.
_________________ - We don't need good ideas, we haven't run out on bad ones yet - blame Canada |
| Status: Offline |
| | Lou
|  |
Re: Finally a great and useful PPC port! Posted on 14-Mar-2025 17:55:59
| | [ #34 ] |
| |
 |
Elite Member  |
Joined: 2-Nov-2004 Posts: 4255
From: Rhode Island | | |
|
| @Kronos
The SSD mod is 5x faster than the internal eMMC which at this age are starting to fail. So who cares how much internal storage it comes with. Wii U supports 2TB drives.
Wii U has HDMI and should support 1920x1080p. 4K wasn't a thing in 2012.
Edit:
They are $160-250 on ebay. Since the Switch got all the best Wii U games ported to it, it's not as collectible. Last edited by Lou on 14-Mar-2025 at 06:33 PM.
|
| Status: Offline |
| | Kronos
|  |
Re: Finally a great and useful PPC port! Posted on 14-Mar-2025 19:05:30
| | [ #35 ] |
| |
 |
Elite Member  |
Joined: 8-Mar-2003 Posts: 2745
From: Unknown | | |
|
| @Lou
So I'd need my WiiU jail broken, break out the soldering iron to get some storage in only to have something that (if supported) does the same resolution as MacMini while being lower or the same on single core performance?
So yeah, I'll just buy a Mini if I ever wanted to back such levels of HW.
Point is not that a WiiU port is a bad idea, it is just that MacMini port would yield the same results at the same or lower efforts needed. Even if you you'd had to start from scratch that is.
_________________ - We don't need good ideas, we haven't run out on bad ones yet - blame Canada |
| Status: Offline |
| | Hammer
 |  |
Re: Finally a great and useful PPC port! Posted on 14-Mar-2025 23:03:34
| | [ #36 ] |
| |
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 6320
From: Australia | | |
|
| @Lou
Quote:
You throw around a lot of mumbo jumbo but most of it doesn't make sense. |
My point, Wii U GPU is not Radeon HD 4670.
Wii U GPU has up to programmable 176.0 GFLOPS FP32 which is about half of the Radeon HD 4670.
TeraScale 1 includes multiple units capable of carrying out tessellation. Those are similar to the programmable units of the Xenos GPU which is used in the Xbox 360.
Wii U GPU (known GPU7) has 32 MB eDRAM (MEM1) vs Xbox 360's 10 MB eDRAM. GPU7 advantage reduces the need to tile, hence fewer transactions.
GPU7 still inherits the designs of the Radeon R700 series, hence GPU7’s pipeline is somewhat aligned to Direct3D 10.1 and OpenGL 3.3 standards.
GPU7 features a dedicated Direct Memory Access (DMA) controller to manipulate data between MEM1 and MEM2 without the intervention of the CPU.
For legacy support, Wii U also includes Wii's GPU along with its 3MB 1T-SRAM.
One of the three PPC in Wii U has 2MB L2 cache, the rest have 512 KB L2 cache.
Wii U's walled garden needs to be jailbroken. Last edited by Hammer on 14-Mar-2025 at 11:12 PM. Last edited by Hammer on 14-Mar-2025 at 11:09 PM.
_________________ Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68) Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68) Ryzen 9 7950X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB |
| Status: Offline |
| | matthey
|  |
Re: Finally a great and useful PPC port! Posted on 15-Mar-2025 18:46:12
| | [ #37 ] |
| |
 |
Elite Member  |
Joined: 14-Mar-2007 Posts: 2596
From: Kansas | | |
|
| Heimdall Quote:
A Nintendo console to run Amiga code is a very hard sell. A lot of people can't accept 68060 as an Amiga CPU, and here you're asking for a Nintendo HW (!!!).
|
I believe most 68k Amiga users can accept a 68060 as an Amiga CPU. It is possible to have good compatibility with a 68060 if the 68060.library is placed in flash memory and available at startup which is already available on some 68060 accelerators.
https://forum.icomp.de/index.php?thread/3786-aca1240-1260-full-specifications/ Quote:
local flash rom
There's 8MBytes of flash memory on the ACA1240/1260. A large portion of that is used to keep all the different FPGA cores, because we need a separate core for each supported CPU speed (the memory controller needs to be tweaked for every CPU and frequency). However, one key thing that users of 68040 and 68060 accelerators had to do before they could even install the new card was to install a library. Not so with the ACA1240/1260! It comes with Thor's CPU and MMU libraries pre-installed in the flash, and there is no need to install libraries on your WB before you insert the accelerator. This is how we see the term "autoconfig" - true plug&play.
|
Better would be to enhance the 68060 so less 68k support is missing for compatibility. A 68000 cycle exact core with no caches can be provided for compatibility of early games.
https://forum.icomp.de/index.php?thread/3786-aca1240-1260-full-specifications/ Quote:
For more compatibility, the FPGA can also stop the 68040/68060 CPU and let the CPU of the host computer do the work - memory and peripherals of the ACA1240/1260 can still be made available to that CPU, so at least fastmem can be provided.
The one thing that I have kept secret all over the development time is that the FPGA is wired up in a way that it can halt the 68040/68060 CPU, but also keep the CPU of the host computer off the bus. In that state, the FPGA takes over the A1200 bus and memory, and act as the CPU. We plan to use this for a high-compatibility mode with a cycle-exact 7MHz 68000 processor. The FPGA may be strong enough to provide 68020 performance when connected to an ACA500plus, but don't expect wonders - there's only 10k logic elements in the FPGA, so there's a limit to what we can do. Our focus with this is on compatibility with original A500 titles, not outperforming 680x0 CPUs.
|
Variable clock frequencies are possible with full static core designs like the 68060, 68040V and newer 68000 cores which can further improve compatibility. Games can be adjusted for compatibility or performance which is also already available on some 68060 accelerators. It is likely that 68k Amiga fans with a 68030 Amiga would not give up their system but most likely would buy a 68060 Amiga if it was cheap enough and had compatibility options like I have talked about above.
Heimdall Quote:
That, right there, is just wrong. 20 years ago, when we didn't know any better, it would have been OK to support Nintendo. But it is the year 2025.
The world does not need Nintendo. This is the case, where it would actually be more beneficial for the gaming world to have one less competitor. Give me higher prices of next-gen consoles and less variety in gaming titles - all that is a very fair price for the world where Nintendo does not exist anymore.
|
Is Nintendo the largest danger to the gaming world? How about Microsoft buying up Activision/Blizzard, ZeniMax/Bethesda, Mojang and Rare? Does the monopoly lock in x86-64 Windows and Xbox exclusives forever?
Heimdall Quote:
By which I mean that if somebody could be arsed with a proper tech demo written in PowerPC Assembler, removing all pipeline bubbles, then WiiU HW could drop jaws (as any HW, really) - but the amount of work required for that is, practically, insane.
|
Nintendo PPC hardware is more desktop like than PS3 or Xbox 360 hardware which is more powerful but very difficult to program because of common and large stalls. The Nintendo PPC CPU cores are 4-stage limited OoO desktop PPC G3 cores with an enhanced SIMD unit. The PPC shallow pipeline limited OoO is effective at minimizes stalls but limits the clock speed which was the downfall of PPC.


The 7-stage limited OoO G4 was already showing signs of losing efficiency and IBM ditched the limited OoO for the in-order 23-stage PS3 Cell and 21-stage in-order XBox Xenon CPU cores. A good example of just how bad of performance compiled PPC code is on the PS3 Cell CPU core, I created a chart of single core per MHz 7-Zip results (higher is better).
single core | design | compression/MHz | decompression/MHz SiFive_U74 in-order 0.70 0.92 Cortex-A53 in-order 0.56 0.92 Cortex-A55 in-order 0.63 1.03 IBM_Cell_PPE in-order 0.23 0.33
IBM_PPC_G5 OoO 0.49 0.82 POWER9 OoO 1.08 0.83
https://www.7-cpu.com/
Unfortunately, there are no G3 PPC 7-Zip results but I expect they would be much better than the Cell results and likely comparable to the in-order CPU cores. The aggressive OoO PPC G5 and POWER9 should have better results considering how much cheaper the in-order cores are but compilers are having trouble extracting the full performance despite more stall avoidance hardware. The SiFive U74 design resembles the 68060 design although the RISC-V ISA lacks the performance potential of the 68k ISA. Considering the in-order SoC prices are just a few dollars and use passive cooling, it shows just how noncompetitive the high end aggressive OoO PPC/POWER cores are for general purpose use.
The same software that ran on PPC G3 desktop hardware should work on Nintendo PPC hardware, provided drivers can be developed. The hardware is practical for general purpose use despite limited theoretical/max performance. The WiiU has more expandability than earlier consoles. Amiga Forever support may be coming soon for playing 68k Amiga games.
https://www.amigaforever.com/wii/
The WiiU is older and limited but not bad hardware. The problem is all the closed gaming hardware that is good enough for general purpose use. There is nothing like the open Amiga CD32 hardware where expandability and embedded use were encouraged. It is too bad as in-order CPU SoCs are only a few dollars and a 68060@100MHz CD32 means a 68060@1-2GHz CD32 is possible with compatibility. The 8-stage 68060 should allow a significantly higher clock speed than a 4-stage PPC G3 which was never leveraged as the 68k was sabotaged to promote PPC.
Last edited by matthey on 15-Mar-2025 at 07:04 PM. Last edited by matthey on 15-Mar-2025 at 06:52 PM.
|
| Status: Offline |
| | Hammer
 |  |
Re: Finally a great and useful PPC port! Posted on 15-Mar-2025 22:44:34
| | [ #38 ] |
| |
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 6320
From: Australia | | |
|
| @matthey
Quote:
The 7-stage limited OoO G4 was already showing signs of losing efficiency and IBM ditched the limited OoO for the in-order 23-stage PS3 Cell and 21-stage in-order XBox Xenon CPU cores. A good example of just how bad of performance compiled PPC code is on the PS3 Cell CPU core, I created a chart of single core per MHz 7-Zip results (higher is better). |
PowerPC G4's Altivec was Motorola/Apple's toy. CELL's claim to fame is vector floating point and integer power for the intended 3D games use case.
68060 wouldn't be able to properly patch PS3's RSX, an aging GeForce 7-based GPU. For example, CELL's SPEs are used to patch RSX's weak vertex shaders relative to Xbox 360's unified shader GpGPU.
For desktops, IBM recycled Power4 core and made PowerPC 970 with VMX (aka VMX32, Altivec). CELL has 128 register VMX (aka VMX128).
On the PC, NVIDIA released GeForce 8 with unified shaders a few weeks before PS3's launch. For Core 2, Intel added SSSE3.1 with necessary pack math instructions to feature match CELL's pack math.
For example, https://whatcookie.github.io/posts/why-is-avx-512-useful-for-rpcs3/ SSSE3's pshufb is invaluable for emulating CELL's shufb instruction.
PSHUFB uses the low 4-bits of each 8-bit lane as an index into a 16-byte (128-bit) vector. 68060 version would be 16 scalar instructions.
68060 doesn't have VMX/SPE/SSE SIMD's eight FP16 and four FP32 element data processing. On competing for PS4 contract, CELL's PPE was reused and evolved into PowerPC A2, which competed against AMD's Jaguar, a dual instruction issue out-of-order processor CPU with superscalar pipelines for 128bit FADD SIMD and 128bit FMUL SIMD operations. AMD K10 has 128bit FADD SIMD and 128bit FMUL SIMD pipelines.
PPE's design ideology continues from light-weight in-order processing PPC 602 with FP32 FPU game console CPU for 3DO M2.
PPE has full FPU/SIMD pipelines i.e. 1. 128bit VMX(load/store/premute), 2. 128bit VMX(ALU/logic), 3. FP64 FPU (ALU/logic), 4. FP64 FPU (load/store).
PPE integer pipeline is just a 68040 class with very high clock speed. 1. Branch, 2. 64bit ALU, 3. Load/Store, Up to 23 stage pipeline with a profitable yield rate of 3.2 Ghz clock speed on a 90 nm process node. Server PPE variant reached 4 GHz. PPE is biased towards floating point and vector workloads.
AMD Jaguar's pipeline layout 1. 64bit ALU, 2. 64bit ALU, MUL, DIV 3. AGU, Store 4. AGU, Load 5. 128bit Vector ALU, 128bit Vector MUL, FADD, 6. 128bit Vector ALU, FMUL, 13 stage pipeline with a profitable yield rate of 1.75 Ghz clock speed on 28 nm. Low cost desktop PC variant reached 2.2 GHz. It supports two data elements, FP64, on 128bit SIMD hardware via the SSE2 and AVX instruction paths. 256bit AVX is divided into two 128-bit vector uops which conserves the instruction issue slot. Jaguar is an effective cost-reduced K8 with K10's 128bit SIMD units and AVX capability.
Jaguar was 2X scaled into Zen with AVX2 capability.
Reborn 68060 wouldn't be competitive in modern 3D gaming.
PPE is designed for SMT, hence two threads need to be run per CPU core.
For 7ZIP example with SMT CPU cores Intel Atom N270 has a single "Bonnell" CPU core with SMT and 1600 MHz 1 thread = 700 compress, 900 decompress 2 threads = 1000 compress, 1500 decompress
7ZIP doesn't test floating point nor vector math.
Y-cruncher test floating point with vector math support e.g. AVX-512. Y-cruncher also open source like 7ZIP. Last edited by Hammer on 16-Mar-2025 at 12:17 AM. Last edited by Hammer on 16-Mar-2025 at 12:14 AM. Last edited by Hammer on 16-Mar-2025 at 12:13 AM. Last edited by Hammer on 16-Mar-2025 at 12:09 AM. Last edited by Hammer on 15-Mar-2025 at 11:47 PM. Last edited by Hammer on 15-Mar-2025 at 11:21 PM. Last edited by Hammer on 15-Mar-2025 at 11:19 PM.
_________________ Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68) Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68) Ryzen 9 7950X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB |
| Status: Offline |
| | matthey
|  |
Re: Finally a great and useful PPC port! Posted on 16-Mar-2025 1:59:09
| | [ #39 ] |
| |
 |
Elite Member  |
Joined: 14-Mar-2007 Posts: 2596
From: Kansas | | |
|
| Hammer Quote:
PowerPC G4's Altivec was Motorola/Apple's toy. CELL's claim to fame is vector floating point and integer power for the intended 3D games use case.
|
The PPC G4 VMX/Altivec is enough SIMD to be useful without driving up CPU prices into orbit. SIMD programming then did not work well with compilers and there was a shortage of PPC assembly programmers so why beef up a SIMD unit that is rarely used?
SIMD units did make higher clock speeds more important to boost SIMD throughput but that was not a good thing for PPC shallow pipeline limited OoO designs. A 7-stage limited OoO G4 is still practical for general purpose use while the 23-stage in-order Cell PPC CPU core is an expensive waste of specialized hardware for general purpose use. The Nintendo PPC G3 CPU cores have enhanced PPC SIMD support but retain desktop like general purpose usability. The PS3 design would have been better if the Cell PPC CPU had not been stripped down to weak in-order RISC performance. IBM assumed the specialized performance would outweigh the extra development time and that developers would learn to optimize the hardware but they were wrong.
https://www.anandtech.com/show/1719/4 Quote:
The bottom line is that Sony would not foolishly spend over 75% of their CPU die budget on SPEs to use them for nothing more than fancy DSPs. Architecting a game engine around Cell and optimizing for SPE acceleration will take more effort than developing for the Xbox 360 or PC, but it can be done. The question then becomes, will developers do it?
In Johan’s Quest for More Processing Power series he looked at the developmental limitations of multi-threading, especially as they applied to games. The end result is that multi-threaded game development takes between 2 and 3 times longer than conventional single-threaded game development, to add additional time in order to restructure elements of your engine to get better performance on the PS3 isn’t going to make the transition any easier on developers.
|
An example of how bad the Cell PPC CPU is to program is branches where branch prediction hints are expected to be used for a 23-stage pipeline.
http://ilab.usc.edu/packages/cell-processor/docs/IBM_redbook.pdf Quote:
Programmer-directed hints can also be used effectively to encourage compilers to insert optimally predicted branches. Even though there is some anecdotal evidence that programmers do not use them very often, and when they do use them, the result is wrong, this is likely not the case for SPU programmers. SPU programmers generally know a great deal about performance and will be highly motivated to generate optimal code.
|
Not only are the static branch prediction hints expected but the original PPC branch hint bit in branch instructions is ignored as it is not available early enough with such a deep pipeline.
http://ilab.usc.edu/packages/cell-processor/docs/IBM_redbook.pdf Quote:
A branching hint should be present soon enough in the code. A hint that precede the branch by at least eleven cycles plus four instruction pairs is minimal. Hints that are too close to the branch do not affect the speculation after the branch.
|
Every loop and branch is tedious and error prone monotony. The manual tuning requirement is too difficult for compilers too. Cell developers learned very quickly that they wanted desktop like x86-64 CPU cores instead despite the baggage, bloat and heat increasing the cost of consoles!
Hammer Quote:
PPE's design ideology continues from light-weight in-order processing PPC 602 with FP32 FPU game console CPU for 3DO M2.
|
More stripped down PPC crap. The Apple Pippen at least used the PPC 603 which is only less stripped down PPC crap responsible for the worst Macs ever.
Hammer Quote:
PPE has full FPU/SIMD pipelines i.e. 1. 128bit VMX(load/store/premute), 2. 128bit VMX(ALU/logic), 3. FP64 FPU (ALU/logic), 4. FP64 FPU (load/store).
|
The Cell PPC CPU SIMD hardware has been stripped down too!
http://ilab.usc.edu/packages/cell-processor/docs/IBM_redbook.pdf Quote:
Floating-point operations:
o Single-precision instructions are performed in 4-way SIMD fashion and are fully pipelined. Since those instructions have good performance it is recommended for the programer to try to use them if the application allows to.
o Double-precision instructions are performed in 4-way SIMD fashion, are only partially pipelined, and will stall dual issue of other instructions. The performance of these instructions makes Cell BE less attractive for applications that have massive use of double-precision instructions.
o Data format follows the IEEE Standard 754 definition, but the single precision results are not fully compliant with this standard (different overflow and underflow behavior, support only for truncation rounding mode, different denormal results).
The programer should be aware that in come cases the computation results will not be identical to IEEE Standard 754
|
The lack of IEEE FP compliance means the same thing as it does for an ARM Neon SIMD unit.
https://en.wikipedia.org/wiki/ARM_architecture_family#Advanced_SIMD_(Neon) Quote:
A quirk of Neon in Armv7 devices is that it flushes all subnormal numbers to zero, and as a result the GCC compiler will not use it unless -funsafe-math-optimizations, which allows losing denormals, is turned on. "Enhanced" Neon defined since Armv8 does not have this quirk, but as of GCC 8.2 the same flag is still required to enable Neon instructions. On the other hand, GCC does consider Neon safe on AArch64 for Armv8.
|
The Cell FP handling is worse because it applies to the FPU as well which usually is used by compilers but they may use soft float libraries because of the lack of a IEEE compliant FPU. All that hardware and compilers can not use it by default! It is not general purpose! The architects are cutting corners to try to get an advantage but end up with a disadvantage!
Hammer Quote:
PPE integer pipeline is just a 68040 class with very high clock speed. 1. Branch, 2. 64bit ALU, 3. Load/Store, Up to 23 stage pipeline. PPE is biased towards floating point and vector workloads.
|
The Cell CPU core is superscalar and multithreading which is more advanced than the superscalar 68060 other than the lack of units. The Cell core fetches 32B/cycle from the L2 cache for predecoding to the L1 and fetches 16B/cycle from L1 for execution yet the 4B/cycle instruction fetch 68060 at half the clock speed would destroy it for general purpose use of programs with a quick compile. It is not how big your hardware is but how you use it.
Hammer Quote:
Reborn 68060 will be smashed in modern 3D gaming.
|
Sure but general purpose use is about integer and FPU performance which is easily accessible with a compile. The 68k Amiga virtual machines are 32-bit only, single core only and do not use SIMD units forever. A reborn 68060 could add 64-bit, a SIMD/vector unit, SMP support and a GPU which would be enough for affordable semi-modern 3D gaming that is competitive with RPi and RISC-V hardware. The 68k Amiga needs practical and easy to use hardware more like the WiiU hardware than the PS3 hardware.
Last edited by matthey on 16-Mar-2025 at 05:43 PM.
|
| Status: Offline |
| | OneTimer1
|  |
Re: Finally a great and useful PPC port! Posted on 16-Mar-2025 14:26:13
| | [ #40 ] |
| |
 |
Super Member  |
Joined: 3-Aug-2015 Posts: 1163
From: Germany | | |
|
| @Lou
Quote:
Lou wrote:
Well - finally a great and useful OS has been ported to inexpensive PPC hardware.
|
Neither useful nor great.
WindowsNT for PowerPC was not very useful at all, today I would prefer a PPC Linux over every dead vintage windows version that was never really supported in its zenith.
@Lou
Quote:
Lou wrote:
The toxic community backlash was "who will pay for it."
|
Maybe you didn't understand, it is not the 'community' that is asking for money, pay Hyperion and they will port every code they have to everything you pay for.
-------------
But don't understand me wrong, I have respect for the coders who did the port.Last edited by OneTimer1 on 16-Mar-2025 at 03:01 PM.
|
| Status: Offline |
| |
|
|
|
[ home ][ about us ][ privacy ]
[ forums ][ classifieds ]
[ links ][ news archive ]
[ link to us ][ user account ]
|