Click Here
home features news forums classifieds faqs links search
6124 members 
Amiga Q&A /  Free for All /  Emulation /  Gaming / (Latest Posts)
Login

Nickname

Password

Lost Password?

Don't have an account yet?
Register now!

Support Amigaworld.net
Your support is needed and is appreciated as Amigaworld.net is primarily dependent upon the support of its users.
Donate

Menu
Main sections
» Home
» Features
» News
» Forums
» Classifieds
» Links
» Downloads
Extras
» OS4 Zone
» IRC Network
» AmigaWorld Radio
» Newsfeed
» Top Members
» Amiga Dealers
Information
» About Us
» FAQs
» Advertise
» Polls
» Terms of Service
» Search

IRC Channel
Server: irc.amigaworld.net
Ports: 1024,5555, 6665-6669
SSL port: 6697
Channel: #Amigaworld
Channel Policy and Guidelines

Who's Online
22 crawler(s) on-line.
 95 guest(s) on-line.
 1 member(s) on-line.


 outlawal2

You are an anonymous user.
Register Now!
 outlawal2:  56 secs ago
 g.bude:  7 mins ago
 AmigaMac:  23 mins ago
 coder76:  29 mins ago
 deadwood:  37 mins ago
 Rob:  38 mins ago
 MEGA_RJ_MICAL:  46 mins ago
 Karlos:  1 hr 26 mins ago
 amigasociety:  1 hr 49 mins ago
 kolla:  2 hrs 9 mins ago

/  Forum Index
   /  Amiga General Chat
      /  Integrating Warp3D into my 3D engine
Register To Post

Goto page ( Previous Page 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 Next Page )
PosterThread
Karlos 
Re: Integrating Warp3D into my 3D engine
Posted on 27-Feb-2025 11:04:33
#181 ]
Elite Member
Joined: 24-Aug-2003
Posts: 4937
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

@Hammer

Lisa was an expensive moonshot and was in a completely different price bracket than the A1000 on release. In terms of user experience there's still no contest: a monochrome machine with workstation ambitions versus a consumer machine with multiple supported resolutions, colour depths, stereo digital sound, preemptive multitasking, etc.

_________________
Doing stupid things for fun...

 Status: Offline
Profile     Report this post  
matthey 
Re: Integrating Warp3D into my 3D engine
Posted on 27-Feb-2025 11:08:41
#182 ]
Elite Member
Joined: 14-Mar-2007
Posts: 2602
From: Kansas

Hammer Quote:

Xbox comparison is only for the GPU.


lol

ARM11 was a performance improvement over earlier ARM offerings but still very weak. The first superscalar ARM core to have descent performance was the 2005 dual issue in-order Cortex-A8 with a deep and usual 13-stage pipeline.

https://en.wikipedia.org/wiki/ARM_Cortex-A8 Quote:

Compared to the ARM11, the Cortex-A8 is a dual-issue superscalar design, achieving roughly twice the instructions per cycle. The Cortex-A8 was the first Cortex design to be adopted on a large scale in consumer devices.


The 13-stage in-order Cortex-A8 had a claimed 2.0 DMIPS/MHz but ARM returned to a more practical 8-stage in-order Cortex-A7 design with only a claimed 1.9 DMIPS/MHz, likely due to deeper in-order RISC pipelines having more and longer stalls which reduce general purpose performance. Cesare Di Mauro writes the following about the comparison between the Cortex-A8 and early OoO ARM cores with a 2-way in-order x86 Atom core.

https://www.appuntidigitali.it/21667/nex64t-8-comparison-with-other-architectures/ Quote:



As can be seen, an older Atom (x86/x64) that has a two-way in-order pipeline obliterates the A8 (ARM, also two-way in-order), disintegrates even the A9 (ARM, two-way out-of-order), and even manages to be competitive with the A15 (ARM, three-way out-of-order)!

The reasons are given in the conclusion of the article, from which I extract the relevant part (relative to a benchmark that was investigated. But similar situations are common with L/S ISAs):

The Cortex processors execute dramatically more instructions for the same results.


The "two-way in-order pipeline" Atom obliterated everything ARM had at the time including a 3-way OoO Cortex-A15 where Cortex-A72 is also 3-way OoO. The 68060 has a "two-way in-order pipeline" like the in-order Atom but is lower power and not modernized. ARM cores were anemic and the lower power cores today are not much better. On small footprint hardware like the RPi which we are talking about, "function inlining and loop unrolling" is often avoided to improve code density.

Profile Guided Selection of ARM and Thumb Instructions
https://www2.cs.arizona.edu/~arvind/papers/lctes02.pdf Quote:

Optimizing compiler

The compiler we used in this work is the gcc compiler which was built to create a version that supports generation of mixed ARM and Thumb code. Specifically we use the xscale-elf-gcc compiler version 2.9-xscale. Each module in the application can be compiled into either Thumb code or ARM code. The transitions between the modes at function boundaries are also taken care of by the compiler. From the above perspective, the libraries are treated as a single module, that is, either they are compiled into ARM code or completely into Thumb code. All programs were compiled at -O2 level of optimization. We did not use -O3 because at that level of optimization function inlining and loop unrolling is enabled. Clearly since the code size is an important concern for embedded systems, we did not want to enable function inlining and loop unrolling.


The emubench benchmark Karlos wrote shows us that RISC/ARM cores not only have reduced performance for JIT 68k to ARM translated code but for small footprint hardware which seems to be independent of ARM ISAs. ARM64/AArch64 has mediocre code density at best and optimizing for performance requires "function inlining and loop unrolling" for performance, both of which have synergies to increase code size and the reason why newer RPis need more memory and caches. The small footprint advantage that made the original RPi possible is gone with ARM64 and even with Thumb(-2) ISAs it was never as good of performance as the 68k allows!

Hammer Quote:

For integer performance, ARM11 is reliant on 32-bit integer SIMD. Your benchmark doesn't use SIMD.


Recall that more than 90% of instructions in Photoshop for the x86 and x86-64 are integer instructions and not FPU or SIMD instructions.

https://www.appuntidigitali.it/18362/statistiche-su-x86-x64-parte-8-istruzioni-mnemonici/

ARM auto vectorization was so bad before ARM64 that it was not worth using.

https://en.wikipedia.org/wiki/Automatic_vectorization

Even with auto vectorization in ARM64, the C(++) code needs to be written carefully and the resulting SIMD code may need to be disabled due to lower performance than using the FPU. Do you really think small footprint hardware is as heavily hand optimized in assembly using SIMD instructions as x86-64 games on consoles like the XBox?

Hammer Quote:

Refer to Amiga Hombre's PA-RISC's custom SIMD extensions for a similar direction.


I showed the 68060 was nearly the same performance for MPEG playback as a PA-RISC CPU with SIMD instructions. Load/store and load-to-use stalls are performance handicaps that CISC cores like the 68060 do not have. RISC cores with SIMD instructions may be able to make this performance deficit up in rare cases but then the SIMD unit is load/store and the CISC core can have mem-reg loads when a SIMD unit is added. Why do you think RISC cores do not add SIMD units as powerful as x86-64 cores?

Hammer Quote:

Jazelle (Java acceleration) is important for Android and Blu-ray. https://en.wikipedia.org/wiki/BD-J


Python seems to be more popular on the RPi than Java. Judging by the image stats of RPi downloads, I see "Android by emteria" and "Bass ARM (Android)" at less than 1% combined. Did i miss any Android OSs?

https://rpi-imager-stats.raspberrypi.com/

At least SIMD instructions are general purpose and provide a worthwhile performance increase even though rarely used. In the case of the scalar ARM11, I would rather have a superscalar CPU core than a CPU core with a SIMD unit or Jazelle wastefulness.

Last edited by matthey on 27-Feb-2025 at 03:19 PM.
Last edited by matthey on 27-Feb-2025 at 11:15 AM.
Last edited by matthey on 27-Feb-2025 at 11:12 AM.
Last edited by matthey on 27-Feb-2025 at 11:11 AM.

 Status: Offline
Profile     Report this post  
ZXDunny 
Re: Integrating Warp3D into my 3D engine
Posted on 28-Feb-2025 9:57:33
#183 ]
New Member
Joined: 7-Feb-2025
Posts: 7
From: Unknown

@matthey

I dunno dude, I feel that - given that ARMs won't run 68k code - we kinda have to do the JIT thing, you know? There's no way around that.

And as there isn't anything in 060 land that clocks more than a hundred MHz or so, we don't really have a choice if you want more speed. I mean, we'd all love a 1GHz m68k but... there isn't one. So we have to do ARM and that means we have to do JIT.

Pretty much stuck there, ain't we.

Mind you, there are bonus features like RTG and uSD storage and Wifi to sweeten the deal, you ain't gonna get all that on no 68060 expansion card.

 Status: Offline
Profile     Report this post  
Hammer 
Re: Integrating Warp3D into my 3D engine
Posted on 28-Feb-2025 11:01:34
#184 ]
Elite Member
Joined: 9-Mar-2003
Posts: 6320
From: Australia

@matthey

Quote:

ARM11 was a performance improvement over earlier ARM offerings but still very weak. The first superscalar ARM core to have descent performance was the 2005 dual issue in-order Cortex-A8 with a deep and usual 13-stage pipeline.


ARM11 has proven a high clock speed advantage and proven smartphone SoC suitability.

Clock speed x IPC = performance.

In 1996, 68060 already lost to StrongARM SA-110 and the first versions, operating at 100, 160, and 200 MHz were announced on 5 February 1996.

https://www.cpushack.com/CIC/announce/1996/DECStrongARMSA-110.html

Sept. 12, 1996

The new versions of the StrongARM SA-110 microprocessor announced today
operate at 233MHz and 166MHz. The new SA-110 233MHz processor couples
the large, on-chip cache and fast bus interface of the original SA-110
with a higher-speed pipeline to deliver performance estimated at 270
Dhrystone 2.1 MIPS, five to ten times that of competitively priced
parts. The new SA-110 at 166MHz offers embedded designers access to 200
Dhrystone MIPS performance at lower price points.


Extended Range, New Applications

Both products feature the low-power ARM architecture and high-
performance enhancements of existing StrongARM SA-110 microprocessors,
while extending the family's range of price and performance. The
StrongARM SA-110 233MHz chip is volume-priced at less than $50, setting
a new price/performance record of 5.5 MIPS per dollar. It dissipates
just one watt of power.



Quote:
https://www.appuntidigitali.it/21667/nex64t-8-comparison-with-other-architectures/ Quote:


"Core i7" is meaningless without a specific model number.

Before year 2013 and against a similar era ARM Cortex A15, AMD Jaguar wins the desktop game console contracts for Xbox One and PS4.

I'm using game-related CPU benchmarks since Amiga's target audience is mostly interested in games. I'm going to hammer vector floating point and branch performance due to 3D games use case.

From https://gpucuriosity.wordpress.com/2025/02/28/3d-marks-ice-cpu-physics-scores/
AMD A4-5000 has quad-core Jaguar @ 1.5 Ghz. Physics Score: 16,812

AMD A4-5200 quad-core Jaguar @ 2.0 Ghz Physics Score: 24,528

ASUS Zenfone 2 has Quad-core Intel Atom Z3580 @ 2.3 GHz (4 GB RAM model) or Quad-core Intel Atom Z3580 @ 1.8 GHz (2 GB RAM model) Physics Score: 20,767


Huawei Mate 8’s HiSilicon’s Kirin 950 is an octa-core chip with four ARM A72 cores at 2.3GHz and four ARM A53 cores at 1.8GHz. Physics Score: 14,566

Huawei P8 lite 7201 has octa-core 1.2 GHz Cortex-A53, Physics Score: 7,407

Samsung Galaxy S6 has Octa-core (4Ă—2.1 GHz Cortex-A57 & 4Ă—1.5 GHz Cortex-A53) Physics Score: 16,835

PS4 and Xbox One have 8 Jaguar CPU cores from AMD.

Quote:

Recall that more than 90% of instructions in Photoshop for the x86 and x86-64 are integer instructions and not FPU or SIMD instructions.

Useless for games. Run Valve's Steam on 1st gen Core i7 "Nehalem" and watch multiple PS4/XBO era games crash or halt due to missing AVX.

PS5 / XSX era games need AVX2 / X86-64 Level 3.

Bullet physics supports X86 SSEx and AVXx extensions.

https://youtu.be/dXbjmF--QVc?t=304
Heavy ray tracing in World of Tanks Encore RT. Intel Embree middleware is used.


Starfield has more then 50% AVX Instructions in its code so running it on anything non AVX is just not possible / very very very slow.

https://www.youtube.com/watch?v=6BIfqfC1i7U
GDC 2025, Extreme SIMD: Optimized Collision Detection in Titanfall.
In this 2018 GDC talk, Respawn Entertainment's Earl Hammon explains how the Titanfall team made already optimized continuous collision detection code more than twice as fast.

Quote:

I showed the 68060 was nearly the same performance for MPEG playback as a PA-RISC CPU with SIMD instructions. Load/store and load-to-use stalls are performance handicaps that CISC cores like the 68060 do not have. RISC cores with SIMD instructions may be able to make this performance deficit up in rare cases but then the SIMD unit is load/store and the CISC core can have mem-reg loads when a SIMD unit is added. Why do you think RISC cores do not add SIMD units as powerful as x86-64 cores?

For Amiga Hombre, Commodore selected a licensed PA-RISC clone for performance AND low price.

Again, present an Xbox's BOM costing framework plan with 1996 era 68060.

68060 is expensive for Commodore's Xbox BOM cost range A500/A1200/CD32.

------------------------------------------

https://www.neogaf.com/threads/3do-mx-chipset-the-technology-nintendo-almost-used-in-an-n64-successor-for-1999.350196/#post-14521193
The initial BOM costings for the original Xbox


Brown said the goals were to make money, expand Microsoft's technology into the living room, and create the perception that Microsoft was leading the
charge in the new era of consumer appliances. The initial cost estimate was for a machine with a bill of materials (engineering talk for cost) of $303. That
machine would debut in the fall of 2000 and use a $20 microprocessor running at 350 megahertz from Advanced Micro Devices. The machine would also have
a $55 hard disk drive with two gigabytes of storage, a $27 DVD drive to play movies, a $35 graphics chip, $25 worth of memory chips, and a collection of
other standard parts like a motherboard, and power supply. Over time, these prices would decline.


$20 Intel-compatible microprocessor and a $30 graphics chip from Nvidia. The highest-priced item on the list of materials was $40 for memory chips. But the
rest of the bill of materials was complete, down to $2.14 for the cables and $4.85 for screws



Xbox's BOM costs are close to mainstream Amiga AGA.

https://forum.beyond3d.com/threads/og-xbox-was-planned-to-launch-with-an-amd-cpu-until-last-minute.62562/#post-2225089
Xbox's CPU increased to K7 Duron before switching to Intel Coppermine 128K. AMD 760 and so nForce/XBox is mostly an AMD chipset that NVidia bought the rights to modify.


For the year 2000 release, prove Motorola can deliver the CPU solution that Xbox needs.



Last edited by Hammer on 28-Feb-2025 at 01:14 PM.
Last edited by Hammer on 28-Feb-2025 at 01:12 PM.
Last edited by Hammer on 28-Feb-2025 at 01:05 PM.
Last edited by Hammer on 28-Feb-2025 at 12:40 PM.
Last edited by Hammer on 28-Feb-2025 at 12:31 PM.
Last edited by Hammer on 28-Feb-2025 at 12:29 PM.
Last edited by Hammer on 28-Feb-2025 at 12:25 PM.
Last edited by Hammer on 28-Feb-2025 at 12:22 PM.
Last edited by Hammer on 28-Feb-2025 at 12:14 PM.
Last edited by Hammer on 28-Feb-2025 at 11:49 AM.
Last edited by Hammer on 28-Feb-2025 at 11:33 AM.

_________________
Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68)
Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68)
Ryzen 9 7950X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB

 Status: Offline
Profile     Report this post  
michalsc 
Re: Integrating Warp3D into my 3D engine
Posted on 28-Feb-2025 11:08:47
#185 ]
AROS Core Developer
Joined: 14-Jun-2005
Posts: 433
From: Germany

@ZXDunny

But matthey is right. 1GHz 68060 is much better than JIT running on ARM using the same clock speed. Much better code density and for sure its L1, L2 and L3 caches are running at the same over 1GHz clock speed as the 68060 itself. That must be, otherwise we wouldn't have such low latencies for loads and stores.

I just wonder why we all have not yet switched to MC68060 running at 1 or 2GHz.

... oh, wait... ;)

 Status: Offline
Profile     Report this post  
pixie 
Re: Integrating Warp3D into my 3D engine
Posted on 28-Feb-2025 12:05:26
#186 ]
Elite Member
Joined: 10-Mar-2003
Posts: 3449
From: Figueira da Foz - Portugal

@michalsc

But let's pretend we have AROS running natively on ARM, let's pretend we have the most used software running natively on it, would the speed difference be relevant? I mean, not in rendering stuff like Lightwave, which should have a meaningful speed up, but perhaps only on difference time taken on rendered output.
What I am talking about is the interaction with the software. Take "SQLite Manager" for example, would a user feel any difference between working with the ARM native version or running it on JIT?

Last edited by pixie on 28-Feb-2025 at 12:24 PM.

_________________
Indigo 3D Lounge, my second home.
The Illusion of Choice | Am*ga

 Status: Offline
Profile     Report this post  
Karlos 
Re: Integrating Warp3D into my 3D engine
Posted on 28-Feb-2025 14:35:31
#187 ]
Elite Member
Joined: 24-Aug-2003
Posts: 4937
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

@michalsc

Quote:
I just wonder why we all have not yet switched to MC68060 running at 1 or 2GHz


*Raises hand enthusiastically*

Ooh! Ooh! I know! I know! Pick me! I know!

Is it becaaauuuuse..... They don't exist?

_________________
Doing stupid things for fun...

 Status: Offline
Profile     Report this post  
matthey 
Re: Integrating Warp3D into my 3D engine
Posted on 1-Mar-2025 20:21:31
#188 ]
Elite Member
Joined: 14-Mar-2007
Posts: 2602
From: Kansas

There once was a small town called Amitown that had the best general store around. Through mismanagement the store closed and with it many of the other stores in town. Most of the people left town without basic necessities being found for long distances around. Along came an engineer to town and built a railroad to a distant small town that still required hours to commute. The remaining populace loved that necessities were so close and convenient and expected people that left to return. Someone else comes along and proposes that Amitown build their own stores but the populace proclaimed how convenient it was to have the railroad and that is all that AmiTown requires. "We still have our much loved and familiar Amitown and it is better again so the people will return", said the die hard optimists of Amitown. The majority of people that left Amitown are living in larger towns with more and better necessities minutes away though. Some previous residents may periodically visit Amitown and agree that it is still lovely before returning to the convenience of their new town.

Last edited by matthey on 01-Mar-2025 at 08:25 PM.

 Status: Offline
Profile     Report this post  
Kronos 
Re: Integrating Warp3D into my 3D engine
Posted on 1-Mar-2025 20:51:14
#189 ]
Elite Member
Joined: 8-Mar-2003
Posts: 2746
From: Unknown

@pixie

Quote:

pixie wrote:

But let's pretend we have AROS running natively on ARM, let's pretend we have the most used software running natively on it, would the speed difference be relevant?


At this point you would be talking about a pure "NG" system and the real question is whether you'd be running the same old SW or some heavier stuff.

I'm pretty sure I'd notice a slight difference be Peg1_G3 (aka rPI running 68k EMU) and a fast G4 as in PowerBook or PowerMac with CPU upgrade (aka native single core speed of a rPI).
Let alone a G5.

Not sure what this is about, but there is a lot of performance to be gained on the ARM side of a PI, just not sure if there would be a point to it, and if there was what would be the point of tying it to ancient HW.

_________________
- We don't need good ideas, we haven't run out on bad ones yet
- blame Canada

 Status: Offline
Profile     Report this post  
Hammer 
Re: Integrating Warp3D into my 3D engine
Posted on 1-Mar-2025 23:11:41
#190 ]
Elite Member
Joined: 9-Mar-2003
Posts: 6320
From: Australia

@Karlos

Quote:

Lisa was an expensive moonshot and was in a completely different price bracket than the A1000 on release. In terms of user experience there's still no contest: a monochrome machine with workstation ambitions versus a consumer machine with multiple supported resolutions, colour depths, stereo digital sound, preemptive multitasking, etc.

Apple Lisa has a custom MMU, memory protection, virtual memory, 32-bit preemptive multitasking, stable business, high resolution monochrome, stock 1MB RAM ECC, and WIMP GUI. Two hardware revisions were released. Apple OS team demonstrated advanced 32-bit OS skills. Lisa has mono PCM sound. LisaOS is missing Xenix's C2 rated security.

Apple's second attempt was with Apple Unix, which included Mac apps compatibility for Macs equipped with 68020/68551 MMU or 68030.

Lisa was changed into Macintosh XL with name and MacOS change. MacOS was changed due to the Mac 128K RAM's cut-down hardware. Task switcher was added for Mac 512K, which MS Excel depended on, and it was bundled with the OS task switcher update in 1985.

MMU LisaOS was a major factor for Bill Gates' 32-bit 386 with MMU argument, hence Xenix 386, Windows 2.0 386 and starting 1988 Windows NT (known OS/2 3.0, C2 security target) project. Windows NT combines higher level Xenix 386 features and Windows userland layers.

When Steve Jobs left Apple, key Apple OS engineers followed Jobs into NextStep and avoided expensive commercial Unix license. Both Bill Gates and Steve Jobs dislike expensive Unix licenses.

Both Bill Gates and Steve Jobs led projects with 32-bit preemptive multitasking, memory protection, multi-user, symmetric multiprocessor, sandboxed legacy, retargetable graphics and at least C2 security rated futures for their respective platforms.

Amiga OCS's multi-resolution didn't cover a stable graphics high resolution for business GUI. Amiga OCS remained in TV modes and A2024 has a 5000 unit production scale debacle.

Apple started the Mac II's 256 color with business resolution display capability R&D in 1985. This capability was ready for Apple's early 1990s best-selling Mac LC series.

Without Steve Jobs' NextStep team, Apple tried to replace MacOS with Copland and failed.
Steve Jobs' OS team has OS GUI design strength which underpins today's MacOS X and IOS.

Amiga OCS strength was turned into a weakness during the "next gen" transition phase.

----------------
From the original key Amiga engineers who learned from AmigaOS mistakes,


the 3DO OS is a fully-featured 32-bit multitasking and real-time operating system written specifically for the 3DO by NTG. Developers *must* use the OS for a variety of reasons, the main one being to maintain compatibility with all 3DO consoles and future next-generation 3DO consoles.

The OS is loaded from an application's CD when the system starts up, and is not in the consoles ROM. Contrary to rumors, there is no way to bypass the OS and "hack" directly on the hardware.

The 3DO OS consists of two parts:
1. A multitasking kernel with drivers for peripherals, a complete file system, and support for physical storage.

2. Several software "folios" that provide a link between application software and the 3DO hardware, and are designed to allow software compatibility as new versions of the hardware are developed. The following six subsystems make up the entire "Portfolio" of 3DO OS system calls:

The Decompression Folio: supports software and hardware decompression of audio and video data.
The Math Folio: performs many of the high level calculations.
The Graphics Folio: provides access to the 3DO's cel and display subsystems for doing graphics effects and animation. The effects include warping, transparency, lighting effects, anti-aliasing, and texture mapping.
The 3-D Folio: system code for creating 3-D effects and doing complex calculations.
The Audio Folio: supports the creation and manipulation of sound effects and music. This includes proprietary algorithms called "3D audio imaging" that create the illusion of sound coming not only from the left and right, but front and back (when wearing headphones). These algorithms can also produce Doppler effects and reverberations.
The File System Folio: manages the file system.



From https://users.polytech.unice.fr/~buffa/videogames/3do_faq2.4.html
3DO team created a game-centric low level API that must be used.

Key original Amiga engineers were kicked out of Commodore-Amiga Inc and taken over by Commodore's system engineering group.

After intervention from Japan Inc deep state industrialist, 3DO M2's progress was halted by Panasonic after they paid $100 million for 3DO M2 (PPC 602) design.

Samsung's 3DO group later designed 3DO MX (PPC 602), which was purchased by MS, which merged with the Xbox team. Xbox project had the advanced low-level Direct3D and OS kernel development, which later transferred into the PC's DirectX12, which is common with AMD's low-level Mantle and Vulkan APIs. Xbox 360 has hardware microcode engine version for Direct3D, which is applied for Xbox One and Xbox Series X/Series S.

AMD's CEO, Lisa Su was responsible for interfacing IBM Microelectronics and Sony PlayStation and MS Xbox (during Xbox 360) divisions.

Xbox 360 (2005) has major GCN's HSA features such as CPU/GPU pointer exchange, async compute with 8 concurrent context (async compute in PC's DX12_0, Southern Islands GCN has two ACE units), rasterizer ordered views (in PC's DirectX12_1) features and 'etc'.

Bill Gates' MS assimilated 3DO.

The Vulkan API is capable of abstracting AMD GCN's low level PM4 packets with high efficiency as shown from Steam Deck with 1.6 TFLOPS FP32 RDNA 2 GPU and ShadPS4 emulator vs PS4's 1.84 TFLOPS FP32 GCN 2.0.


References
https://www.neogaf.com/threads/3do-mx-chipset-the-technology-nintendo-almost-used-in-an-n64-successor-for-1999.350196/#post-14521193
MS merged the former CagEnt/3DO Systems team with the Xbox team as the primary group to attack Sony PlayStation.

Gaming PC would later benefit from Xbox's low-level Direct3D R&D as PC's low-level DirectX12.








Last edited by Hammer on 01-Mar-2025 at 11:43 PM.

_________________
Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68)
Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68)
Ryzen 9 7950X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB

 Status: Offline
Profile     Report this post  
michalsc 
Re: Integrating Warp3D into my 3D engine
Posted on 2-Mar-2025 8:53:54
#191 ]
AROS Core Developer
Joined: 14-Jun-2005
Posts: 433
From: Germany

@matthey

Quote:
Someone else comes along and proposes that Amitown build their own stores


The people of Amitown agreed that this "Someone" was a well-known, old, grumpy Amitownian who came to the city only to tell others how great the new store would be - if it were ever built. Unfortunately for him, the people of Amitown lacked the ability to build their own stores because only they, in their tiny town, would benefit. The stores would grow to utterly massive sizes, leading to high costs and making necessities unbearably expensive.

So, the people of Amitown chose to kindly ignore the old grumpy Amitownian, knowing their own limitations.

And yet, the old grumpy man still stands at the crossing, telling people how great the new store would be.

Has he done anything to change the situation?

Nope - just talking.

And talking.

And talking.

 Status: Offline
Profile     Report this post  
Karlos 
Re: Integrating Warp3D into my 3D engine
Posted on 2-Mar-2025 10:22:36
#192 ]
Elite Member
Joined: 24-Aug-2003
Posts: 4937
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

@michalsc

How feasible is it to perform additional optimisation passes on the translated ARM code looking for opportunities to reduce load to use penalties by instruction reordering?

_________________
Doing stupid things for fun...

 Status: Offline
Profile     Report this post  
michalsc 
Re: Integrating Warp3D into my 3D engine
Posted on 2-Mar-2025 12:25:26
#193 ]
AROS Core Developer
Joined: 14-Jun-2005
Posts: 433
From: Germany

@Karlos

Some of the optimizations can be definitely possible - I will need to investigate the possibilities

 Status: Offline
Profile     Report this post  
Karlos 
Re: Integrating Warp3D into my 3D engine
Posted on 2-Mar-2025 14:35:27
#194 ]
Elite Member
Joined: 24-Aug-2003
Posts: 4937
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

@michalsc

I'm sure it's the best kind of nerdsnipe :D

_________________
Doing stupid things for fun...

 Status: Offline
Profile     Report this post  
cdimauro 
Re: Integrating Warp3D into my 3D engine
Posted on 7-Mar-2025 12:45:23
#195 ]
Elite Member
Joined: 29-Oct-2012
Posts: 4295
From: Germany

A lot could be done, but a JIT is always a trade-off between how good/optimized is the generated code, and its latency (e.g.: how it takes before executing the first instruction).

That's why I always prefer an AOT system: compile everything one time, and forget it. And you can spend a lot of time on accurately optimizing the generated code, because you can do the same that compilers normally do.

People can object that a JIT allows to compile every code, whereas with AOT you should know the structure (e.g.: code boundaries) otherwise it cannot be applied.

True. But here we're taking about a retro system: the code is usually "static". It doesn't changed. So it can be "mapped" by an hybrid system which is made up of an emulator and a "metadata collector": the more you execute the code, the more chance there are to map all code and have a final picture which allows to apply the AOT.
That's what I've called "Amiga virtualizer" (since more than a decade now).

Regarding the last part of the discussions, maybe you haven't got it right.
I think that Matt is well aware that solutions like PiStorm are real and are already getting way more than any 68k system. And the same for such ARM-based systems to run games.
Nothing to say here: they are concrete products. And kudos to the people which delivered. When I joined Intel I've learned THE "keyword" there: "Delivery is the king!".

However, he's advocating for a 68k rebirth because he sees a great potential. From this perspective, he's also right. I don't see a mutual exclusion here. Yes, there are cases where two positions could be both right / sustainable (e.g.: not conflicting each other).

He's right because this architecture family is simply great. Great at all metrics which are important for a processor:
- top notch code density;
- less executed instructions;
- less memory accesses;
- not so many instructions in the ISA (I mean, the "core" / general purpose instructions);
- not so many (neither so little) registers.

It's so damn difficult getting close (read: a good balance of all those factors), believe me.

But this requires resources: people & money to revive the project. And then to extend & modernize it, because it needs changes.

That's the unfortunate point and the reason why Matt is against other solutions (emulation included).

 Status: Offline
Profile     Report this post  
cdimauro 
Re: Integrating Warp3D into my 3D engine
Posted on 7-Mar-2025 13:06:05
#196 ]
Elite Member
Joined: 29-Oct-2012
Posts: 4295
From: Germany

@Hammer

Quote:

Hammer wrote:
@Karlos

Quote:

Lisa was an expensive moonshot and was in a completely different price bracket than the A1000 on release. In terms of user experience there's still no contest: a monochrome machine with workstation ambitions versus a consumer machine with multiple supported resolutions, colour depths, stereo digital sound, preemptive multitasking, etc.

Apple Lisa has a custom MMU, memory protection, virtual memory, 32-bit preemptive multitasking, stable business, high resolution monochrome, stock 1MB RAM ECC, and WIMP GUI. Two hardware revisions were released. Apple OS team demonstrated advanced 32-bit OS skills. Lisa has mono PCM sound. LisaOS is missing Xenix's C2 rated security.

Apple's second attempt was with Apple Unix, which included Mac apps compatibility for Macs equipped with 68020/68551 MMU or 68030.

Lisa was changed into Macintosh XL with name and MacOS change. MacOS was changed due to the Mac 128K RAM's cut-down hardware. Task switcher was added for Mac 512K, which MS Excel depended on, and it was bundled with the OS task switcher update in 1985.

Right. Because a system with an MMU was EXPENSIVE (NOT only for the hardware implementation).

The same reason why not only the Amiga OS, but other systems / OSes have NOT used an MMU since very long time.

Something which you still don't get, because as a bot you don't understand and you're not able to contextualize.
Quote:
MMU LisaOS was a major factor for Bill Gates' 32-bit 386 with MMU argument, hence Xenix 386, Windows 2.0 386 and starting 1988 Windows NT (known OS/2 3.0, C2 security target) project. Windows NT combines higher level Xenix 386 features and Windows userland layers.

When Steve Jobs left Apple, key Apple OS engineers followed Jobs into NextStep and avoided expensive commercial Unix license. Both Bill Gates and Steve Jobs dislike expensive Unix licenses.

Both Bill Gates and Steve Jobs led projects with 32-bit preemptive multitasking, memory protection, multi-user, symmetric multiprocessor, sandboxed legacy, retargetable graphics and at least C2 security rated futures for their respective platforms.

Amiga OCS's multi-resolution didn't cover a stable graphics high resolution for business GUI. Amiga OCS remained in TV modes and A2024 has a 5000 unit production scale debacle.

Apple started the Mac II's 256 color with business resolution display capability R&D in 1985. This capability was ready for Apple's early 1990s best-selling Mac LC series.

Without Steve Jobs' NextStep team, Apple tried to replace MacOS with Copland and failed.
Steve Jobs' OS team has OS GUI design strength which underpins today's MacOS X and IOS.

Amiga OCS strength was turned into a weakness during the "next gen" transition phase.

I reveal you a secret: MacOS X is VERY DIFFERENT from MacOS (which... did NOT... require an MMU! Now take some salts to recover from the shock).

The same thing could have happened to the Amiga. There was simply (!) not enough time to correctly evolve the platform. Sic et simpliciter.

Hint: the release date of Mac OS X speaks by itself.
Quote:
----------------
From the original key Amiga engineers who learned from AmigaOS mistakes,


the 3DO OS is a fully-featured 32-bit multitasking and real-time operating system written specifically for the 3DO by NTG. Developers *must* use the OS for a variety of reasons, the main one being to maintain compatibility with all 3DO consoles and future next-generation 3DO consoles.

The OS is loaded from an application's CD when the system starts up, and is not in the consoles ROM. Contrary to rumors, there is no way to bypass the OS and "hack" directly on the hardware.

The 3DO OS consists of two parts:
1. A multitasking kernel with drivers for peripherals, a complete file system, and support for physical storage.

2. Several software "folios" that provide a link between application software and the 3DO hardware, and are designed to allow software compatibility as new versions of the hardware are developed. The following six subsystems make up the entire "Portfolio" of 3DO OS system calls:

The Decompression Folio: supports software and hardware decompression of audio and video data.
The Math Folio: performs many of the high level calculations.
The Graphics Folio: provides access to the 3DO's cel and display subsystems for doing graphics effects and animation. The effects include warping, transparency, lighting effects, anti-aliasing, and texture mapping.
The 3-D Folio: system code for creating 3-D effects and doing complex calculations.
The Audio Folio: supports the creation and manipulation of sound effects and music. This includes proprietary algorithms called "3D audio imaging" that create the illusion of sound coming not only from the left and right, but front and back (when wearing headphones). These algorithms can also produce Doppler effects and reverberations.
The File System Folio: manages the file system.



From https://users.polytech.unice.fr/~buffa/videogames/3do_faq2.4.html
3DO team created a game-centric low level API that must be used.

Key original Amiga engineers were kicked out of Commodore-Amiga Inc and taken over by Commodore's system engineering group.

After intervention from Japan Inc deep state industrialist, 3DO M2's progress was halted by Panasonic after they paid $100 million for 3DO M2 (PPC 602) design.

Samsung's 3DO group later designed 3DO MX (PPC 602), which was purchased by MS, which merged with the Xbox team. Xbox project had the advanced low-level Direct3D and OS kernel development, which later transferred into the PC's DirectX12, which is common with AMD's low-level Mantle and Vulkan APIs. Xbox 360 has hardware microcode engine version for Direct3D, which is applied for Xbox One and Xbox Series X/Series S.

AMD's CEO, Lisa Su was responsible for interfacing IBM Microelectronics and Sony PlayStation and MS Xbox (during Xbox 360) divisions.

Xbox 360 (2005) has major GCN's HSA features such as CPU/GPU pointer exchange, async compute with 8 concurrent context (async compute in PC's DX12_0, Southern Islands GCN has two ACE units), rasterizer ordered views (in PC's DirectX12_1) features and 'etc'.

Bill Gates' MS assimilated 3DO.

The Vulkan API is capable of abstracting AMD GCN's low level PM4 packets with high efficiency as shown from Steam Deck with 1.6 TFLOPS FP32 RDNA 2 GPU and ShadPS4 emulator vs PS4's 1.84 TFLOPS FP32 GCN 2.0.


References
https://www.neogaf.com/threads/3do-mx-chipset-the-technology-nintendo-almost-used-in-an-n64-successor-for-1999.350196/#post-14521193
MS merged the former CagEnt/3DO Systems team with the Xbox team as the primary group to attack Sony PlayStation.

Gaming PC would later benefit from Xbox's low-level Direct3D R&D as PC's low-level DirectX12.

Read: 3DO is a NEW system which was written for the NEW standards & needs of the time.

In Italy, in such cases, we're used to say something which roughly translates as: "you discovered the hot water".

Clap clap clap!

 Status: Offline
Profile     Report this post  
Karlos 
Re: Integrating Warp3D into my 3D engine
Posted on 7-Mar-2025 13:08:37
#197 ]
Elite Member
Joined: 24-Aug-2003
Posts: 4937
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

@cdimauro

Nobody here disagrees the 68K is great. Well, apart from the hamster. Nobody is realistically saying a modern hardware refresh or the 68K line is not an exciting prospect. Of course it is.

It's just that it seems completely unrealistic that it will ever happen, so there's not much point lamenting it.

_________________
Doing stupid things for fun...

 Status: Offline
Profile     Report this post  
cdimauro 
Re: Integrating Warp3D into my 3D engine
Posted on 8-Mar-2025 7:14:16
#198 ]
Elite Member
Joined: 29-Oct-2012
Posts: 4295
From: Germany

@Karlos I don't agree on the last part: it's not completely unrealistic.

As I've said before, the primary problem is lack of resources (money, primarily. Experts could be found having money upfront).

The 68k ISA is not perfect and carries some sensible legacy baggage. But IMO it can still be very competitive in some markets (albeit I've different opinions here, since I'm also biased by my architectures).

But I would exclude the Amiga market from this. I mean: giving a new future to the Amiga platform, but evolving it.

Having more powerful CPUs is very good to have (as PiStorm clearly has clearly shown) and enhancing some expects (better / faster RTG, AHI, disks etc.) as well.

However, the OS cannot enhanced because it's fundamentally Broken-by-Design.
You can use it for some not important multimedia kiosk, for example, because it's tiny and super fast to load & react. And also by adding some watchdog time to reset it when the system hanged for some reason.
But that's all that you can do it. There's no chance of using on something more serious, like the OS for modern TVs, to give another example. It's too fragile and "insecure". No chance at all here, as well as for being used as desktop / mainstream OS.

 Status: Offline
Profile     Report this post  
Karlos 
Re: Integrating Warp3D into my 3D engine
Posted on 8-Mar-2025 19:35:50
#199 ]
Elite Member
Joined: 24-Aug-2003
Posts: 4937
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

@cdimauro

It's unrealistic because no matter how good the potential of the 68K ISA for embedded applications, there are already many competitors, including ARM and x86. Not to mention all the microcontrollers for less demanding applications. The relative cost for ARM now is low enough that any hardware you get for such purposes is likely overkill already and maybe you care more about power efficiency than you do about overall performance for your embedded application.

I don't see a 68K hardware revival happening other than a vanity project from an eccentric with the resources to burn.

_________________
Doing stupid things for fun...

 Status: Offline
Profile     Report this post  
Hammer 
Re: Integrating Warp3D into my 3D engine
Posted on 8-Mar-2025 23:07:39
#200 ]
Elite Member
Joined: 9-Mar-2003
Posts: 6320
From: Australia

@cdimauro

Quote:
Right. Because a system with an MMU was EXPENSIVE (NOT only for the hardware implementation).

The same reason why not only the Amiga OS, but other systems / OSes have NOT used an MMU since very long time.

Something which you still don't get, because as a bot you don't understand and you're not able to contextualize.


Motorola made MMU to be a premium, and 68551 were late, which caused Commodore to commit R&D resources for 68000's MMU and 68020's MMU. The 1st C= custom MMU works, but it was slow and needed a TLB cache.

Commodore's memory protected/multi-user OS development was focused on aiding AT&T's expensive license Unix platform instead of AmigaOS. MS's 1988 focused on in-house OS/2 3.0 project aka Windows NT to replace MS Xenix. Steve Job's 1986 focused on freebe *nix kernel with value-add higher layers for NextStep, that later served as the foundation for MacOS X.

Commodore's 2nd custom MMU was abandoned for 68551 which caused delays for A2620's release. In 1988, Commodore's MMU engineer, Bob Welland later moved to Apple and focused on RISC ARM-based devices. The first true ARM MMU design was Bob Welland, for Apple.

https://www.linkedin.com/in/bob-welland-1349193/details/experience/
At Apple, Bob Welland designed "ARM MMU Architecture".

Bob Welland argued for MMU for the masses and RISC during his time at Commodore.

Both 16-bit Windows/32-bit Win32S and AmigaOS have shared memory address space.

Win16 was boxed into a virtual machine on memory protected Windows NT/Windows 95.
MS boxed their legacy software environment e.g. Windows NT's NTVDM (NT Virtual DOS Machine).

In 1997, Microsoft purchased Connectix, a Virtual PC specialist, and led into Hyper V.

When PC and Mac switched to C3 rated memory protected/multi-user 32bit OS, both platforms were ready.

Windows 3.1 enhanced mode was a 32-bit protected mode virtual machine manager that ran WIndows 3.1 standard mode, Win32S, and DOS boxes virtual machines. Windows 3.1 and Win32 have shared memory address space with no memory protection. Windows NT 3.1 boxed Win16 environment. My point, MS mastered virtual machine software tech.

Go read Commodore - The Final Years book, you ignoramus.

PiStorm-Emu68 software is designed as a hypervisor and has a high resistance against Amiga software crash.

32bit ARM's birth was mostly due to Commodore (Jack Tramiel)'s slow 8bit 65xx R&D road maps.

ARM MMU's were due to Commodore (Henri Rubin)'s anti-MMU for the masses. Commodore management didn't plan for the next generation C3/POSIX rated AmigaOS.

Commodore didn't keep up with the competition, ex-Commodore engineers aided the competition.


Quote:

I reveal you a secret: MacOS X is VERY DIFFERENT from MacOS (which... did NOT... require an MMU! Now take some salts to recover from the shock).

The same thing could have happened to the Amiga. There was simply (!) not enough time to correctly evolve the platform. Sic et simpliciter.

Don't assume I don't know MacOS X is different from MacOS.

Both MS and Apple (Steve Jobs) boxed their legacy software environments.

PowerPC versions of Mac OS X up to and including Mac OS X 10.4 Tiger include a compatibility layer for running older Mac applications, the Classic Environment (aka Blue box).

Read https://en.wikipedia.org/wiki/Sandbox_(computer_security)
Sandboxes may be seen as a specific example of virtualization

Your assumption is wrong.

Last edited by Hammer on 09-Mar-2025 at 12:08 AM.
Last edited by Hammer on 08-Mar-2025 at 11:56 PM.
Last edited by Hammer on 08-Mar-2025 at 11:44 PM.
Last edited by Hammer on 08-Mar-2025 at 11:40 PM.
Last edited by Hammer on 08-Mar-2025 at 11:37 PM.
Last edited by Hammer on 08-Mar-2025 at 11:33 PM.
Last edited by Hammer on 08-Mar-2025 at 11:14 PM.
Last edited by Hammer on 08-Mar-2025 at 11:12 PM.

_________________
Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68)
Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68)
Ryzen 9 7950X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB

 Status: Offline
Profile     Report this post  
Goto page ( Previous Page 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 Next Page )

[ home ][ about us ][ privacy ] [ forums ][ classifieds ] [ links ][ news archive ] [ link to us ][ user account ]
Copyright (C) 2000 - 2019 Amigaworld.net.
Amigaworld.net was originally founded by David Doyle