Poster | Thread |
Hammer
 |  |
Re: Packed Versus Planar: FIGHT Posted on 22-Nov-2022 5:21:29
| | [ #721 ] |
|
|
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 4641
From: Australia | | |
|
| @pixie
Quote:
pixie wrote: @Karlos
From quake I got: AGA 345fps from low res 320x256 256 109fps from produtivity mode 640x480 256 146fps from ham low res
RTG 397fps from low res 320x256 256 171fps from 640x480 256
|
For Quake, it's unlikely Pistorm32-Emu68-RPI CM4 on A1200's AGA would reach 60 fps 320x200/240/245 resolution.
_________________ Ryzen 9 7900X, DDR5-6000 32 GB RAM, GeForce RTX 4080 Amiga 1200 (rev 1D1, KS 3.2, TF1260, 68060 @ 63 Mhz, 128 MB) Amiga 500 (rev 6A, KS 3.2, PiStorm/RPi3a/Emu68) |
|
Status: Offline |
|
|
pixie
 |  |
Re: Packed Versus Planar: FIGHT Posted on 22-Nov-2022 22:17:07
| | [ #722 ] |
|
|
 |
Elite Member  |
Joined: 10-Mar-2003 Posts: 2821
From: Figueira da Foz - Portugal | | |
|
| |
Status: Offline |
|
|
Hammer
 |  |
Re: Packed Versus Planar: FIGHT Posted on 22-Nov-2022 22:49:59
| | [ #723 ] |
|
|
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 4641
From: Australia | | |
|
| @pixie
I haven't tested Sam's Quake benchmark with demo1.

On a single thread, RPI 4B's ARM Cortex A72 CPU out-of-order processing upgrade from RPI 3B+ ARM Cortex A53 is relatively minor.
From https://ibug.io/blog/2019/09/raspberry-pi-4-review-benchmark/
I usually follow PC's Quake "timedemo demo3" (320x200 resolution with full UI) benchmarks from https://thandor.net/benchmark/33 Last edited by Hammer on 22-Nov-2022 at 11:12 PM. Last edited by Hammer on 22-Nov-2022 at 11:11 PM. Last edited by Hammer on 22-Nov-2022 at 10:50 PM. Last edited by Hammer on 22-Nov-2022 at 10:50 PM.
_________________ Ryzen 9 7900X, DDR5-6000 32 GB RAM, GeForce RTX 4080 Amiga 1200 (rev 1D1, KS 3.2, TF1260, 68060 @ 63 Mhz, 128 MB) Amiga 500 (rev 6A, KS 3.2, PiStorm/RPi3a/Emu68) |
|
Status: Offline |
|
|
Hammer
 |  |
Re: Packed Versus Planar: FIGHT Posted on 1-Dec-2022 0:46:44
| | [ #724 ] |
|
|
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 4641
From: Australia | | |
|
| @Gunnar
Quote:
Gunnar wrote:
Today all good FPU are fully pipelined. Still all the FPU operation need several clock cycle to finish. But you can start a new FPU instruction every clock! This means you have several instruction in flight in parallel. All modern FPUs work like this - they are all pipelined. On POWER, on INTEL on ARM, on 68080 - all modern FPU work like this
Typically todays FPUs have about 6 or more operations in flight. The 68080 can have up to 22 FPU operations in flight in parallel!
|
AMD K19.5 Zen 4's reorder buffer has 320 instructions deep which accommodates the tracking of a large number of instructions in flight.
AMD K8 Sledge-Hammer's reorder buffer allows the instruction control unit to track and monitor up to 72 in-flight macro-ops (whether integer or floating-point).
AMD refers to the more simplified fixed-length operation as macro-ops (sometimes also Complex-Op or COPs). In their context, macro-operations are fixed-length operations that may be composed of a memory operation and an arithmetic operation. Fixed-length operations are one of RISC's design ideologies. Intel refers to the variable-length x86 instructions as macro-ops.
 Last edited by Hammer on 01-Dec-2022 at 12:58 AM. Last edited by Hammer on 01-Dec-2022 at 12:52 AM. Last edited by Hammer on 01-Dec-2022 at 12:49 AM. Last edited by Hammer on 01-Dec-2022 at 12:47 AM.
_________________ Ryzen 9 7900X, DDR5-6000 32 GB RAM, GeForce RTX 4080 Amiga 1200 (rev 1D1, KS 3.2, TF1260, 68060 @ 63 Mhz, 128 MB) Amiga 500 (rev 6A, KS 3.2, PiStorm/RPi3a/Emu68) |
|
Status: Offline |
|
|
bhabbott
|  |
Re: Packed Versus Planar: FIGHT Posted on 1-Dec-2022 7:03:29
| | [ #725 ] |
|
|
 |
Regular Member  |
Joined: 6-Jun-2018 Posts: 251
From: Aotearoa | | |
|
| Quote:
Hammer wrote:
AMD refers to the more simplified fixed-length operation as macro-ops... Intel refers to the variable-length x86 instructions as macro-ops. |
Why can't these people agree on definition of the terms they use?
Anyway, good to hear that the 68080 uses techniques similar to modern CPUs. Those guys must really know their stuff!
|
|
Status: Offline |
|
|
michalsc
|  |
Re: Packed Versus Planar: FIGHT Posted on 1-Dec-2022 20:58:17
| | [ #726 ] |
|
|
 |
AROS Core Developer  |
Joined: 14-Jun-2005 Posts: 346
From: Germany | | |
|
| @Hammer
Quote:
Here you go:
320x200: 73.8 FPS 320x240: 66.3 FPS 320x256: 63.1 FPS 640x480: 29.5 FPS
Emu68 0.11, PiStorm600, CM4, Amiga600 (2MB CHIP, 1.8GB FAST) |
|
Status: Offline |
|
|
Hammer
 |  |
Re: Packed Versus Planar: FIGHT Posted on 2-Dec-2022 10:02:35
| | [ #727 ] |
|
|
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 4641
From: Australia | | |
|
| @michalsc
Quote:
michalsc wrote: @Hammer
Quote:
Here you go:
320x200: 73.8 FPS 320x240: 66.3 FPS 320x256: 63.1 FPS 640x480: 29.5 FPS
Emu68 0.11, PiStorm600, CM4, Amiga600 (2MB CHIP, 1.8GB FAST)
|
Have you tried Amiga's HAM mode with Samuel Devulder's Quake build?
For A1200, I have pre-ordered CM4.Last edited by Hammer on 02-Dec-2022 at 10:26 AM. Last edited by Hammer on 02-Dec-2022 at 10:24 AM.
_________________ Ryzen 9 7900X, DDR5-6000 32 GB RAM, GeForce RTX 4080 Amiga 1200 (rev 1D1, KS 3.2, TF1260, 68060 @ 63 Mhz, 128 MB) Amiga 500 (rev 6A, KS 3.2, PiStorm/RPi3a/Emu68) |
|
Status: Offline |
|
|
Hammer
 |  |
Re: Packed Versus Planar: FIGHT Posted on 2-Dec-2022 10:16:44
| | [ #728 ] |
|
|
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 4641
From: Australia | | |
|
| @bhabbott
Quote:
bhabbott wrote:
Why can't these people agree on definition of the terms they use?
Anyway, good to hear that the 68080 uses techniques similar to modern CPUs. Those guys must really know their stuff!
|
Different companies have their own terminologies and culture.
68060 FPU is not pipelined and it doesn't have out-of-order processing, hence missing re-order-buffer hardware.
The function of the reorder buffer is to put the instructions back in the original program order after the instructions have finished execution possibly out of order. The reorder buffer maintains an ordered list of the instructions. Instructions are added at one end of the list when they are dispatched and they are removed from the other end of the list when they are completed. In this way, instructions will be completed in the same order as they were dispatched.
The text that annoyed me is when Gunnar minimized the modern X86 CPU's instructions in flight capability while maximizing AC68080 FPU's instructions in flight capability.
Quote:
Gunnar wrote:
Typically todays FPUs have about 6 or more operations in flight. The 68080 can have up to 22 FPU operations in flight in parallel!
|
Last edited by Hammer on 02-Dec-2022 at 10:17 AM.
_________________ Ryzen 9 7900X, DDR5-6000 32 GB RAM, GeForce RTX 4080 Amiga 1200 (rev 1D1, KS 3.2, TF1260, 68060 @ 63 Mhz, 128 MB) Amiga 500 (rev 6A, KS 3.2, PiStorm/RPi3a/Emu68) |
|
Status: Offline |
|
|