Poster | Thread |
Karlos
|  |
Re: Packed Versus Planar: FIGHT Posted on 9-Nov-2022 12:40:22
| | [ #701 ] |
|
|
 |
Elite Member  |
Joined: 24-Aug-2003 Posts: 4930
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition! | | |
|
| The Akiko could've been better. At the very least, if it had some bitplane pointers internally and the ability to write to chip ram by itself using some incremental addressing of these pointers it could've been quite useful. You'd just have to set up the pointers and then keep writing packed data to it from the CPU and let it write the converted data to chip ram.
As it was, you had to write your packed pixels to it, read back the planar data and then write that off to the chip ram yourself. It was advertised as being as fast as C2P as a decent 040 but with all the data movement overhead, I don't know how realistic that claim was. _________________ Doing stupid things for fun... |
|
Status: Offline |
|
|
pixie
 |  |
Re: Packed Versus Planar: FIGHT Posted on 9-Nov-2022 20:03:56
| | [ #702 ] |
|
|
 |
Elite Member  |
Joined: 10-Mar-2003 Posts: 3447
From: Figueira da Foz - Portugal | | |
|
| @Karlos
Saga is a thing of their own, I was talking implementing based on the original schematics, perhaps more akin to cores being used on FPGA _________________ Indigo 3D Lounge, my second home. The Illusion of Choice | Am*ga |
|
Status: Offline |
|
|
Karlos
|  |
Re: Packed Versus Planar: FIGHT Posted on 9-Nov-2022 20:47:22
| | [ #703 ] |
|
|
 |
Elite Member  |
Joined: 24-Aug-2003 Posts: 4930
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition! | | |
|
| |
Status: Offline |
|
|
Hammer
 |  |
Re: Packed Versus Planar: FIGHT Posted on 10-Nov-2022 2:20:58
| | [ #704 ] |
|
|
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 6320
From: Australia | | |
|
| @pixie
Quote:
pixie wrote: @Hammer
Quote:
Ed Hepler left the AAA group and went on to start the Hombre project. |
How I would love to see a WhatIf™ FPGA machine that would allow to bring back these projects back to life, such as AA3000+ AGA DSP, and flexible enough we could run multiple such machines on it.
|
I'm aware of AA3000+ AGA DSP.
For 1993 with an A3000/A4000 motherboard and case design, Commodore wouldn't be able to match Apple's $1000 USD entry-priced Quadra 605 (with 68LC040 @ 25Mhz) that directly competed against similar priced 486SX-33 PC clones.
For 1993, there's a large price gap between base A1200 and A4000/030.
Competition in 1993 in the large US market,
https://vintageapple.org/pcworld/pdf/PC_World_9308_August_1993.pdf Gateway Party List, Page 62 of 324
4SX-33 with 486-SX @ 33Mhz, 4MB RAM, 212MB HDD, Windows Video accelerator 1MB video DRAM, 14-inch monitor for $1,495.
4DX-33 with 486-DX @ 33Mhz, 8MB RAM, 212 MB HDD, Windows Video accelerator 1MB video DRAM, 14-inch monitor for $1,795.
or
Page 292 of 324 From Comtrade VESA Local Bus WinMax with 32-Bit VL-Bus Video Accelerator 1MB, 486DX2 @ 66 Mhz, 210 MB HDD, 4MB RAM, Price: $1795
or
https://vintageapple.org/pcworld/pdf/PC_World_9310_October_1993.pdf October 1993, Page 13 of 354, ALR Inc, Model 1 has Pentium 60-based PC clone for $2495.
VS
https://archive.org/details/amiga-world-1993-10/page/n7/mode/2up Amigaworld, October 1993, Page 66 of 104 Amiga 4000/040 @ 25Mhz for $2299 Amiga 4000/030 @ 25Mhz for $1599
Page 82 of 104 Microbotics M1230X's 68030 @ 50 Mhz has $349 C= 1942 Monitor has $389 C= A1200 with 85MB HDD has $624 C= A1200 with 130MB HDD has $724
$624 + $349 + $389 = $1362.
Gateway's 486-based PC offers have beaten Commodore's A4000 offers.
A1200 with M1230X 68030 @ 50Mhz is not "cost vs performance" competitive when compared to Gateway's 486SX-based PC offers.
Microbotics doesn't have Commodore's economies of scale.
https://archive.org/details/amiga-world?and[]=year%3A%221993%22 Amiga World Magazine (November 1993), page 58 of 100, A1200 price $379 A3000 5MB, 105HD, price $899 A3000T/030, 5MB, 200MB HDD, price $1199 A3000T/040, 5MB, 200MB HDD, price $1599 Cost for 040 card = $400
The cost estimate for Commodore's A3640 68040 card, $1599 - $1199, and the cost for A3640 68040 card is about $400.
A1200's $379 + 040 card's $400 = $779.
Commodore could have pre-configured A1200 with 68LC040 at 25 Mhz SKU for slightly above $779 (i.e. add 4MB fast ram, HDD) and competed against Gateway's 486SX-33-based PC offerings.
Commodore's economies of scale delivered $400 A3640 card with a full 68040 CPU. Commodore needs to partner with 3rd party engineering company that can design a 68LC040 accelerator card for A1200 and use Commodore's economies of scale purchasing strength. This is to minimize the bleed to the 486SX PC price segment.
A1200 with 68LC040 accelerator card would not compete against A4000's Video Toaster and required FPU (68882, 68040's FPU) markets. Full 68040 competes against 486DX.
A1200 with 68LC040 accelerator card infrastructure and 3.3V modification can be recycled for 68LC060, 68060, and 68040V SKUs.
In the 1990s, there is no 68K cloner to continue after the 68060 design i.e. missing 68K's "AMD 2nd source insurance". Apollo-Core doesn't exist in the 1990s.
IBM's prudent second X86 source contract requirement was a good move that insured X86's survival after Intel attempted to kill X86 with IA-64 Itanium.
Last edited by Hammer on 10-Nov-2022 at 02:27 AM.
_________________ Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68) Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68) Ryzen 9 7950X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB |
|
Status: Offline |
|
|
Hammer
 |  |
Re: Packed Versus Planar: FIGHT Posted on 10-Nov-2022 2:42:12
| | [ #705 ] |
|
|
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 6320
From: Australia | | |
|
| @Karlos
Quote:
Karlos wrote: The Akiko could've been better. At the very least, if it had some bitplane pointers internally and the ability to write to chip ram by itself using some incremental addressing of these pointers it could've been quite useful. You'd just have to set up the pointers and then keep writing packed data to it from the CPU and let it write the converted data to chip ram.
As it was, you had to write your packed pixels to it, read back the planar data and then write that off to the chip ram yourself. It was advertised as being as fast as C2P as a decent 040 but with all the data movement overhead, I don't know how realistic that claim was.
|
Chip RAM gimped 68EC020's memory access.
From Cammy's Doom benchmarks via https://forum.amiga.org/index.php?topic=51616.0
A1200 030/50Mhz 32Mb (Optimised 020 C2P) - 8481 realtics (8.8 fps) Amiga CD32 68020/14Mhz 8Mb (Optimised 020 C2P) - 18971 realtics (3.9 fps) Amiga CD32 68020/14Mhz 8Mb (Optimised Akiko C2P) - 12872 realtics (5.8 fps)
---------- A1200 with 030/50Mhz and Fast RAM, Doom in low-detail mode would have delivered sufficient gameplay frame rates i.e. similar to PC clone with 386DX-40 and ET4000AX.
Cammy's CD32 has 8 MB of Fast RAM.
Without sufficient compute CPU power, 68020 @ 14 Mhz is too slow despite Akiko C2P's existence.
A1200 with 030/50Mhz accelerator card has a cost vs performance issue.Last edited by Hammer on 10-Nov-2022 at 02:50 AM. Last edited by Hammer on 10-Nov-2022 at 02:45 AM.
_________________ Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68) Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68) Ryzen 9 7950X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB |
|
Status: Offline |
|
|
cdimauro
|  |
Re: Packed Versus Planar: FIGHT Posted on 10-Nov-2022 5:25:38
| | [ #706 ] |
|
|
 |
Elite Member  |
Joined: 29-Oct-2012 Posts: 4274
From: Germany | | |
|
| @bhabbott
Quote:
bhabbott wrote: @cdimauro
Quote:
cdimauro wrote:
There are several primitives that were deeply analyzed on my article, but let's focus on the conclusion about the most important one for an Amiga: cookie-cut. Here is it:
Packed: it ranges from a minimum of 4 (memory; ndr) accesses to a maximum of 8. Planar: This ranges from a minimum of 12 accesses (memory; ndr) to a maximum of 24. Conclusions: As we can see, for this last analyzed primitive the planar format is even more penalized than the packed one, because the more the bitplanes increase the more the accesses to read always the same mask: an enormous waste considering that the mask is always the same.
As you can see I've also used the same word: enormous. |
I'm skeptical though. You say that with planar there is a minimum of 12 '(memory; ndr) accesses' (whatever that means), but with packed the minimum is 4. However 1 bitplane is the same for packed and planar - they are identical - so how can there be a difference in the number of 'accesses'? |
What do you expect if you don't read the article? I've reported the conclusions, but the analysis before them gives the information that YOU are missing. Quote:
I have more enjoyable things to do then checking your math. |
Then at least don't complaint if you don't read the article, because and obviously you cannot understand. Quote:
Provide some code we can run to quantify the 'ENORMOUS waste of bandwidth', or I will remain skeptical. |
Code is required for quiche eaters that don't read articles. |
|
Status: Offline |
|
|
pixie
 |  |
Re: Packed Versus Planar: FIGHT Posted on 10-Nov-2022 9:02:28
| | [ #707 ] |
|
|
 |
Elite Member  |
Joined: 10-Mar-2003 Posts: 3447
From: Figueira da Foz - Portugal | | |
|
| @Karlos
Quote:
Sure, but what I mean is it's something built in the spirit of "what if?". Imagine it. Build it. |
I'm no Steve Jobs to create products out of thin air, worse still I'm no Wozniak either, sadly!
I know my limits, which is fine, I guess! _________________ Indigo 3D Lounge, my second home. The Illusion of Choice | Am*ga |
|
Status: Offline |
|
|
Karlos
|  |
Re: Packed Versus Planar: FIGHT Posted on 10-Nov-2022 10:10:32
| | [ #708 ] |
|
|
 |
Elite Member  |
Joined: 24-Aug-2003 Posts: 4930
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition! | | |
|
| @pixie
Quote:
I know my limits, which is fine, I guess! |
Likewise, which is why I decided to I implement an entirely virtual system for MC64K. Currently working on a set of virtual modular synthesis components for the audio and loving it._________________ Doing stupid things for fun... |
|
Status: Offline |
|
|
Hammer
 |  |
Re: Packed Versus Planar: FIGHT Posted on 19-Nov-2022 4:06:15
| | [ #709 ] |
|
|
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 6320
From: Australia | | |
|
| @Karlos From PiStorm Discord channel
Pixie: Anyone can try Doom/Quake on your AGA machines? Just so one can see how bootlenecked AGA actually is
Mr Z: Tested that with @mschulz at A37 on A1200 with PiStorm32 (Pi4) we got 38 fps if I remember correctly, that's half the fps you get on PiStorm with pi3 + RTG _________________ Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68) Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68) Ryzen 9 7950X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB |
|
Status: Offline |
|
|
Karlos
|  |
Re: Packed Versus Planar: FIGHT Posted on 19-Nov-2022 9:09:02
| | [ #710 ] |
|
|
 |
Elite Member  |
Joined: 24-Aug-2003 Posts: 4930
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition! | | |
|
| @Hammer
So it's not clear at all if they tested doom or quake there. They just report 38fps but don't say for which. Quake spends way more time calculating what to draw than Doom does. _________________ Doing stupid things for fun... |
|
Status: Offline |
|
|
pixie
 |  |
Re: Packed Versus Planar: FIGHT Posted on 19-Nov-2022 11:01:08
| | [ #711 ] |
|
|
 |
Elite Member  |
Joined: 10-Mar-2003 Posts: 3447
From: Figueira da Foz - Portugal | | |
|
| @Karlos
I get absurd numbers on WinUAE, I just don't know if the speed is due to memory being way faster, the emulation benefiting AGA... therefore I thought it would be interesting having pistorm numbers, since pistorm is actually the fastest CPU on an Amiga and doesn't introduce further variables as Vampire would. Any news on buffee btw? _________________ Indigo 3D Lounge, my second home. The Illusion of Choice | Am*ga |
|
Status: Offline |
|
|
Karlos
|  |
Re: Packed Versus Planar: FIGHT Posted on 19-Nov-2022 11:24:17
| | [ #712 ] |
|
|
 |
Elite Member  |
Joined: 24-Aug-2003 Posts: 4930
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition! | | |
|
| @pixie
It's all of it. In any case, AGAs memory bandwidth isn't the bottleneck for 320x200 provided it's not contended by anything else, and that's a hill I'll die on. You can do about 7MB/s write to chip memory using 32 bit transfers. The maths is pretty simple: 7*1024*1024/(320*200) = 115fps
However this is not a possibility in reality since we aren't just doing 32 bit memory writes to chip ram and the chip ram will be being accessed by other things too. The actual memory limit wil be down to the C2P which has to read fast memory, chew on it and write to chip ram. Optimistically, this will only reach the Fast to Chip copy speed (which is basically "lightspeed" for this operation). _________________ Doing stupid things for fun... |
|
Status: Offline |
|
|
pixie
 |  |
Re: Packed Versus Planar: FIGHT Posted on 19-Nov-2022 11:30:42
| | [ #713 ] |
|
|
 |
Elite Member  |
Joined: 10-Mar-2003 Posts: 3447
From: Figueira da Foz - Portugal | | |
|
| @Karlos
I am not on my computer right now, but I think despite quake being harder it managed to actually have an higher frame rate, perhaps it had optimized better the c2p optimization. _________________ Indigo 3D Lounge, my second home. The Illusion of Choice | Am*ga |
|
Status: Offline |
|
|
Karlos
|  |
Re: Packed Versus Planar: FIGHT Posted on 19-Nov-2022 11:48:54
| | [ #714 ] |
|
|
 |
Elite Member  |
Joined: 24-Aug-2003 Posts: 4930
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition! | | |
|
| @pixie
I'd be surprised if it was a c2p making the difference. One thing which quake has up it's sleeve that makes a big difference to the performance is mip mapping.
Consider how doom draws walls, for example. Traditionally, it draws them column by column. The textures are rotated 90 degrees such that drawing down the column is sampling across the texture. Unless you are using a specialised buffer for the output, you are writing bytes a scanline apart for each point and that's not particularly cache friendly. For close up walls, you are sampling the texture at incremental addresses and that is cache friendly, but as you draw walls further away you're sampling points further and further apart, incurring misses.
Quake fills in polygons as spans (iirc) so most of the time youre writing pixels that are adjacent. Texture wise, sampling orientation may vary somewhat. However by using mip maps, the futher away the polygon is, the smaller the overall texture used and you're always sampling texels reasonably close together in memory.
Overall then, the case can be made that quakes renderer is significantly better for caches than dooms. Try changing the mip mapping parameters in the console to see the impact, it's petty severe on the original era hardware if you set it so that the mip mapping isn't applied on distant surfaces. _________________ Doing stupid things for fun... |
|
Status: Offline |
|
|
Hammer
 |  |
Re: Packed Versus Planar: FIGHT Posted on 19-Nov-2022 23:06:57
| | [ #715 ] |
|
|
 |
Elite Member  |
Joined: 9-Mar-2003 Posts: 6320
From: Australia | | |
|
| @Karlos
With Pistorm-Emu68-PRI3a+ accelerator, Sam's Quake build's demo3 in Amiga 500's HAM mode (6 bitplanes) has +19 fps. The same benchmark with Pistorm-Emu68-PRI3a+'s RTG has 64 fps.
I have Pi CM4 + incoming for A1200 and I'm waiting for PiStorm32.
Last edited by Hammer on 19-Nov-2022 at 11:12 PM. Last edited by Hammer on 19-Nov-2022 at 11:09 PM.
_________________ Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68) Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68) Ryzen 9 7950X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB |
|
Status: Offline |
|
|
pixie
 |  |
Re: Packed Versus Planar: FIGHT Posted on 21-Nov-2022 11:41:18
| | [ #716 ] |
|
|
 |
Elite Member  |
Joined: 10-Mar-2003 Posts: 3447
From: Figueira da Foz - Portugal | | |
|
| @Karlos
From quake I got: AGA 345fps from low res 320x256 256 109fps from produtivity mode 640x480 256 146fps from ham low res
RTG 397fps from low res 320x256 256 171fps from 640x480 256
_________________ Indigo 3D Lounge, my second home. The Illusion of Choice | Am*ga |
|
Status: Offline |
|
|
Karlos
|  |
Re: Packed Versus Planar: FIGHT Posted on 21-Nov-2022 15:29:56
| | [ #717 ] |
|
|
 |
Elite Member  |
Joined: 24-Aug-2003 Posts: 4930
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition! | | |
|
| |
Status: Offline |
|
|
pixie
 |  |
Re: Packed Versus Planar: FIGHT Posted on 21-Nov-2022 16:45:24
| | [ #718 ] |
|
|
 |
Elite Member  |
Joined: 10-Mar-2003 Posts: 3447
From: Figueira da Foz - Portugal | | |
|
| @Karlos
AMD 3300X Quote:
From quake I got: AGA 345fps from low res 320x256 256 109fps from produtivity mode 640x480 256 146fps from ham low res
RTG 397fps from low res 320x256 256 171fps from 640x480 256 |
AMD 5800X Quote:
From quake I got: AGA 483fps from low res 320x256 256 153fps from produtivity mode 640x480 256 185fps from ham low res
RTG 523fps from low res 320x256 256 219fps from 640x480 256 |
Last edited by pixie on 21-Nov-2022 at 06:18 PM.
_________________ Indigo 3D Lounge, my second home. The Illusion of Choice | Am*ga |
|
Status: Offline |
|
|
Karlos
|  |
Re: Packed Versus Planar: FIGHT Posted on 21-Nov-2022 20:22:33
| | [ #719 ] |
|
|
 |
Elite Member  |
Joined: 24-Aug-2003 Posts: 4930
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition! | | |
|
| @pixie
What is it you are trying to ascertain from these? The difference between RTG and native chipset under emulation will be the additional emulated CPU overhead of C2P per frame but it's unlikely to be affected by any notion of "bus speed". Unless you are running a cycle exact (including memory access time) simulation, the RTG framebuffer or the Chip RAM that your emulated Amiga sees will just be a block of generic system memory in the host machine with no difference in access performance. I don't believe the RTG framebuffer is a native video surface or even a texture buffer object.
If your test shows anything it's probably the overhead of C2P code with all other limitations being removed. However, on a real Amiga, the performance limit of memory read and write hides this latency and a good quality C2P routine isn't much slower than just copying the same volume of data from fast ram to chip ram with no modification. _________________ Doing stupid things for fun... |
|
Status: Offline |
|
|
pixie
 |  |
Re: Packed Versus Planar: FIGHT Posted on 21-Nov-2022 21:21:19
| | [ #720 ] |
|
|
 |
Elite Member  |
Joined: 10-Mar-2003 Posts: 3447
From: Figueira da Foz - Portugal | | |
|
| @Karlos
I guess that everything is quite speed on Amiga side, chipsets memory access et all. And there's the fact that doom runs slower Last edited by pixie on 22-Nov-2022 at 07:28 AM.
_________________ Indigo 3D Lounge, my second home. The Illusion of Choice | Am*ga |
|
Status: Offline |
|
|