Click Here
home features news forums classifieds faqs links search
6071 members 
Amiga Q&A /  Free for All /  Emulation /  Gaming / (Latest Posts)
Login

Nickname

Password

Lost Password?

Don't have an account yet?
Register now!

Support Amigaworld.net
Your support is needed and is appreciated as Amigaworld.net is primarily dependent upon the support of its users.
Donate

Menu
Main sections
» Home
» Features
» News
» Forums
» Classifieds
» Links
» Downloads
Extras
» OS4 Zone
» IRC Network
» AmigaWorld Radio
» Newsfeed
» Top Members
» Amiga Dealers
Information
» About Us
» FAQs
» Advertise
» Polls
» Terms of Service
» Search

IRC Channel
Server: irc.amigaworld.net
Ports: 1024,5555, 6665-6669
SSL port: 6697
Channel: #Amigaworld
Channel Policy and Guidelines

Who's Online
6 crawler(s) on-line.
 136 guest(s) on-line.
 0 member(s) on-line.



You are an anonymous user.
Register Now!
 pixie:  45 mins ago
 CosmosUnivers:  1 hr 7 mins ago
 Musashi5150:  1 hr 36 mins ago
 AmigaPapst:  1 hr 36 mins ago
 RobertB:  1 hr 41 mins ago
 jPV:  1 hr 56 mins ago
 ppcamiga1:  2 hrs 1 min ago
 matthey:  3 hrs 45 mins ago
 DiscreetFX:  4 hrs 45 mins ago
 djnick:  5 hrs 5 mins ago

/  Forum Index
   /  Amiga Development
      /  Packed Versus Planar: FIGHT
Register To Post

Goto page ( Previous Page 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 Next Page )
PosterThread
cdimauro 
Re: Packed Versus Planar: FIGHT
Posted on 3-Oct-2022 17:33:51
#361 ]
Elite Member
Joined: 29-Oct-2012
Posts: 3650
From: Germany

@Hypex

Quote:

Hypex wrote:
@cdimauro

Quote:
It's not the original one. From the readme:


No. I found an earlier one. But was more interested in the AGA version to see what they had fixed.

So, you still haven't the original version.
Quote:
Quote:
which is clear, right?


No! Only more clear.

All Amigas have a 680x0 so that doesn't make sense. A 68000 is a 680x0. What they obviously mean is 68010+ or 68020+ with CPU32, so they should just put that.

I think it was a typo from Skid Row: it should have been "WITH" (not WITHOUT) 680x0.

Anyway, the important message was that they were upset with the very bad work done (Again! According to them) by those demo makers.

 Status: Offline
Profile     Report this post  
Hypex 
Re: Packed Versus Planar: FIGHT
Posted on 4-Oct-2022 1:19:08
#362 ]
Elite Member
Joined: 6-May-2007
Posts: 11215
From: Greensborough, Australia

@Karlos

Quote:
Regarding true colour, sprites and alpha blending, wouldn't the "Amiga way (tm)" have been to do this using new features of the display hardware? I was using Warp3D as a pure 2D renderer for this sort of thing way back. For example, simple screen-space textured quads rendered as indexed lists for sprites, shaded untextured triangle strips for copper style gradients and such.


Warp3D might be considered foreign compared to Amiga display hardware as it goes beyond it and needs RTG cards. It's also around the same time as PowerPC, which many think was a mistake or a mere side note now, but also independent of it. Still, it's part of what a powerful Amiga had in the last days, with Warp3D providing hardware 3D on the Amiga.

Quote:
AMMX sounds great as an extension for lots of otherwise laborious data transformation but if you're using it to draw on screen isn't that something of a retrograde step for the system?


That was my impression as well. I expected the blitter to be upgraded in speed and ability. Line drawing with expanded width and source bitmap texture could blit rotated shapes. Scaling and warped blits could provide texture mapping blits.

Using AMMX is like doing a CPU blit operation because it's faster. So it relies on the CPU. But the blitter did need some set up so simpler operations can be done using less CPU code.

But what I notice, is that like Warp3D targeting RTG, AMMX is really made for RTG. All the examples are for RTG soft sprites and usually in 24 bit true colour. I don't know if any use 8 bit chunky even. So it's not so useful for "real" Amiga games that use sprites and bitplanes.

 Status: Offline
Profile     Report this post  
Karlos 
Re: Packed Versus Planar: FIGHT
Posted on 4-Oct-2022 1:27:09
#363 ]
Elite Member
Joined: 24-Aug-2003
Posts: 4404
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

@Hypex

Maybe. The point was, it was using dedicated hardware to accelerate graphics operations rather than SIMD instructions on the CPU. I'm not opposed either way but the former seems more in the "spirit" of the Amiga way of doing things.

_________________
Doing stupid things for fun...

 Status: Offline
Profile     Report this post  
Hypex 
Re: Packed Versus Planar: FIGHT
Posted on 4-Oct-2022 4:10:17
#364 ]
Elite Member
Joined: 6-May-2007
Posts: 11215
From: Greensborough, Australia

@cdimauro

Quote:
The problem is that games didn't just jumped to the beginning of routines which are on the jump tables. AFAIR they also directly jumped to parts of the Kickstart and this cannot be solved by keeping track of / storing the specific & needed jump tables.


Well, that's the job of any game patcher, to patch the crap out of it!

I can't imagine it affecting much more than some old games. It wouldn't have been a practical method after a 1.2 A500 was superseded by a 1.3 A500 and beyond. WHD is in the pop spot, but I also used JST back in the day, so should look into if it also required kick images.

Quote:
Unfortunately it can't be used as a regular application on any Amiga: it needs proper wiring to the CD players and LaserVideo players (I've checked, and it wasn't just one: three were controlled by the application).


There's always the CD32 jukebox then. If you have a CD32.

A friend bought s huge archive CD stacker. Well it's wide and tall. SCSI interface. Wanted to use it for his Aminet collection. But needs software to change discs. I looked into it and it changes disc with certain SCSI commands. I was going to write a commodity to do it but I never got around to it. Would have needed a GUI where you pick what disc you want, from a range of gadget boxes that also possibly scanned for labels, then issued the commands to swap discs.

Quote:
Do you have space for all this stuff? Lucky you!


No not really!

It sits in storage. Packed away in bags. I tend to use it if I take it to an Amiga club. Then it gets taken out. I really need to clear some junk as I'd like to actually have some machines set up. My old school desk sits in the back corner. Be good for my A4000 and real CRT monitor. Well, a now old silver ViewSonic I bought for my AmigaOne XE, that would work great with my A4000 and P96 scan doubler. But I can hardly reach the desk because of stuff on the floor including my old TV now, that is also silver with a flat CRT screen, so like my monitor not that old. The X1000 tower sits on another desk and I can use that.

Quote:
Let's give another try:


Okay I see what's happening. I tend to forget but I got stuck on this a while back when I posted a chnuky to planar bit table that had symbols. I needed the pre tags and ended using a some converter for the text symbols.

So there's a few games breaking there. Some prefs are wrong? What prefs? Join Discord? I keep hearing about that and that suggestion just annoys me these days. How about someone just post on the forum with the answers if it's easy enough to fix. Instead of suggesting to use some other type of forum. It's like posting here, only to be told you should post to Amigans, so then need to assign up for another account and then check another site...

Quote:
AMMX could be more efficient for some operations, but it definitely cannot be faster than Altivec. In fact, Altivec is implemented on hundred or thousand of Mhz processors, which usually are also out-of-order: they literally obliterate the 68080.


That's the crux of it. An FPGA against an ASIC. Those PPC cores maybe older now but it's still a full ASIC against an FPGA embedded CPU.

By comparison. The PPC64 could be more efficient than x64 in some operations, and less complicated in other ways, but with years of development put into x64 and newer fabrications able to run more core threads at higher clocks rate means x64 will end up being more powerful over all.

Quote:
This time it's me that I've problem: I can't open it.


For reference my link fixed:
http://apollo-core.com/knowledge.php?b=4¬e=38817

A fee posts up from the bottom.

Quote:
Maybe only on some synthetic benchmarks which are just using vector operations.

On real-world applications there's no chance that it could do better.


Yes, and some slight humour of mine, since AMMX wins by default in vector ops as Sam has none.

Quote:
Which will never happens DRAM, for example, was invented by Intel: try to use a processor without DRAM(s).


Oh no. Another Intel Invention Inside. Triple I threat.

I've read that Intel were more well known for producing memory chips before entering the major CPU market. So that wouldn't be surprising.

Quote:
AMMX is only a marketing name, since they are very different. The only things in common are the use of FP registers for keeping vector data and that the SIMD extension is only limited to integer data types.


I would have expected a V0 to V7 to be added. Though I can see how that would work better for an older design. Since the floats are 64 bit and can save space on the FPGA die.

But I don't quite understand the ASM examples. Such as:


lea Spritedata,A0
lea Screen,A1

move.w #height-1,D1
Yloop:
move.w #width/8-1,D0
Xloop:
load (A0)+,D2
storem.b D2,(A1)+
dbra D0,Xloop
add.l #Modulo,A1
dbra D1,Yloop
rts


The load doesn't make sense, load one value in D2? A word?

Then the storem. Store one byte into (A1)+? Both have no data size specifier so where is the size coming from? In this case it somehow loads 8 bytes into D2, then stores 8 bytes but only non-zero bytes. This solves the packed sprite mask problem. But, there is no 8 for 8 bytes nor any clue in the mnemonic for skipping blank bytes. Did the official forum cut off official code? Plus the elephant in the room is D2 is only 32 bits wide.

From same thread a few posts down from top.

Quote:
Of course: it was never released.


Quote:
Hum. 3 frames is too much. Depending on the game speed, the sound can change after one (50FPS) or two (25FPS) frames with my player.


Well, it's not so much a delay of 3 frames, it's how it's going about it. For one thing, since modules are timed by ticks as the minimum amount, with ticks per line, each tick should correspond to an actual sound event such as starting a sound or adjusting volume or pitch. But more important I think, is what MED does, to start a sound. It sets the sample, sets volume to 0, then pokes a weird value like $FF00 into the period register and starts DMA. On Pal that's about 54 Hz. Barely audible I'd say. A few ticks later, it fixes it, with the proper period! Weird! I only know this because I was porting the MED replayer used in Total Chaos to AHI. And found all this weird stuff sent to AHI. My routines cached the audio writes until the end. But I had to verify all the values before sending any play command to AHI. Because $FF00 obviously wont appear on any period table I think it's actually a bug that slipped under the radar. It may run in an interrupt but it can run as user code easily so I see no excuse for not spotting it in the development phase. Also, that was never released either. Unfortunately James Conwell, main developer of Total Chaos died, and the game died with him since he had the sources and I only had the music player source.

However, I plan to use that knowledge I learned about replayer quirks, and use it in a future CIAgent when I finally get Paula emulation working.

Quote:
Still, not practical on low-end machines.


No, but people shouldn't be trying to write or play multi tracker modules on a 1MB A500, that's old school man. I agree with upgrading. I moved from 1.3 to 3.0. Given what OS2+ provides I don't see it as fun needing to use old methods and functions with more code so someone can run a program on his old A500.

Quote:
Then the chipset would have required an additional read-cycle from memory, which wasn't possible, since the raster slots for the sprites were already quite limited.


It would have needed to be somewhere. Given the design in the sprite data would be convenient. But stored in registers would be better to keep it organised.

 Status: Offline
Profile     Report this post  
cdimauro 
Re: Packed Versus Planar: FIGHT
Posted on 4-Oct-2022 6:10:57
#365 ]
Elite Member
Joined: 29-Oct-2012
Posts: 3650
From: Germany

@Hypex

Quote:

Hypex wrote:
@cdimauro

Quote:
The problem is that games didn't just jumped to the beginning of routines which are on the jump tables. AFAIR they also directly jumped to parts of the Kickstart and this cannot be solved by keeping track of / storing the specific & needed jump tables.


Well, that's the job of any game patcher, to patch the crap out of it!

I can't imagine it affecting much more than some old games. It wouldn't have been a practical method after a 1.2 A500 was superseded by a 1.3 A500 and beyond. WHD is in the pop spot, but I also used JST back in the day, so should look into if it also required kick images.

I've no statistics, but when I've checked WHDLoad games quite some of them required some Kickstart. I would be good to have some numbers.
Quote:
Quote:
Unfortunately it can't be used as a regular application on any Amiga: it needs proper wiring to the CD players and LaserVideo players (I've checked, and it wasn't just one: three were controlled by the application).


There's always the CD32 jukebox then. If you have a CD32.

Unfortunately not. And it was expensive for me, at the time.
Quote:
A friend bought s huge archive CD stacker. Well it's wide and tall. SCSI interface. Wanted to use it for his Aminet collection. But needs software to change discs. I looked into it and it changes disc with certain SCSI commands. I was going to write a commodity to do it but I never got around to it. Would have needed a GUI where you pick what disc you want, from a range of gadget boxes that also possibly scanned for labels, then issued the commands to swap discs.

Well, that's much more expensive!
Quote:
Quote:
Let's give another try:


Okay I see what's happening. I tend to forget but I got stuck on this a while back when I posted a chnuky to planar bit table that had symbols. I needed the pre tags and ended using a some converter for the text symbols.

So there's a few games breaking there. Some prefs are wrong? What prefs? Join Discord? I keep hearing about that and that suggestion just annoys me these days. How about someone just post on the forum with the answers if it's easy enough to fix. Instead of suggesting to use some other type of forum. It's like posting here, only to be told you should post to Amigans, so then need to assign up for another account and then check another site...

Same here with The Nut which tends to move people to another "protected" environment.
Quote:
Quote:
AMMX could be more efficient for some operations, but it definitely cannot be faster than Altivec. In fact, Altivec is implemented on hundred or thousand of Mhz processors, which usually are also out-of-order: they literally obliterate the 68080.


That's the crux of it. An FPGA against an ASIC. Those PPC cores maybe older now but it's still a full ASIC against an FPGA embedded CPU.

Correct. That's why you can talk about efficiency and then AMMX is clearly more efficient for some operations.

But it's when you talk about performances and saying that AMMX is faster than a PowerPC then you're lying.

The difference might sound subtle but it's essential.
Quote:
By comparison. The PPC64 could be more efficient than x64 in some operations, and less complicated in other ways, but with years of development put into x64 and newer fabrications able to run more core threads at higher clocks rate means x64 will end up being more powerful over all.

No, x64 have also a better microarchitecture. I've shared several links to benchmarks on the code density thread and you can see that there isn't a huge clock difference between the tested systems, but x64 platforms as much faster on average compared to PowerPCs.
Quote:
Quote:
This time it's me that I've problem: I can't open it.


For reference my link fixed:
http://apollo-core.com/knowledge.php?b=4¬e=38817

A fee posts up from the bottom.

Sorry, but it's still wrong. Either you remove the & and replace it with a space, like I did it, or replace it with the HTML equivalent, like kolla suggested (which is much better).
Quote:
Quote:
Maybe only on some synthetic benchmarks which are just using vector operations.

On real-world applications there's no chance that it could do better.


Yes, and some slight humour of mine, since AMMX wins by default in vector ops as Sam has none.

Maybe on some synthetic benchmark. Because the Sams are running at 1Ghz, so they are much faster than an Apollo core.

It's important to test an entire application / game and not just single routines.
Quote:
Quote:
Which will never happens DRAM, for example, was invented by Intel: try to use a processor without DRAM(s).


Oh no. Another Intel Invention Inside. Triple I threat.

I've read that Intel were more well known for producing memory chips before entering the major CPU market. So that wouldn't be surprising.

Indeed. In fact, Intel invented also the EPROM.
Quote:
Quote:
AMMX is only a marketing name, since they are very different. The only things in common are the use of FP registers for keeping vector data and that the SIMD extension is only limited to integer data types.


I would have expected a V0 to V7 to be added.

Hum. V is better to be used for variable-length vectors. Maybe S = SIMD is a better prefix for those registers.
Quote:
Though I can see how that would work better for an older design. Since the floats are 64 bit and can save space on the FPGA die.

floats could be 32-bit also.
Quote:
But I don't quite understand the ASM examples. Such as:

lea Spritedata,A0
lea Screen,A1

move.w #height-1,D1
Yloop:
move.w #width/8-1,D0
Xloop:
load (A0)+,D2
storem.b D2,(A1)+
dbra D0,Xloop
add.l #Modulo,A1
dbra D1,Yloop
rts


The load doesn't make sense, load one value in D2? A word?

No, it this case the instruction is an AMMX one, which always operates on 64-bit data = quad word.

It would have been better to use a size suffix, like .q, but we know that Motorola also used default sizes for some instructions, so this is choice is in-line with what Motorola did.

Much better would have been to use a different syntax for the SIMD registers, like S2 in this case instead of D2, because D2 is causing confusion, like for you.
Quote:
Then the storem. Store one byte into (A1)+?

No, it merged the bytes from D2 with the bytes on (A1). If a byte is zero, then the corresponding byte on (A1) isn't changed. Otherwise it's replaced with the one from D2.

It's a merge / blend operation. Maybe a better name should have been used.
Quote:
Both have no data size specifier so where is the size coming from? In this case it somehow loads 8 bytes into D2, then stores 8 bytes but only non-zero bytes. This solves the packed sprite mask problem. But, there is no 8 for 8 bytes nor any clue in the mnemonic for skipping blank bytes.

See above.
Quote:
Did the official forum cut off official code?

There some examples and a Programmer's guide on Discord. I've posted several links on the Code Density thread.
Quote:
Quote:
Hum. 3 frames is too much. Depending on the game speed, the sound can change after one (50FPS) or two (25FPS) frames with my player.


Well, it's not so much a delay of 3 frames, it's how it's going about it.

In fact. It's reasonable. And... not audible for common users.
Quote:
For one thing, since modules are timed by ticks as the minimum amount, with ticks per line, each tick should correspond to an actual sound event such as starting a sound or adjusting volume or pitch. But more important I think, is what MED does, to start a sound. It sets the sample, sets volume to 0, then pokes a weird value like $FF00 into the period register and starts DMA. On Pal that's about 54 Hz. Barely audible I'd say. A few ticks later, it fixes it, with the proper period! Weird! I only know this because I was porting the MED replayer used in Total Chaos to AHI. And found all this weird stuff sent to AHI. My routines cached the audio writes until the end. But I had to verify all the values before sending any play command to AHI. Because $FF00 obviously wont appear on any period table I think it's actually a bug that slipped under the radar. It may run in an interrupt but it can run as user code easily so I see no excuse for not spotting it in the development phase.

To me it looks complicated.

Maybe I'm biased because of my player.
Quote:
Also, that was never released either. Unfortunately James Conwell, main developer of Total Chaos died, and the game died with him since he had the sources and I only had the music player source.

Sorry for that.
Quote:
However, I plan to use that knowledge I learned about replayer quirks, and use it in a future CIAgent when I finally get Paula emulation working.

Not an easy task...
Quote:
Quote:
Then the chipset would have required an additional read-cycle from memory, which wasn't possible, since the raster slots for the sprites were already quite limited.


It would have needed to be somewhere. Given the design in the sprite data would be convenient. But stored in registers would be better to keep it organised.

Well, I didn't liked the sprites on the Amiga.

As I've said before, I would have preferred 8 or, even better, 16 audio channels. Leaving only a few sprites for the mouse pointer or some extra, small stuff.

P.S. Sorry, no time to re-read: it's work time...

 Status: Offline
Profile     Report this post  
cdimauro 
Re: Packed Versus Planar: FIGHT
Posted on 4-Oct-2022 6:11:22
#366 ]
Elite Member
Joined: 29-Oct-2012
Posts: 3650
From: Germany

@Karlos

Quote:

Karlos wrote:
@Hypex

Maybe. The point was, it was using dedicated hardware to accelerate graphics operations rather than SIMD instructions on the CPU. I'm not opposed either way but the former seems more in the "spirit" of the Amiga way of doing things.

Exactly. That spirit is lost...

 Status: Offline
Profile     Report this post  
Hammer 
Re: Packed Versus Planar: FIGHT
Posted on 4-Oct-2022 6:57:35
#367 ]
Elite Member
Joined: 9-Mar-2003
Posts: 5286
From: Australia

@Hypex

Quote:

Hypex wrote:

Warp3D might be considered foreign compared to Amiga display hardware as it goes beyond it and needs RTG cards. It's also around the same time as PowerPC, which many think was a mistake or a mere side note now, but also independent of it. Still, it's part of what a powerful Amiga had in the last days, with Warp3D providing hardware 3D on the Amiga.

Reminder, Amiga Hombre chipset targeted OpenGL compatibility.

Wazp3D on Pistorm/Emu68/RPI 3a runs software QLQuake at 320x200 at playable frame rates.

https://www.youtube.com/watch?v=8UYvNTPM88Q
Emu68 has its non-SMP 68K multi-threading evolution path. This example shows four threads running path tracers on Emu68.

Vampire V2/V4 is not the only post-Commodore Amiga accelerator solution with an evolution fork.

Last edited by Hammer on 04-Oct-2022 at 06:59 AM.

_________________
Ryzen 9 7900X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB
Amiga 1200 (Rev 1D1, KS 3.2, PiStorm32lite/RPi 4B 4GB/Emu68)
Amiga 500 (Rev 6A, KS 3.2, PiStorm/RPi 3a/Emu68)

 Status: Offline
Profile     Report this post  
Hammer 
Re: Packed Versus Planar: FIGHT
Posted on 4-Oct-2022 7:05:07
#368 ]
Elite Member
Joined: 9-Mar-2003
Posts: 5286
From: Australia

@Hypex

Quote:
Yes, and some slight humour of mine, since AMMX wins by default in vector ops as Sam has none


https://www.youtube.com/watch?v=gJwdColUxUs
Quake demo3 benchmark with PiStorm/Emu68/RPI 3a

PC's Quake "timedemo demo3" benchmarks from https://thandor.net/benchmark/33

For Quake demo 3, PiStorm/Emu68/RPI 3a delivered Pentium II 266/300 level results.

Vampire V4 and SAM should have it's Quake demo3 benchmarks.

_________________
Ryzen 9 7900X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB
Amiga 1200 (Rev 1D1, KS 3.2, PiStorm32lite/RPi 4B 4GB/Emu68)
Amiga 500 (Rev 6A, KS 3.2, PiStorm/RPi 3a/Emu68)

 Status: Offline
Profile     Report this post  
Gunnar 
Re: Packed Versus Planar: FIGHT
Posted on 4-Oct-2022 7:20:44
#369 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown

Cesare Di Mauro,

Quote:

Quote:

Correct. That's why you can talk about efficiency and then AMMX is clearly more efficient for some operations.

But it's when you talk about performances and saying that AMMX is faster than a PowerPC then you're lying.



Please help us understand what lying means for you.


Does a CPU with AMMX exist today? Yes!
Can you buy AMMX systems today? Yes!
Do people already have such systems today? Yes! Over 10,000 systems sold
Is AMMX used by several coders? Yes!
Do any programs use AMMX today? Yes!
Does the Operating System using AMMX today? Yes!
Can you play games using AMMX today? Yes!
Are general graphic routines shorter and simpler to code with AMMX than with ALTIVEC? Yes!
Are game routines shorter and simpler to code with AMMX than with ALTIVEC? Yes!
Do game routines using AMMX need less clock cycles than ALTIVEC? Yes!
Do game routines using AMMX need less memory bandwidth than ALTIVEC? Yes!


Clearly in all regards AMMX has many advantages!
+ Easier to code
+ Runs faster clock by clock
+ Makes more efficient use of available memory bandwidth


Cesare Di Mauro,
You claimed here several times that you create a much better architecture than Intel.

Does a CPU with your vapor architecture exist today? No!
Can you buy a system your vapor architecture today? No!
Do any programmers code for your vapor architecture today? No!
Does any Operating system support your vapor architecture today? No!
Can you play games on your vapor architecture today? No!
Does a CPU with your vapor architecture reach any clock rate at all today? No!
Can your at least proof that your architecture would even work? No!

What is your definition of lying?

Last edited by Gunnar on 04-Oct-2022 at 09:01 AM.
Last edited by Gunnar on 04-Oct-2022 at 07:46 AM.
Last edited by Gunnar on 04-Oct-2022 at 07:23 AM.

 Status: Offline
Profile     Report this post  
Karlos 
Re: Packed Versus Planar: FIGHT
Posted on 4-Oct-2022 9:43:38
#370 ]
Elite Member
Joined: 24-Aug-2003
Posts: 4404
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

@Gunnar

The issue here is clearly one of relative versus absolute performance. I believe your claim is that, per MHz, AMMX is faster than AltiVec. I'm sure that may be true for some operations. However the claim that it's "faster than" Altivec, without further qualification, risks being misinterpreted as delivering higher throughput on any altivec implementation. I doubt you are claiming that the Apollo as it stands today can compete with the last generation 1-2 GHz rated G4 and G5 parts in raw vector performance.

_________________
Doing stupid things for fun...

 Status: Offline
Profile     Report this post  
Gunnar 
Re: Packed Versus Planar: FIGHT
Posted on 4-Oct-2022 10:16:50
#371 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown

@Karlos

Quote:

Karlos wrote:
@Gunnar

The issue here is clearly one of relative versus absolute performance. I believe your claim is that, per MHz, AMMX is faster than AltiVec. I'm sure that may be true for some operations. However the claim that it's "faster than" Altivec, without further qualification, risks being misinterpreted as delivering higher throughput on any altivec implementation. I doubt you are claiming that the Apollo as it stands today can compete with the last generation 1-2 GHz rated G4 and G5 parts in raw vector performance.


Yes, for the workloads we speak about = AMMX is more efficient than ALTIVEC.
Yes, for the workloads we speak about = AMMX has more performance than ALTIVEC.
If you put an POWERPC in the same technology, if you put a POWERPC in an FPGA, than AMMX offer higher performance than ALTIVEC.

So its clear that if you produce them on the same level like both FPGA or both ASIC then AMMX system can beat ALTIVEC.

But what if you compare not the same clockrate?
What if you compete with existing systems like AmigaOne or Pegasos?

If you compare the real world performance that existing systems.
If you compare what the V4 does deliver versus AmigaOne G4, and Pegasos 2 G4.
If you look at what real life, real memory, gaming blitting performance the deliver
= then the 68080 AMMX system does outperform 1GHz PowerPC systems
in the maximum real screen game/sprite blitting performance.

 Status: Offline
Profile     Report this post  
Karlos 
Re: Packed Versus Planar: FIGHT
Posted on 4-Oct-2022 11:06:20
#372 ]
Elite Member
Joined: 24-Aug-2003
Posts: 4404
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

@Gunnar

Those are bold claims. Your workloads must be very selective. Alpha blending is a good example though. Suppose I have two large pixel, e.g 1080p arrays of ARGB 32-bit pixels and I want to alpha blend buffer B onto buffer A using B's alpha channel.

Are you claiming the 68080, at it's normal clock rate, using AMMX will complete this in less time than a 1GHz PPC using altivec instructions to perform this task?

Last edited by Karlos on 04-Oct-2022 at 11:06 AM.

_________________
Doing stupid things for fun...

 Status: Offline
Profile     Report this post  
Gunnar 
Re: Packed Versus Planar: FIGHT
Posted on 4-Oct-2022 12:45:18
#373 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown


Quote:

Karlos wrote:
@Gunnar

Those are bold claims.


I have both 68080 and PowerPC Systems here


Quote:

Your workloads must be very selective. Alpha blending is a good example though. Suppose I have two large pixel, e.g 1080p arrays of ARGB 32-bit pixels and I want to alpha blend buffer B onto buffer A using B's alpha channel.

Are you claiming the 68080, at it's normal clock rate, using AMMX will complete this in less time than a 1GHz PPC using altivec instructions to perform this task?


Yes correct.

Using optimized pixel operations like Sprite Blitting, or Sprite Blending
the V4 is able to do more work than both the AmigaOne G4-XE 800MHz and the Pegasos 2 G4-1000MHz systems.

 Status: Offline
Profile     Report this post  
Karlos 
Re: Packed Versus Planar: FIGHT
Posted on 4-Oct-2022 13:19:52
#374 ]
Elite Member
Joined: 24-Aug-2003
Posts: 4404
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

@Gunnar

Numbers, please...

_________________
Doing stupid things for fun...

 Status: Offline
Profile     Report this post  
Gunnar 
Re: Packed Versus Planar: FIGHT
Posted on 4-Oct-2022 13:22:33
#375 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown

@Karlos

Karlos do you have an AmigaOne XE?

 Status: Offline
Profile     Report this post  
Karlos 
Re: Packed Versus Planar: FIGHT
Posted on 4-Oct-2022 13:36:13
#376 ]
Elite Member
Joined: 24-Aug-2003
Posts: 4404
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

@Gunnar

There is no question AMMX is faster than my A1XE given that it's not worked for some years now. Didn't you just claim you had a PPC machine to compare with?

Last edited by Karlos on 04-Oct-2022 at 01:53 PM.

_________________
Doing stupid things for fun...

 Status: Offline
Profile     Report this post  
Gunnar 
Re: Packed Versus Planar: FIGHT
Posted on 4-Oct-2022 14:09:38
#377 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown

@Karlos

Quote:

There is no question AMMX is faster than my A1XE given that it's not worked for some years now.


Ha ha you are funny.


If you had an A1XE then you will know its performance and it limitations yourself.

We should not forget is that performance can not be understood from only looking at some number.
You need to understand what factors are important.

Factors like: can the System handle misaligned memory access?
What performance penalty to you get for misaligned access?
In real world case sprites need to be often blitted on non 16byte aligned addresses.
So GFX blitting has always to solve misalignment. As you will know ALTIVEC can not do well.

What is your memory performance? MB/sec
AmigaOne XE is very bad here.

What is the latency of your memory?
Again AmigaOne XE is very bad here.

Can you CPU prefetch and stream?
While real POWER core do prefetch on their own your PPC G4 will not do this automatically.

Can your system to masked writes or do you need to do a read/write to achieve the same?
Your system can not do this.
This means you need 2 memory access instead 1.

The moral of the story is for understanding performance you need to know the used algorithms
you need to have coding experience, you really need to code testcases, and you need measure the memory controller performance.

The armchair CPU experts performance discussion here of people with no real world coding experience are just a waste of time.


The AmigaOne XE weak memory performance is killing it in real world comparisons with the 68080.
You could have a 6 GHz G4 in it, and it would still loose - simply because of the poor memory controller.

 Status: Offline
Profile     Report this post  
Karlos 
Re: Packed Versus Planar: FIGHT
Posted on 4-Oct-2022 15:06:53
#378 ]
Elite Member
Joined: 24-Aug-2003
Posts: 4404
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

@Gunnar

I'm not remotely interested in the hypothetical reasons why AMMX may be faster than Altivec, I've offered a perfectly reasonable way for you to prove it based on a claim you made earlier in the thread, using the very class of operation you claimed it is particularly well suited for, e.g. Alpha Blending.

So, to reiterate, I would like to know the total elapsed time, in milliseconds, for blending one 1080p 32-bit ARGB buffer 'B' onto another 1080p 32-bit ARGB buffer 'A', using B's alpha channel for a standard alpha, 1-alpha type blend. I'm sure you will be able to find an optimal example for Altivec on stack overflow / github somewhere as it's a very common problem people were solving for PPC macs years ago.

For the purposes of this, you can have everything optimally aligned and in whatever passes for Fast RAM on each system to make sure you aren't at the mercy of reading from slower video memory.

You said above you have both 68080 and PowerPC to compare against, so I don't see any reason you can't prove the claim. Unless the claim is false, of course.

Last edited by Karlos on 04-Oct-2022 at 03:07 PM.

_________________
Doing stupid things for fun...

 Status: Offline
Profile     Report this post  
Gunnar 
Re: Packed Versus Planar: FIGHT
Posted on 4-Oct-2022 15:09:47
#379 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown

@Karlos

Quote:

For the purposes of this, you can have everything optimally aligned and in whatever passes for Fast RAM


You misunderstood what I said.
Having the GFX operation aligned is exactly what never happens in real live.

In real world, sprite operations are never aligned and this misalignment
kills ALTIVEC - as ALTIVEC is tuned for working always on aligned memory.

The 68080 on the other hand has no problem with misalignment, and support this in HW for free.

Last edited by Gunnar on 04-Oct-2022 at 03:12 PM.

 Status: Offline
Profile     Report this post  
Bosanac 
Re: Packed Versus Planar: FIGHT
Posted on 4-Oct-2022 15:13:28
#380 ]
Regular Member
Joined: 10-May-2022
Posts: 255
From: Unknown

@Gunnar

Quote:
The moral of the story is for understanding performance you need to know the used algorithms
you need to have coding experience, you really need to code testcases, and you need measure the memory controller performance.

The armchair CPU experts performance discussion here of people with no real world coding experience are just a waste of time.


You hear that?

Just exactly what do you do all day you lazy oaf?

And you can take your name off the AmigaOS credits too! Bloody shyster!

Last edited by Bosanac on 04-Oct-2022 at 03:14 PM.

 Status: Offline
Profile     Report this post  
Goto page ( Previous Page 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 Next Page )

[ home ][ about us ][ privacy ] [ forums ][ classifieds ] [ links ][ news archive ] [ link to us ][ user account ]
Copyright (C) 2000 - 2019 Amigaworld.net.
Amigaworld.net was originally founded by David Doyle