Click Here
home features news forums classifieds faqs links search
6071 members 
Amiga Q&A /  Free for All /  Emulation /  Gaming / (Latest Posts)
Login

Nickname

Password

Lost Password?

Don't have an account yet?
Register now!

Support Amigaworld.net
Your support is needed and is appreciated as Amigaworld.net is primarily dependent upon the support of its users.
Donate

Menu
Main sections
» Home
» Features
» News
» Forums
» Classifieds
» Links
» Downloads
Extras
» OS4 Zone
» IRC Network
» AmigaWorld Radio
» Newsfeed
» Top Members
» Amiga Dealers
Information
» About Us
» FAQs
» Advertise
» Polls
» Terms of Service
» Search

IRC Channel
Server: irc.amigaworld.net
Ports: 1024,5555, 6665-6669
SSL port: 6697
Channel: #Amigaworld
Channel Policy and Guidelines

Who's Online
7 crawler(s) on-line.
 141 guest(s) on-line.
 1 member(s) on-line.


 matthey

You are an anonymous user.
Register Now!
 matthey:  4 mins ago
 Matt3k:  11 mins ago
 OlafS25:  21 mins ago
 amigakit:  50 mins ago
 RobertB:  2 hrs 2 mins ago
 Rob:  2 hrs 2 mins ago
 A1200:  2 hrs 9 mins ago
 pixie:  2 hrs 13 mins ago
 sibbi:  2 hrs 36 mins ago
 NutsAboutAmiga:  2 hrs 49 mins ago

/  Forum Index
   /  Amiga Development
      /  Packed Versus Planar: FIGHT
Register To Post

Goto page ( Previous Page 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 Next Page )
PosterThread
kolla 
Re: Packed Versus Planar: FIGHT
Posted on 7-Oct-2022 19:52:08
#461 ]
Elite Member
Joined: 21-Aug-2003
Posts: 2896
From: Trondheim, Norway

@Gunnar

So will SAGA ever be open sourced, as you "promised" 5 years ago?

_________________
B5D6A1D019D5D45BCC56F4782AC220D8B3E2A6CC

 Status: Offline
Profile     Report this post  
cdimauro 
Re: Packed Versus Planar: FIGHT
Posted on 7-Oct-2022 21:43:59
#462 ]
Elite Member
Joined: 29-Oct-2012
Posts: 3650
From: Germany

@Gunnar

Quote:

Gunnar wrote:
Dear Cesare Di Mauro,

Quote:

Those are SOME examples and ALL GAMES. What do you want to prove with your examples?


I explained with examples how these games create their game graphics.
And this showed why memory performance is very important for many games.
Many people dont know how games are coded, therefore it was useful to explain this.

And again: only games...
Quote:
I also explained why AMMX highly improves how much you can draw with the available memory bandwidth of your system.
The reason is that on PowerPC and other CPUs you always need 3 Memory operations for common drawing techniques. AMMX allows you to do the same with only 2 Memory operations.
This means AMMX gives you a 50% advantage here.

Not always. I'll clarify it below.
Quote:
Quote:
150MB/s for the AmigaOnes look too low. Even ridiculous, since they can mount 133Mhz SDRAMs, which means >1GB/s of available bandwidth.


Yes, the AmigaOne XE does significantly underperform in memory performance.
This is well known fact.

The PowerPC G3 and G4 CPU can not automatically do memory prefetches
this is why they are generally never good in normal memory performance.
The G2 of the Efika and the 4xx CPU of the SAM also can not do memory prefetches.
They are also very poor in memory performance.

The G5 970 IBM PowerPC is significantly better then all others in memory performance.
The G5 does do automatic memory prefetches.
The APOLLO 68080 does also do automatic prefetches and has automatic stream detection.
This is the reason why the APOLLO 68080 is better in memory performance than G3/G4 and 440/460 systems.

It doesn't seems to be true, looking at the benchmarks posted by kolla.
Quote:
BTW, the AmigaOne XE does also under perform in bus performance. The AmigaOne XE has a AGP2 Graphics port which in "theory" could be reach good speed - but the real bus performance on this port the AmigaOne has only low performance.

But could be good enough for transferring the frame buffers from the system memory to the video memory.

However you seem to "forget" that AGP means that those AmigaOnes are connected to... rolling drum... a (discrete) graphic card.

Which might have "just a little bit" better performances on graphics-related tasks, included the ones for rendering 2D games.
Quote:
Quote:
You're using the SDRAM's Data Mask. Nice trick.

However it works only on 8-bit / 256 colors games.


You are wrong.

The 50% efficiency boost of AMMX works in any GFX format.
The Apollo 68080 CPU can use this in 8-Bit/256 color mode, in 15bit mode, in 16bit mode and in also in the truecolor modes.

The games that I posted use all of these modes.
Some games use 8bit, some are 16bit, some are truecolor. Some games even mix modes.
All of the games highly benefit from the AMMX memory efficiency boost.

I'll talk about it below.
Quote:
Quote:

With 16 and 32-bit graphics this trick isn't possible, because STOREM is available only for bytes stores (no 16-bit and 32-bit versions of the instructions are available, AFAIR).

Your claim is wrong.

You have never coded AMMX,

And I don't need to.
Quote:
and you not understood the documentation.

Maybe you need to write a good documentation?
Quote:
Help us understand WHY do you always make claims if you don't know anything in this topic?

Do you understand the meaning of R on AFAIR?

I was R=recalling that STOREM was used for such byte masking operations but it didn't worked for anything other than bytes. And I was right. In fact, from your documentation:
http://www.apollo-core.com/instructions.htm?b=4&i=STOREM
store a selection of bytes from a into destination
[...]
It is usual in SIMD that you can write only the native amount of bytes at once.
This instruction enables selective overwriting of memory, based on the contents
of the mask register (lower 8 Bit).


But this isn't suitable for 16-bit and 32-bit modes. In fact, the example provided at the bottom shows how to use it for 16-bit pixel size, but it requires additional instructions to "compose" the mask. So, it's NOT efficient for this tasks.

So, since here I was perfectly right.

Now I was that there's another instruction that was added (I don't know when, since you keep adding stuff) and that it matches those 16-bit and 32-bit cases:
http://www.apollo-core.com/instructions.htm?b=4&i=STOREM3

Effectively it works on a 16-bit format, but not in all cases. Let's report the documentation:

2) WORD / 16bit mode
In this mode each WORD of the source is stored to the destination which
has not the value of $F81F.
This is perfect for copying sprites where PINK color indicated transparent.

3) WORD / 15bit mode
In this mode each WORD of the source is stored to the destination which
has not Bit(15) set.
This is perfect for copying sprites in 32K color mode.

4) LONG
In this mode each LONG of the source is stored to the destination which
has not Bit(31) set.
This is perfect for copying sprites in truecolor mode


3) is OK, but ONLY for the big-endian 16-bit format. It lacks the little-endian one.

2) arbitrarily assumes that the color value to be used for the overlay should always be $F81F. So, it doesn't work with other overlay values, like the $0000 for example. And, of course, it's related to a single endianess (but this is irrelevant in this context: what's important is to define the specific color to be used as the overlay signal).

4) this 32-bit mode suffers form the same endianess issues of 3), but it's even worse: it arbitrarily assumes that only the MSB is and should be enough to signal the overlay. Whereas, as we know, the alpha channel is 8-bit wide, and any value between 0 and 255 is valued. Here, it assumes the all values between 128 and 255 signal that's an overlay.

Summing it up, your previous statement:

The Apollo 68080 CPU can use this in 8-Bit/256 color mode, in 15bit mode, in 16bit mode and in also in the truecolor modes.

is plainly false. Because it doesn't support all modes, but only some, and for some it's making totally arbitrary assumptions.

BTW, it doesn't support 24-bit modes (which usually are used for saving memory and bandwidth, at least for the graphic framebuffer).
Quote:
Quote:
Let's clarify (again) one thing: those are SOME, SPECIFIC, cases, whereas my statement was GENERAL.


The general problem with many of your statements is that you speak without having a clue.
And then you talk nonsense.

Denigrating your interlocutor is your typical arrogant behavior.

Despite your wall-of-text my sentence:

But it's when you talk about performances and saying that AMMX is faster than a PowerPC then you're lying.

still applies, because you've proved NOTHING against it.
Quote:
Above we have several examples of this:
You did not know the memory performance problems of AmigaOne

Which is OK: I never claimed the contrary.
Quote:
You did know that G4 PowerPC can not do automatic memory prefetches and therefore has a general performance problem - also on the MAC and other systems.

Same as before.
Quote:
You did not know that AMMX can use the 50% memory performance advantage on any Graphic mode.
On the TINA project you did not know that the FPGA can not use a 128Bit memory bus
and you did not know that the FPGA can not reach 400MHz clockrate (but in fact struggles to reach 200)

"If you repeat a lie often enough, people will believe it, and you will even come to believe it yourself." - Joseph Goebbels

Same as before. Replied here: https://amigaworld.net/modules/newbb/viewtopic.php?topic_id=44169&start=200&post_id=855345&order=0&viewmode=flat&pid=0&forum=17#855345
Quote:
There is nothing wrong in not knowing stuff. No one can know everything.

In fact, I never claimed to know everything, right?
Quote:
What is not good is making false claims - if you not know the topic.

Which I haven't made: you continue to mystify, as usual.

Maybe this is the only thing that you could do a in discussion, as you clearly proved several times.
Quote:
No one expects that people here have experience and knowledge how games are typically coded.
And this is OK.
And to explain how games are often coded I gave some examples showing you this.

No, you gave examples on SOME games are coded on SOME platforms.

But, still, nothing that can confute my statement that YOU quoted and that I report it again for your plaisure:

But it's when you talk about performances and saying that AMMX is faster than a PowerPC then you're lying.

Even more important, you're miserably lying by comparing a whole system, the Apollo Core, which ALSO handles graphic, with others where it's supposed to do ALL work.

In fact, and as I've already said in the middle of this post, AmigaOnes can (well, NEED) connect a discrete graphic cards.

And I'm pretty sure that games which properly makes use of it (e.g.: transferring graphic data on its VRAM and using the card's graphic primitives to do the cookie-cut operations and, in general, what's needed) could render all games that you mentioned much faster that the best Vampire card.

 Status: Offline
Profile     Report this post  
cdimauro 
Re: Packed Versus Planar: FIGHT
Posted on 7-Oct-2022 21:52:26
#463 ]
Elite Member
Joined: 29-Oct-2012
Posts: 3650
From: Germany

@Gunnar

Quote:

Gunnar wrote:
Dear Cesare Di Mauro,

please be so kind and stop pretending your know stuff if you have no knowledge in this area.

Which isn't the case, dear liar.
Quote:
On the AmigeONE memory problem topic your post was:
"The low performance can't be correct, as there is 133 printed on the memory stick"

I think the American call this "talking out of your ass"?

No, it's a simple case of mystification, since those where NOT my words, but what YOU invented and miserably trying to attribute me.

Specifically I've written those:
150MB/s for the AmigaOnes look too low. Even ridiculous, since they can mount 133Mhz SDRAMs, which means >1GB/s of available bandwidth.
[...]
Yes, IF the AmigaOnes have such ridiculous memory bandwidth.


which are completely different (see the highlighted parts, which the average Joe should understand) from what you pretend to assign to me. Liar!
Quote:
Why did you not ask someone to run "BUSTEST" for you to give you more information?

Why should I? You already did it and nobody replied 'til the last message where you explicitly asked kolla.
Quote:
In the AMMX memory performance efficiency discussion your post was:
"STOREM only supports 8bit mode, this will not work for 16bit or truecolor!"
You claim was wrong.
Why could you not ask "I see that StoreM does improve 8Bit, but does also support 16bit and 32bit mode?"

Maybe because I've "asked"? Here's my statement:

With 16 and 32-bit graphics this trick isn't possible, because STOREM is available only for bytes stores (no 16-bit and 32-bit versions of the instructions are available, AFAIR).

Do you know the meaning of AFAIR?
Quote:
Cesare some of posts are clever and good.
But your habit of making false claims without knowing does not help anyone, really!
Could you maybe, please consider to change this habit?
What do you think ?

The usual thing: that you're a liar and mystifier. See above.

 Status: Offline
Profile     Report this post  
cdimauro 
Re: Packed Versus Planar: FIGHT
Posted on 7-Oct-2022 21:56:05
#464 ]
Elite Member
Joined: 29-Oct-2012
Posts: 3650
From: Germany

@Gunnar

Quote:

Gunnar wrote:
@kolla

Quote:

kolla wrote:
I have both G4 systems with altivec and an actual Vampire systems.


Maybe you can help Cesare with some numbers then?

Maybe you can run on your AmigaOne XE some memory benchmark?
Like bustest, stream, minibench?

Please also state with the results what version your XE is, and which CPU model and which clockrate does it has.

Thanks



Here are the numbers:

@kolla

Quote:

kolla wrote:
@Gunnar

Quote:

Gunnar wrote:
@kolla

[quote]
kolla wrote:
I have both G4 systems with altivec and an actual Vampire systems.


Maybe you can help Cesare with some numbers then?

Maybe you can run on your AmigaOne XE some memory benchmark?
Like bustest, stream, minibench?

Please also state with the results what version your XE is, and which CPU model and which clockrate does it has.

Thanks


Vampire:


1.Minne:> bustest SIZE 16m FAST
BusSpeedTest 0.19 (mlelstv) Buffer: 16777216 Bytes, Alignment: 32768
========================================================================
memtype addr op cycle calib bandwidth
fast $08EC8000 readw 13.2 ns normal 151.0 * 10^6 byte/s
fast $08EC8000 readl 16.3 ns normal 244.8 * 10^6 byte/s
fast $08EC8000 readm 19.7 ns normal 202.9 * 10^6 byte/s
fast $08EC8000 writew 27.2 ns normal 73.6 * 10^6 byte/s
fast $08EC8000 writel 13.7 ns normal 292.0 * 10^6 byte/s
fast $08EC8000 writem 26.8 ns normal 149.4 * 10^6 byte/s



PowerMac10,1 (287 - Mac mini), 7447A @ 1410 MHz:

----------------------------------------------------------------
STREAM Memory Benchmark v0.3
Gunnar von Boehn
----------------------------------------------------------------
The Test will run some minutes please be patient.
Total memory required = 32.0 MB.
Each test is run 3 times, but only the *best* time for each is used.
----------------------------------------------------------------


Memory throughput Working on Arrays of 16 MB.
----------------------------------------------------------------
Read test (summing up the array).
----------------------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
read 8 615.3605 1.3004 1.3001 1.3007
read 32 1846.3039 0.4334 0.4333 0.4335
read 64 2395.3678 0.3342 0.3340 0.3343
read 32x2 2017.0087 0.3967 0.3966 0.3968
read 32x4 2145.0935 0.3729 0.3729 0.3730
read 32 CP3 7854.4467 0.1019 0.1019 0.1020
read 32 CP4 9005.5561 0.0889 0.0888 0.0890
read 32 CP5 * 10557.5513 0.0758 0.0758 0.0759
read 32 CP6 10477.6398 0.0764 0.0764 0.0765
read 32x4 CP3 8709.1708 0.0919 0.0919 0.0920
read 32x4 CP4 10168.0097 0.0787 0.0787 0.0788
read 32x4 CP5 10900.5250 0.0734 0.0734 0.0735
read 32x4 CP6 10416.2640 0.0768 0.0768 0.0769
----------------------------------------------------------------
Write test (setting array A).
----------------------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
write 8 672.8457 1.1892 1.1890 1.1895
write 32 1859.6173 0.4710 0.4302 0.5117
write 64 2719.3868 0.2944 0.2942 0.2946
write 32x2 2126.3414 0.3763 0.3762 0.3763
write 32x4 2264.1714 0.3534 0.3533 0.3535
memset 750 * 7871.3796 0.1017 0.1016 0.1017
memset 750 0 7895.4763 0.1014 0.1013 0.1014
libmoto memset 7831.1660 0.1024 0.1022 0.1027
glibc memset 4566.6517 0.1752 0.1752 0.1752
glibc memset0 7926.2126 0.1010 0.1009 0.1010
----------------------------------------------------------------
ACompare test (comparing the source and destination arrays).
----------------------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
cmp 8 971.4780 1.6470 1.6470 1.6470
cmp 32 2839.0042 0.5640 0.5636 0.5645
cmp 64 3602.7040 0.4629 0.4441 0.4816
cmp 32x2 3090.6772 0.5181 0.5177 0.5185
cmp 32x4 3277.2513 0.4889 0.4882 0.4896
cmp 32 CP2 7448.5126 0.2152 0.2148 0.2155
cmp 32 CP3 * 10228.4817 0.1566 0.1564 0.1568
cmp 32 CP4 10071.1889 0.1591 0.1589 0.1592
cmp 32 CP5 10278.4092 0.1559 0.1557 0.1561
cmp 32 CP6 9889.4712 0.1620 0.1618 0.1621
cmp 32x4 CP2 10295.8656 0.1559 0.1554 0.1564
cmp 32x4 CP3 10797.1741 0.1482 0.1482 0.1483
cmp 32x4 CP4 10368.5308 0.1544 0.1543 0.1546
cmp 32x4 CP5 10650.0368 0.1504 0.1502 0.1506
cmp 32x4 CP6 10132.5458 0.1579 0.1579 0.1579
libmoto memcmp419430400.0000 0.1959 0.0000 0.3918
glibc memcmp 12647.1357 0.1582 0.1265 0.1898
----------------------------------------------------------------
Copy test (copying array A -> B).
----------------------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
copy 8 1050.2341 1.5236 1.5235 1.5238
copy 32 2569.0632 0.6228 0.6228 0.6228
copy 64 3429.5173 0.4672 0.4665 0.4678
copy 32x2 2306.4525 0.6939 0.6937 0.6940
copy 32x4 2284.5656 0.7005 0.7004 0.7007
copy 32 CP2 6425.8055 0.2491 0.2490 0.2491
copy 32 CP3 7962.4195 0.2010 0.2009 0.2011
copy 32 CP4 * 8337.8099 0.1920 0.1919 0.1922
copy 32 CP5 8559.1163 0.1871 0.1869 0.1872
copy 32x4 CP2 7647.2719 0.2093 0.2092 0.2093
copy 32x4 CP3 7997.4812 0.2003 0.2001 0.2005
copy 32x4 CP4 8376.9640 0.1910 0.1910 0.1911
copy 32x4 CP5 8350.5504 0.1918 0.1916 0.1919
copy 64x4 CP4 8356.0507 0.1915 0.1915 0.1915
copy 64x4 CP4C 8723.2878 0.1834 0.1834 0.1835
glibc memcpy 5139.4565 0.3114 0.3113 0.3116
bmove512 5116.5921 0.3129 0.3127 0.3132
FC64 8247.9293 0.1941 0.1940 0.1943
libmoto memcpy 8660.4487 0.1848 0.1847 0.1849
memcpy 750 6418.2035 0.2495 0.2493 0.2496
----------------------------------------------------------------
[/quote]


But you don't like them:
@Gunnar

Quote:

Gunnar wrote:
@kolla

Quote:

Vampire:

1.Minne:> bustest SIZE 16m FAST
BusSpeedTest 0.19 (mlelstv) Buffer: 16777216 Bytes, Alignment: 32768
========================================================================
memtype addr op cycle calib bandwidth
fast $08EC8000 readw 13.2 ns normal 151.0 * 10^6 byte/s
fast $08EC8000 readl 16.3 ns normal 244.8 * 10^6 byte/s
fast $08EC8000 readm 19.7 ns normal 202.9 * 10^6 byte/s
fast $08EC8000 writew 27.2 ns normal 73.6 * 10^6 byte/s
fast $08EC8000 writel 13.7 ns normal 292.0 * 10^6 byte/s
fast $08EC8000 writem 26.8 ns normal 149.4 * 10^6 byte/s



You core seems to be outdated by some years, maybe you should update your core.

For comparison, what I do get here:
Quote:

fast $02CB8000 writel 7.2 ns normal 556.4 * 10^6 byte/s
fast $02CB8000 writem 7.2 ns normal 555.8 * 10^6 byte/s

Mind that both values are not only higher than yours but also the same speed.


Mind that the Display DMA will eat also bandwidth.
This looks like disadvantage for the Vampire and an advantage for the AmigaONE XE - but reality it is the opposite. In reality this the Super-AGA display from fast memory a major advantage for these games.

Let me help you understand why:

These 2D Games typically compose their screen inside Fast memory.
And then the game will copy the Frame to the GFX card.
This copy does need to read the frame from Fastmem and then needs to copy the frame over the bus to the GFX card. This copy takes significant time.
This takes more time than the DMA eats when just reading from Fastmem.
Everyone will understand that Read+write, takes longer than just Read.

We so far spoke about the "compose" time, we did not mention that the AmigaONE XE will need extra time to copy the frame too - while the Vampire does not need to copy to display.

I assume you ran this on 1280x720 32bit mode? Could this be?
This means you calculated on old slower core the time available after the screen "display" was done.

Please mind that the AmigaONE XE does not even has the power to just copy the screen in this resolution to the GFX card fluently (50/60Hz).
The AmigaOne can never do such games in the resolution fluently.

The Vampire is also not designed to make 1280x720 32bit action games in 50Hz.
This is clear. The Vampire has more power than the AmigeOne but not enough to make these fast action in this resolution games.

If you want to make reasonable comparison then set your display to a mode used by the games.
Then you will see what bandwidth for composing is available.
SONIC uses 320x200 16bit, DIABLO uses 640x480 8bit, for example.




The numbers of your MAC are obviously Cache values and not memory speed.
But you probably saw this yourself.

Please mind that STREAM will run the test twice and give out 2 results.
The first run is memory speed, the second run "forced" to run in Cache size, to measure the cache performance.

So please make sure to post the 1st result and the 2nd block!

But maybe this was just a wrong EXE with only cache test enabled.
The EXEs were done 16years ago, when I created LINUX memcopy optimizations for POWER. But the official website of this does not even exist anymore. I just grapped an old EXE from some old folder without being able to check it as I have no PPC running anymore.

Sorry if the EXE did only one test, then this was not the right version, maybe you can run BUSTEST , on real AMIGAONE XE?
Or run minibench?
http://apollo-core.com/minibench/index.htm?page=downloads

Or download and compile STREAM from here:
https://www.cs.virginia.edu/stream/




And you don't like them even after that kolla clarified:
@kolla

Quote:

kolla wrote:
@Gunnar

The core is the just released 2.16


Because it shows something different from your propaganda...

 Status: Offline
Profile     Report this post  
cdimauro 
Re: Packed Versus Planar: FIGHT
Posted on 7-Oct-2022 22:14:07
#465 ]
Elite Member
Joined: 29-Oct-2012
Posts: 3650
From: Germany

@Gunnar

Quote:

Gunnar wrote:

Alphablending is very easy for the 68080

.loop
load (a0),D0
mulalpha (a1)+,D0
store D0,(a0)+
dbra.l d1,.loop


The 68080 can do full alpha blending of 2 sources and writing 1 destination in 3 clocks per loop = processing 3x64bit per loop. This means three 64bit memory access in three clock cycles.

BTW, there's no trace of this instruction on YOUR documentation:

http://www.apollo-core.com/instructions.htm?b=4&z=6vmpGr

Your documentation sucks! A LOT. You should better spend time on it.

 Status: Offline
Profile     Report this post  
Hypex 
Re: Packed Versus Planar: FIGHT
Posted on 8-Oct-2022 5:14:59
#466 ]
Elite Member
Joined: 6-May-2007
Posts: 11220
From: Greensborough, Australia

@Hammer

Quote:
Reminder, Amiga Hombre chipset targeted OpenGL compatibility.


Yes and it was new departure. A wholly new design whose only relation to the Amiga hardware itself was in name only. Could Amiga gamers accept a new machine that didn't have that 2d chipset at $DFF000?

Quote:
Emu68 has its non-SMP 68K multi-threading evolution path. This example shows four threads running path tracers on Emu68.


I wonder how this is done? AmigaOS has no support for threads. The only way I see for this to work if is the code ran threads on EMu68 host CPU directly.

Quote:
Vampire V2/V4 is not the only post-Commodore Amiga accelerator solution with an evolution fork.


No. But it builds on the original $DFF000 chipset. So Amiga gamers like it.

 Status: Offline
Profile     Report this post  
Gunnar 
Re: Packed Versus Planar: FIGHT
Posted on 8-Oct-2022 8:05:32
#467 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown


Dear Cesare Di Mauro,


I'm surprised that you continue to argue.
Do you really want to show all the world that you talk bullshit?

Quote:


The games that I posted use all of these modes.
Some games use 8bit, some are 16bit, some are truecolor. Some games even mix modes.
All of the games highly benefit from the AMMX memory efficiency boost.

I'll talk about it below.
Quote:
Quote:

With 16 and 32-bit graphics this trick isn't possible, because STOREM is available only for bytes stores (no 16-bit and 32-bit versions of the instructions are available, AFAIR).

Your claim is wrong.



Quote:

So, since here I was perfectly right.

There are 6 different types of STOREM.
Of course you need to use the BYTE version for 8bit, WORD version for 16bit, LONG version for 32bit mode. Just as on 68K you use MOVE.B for BYTE and MOVE.W for WORD



Quote:

Quote:

The Apollo 68080 CPU can use this in 8-Bit/256 color mode, in 15bit mode, in 16bit mode and in also in the truecolor modes.

3) is OK, but ONLY for the big-endian 16-bit format. It lacks the little-endian one.

is plainly false. Because it doesn't support all modes, but only some, and for some it's making totally arbitrary assumptions.


68K uses Bigendian format. Amiga memory layout is also Bigendian.
Copper list are in Bigendian, Amiga screenmodes are in Bigendian,Audio to play by PAULA the Amiga audio chip, is in Bigendian. Also our SAGA chunky modes are Bigendian.
PC use LittleEndian

I'm surprised that you do not know this.


Quote:

BTW, it doesn't support 24-bit modes (which usually are used for saving memory and bandwidth, at least for the graphic framebuffer).

Yes games normally use 32bit mode truecolor mode and not 24bit.
A game coder would know this.

That said you can with also use STOREM on 24bit mode with an extra instruction.


Quote:
[quote]Let's clarify (again) one thing: those are SOME, SPECIFIC, cases, whereas my statement was GENERAL.




Quote:

But it's when you talk about performances and saying that AMMX is faster than a PowerPC then you're lying.

Please stop making this up!
I never said that AMMX is faster than all PowerPC in all and any cases.

I said that AMMX is highly tuned for video and game operations.
I said AMMX offers for games very useful instructions that ALTIVEC does not.
I said that AMMX offers instruction which allow you to code some often used game operations in a much more efficient way than you can on PowerPC. In other words you save memory bandwidth and get more FPS out of your system.





Quote:

Even more important, you're miserably lying by comparing a whole system, the Apollo Core, which ALSO handles graphic, with others where it's supposed to do ALL work.


I never said AMMX would be tuned for PC-Modes which SuperAGA not supports.
Super-AGA is an Amiga chipset and like everything else in Amiga also our Graphic Formats are BigEndian. All the games we spoke about are 100% BigEndian.
All the code is in BigEndian, all the GFX in them is in BigEndian and also all the Audio is in BigEndian.







Quote:

And I'm pretty sure that games which properly makes use of it (e.g.: transferring graphic data on its VRAM and using the card's graphic primitives to do the cookie-cut operations and, in general, what's needed) could render all games that you mentioned much faster that the best Vampire card.


In theory yes.

But praxis many games that we have access to
are coded for the CPU to compose the Graphics, as I explained.
ROBIN HOOD for example, a very successful PC game which I ported to MorphOS, is coded on the basis that the CPU does all the graphic composition.
DIABLO, also a very successful PC game that we ported, is designed the same.
Command and Conquer, also a very successful PC game that we ported, is designed the same.
Northland, also a very successful PC, is designed the same.
Age of Empires, also a very successful PC, is designed the same.
Desperados, also a very successful PC, is designed the same.
Commandos, also a very successful PC, is designed the same.
Star Craft, also a very successful PC, is designed the same.
and thousands more games, are designed for the CPU to compose the graphics.


If you port such a game to Amiga, than you can keep the whole game engine intact as it was designed. And you can use AMMX in the sprite routines to get a good performance boost in the game. This is very easy. AMMX is very easy to code and to very easy use - magnitude easier than ALTIVEC.

The goal behind AMMX is to make coding easier and to improve performance.
And this is exactly what it does.

You can take a Game source in C, add a few lines ASM in the sprite copy routine and you have a port running fast and good on Amiga.

Of course in theory one could also rewrite all these games to NOT use the CPU for composing but instead to use Graphic card operations and to stay in the limited amount of VRAM. But this would need that you rewrite the game completely. This would be 100 times more work.

Lets just look at DIABLO.
DIABLO a PC game, does compose the GFX with the CPU.
The Amiga port work the same.
The OS 4 PowerPC port works the same.
The 68K version uses AMMX (if available) to copy the sprites.
The 68K version runs the game much faster on the 85MHz 68080 than it runs on a 1000MHz PPC OS 4 machine.




 Status: Offline
Profile     Report this post  
Gunnar 
Re: Packed Versus Planar: FIGHT
Posted on 8-Oct-2022 8:12:23
#468 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown

@kolla

Quote:

kolla wrote:
@Gunnar

The core is the just released 2.16


Ah ... you have an old V2 card.
My bad, when you said you have an "actual card" I was thinking we speak about V4 generation cards.


Which Screenmode did you use when doing the measurement?

Did you use a Screenmode that the PowerPC AmigaOne XE systems are not able to copy to the GFX card in real time?

Last edited by Gunnar on 08-Oct-2022 at 08:31 AM.

 Status: Offline
Profile     Report this post  
Gunnar 
Re: Packed Versus Planar: FIGHT
Posted on 8-Oct-2022 8:30:59
#469 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown

Quote:

And you don't like them even after that kolla clarified:
Because it shows something different from your propaganda...


Cesare Di Mauro, maybe you did not understand what the number show?
Maybe I did not explain this good enough for you.
Let me help you.


Kollas result showed exactly what I said.
Kolla has different numbers than me as I use a V4 and he a V2.
Kolla said he would have an "actual Vampire card",
therefore I assumed he had the new generation card.



Many thousands of 2D games work the same.
They compose the graphics with the CPU, and then they copy the GFX to the GFX card.
You need for the compose memory bandwidth and you need for the screencopy memory bandwidth.

The Amiga One XE has ~ 180 MB/sec memcopy maximum bandwidth.

I think that Kolla ran on the Vampire a screenmode which needs 360 MB/sec bandwidth for the frame copy. And even in this mode the Vampire had still 240-300MB Bandwidth free for composing.

The AmigaOne XE has a bandwidth of 180MB,
to reach the same game speed as Kolla measured on the Vampire
the AmigaOne XE would need 360+240 bandwidth.
And yes, if you use AMMX for Sprite copies than you get even more out of the bandwidth.
This means what you get out of the 600MB would be like you have 650 or 700 MB bandwidth.

The AmigaOne would need to have same game speed a memory bandwidth of 650-700 MB - but it has only 180 MB.
And we have not even discussed that the AGP port is slow...

Kolla result clearly show that memory performance advantage of the Vampire over the AmigaOne.
As 650 MB/sec is more than 180 MB/sec




Cesare did this help you understand it now.
Or shall I explain it again.

Last edited by Gunnar on 08-Oct-2022 at 08:45 AM.

 Status: Offline
Profile     Report this post  
kolla 
Re: Packed Versus Planar: FIGHT
Posted on 8-Oct-2022 9:00:00
#470 ]
Elite Member
Joined: 21-Aug-2003
Posts: 2896
From: Trondheim, Norway

@Gunnar

Quote:

Gunnar wrote:
@kolla

Quote:

kolla wrote:
@Gunnar

The core is the just released 2.16


Ah ... you have an old V2 card


As I wrote initially - I have both G4 systems and actual Vampire systems.

Quote:

My bad, I was thinking all the time about V4 generation cards.


Yes, like I wrote - I have actual Vampire systems. V4 systems are not.

Quote:

Which Screenmode did you use when doing the measurement?


If your test has special prerequisites, you should have stated so.

Quote:
Did you use a Screenmode that the PowerPC AmigaOne XE systems are not able to copy to the GFX card in real time?


Why this obsession with AmigaOne XE?

And why isn’t SAGA (aka Super-AGA) not open sourced yet?

Is Slamtilt a really difficult game?

_________________
B5D6A1D019D5D45BCC56F4782AC220D8B3E2A6CC

 Status: Offline
Profile     Report this post  
Gunnar 
Re: Packed Versus Planar: FIGHT
Posted on 8-Oct-2022 9:04:58
#471 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown


Quote:
Cesare some of posts are clever and good.
But your habit of making false claims without knowing does not help anyone, really!
Could you maybe, please consider to change this habit?
What do you think ?


The usual thing: that you're a liar and mystifier. See above.[/quote]


Dear Cesare Di Mauro,

your compliments don't change the facts.

You try to present yourself here as an AMIGA expert.
The reality is that thousands of Amiga users have a lot more Amiga coding experience than you.
While other Amiga users code games and demos, you dont code, you prefer to spend your time posting in the forum.

You are a pretender.

You often talk about stuff that you have never done yourself,
and you make assumptions and claims based on information that you "assume" to be true.

And it regularly happens that you don't inform yourself properly
or you misunderstood the Internet source you base your "knowledge" on.
And then you "talk out of your ass".

But you are not alone.
Matthew Hey does often the same.
Matthew Hey also often talks total bullshit, because he not informed himself properly.


Guys why do you need to spend your time "pretending" in forums?
Why don't you code something for Amiga instead?
Write a good Webbrowser for Amiga or code a new game!
Do something useful!

 Status: Offline
Profile     Report this post  
kolla 
Re: Packed Versus Planar: FIGHT
Posted on 8-Oct-2022 9:21:59
#472 ]
Elite Member
Joined: 21-Aug-2003
Posts: 2896
From: Trondheim, Norway

@Gunnar

Why are you here? Don’t you have your own forum?

When will SAGA be open sourced?
Can you showcase Slamtilt in interlaced mode on V4?

_________________
B5D6A1D019D5D45BCC56F4782AC220D8B3E2A6CC

 Status: Offline
Profile     Report this post  
Gunnar 
Re: Packed Versus Planar: FIGHT
Posted on 8-Oct-2022 9:29:55
#473 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown


Quote:

Quote:
Did you use a Screenmode that the PowerPC AmigaOne XE systems are not able to copy to the GFX card in real time?


Why this obsession with AmigaOne XE?



We speak about the AmigaOne because it was used as an example in the discussion before.


If you follow the discussion then it went somewhat like this.

Me: AMMX is very nice for speeding up games.
Me: AMMX has some game coding instructions which for example other SIMD like ALTIVEC not have

But ALTIVEC is faster

Me: This depends on the usecase, there are usecases where ALTIVEC is great, and there are usecases like game coding where AMMX is much nicer to use and has more performance.

maybe at the same clock. But the PowerPC Amigas have 800 MHz or 1000MHz
1000MHz = THIS IS ONE GIGAHERZ!!!! THIS IS FASTER you can never beat a gigaherz


Me: yes, 1000Mhz is higher clock than 85Mhz.
But this does not mean that all the games will run faster.
The 1000MHz is inside the CPU only - the G4 bus to the mainboard runs at 133MHz only.
And 133MHz is actually very similar to the 85 / 100MHz of the 68080.
And in real world there are games running faster on the 68080 than on the PowerPC Amigas with 800 MHz or 1000MHz


Give us numbers as proof. Or you are a liar!

This is how we ended here..

 Status: Offline
Profile     Report this post  
NutsAboutAmiga 
Re: Packed Versus Planar: FIGHT
Posted on 8-Oct-2022 9:35:24
#474 ]
Elite Member
Joined: 9-Jun-2004
Posts: 12819
From: Norway

@Gunnar

Not sure why you insist on comparing your Vampire ageist a crippled computer, because that’s what the XE really is.

XE had lot of potential, you had 2 x memory banks, this could have given better performance, by interleaving the memory banks. it had AGP 2x, it could have given better performance as GPU can access system on its own (in theory). G4 has lots of L2 chance, on Sam460 that similar problems and the L2 is always off (but can be enabled, we know can have huge impact on speed it can have), I’m not sure what case on XE. A lot of XE features where never ever used, due to the “Artica S” chip.

I believe one of the issues with Artica S, was that did not report when the DMA was done, so need take that into account. (Should pretty easy to calculate however, do some time keeping.), anyhow via686b was also known as a problem chip in PC industry before the XE was released.

On paper the XE looks like it’s better than the Pegasus II, but in reality, the Pegasus II with its Marvel chipset always outperformed the XE, at similar clock frequencies.

Lett summarize:
Using 2 x memory sticks never worked reliable, it had issues with interrupts (often IRQ was disabled in drivers, using PIO4 mode), AGP never preformed to specifications, USB ports was f*cked, and L2 cache was most likely deactivated.

Last edited by NutsAboutAmiga on 08-Oct-2022 at 09:37 AM.

_________________
http://lifeofliveforit.blogspot.no/
Facebook::LiveForIt Software for AmigaOS

 Status: Offline
Profile     Report this post  
NutsAboutAmiga 
Re: Packed Versus Planar: FIGHT
Posted on 8-Oct-2022 9:44:57
#475 ]
Elite Member
Joined: 9-Jun-2004
Posts: 12819
From: Norway

@Gunnar

I think the most significant difference between the Vampire and XE, if want to compare the two, is that you plug in 3D graphics card in XE, so you can play some nice 3D games, in RAW CPU speed performance, is also has advantage, and it does help with emulators. XE also has working FPU, that can do math really well.

Yeh theoretical 133mhz bus, but when the CPU is not the only chip accessing the RAM, you can benefit from interleaved memory banks, doubling the theoretical max.

Last edited by NutsAboutAmiga on 08-Oct-2022 at 10:00 AM.
Last edited by NutsAboutAmiga on 08-Oct-2022 at 09:51 AM.
Last edited by NutsAboutAmiga on 08-Oct-2022 at 09:50 AM.
Last edited by NutsAboutAmiga on 08-Oct-2022 at 09:47 AM.

_________________
http://lifeofliveforit.blogspot.no/
Facebook::LiveForIt Software for AmigaOS

 Status: Offline
Profile     Report this post  
Gunnar 
Re: Packed Versus Planar: FIGHT
Posted on 8-Oct-2022 9:54:33
#476 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown

@NutsAboutAmiga

Quote:

Lets summarize:
Using 2 x memory sticks never worked reliable, it had issues with interrupts (often IRQ was disabled in drivers, using PIO4 mode), AGP never preformed to specifications, USB ports was f*cked, and L2 cache was most likely deactivated.


WOW, these are many problems.


Lets make this clear.
I would not have started this comparison.
I was repeatedly asked to give proof if AMMX can beat a 1000MHz PPC.


My initial point was that AMMX is very nice and useful for game coders.
And I was comparing coding games with AMMX versus ALTIVEC, simply because I have decades of ALTIVEC coding experience and I know the pain trying to code games with it.
As you might know, I worked for IBM and was coding some of IBMs SIMD performance libraries.
IBM uses Altivec a lot, even for speeding up certain DB2 database operations.


Cesare is angry with me because I called him out as "Forum-Pretender".
Now Cesare tried to twist every word from me to proof that I'm a "liar".


Last edited by Gunnar on 08-Oct-2022 at 10:04 AM.

 Status: Offline
Profile     Report this post  
NutsAboutAmiga 
Re: Packed Versus Planar: FIGHT
Posted on 8-Oct-2022 10:07:49
#477 ]
Elite Member
Joined: 9-Jun-2004
Posts: 12819
From: Norway

@Gunnar

Yeh.. putting G4 1Ghz in a XE insted of a std 800mhz/933mhz, is comparable to putting a Ferrari motor inside a bulldozer. its just is not going to move any faster.

Well, you see some differences, but it’s not going to be wow.
changing from 100mhz to 133mhz ram might have more noticeable effect. not all XE, has 133mhz ram. Using a PC100 is like operating at 75.18%.

Last edited by NutsAboutAmiga on 08-Oct-2022 at 10:15 AM.
Last edited by NutsAboutAmiga on 08-Oct-2022 at 10:11 AM.
Last edited by NutsAboutAmiga on 08-Oct-2022 at 10:08 AM.

_________________
http://lifeofliveforit.blogspot.no/
Facebook::LiveForIt Software for AmigaOS

 Status: Offline
Profile     Report this post  
cdimauro 
Re: Packed Versus Planar: FIGHT
Posted on 8-Oct-2022 10:10:55
#478 ]
Elite Member
Joined: 29-Oct-2012
Posts: 3650
From: Germany

@Hypex

Quote:

Hypex wrote:
@Hammer

Quote:
Reminder, Amiga Hombre chipset targeted OpenGL compatibility.


Yes and it was new departure. A wholly new design whose only relation to the Amiga hardware itself was in name only. Could Amiga gamers accept a new machine that didn't have that 2d chipset at $DFF000?

Quote:
Emu68 has its non-SMP 68K multi-threading evolution path. This example shows four threads running path tracers on Emu68.


I wonder how this is done? AmigaOS has no support for threads. The only way I see for this to work if is the code ran threads on EMu68 host CPU directly.

Quote:
Vampire V2/V4 is not the only post-Commodore Amiga accelerator solution with an evolution fork.


No. But it builds on the original $DFF000 chipset. So Amiga gamers like it.

Not Amiga gamers: Amiga games coders.

 Status: Offline
Profile     Report this post  
Gunnar 
Re: Packed Versus Planar: FIGHT
Posted on 8-Oct-2022 10:12:45
#479 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown

Hello NutsAboutAmiga,
how are you?

Quote:

NutsAboutAmiga wrote:
@Gunnar

I think the most significant difference between the Vampire and XE, if want to compare the two, is that you plug in 3D graphics card in XE, so you can play some nice 3D games, in RAW CPU speed performance, is also has advantage, and it does help with emulators. XE also has working FPU, that can do math really well.


The focus of the POWERPC design was always a strong FPU.
IBM market was selling POWER machines as number cruncher for FPU code.

What makes FPU good on the POWERPC is that
- the FPU is fully pipelined, you can launch another FPU operation each clock, they have of course a latency
- the FPU has 32 register, this amount of FPU register is important to work around the latency.
With proper code you can get real good FPU speed on POWER.


The 68080 FPU is actually designed the same way.
- the 68080 FPU is fully pipelined, you can launch another FPU operation each clock
- the 68080 FPU can use 32 register, this amount of FPU register is important to work around the latency.
With proper code you can get real good FPU speed on 68080.

 Status: Offline
Profile     Report this post  
Gunnar 
Re: Packed Versus Planar: FIGHT
Posted on 8-Oct-2022 10:25:27
#480 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown

Hello NutsAboutAmiga

Quote:

Yeh.. putting G4 1Ghz in a XE insted of a std 800mhz/933mhz, is comparable to putting a Ferrari motor inside a bulldozer. its just is not going to move any faster.


Yes, the memory performance of the AmigaOne XE is bad.
But frankly this is a problem of not only the AmigaOne XE systems.

The Efika memory performance is not stellar
The Pegasos 1 memory performance is very bad
The Pegasos 2 memory performance is better than Pegasos 1 but also not great
I saw numbers that the SAM also have weak performance?
Frankly G3 and G4 MAC have also not stellar memory performance.

The general problem of all these PowerPC systems is that they will not automatically prefetch memory.
The IBM 970 does this - and it has great memory performance because of this.

The 68080 does also automatically prefetch memory.
Did I mention that our Apollo-team did work on high end POWER CPUs at IBM.
We know all the tricks that IBM does on great chips.
We also know the tricks that INTEL uses.
Of course we included great ideas from good CPUs to the 68k.

 Status: Offline
Profile     Report this post  
Goto page ( Previous Page 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 Next Page )

[ home ][ about us ][ privacy ] [ forums ][ classifieds ] [ links ][ news archive ] [ link to us ][ user account ]
Copyright (C) 2000 - 2019 Amigaworld.net.
Amigaworld.net was originally founded by David Doyle