Amigaworld.net - The Amiga Computer Community Portal Website

home

features

news

forums

classifieds

faqs

links

search

6223 members

Amiga Q&A / Free for All / Emulation / Gaming / (Latest Posts)

Login

Lost Password?

Don't have an account yet?
Register now!

Support Amigaworld.net

Your support is needed and is appreciated as Amigaworld.net is primarily dependent upon the support of its users.

Menu

Main sections

»	Home
»	Features
»	News
»	Forums
»	Classifieds
»	Links
»	Downloads

Extras

»	OS4 Zone
»	IRC Network
»	AmigaWorld Radio
»	Newsfeed
»	Top Members
»	Amiga Dealers

Information

»	About Us
»	FAQs
»	Advertise
»	Polls
»	Terms of Service
»	Search

IRC Channel

Server: irc.amigaworld.net
Ports: 1024,5555, 6665-6669
SSL port: 6697
Channel: #Amigaworld
Channel Policy and Guidelines

Who's Online

22 crawler(s) on-line.

95 guest(s) on-line.

1 member(s) on-line.

70sAnd80sRule

You are an anonymous user.
Register Now!

70sAnd80sRule: 2 mins ago

kolla: 8 mins ago

minator: 37 mins ago

matthey: 44 mins ago

arden2222: 54 mins ago

number6: 1 hr 6 mins ago

Chris_Y: 1 hr 28 mins ago

OneTimer1: 1 hr 34 mins ago

clint: 1 hr 36 mins ago

AmigaMac: 1 hr 40 mins ago

Forum Index

Amiga OS4 Hardware

3D benchmark and X1000+PCI Radeon9250

Poster

Thread

thellier

3D benchmark and X1000+PCI Radeon9250
Posted on 23-Apr-2013 11:55:00

[ #1 ]

Regular Member

Joined: 2-Nov-2009
Posts: 270
From: Paris

Hello

I have made some 3D tests on my Sam440 and Guillaume Boesel (=zzd10h ) did the same with his x1000

It seems that x1000+PCI Radeon9250 is slower than Sam440

Can someone confirm ?? or explain the problem ?

The test is
1) use Aminet/Microbe3D
copy Microbe3D.library-ppc TO LIBS:
run demo-view-ppc
load partygirl.obj
let the fps stabilize.....
get the average xx FPS that is displayed "( On 50 frames xx)"

then Select Menu Light/MAT LIGHT FAST)
get the average FPS too

2) use Aminet/Cow3D
run Cow3D-Amiga-ppc
let the fps stabilize.....
get the average xx FPS that is displayed

then hit key 'b' (will bufferize the display so will no more do rotation math)
get the average xx FPS that is displayed

My result:

Thellier - Sam440 - On board 3D (Radeon M9?)
======================================
Cow3D v5 ppc : 41 FPS
with key 'b' : 45 FPS
Microbe3D : PartyGirl.obj : 14 FPS
menu/light/MAT LIGHT FAST : 15 FPS

So I obtain around 200 000 triangles/seconde

Status: Offline

tlosm

Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 23-Apr-2013 12:23:57

[ #2 ]

Elite Member

Joined: 28-Jul-2012
Posts: 2759
From: Amiga land

@thellier

Probably because the bus write and relad of the x1000 is about 50% slower of the sam
You can check here
http://amiga.ikirsector.it/forum/viewtopic.php?f=37&t=17550

_________________
I love Amiga and new hope by AmigaNG
A 500 + ; CDTV; CD32;
PowerMac G5 Quad 8GB,SSD,SSHD,7800gtx,Radeon R5 230 2GB;
MacBook Pro Retina I7 2.3ghz;
#nomorea-eoninmyhome

Status: Offline

Rob

Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 23-Apr-2013 13:56:56

[ #3 ]

Elite Member

Joined: 20-Mar-2003
Posts: 6417
From: S.Wales

@thellier

Quote:
It seems that x1000+PCI Radeon9250 is slower than Sam440

Can someone confirm ?? or explain the problem ?

66Mhz slot in Sam440 vs 33Mhz slot in X1000.

Status: Offline

mbrantley

Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 23-Apr-2013 14:48:54

[ #4 ]

Cult Member

Joined: 10-Jun-2010
Posts: 561
From: Mobile, Alabama, United States

Yes, this is true. I played around a bit with a 9250 card as a second graphics card in my X1000 and determined the Blender interface was a bit slower this way than it is with the 9250 in my old Sam440ep-flex board. Now with the new Wazp3D update improving some things with my Blender experience, I just go for that for now (software only screen rendering). I am very much looking forward to the new Warp3D and am anticipating/hoping the Blender interface gets a big boost with that.

Note I am talking about shoving complex objects around the screen and such, not rendering of scenes. That kind of rendering already is quite fast.

_________________

Status: Offline

tlosm

Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 23-Apr-2013 16:46:11

[ #5 ]

Elite Member

Joined: 28-Jul-2012
Posts: 2759
From: Amiga land

@mbrantley

if you are using wazp mean that you are using the power of your cpu :)
this release of wazp is more optimized and is more faster i think because of this.

now make this calc ... x= your speed with wazp
y= your cpu
z= warp3d (depend of your sys config) and gallium

x=y/4

z=y*16

Wazp is a great tool but hope in future we will have the warp :)

_________________
I love Amiga and new hope by AmigaNG
A 500 + ; CDTV; CD32;
PowerMac G5 Quad 8GB,SSD,SSHD,7800gtx,Radeon R5 230 2GB;
MacBook Pro Retina I7 2.3ghz;
#nomorea-eoninmyhome

Status: Offline

Spectre660

Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 23-Apr-2013 17:29:46

[ #6 ]

Elite Member

Joined: 4-Jun-2005
Posts: 3918
From: Unknown

@thellier

For comparative info

Cow3D-Amiga-ppc

22 FPS
with b key 23 FPS

Sam440ep-flex 800MHZ and RadeonHD driver + Warp3d + Radeon HD6670

_________________
Sam460ex : Radeon Rx550 Single slot Video Card : SIL3112 SATA card

Status: Offline

zzd10h

Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 23-Apr-2013 17:35:44

[ #7 ]

Amiga Developer Team

Joined: 21-May-2012
Posts: 1077
From: France

@Spectre660

Not really better than my x1000 !

I reach 21 fps with Radeon9250 and native warp3D with Cow3D.

Strange, I thought that it was the x1000 PCI the culprit...

_________________
http://apps.amistore.net/zTools

Status: Offline

Spectre660

Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 23-Apr-2013 18:10:45

[ #8 ]

Elite Member

Joined: 4-Jun-2005
Posts: 3918
From: Unknown

@zzd10h

If you have wapz3D installed on a computer then it cant use the normal Warp3d library.
So you would be using the slower 9250 without any video card 3d support.

_________________
Sam460ex : Radeon Rx550 Single slot Video Card : SIL3112 SATA card

Status: Offline

Spectre660

Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 23-Apr-2013 22:55:11

[ #9 ]

Elite Member

Joined: 4-Jun-2005
Posts: 3918
From: Unknown

@zzd10h

Are you using the original OS4.1 warp3d.library ?
To do the test properly for a radeon 9250 you need to revert to this library .

_________________
Sam460ex : Radeon Rx550 Single slot Video Card : SIL3112 SATA card

Status: Offline

Rob

Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 24-Apr-2013 0:34:22

[ #10 ]

Elite Member

Joined: 20-Mar-2003
Posts: 6417
From: S.Wales

What's video like on X1000 with Radeon 92x0 compared to with an HD card.

Status: Offline

Jupp3

Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 24-Apr-2013 4:24:33

[ #11 ]

Super Member

Joined: 22-Feb-2007
Posts: 1225
From: Unknown

@Rob

Quote:
What's video like on X1000 with Radeon 92x0 compared to with an HD card.

I assume overlay is supported on all Radeon 92x0 cards, so the video playback should be significantly faster, especially if there's any sort of scaling involved.

Status: Offline

zzd10h

Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 24-Apr-2013 6:29:25

[ #12 ]

Amiga Developer Team

Joined: 21-May-2012
Posts: 1077
From: France

@Spectre660

Yes, of course, I use official Warp3D. I have never installed wazp3D.
I have installed a Radeon9250 principaly to watch video with overlay.

_________________
http://apps.amistore.net/zTools

Status: Offline

Spectre660

Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 24-Apr-2013 10:22:22

[ #13 ]

Elite Member

Joined: 4-Jun-2005
Posts: 3918
From: Unknown

@Rob

Cant give you any X1000 info but I had done some test with a Sam460ex when I had a working 9250 card.

http://amigaworld.net/modules/newbb/viewtopic.php?mode=viewtopic&topic_id=36355&forum=33&start=20&viewmode=flat&order=0#680357

Quote:

Rob wrote:
What's video like on X1000 with Radeon 92x0 compared to with an HD card.

_________________
Sam460ex : Radeon Rx550 Single slot Video Card : SIL3112 SATA card

Status: Offline

Spectre660

Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 24-Apr-2013 10:30:45

[ #14 ]

Elite Member

Joined: 4-Jun-2005
Posts: 3918
From: Unknown

@zzd10h

Ok, Things are clear now.
The speed issue may be no Picasso96 dma support for the X1000 pci slots.
The same issue exists with the Sam440ep itx single 33MHZ pci slot, and the Sam440ep-flex second and third pci slots wich are also 33MHZ.

This post refers to the Gfxbench2D benchmark Memcopy score with with or without dma .
The performance is only about 25% for a 33mhz slot without dma as compared to a 66mhz slot with dma.

http://amigaworld.net/modules/newbb/viewtopic.php?topic_id=36053&start=20&post_id=687804&order=0&viewmode=flat&pid=0&forum=14#687804

The Sam460ex pci slot performs better.

This is the Sam640ex

MemCopy Score: 187.69
Operation MiB/s
Copy to VRAM 104.38
Write Pixel Array 164.92
Copy from VRAM 36.19
Read Pixel Array 38.48

This is the X1000

MemCopy Score: 25.68
Operation MiB/s
Copy to VRAM 34.22
Write Pixel Array 18.62
Copy from VRAM 4.69
Read Pixel Array 2.65

This is the faster Sam440ep itx M9

MemCopy Score: 103.03
Operation MiB/s
Copy to VRAM 39.94
Write Pixel Array 81.26
Copy from VRAM 33.10
Read Pixel Array 31.78

And your Sam440ep-flex 66MHZ pci slot

MemCopy Score: 122.62
Operation MiB/s
Copy to VRAM 56.88
Write Pixel Array 96.86
Copy from VRAM 35.68
Read Pixel Array 35.78

Quote:

zzd10h wrote:
@Spectre660

Yes, of course, I use official Warp3D. I have never installed wazp3D.
I have installed a Radeon9250 principaly to watch video with overlay.

Last edited by Spectre660 on 24-Apr-2013 at 11:42 AM.
Last edited by Spectre660 on 24-Apr-2013 at 11:40 AM.
Last edited by Spectre660 on 24-Apr-2013 at 10:45 AM.
Last edited by Spectre660 on 24-Apr-2013 at 10:42 AM.
Last edited by Spectre660 on 24-Apr-2013 at 10:36 AM.
Last edited by Spectre660 on 24-Apr-2013 at 10:32 AM.

_________________
Sam460ex : Radeon Rx550 Single slot Video Card : SIL3112 SATA card

Status: Offline

thellier

Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 24-Apr-2013 13:22:07

[ #15 ]

Regular Member

Joined: 2-Nov-2009
Posts: 270
From: Paris

@all

So if it is a bandwidth problem then the coordinates format we use in Warp3D have importance

So i made a test and yes using allways a point format like this one that may allow bump-mapping and specular color
/*==================================================================*/
typedef struct _Point3D
{
float x,y,z;
float color[4];
float color2[4];
float u,v,w;
float u2,v2,w2;
float nx,ny,nz;
} Point3D;
*==================================================================*/
is slower that this one
/*==================================================================*/
typedef struct _Point3D
{
float x,y,z;
float color[4];
float u,v,w;
float nx,ny,nz;
} Point3D;

I mean it is slower even without enabling bump-mapping and specular color

Even replacing
float color[4];
by
UBYTE RGBA[4];
should help a few

Conclusion having the smaller coordinates format in Warp3D got importance when you draw 200 000 triangles/secondes
So 200 000 * 3 points * 80 bytes per point = 48 MB/s

Alain Thellier

Status: Offline

Seiya

Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 26-Apr-2013 19:31:58

[ #16 ]

Super Member

Joined: 19-Aug-2006
Posts: 1479
From: Italia

@thellier

these are only syntethic benchmark. you have to use real benchmark or try Sam440ep andn X1000 with real software.

_________________

Status: Offline

Karlos

Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 27-Apr-2013 15:06:54

[ #17 ]

Elite Member

Joined: 24-Aug-2003
Posts: 4958
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

@thellier
Quote:
Even replacing
float color[4];
by
UBYTE RGBA[4];
should help a few

Internally, the drivers already do this in most cases. For example, here's an inline function I wrote to handle the normalized float to uint8 conversion:

/* Fast conversion of 0.0-1.0 ... 0-255 */
static __inline uint32 normalizedF2U8(float32 x)
{
union { float32 f; uint32 i; } u;
u.f = 32768.0f + x * (255.0f / 256.0f);
return (uint8)u.i;
}

Error is +/-1 in the least significant digit, but is always in the range 0-255 for properly normalized input, which I consider acceptable for the use case.

It produces significantly better code than the naive multiply by 255 version, even given the fact the union causes register stack spillage. For example, stacking up 4 conversions using a naive approach :

/* static inline */uint32 convertF2U(float32* clr) {
extern float32 FAC_255; /* 1/255 */
register float32 scale = FAC_255;
return ( (uint32) (clr[0]*scale) )<<24 | /* A */
( (uint32) (clr[1]*scale) )<<16 | /* R */
( (uint32) (clr[2]*scale) )<<8 | /* G */
( (uint32) (clr[3]*scale) ); /* B */
}

when compiled at -O2:

convertF2U:
stwu %r1,-80(%r1)
lis %r8,.LC4@ha
lis %r10,.LC6@ha
lfs %f0,.LC4@l(%r8)
lfs %f12,.LC6@l(%r10)
lfs %f13,12(%r3)
fmul %f0,%f13,%f0
fcmpu %cr7,%f0,%f12
cror 30,29,30
beq- %cr7,.L4
fctiwz %f0,%f0
lfs %f13,0(%r3)
lfs %f12,.LC6@l(%r10)
stfd %f0,8(%r1)
lfs %f0,.LC4@l(%r8)
lwz %r11,12(%r1)
fmul %f0,%f13,%f0
fcmpu %cr7,%f0,%f12
cror 30,29,30
beq- %cr7,.L6
.L14:
fctiwz %f0,%f0
lfs %f13,4(%r3)
lfs %f12,.LC6@l(%r10)
stfd %f0,24(%r1)
lfs %f0,.LC4@l(%r8)
lwz %r9,28(%r1)
fmul %f0,%f13,%f0
slwi %r0,%r9,24
or %r11,%r11,%r0
fcmpu %cr7,%f0,%f12
cror 30,29,30
beq- %cr7,.L8
.L15:
fctiwz %f0,%f0
lfs %f13,8(%r3)
lfs %f12,.LC6@l(%r10)
stfd %f0,40(%r1)
lfs %f0,.LC4@l(%r8)
lwz %r9,44(%r1)
fmul %f0,%f13,%f0
slwi %r0,%r9,16
or %r0,%r11,%r0
fcmpu %cr7,%f0,%f12
cror 30,29,30
beq- %cr7,.L10
.L16:
fctiwz %f0,%f0
stfd %f0,56(%r1)
lwz %r3,60(%r1)
addi %r1,%r1,80
slwi %r3,%r3,8
or %r3,%r0,%r3
blr
.L4:
fsub %f0,%f0,%f12
lfs %f12,.LC6@l(%r10)
fctiwz %f13,%f0
lfs %f0,.LC4@l(%r8)
stfd %f13,16(%r1)
lfs %f13,0(%r3)
lwz %r11,20(%r1)
fmul %f0,%f13,%f0
addis %r11,%r11,0x8000
fcmpu %cr7,%f0,%f12
cror 30,29,30
bne+ %cr7,.L14
.L6:
fsub %f0,%f0,%f12
lfs %f12,.LC6@l(%r10)
fctiwz %f13,%f0
lfs %f0,.LC4@l(%r8)
stfd %f13,32(%r1)
lfs %f13,4(%r3)
lwz %r9,36(%r1)
fmul %f0,%f13,%f0
addis %r9,%r9,0x8000
slwi %r0,%r9,24
fcmpu %cr7,%f0,%f12
or %r11,%r11,%r0
cror 30,29,30
bne+ %cr7,.L15
.L8:
fsub %f0,%f0,%f12
lfs %f12,.LC6@l(%r10)
fctiwz %f13,%f0
lfs %f0,.LC4@l(%r8)
stfd %f13,48(%r1)
lfs %f13,8(%r3)
lwz %r9,52(%r1)
fmul %f0,%f13,%f0
addis %r9,%r9,0x8000
slwi %r0,%r9,16
fcmpu %cr7,%f0,%f12
or %r0,%r11,%r0
cror 30,29,30
bne+ %cr7,.L16
.L10:
fsub %f0,%f0,%f12
fctiwz %f13,%f0
stfd %f13,64(%r1)
lwz %r3,68(%r1)
addi %r1,%r1,80
addis %r3,%r3,0x8000
slwi %r3,%r3,8
or %r3,%r0,%r3
blr

Replacing the scale multiply by calls to the inlined normalizedF2U8 gives the following output at the same optimization level:

convertF2U:
stwu %r1,-16(%r1)
lis %r9,.LC0@ha
lis %r11,.LC1@ha
lfs %f13,.LC0@l(%r9)
mr %r10,%r3
lfs %f0,.LC1@l(%r11)
lfs %f12,0(%r3)
lfs %f11,4(%r3)
lfs %f10,12(%r3)
fmadds %f12,%f12,%f13,%f0
fmadds %f11,%f11,%f13,%f0
fmadds %f10,%f10,%f13,%f0
stfs %f12,8(%r1)
lwz %r3,8(%r1)
stfs %f11,8(%r1)
slwi %r3,%r3,24
lfs %f12,8(%r10)
lwz %r0,8(%r1)
stfs %f10,8(%r1)
fmadds %f12,%f12,%f13,%f0
rlwinm %r0,%r0,16,8,15
or %r3,%r3,%r0
lwz %r9,8(%r1)
stfs %f12,8(%r1)
rlwinm %r9,%r9,0,0xff
or %r3,%r3,%r9
lwz %r11,8(%r1)
addi %r1,%r1,16
rlwinm %r11,%r11,8,16,23
or %r3,%r3,%r11
blr

_________________
Doing stupid things for fun...

Status: Offline

[ home ][ about us ][ privacy ] [ forums ][ classifieds ] [ links ][ news archive ] [ link to us ][ user account ]

Amigaworld.net was originally founded by David Doyle