Click Here
home features news forums classifieds faqs links search
5630 members 
Amiga Q&A /  Free for All /  Emulation /  Gaming / (Latest Posts)
Login

Nickname

Password

Lost Password?

Don't have an account yet?
Register now!

Support Amigaworld.net
Your support is needed and is appreciated as Amigaworld.net is primarily dependent upon the support of its users.
Donate

Menu
Main sections
Home
Features
News
Forums
Classifieds
Links
Downloads
Extras
OS4 Zone
IRC Network
AmigaWorld Radio
Newsfeed
Top Members
Amiga Dealers
Information
About Us
FAQs
Advertise
Polls
Terms of Service
Search

IRC Channel

Who's Online
 59 guest(s) on-line.
 1 member(s) on-line.


 Spectre660

You are an anonymous user.
Register Now!
 Spectre660:  4 mins ago
 BSzili:  10 mins ago
 eliyahu:  16 mins ago
 zipper:  38 mins ago
 Templario:  57 mins ago
 billt:  1 hr 8 mins ago
 Jasper:  1 hr 11 mins ago
 broadblues:  1 hr 13 mins ago
 alpyre:  1 hr 27 mins ago
 Hammer:  1 hr 39 mins ago

/  Forum Index
   /  Amiga OS4 Hardware
      /  3D benchmark and X1000+PCI Radeon9250
Register To Post

PosterThread
thellier 
3D benchmark and X1000+PCI Radeon9250
Posted on 23-Apr-2013 12:55:00
#1 ]
Regular Member
Joined: 2-Nov-2009
Posts: 250
From: Paris

Hello

I have made some 3D tests on my Sam440 and Guillaume Boesel (=zzd10h ) did the same with his x1000

It seems that x1000+PCI Radeon9250 is slower than Sam440

Can someone confirm ?? or explain the problem ?

The test is
1) use Aminet/Microbe3D
copy Microbe3D.library-ppc TO LIBS:
run demo-view-ppc
load partygirl.obj
let the fps stabilize.....
get the average xx FPS that is displayed "( On 50 frames xx)"

then Select Menu Light/MAT LIGHT FAST)
get the average FPS too

2) use Aminet/Cow3D
run Cow3D-Amiga-ppc
let the fps stabilize.....
get the average xx FPS that is displayed

then hit key 'b' (will bufferize the display so will no more do rotation math)
get the average xx FPS that is displayed

My result:


Thellier - Sam440 - On board 3D (Radeon M9?)
======================================
Cow3D v5 ppc : 41 FPS
with key 'b' : 45 FPS
Microbe3D : PartyGirl.obj : 14 FPS
menu/light/MAT LIGHT FAST : 15 FPS

So I obtain around 200 000 triangles/seconde


 Status: Offline
Profile     Report this post  
tlosm 
Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 23-Apr-2013 13:23:57
#2 ]
Elite Member
Joined: 28-Jul-2012
Posts: 2612
From: Amiga land

@thellier


Probably because the bus write and relad of the x1000 is about 50% slower of the sam
You can check here
http://amiga.ikirsector.it/forum/viewtopic.php?f=37&t=17550

_________________
I love Amiga and new hope by AmigaNG
A 500 + ; CDTV; CD32;
PowerMac G5 Quad 8GB,SSD,SSHD,7800gtx,Radeon R5 230 2GB;
MacBook Pro Retina I7 2.3ghz;
#nomorea-eoninmyhome

 Status: Offline
Profile     Report this post  
Rob 
Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 23-Apr-2013 14:56:56
#3 ]
Elite Member
Joined: 20-Mar-2003
Posts: 5778
From: S.Wales

@thellier

Quote:
It seems that x1000+PCI Radeon9250 is slower than Sam440

Can someone confirm ?? or explain the problem ?


66Mhz slot in Sam440 vs 33Mhz slot in X1000.

 Status: Offline
Profile     Report this post  
mbrantley 
Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 23-Apr-2013 15:48:54
#4 ]
Cult Member
Joined: 10-Jun-2010
Posts: 524
From: Mobile, Alabama, United States

Yes, this is true. I played around a bit with a 9250 card as a second graphics card in my X1000 and determined the Blender interface was a bit slower this way than it is with the 9250 in my old Sam440ep-flex board. Now with the new Wazp3D update improving some things with my Blender experience, I just go for that for now (software only screen rendering). I am very much looking forward to the new Warp3D and am anticipating/hoping the Blender interface gets a big boost with that.

Note I am talking about shoving complex objects around the screen and such, not rendering of scenes. That kind of rendering already is quite fast.

_________________

 Status: Offline
Profile     Report this post  
tlosm 
Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 23-Apr-2013 17:46:11
#5 ]
Elite Member
Joined: 28-Jul-2012
Posts: 2612
From: Amiga land

@mbrantley

if you are using wazp mean that you are using the power of your cpu :)
this release of wazp is more optimized and is more faster i think because of this.

now make this calc ... x= your speed with wazp
y= your cpu
z= warp3d (depend of your sys config) and gallium


x=y/4

z=y*16

Wazp is a great tool but hope in future we will have the warp :)



_________________
I love Amiga and new hope by AmigaNG
A 500 + ; CDTV; CD32;
PowerMac G5 Quad 8GB,SSD,SSHD,7800gtx,Radeon R5 230 2GB;
MacBook Pro Retina I7 2.3ghz;
#nomorea-eoninmyhome

 Status: Offline
Profile     Report this post  
Spectre660 
Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 23-Apr-2013 18:29:46
#6 ]
Elite Member
Joined: 5-Jun-2005
Posts: 3738
From: Unknown

@thellier

For comparative info

Cow3D-Amiga-ppc

22 FPS
with b key 23 FPS

Sam440ep-flex 800MHZ and RadeonHD driver + Warp3d + Radeon HD6670

_________________

 Status: Online!
Profile     Report this post  
zzd10h 
Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 23-Apr-2013 18:35:44
#7 ]
Amiga Developer Team
Joined: 21-May-2012
Posts: 1069
From: France

@Spectre660

Not really better than my x1000 !

I reach 21 fps with Radeon9250 and native warp3D with Cow3D.

Strange, I thought that it was the x1000 PCI the culprit...

_________________
http://apps.amistore.net/zTools

 Status: Offline
Profile     Report this post  
Spectre660 
Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 23-Apr-2013 19:10:45
#8 ]
Elite Member
Joined: 5-Jun-2005
Posts: 3738
From: Unknown

@zzd10h

If you have wapz3D installed on a computer then it cant use the normal Warp3d library.
So you would be using the slower 9250 without any video card 3d support.

_________________

 Status: Online!
Profile     Report this post  
Spectre660 
Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 23-Apr-2013 23:55:11
#9 ]
Elite Member
Joined: 5-Jun-2005
Posts: 3738
From: Unknown

@zzd10h

Are you using the original OS4.1 warp3d.library ?
To do the test properly for a radeon 9250 you need to revert to this library .

_________________

 Status: Online!
Profile     Report this post  
Rob 
Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 24-Apr-2013 1:34:22
#10 ]
Elite Member
Joined: 20-Mar-2003
Posts: 5778
From: S.Wales

What's video like on X1000 with Radeon 92x0 compared to with an HD card.

 Status: Offline
Profile     Report this post  
Jupp3 
Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 24-Apr-2013 5:24:33
#11 ]
Super Member
Joined: 22-Feb-2007
Posts: 1220
From: Unknown

@Rob

Quote:
What's video like on X1000 with Radeon 92x0 compared to with an HD card.

I assume overlay is supported on all Radeon 92x0 cards, so the video playback should be significantly faster, especially if there's any sort of scaling involved.

 Status: Offline
Profile     Report this post  
zzd10h 
Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 24-Apr-2013 7:29:25
#12 ]
Amiga Developer Team
Joined: 21-May-2012
Posts: 1069
From: France

@Spectre660

Yes, of course, I use official Warp3D. I have never installed wazp3D.
I have installed a Radeon9250 principaly to watch video with overlay.

_________________
http://apps.amistore.net/zTools

 Status: Offline
Profile     Report this post  
Spectre660 
Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 24-Apr-2013 11:22:22
#13 ]
Elite Member
Joined: 5-Jun-2005
Posts: 3738
From: Unknown

@Rob

Cant give you any X1000 info but I had done some test with a Sam460ex when I had a working 9250 card.

http://amigaworld.net/modules/newbb/viewtopic.php?mode=viewtopic&topic_id=36355&forum=33&start=20&viewmode=flat&order=0#680357

Quote:

Rob wrote:
What's video like on X1000 with Radeon 92x0 compared to with an HD card.

_________________

 Status: Online!
Profile     Report this post  
Spectre660 
Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 24-Apr-2013 11:30:45
#14 ]
Elite Member
Joined: 5-Jun-2005
Posts: 3738
From: Unknown

@zzd10h

Ok, Things are clear now.
The speed issue may be no Picasso96 dma support for the X1000 pci slots.
The same issue exists with the Sam440ep itx single 33MHZ pci slot, and the Sam440ep-flex second and third pci slots wich are also 33MHZ.

This post refers to the Gfxbench2D benchmark Memcopy score with with or without dma .
The performance is only about 25% for a 33mhz slot without dma as compared to a 66mhz slot with dma.

http://amigaworld.net/modules/newbb/viewtopic.php?topic_id=36053&start=20&post_id=687804&order=0&viewmode=flat&pid=0&forum=14#687804

The Sam460ex pci slot performs better.

This is the Sam640ex

MemCopy Score: 187.69
Operation MiB/s
Copy to VRAM 104.38
Write Pixel Array 164.92
Copy from VRAM 36.19
Read Pixel Array 38.48

This is the X1000

MemCopy Score: 25.68
Operation MiB/s
Copy to VRAM 34.22
Write Pixel Array 18.62
Copy from VRAM 4.69
Read Pixel Array 2.65

This is the faster Sam440ep itx M9

MemCopy Score: 103.03
Operation MiB/s
Copy to VRAM 39.94
Write Pixel Array 81.26
Copy from VRAM 33.10
Read Pixel Array 31.78

And your Sam440ep-flex 66MHZ pci slot

MemCopy Score: 122.62
Operation MiB/s
Copy to VRAM 56.88
Write Pixel Array 96.86
Copy from VRAM 35.68
Read Pixel Array 35.78


Quote:

zzd10h wrote:
@Spectre660

Yes, of course, I use official Warp3D. I have never installed wazp3D.
I have installed a Radeon9250 principaly to watch video with overlay.

Last edited by Spectre660 on 24-Apr-2013 at 12:42 PM.
Last edited by Spectre660 on 24-Apr-2013 at 12:40 PM.
Last edited by Spectre660 on 24-Apr-2013 at 11:45 AM.
Last edited by Spectre660 on 24-Apr-2013 at 11:42 AM.
Last edited by Spectre660 on 24-Apr-2013 at 11:36 AM.
Last edited by Spectre660 on 24-Apr-2013 at 11:32 AM.

_________________

 Status: Online!
Profile     Report this post  
thellier 
Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 24-Apr-2013 14:22:07
#15 ]
Regular Member
Joined: 2-Nov-2009
Posts: 250
From: Paris

@all

So if it is a bandwidth problem then the coordinates format we use in Warp3D have importance

So i made a test and yes using allways a point format like this one that may allow bump-mapping and specular color
/*==================================================================*/
typedef struct _Point3D
{
float x,y,z;
float color[4];
float color2[4];
float u,v,w;
float u2,v2,w2;
float nx,ny,nz;
} Point3D;
*==================================================================*/
is slower that this one
/*==================================================================*/
typedef struct _Point3D
{
float x,y,z;
float color[4];
float u,v,w;
float nx,ny,nz;
} Point3D;

I mean it is slower even without enabling bump-mapping and specular color

Even replacing
float color[4];
by
UBYTE RGBA[4];
should help a few

Conclusion having the smaller coordinates format in Warp3D got importance when you draw 200 000 triangles/secondes
So 200 000 * 3 points * 80 bytes per point = 48 MB/s

Alain Thellier





 Status: Offline
Profile     Report this post  
Seiya 
Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 26-Apr-2013 20:31:58
#16 ]
Super Member
Joined: 19-Aug-2006
Posts: 1415
From: Italia

@thellier

these are only syntethic benchmark. you have to use real benchmark or try Sam440ep andn X1000 with real software.

_________________

 Status: Offline
Profile     Report this post  
Karlos 
Re: 3D benchmark and X1000+PCI Radeon9250
Posted on 27-Apr-2013 16:06:54
#17 ]
Elite Member
Joined: 24-Aug-2003
Posts: 2019
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

@thellier
Quote:
Even replacing
float color[4];
by
UBYTE RGBA[4];
should help a few


Internally, the drivers already do this in most cases. For example, here's an inline function I wrote to handle the normalized float to uint8 conversion:

/* Fast conversion of 0.0-1.0 ... 0-255 */
static __inline uint32 normalizedF2U8(float32 x)
{
union { float32 f; uint32 i; } u;
u.f = 32768.0f + x * (255.0f / 256.0f);
return (uint8)u.i;
}

Error is +/-1 in the least significant digit, but is always in the range 0-255 for properly normalized input, which I consider acceptable for the use case.

It produces significantly better code than the naive multiply by 255 version, even given the fact the union causes register stack spillage. For example, stacking up 4 conversions using a naive approach :

/* static inline */uint32 convertF2U(float32* clr) {
extern float32 FAC_255; /* 1/255 */
register float32 scale = FAC_255;
return ( (uint32) (clr[0]*scale) )<<24 | /* A */
( (uint32) (clr[1]*scale) )<<16 | /* R */
( (uint32) (clr[2]*scale) )<<8 | /* G */
( (uint32) (clr[3]*scale) ); /* B */
}

when compiled at -O2:

convertF2U:
stwu %r1,-80(%r1)
lis %r8,.LC4@ha
lis %r10,.LC6@ha
lfs %f0,.LC4@l(%r8)
lfs %f12,.LC6@l(%r10)
lfs %f13,12(%r3)
fmul %f0,%f13,%f0
fcmpu %cr7,%f0,%f12
cror 30,29,30
beq- %cr7,.L4
fctiwz %f0,%f0
lfs %f13,0(%r3)
lfs %f12,.LC6@l(%r10)
stfd %f0,8(%r1)
lfs %f0,.LC4@l(%r8)
lwz %r11,12(%r1)
fmul %f0,%f13,%f0
fcmpu %cr7,%f0,%f12
cror 30,29,30
beq- %cr7,.L6
.L14:
fctiwz %f0,%f0
lfs %f13,4(%r3)
lfs %f12,.LC6@l(%r10)
stfd %f0,24(%r1)
lfs %f0,.LC4@l(%r8)
lwz %r9,28(%r1)
fmul %f0,%f13,%f0
slwi %r0,%r9,24
or %r11,%r11,%r0
fcmpu %cr7,%f0,%f12
cror 30,29,30
beq- %cr7,.L8
.L15:
fctiwz %f0,%f0
lfs %f13,8(%r3)
lfs %f12,.LC6@l(%r10)
stfd %f0,40(%r1)
lfs %f0,.LC4@l(%r8)
lwz %r9,44(%r1)
fmul %f0,%f13,%f0
slwi %r0,%r9,16
or %r0,%r11,%r0
fcmpu %cr7,%f0,%f12
cror 30,29,30
beq- %cr7,.L10
.L16:
fctiwz %f0,%f0
stfd %f0,56(%r1)
lwz %r3,60(%r1)
addi %r1,%r1,80
slwi %r3,%r3,8
or %r3,%r0,%r3
blr
.L4:
fsub %f0,%f0,%f12
lfs %f12,.LC6@l(%r10)
fctiwz %f13,%f0
lfs %f0,.LC4@l(%r8)
stfd %f13,16(%r1)
lfs %f13,0(%r3)
lwz %r11,20(%r1)
fmul %f0,%f13,%f0
addis %r11,%r11,0x8000
fcmpu %cr7,%f0,%f12
cror 30,29,30
bne+ %cr7,.L14
.L6:
fsub %f0,%f0,%f12
lfs %f12,.LC6@l(%r10)
fctiwz %f13,%f0
lfs %f0,.LC4@l(%r8)
stfd %f13,32(%r1)
lfs %f13,4(%r3)
lwz %r9,36(%r1)
fmul %f0,%f13,%f0
addis %r9,%r9,0x8000
slwi %r0,%r9,24
fcmpu %cr7,%f0,%f12
or %r11,%r11,%r0
cror 30,29,30
bne+ %cr7,.L15
.L8:
fsub %f0,%f0,%f12
lfs %f12,.LC6@l(%r10)
fctiwz %f13,%f0
lfs %f0,.LC4@l(%r8)
stfd %f13,48(%r1)
lfs %f13,8(%r3)
lwz %r9,52(%r1)
fmul %f0,%f13,%f0
addis %r9,%r9,0x8000
slwi %r0,%r9,16
fcmpu %cr7,%f0,%f12
or %r0,%r11,%r0
cror 30,29,30
bne+ %cr7,.L16
.L10:
fsub %f0,%f0,%f12
fctiwz %f13,%f0
stfd %f13,64(%r1)
lwz %r3,68(%r1)
addi %r1,%r1,80
addis %r3,%r3,0x8000
slwi %r3,%r3,8
or %r3,%r0,%r3
blr

Replacing the scale multiply by calls to the inlined normalizedF2U8 gives the following output at the same optimization level:

convertF2U:
stwu %r1,-16(%r1)
lis %r9,.LC0@ha
lis %r11,.LC1@ha
lfs %f13,.LC0@l(%r9)
mr %r10,%r3
lfs %f0,.LC1@l(%r11)
lfs %f12,0(%r3)
lfs %f11,4(%r3)
lfs %f10,12(%r3)
fmadds %f12,%f12,%f13,%f0
fmadds %f11,%f11,%f13,%f0
fmadds %f10,%f10,%f13,%f0
stfs %f12,8(%r1)
lwz %r3,8(%r1)
stfs %f11,8(%r1)
slwi %r3,%r3,24
lfs %f12,8(%r10)
lwz %r0,8(%r1)
stfs %f10,8(%r1)
fmadds %f12,%f12,%f13,%f0
rlwinm %r0,%r0,16,8,15
or %r3,%r3,%r0
lwz %r9,8(%r1)
stfs %f12,8(%r1)
rlwinm %r9,%r9,0,0xff
or %r3,%r3,%r9
lwz %r11,8(%r1)
addi %r1,%r1,16
rlwinm %r11,%r11,8,16,23
or %r3,%r3,%r11
blr

_________________
IBrowse/AWeb user? amiga.org - classic browser edition

 Status: Offline
Profile     Report this post  

[ home ][ about us ][ privacy ] [ forums ][ classifieds ] [ links ][ news archive ] [ link to us ][ user account ]
Copyright (C) 2000 - 2019 Amigaworld.net.
Amigaworld.net was originally founded by David Doyle