Your support is needed and is appreciated as Amigaworld.net is primarily dependent upon the support of its users.
|
|
|
|
Poster | Thread | thellier
| |
3D benchmark and X1000+PCI Radeon9250 Posted on 23-Apr-2013 11:55:00
| | [ #1 ] |
| |
|
Regular Member |
Joined: 2-Nov-2009 Posts: 263
From: Paris | | |
|
| Hello
I have made some 3D tests on my Sam440 and Guillaume Boesel (=zzd10h ) did the same with his x1000
It seems that x1000+PCI Radeon9250 is slower than Sam440
Can someone confirm ?? or explain the problem ?
The test is 1) use Aminet/Microbe3D copy Microbe3D.library-ppc TO LIBS: run demo-view-ppc load partygirl.obj let the fps stabilize..... get the average xx FPS that is displayed "( On 50 frames xx)"
then Select Menu Light/MAT LIGHT FAST) get the average FPS too
2) use Aminet/Cow3D run Cow3D-Amiga-ppc let the fps stabilize..... get the average xx FPS that is displayed
then hit key 'b' (will bufferize the display so will no more do rotation math) get the average xx FPS that is displayed
My result:
Thellier - Sam440 - On board 3D (Radeon M9?) ====================================== Cow3D v5 ppc : 41 FPS with key 'b' : 45 FPS Microbe3D : PartyGirl.obj : 14 FPS menu/light/MAT LIGHT FAST : 15 FPS
So I obtain around 200 000 triangles/seconde
|
| Status: Offline |
| | tlosm
| |
Re: 3D benchmark and X1000+PCI Radeon9250 Posted on 23-Apr-2013 12:23:57
| | [ #2 ] |
| |
|
Elite Member |
Joined: 28-Jul-2012 Posts: 2755
From: Amiga land | | |
|
| @thellier
Probably because the bus write and relad of the x1000 is about 50% slower of the sam You can check here http://amiga.ikirsector.it/forum/viewtopic.php?f=37&t=17550 _________________ I love Amiga and new hope by AmigaNG A 500 + ; CDTV; CD32; PowerMac G5 Quad 8GB,SSD,SSHD,7800gtx,Radeon R5 230 2GB; MacBook Pro Retina I7 2.3ghz; #nomorea-eoninmyhome |
| Status: Offline |
| | Rob
| |
Re: 3D benchmark and X1000+PCI Radeon9250 Posted on 23-Apr-2013 13:56:56
| | [ #3 ] |
| |
|
Elite Member |
Joined: 20-Mar-2003 Posts: 6385
From: S.Wales | | |
|
| @thellier
Quote:
It seems that x1000+PCI Radeon9250 is slower than Sam440
Can someone confirm ?? or explain the problem ? |
66Mhz slot in Sam440 vs 33Mhz slot in X1000. |
| Status: Offline |
| | mbrantley
| |
Re: 3D benchmark and X1000+PCI Radeon9250 Posted on 23-Apr-2013 14:48:54
| | [ #4 ] |
| |
|
Cult Member |
Joined: 10-Jun-2010 Posts: 560
From: Mobile, Alabama, United States | | |
|
| Yes, this is true. I played around a bit with a 9250 card as a second graphics card in my X1000 and determined the Blender interface was a bit slower this way than it is with the 9250 in my old Sam440ep-flex board. Now with the new Wazp3D update improving some things with my Blender experience, I just go for that for now (software only screen rendering). I am very much looking forward to the new Warp3D and am anticipating/hoping the Blender interface gets a big boost with that.
Note I am talking about shoving complex objects around the screen and such, not rendering of scenes. That kind of rendering already is quite fast. _________________
|
| Status: Offline |
| | tlosm
| |
Re: 3D benchmark and X1000+PCI Radeon9250 Posted on 23-Apr-2013 16:46:11
| | [ #5 ] |
| |
|
Elite Member |
Joined: 28-Jul-2012 Posts: 2755
From: Amiga land | | |
|
| @mbrantley
if you are using wazp mean that you are using the power of your cpu :) this release of wazp is more optimized and is more faster i think because of this.
now make this calc ... x= your speed with wazp y= your cpu z= warp3d (depend of your sys config) and gallium
x=y/4
z=y*16
Wazp is a great tool but hope in future we will have the warp :)
_________________ I love Amiga and new hope by AmigaNG A 500 + ; CDTV; CD32; PowerMac G5 Quad 8GB,SSD,SSHD,7800gtx,Radeon R5 230 2GB; MacBook Pro Retina I7 2.3ghz; #nomorea-eoninmyhome |
| Status: Offline |
| | Spectre660
| |
Re: 3D benchmark and X1000+PCI Radeon9250 Posted on 23-Apr-2013 17:29:46
| | [ #6 ] |
| |
|
Elite Member |
Joined: 4-Jun-2005 Posts: 3918
From: Unknown | | |
|
| @thellier
For comparative info
Cow3D-Amiga-ppc
22 FPS with b key 23 FPS
Sam440ep-flex 800MHZ and RadeonHD driver + Warp3d + Radeon HD6670 _________________ Sam460ex : Radeon Rx550 Single slot Video Card : SIL3112 SATA card |
| Status: Offline |
| | zzd10h
| |
Re: 3D benchmark and X1000+PCI Radeon9250 Posted on 23-Apr-2013 17:35:44
| | [ #7 ] |
| |
|
Amiga Developer Team |
Joined: 21-May-2012 Posts: 1077
From: France | | |
|
| @Spectre660
Not really better than my x1000 !
I reach 21 fps with Radeon9250 and native warp3D with Cow3D.
Strange, I thought that it was the x1000 PCI the culprit... _________________ http://apps.amistore.net/zTools |
| Status: Offline |
| | Spectre660
| |
Re: 3D benchmark and X1000+PCI Radeon9250 Posted on 23-Apr-2013 18:10:45
| | [ #8 ] |
| |
|
Elite Member |
Joined: 4-Jun-2005 Posts: 3918
From: Unknown | | |
|
| @zzd10h
If you have wapz3D installed on a computer then it cant use the normal Warp3d library. So you would be using the slower 9250 without any video card 3d support.
_________________ Sam460ex : Radeon Rx550 Single slot Video Card : SIL3112 SATA card |
| Status: Offline |
| | Spectre660
| |
Re: 3D benchmark and X1000+PCI Radeon9250 Posted on 23-Apr-2013 22:55:11
| | [ #9 ] |
| |
|
Elite Member |
Joined: 4-Jun-2005 Posts: 3918
From: Unknown | | |
|
| @zzd10h
Are you using the original OS4.1 warp3d.library ? To do the test properly for a radeon 9250 you need to revert to this library .
_________________ Sam460ex : Radeon Rx550 Single slot Video Card : SIL3112 SATA card |
| Status: Offline |
| | Rob
| |
Re: 3D benchmark and X1000+PCI Radeon9250 Posted on 24-Apr-2013 0:34:22
| | [ #10 ] |
| |
|
Elite Member |
Joined: 20-Mar-2003 Posts: 6385
From: S.Wales | | |
|
| What's video like on X1000 with Radeon 92x0 compared to with an HD card.
|
| Status: Offline |
| | Jupp3
| |
Re: 3D benchmark and X1000+PCI Radeon9250 Posted on 24-Apr-2013 4:24:33
| | [ #11 ] |
| |
|
Super Member |
Joined: 22-Feb-2007 Posts: 1225
From: Unknown | | |
|
| @Rob
Quote:
What's video like on X1000 with Radeon 92x0 compared to with an HD card. |
I assume overlay is supported on all Radeon 92x0 cards, so the video playback should be significantly faster, especially if there's any sort of scaling involved. |
| Status: Offline |
| | zzd10h
| |
Re: 3D benchmark and X1000+PCI Radeon9250 Posted on 24-Apr-2013 6:29:25
| | [ #12 ] |
| |
|
Amiga Developer Team |
Joined: 21-May-2012 Posts: 1077
From: France | | |
|
| @Spectre660
Yes, of course, I use official Warp3D. I have never installed wazp3D. I have installed a Radeon9250 principaly to watch video with overlay. _________________ http://apps.amistore.net/zTools |
| Status: Offline |
| | Spectre660
| |
Re: 3D benchmark and X1000+PCI Radeon9250 Posted on 24-Apr-2013 10:22:22
| | [ #13 ] |
| |
|
Elite Member |
Joined: 4-Jun-2005 Posts: 3918
From: Unknown | | |
|
| | Status: Offline |
| | Spectre660
| |
Re: 3D benchmark and X1000+PCI Radeon9250 Posted on 24-Apr-2013 10:30:45
| | [ #14 ] |
| |
|
Elite Member |
Joined: 4-Jun-2005 Posts: 3918
From: Unknown | | |
|
| @zzd10h
Ok, Things are clear now. The speed issue may be no Picasso96 dma support for the X1000 pci slots. The same issue exists with the Sam440ep itx single 33MHZ pci slot, and the Sam440ep-flex second and third pci slots wich are also 33MHZ.
This post refers to the Gfxbench2D benchmark Memcopy score with with or without dma . The performance is only about 25% for a 33mhz slot without dma as compared to a 66mhz slot with dma.
http://amigaworld.net/modules/newbb/viewtopic.php?topic_id=36053&start=20&post_id=687804&order=0&viewmode=flat&pid=0&forum=14#687804
The Sam460ex pci slot performs better.
This is the Sam640ex
MemCopy Score: 187.69 Operation MiB/s Copy to VRAM 104.38 Write Pixel Array 164.92 Copy from VRAM 36.19 Read Pixel Array 38.48
This is the X1000
MemCopy Score: 25.68 Operation MiB/s Copy to VRAM 34.22 Write Pixel Array 18.62 Copy from VRAM 4.69 Read Pixel Array 2.65
This is the faster Sam440ep itx M9
MemCopy Score: 103.03 Operation MiB/s Copy to VRAM 39.94 Write Pixel Array 81.26 Copy from VRAM 33.10 Read Pixel Array 31.78
And your Sam440ep-flex 66MHZ pci slot
MemCopy Score: 122.62 Operation MiB/s Copy to VRAM 56.88 Write Pixel Array 96.86 Copy from VRAM 35.68 Read Pixel Array 35.78
Quote:
zzd10h wrote: @Spectre660
Yes, of course, I use official Warp3D. I have never installed wazp3D. I have installed a Radeon9250 principaly to watch video with overlay. |
Last edited by Spectre660 on 24-Apr-2013 at 11:42 AM. Last edited by Spectre660 on 24-Apr-2013 at 11:40 AM. Last edited by Spectre660 on 24-Apr-2013 at 10:45 AM. Last edited by Spectre660 on 24-Apr-2013 at 10:42 AM. Last edited by Spectre660 on 24-Apr-2013 at 10:36 AM. Last edited by Spectre660 on 24-Apr-2013 at 10:32 AM.
_________________ Sam460ex : Radeon Rx550 Single slot Video Card : SIL3112 SATA card |
| Status: Offline |
| | thellier
| |
Re: 3D benchmark and X1000+PCI Radeon9250 Posted on 24-Apr-2013 13:22:07
| | [ #15 ] |
| |
|
Regular Member |
Joined: 2-Nov-2009 Posts: 263
From: Paris | | |
|
| @all
So if it is a bandwidth problem then the coordinates format we use in Warp3D have importance
So i made a test and yes using allways a point format like this one that may allow bump-mapping and specular color /*==================================================================*/ typedef struct _Point3D { float x,y,z; float color[4]; float color2[4]; float u,v,w; float u2,v2,w2; float nx,ny,nz; } Point3D; *==================================================================*/ is slower that this one /*==================================================================*/ typedef struct _Point3D { float x,y,z; float color[4]; float u,v,w; float nx,ny,nz; } Point3D;
I mean it is slower even without enabling bump-mapping and specular color
Even replacing float color[4]; by UBYTE RGBA[4]; should help a few
Conclusion having the smaller coordinates format in Warp3D got importance when you draw 200 000 triangles/secondes So 200 000 * 3 points * 80 bytes per point = 48 MB/s
Alain Thellier
|
| Status: Offline |
| | Seiya
| |
Re: 3D benchmark and X1000+PCI Radeon9250 Posted on 26-Apr-2013 19:31:58
| | [ #16 ] |
| |
|
Super Member |
Joined: 19-Aug-2006 Posts: 1475
From: Italia | | |
|
| @thellier
these are only syntethic benchmark. you have to use real benchmark or try Sam440ep andn X1000 with real software. _________________
|
| Status: Offline |
| | Karlos
| |
Re: 3D benchmark and X1000+PCI Radeon9250 Posted on 27-Apr-2013 15:06:54
| | [ #17 ] |
| |
|
Elite Member |
Joined: 24-Aug-2003 Posts: 4619
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition! | | |
|
| @thellier Quote:
Even replacing float color[4]; by UBYTE RGBA[4]; should help a few
|
Internally, the drivers already do this in most cases. For example, here's an inline function I wrote to handle the normalized float to uint8 conversion:
/* Fast conversion of 0.0-1.0 ... 0-255 */ static __inline uint32 normalizedF2U8(float32 x) { union { float32 f; uint32 i; } u; u.f = 32768.0f + x * (255.0f / 256.0f); return (uint8)u.i; }
Error is +/-1 in the least significant digit, but is always in the range 0-255 for properly normalized input, which I consider acceptable for the use case.
It produces significantly better code than the naive multiply by 255 version, even given the fact the union causes register stack spillage. For example, stacking up 4 conversions using a naive approach :
/* static inline */uint32 convertF2U(float32* clr) { extern float32 FAC_255; /* 1/255 */ register float32 scale = FAC_255; return ( (uint32) (clr[0]*scale) )<<24 | /* A */ ( (uint32) (clr[1]*scale) )<<16 | /* R */ ( (uint32) (clr[2]*scale) )<<8 | /* G */ ( (uint32) (clr[3]*scale) ); /* B */ }
when compiled at -O2:
convertF2U: stwu %r1,-80(%r1) lis %r8,.LC4@ha lis %r10,.LC6@ha lfs %f0,.LC4@l(%r8) lfs %f12,.LC6@l(%r10) lfs %f13,12(%r3) fmul %f0,%f13,%f0 fcmpu %cr7,%f0,%f12 cror 30,29,30 beq- %cr7,.L4 fctiwz %f0,%f0 lfs %f13,0(%r3) lfs %f12,.LC6@l(%r10) stfd %f0,8(%r1) lfs %f0,.LC4@l(%r8) lwz %r11,12(%r1) fmul %f0,%f13,%f0 fcmpu %cr7,%f0,%f12 cror 30,29,30 beq- %cr7,.L6 .L14: fctiwz %f0,%f0 lfs %f13,4(%r3) lfs %f12,.LC6@l(%r10) stfd %f0,24(%r1) lfs %f0,.LC4@l(%r8) lwz %r9,28(%r1) fmul %f0,%f13,%f0 slwi %r0,%r9,24 or %r11,%r11,%r0 fcmpu %cr7,%f0,%f12 cror 30,29,30 beq- %cr7,.L8 .L15: fctiwz %f0,%f0 lfs %f13,8(%r3) lfs %f12,.LC6@l(%r10) stfd %f0,40(%r1) lfs %f0,.LC4@l(%r8) lwz %r9,44(%r1) fmul %f0,%f13,%f0 slwi %r0,%r9,16 or %r0,%r11,%r0 fcmpu %cr7,%f0,%f12 cror 30,29,30 beq- %cr7,.L10 .L16: fctiwz %f0,%f0 stfd %f0,56(%r1) lwz %r3,60(%r1) addi %r1,%r1,80 slwi %r3,%r3,8 or %r3,%r0,%r3 blr .L4: fsub %f0,%f0,%f12 lfs %f12,.LC6@l(%r10) fctiwz %f13,%f0 lfs %f0,.LC4@l(%r8) stfd %f13,16(%r1) lfs %f13,0(%r3) lwz %r11,20(%r1) fmul %f0,%f13,%f0 addis %r11,%r11,0x8000 fcmpu %cr7,%f0,%f12 cror 30,29,30 bne+ %cr7,.L14 .L6: fsub %f0,%f0,%f12 lfs %f12,.LC6@l(%r10) fctiwz %f13,%f0 lfs %f0,.LC4@l(%r8) stfd %f13,32(%r1) lfs %f13,4(%r3) lwz %r9,36(%r1) fmul %f0,%f13,%f0 addis %r9,%r9,0x8000 slwi %r0,%r9,24 fcmpu %cr7,%f0,%f12 or %r11,%r11,%r0 cror 30,29,30 bne+ %cr7,.L15 .L8: fsub %f0,%f0,%f12 lfs %f12,.LC6@l(%r10) fctiwz %f13,%f0 lfs %f0,.LC4@l(%r8) stfd %f13,48(%r1) lfs %f13,8(%r3) lwz %r9,52(%r1) fmul %f0,%f13,%f0 addis %r9,%r9,0x8000 slwi %r0,%r9,16 fcmpu %cr7,%f0,%f12 or %r0,%r11,%r0 cror 30,29,30 bne+ %cr7,.L16 .L10: fsub %f0,%f0,%f12 fctiwz %f13,%f0 stfd %f13,64(%r1) lwz %r3,68(%r1) addi %r1,%r1,80 addis %r3,%r3,0x8000 slwi %r3,%r3,8 or %r3,%r0,%r3 blr
Replacing the scale multiply by calls to the inlined normalizedF2U8 gives the following output at the same optimization level:
convertF2U: stwu %r1,-16(%r1) lis %r9,.LC0@ha lis %r11,.LC1@ha lfs %f13,.LC0@l(%r9) mr %r10,%r3 lfs %f0,.LC1@l(%r11) lfs %f12,0(%r3) lfs %f11,4(%r3) lfs %f10,12(%r3) fmadds %f12,%f12,%f13,%f0 fmadds %f11,%f11,%f13,%f0 fmadds %f10,%f10,%f13,%f0 stfs %f12,8(%r1) lwz %r3,8(%r1) stfs %f11,8(%r1) slwi %r3,%r3,24 lfs %f12,8(%r10) lwz %r0,8(%r1) stfs %f10,8(%r1) fmadds %f12,%f12,%f13,%f0 rlwinm %r0,%r0,16,8,15 or %r3,%r3,%r0 lwz %r9,8(%r1) stfs %f12,8(%r1) rlwinm %r9,%r9,0,0xff or %r3,%r3,%r9 lwz %r11,8(%r1) addi %r1,%r1,16 rlwinm %r11,%r11,8,16,23 or %r3,%r3,%r11 blr_________________ Doing stupid things for fun... |
| Status: Offline |
| |
|
|
|
[ home ][ about us ][ privacy ]
[ forums ][ classifieds ]
[ links ][ news archive ]
[ link to us ][ user account ]
|