Your support is needed and is appreciated as Amigaworld.net is primarily dependent upon the support of its users.
|
|
|
|
Poster | Thread | cdimauro
| |
Re: PowerPC notebook - Status update Posted on 10-Jun-2015 21:11:50
| | [ #61 ] |
| |
|
Elite Member |
Joined: 29-Oct-2012 Posts: 3650
From: Germany | | |
|
| @pavlor
Quote:
pavlor wrote: @cdimauro
Quote:
it's derived from it, but it's different. |
Different in what? Bigger cache? |
3 vs 4 instructions is enough, I think. However I talk about that replying to olegil. Quote:
Quote:
You can see that a G4 is a very respecatable chip, compared to the current PowerPCs offering. |
Quote:
it's better to use the same SPEC benchmark. However, it remains a synthetic benchmark which very unlikely tests the Altivec unit, for example. |
Both G4 and e6500 have similar AltiVec units. |
I don't think so: the G4 has 4 vector units, whereas the e6500 has only one, for example. Quote:
I don´t think G4 would score any better. |
See above. Let's see the benchmarks.
I think that you have seen some benchmarks which were posted time ago, when the X1000 was introduced, and AFAIK G4 systems were, on average, superior to any other PowerPC microarchitecture, except the G5. Quote:
On the other hand, G4 is bottlenecked by slow memory interface (eg. G4 1666 MHz is faster in raw performance than PA6T and has much more powerful AltiVec, still it loses in video playback benchmarks). |
Sure, it's limited by the memory subsystem, but it doesn't mean that it'll be a poor performer, in general.
Regarding the video playback, it's really strange. Does it really depends on the processor? Because the issue might be related to the video card. |
| Status: Offline |
| | cdimauro
| |
Re: PowerPC notebook - Status update Posted on 10-Jun-2015 21:12:53
| | [ #62 ] |
| |
|
Elite Member |
Joined: 29-Oct-2012 Posts: 3650
From: Germany | | |
|
| @Hypex
Quote:
Hypex wrote: @cdimauro
For the past 10-15 years producing a PowerPC machine has been seemingly hard and at first there was similar if not equal power at an expense to now where there is lesser power and still an expense, compared to cheaper boards. On top of sourcing the chips.
But if this laptop can be produced with four cores at 64-bit it certainly is up there in the power stakes. And likely to be as expensive as a 17" PowerBook ever was. |
And performance has to be seen... |
| Status: Offline |
| | cdimauro
| |
Re: PowerPC notebook - Status update Posted on 10-Jun-2015 21:22:47
| | [ #63 ] |
| |
|
Elite Member |
Joined: 29-Oct-2012 Posts: 3650
From: Germany | | |
|
| @olegil
Quote:
olegil wrote: @cdimauro
e600/G4 is closer to a renaming than a derivation.
7448 uses e600, 7447 is G4. |
Then there's something wrong on the documentation that pavlor posted, since the G4 can execute up to 4 instructions per clock cycle, instead of the reported 3. Quote:
However, buried in your TL;DR posts is an important point. e5500 and e6500 are beefed up e500 cores, not beefed up e600 cores. Comparing MIPS numbers across families is hard, so we're gonna need some real-world data on this one. |
That's what I repeat from long time. Quote:
It's hard for a G4 with 133/166 or even 200MHz FSB to compete with triple DDR3 1866 also. Essentially, the T4 has 28 times higher bandwidth to memory than even the 7448. |
Sure, but: - the T4 is an "high-end" chip and it has to be seen if it'll be used for any new post-Amiga machine; - you don't need to use all that bandwidth, which is also theoretical (practical/real is quite lower, especially if the code is not able to take advantage of issuing commands in parallel to the current request being served); - DDR3 poses other limitations compared to DDR and DDR2 (granularity, full burst modes, latency); Quote:
Also, you're ONLY concerned with single-thread performance, the rest of us are still hopeful there's gonna be some SMP benefit here. |
Naturally, since any post-Amiga o.s. is strictly single thread/core, and there's no chance to have SMP without losing retrocompatibily with the existing applications (which is something that unlikely will happen). Unless for AROS x86/x64, which might have some chance, but let me leave it out of the discussion now.
I think that a form of AMP will be more realistic.
But, anyway, we can talk about this WHEN there'll be some multi-core support. "When it's done"... Quote:
Edit: Now I found the 8641D datasheet, it lists exactly the same core features as the 7447 you quote, no surprises there. |
OK, so the previous informations reported by pavlor weren't all exact. |
| Status: Offline |
| | cdimauro
| |
Re: PowerPC notebook - Status update Posted on 10-Jun-2015 21:31:25
| | [ #64 ] |
| |
|
Elite Member |
Joined: 29-Oct-2012 Posts: 3650
From: Germany | | |
|
| @Hammer
Quote:
Hammer wrote: @cdimauro
Quote:
cdimauro wrote: As you can see, the last models have the latest G4 processors, which had good performance (and Altivec too).
|
While it's good to see alternatives, what's special about Altivec when Intel AVXv2 (with FMA + 256bit SIMD) and Intel Iris Pro smashes Altivec solution?
|
I already posted a link to a post some time ago, which talked about AVX. AVX2 is better, especially with FMA (which essentially doubles the floating point performance, when using the common MAC pattern).
However already SSE had also very good performance.
Regarding the Iris Pro, it's a graphic card, and not a SIMD unit. You can use it, with things like OpenCL, and achieve very good performance.
More interesting stuff will come, to make it MUCH easier to parallelize/vectorize the code, without rewriting it again like it's needed with OpenCL, CUDA, DirectCompute. |
| Status: Offline |
| | pavlor
| |
Re: PowerPC notebook - Status update Posted on 10-Jun-2015 21:37:39
| | [ #65 ] |
| |
|
Elite Member |
Joined: 10-Jul-2005 Posts: 9584
From: Unknown | | |
|
| @cdimauro
Quote:
since the G4 can execute up to 4 instructions per clock cycle, instead of the reported 3. |
Look again at CPU specs - 4 integer units in both e600 and older G4 designs.
Quote:
OK, so the previous informations reported by pavlor weren't all exact. |
I linked all documents you need. I only C/P informations important for our discussion.Last edited by pavlor on 10-Jun-2015 at 09:40 PM.
|
| Status: Offline |
| | cdimauro
| |
Re: PowerPC notebook - Status update Posted on 10-Jun-2015 21:50:03
| | [ #66 ] |
| |
|
Elite Member |
Joined: 29-Oct-2012 Posts: 3650
From: Germany | | |
|
| | Status: Offline |
| | KimmoK
| |
Re: PowerPC notebook - Status update Posted on 11-Jun-2015 7:51:19
| | [ #67 ] |
| |
|
Elite Member |
Joined: 14-Mar-2003 Posts: 5211
From: Ylikiiminki, Finland | | |
|
| @cdimauro
>G4 has 4 vector units, whereas the e6500 has only one, for example.
To me it seems they look more similar regarding amount of internal units. e6500 altivec consist of VSFX, VCFX, VFPU, VPERM units. and e600 altivec consist of VPU, VIU1, VIU2, VFPU units.
(I believe in e6500 two threads share one Altivec unit, while a lot of other stuff is doubled to enable near 2x overall performance... e6500 dual thread core looks to me a little bit like AMD Bobcat dualcore ...)
SPAM of obsolete marketing material of Freescale: https://www.power.org/wp-content/uploads/2012/10/8.-Freescale-Dac-Pham.pdf (has some cool coremark graphs, for us Power Architecture lowers)
& for reference, here one can see how cores match with Power ISA spec: http://en.wikipedia.org/wiki/Power_Architecture#Power_ISA_v.2.06
Power ISA v.2.03 -440,460 (A1 500 / SAM CPU)
Power ISA v.2.04 -PA6T (the A1 x1000 CPU
Power ISA v.2.06 -e500-mc (A1 X3500 cpu) -e5500 (A1 X5000 cpu)
Power ISA v.2.07 -e6500 (T208x adn T4xxx SoCs)
e6500 internals: http://doi.ieeecomputersociety.org/cms/Computer.org/dl/mags/mi/2012/05/figures/mmi20120500261.gif UPDATE, e6500 pipeline pic: http://doi.ieeecomputersociety.org/cms/Computer.org/dl/mags/mi/2012/05/figures/mmi20120500262.gif
++++ Power ISA v2.06 introduces the new VSX vector-scalar instructions which extend SIMD processing for the Power ISA to support up to 64 registers, with support for regular floating point, decimal floating point and vector execution. So e6500 Altivec should be better than the old (better than SSE2) Altivec. But surely even the latest Altivec is not on the level of intel latest SIMD offerings. +++++
The pipeline pic shows how freescale has dublicated some components for the second thread.
++++ Performance gain in e6500 (vs 5500?) http://doi.ieeecomputersociety.org/cms/Computer.org/dl/mags/mi/2012/05/figures/mmi2012050026t1.gif Last edited by KimmoK on 11-Jun-2015 at 09:50 AM. Last edited by KimmoK on 11-Jun-2015 at 09:45 AM. Last edited by KimmoK on 11-Jun-2015 at 09:44 AM. Last edited by KimmoK on 11-Jun-2015 at 09:40 AM. Last edited by KimmoK on 11-Jun-2015 at 08:18 AM. Last edited by KimmoK on 11-Jun-2015 at 08:17 AM. Last edited by KimmoK on 11-Jun-2015 at 08:04 AM. Last edited by KimmoK on 11-Jun-2015 at 07:59 AM. Last edited by KimmoK on 11-Jun-2015 at 07:53 AM.
_________________ - KimmoK // For freedom, for honor, for AMIGA // // Thing that I should find more time for: CC64 - 64bit Community Computer? |
| Status: Offline |
| | olegil
| |
Re: PowerPC notebook - Status update Posted on 11-Jun-2015 9:31:38
| | [ #68 ] |
| |
|
Elite Member |
Joined: 22-Aug-2003 Posts: 5895
From: Work | | |
|
| @cdimauro
Different phrasings for the same thing. 7540UM includes ALL G4 processors up to and including the e600 7448, with tables listing differences between them. It has this to say: Quote:
The MPC7450 also implements the AltiVec instruction set architectural extension. The MPC7450 is a superscalar processor that can dispatch and complete three instructions simultaneously. It incorporates the following execution units: 64-bit floating-point unit (FPU) • Branch processing unit (BPU) • Load/store unit (LSU) • Four integer units (IUs): — Three shorter latency IUs (IU1a–IU1c)—execute all integer instructions except multiply, divide, and move to/from special-purpose register (SPR) instructions. — Longer latency IU (IU2)—executes miscellaneous instructions including condition register (CR) logical operations, integer multiplication and division instructions, and move to/from SPR instructions. • Four vector units that support AltiVec instructions: — Vector permute unit (VPU) — Vector integer unit 1 (VIU1)—performs shorter latency integer calculations — Vector integer unit 2 (VIU2)—performs longer latency integer calculations — Vector floating-point unit (VFPU)
|
Quote:
The e600 core also implements the AltiVec instruction set architectural extension. The e600 core can dispatch and complete three instructions simultaneously. It incorporates the following execution units: • 64-bit floating-point unit (FPU) • Branch processing unit (BPU) • Load/store unit (LSU) • Four integer units (IUs): — Three shorter latency IUs (IU1a–IU1c)—execute all integer instructions except multiply, divide, and move to/from special-purpose register (SPR) instructions. — Longer latency IU (IU2)—executes miscellaneous instructions including condition register (CR) logical operations, integer multiplication and division instructions, and move to/from SPR instructions. • Four vector units that support AltiVec instructions: — Vector permute unit (VPU) — Vector integer unit 1 (VIU1)—performs shorter latency integer calculations — Vector integer unit 2 (VIU2)—performs longer latency integer calculations — Vector floating-point unit (VFPU)
|
e600 === G4.
Edit: cleaned up some linebreaks Edit2: forgot to write where I got the e600 part. it's from the MPC8641DRMLast edited by olegil on 11-Jun-2015 at 09:48 AM. Last edited by olegil on 11-Jun-2015 at 09:32 AM.
_________________ This weeks pet peeve: Using "voltage" instead of "potential", which leads to inventing new words like "amperage" instead of "current" (I, measured in A) or possible "charge" (amperehours, Ah or Coulomb, C). Sometimes I don't even know what people mean. |
| Status: Offline |
| | pavlor
| |
Re: PowerPC notebook - Status update Posted on 11-Jun-2015 16:43:47
| | [ #69 ] |
| |
|
Elite Member |
Joined: 10-Jul-2005 Posts: 9584
From: Unknown | | |
|
| @olegil
Thanks for side by side comparison. I hope it is now clear to all. |
| Status: Offline |
| | cdimauro
| |
Re: PowerPC notebook - Status update Posted on 11-Jun-2015 20:50:47
| | [ #70 ] |
| |
|
Elite Member |
Joined: 29-Oct-2012 Posts: 3650
From: Germany | | |
|
| @KimmoK
Quote:
KimmoK wrote: @cdimauro
>G4 has 4 vector units, whereas the e6500 has only one, for example.
To me it seems they look more similar regarding amount of internal units. e6500 altivec consist of VSFX, VCFX, VFPU, VPERM units. and e600 altivec consist of VPU, VIU1, VIU2, VFPU units. |
What counts is how many instruction you can issue to any unit. Quote:
(I believe in e6500 two threads share one Altivec unit, |
There's only one, shared by the two thread, as is shown by the images that you reported. Quote:
while a lot of other stuff is doubled to enable near 2x overall performance... |
No. Only the integer units were doubled, as well as the load/store and the branch, but the number of decoded instructions per cycle passed from 2 to 3. The fetch, FPU and Altivec units are all single copy / shared. So, no you: cannot get 2x overall performance, and we are talking about theoretical numbers. Quote:
e6500 dual thread core looks to me a little bit like AMD Bobcat dualcore ...) |
Sorry, I don't remember how it looks. Quote:
Marketing. Quote:
Power ISA v2.06 introduces the new VSX vector-scalar instructions which extend SIMD processing for the Power ISA to support up to 64 registers, with support for regular floating point, decimal floating point and vector execution. |
Does it support also double precision? Has it some FMAC? Quote:
So e6500 Altivec should be better than the old (better than SSE2) Altivec. |
Better than old Altivec it's very likely, but why do you think that it's better than SSE2 (which is an old technology, BTW: we have SSE4.2 from long time, with some minor, but important, extensions)? Have you some information? Quote:
But surely even the latest Altivec is not on the level of intel latest SIMD offerings. |
I already posted an article. Quote:
Single thread/core can be lower, due to halving the L1 code cache. |
| Status: Offline |
| | cdimauro
| |
Re: PowerPC notebook - Status update Posted on 11-Jun-2015 20:52:10
| | [ #71 ] |
| |
|
Elite Member |
Joined: 29-Oct-2012 Posts: 3650
From: Germany | | |
|
| @olegil: they still report "three instructions", whereas the G4 can process up to 4 (one should be a branch).
So, they don't consider a branch as an instruction. |
| Status: Offline |
| | pavlor
| |
Re: PowerPC notebook - Status update Posted on 11-Jun-2015 21:14:39
| | [ #72 ] |
| |
|
Elite Member |
Joined: 10-Jul-2005 Posts: 9584
From: Unknown | | |
|
| @cdimauro
Did you read Olegil´s direct comparison of e600 vs "G4" core features? I know it is too late here in central Europe, try again tomorrow morning (and me too ). |
| Status: Offline |
| | resle
| |
Re: PowerPC notebook - Status update Posted on 12-Jun-2015 1:51:20
| | [ #73 ] |
| |
|
Cult Member |
Joined: 28-Nov-2005 Posts: 500
From: shanghai | | |
|
| @cdimauro @pavlor
Jesus F. Christ,
what's the point of this obnoxious banter on instruction cycles, picoseconds, vperm, vcpu, xrks, xwerlwlwlslfs and on and on and on,
when ultimately the thing will be an underpowered, overpriced black brick that looks like a Toshiba notebook from the mid 90s? |
| Status: Offline |
| | cdimauro
| |
Re: PowerPC notebook - Status update Posted on 12-Jun-2015 5:27:27
| | [ #74 ] |
| |
|
Elite Member |
Joined: 29-Oct-2012 Posts: 3650
From: Germany | | |
|
| @pavlor
Quote:
pavlor wrote: @cdimauro
Did you read Olegil´s direct comparison of e600 vs "G4" core features? |
Yes. And I only reported that they report 3 instructions for both, whereas in reality the G4 can execute 4, as is written on the MPC7447AEC.pdf (from which I reported the TL;DR).
To speak clearly: I've no problem accepting that e600 = G4, but the documentation is simply wrong. That's all. Quote:
I know it is too late here in central Europe, try again tomorrow morning (and me too ). |
No, usually I stay longer, but yesterday night I was too tired that I preferred to fly on the bed and rest.
@resle
Quote:
resle wrote: @cdimauro @pavlor
Jesus F. Christ,
what's the point of this obnoxious banter on instruction cycles, picoseconds, vperm, vcpu, xrks, xwerlwlwlslfs and on and on and on,
when ultimately the thing will be an underpowered, overpriced black brick that looks like a Toshiba notebook from the mid 90s? |
To claim that it's underpowered you have to discuss about performance, which is what we were doing.
Anyway, you're too much optimistic, because you're supposing that this project will be completed and... shipped. |
| Status: Offline |
| | olegil
| |
Re: PowerPC notebook - Status update Posted on 12-Jun-2015 6:43:10
| | [ #75 ] |
| |
|
Elite Member |
Joined: 22-Aug-2003 Posts: 5895
From: Work | | |
|
| @cdimauro
Quote:
cdimauro wrote: @olegil: they still report "three instructions", whereas the G4 can process up to 4 (one should be a branch).
So, they don't consider a branch as an instruction. |
Well, it isn't. If the branch predictor is implemented sanely
Basically you either take the most likely path and have to clear out the pipeline if you're wrong (PPC 603e does this, always guesses loop), OR you take BOTH paths (x86 does this). One of these approaches cost a LOT of transistors, shouldn't be hard to guess which. As for G4, I don't remember what it does.
Edit: It seems to have "dynamic branch prediction", so I'm thinking it's NOT the 603e approach (as that is not very dynamic). Edit2: AND reading further it's all explained in the next chapter: Branch processing unit (BPU) features static and dynamic branch prediction – 128-entry (32-set, four-way set-associative) branch target instruction cache (BTIC), a cache of branch instructions that have been encountered in branch/loop code sequences. If a target instruction is in the BTIC, it is fetched into the instruction queue a cycle sooner than it can be made available from the instruction cache. Typically, a fetch that hits the BTIC provides the first 4 instructions in the target stream. – 2048-entry branch history table (BHT) with 2 bits per entry for four levels of prediction—not-taken, strongly not-taken, taken, strongly taken – Up to three outstanding speculative branches – Branch instructions that do not update the count register (CTR) or link register (LR) are often removed from the instruction stream. – Eight-entry link register stack to predict the target address of Branch Conditional to Link Register (bclr) instructions
So it keeps statistics of the branch (well, two bits, but anyway), it keeps a cache of targets, the branch doesn't take any cycles if it hits the cache. This gives you close to "three instructions and a branch" each cycle. The e6500 manual is phrased a little bit differently (actually not that much), but it seems to do more or less exactly the same thing. Basically all the good stuff from G4 has now been put in the e5500, this is why they named it e6500, as far as I can tell. Now I need to work.Last edited by olegil on 12-Jun-2015 at 07:20 AM. Last edited by olegil on 12-Jun-2015 at 07:07 AM.
_________________ This weeks pet peeve: Using "voltage" instead of "potential", which leads to inventing new words like "amperage" instead of "current" (I, measured in A) or possible "charge" (amperehours, Ah or Coulomb, C). Sometimes I don't even know what people mean. |
| Status: Offline |
| | KimmoK
| |
Re: PowerPC notebook - Status update Posted on 12-Jun-2015 8:07:52
| | [ #76 ] |
| |
|
Elite Member |
Joined: 14-Mar-2003 Posts: 5211
From: Ylikiiminki, Finland | | |
|
| @cdimauro
>Better than old Altivec it's very likely, but why do you think that it's better than SSE2
I did not save any link, just read a study (non apple .pdf) where P4 systems were tested against G5 systems. In that document 2Ghz G5 was slower than much higher clocked P4 with compiler optimized code. (I think it took long time before compilers started to fully optimize for G5) And with hand optimized assembler (on that document/study) G5 became a lot faster in multimedia related operations . IIRC, 150% faster than the higher clocked P4.
But as we know, Altivec has not had big updates since G4, while Intel SIMD has got at least 4...5 updates to it's SIMD technology since 2001.
(and intel has also been superior in developing compilers for their SIMD, in PowerPC there never was such power in compiler development, so compiler seem to be far behind what latest cores could do)
((unless I'm mistaken, current compilers for AOS4 do not yet fully support PA6T, just as an example))
>(which is an old technology,
Sure, SSE2 is from y2001. Apple people had a lot of fun in optimizing for G4 and G5 so that it started to beat P4 with SSE2 (per Mhz and more. Last edited by KimmoK on 12-Jun-2015 at 08:11 AM. Last edited by KimmoK on 12-Jun-2015 at 08:10 AM.
_________________ - KimmoK // For freedom, for honor, for AMIGA // // Thing that I should find more time for: CC64 - 64bit Community Computer? |
| Status: Offline |
| | tlosm
| |
Re: PowerPC notebook - Status update Posted on 12-Jun-2015 9:57:02
| | [ #77 ] |
| |
|
Elite Member |
Joined: 28-Jul-2012 Posts: 2746
From: Amiga land | | |
|
| @KimmoK
altivec on quad g5 2.5 ghz lame 25x for core 32bit ( the quad) lame 50x i5 2.5 ghz 2012 64 bit
belive me g5 is much faster than pentium 4 Last edited by tlosm on 12-Jun-2015 at 10:00 AM.
_________________ I love Amiga and new hope by AmigaNG A 500 + ; CDTV; CD32; PowerMac G5 Quad 8GB,SSD,SSHD,7800gtx,Radeon R5 230 2GB; MacBook Pro Retina I7 2.3ghz; #nomorea-eoninmyhome |
| Status: Offline |
| | WolfToTheMoon
| |
Re: PowerPC notebook - Status update Posted on 12-Jun-2015 11:45:36
| | [ #78 ] |
| |
|
Super Member |
Joined: 2-Sep-2010 Posts: 1351
From: CRO | | |
|
| @tlosm
Quote:
belive me g5 is much faster than pentium 4 |
it isn't. At best it's a mixed bag between the two. Athlon 64 was far better, as a design, than either of the two.
From the mouth of the guy who worked parallel on P4 and G5 on OS X
Quote:
No. As a developer I worked on both early Intel Macs with Pentium 4s running OS X and G5s running OS X side by side. Both running OS X natively.
Same systems noted here, with other developers noting the same thing: http://www.appleinsider.com/ article...ple_intel_dev_systems_impress_developers. html
Pentium 4 was faster at running OS X. Most my time was spent getting my software working on both Intel and PowerPC so I spent a lot of time comparing performance. |
http://forums.macrumors.com/threads/power-mac-g5-vs-pentium-4.1349317/_________________
|
| Status: Offline |
| | WolfToTheMoon
| |
Re: PowerPC notebook - Status update Posted on 12-Jun-2015 11:45:46
| | [ #79 ] |
| |
|
Super Member |
Joined: 2-Sep-2010 Posts: 1351
From: CRO | | |
|
| @tlosm
Quote:
belive me g5 is much faster than pentium 4 |
it isn't. At best it's a mixed bag between the two. Athlon 64 was far better, as a design, than either of the two.
From the mouth of the guy who worked parallel on P4 and G5 on OS X
Quote:
No. As a developer I worked on both early Intel Macs with Pentium 4s running OS X and G5s running OS X side by side. Both running OS X natively.
Same systems noted here, with other developers noting the same thing: http://www.appleinsider.com/ article...ple_intel_dev_systems_impress_developers. html
Pentium 4 was faster at running OS X. Most my time was spent getting my software working on both Intel and PowerPC so I spent a lot of time comparing performance. |
http://forums.macrumors.com/threads/power-mac-g5-vs-pentium-4.1349317/_________________
|
| Status: Offline |
| | KimmoK
| |
Re: PowerPC notebook - Status update Posted on 12-Jun-2015 12:04:30
| | [ #80 ] |
| |
|
Elite Member |
Joined: 14-Mar-2003 Posts: 5211
From: Ylikiiminki, Finland | | |
|
| | Status: Offline |
| |
|
|
|
[ home ][ about us ][ privacy ]
[ forums ][ classifieds ]
[ links ][ news archive ]
[ link to us ][ user account ]
|