Click Here
home features news forums classifieds faqs links search
6071 members 
Amiga Q&A /  Free for All /  Emulation /  Gaming / (Latest Posts)
Login

Nickname

Password

Lost Password?

Don't have an account yet?
Register now!

Support Amigaworld.net
Your support is needed and is appreciated as Amigaworld.net is primarily dependent upon the support of its users.
Donate

Menu
Main sections
» Home
» Features
» News
» Forums
» Classifieds
» Links
» Downloads
Extras
» OS4 Zone
» IRC Network
» AmigaWorld Radio
» Newsfeed
» Top Members
» Amiga Dealers
Information
» About Us
» FAQs
» Advertise
» Polls
» Terms of Service
» Search

IRC Channel
Server: irc.amigaworld.net
Ports: 1024,5555, 6665-6669
SSL port: 6697
Channel: #Amigaworld
Channel Policy and Guidelines

Who's Online
10 crawler(s) on-line.
 135 guest(s) on-line.
 0 member(s) on-line.



You are an anonymous user.
Register Now!
 A1200:  47 mins ago
 michalsc:  51 mins ago
 amigakit:  1 hr 28 mins ago
 OlafS25:  1 hr 50 mins ago
 clint:  1 hr 55 mins ago
 amigang:  3 hrs 5 mins ago
 Tpod:  3 hrs 45 mins ago
 pixie:  3 hrs 50 mins ago
 Birbo:  4 hrs 5 mins ago
 Hammer:  4 hrs 12 mins ago

/  Forum Index
   /  Amiga OS4 Hardware
      /  Next Freescale high performance PPC chip.
Register To Post

Goto page ( Previous Page 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 Next Page )
PosterThread
NutsAboutAmiga 
Re: Next Freescale high performance PPC chip.
Posted on 29-Oct-2013 16:04:44
#161 ]
Elite Member
Joined: 9-Jun-2004
Posts: 12817
From: Norway

@damocles

Quote:
Does anyone really care about Altivec in 2013?


Yes Altivec is great, you can execute about 4 instructions at the same time, but it has it problems, first of all you need to compile the binary whit altivec optimizing.

GCC does not allow you to make inline assembler code whit it and whit out, so you can't switch between while programs runs, so you end up whit two exe files, or common library or something like that.

2en problem is poor documentation on the internet; I have spent a lot of time looking for guides how to have to write Altivec assembler code.

3RD problem most AmigaONE/Sam users, do not have a CPU that supports it.

so in the end because of issue 2 and 3, there are not many who know how to, and if they did they might not have the hardware to do it on, and besides only handful of people will be able to make use of it.

Altivec has its own registers, etch of the registers hold a temporary value; as long as registers are not interchanged the operations can go in parallel, this is its advantage.

For example if you have something like this

Load normal register 0 into vector 0
Add 10 to vector 0
Store Vector 0 to normal register 0

This code is just slow as normal code, but if you unroll loops and do.

Load normal register 0 into vector 0
Load normal register 1 into vector 1
Load normal register 2 into vector 2
Add 10 to vector 0
Add 10 to vector 1
Add 10 to vector 2
Store Vector 0 to normal register 0
Store Vector 1 to normal register 1
Store Vector 2 to normal register 2

Then the code is going be executed many times faster than normal code.

Last edited by NutsAboutAmiga on 30-Oct-2013 at 10:57 AM.
Last edited by NutsAboutAmiga on 29-Oct-2013 at 04:06 PM.

_________________
http://lifeofliveforit.blogspot.no/
Facebook::LiveForIt Software for AmigaOS

 Status: Offline
Profile     Report this post  
damocles 
Re: Next Freescale high performance PPC chip.
Posted on 29-Oct-2013 18:32:07
#162 ]
Super Member
Joined: 22-Dec-2007
Posts: 1719
From: Unknown

@NutsAboutAmiga

So basically, no body cares about Altivec in 2013.

_________________
Dammy

 Status: Offline
Profile     Report this post  
minator 
Re: Next Freescale high performance PPC chip.
Posted on 29-Oct-2013 20:28:37
#163 ]
Cult Member
Joined: 23-Mar-2004
Posts: 989
From: Cambridge

@NutsAboutAmiga

Quote:

Then the code is going be executed many times faster than normal code.


Actually it will be slower. Probably a lot slower.

I don't know if you've just explained it badly but it looks like you're trying to use the AltiVec unit to do normal (scalar) maths. This makes no sense whatsoever.

You're also moving things to and from the normal scalar registers. This adds a lot of overhead so should be avoided unless absolutely necessary.

Here's a better example:

You have an array of 32 bit numbers and you want to increment them by 10.

In a loop do this:
load vector0 from memory (this loads 4x32 bit numbers)
vector-add a vector of 10s to vector0 (this adds 4x32 bit numbers)
store vector0 to memory (this stores 4x32 bit numbers)

That will do 4 adds per add instruction but there's a load of overhead. Unrolling it will speed it up.

BTW You should also be using intrinsics instead of assembly. They're much easier to use and make the compiler do a load of work for you.

_________________
Whyzzat?

 Status: Offline
Profile     Report this post  
NutsAboutAmiga 
Re: Next Freescale high performance PPC chip.
Posted on 29-Oct-2013 20:31:59
#164 ]
Elite Member
Joined: 9-Jun-2004
Posts: 12817
From: Norway

@damocles

Well I care, so the there are some one who cares.

Well I don't think the problem is that people don't care, it just a bit more hazel to get some thing out of it, but it might be worth it, if you have Altivec that is.

The truth is that it might be tiny bit extra that is needed to play HD video at acceptable speed on AmigaONE-X1000 for example, but then its about having some one who knows what they are doing.

Even normal powerpc assembler optimized rutins might do a big difference if some one did take there time to do it.

Last edited by NutsAboutAmiga on 30-Oct-2013 at 10:58 AM.

_________________
http://lifeofliveforit.blogspot.no/
Facebook::LiveForIt Software for AmigaOS

 Status: Offline
Profile     Report this post  
tonyw 
Re: Next Freescale high performance PPC chip.
Posted on 29-Oct-2013 20:36:44
#165 ]
Elite Member
Joined: 8-Mar-2003
Posts: 3240
From: Sydney (of course)

@NutsAboutAmiga

Quote:

Even normal powerpc assembler optimized routines might make a big difference if someone took their time to do it.


You can't write better assembler code than the compiler generates from C. It has a lot more insight than you have.

_________________
cheers
tony

Hyperion Support Forum: http://forum.hyperion-entertainment.biz/index.php

 Status: Offline
Profile     Report this post  
NutsAboutAmiga 
Re: Next Freescale high performance PPC chip.
Posted on 29-Oct-2013 20:40:46
#166 ]
Elite Member
Joined: 9-Jun-2004
Posts: 12817
From: Norway

@minator

Well poor example I know but basically what I'm, trying to say is that vector registers works independent, so what looks like sequence of assembler code is spited up and executed in parallel.

But does require you to stack instruction in way that gives you that effect.

Last edited by NutsAboutAmiga on 29-Oct-2013 at 09:13 PM.
Last edited by NutsAboutAmiga on 29-Oct-2013 at 08:49 PM.

_________________
http://lifeofliveforit.blogspot.no/
Facebook::LiveForIt Software for AmigaOS

 Status: Offline
Profile     Report this post  
NutsAboutAmiga 
Re: Next Freescale high performance PPC chip.
Posted on 29-Oct-2013 20:46:50
#167 ]
Elite Member
Joined: 9-Jun-2004
Posts: 12817
From: Norway

@tonyw

Sorry thats easy, the C compiler does crap job at it really.

Do objdump -S on your Exe file and see what it has done, the problem whit C in general is that is pushed to mutch onto the RAM too often, there is often bunch of code that can be removed.

But you don't write a full program in assembler, you only optimize the inner loops, this where it makes most sense, this where you have lot repetitive code being executed over and over again, and this is way it does make a difference.

Lets say you have routine that is executed 10000 to 100000 of times or more.

There are also cases you where you have IF condition in C, that can be replaced by ISEL assembler instruction and eliminating brash jumping.

If programmer is not too stupid he might be able to get few cycles extra out of C too.

It does require a understand of what C language generates, and understanding the consequences, of writing some thing this way, instead of that way.

Last edited by NutsAboutAmiga on 30-Oct-2013 at 09:51 AM.
Last edited by NutsAboutAmiga on 29-Oct-2013 at 09:02 PM.
Last edited by NutsAboutAmiga on 29-Oct-2013 at 08:58 PM.
Last edited by NutsAboutAmiga on 29-Oct-2013 at 08:56 PM.
Last edited by NutsAboutAmiga on 29-Oct-2013 at 08:54 PM.
Last edited by NutsAboutAmiga on 29-Oct-2013 at 08:51 PM.
Last edited by NutsAboutAmiga on 29-Oct-2013 at 08:50 PM.

_________________
http://lifeofliveforit.blogspot.no/
Facebook::LiveForIt Software for AmigaOS

 Status: Offline
Profile     Report this post  
olegil 
Re: Next Freescale high performance PPC chip.
Posted on 30-Oct-2013 9:21:11
#168 ]
Elite Member
Joined: 22-Aug-2003
Posts: 5895
From: Work

@tonyw

Well, you can, but it usually takes too much effort to be worth it, as the same effort could be invested in rewriting things that are not optimized enough from the programmers side.

Example:
Load/store arch (ARM in this case), needed to write some memory mapped registers in a bootloader. Using macros/defines for readability makes it possible to use assembly, but the compiler knew that the same value was ending up in two of the registers (the programmer didn't, as he filled in the values after writing the code). This means that some load instructions were unneccessary, multiple stores from single load saved time and space. Now, rewriting the assembly to be as efficient the C compiler ended up with wouldn't have been difficult by unwrapping the macros and looking at the values to be written. But then what happens if you need to change a value? Assembly: complete rewrite. C: single macro change.

In other instances it makes perfect sense, for instance AVR-GCC which insists on pushing register 1 (which it uses as a zero EVERYWHERE IN THE CODE) and register 0 (temp-reg) before copying the status reg to reg 0 and pushing AGAIN. Even if you write the code to not need changing the status register (simple move/store etc) in your interrupt. I went from push/mov/push/ser/pop/mov/pop/rts to just ser/rts, it took me VERY little time to change, and it really helped with the performance. This was the chipselect line on an SPI slave implementation. Similar fixes was done to a few other interrupts (like setting aside registers for status instead of using stack, I used 6 registers for 3 copies of status and my prime data reg). Without the fixes, the implementation needed a 14 USD FPGA, I managed it with a 1 USD MCU.

_________________
This weeks pet peeve:
Using "voltage" instead of "potential", which leads to inventing new words like "amperage" instead of "current" (I, measured in A) or possible "charge" (amperehours, Ah or Coulomb, C). Sometimes I don't even know what people mean.

 Status: Offline
Profile     Report this post  
KimmoK 
Re: Next Freescale high performance PPC chip.
Posted on 30-Oct-2013 10:11:17
#169 ]
Elite Member
Joined: 14-Mar-2003
Posts: 5211
From: Ylikiiminki, Finland

@damocles

"So basically, no body cares about Altivec in 2013."

Almost every mainstream CPU has multimedia instructions, they only have different names.

MMX, SSE, 3DNow, VMX, Altivec, NEON etc...

So, almost everybody cares about Altivec (=multimedia instruction unit).
(and most devs let compiler do the vectorization/optimization)

Last edited by KimmoK on 30-Oct-2013 at 10:12 AM.

_________________
- KimmoK
// For freedom, for honor, for AMIGA
//
// Thing that I should find more time for: CC64 - 64bit Community Computer?

 Status: Offline
Profile     Report this post  
olegil 
Re: Next Freescale high performance PPC chip.
Posted on 30-Oct-2013 12:00:28
#170 ]
Elite Member
Joined: 22-Aug-2003
Posts: 5895
From: Work

@minator

For what it's worth, I understood what he was saying, if you unroll loops you can vectorise scalar math. You just took it one step further WHILE saying he was completely wrong.

_________________
This weeks pet peeve:
Using "voltage" instead of "potential", which leads to inventing new words like "amperage" instead of "current" (I, measured in A) or possible "charge" (amperehours, Ah or Coulomb, C). Sometimes I don't even know what people mean.

 Status: Offline
Profile     Report this post  
olegil 
Re: Next Freescale high performance PPC chip.
Posted on 30-Oct-2013 12:04:26
#171 ]
Elite Member
Joined: 22-Aug-2003
Posts: 5895
From: Work

@damocles

Maybe no body cares, but a lot of minds care.

SIMD is important in 2013, because it means you can process more data per clock and if everyone else uses it then it becomes essential.

For instance small ARM processors which completely suck at general processing but excel at video compression/decompression. While consuming hardly any power from a tiny battery.

_________________
This weeks pet peeve:
Using "voltage" instead of "potential", which leads to inventing new words like "amperage" instead of "current" (I, measured in A) or possible "charge" (amperehours, Ah or Coulomb, C). Sometimes I don't even know what people mean.

 Status: Offline
Profile     Report this post  
damocles 
Re: Next Freescale high performance PPC chip.
Posted on 30-Oct-2013 12:28:39
#172 ]
Super Member
Joined: 22-Dec-2007
Posts: 1719
From: Unknown

@KimmoK

Quote:
Almost every mainstream CPU has multimedia instructions, they only have different names. MMX, SSE, 3DNow, VMX, Altivec, NEON etc...


No, don't go there. I specifically said Altivec and nothing else. According to NutsAboutAmiga, most AmigaOne and SAM owners do not have CPUs with Altivec. Since it has to be compiled in, just how many Amiga OS4 binaries out there were compiled with Altivec at various Amiga OS4 file depots vs Amiga OS4 binaries compiled without Altivec?

If Trevor is producing A1X#?Ks that do not have Altivec, a tiny population is going to care about it, vast majority will not.

_________________
Dammy

 Status: Offline
Profile     Report this post  
olegil 
Re: Next Freescale high performance PPC chip.
Posted on 30-Oct-2013 14:22:51
#173 ]
Elite Member
Joined: 22-Aug-2003
Posts: 5895
From: Work

@damocles

Circular argument.

-"Feature X needs improving, as it's just too hard to use."
-"But noone uses that feature, since it's too hard to use."

We have machines with Altivec, therefore developers should be encouraged to utilize it rather than hardware manufacturers being encouraged to drop it.

_________________
This weeks pet peeve:
Using "voltage" instead of "potential", which leads to inventing new words like "amperage" instead of "current" (I, measured in A) or possible "charge" (amperehours, Ah or Coulomb, C). Sometimes I don't even know what people mean.

 Status: Offline
Profile     Report this post  
NutsAboutAmiga 
Re: Next Freescale high performance PPC chip.
Posted on 30-Oct-2013 14:23:11
#174 ]
Elite Member
Joined: 9-Jun-2004
Posts: 12817
From: Norway

@damocles

We don't have any number on that as we do not know sales figures, it can be 50/50, 60/40 or 40/60, or any combination, what we do know is that there are more CPU's that does not support it then the once that does.

603, 604, G3, AMC440, P5040, P5020 and AMC460 is not Altivec
G4 and PA6T is Altivec compliant.

(G3 and G4 are many different CPU models)
Well it might be nice find it out; maybe we should start a pool.

But it looks like it's more CPU's that does not support it and is coming then ones that do, unless you optimize for self-interests, its more practical to just do normal assembler optimizing instead.

Last edited by NutsAboutAmiga on 30-Oct-2013 at 02:41 PM.
Last edited by NutsAboutAmiga on 30-Oct-2013 at 02:24 PM.

_________________
http://lifeofliveforit.blogspot.no/
Facebook::LiveForIt Software for AmigaOS

 Status: Offline
Profile     Report this post  
damocles 
Re: Next Freescale high performance PPC chip.
Posted on 30-Oct-2013 14:32:01
#175 ]
Super Member
Joined: 22-Dec-2007
Posts: 1719
From: Unknown

@olegil

Quote:
We have machines with Altivec,


Which ones and how many?

_________________
Dammy

 Status: Offline
Profile     Report this post  
NutsAboutAmiga 
Re: Next Freescale high performance PPC chip.
Posted on 30-Oct-2013 14:35:54
#176 ]
Elite Member
Joined: 9-Jun-2004
Posts: 12817
From: Norway

@damocles

There is defiantly huge potential of untapped computer power wasted in most programs.
it be interesting to see what happened if some actually spent some time optimizing video and audio codecs, and mp3 encoders and display routines in video players, and emulators.

Last edited by NutsAboutAmiga on 30-Oct-2013 at 02:48 PM.
Last edited by NutsAboutAmiga on 30-Oct-2013 at 02:37 PM.

_________________
http://lifeofliveforit.blogspot.no/
Facebook::LiveForIt Software for AmigaOS

 Status: Offline
Profile     Report this post  
damocles 
Re: Next Freescale high performance PPC chip.
Posted on 30-Oct-2013 15:10:41
#177 ]
Super Member
Joined: 22-Dec-2007
Posts: 1719
From: Unknown

@NutsAboutAmiga

Quote:
603, 604, G3, AMC440, P5040, P5020 and AMC460 is not Altivec G4 and PA6T is Altivec compliant. (G3 and G4 are many different CPU models) Well it might be nice find it out; maybe we should start a pool. But it looks like it's more CPU's that does not support it and is coming then ones that do, unless you optimize for self-interests, its more practical to just do normal assembler optimizing instead.


It looks to me that the lack of Altivec support in Trevor's upcoming line of computers will not hurt his future sales at all. Too few Amiga OS4 systems have Altivec, and those few who do, do not have a massive amount of applications/games that require Altivec in the first place.



_________________
Dammy

 Status: Offline
Profile     Report this post  
olegil 
Re: Next Freescale high performance PPC chip.
Posted on 30-Oct-2013 15:31:40
#178 ]
Elite Member
Joined: 22-Aug-2003
Posts: 5895
From: Work

@damocles

It looks to me that lack of Altivec hurts sales of upcoming AmigaOnes day in an day out, judging from the snide remarks on online forums regarding said lack.

On the other hand, PA6T wasn't all that impressive anyway.

As a potential customer, I for one would very much welcome Altivec. But as evidenced by the post I link to in my sig, it's not a deal-breaker for me. Current lack of funds (or rather, abundance of things with higher importance to spend them on) is worse

_________________
This weeks pet peeve:
Using "voltage" instead of "potential", which leads to inventing new words like "amperage" instead of "current" (I, measured in A) or possible "charge" (amperehours, Ah or Coulomb, C). Sometimes I don't even know what people mean.

 Status: Offline
Profile     Report this post  
KimmoK 
Re: Next Freescale high performance PPC chip.
Posted on 30-Oct-2013 16:58:51
#179 ]
Elite Member
Joined: 14-Mar-2003
Posts: 5211
From: Ylikiiminki, Finland

made me wonder if anyone succeeded in accelerating any app with DSP instructions of PPC440/PPC460?


And interesting for AmigaSW newbies:
How to compile binary that internally knows to utilize Altivec when it is available?
(via using gcc's altivec optimization)

Is it possible to tell the compiler what code segment optimize for Altivec and what not?




(googling)

btw. Web site dedicated to Altivec and NEON optimization.
"For example, did you know that you can do byte swapping with AltiVec 7 times faster than with scalar code? Or that it is possible to sort integers and floats 4 times faster with the help of AltiVec? Were you aware that it helps to do string searching faster? Memory hashing gets upto 7 times faster. The list could just go on and on..." etc....


And a SIMD study kind of doc.


web site I had forgotten (again): http://www.powerdeveloper.org/forums/viewforum.php?f=23
There was a link to MPC8610 demo. (i wonder the price of that motherboard

About DIU (of T1020 chip) by Andreas Wolf: http://www.morphzone.info/modules/newbb_plus/viewtopic.php?forum=11&topic_id=8473&post_id=104864&viewmode=flat&sortorder=0&showonepost=1

UPDATE:
Beyond Bits is interesting to read. (issue VI shows roadmap with CPUS beyond AMP/Txxxx series, but will be interesting to see if freescale can afford that, most likely depends on AMP series sales etc...)

Price hunting:
http://www.zauba.com/import-BRAND+FREESCALE-hs-code.html

UPDATE:
T1 and T2 products should start to appear in feb/2014:
http://www.sintecs.eu/content/modules
http://edality.by/node/92

Last edited by KimmoK on 10-Feb-2014 at 02:30 PM.
Last edited by KimmoK on 04-Nov-2013 at 11:45 AM.
Last edited by KimmoK on 04-Nov-2013 at 10:51 AM.
Last edited by KimmoK on 31-Oct-2013 at 11:37 AM.
Last edited by KimmoK on 30-Oct-2013 at 05:24 PM.
Last edited by KimmoK on 30-Oct-2013 at 05:22 PM.
Last edited by KimmoK on 30-Oct-2013 at 05:08 PM.
Last edited by KimmoK on 30-Oct-2013 at 05:00 PM.
Last edited by KimmoK on 30-Oct-2013 at 05:00 PM.

_________________
- KimmoK
// For freedom, for honor, for AMIGA
//
// Thing that I should find more time for: CC64 - 64bit Community Computer?

 Status: Offline
Profile     Report this post  
broadblues 
Re: Next Freescale high performance PPC chip.
Posted on 30-Oct-2013 18:07:21
#180 ]
Amiga Developer Team
Joined: 20-Jul-2004
Posts: 4446
From: Portsmouth England

@KimmoK

Quote:


Is it possible to tell the compiler what code segment optimize for Altivec and what not?



Only if the code is in seperate object files. If you enable altivec then the compile will enable optinistaion as well and so optimise the general code, so you must seperate out all your altivec code from the general code.

Not difficult to do once you know, but decidedly confusing for while if you don;t/


_________________
BroadBlues On Blues BroadBlues On Amiga Walker Broad

 Status: Offline
Profile     Report this post  
Goto page ( Previous Page 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 Next Page )

[ home ][ about us ][ privacy ] [ forums ][ classifieds ] [ links ][ news archive ] [ link to us ][ user account ]
Copyright (C) 2000 - 2019 Amigaworld.net.
Amigaworld.net was originally founded by David Doyle