Click Here
home features news forums classifieds faqs links search
6071 members 
Amiga Q&A /  Free for All /  Emulation /  Gaming / (Latest Posts)
Login

Nickname

Password

Lost Password?

Don't have an account yet?
Register now!

Support Amigaworld.net
Your support is needed and is appreciated as Amigaworld.net is primarily dependent upon the support of its users.
Donate

Menu
Main sections
Home
Features
News
Forums
Classifieds
Links
Downloads
Extras
OS4 Zone
IRC Network
AmigaWorld Radio
Newsfeed
Top Members
Amiga Dealers
Information
About Us
FAQs
Advertise
Polls
Terms of Service
Search

IRC Channel
Server: irc.amigaworld.net
Ports: 1024,5555, 6665-6669
SSL port: 6697
Channel: #Amigaworld
Channel Policy and Guidelines

Who's Online
31 crawler(s) on-line.
 37 guest(s) on-line.
 0 member(s) on-line.



You are an anonymous user.
Register Now!
 Hammer:  47 mins ago
 MEGA_RJ_MICAL:  1 hr 21 mins ago
 agami:  1 hr 21 mins ago
 matthey:  3 hrs 27 mins ago
 kolla:  4 hrs 12 mins ago
 rzookol:  4 hrs 57 mins ago
 Rob:  5 hrs 39 mins ago
 Lazi:  5 hrs 47 mins ago
 Alex:  6 hrs 2 mins ago
 amigakit:  6 hrs 16 mins ago

/  Forum Index
   /  Amiga OS4 Hardware
      /  32-bit PPC on FPGA
Register To Post

Goto page ( Previous Page 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 Next Page )
PosterThread
Gunnar 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 16:23:10
#61 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown

@Karlos

Quote:

Quote:
Maybe you know that I have worked at IBM in PowerPC development ...

What specifically did you do there? Just interested.


The core team of the Apollo-Team worked together at IBM.
We did meet and "find" us during the development on the CELL project
in the IBM hardware development center in Deutschland/B├Âblingen.

Funnily there was a relative high number of Amiga fans working at the IBM hardware development.
Chris, Jens, Peter, Thomas and me - we found us there ...
And at lunch and in the coffee breaks - we did spend a lot time talking about how nice Amiga was..
And how much nicer coding on Amiga was ..
And how much more easier we found coding on 68K was compared to coding on Cell or compared to coding on our PPC.

And during these coffee chats - our "evil world domination plan" to revive Amiga was born.

People from our team worked on or had our hands in several of IBM PowerPC chips..
Including PowerPC 970, Power 4 / Power 7/ Power 8 / A2 / PowerOnEdge

In my opinion the Power architecture has as any architecture both strength and weaknesses.
There are computers where the strength of the POWER chips are very nice.
I personally don't think that the Amiga home computer is the optimal place for an POWER chip.


I like very much on the Amiga that it was a good combination
of having a very easy to understand, very good to program 68k CPU.
In combination with the elegant, DMA based, clever Amiga chipset.
This combination allowed the Amiga to do a lot cool things..
Allowed it to do a lot more than other like Atari or Mac or PC could do at the same time
with the same amount of Clockrate and memory.
And the light weight operation System and the Amiga philosophie
of making direct hardware coding easy ...

Gave people the opportunity to learn the fundamentals of
how computer chips works, how to use them,
this allowed coders to understand how a CPU works internally,
and allowed them to master this in their coding

I think these attributes were the breeding ground for many demo and game coders.
These very often young people that learned coding not at the university
but in their "bedroom" using the Amiga - and these people created the Amiga scene.

 Status: Offline
Profile     Report this post  
Gunnar 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 16:34:05
#62 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown

@Karlos

Quote:
If you check the original question, you'll see that I was asking whether or not a purpose designed FPGA solution might be an affordable alternative to current PPC/NG offerings. If the end result is a scalar performance no better than and probably worse than the original PowerUp expansions, there's no point, at any price point.



Well you posted this. ... But was this really your question?

Sorry to ask so dump,
but I was thinking that today people have some common knowledge about FPGA ..
and everyone would know that you can put any CPU in an FPGA ...

And that the FPGA being an FPGA, will make the CPU run like 10 or 20 times lower clock
than the same CPU runs in an ASIC.
While the FPGA will run slower it will also cost a lot more than an ASIC per unit.

So the point of the FPGA is being a great development system....
Perfect to perfectionate a CPU or chipset ...

I thought you knew this too ... and your question had a deeper idea behind.
Like getting independent of others by creating your own CPU?

Last edited by Gunnar on 11-Feb-2024 at 04:36 PM.

 Status: Offline
Profile     Report this post  
Kronos 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 16:35:29
#63 ]
Elite Member
Joined: 8-Mar-2003
Posts: 2562
From: Unknown

@Gunnar

I'd say "Often a PPC needs 3 or 5 or even 6 or 7 instructions to do what a single 68K instructions does," is pretty much the definition of "forwarding the idea that it would be that much slower".

Sure you can move the goalposts or play pedantic word games, but if you insist that what you wrote shouldn't/couldn't be understood that way you clearly have a deeper problem.

A problem that is well established by your past usage of such obviously false/misleading comparisons.

_________________
- We don't need good ideas, we haven't run out on bad ones yet
- blame Canada

 Status: Offline
Profile     Report this post  
Gunnar 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 16:50:03
#64 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown

@Kronos

Quote:
I'd say "Often a PPC needs 3 or 5 or even 6 or 7 instructions to do what a single 68K instructions does," is pretty much the definition of "forwarding the idea that it would be that much slower".


Ok course the PPC has a disadvantage.
Get over it

If you not believe this make an appointment with the IBM POWERPC design people and talk to them!

That RISC chips have a design disadvantage for INTEGER code
is as "shocking news" as that fishes have a disadvantage in flying.
This is absolutely no news! Every one knows this. Including IBM and Motorola.
No one of IBM would spend a second to argue against this.

Only Zealot fan boys argue this.

But does this matter?
No it does not matter

It does not matter, just likes "Fishes goal" is not to fly!
The same way as the design goal of POWER is NOT to have the best possible integer performance.
It does not matter to IBM users.
They have different goals and seek different features.



POWERPC chips are very strong in FPU
POWERPC are very reliable ... they have extremely good internal error checking
making them more resistant against gamma rays, and more reliable in spotting
miscomputings = which is very important e.g for running a nuclear power plant or
running the accounts of a bank.
POWERC chip have their strength managing huge amount of memory,
and in running very reliable SMP systems.


But Power has also many weaknesses.
* The integer performance has a disadvantage
* The programming on POWER is a lot more difficult not only in Assembler !
* Power chip have weak memory model - this makes coding multithreaded more complex
and many coders had and have serious problems with this.






 Status: Offline
Profile     Report this post  
Kronos 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 17:08:46
#65 ]
Elite Member
Joined: 8-Mar-2003
Posts: 2562
From: Unknown

@Gunnar

Quote:

Gunnar wrote:

Ok course the PPC has a disadvantage.


Something I kinda pointed out early on:
https://amigaworld.net/modules/newbb/viewtopic.php?topic_id=45166&forum=33#867427

Were the 1st 601 Macs slower than an 68060?
Yeap.
Was it the bad design of the HW around it, 68k code in the OS or just the inferior PPC design?
I'd say a combination of all the above.
Was it anywhere as dramatic as you suggested?
Nope.

In reality both chips were in the same ballpark and so would be a fake FPGA PPC compared to 68080. !IF! the designer puts the same time&skill into.
"Ballpark" in this case means "something o.k. for the 90s".
Early 90s for 68060 and 601, late 90s for fake versions done in today's FPGA.

And just to clarify I pretty much stopped carrying bout specific HW 25 years ago.
Sure 68000 was nice for assembler, but once we got to 040 and good compilers that was pointless in 99.999% of cases.
OCS was a stroke of genius when combined with a low clocked 68000 and 1MB (or less), HW hitting on anything better only created problems with no real benefits.

If AROS hadn't been such a s### show in 2000 I would have gone there, if Amithlon hadn't been killed I might still be on virtual 68k.
Neither did happen.
What did happen that both the project leaders for OS4 and Natami were very stronly badmouthing MorphOS/PPC in the early 2000s which I interpreted as them knowing how inferior/late their project were.
Yeah, zealot indeed

Last edited by Kronos on 11-Feb-2024 at 05:09 PM.

_________________
- We don't need good ideas, we haven't run out on bad ones yet
- blame Canada

 Status: Offline
Profile     Report this post  
Gunnar 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 17:10:06
#66 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown

@Fl@sh

Quote:
I know most members of this old great forum are graduate in "MC68k architecture", but is quite evident when discussion goes on powerpc no one knows about what he is writing.


Do you think this?
I think you are to optimistic here.

I have no doubt that you find 100% Amiga fans here.
I would be very surprised if over 1% people have serious hard core coding expertise.

But I not mean taking a free source code and run the makefile and make an EXE.
I mean real development.
Writing and finish a real game from scratch or similar amount of work.
Developing something serious like a mp3 decoder or a video player.


 Status: Offline
Profile     Report this post  
Gunnar 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 17:17:28
#67 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown

@Kronos

Quote:
Natami were very stronly badmouthing MorphOS/PPC


Maybe you dont know this but the whole Natami Team was made from IBM people working in PowerPC chip and hardware design..

Do you love PowerPC ?

Do you love chips like the IBM G5 970 PPC?

Do you understand that the People of the Natami team did participate in doing these chips?

 Status: Offline
Profile     Report this post  
Gunnar 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 17:20:43
#68 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown

@Kronos

Quote:
I would suspect that doing CISC in FPGA might actually be easier than RISC as these are more suitable to do complicated function at one speed compared to doing simple functions faster.


you are totally wrong here.

Making a CISC chip is MANY times more work than doing a RISC chip.
For reference INTELs CPU team working on 1 core release is more than twice a big as IBM team working on a new core.

There is a good reason for this.



The key point of RISC is that they are a magnitude easier to make.

 Status: Offline
Profile     Report this post  
ppcamiga1 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 17:21:35
#69 ]
Cult Member
Joined: 23-Aug-2015
Posts: 767
From: Unknown

@Gunnar

I don't care if ppc assembler is nice or not.
Last time I wrote some code in assembler 31 years ago.
I wan't ppc because I saw amiga with ppc 27 years ago in 1997
and I have fond memories of it.
so my ideal amiga will be fpga based with 68k and ppc core
with ocs for old games and better graphics for rest.
no arm no x86
I want amiga as it was in 1997
It was good year for me

 Status: Offline
Profile     Report this post  
Kronos 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 17:24:51
#70 ]
Elite Member
Joined: 8-Mar-2003
Posts: 2562
From: Unknown

@Gunnar

Quote:

Gunnar wrote:

Do you love PowerPC ?


I don't care bout the CPU, so no.

Quote:


Do you understand that the People of the Natami team did participate in doing these chips?


So what?

You twisted yourself into a knot trying to argue that the (not yet existing) Natami would outperform an EFIKA/Radeon combo.

Your definition of "outperforming" was HW hitting hand optimized asm version of a game vs one in a braindead&lazy SDL based SW renderer.
Numbers might have been true, still as dishonest as it can get.

_________________
- We don't need good ideas, we haven't run out on bad ones yet
- blame Canada

 Status: Offline
Profile     Report this post  
Karlos 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 17:27:12
#71 ]
Elite Member
Joined: 24-Aug-2003
Posts: 4403
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

@Gunnar

The question wasn't really whether or not it was possible to implement a 32-bit PPC in an FPGA because I naïvely assumed it ought to be simpler to do than a complex CISC design, like the 68080.

The question was whether or not it was practical to do so and by practical I mean possible to produce an implementation that would be capable of delivering usable performance at an acceptable price point. This clearly doesn't seem to be the case.

Having an independent design and options for future ASIC is an interesting prospect but I assumed that was out of reach from the get go.

If I were to design my own CPU for FPGA it would most likely be a realisation of the MC64K bytecode machine, just for the fun of it and in my case, learning how to wield hardware synthesis.

Last edited by Karlos on 11-Feb-2024 at 05:29 PM.

_________________
Doing stupid things for fun...

 Status: Offline
Profile     Report this post  
Kronos 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 17:29:56
#72 ]
Elite Member
Joined: 8-Mar-2003
Posts: 2562
From: Unknown

@Gunnar

Quote:

Gunnar wrote:
@Kronos

Quote:
I would suspect that doing CISC in FPGA might actually be easier than RISC as these are more suitable to do complicated function at one speed compared to doing simple functions faster.


you are totally wrong here.

Making a CISC chip is MANY times more work than doing a RISC chip.
For reference INTELs CPU team working on 1 core release is more than twice a big as IBM team working on a new core.

There is a good reason for this.



The key point of RISC is that they are a magnitude easier to make.


And again you completely miss the mark....
It is not about making real chips.
It is not about making a "it kinda works" FPGA version.
It is about making a FPGA CPU that gets the optimal performance.

And yes everything I know and everything you stated in this thread suggest CISC is more suitable for FPGA, hence it is should be easier to get same performance out of it compared to RISC.

_________________
- We don't need good ideas, we haven't run out on bad ones yet
- blame Canada

 Status: Offline
Profile     Report this post  
Gunnar 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 17:34:46
#73 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown

@Kronos

Quote:
Were the 1st 601 Macs slower than an 68060? Yeap. Was it the bad design of the HW around it, 68k code in the OS or just the inferior PPC design? I'd say a combination of all the above. Was it anywhere as dramatic as you suggested?



Did you see the CPU fingerprints that I posted before?

Lets take a look at this one:


This a "fingerprint" of running some tests on a Pentium 4 Core.

What does it tell you?

It tells you that the CPU has both strength and weaknesses.
Some operations it can do pretty good.
In some operations its rather weak.


Now lets compare this to this CPU



Can you see that the Opteron is more "rounded up" - with less weaknesses.



Now if you look at "one weakness" of the Pentium 4 - the Opteron might be 10 times better in this field.
And this is for this single result a true fact.

But it will not mean that the Opteron is in all regards always 10 times faster.

I posted the "fingerprints" with the intention to help you guys see that a CPU is not 1 number.
I think if you see the picture then everyone will understand this and understand what I mean.

Yes CPUs have advantages and disadvantages.

Yes PowerPC have a disadvantage in code density.
Yes PowerPC have a disadvantage in INTEGER performance.
Yes PowerPC have a serious disadvantage, a challenge, with their weak memory model in coding multithreaded applications.

These are facts.
Nothing more, nothing less.



 Status: Offline
Profile     Report this post  
Karlos 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 17:41:28
#74 ]
Elite Member
Joined: 24-Aug-2003
Posts: 4403
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

@Gunnar

Integer performance is a bit of a vague term. Which specific aspects of integer is it bad at? Addressing? Branching? Indirection?

I assume it's not particularly bad at basic arithmetic and logic.

_________________
Doing stupid things for fun...

 Status: Offline
Profile     Report this post  
Kronos 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 17:43:01
#75 ]
Elite Member
Joined: 8-Mar-2003
Posts: 2562
From: Unknown

@Gunnar

The "fingerprints" are nice and show that unless you cherry pick real hard both are in the same ballpark

Not that it would matter if you were to redo them in FPGA.

You would do what you have done with the 68080 vs a real 680x0. You know optimize the pipeline, add more registers or cache, speed up execution of critical opcodes, leave out obsolete ones not needed for the application and so on..

_________________
- We don't need good ideas, we haven't run out on bad ones yet
- blame Canada

 Status: Offline
Profile     Report this post  
Gunnar 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 17:47:33
#76 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown

@Kronos

Quote:
And again you completely miss the mark....
It is not about making real chips.
It is not about making a "it kinda works" FPGA version.
It is about making a FPGA CPU that gets the optimal performance.



Let me try to explain you this

* Making it "kinda work" is the real work.
And this needs many times longer making a CISC than a RISC.


I can explain you this.

On an 68k you have like 14 address modes.
Lets say you have 100 different instructions.
This makes 1400 testcase to verify your design is working!
Lets say you support 3 sizes B/W/L
This makes 4200 testcase to verify your design is working!

Lets say you can schedule 2 instructions per cycle
So makes it 17.000.000 = 17 million combinations.
So you need now 17 million testcases.

Lets say you pipeline has 6 stages - where instruction can influence each other

You have now 17 Million times 17 Million times 17 Million times
17 Million times 17 Million times 17 Million = combinations

Do you see the problem here?


Why is RISC simpler?
Each instruction support only Register,
Each instruction supports only 1 size

Lets say your RISC design has 60 instructions.
You only need 60 testcases now

Lets say you can schedule 2 instructions per cycle
So makes it 3600 combinations.

I hope is crystal clear now how much less work making a RISC chip is..



If you need more examples ...
Think about how easy its for a risc chip to see where the 2nd instruction for decoding starts.
And how complicated this is for a CISC.


The APOLLO 68080 can decode up to 4 instructions per cycle.
The 68k instructions have variable length...
To know where the 4th instruction is in the stream you need to know
where the 2nd starts , and based in this you need to have decoded where the 3rd starts ..

Do you understand the problem?

 Status: Offline
Profile     Report this post  
matthey 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 18:10:43
#77 ]
Elite Member
Joined: 14-Mar-2007
Posts: 2007
From: Kansas

umisef Quote:

See, that's exactly why I was thinking of you when reading the thread you started.

That's a 10 byte instruction, which involves both a memory 4 byte read and a 4 byte memory write in addition to the 10 byte instruction read. That "1 cycle" is extremely generous towards any 68k implementation...


Gunnar is MEGA_RJ_MICAL (notice the MEGA_RJ_MICAL account is active recently and he sent me a PM telling me he was not Gunnar). According to the real Gunnar, I believe the AC68080 CPU core can execute "add.l #100000,myscore" in a single cycle (MEGA_RJ_MICAL is copying and pasting real Gunnar posts). The 68060 can't execute this instruction in a single cycle because it is limited to 6 bytes/cycle in each execution pipeline which is a rather low limit to save transistors and power back then. The 68040 handles large instructions and simple addressing modes well but many instructions are not single cycle. I believe both the 68040 and 68060 need 2 cycles to execute this instruction which is not bad but a higher performance 68k likely could execute this instruction in a single cycle with the data in L1 cache. The 68040 and 68060 can already execute "addq.l #1,myscore" in a single cycle.

umisef Quote:

...and at the same time, the "it will need 6 instructions" is assuming the absolute worst implementation:
1) load address of "myscore" into register
2) read current value of mycode into second register
3+4) perform a 32 bit immediate load to 3rd register
5) add 3rd register to 2nd
6) store new value into "myscore".

What actually happens is that either steps 3+4 are implemented as a single 32 bit PC-relative read (i.e. a single instruction), or, for this particular example, instructions 3/4/5 are implemented as two immediate adds (addis/addi), so two instructions only.


Loading 32 bit integer constants from PC relative code hunks is not worth it for most RISC cores. There is a load-to-use stall after each load including after instruction 2 in your diagram above. Many early PPC cores were able to keep this small, usually a single cycle, by using a shallow pipeline but this is not true of later PPC cores with deeper pipelines. OoO execution may be able to remove some load-to-use stall cycles but PPC OoO execution is usually limited and instruction scheduling highly recommended. Most 68k and x86 CISC cores avoid load-to-use stalls. It is true that instructions 3-5 may be able to be reduced from 3 to 2 instructions in some cases where the addi instruction immediate encoding bits are close to enough. This is a tedious process to determine for assembler coding as your lack of evaluation shows but compilers are good at this type of optimization. The PPC code is 5-6 dependent instructions with a load-to-use penalty. The 68040, 68060 and AC68080 can execute this instruction in 1-2 cycles where PPC requires 5-6 instructions and most PPC cores require 6+ cycles. This is a useful instruction and is not an extreme example.

umisef Quote:

Another notable difference between the two is that if you want to do it again, the 68k will need another 10-byte 68k instruction, with another 4 byte read and a 4 byte write. In contrast, the PPC merely needs to repeat instruction 5 and 6, because it has all the relevant state still in its large register file.


There are plenty of times when multiple accesses are not needed. In cases where they are, temp pointers and immediates can be stored in registers reducing the instruction size and still using significantly fewer instructions than PPC.

umisef Quote:

And that's before doing anything smart. If updating a score was an operation that was done often enough to matter (which it won't be, ever, on any game, on any processor), one would simply dedicate one of the PPC's many registers to hold that value throughout the program, immediately removing the need for instructions (1), (2) and (6). And while the PPC does need a few scratch registers in places the 68k doesn't, that "few" is a long way away from 16, which is how many more general purpose registers the PPC ISA offers compared to the 68k.


PPC has a small advantage when executing code on data in registers due to having more registers but the 68k has a large advantage when executing code accessing data in caches/memory.

umisef Quote:

So you are comparing hand-optimised 68k code (gcc 68k will emit two instructions for this, a LEA followed by an ADD.L #100000,(A0)) with an unrealistically convoluted implementation on the PPC (gcc will happily produce the 5 instruction version, without any hints or help), while making unrealistic claims about it taking "a single cycle".


A hand optimized single instruction? What version of GCC are using for the 68k? It should not be producing the code with 2 instructions you suggest unless there are multiple memory accesses to the same location. The large data model is the default in 68k compilers and it uses (xxx).L addressing extensively including with large immediates. This is not a complex instruction using a complex addressing mode but simple instruction and addressing mode that even cores in FPGAs can handle with ease. The immediate and addressing mode require no ALU calculations with only the ADD.L requiring an ALU calculation. Instructions with a fixed length encoding miss many opportunities to execute simple instructions.

umisef Quote:

Meanwhile, a similarly simple line myscore=10*myscore+myscore/3; ends up being 10 PPC instructions, but 13 68k instructions, when compiled.


moveq #10,d0
move.l myscore,d1
mulu.l d1,d0
divu.l #3,d1
add.l d1,d0
move.l d0,myscore

Modern GCC 68k compilers must really be on the decline but that is no surprise. Maybe vbcc is better than GCC for the 68k now.

Edit: I thought the /3 was an editor code the first time. It's 6 instructions instead of 13 still. The GCC 68k code by default uses division magic number code to remove the division which is why it is so long. Entering a compiler argument of -m68060 will do away with the magic number code since there is no hardware 32x32=64 instruction. This shows how common this optimization is used and why it was so bad to remove 32x32=64 from the 68060. The POWER code also uses division magic code.

Last edited by matthey on 11-Feb-2024 at 10:28 PM.

 Status: Offline
Profile     Report this post  
Gunnar 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 18:11:47
#78 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown

@Karlos

Quote:
Integer performance is a bit of a vague term.
Which specific aspects of integer is it bad at?
Addressing? Branching? Indirection?



Integer performance is the term for "normal" program execution.

Lets me explain how I see this:

The general design idea behind the 68K architecture
is to give the programmer a CPU which has some key features:

* very coder readable = very good to understand
* very powerful EA modes = making working with memory very easy.
* nearly every instruction can work directly with memory
* very powerful instructions what can do complex stuff in a single instruction for example BFFFO

This gives you as Amiga coder a great CPU but has the disadvantage that developing and testing the CPU is hellish complex = very difficult for Motorola


The RISC goal is cheating.
The goal is to make testing the CPU 100 times easier.
This makes developing a CPU a lot easier for IBM...

Yes the RISC instructions are weaker...
and you often need many RISC instructions to do what one 68K instruction can do.


How can RISC fix this?
One solution is doing more instruction in parallel..
So instead 1 CISC instruction the RISC can do 2 instructions per cycle to compensate ..

This is trying to win by sheer amount of number.
> Instead one American Abrams tank ... you send four T72 in the battle...
> instead 1 Samurai you send 4 bandits..

But here comes an interesting fact..
You can not infinitely scale this.
If you look at IBM PowerPC chips then you see
that the 440 does 2 instruction per cycle
that the 750 does 2 instruction per cycle
that the 970 does 2 instruction per cycle

Yes some high end Power can do up to 4 per cycle ...
but this the end of the scaling ...
Scaling more instructions per cycle becomes very very difficult.
A big problem are the internal CPU forwarding nets.
They become a gargantuan effort with 4 instructions already.


The 68K idea is to have stronger more capable instructions..
More Abrams, more Samurai ..
Would you rather have a card pulled by 4 hens, or by 2 horses?

And yes developing a CPU that can do several powerful CISC instructions per cycle is very difficult. A lot more difficult than developing a CPU that can do several RISC instructions per cycle.

But such CPU can be developed..
The Apollo 68080 can do several powerful CISC instructions per cycle.


As you saw in thew movies, the 7 Samurai did beat many more bandits...

In this the same idea behind the CISC.
Use stronger instructions...



Lets get this clear here ...
POWERPC design goal always was to have stronger instructions than most other RISC.
So IBM and MOTOROLA also understand this very well.

IBM and MOTO aimed for making a RISC which is less RISC..
The goal of POWERPC always was to be more CISC than the other RISC.

But still the PPC often needs several instructions to do what a real CISC can do in 1 instruction.

addq.l #1,time -- one simple 68K

CPU have register ... lets say you have 32 register..
This means the CPU can keep 32 variables inside itself...
all the rest of the program is "outside" the CPU ... is in the memory...

Lets say your memory is 500 MB ...
while you have 32 variables inside ... there are still hundred of million of variables "outside" the CPU in memory...

For the 68k this is not a real problem as basically any instruction can always work with memory...
A RISC has a drawback as to work with 99% of your program which is outside the CPU,
the RISC CPU needs always several instructions - typically 3 -


Did this explanation help a little?

CPU architecture and design is a complex topic

 Status: Offline
Profile     Report this post  
Kronos 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 18:20:27
#79 ]
Elite Member
Joined: 8-Mar-2003
Posts: 2562
From: Unknown

@Gunnar

So what you saying is for CISC you can utilize the parallel nature of an FPGA better than for a RISC CPU

Sure a lot of work and complicated too.

For RISC you might "finish" earlier but leaving a portion of the theoretical performance on the floor (that being compared to if you were doing an optimal design for FPGA with no compatibility concerns whatsoever).

But in reality both will be decades behind real CPUs, whether it is 25years for one or 30 for the other makes no difference in the great scheme of things.

What does make a difference (and the whole thread pointless from the start) is that PPC "ended" a decade later, a decade with massive improvements resulting in an FPGA CPU being the fastest "HW" 68k (omitting ColdFire) while not even being able to touch the bottom end of NG supported PPCs.

_________________
- We don't need good ideas, we haven't run out on bad ones yet
- blame Canada

 Status: Offline
Profile     Report this post  
Gunnar 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 18:25:40
#80 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown

We talked a lot now.


Let me give you me experience in what helps to make a good CPU

* compact code= helps to get more from your cache
* being able to do more per single instruction= being strong! = is very good
* many instruction per cycle = is good
* more operations per instruction = SIMD = is very good
* doing more in one = 3 operand support
* having good EA modes = is very useful
* having many or enough register = is very good
* Having powerful caches - that support several access per cycle = is very good
* Good branch prediction
* good internal forwarding ..
* Having either a design that performs good without OOO - or if not then having strong OOO
* Having smart cache pre-fetching


There are good and bad PowerPC implementations.
For example the PowerPC 405 is very bad compared to the PowerPC 960
A good PowerPC CPU implementation will tick many of the above items.

Many 68K CPU will tick a number of the above but not all..
All old 68k lack
-SIMD
-3operant
-very many register
-prefetching

This makes many old 68k "medium" in my experience.



I try to make a excellent CPU with the Apollo 68080
The Apollo 68080 ticks EACH and EVERY of the above attributes.

 Status: Offline
Profile     Report this post  
Goto page ( Previous Page 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 Next Page )

[ home ][ about us ][ privacy ] [ forums ][ classifieds ] [ links ][ news archive ] [ link to us ][ user account ]
Copyright (C) 2000 - 2019 Amigaworld.net.
Amigaworld.net was originally founded by David Doyle