Poster | Thread |
Karlos
|  |
Re: 32-bit PPC on FPGA Posted on 11-Feb-2024 10:59:06
| | [ #41 ] |
|
|
 |
Elite Member  |
Joined: 24-Aug-2003 Posts: 4843
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition! | | |
|
| @Gunnar
Quote:
And if you get the pocket-money to make a real ASIC - you could get it 10 times higher clockrate |
The master plan for the 68100 revealed...?
Somewhere out there, Matthey just had a rapture moment._________________ Doing stupid things for fun... |
|
Status: Offline |
|
|
Karlos
|  |
Re: 32-bit PPC on FPGA Posted on 11-Feb-2024 11:06:49
| | [ #42 ] |
|
|
 |
Elite Member  |
Joined: 24-Aug-2003 Posts: 4843
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition! | | |
|
| @Fl@sh
Quote:
To add a 16 bit const in a powerpc register you need one instruction, for a 32 bit value you need two. Load a 32 bit constant is not a so frequent case and it's not a real problem about powerpc arch |
To be fair to him, that wasn't the example Gunnar gave. He added a 32 bit constant to a direct memory location. Which is part of the point of CISC, but not a feature of typical load/store architectures. So if you include the need to load the target operand and write it back again, that's still four operations including the two you suggest are necessary._________________ Doing stupid things for fun... |
|
Status: Offline |
|
|
OneTimer1
|  |
Re: 32-bit PPC on FPGA Posted on 11-Feb-2024 12:12:54
| | [ #43 ] |
|
|
 |
Super Member  |
Joined: 3-Aug-2015 Posts: 1141
From: Germany | | |
|
| @Gunnar
Any plans for selling 68k compatible CPUs for embedded industries?
You would just have to go to clocks above 300MHz have some integrated DDR2 (or better ) memory interfaces, I2C, SPI, UART and CAN onboard and a IEEE compliant FPU. |
|
Status: Offline |
|
|
V8
|  |
Re: 32-bit PPC on FPGA Posted on 11-Feb-2024 12:26:36
| | [ #44 ] |
|
|
 |
Regular Member  |
Joined: 30-Mar-2022 Posts: 138
From: Unknown | | |
|
| @NutsAboutAmiga
Quote:
Well, they are going to sell SIN and Heretic II to all markets. |
Maybe steffen can forgo some of that money and use it to pay some of the developers that he unlube-dildoed when he was at Hyperion?
Or maybe steffen will just, fuck joerg and fuck the others, this is my money. |
|
Status: Offline |
|
|
umisef
|  |
Re: 32-bit PPC on FPGA Posted on 11-Feb-2024 13:16:35
| | [ #45 ] |
|
|
 |
Super Member  |
Joined: 19-Jun-2005 Posts: 1714
From: Melbourne, Australia | | |
|
| @Gunnar
Quote:
add.l #100000,myscore
The PowerPC will need more instructions todo the same work. In this case it will need 6 instructions!
This one instruction needs 1 cycle on 68080.
|
See, that's exactly why I was thinking of you when reading the thread you started.
That's a 10 byte instruction, which involves both a memory 4 byte read and a 4 byte memory write in addition to the 10 byte instruction read. That "1 cycle" is extremely generous towards any 68k implementation...
...and at the same time, the "it will need 6 instructions" is assuming the absolute worst implementation: 1) load address of "myscore" into register 2) read current value of mycode into second register 3+4) perform a 32 bit immediate load to 3rd register 5) add 3rd register to 2nd 6) store new value into "myscore".
What actually happens is that either steps 3+4 are implemented as a single 32 bit PC-relative read (i.e. a single instruction), or, for this particular example, instructions 3/4/5 are implemented as two immediate adds (addis/addi), so two instructions only.
Another notable difference between the two is that if you want to do it again, the 68k will need another 10-byte 68k instruction, with another 4 byte read and a 4 byte write. In contrast, the PPC merely needs to repeat instruction 5 and 6, because it has all the relevant state still in its large register file.
And that's before doing anything smart. If updating a score was an operation that was done often enough to matter (which it won't be, ever, on any game, on any processor), one would simply dedicate one of the PPC's many registers to hold that value throughout the program, immediately removing the need for instructions (1), (2) and (6). And while the PPC does need a few scratch registers in places the 68k doesn't, that "few" is a long way away from 16, which is how many more general purpose registers the PPC ISA offers compared to the 68k.
So you are comparing hand-optimised 68k code (gcc 68k will emit two instructions for this, a LEA followed by an ADD.L #100000,(A0)) with an unrealistically convoluted implementation on the PPC (gcc will happily produce the 5 instruction version, without any hints or help), while making unrealistic claims about it taking "a single cycle".
Meanwhile, a similarly simple line myscore=10*myscore+myscore/3; ends up being 10 PPC instructions, but 13 68k instructions, when compiled.
For anyone interested in playing with this, the compiler explorer is a lovely, and somewhat less subjective, resource.Last edited by umisef on 11-Feb-2024 at 01:19 PM. Last edited by umisef on 11-Feb-2024 at 01:17 PM.
|
|
Status: Offline |
|
|
WolfToTheMoon
|  |
Re: 32-bit PPC on FPGA Posted on 11-Feb-2024 13:22:11
| | [ #46 ] |
|
|
 |
Super Member  |
Joined: 2-Sep-2010 Posts: 1410
From: CRO | | |
|
| There is a open source PPC SoC in development
https://libre-soc.org/
They have a goal of producing a Pi-like SBC featuring a 22/28 nm SoC.
So maybe an FPGA will not be needed, but... considering the state of OS4, possibly not even new and truly cheap hardware will make much of a difference.
_________________
|
|
Status: Offline |
|
|
NutsAboutAmiga
|  |
Re: 32-bit PPC on FPGA Posted on 11-Feb-2024 13:23:45
| | [ #47 ] |
|
|
 |
Elite Member  |
Joined: 9-Jun-2004 Posts: 12960
From: Norway | | |
|
| @V8
Steffen is not responsible for Ben paying or not paying the other developers.
Sure, I can agree that it be better if did this work when Hyperion was struggling to find money, in any case I think ODM like agreements with individual contractors was bad idea, Ben should never have agreed these contracts like that, but he did.
Its mostly resolved by it being sold from AmiStore instead or being pay for by Trever instead. Last edited by NutsAboutAmiga on 11-Feb-2024 at 01:30 PM.
_________________ http://lifeofliveforit.blogspot.no/ Facebook::LiveForIt Software for AmigaOS |
|
Status: Offline |
|
|
Karlos
|  |
Re: 32-bit PPC on FPGA Posted on 11-Feb-2024 13:27:13
| | [ #48 ] |
|
|
 |
Elite Member  |
Joined: 24-Aug-2003 Posts: 4843
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition! | | |
|
| @umisef
Does the instruction fetch count here? Certainly for the worst case sure, but in a typical case, you might reasonably assume the instruction and immediate operand are already resident in the instruction cache, assuming code is being loaded into it via line transfers*
*Also applies to PPC version. Last edited by Karlos on 11-Feb-2024 at 01:29 PM. Last edited by Karlos on 11-Feb-2024 at 01:28 PM. Last edited by Karlos on 11-Feb-2024 at 01:27 PM.
_________________ Doing stupid things for fun... |
|
Status: Offline |
|
|
NutsAboutAmiga
|  |
Re: 32-bit PPC on FPGA Posted on 11-Feb-2024 13:33:26
| | [ #49 ] |
|
|
 |
Elite Member  |
Joined: 9-Jun-2004 Posts: 12960
From: Norway | | |
|
| |
Status: Offline |
|
|
Gunnar
|  |
Re: 32-bit PPC on FPGA Posted on 11-Feb-2024 13:58:28
| | [ #50 ] |
|
|
 |
Cult Member  |
Joined: 25-Sep-2022 Posts: 512
From: Unknown | | |
|
| @umisef
Quote:
And while the PPC does need a few scratch registers in places the 68k doesn't, that "few" is a long way away from 16, which is how many more general purpose registers the PPC ISA offers compared to the 68k. |
Yes more register are always good. This is also why the Apollo 68080 CPU has more more register than the PPC. 
The Apollo 68080 has 32 DATA register and 16 ADDRES Register-
|
|
Status: Offline |
|
|
Gunnar
|  |
Re: 32-bit PPC on FPGA Posted on 11-Feb-2024 14:06:20
| | [ #51 ] |
|
|
 |
Cult Member  |
Joined: 25-Sep-2022 Posts: 512
From: Unknown | | |
|
| @umisef
Quote:
That "1 cycle" is extremely generous towards any 68k implementation... |
The example with the instruction ADD.l #10000,myscore in one instruction is a real world result.
The Apollo 68080 can execute this instruction each clock cycle.
This means running at 100 MHz, the Apollo CPU can execute 100 Million of such instructions and in addition also do 100 Million Floating Point Divides - per second.
Let me explain you how this works:
A modern CPU executes instructions in a so called "pipeline". Using pipelining a CPU will "split" the work an instruction into several parts. Lets say you split the execution of the instruction in 8 steps.
Each cycle you can feed a new instruction in the pipeline and each cycle an instruction is finished.
Using pipelining is how every modern CPU is designed. Moden INTEL chips are designed like this. modern IBM POWER chips are designed like this and also the 68080 is designed like this.
|
|
Status: Offline |
|
|
Gunnar
|  |
Re: 32-bit PPC on FPGA Posted on 11-Feb-2024 14:25:27
| | [ #52 ] |
|
|
 |
Cult Member  |
Joined: 25-Sep-2022 Posts: 512
From: Unknown | | |
|
| @umisef
...and at the same time, the "it will need 6 instructions" is assuming the absolute worst implementation:
1) load address of "myscore" into register 2) read current value of mycode into second register 3+4) perform a 32 bit immediate load to 3rd register 5) add 3rd register to 2nd 6) store new value into "myscore".
What actually happens is that either steps 3+4 are implemented as a single 32 bit PC-relative read (i.e. a single instruction), or, for this particular example, instructions 3/4/5 are implemented as two immediate adds (addis/addi), so two instructions only.
in two instruction? Your post claims "fantasy features" which are not guaranteed in the PowerPC ISA In fact there are many PowerPC implementations that suffer from pipeline stales running this code ... And would even take LONGER than 6 cycle.
But this was not my point. My point was that the PPC has some design advantages and also "planned" design disadvantages. And if you have a 100Mhz Softcore e.g. of an PowerPC 4xx CPU your integer performance is on a disadvantage.
If you want to talk about PowerPC implementation Pipeline design and their advantages or weaknesses - We can do this.
How about you write exactly the PPC assembly that you think this needs. And I can then tell you how the pipeline of an 4xx IBM PowerPC or an 750 IBM PowerPC can execute this and how many cycles they need in reality.
And I can explain you what problems these pipelines might run into.
Last edited by Gunnar on 11-Feb-2024 at 04:04 PM.
|
|
Status: Offline |
|
|
Kronos
|  |
Re: 32-bit PPC on FPGA Posted on 11-Feb-2024 14:28:01
| | [ #53 ] |
|
|
 |
Elite Member  |
Joined: 8-Mar-2003 Posts: 2713
From: Unknown | | |
|
| @Gunnar
I pretty sure umisef knows about pipelines, but your example is an ideal(unrealistic) best case scenario which would yield similar throughput on any other CPU in the past 25 years.
How well or how bad the CPU prevents or recovers from that pipeline getting stalled is what determines real world single core performance. _________________ - We don't need good ideas, we haven't run out on bad ones yet - blame Canada |
|
Status: Offline |
|
|
Gunnar
|  |
Re: 32-bit PPC on FPGA Posted on 11-Feb-2024 14:39:12
| | [ #54 ] |
|
|
 |
Cult Member  |
Joined: 25-Sep-2022 Posts: 512
From: Unknown | | |
|
| @Kronos
Quote:
I pretty sure umisef knows about pipelines |
Maybe you know that I have worked at IBM in PowerPC development ... and I know how the PowerPC works internally. 
The fact is:
A PowerPC as a RISC has by design a severe performance disadvantage in regards to integer code.
This is by design.
A PPC Zealot can argue about this a long as he wants... A PPC expert - for example a PPC CPU developer from IBM, he will tell you what the truth is : The PPC has a severe design disadvantage ..
But this is not point.
You need to ask yourself what is the design goal of the PPC.
The design goal of the PPC is to be a lot easier and a lot quicker to design than a high end CISC.
This means making a new PPC core for FPGA or for ASIC is actually a lot less work than making a high end CISC core.
So Karlos question .... was if a PPC Softcore could be done .. Yes it could be done and its relative little work in comparison to making a high end CISC CPU
|
|
Status: Offline |
|
|
Karlos
|  |
Re: 32-bit PPC on FPGA Posted on 11-Feb-2024 14:53:07
| | [ #55 ] |
|
|
 |
Elite Member  |
Joined: 24-Aug-2003 Posts: 4843
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition! | | |
|
| @Gunnar
Quote:
Yes it could be done and its relative little work in comparison to making a high end CISC CPU |
Nevertheless, the end result would not be performant enough._________________ Doing stupid things for fun... |
|
Status: Offline |
|
|
Gunnar
|  |
Re: 32-bit PPC on FPGA Posted on 11-Feb-2024 14:56:42
| | [ #56 ] |
|
|
 |
Cult Member  |
Joined: 25-Sep-2022 Posts: 512
From: Unknown | | |
|
| @Karlos
Quote:
Nevertheless, the end result would not be performant enough. |
Not performant enough in comparison to what?
What exactly is your goal?
If you develop CPU in an FPGA - then you are in fact doing CPU design the same way as INTEL , IBM, AMD ... all the experts do.
All this big ones develop the CPU first in an FPGA ... and then they go for doing an ASIC. This is the normal development procedure today.
If you want to own your own CPU - then going this development cycle makes a lot sense. And you can later make your own ASIC - if you have the budget. |
|
Status: Offline |
|
|
Karlos
|  |
Re: 32-bit PPC on FPGA Posted on 11-Feb-2024 15:03:14
| | [ #57 ] |
|
|
 |
Elite Member  |
Joined: 24-Aug-2003 Posts: 4843
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition! | | |
|
| @Gunnar
If you check the original question, you'll see that I was asking whether or not a purpose designed FPGA solution might be an affordable alternative to current PPC/NG offerings. If the end result is a scalar performance no better than and probably worse than the original PowerUp expansions, there's no point, at any price point. _________________ Doing stupid things for fun... |
|
Status: Offline |
|
|
Kronos
|  |
Re: 32-bit PPC on FPGA Posted on 11-Feb-2024 15:04:52
| | [ #58 ] |
|
|
 |
Elite Member  |
Joined: 8-Mar-2003 Posts: 2713
From: Unknown | | |
|
| @Gunnar
Quote:
Gunnar wrote: @Kronos
Quote:
I pretty sure umisef knows about pipelines |
Maybe you know that I have worked at IBM in PowerPC development ... and I know how the PowerPC works internally. 
|
And yet you tried to forward the idea that PPC would take 6x longer at the same clock....
Maybe just maybe you could atleast for once not use a supercherry picked benchmark plus a fake crippled implementation of the the opposition and yeah thats when we might take your past employment as some sort of proof of competence._________________ - We don't need good ideas, we haven't run out on bad ones yet - blame Canada |
|
Status: Offline |
|
|
Gunnar
|  |
Re: 32-bit PPC on FPGA Posted on 11-Feb-2024 15:25:10
| | [ #59 ] |
|
|
 |
Cult Member  |
Joined: 25-Sep-2022 Posts: 512
From: Unknown | | |
|
| |
Status: Offline |
|
|
Karlos
|  |
Re: 32-bit PPC on FPGA Posted on 11-Feb-2024 16:01:00
| | [ #60 ] |
|
|
 |
Elite Member  |
Joined: 24-Aug-2003 Posts: 4843
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition! | | |
|
| @Gunnar
Quote:
Maybe you know that I have worked at IBM in PowerPC development ... |
What specifically did you do there? Just interested._________________ Doing stupid things for fun... |
|
Status: Offline |
|
|