Click Here
home features news forums classifieds faqs links search
6071 members 
Amiga Q&A /  Free for All /  Emulation /  Gaming / (Latest Posts)
Login

Nickname

Password

Lost Password?

Don't have an account yet?
Register now!

Support Amigaworld.net
Your support is needed and is appreciated as Amigaworld.net is primarily dependent upon the support of its users.
Donate

Menu
Main sections
» Home
» Features
» News
» Forums
» Classifieds
» Links
» Downloads
Extras
» OS4 Zone
» IRC Network
» AmigaWorld Radio
» Newsfeed
» Top Members
» Amiga Dealers
Information
» About Us
» FAQs
» Advertise
» Polls
» Terms of Service
» Search

IRC Channel
Server: irc.amigaworld.net
Ports: 1024,5555, 6665-6669
SSL port: 6697
Channel: #Amigaworld
Channel Policy and Guidelines

Who's Online
23 crawler(s) on-line.
 86 guest(s) on-line.
 1 member(s) on-line.


 kriz

You are an anonymous user.
Register Now!
 kriz:  59 secs ago
 matthey:  8 mins ago
 pavlor:  9 mins ago
 amig_os:  42 mins ago
 OlafS25:  47 mins ago
 Seiya:  1 hr 2 mins ago
 amigatronics:  1 hr 35 mins ago
 zipper:  1 hr 36 mins ago
 amigakit:  2 hrs 16 mins ago
 clint:  2 hrs 28 mins ago

/  Forum Index
   /  Amiga OS4 Hardware
      /  32-bit PPC on FPGA
Register To Post

Goto page ( Previous Page 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 Next Page )
PosterThread
Karlos 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 10:59:06
#41 ]
Elite Member
Joined: 24-Aug-2003
Posts: 4405
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

@Gunnar

Quote:
And if you get the pocket-money to make a real ASIC - you could get it 10 times higher clockrate


The master plan for the 68100 revealed...?

Somewhere out there, Matthey just had a rapture moment.

_________________
Doing stupid things for fun...

 Status: Offline
Profile     Report this post  
Karlos 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 11:06:49
#42 ]
Elite Member
Joined: 24-Aug-2003
Posts: 4405
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

@Fl@sh

Quote:
To add a 16 bit const in a powerpc register you need one instruction, for a 32 bit value you need two.
Load a 32 bit constant is not a so frequent case and it's not a real problem about powerpc arch


To be fair to him, that wasn't the example Gunnar gave. He added a 32 bit constant to a direct memory location. Which is part of the point of CISC, but not a feature of typical load/store architectures. So if you include the need to load the target operand and write it back again, that's still four operations including the two you suggest are necessary.

_________________
Doing stupid things for fun...

 Status: Offline
Profile     Report this post  
OneTimer1 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 12:12:54
#43 ]
Cult Member
Joined: 3-Aug-2015
Posts: 983
From: Unknown

@Gunnar

Any plans for selling 68k compatible CPUs for embedded industries?

You would just have to go to clocks above 300MHz have some integrated DDR2 (or better ) memory interfaces, I2C, SPI, UART and CAN onboard and a IEEE compliant FPU.

 Status: Offline
Profile     Report this post  
V8 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 12:26:36
#44 ]
Regular Member
Joined: 30-Mar-2022
Posts: 133
From: Unknown

@NutsAboutAmiga

Quote:
Well, they are going to sell SIN and Heretic II to all markets.


Maybe steffen can forgo some of that money and use it to pay some of the developers that he unlube-dildoed when he was at Hyperion?

Or maybe steffen will just, fuck joerg and fuck the others, this is my money.

 Status: Offline
Profile     Report this post  
umisef 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 13:16:35
#45 ]
Super Member
Joined: 19-Jun-2005
Posts: 1714
From: Melbourne, Australia

@Gunnar

Quote:
add.l #100000,myscore

The PowerPC will need more instructions todo the same work.
In this case it will need 6 instructions!

This one instruction needs 1 cycle on 68080.


See, that's exactly why I was thinking of you when reading the thread you started.

That's a 10 byte instruction, which involves both a memory 4 byte read and a 4 byte memory write in addition to the 10 byte instruction read. That "1 cycle" is extremely generous towards any 68k implementation...

...and at the same time, the "it will need 6 instructions" is assuming the absolute worst implementation:
1) load address of "myscore" into register
2) read current value of mycode into second register
3+4) perform a 32 bit immediate load to 3rd register
5) add 3rd register to 2nd
6) store new value into "myscore".

What actually happens is that either steps 3+4 are implemented as a single 32 bit PC-relative read (i.e. a single instruction), or, for this particular example, instructions 3/4/5 are implemented as two immediate adds (addis/addi), so two instructions only.

Another notable difference between the two is that if you want to do it again, the 68k will need another 10-byte 68k instruction, with another 4 byte read and a 4 byte write. In contrast, the PPC merely needs to repeat instruction 5 and 6, because it has all the relevant state still in its large register file.

And that's before doing anything smart. If updating a score was an operation that was done often enough to matter (which it won't be, ever, on any game, on any processor), one would simply dedicate one of the PPC's many registers to hold that value throughout the program, immediately removing the need for instructions (1), (2) and (6). And while the PPC does need a few scratch registers in places the 68k doesn't, that "few" is a long way away from 16, which is how many more general purpose registers the PPC ISA offers compared to the 68k.


So you are comparing hand-optimised 68k code (gcc 68k will emit two instructions for this, a LEA followed by an ADD.L #100000,(A0)) with an unrealistically convoluted implementation on the PPC (gcc will happily produce the 5 instruction version, without any hints or help), while making unrealistic claims about it taking "a single cycle".

Meanwhile, a similarly simple line myscore=10*myscore+myscore/3; ends up being 10 PPC instructions, but 13 68k instructions, when compiled.

For anyone interested in playing with this, the compiler explorer is a lovely, and somewhat less subjective, resource.

Last edited by umisef on 11-Feb-2024 at 01:19 PM.
Last edited by umisef on 11-Feb-2024 at 01:17 PM.

 Status: Offline
Profile     Report this post  
WolfToTheMoon 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 13:22:11
#46 ]
Super Member
Joined: 2-Sep-2010
Posts: 1351
From: CRO

There is a open source PPC SoC in development

https://libre-soc.org/

They have a goal of producing a Pi-like SBC featuring a 22/28 nm SoC.

So maybe an FPGA will not be needed, but... considering the state of OS4, possibly not even new and truly cheap hardware will make much of a difference.

_________________

 Status: Offline
Profile     Report this post  
NutsAboutAmiga 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 13:23:45
#47 ]
Elite Member
Joined: 9-Jun-2004
Posts: 12820
From: Norway

@V8

Steffen is not responsible for Ben paying or not paying the other developers.

Sure, I can agree that it be better if did this work when Hyperion was struggling to find money, in any case I think ODM like agreements with individual contractors was bad idea, Ben should never have agreed these contracts like that, but he did.

Its mostly resolved by it being sold from AmiStore instead or being pay for by Trever instead.

Last edited by NutsAboutAmiga on 11-Feb-2024 at 01:30 PM.

_________________
http://lifeofliveforit.blogspot.no/
Facebook::LiveForIt Software for AmigaOS

 Status: Offline
Profile     Report this post  
Karlos 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 13:27:13
#48 ]
Elite Member
Joined: 24-Aug-2003
Posts: 4405
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

@umisef

Does the instruction fetch count here? Certainly for the worst case sure, but in a typical case, you might reasonably assume the instruction and immediate operand are already resident in the instruction cache, assuming code is being loaded into it via line transfers*


*Also applies to PPC version.

Last edited by Karlos on 11-Feb-2024 at 01:29 PM.
Last edited by Karlos on 11-Feb-2024 at 01:28 PM.
Last edited by Karlos on 11-Feb-2024 at 01:27 PM.

_________________
Doing stupid things for fun...

 Status: Offline
Profile     Report this post  
NutsAboutAmiga 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 13:33:26
#49 ]
Elite Member
Joined: 9-Jun-2004
Posts: 12820
From: Norway

@Karlos

The cache is only a faster memory, you have not eliminated a load, you have only limited a cache fill, if it’s already in cache.

_________________
http://lifeofliveforit.blogspot.no/
Facebook::LiveForIt Software for AmigaOS

 Status: Offline
Profile     Report this post  
Gunnar 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 13:58:28
#50 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown

@umisef

Quote:
And while the PPC does need a few scratch registers in places the 68k doesn't, that "few" is a long way away from 16, which is how many more general purpose registers the PPC ISA offers compared to the 68k.


Yes more register are always good.
This is also why the Apollo 68080 CPU has more more register than the PPC.

The Apollo 68080 has 32 DATA register and 16 ADDRES Register-

 Status: Offline
Profile     Report this post  
Gunnar 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 14:06:20
#51 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown

@umisef

Quote:
That "1 cycle" is extremely generous towards any 68k implementation...



The example with the instruction ADD.l #10000,myscore in one instruction is a real world result.

The Apollo 68080 can execute this instruction each clock cycle.

This means running at 100 MHz, the Apollo CPU can execute
100 Million of such instructions and in addition also do 100 Million Floating Point Divides - per second.

Let me explain you how this works:

A modern CPU executes instructions in a so called "pipeline".
Using pipelining a CPU will "split" the work an instruction into several parts.
Lets say you split the execution of the instruction in 8 steps.

Each cycle you can feed a new instruction in the pipeline and each cycle an instruction is finished.

Using pipelining is how every modern CPU is designed.
Moden INTEL chips are designed like this.
modern IBM POWER chips are designed like this
and also the 68080 is designed like this.

 Status: Offline
Profile     Report this post  
Gunnar 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 14:25:27
#52 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown

@umisef

...and at the same time, the "it will need 6 instructions" is assuming the absolute worst implementation:

1) load address of "myscore" into register
2) read current value of mycode into second register
3+4) perform a 32 bit immediate load to 3rd register
5) add 3rd register to 2nd
6) store new value into "myscore".



What actually happens is that either steps 3+4 are implemented as a single 32 bit PC-relative read (i.e. a single instruction), or, for this particular example, instructions 3/4/5 are implemented as two immediate adds (addis/addi), so two instructions only.


in two instruction?
Your post claims "fantasy features" which are not guaranteed in the PowerPC ISA
In fact there are many PowerPC implementations that suffer from pipeline stales running this code ...
And would even take LONGER than 6 cycle.

But this was not my point.
My point was that the PPC has some design advantages and also "planned" design disadvantages.
And if you have a 100Mhz Softcore e.g. of an PowerPC 4xx CPU your integer performance is on a disadvantage.


If you want to talk about PowerPC implementation Pipeline design and their advantages or weaknesses -
We can do this.

How about you write exactly the PPC assembly that you think this needs.
And I can then tell you how the pipeline of an 4xx IBM PowerPC
or an 750 IBM PowerPC can execute this and how many cycles they need in reality.

And I can explain you what problems these pipelines might run into.

Last edited by Gunnar on 11-Feb-2024 at 04:04 PM.

 Status: Offline
Profile     Report this post  
Kronos 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 14:28:01
#53 ]
Elite Member
Joined: 8-Mar-2003
Posts: 2562
From: Unknown

@Gunnar

I pretty sure umisef knows about pipelines, but your example is an ideal(unrealistic) best case scenario which would yield similar throughput on any other CPU in the past 25 years.

How well or how bad the CPU prevents or recovers from that pipeline getting stalled is what determines real world single core performance.

_________________
- We don't need good ideas, we haven't run out on bad ones yet
- blame Canada

 Status: Offline
Profile     Report this post  
Gunnar 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 14:39:12
#54 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown

@Kronos

Quote:
I pretty sure umisef knows about pipelines


Maybe you know that I have worked at IBM in PowerPC development ...
and I know how the PowerPC works internally.


The fact is:

A PowerPC as a RISC has by design a severe performance
disadvantage in regards to integer code.

This is by design.

A PPC Zealot can argue about this a long as he wants...
A PPC expert - for example a PPC CPU developer from IBM,
he will tell you what the truth is : The PPC has a severe design disadvantage ..

But this is not point.

You need to ask yourself what is the design goal of the PPC.

The design goal of the PPC is to be a lot easier and a lot quicker to design than a high end CISC.

This means making a new PPC core for FPGA or for ASIC
is actually a lot less work than making a high end CISC core.


So Karlos question .... was if a PPC Softcore could be done ..
Yes it could be done and its relative little work in comparison to making a high end CISC CPU



 Status: Offline
Profile     Report this post  
Karlos 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 14:53:07
#55 ]
Elite Member
Joined: 24-Aug-2003
Posts: 4405
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

@Gunnar

Quote:
Yes it could be done and its relative little work in comparison to making a high end CISC CPU


Nevertheless, the end result would not be performant enough.

_________________
Doing stupid things for fun...

 Status: Offline
Profile     Report this post  
Gunnar 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 14:56:42
#56 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown

@Karlos

Quote:
Nevertheless, the end result would not be performant enough.


Not performant enough in comparison to what?

What exactly is your goal?


If you develop CPU in an FPGA - then you are in fact doing CPU design the same way as INTEL , IBM, AMD ... all the experts do.

All this big ones develop the CPU first in an FPGA ... and then they go for doing an ASIC.
This is the normal development procedure today.

If you want to own your own CPU - then going this development cycle makes a lot sense.
And you can later make your own ASIC - if you have the budget.

 Status: Offline
Profile     Report this post  
Karlos 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 15:03:14
#57 ]
Elite Member
Joined: 24-Aug-2003
Posts: 4405
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

@Gunnar

If you check the original question, you'll see that I was asking whether or not a purpose designed FPGA solution might be an affordable alternative to current PPC/NG offerings. If the end result is a scalar performance no better than and probably worse than the original PowerUp expansions, there's no point, at any price point.

_________________
Doing stupid things for fun...

 Status: Offline
Profile     Report this post  
Kronos 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 15:04:52
#58 ]
Elite Member
Joined: 8-Mar-2003
Posts: 2562
From: Unknown

@Gunnar

Quote:

Gunnar wrote:
@Kronos

Quote:
I pretty sure umisef knows about pipelines


Maybe you know that I have worked at IBM in PowerPC development ...
and I know how the PowerPC works internally.




And yet you tried to forward the idea that PPC would take 6x longer at the same clock....

Maybe just maybe you could atleast for once not use a supercherry picked benchmark plus a fake crippled implementation of the the opposition and yeah thats when we might take your past employment as some sort of proof of competence.

_________________
- We don't need good ideas, we haven't run out on bad ones yet
- blame Canada

 Status: Offline
Profile     Report this post  
Gunnar 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 15:25:10
#59 ]
Regular Member
Joined: 25-Sep-2022
Posts: 477
From: Unknown

@Kronos

Quote:
And yet you tried to forward the idea that PPC would take 6x longer at the same clock....


I never said this.


What I said was:

Quote:
The RISC needs often more instructions for doing the same amount of work than a CISC.

This is a fact.

Quote:

Often a PPC needs 3 or 5 or even 6 or 7 instructions to do what a single 68K instructions does,
This means the "net-performance" of a PPC running at 100Mhz
would in reality often be lot less than what the 68080 reaches today in the Vampire.

This is correct.


Claiming a PPC is 6 times slower is nonsense.
You can for a CPU anyway not give 1 performance number.

A CPU can do a lot of different operations,
and each CPU has both strength and weaknesses.

Of course if a CPU in some areas often needs more instruction to do the same work - then is a disadvantage.


Lets look at some "finger prints" of how a CPU implementation runs certain operations.
















You see here several PowerPC chips.
All of them have differences.
You also see one x86... for comparison.




What can we learn from this?

Each implementation of a PPC is different.
If you want compare then you need to be very specific ..

For example the 68k provides the (A0)+ address mode for free.
The "original" PowerPC as it was long time ago done did also provide this for free.
And if you read IBM very old PowerPC compiler writers manual
then you can see that IBM used to highly advertise this and
we recommended compiler writers to use this EA modes as often as possible.


In reality many new PPC cores not support this for free anymore...
And IBM today not recommends using the mode anymore...

So you need be very specific if you want to compare.




But a few things are very simple to understand

What features are helpful to have a very high performing CPU?

compact code= helps to get more from your cache
being able to do more per single instruction= being strong! = is very good
many instruction per cycle = is good
more operations per instruction = SIMD = is very good
having good EA modes = is very useful
having many or enough register = is very good
Having powerful caches - that support several access per cycle = is very good


As more of these items you can tick as more advantages you have.

And of course the list goes further ..

* Good branch prediction,
* good internal forwarding ..
* Having either a design that performs good without OOO - or if not then having strong OOO
* Having smart cache prefetching


The combination of all this makes a good CPU -
And many of these are NOT bound to an architecture.

This means you can design for example an 68K CPU having each and every of the above features.

Last edited by Gunnar on 11-Feb-2024 at 03:38 PM.

 Status: Offline
Profile     Report this post  
Karlos 
Re: 32-bit PPC on FPGA
Posted on 11-Feb-2024 16:01:00
#60 ]
Elite Member
Joined: 24-Aug-2003
Posts: 4405
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

@Gunnar

Quote:
Maybe you know that I have worked at IBM in PowerPC development ...


What specifically did you do there? Just interested.

_________________
Doing stupid things for fun...

 Status: Offline
Profile     Report this post  
Goto page ( Previous Page 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 Next Page )

[ home ][ about us ][ privacy ] [ forums ][ classifieds ] [ links ][ news archive ] [ link to us ][ user account ]
Copyright (C) 2000 - 2019 Amigaworld.net.
Amigaworld.net was originally founded by David Doyle