Click Here
home features news forums classifieds faqs links search
6071 members 
Amiga Q&A /  Free for All /  Emulation /  Gaming / (Latest Posts)
Login

Nickname

Password

Lost Password?

Don't have an account yet?
Register now!

Support Amigaworld.net
Your support is needed and is appreciated as Amigaworld.net is primarily dependent upon the support of its users.
Donate

Menu
Main sections
Home
Features
News
Forums
Classifieds
Links
Downloads
Extras
OS4 Zone
IRC Network
AmigaWorld Radio
Newsfeed
Top Members
Amiga Dealers
Information
About Us
FAQs
Advertise
Polls
Terms of Service
Search

IRC Channel
Server: irc.amigaworld.net
Ports: 1024,5555, 6665-6669
SSL port: 6697
Channel: #Amigaworld
Channel Policy and Guidelines

Who's Online
23 crawler(s) on-line.
 84 guest(s) on-line.
 0 member(s) on-line.



You are an anonymous user.
Register Now!
 pixie:  5 mins ago
 bhabbott:  10 mins ago
 Birbo:  18 mins ago
 amigakit:  1 hr 14 mins ago
 kolla:  1 hr 47 mins ago
 Beajar:  1 hr 54 mins ago
 VooDoo:  2 hrs 46 mins ago
 Hammer:  3 hrs 1 min ago
 Musashi5150:  3 hrs 20 mins ago
 amigang:  3 hrs 45 mins ago

/  Forum Index
   /  Amiga OS4 Hardware
      /  some words on senseless attacks on ppc hardware
Register To Post

Goto page ( Previous Page 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 Next Page )
PosterThread
NutsAboutAmiga 
Re: some words on senseless attacks on ppc hardware
Posted on 28-Jan-2024 12:50:28
#781 ]
Elite Member
Joined: 9-Jun-2004
Posts: 12817
From: Norway

@Hypex

Well, AmigaONE-XE G4, was not extremely expensive, but it was clocked at 800Mhz, at time had Athlon 1.8Ghz, the speed difference was big, the PC had newer VIA chipset, and AmigaONE had older chipset, I run Linux on PC as it too crashy to run Windows98. I think cost about the same. They bout pretty horrible computers. The AmigaONE-XE had more issues, however.

I have to say, that Hyperion did fix the bugs in software eventually, but it did take far too long, and I believe the Athlon PC also eventually run Windows without crashing.

Remember using LILO on Linux and kept trashing the NTLoader in Windows. So it was pretty painful setup.

Last edited by NutsAboutAmiga on 28-Jan-2024 at 07:56 PM.
Last edited by NutsAboutAmiga on 28-Jan-2024 at 12:55 PM.

_________________
http://lifeofliveforit.blogspot.no/
Facebook::LiveForIt Software for AmigaOS

 Status: Offline
Profile     Report this post  
agami 
Re: some words on senseless attacks on ppc hardware
Posted on 29-Jan-2024 1:30:33
#782 ]
Super Member
Joined: 30-Jun-2008
Posts: 1648
From: Melbourne, Australia

@Hypex

Quote:
Hypex wrote:

The X1000 was rather crippled in some respects.

Yeah, in respect to the AmigaOS 4 drivers (queue rim shot).

_________________
All the way, with 68k

 Status: Offline
Profile     Report this post  
Hammer 
Re: some words on senseless attacks on ppc hardware
Posted on 29-Jan-2024 2:09:41
#783 ]
Elite Member
Joined: 9-Mar-2003
Posts: 5273
From: Australia

@NutsAboutAmiga

AmigaOne XE was released around April 2003.

Athlon's chipsets in 2003 had AMD's Iron Gate/VIA Southbridge, VIA KT400, SiS 730, and NVIDIA nForce 2.

https://www.anandtech.com/show/1005/4
nForce 2s memory controller is faster when compared to VIA KT333.

NVIDIA nForce2 128-bit DDR333 = 1336.5 MB/s
VIA KT333 64-bit DDR333 = 1028.8 MB/s


Memory Latency Comparison - Cachemem (lower is better)

NVIDIA nForce2 128-bit DDR333 = 244
VIA KT333 64-bit DDR333 = 349


Ethernet Controller Performance - NetIQ Chariot, CPU Utilization
Intel 10/100 PRO+ = 13%, 155 Mbps
nForce2 (NVIDIA MAC) = 12%, 153 Mbps
VIA VT613 = 27%, 125 Mbps

A full-duplex connection and thus the theoretical maximum is 200Mbps (100Mbps each way).

I run Windows XP (NT 5.1) since 2001. "Linux had no games".

_________________
Ryzen 9 7900X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB
Amiga 1200 (Rev 1D1, KS 3.2, PiStorm32lite/RPi 4B 4GB/Emu68)
Amiga 500 (Rev 6A, KS 3.2, PiStorm/RPi 3a/Emu68)

 Status: Offline
Profile     Report this post  
Hammer 
Re: some words on senseless attacks on ppc hardware
Posted on 29-Jan-2024 4:02:12
#784 ]
Elite Member
Joined: 9-Mar-2003
Posts: 5273
From: Australia

@matthey

Quote:

The mass produced Amiga 1200 was at the low end of mid-performance in 1992, low end in 1993 and practically obsolete in 1994. Increasing the CPU performance was relatively easy while increasing the chipset performance was difficult, especially with the lack of vision and planning by upper management.

It's the software that sells hardware. Gaming PC had a time-exclusive 1990 Wing Commander VGA as its "Defender of the Crown" moment.

PC time-exclusive 1993 Doom wasn't the only standout game for the PC platform.

Quote:

The cost difference of a 68EC020 and 68EC030 was quickly closing and the cost of a 68EC030 became low enough to be compelling. There were likely to be better quantity discounts on the 68EC030 that was not at the high end of clock ratings. The 68030 is ~25% lower power than the 68020 offsetting much of the clock increase to power and heat, the data cache further reduces memory traffic and the 68EC030 has a full 32 bit address bus instead of the 24 bit address bus of the 68EC020 (larger memory expansions could be supported while PCMCIA slot and memory expansion conflicts could be avoided).

From real-world experience, 68020 and 68030 real-world IPC performance is similar.

Quote:

It's a good thing the RPi Foundation doesn't market or sell "useless" embedded CPU/SoC chips for the desktop market


1. Without open source PiStorm gateway, RPI 3A or 4B is nearly useless for the classic Amiga.

2. RPi Foundation's embedded products are usually low-cost. The Amiga market has a "Phase 5" tax.

RPi Foundation's 3rd party funding includes "BBC Micro" style desktop education programs.


The Pi was conceived in 2006 by a group (or should it be a circuit?) of computer scientists at the University of Cambridge, concerned that the growing sophistication of modern machines had made it harder for children to learn coding for themselves, as the generation weaned on BBC Micros and Sinclair Spectrums had.


There is no Jack Tramiel's Commodore to interfere with the UK's 21st-century BBC Micro "education" subsidy.

My low-cost RPi 3A and 4B are made in the UK via Sony UK TEC's manufacturing plants.

NXP's ARM Cortex A53 SBC solution is price competitive with RPi 3A+, but it's useless for classic Amiga.

I would thank the British taxpayer and ARM Ltd for subsidizing and investing in Raspberry Pi Foundation.

Quote:

. A-Eon sells the AmigaNOne with embedded CPU/SoC chips for the desktop market but they are obsolete even for the embedded market.

This is A-Eon's problem.

Quote:

Back in the Amiga days, C= bought mostly embedded CPUs.

In the 1980s, Motorola 68K had a sizable market share in the Unix workstation.

Hint: 1987 68030's MMU integration is mostly to address the Unix market and counter Intel's MMU integration with i286/i386 (for Xenix).

68EC030 is a manufacturing defect (failed MMU) recovered 68030. Motorola's BOM cost for 68EC030 and 68030 is the same.

A1200's 68EC020 had a 24-bit address bus like 68000's. Wing Commander 256 colors (AGA-CD32) works with 2MB RAM. Wing Commander VGA's system recommendation is 2 MB RAM.

Intel 386SX has a 24-bit address, a 16-bit external bus, and MMU.

Motorola 68EC020 has a 24-bit address and a 32-bit external bus. Intergrated MMU is important for *nix software development.

On performance, 68EC020 has an edge over i386SX, but Commodore's A1200 gimped 68EC020 with 1985 era memory controller (Alice) i.e. "read my lips, no new chips" during A3000's R&D phase.

Alice wasn't designed for 68020's full 32-bit 14 Mhz memory controller and Commodore ran out of time.

Adding 14 Mhz 32-bit Fast RAM for A1200's stock 68EC020 gives a 2X performance boost and the AGA chipset is less memory bandwidth-constrained.

For sustained 256 color performance, PC VGA port Turrican AGA recommends Fast RAM equipped A1200.

CPU's motherboard platform is a factor.


Quote:

This allowed them to leverage economies of scale from the embedded market. It worked well enough until they needed higher performance CPUs and there wasn't adequate economies of scale to reduce CPU chip costs with Apple leaving the 68k market. Well, C= upper management was so oblivious that they didn't get that they needed to increase CPU performance to remain competitive or they would have at least upped the clock speed of the CPU with the very popular and high production embedded 68EC030. The 68060 would have been super for desktop, laptop and console use but the high end embedded market wasn't large enough at that time to drop the price, especially without Apple.

Without the additional Amiga clone distribution logistics, both GVP's A1230 Turbo and Phase 5's Blizzard 030/040/060 accelerator market are constrained by A1200's small market size and Commodore's A1200 production rates.

With Commodore canceling the A500. Commodore destroyed half a billion revenue GVP.
Unlike AMD and Intel, Commodore has very little regard for its value-added partners.

Commodore publicly attacked the EGS RTG initiative with FUD. Commodore could have worked with the EGS RTG initiative with a compatible solution.

P96 followed Phase 5's CGX API compatibility.

For Intel/AMD, the PC clone platform won the distribution logistics.

When NVIDIA demoed real-time BVH raytracing, the BVH raytracing characteristics were incorporated into Microsoft's DirectX12 Feature Level 12_2. Microsoft assimilates like Star Trek's Borg. The clone master AMD copies NVIDIA's BVH raytracing characteristics and Sony benefited from it.

Crytek's Octree software raytracing is the alternative.

Quote:

It would have been more accurate to say that the 68060 offered the per-clock performance of the Pentium and cost less than half the price. A 100MHz Pentium P54C was planned by the time the 68060 launched in 1994.

That's not accurate in the real world.

I'm still waiting for 68060's kick-ass Doom (integer) or Quake (FP/integer) benchmarks.

https://www.youtube.com/watch?v=sA3WbPksAbM
Warp1260 with 68060 Rev @ 100 Mhz and RTG playing Quake.

I think this one runs between 18FPS and 26FPS depending on where you are.

https://thandor.net/benchmark/33
Quake demo1 320x200p

Pentium 75 (430VX) = 20.00 fps average
Pentium 100 (430VX) = 26.70 fps average

Pentium 150 (430VX) = 33.90 fps average (easy overclock into 166 Mhz with 60 to 66 Mhz FSB jumper)

K6 166ALR 166Mhz (430VX) = 34.70 fps average
Pentium MMX 166 (430VX) = 39.80 fps average

During 1996, it was between Phase 5 CyberStorm 060 @ 50 Mhz + CyberGraphics 64 for my A3000 vs new build Pentium 150 + S3 Trio 64 UV + Intel 430VX PC clone.

Quote:

68060@50MHz
90 DMIPS (1.8 DMIPS/MHz)
2.5 million transistors, CMOS
$308 (1000s quantity)

Pentium P54C@100MHz
138 DMIPS (1.38 DMIPS/MHz)
3.3 million transistors, BiCMOS (large die and BiCMOS raises production cost)
$995 (1000s quantity)

PPC601@66MHz
93 DMIPS (1.41 DMIPS/MHz)
2.8 million transistors, CMOS
$370 (1000s quantity)

1. A useless argument for floating point-enabled geometry Quake.

Classic Pentium class competitors from AMD and Cyrix have strong fix point integer performance since the infamous PR rating is based on integer performance.

Pentium Pro (P6) was released in 1995 and the classic Pentium acted like a lower-cost "Celeron". During the Pentium II (P6 MMX) era, the Celeron brand was created as a lower-cost classic Pentium replacement.

Pentium Pro (P6) has 2.7 DMIPS/MHz.

During the Pentium II era, Intel expanded its product tiers into Celeron, Pentium II, and Xeon product lines.

2. 68060 has a 32-bit front side bus, hence dual 32-bit ALU capability is wasted.

Classic Pentium has a 64-bit front-side bus.

Dhry1 Opt VAX MIPS
AMD 80386DX 40Mhz, 17.5 (eat that 68030 @ 50Mhz)

Pentium 75, 112 (1.49 DMIPS/Mhz)
Pentium 100, 169 (1.69 DMIPS/MHz)
Pentium 200, 353 (1.765 DMIPS/MHz)

Cyrix PP166 133Mhz, 219 (1.64 DMIPS/MHz)
Cyrix PR233 188Mhz, 286 (1.52 DMIPS/MHz)
----

Opt VAX MIPS
Raspberry Pi 3, 32 Bit, 1200 Mhz, 2469 (1.83 DMIPS/MHz)
Raspberry Pi 3, 64 Bit, 1200 Mhz, 3536 (2.94 DMIPS/MHz)


http://www.roylongbottom.org.uk/dhrystone%20results.htm

3.
https://www.nytimes.com/1995/02/03/business/company-news-intel-cuts-prices-of-pentium-chips.html
Date: Feb. 3, 1995

Intel, based in Santa Clara, Calif., said the price cuts were effective this week. It cut the price for its Pentium 75 megahertz processors by about 40 percent, to $301 a chip from $495, for lots of 1,000.


https://www.nytimes.com/1995/08/03/business/intel-cuts-pentium-prices.html
Date: Aug. 3, 1995

on orders of 1,000 chips, a 133-megahertz Pentium chip now costs $694, down from $935,

Intel's 100-megahertz Pentium chips will sell for $398, down from $479, and the 75-megahertz chip will sell for $184, compared with $275 previously.

When Pentium 133 and Pentium Pro were released in 1995, it drove down older Pentium's prices.

I purchased my Pentium 150 in 1996 and it's not a top-end SKU, hence no "flagship" price tax.


Quote:

The embedded LC and EC variants of the 68060 seemed to be the only priority after the full 68060 release in 1994 despite the paper talking about other enhancements. A 68060@66MHz would have directly competed with the PPC601@66MHz with a competitive advantage in performance and cost (due to smaller chip die size). The 8 stage 68060 pipeline should have provided a significant advantage in clock speed over the PPC601 with 4 stage pipeline as well (the 8 stage 68060@50MHz pipeline was deeper than the contemporary Alpha21064@300MHz with 7 stage pipeline). The fateful political decision to switch to PPC for the desktop had already been made though. The Pentium killer 68060 was locked in the Motorola basement and designated to low clock speed embedded duty, with one of the deepest pipelines and best DMIPS/MHz performance of the time.

CPU max clock rating @ ~0.5um chip process with pipeline stages
ARM710@40MHz 3-stage
PPC601+@120MHz 4-stage
PPC603@160MHz 4-stage
Pentium P54C@120MHz 5-stage
PPC604@180MHz 6-stage
HP PA-8000@180MHz 7-stage
Alpha 21064@300MHz 7-stage
MIPS R4400@200MHz 8-stage
68060@50MHz 8-stage

FYI, Pentium Pro (P6) was released in 1995 and the classic Pentium acted like a lower-cost "Celeron" product line.

68060's 8-stage is useless when the FPU is not pipelined.

My FPU less 68LC060 Rev 4 can reach 75 Mhz, but not 100 Mhz. TF1260 supports 100Mhz 68060 Rev 6.

In practice, FPU-less 68LC060 has an uncompetitive clock speed. Need ex-DEC Alpha engineers for high-clock speed designs.

Last edited by Hammer on 01-Feb-2024 at 05:50 AM.
Last edited by Hammer on 29-Jan-2024 at 07:09 AM.
Last edited by Hammer on 29-Jan-2024 at 06:42 AM.
Last edited by Hammer on 29-Jan-2024 at 06:34 AM.

_________________
Ryzen 9 7900X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB
Amiga 1200 (Rev 1D1, KS 3.2, PiStorm32lite/RPi 4B 4GB/Emu68)
Amiga 500 (Rev 6A, KS 3.2, PiStorm/RPi 3a/Emu68)

 Status: Offline
Profile     Report this post  
Karlos 
Re: some words on senseless attacks on ppc hardware
Posted on 29-Jan-2024 10:21:17
#785 ]
Elite Member
Joined: 24-Aug-2003
Posts: 4402
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

@Hypex


Quote:
The X1000 was a better design than the XE but


: Damn, that made me laugh. Really cheered up my day.

"Designed" is a strong word when talking about the Teron. It was knackered from the get go. Don't get me wrong, I had one and while it worked it was fun. I am sure there are probably some that still work, but all the indications are they are mostly dead now, killed by any number of serious flaws in their implementation (I hesitate to use the word design).

_________________
Doing stupid things for fun...

 Status: Offline
Profile     Report this post  
Hammer 
Re: some words on senseless attacks on ppc hardware
Posted on 30-Jan-2024 3:26:59
#786 ]
Elite Member
Joined: 9-Mar-2003
Posts: 5273
From: Australia

@Karlos

Quote:

Karlos wrote:
@Hypex


Quote:
The X1000 was a better design than the XE but


: Damn, that made me laugh. Really cheered up my day.

"Designed" is a strong word when talking about the Teron. It was knackered from the get go. Don't get me wrong, I had one and while it worked it was fun. I am sure there are probably some that still work, but all the indications are they are mostly dead now, killed by any number of serious flaws in their implementation (I hesitate to use the word design).


To be fair, NVIDIA nForce 1 (released in 2001) has data corruption and it was quickly software patched and replaced by nForce 2 (released in 2002).

Due to nForce 1's low market presence, it didn't harm NVIDIA's reputation.

VIA's 686B Southbridge has data corruption with VIA KT133 Northbridges, but it works on AMD's Iron Gate Northbridge.

The VIA bug was uncovered by German hardware site Au-Ja!, the date corruption affects large, 100MB and up file transfers between two hard drives connected to separate IDE channels exchanging the data by DMA. Having a Creative Labs Soundblaster Live card in place seems to exacerbate the problem.

The KT133 chipset corrupted disk subsystems; specifically, the 686B Southbridge had issues with Creative's SBLive! sound cards. A BIOS update was released by VIA to fix this issue.

KT266 contains a hardware bug that causes system instability when using the AGP slot at the specified max capacity of 4X.

VIA's repeated chipset fukups have caused reputation damage on VIA and K7 platforms (the K7 market moved towards NVIDIA's nForce 2).

AMD's K8's integrated Northbridge removes the VIA problem.

In modern times, ASmedia (sister company to ASUStek and Pegatron) has replaced VIA as AMD's go-to chipset block IP licenses. ASmedia was late on PCIe 4.0 release cycle and AMD created its X570 chipset. Excluding the X570 chipset, all of the AM4 chipsets for AMD's Zen micro-architecture were designed by ASMedia.

Taiwanese MAI Logic and VIA doomed themselves with tofu mistakes.

Cache and DMA management are basic functions for classic Pentium-era desktop computers.

Last edited by Hammer on 30-Jan-2024 at 03:29 AM.
Last edited by Hammer on 30-Jan-2024 at 03:28 AM.

_________________
Ryzen 9 7900X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB
Amiga 1200 (Rev 1D1, KS 3.2, PiStorm32lite/RPi 4B 4GB/Emu68)
Amiga 500 (Rev 6A, KS 3.2, PiStorm/RPi 3a/Emu68)

 Status: Offline
Profile     Report this post  
matthey 
Re: some words on senseless attacks on ppc hardware
Posted on 30-Jan-2024 7:08:29
#787 ]
Super Member
Joined: 14-Mar-2007
Posts: 1999
From: Kansas

Hammer Quote:

I'm still waiting for 68060's kick-ass Doom (integer) or Quake (FP/integer) benchmarks.


Quake and Doom had many man hours of optimization for large gaming markets. There may be more man hours of optimization for Quake than 68060 support for any compiler. The most common compilers still don't have a 68060 specific instruction scheduler for an in-order CPU. You will have to wait forever with emulation of the 68k Amiga!

Hammer Quote:

FYI, Pentium Pro (P6) was released in 1995 and the classic Pentium acted like a lower-cost "Celeron" product line.

68060's 8-stage is useless when the FPU is not pipelined.


Integer performance is much more important than FPU performance. The 68060 was targeted for embedded use where FPU use is less important than for the desktop. Few games used the FPU when the 68060 was released. Other FPUs were not fully pipelined for at least double precision (nearly all C code) like in the 486, 68040, AMD K5 (5k86), PowerPC 601, PowerPC 603, MIPS R4000, etc. Other FPU factors are just as important, especially with the stack based x86 FPU handicap. The original Pentium can't superscalar (multi-issue) FPU instructions with anything except FXCH which was necessary manual register renaming to avoid stalls from reusing the default top of the FPU stack which the previous still executing FPU instruction is using with consecutive pipelined FPU instructions. The pipelined x86 FPU has to execute extra FXCH instructions and needs complex scheduling to take advantage of the fully pipelined FPU. The 68060 can superscalar issue the most common FPU and integer instructions together and has better timings on some instruction execution latencies, especially ones using memory. This makes instruction scheduling much easier for compilers. How much difference does this make?

The 1996 AMD K5 (5k86) has a non-pipelined FPU but has significantly better integer performance than a similarly clocked Pentium. The AMD 5k86 FPU Bytemark32 (Nbench) performance is roughly half of the Pentium which can be seen in the following benchmarks (Quake benchmark has less difference but is more dependent on the GPU and display timing).

https://dependency-injection.com/the-perfect-pentium/

The Pentium and AMD 5k86 FPU both have the x86 stack based FPU handicap though. There just happens to be some 68k Amiga Bytemark benchmarks available.

https://amigaworld.net/modules/newbb/viewtopic.php?topic_id=44391&forum=25#847418

Pentium@75MHz
int: 0.82
FP: 0.91

5k86@75MHz
int: 1.16
FP: 0.47

68060@75MHz (my GCC 3.4 results)
int: 1.20
FP: 0.24

68060@75MHz (vbcc estimated from Frank Wille 50MHz results)
int: 1.02
FP: 0.78

The benchmarks mostly fit in the caches so it is possible to estimate the results at a higher clock speed as I did for Frank Wille's results. This is a conservative estimate as the CPU caches are clocked up with CPU clock. This can be seen by adding 25% to the P5@75MHz results which are lower than the actual p5@100MHz results. If we take the GCC 3.4 int results with the vbcc FP estimated results, the 68060 looks good.

Pentium@75MHz
int: **************** 0.82
FP: ****************** 0.91

5k86@75MHz
int: *********************** 1.16
FP: ********* 0.47

68060@75MHz
int: ************************ 1.20
FP: **************** 0.78

The Bytemark benchmarks have the 68060 FP performance closer to the Pentium than the AMD 5k86 with room for further 68060 compiler improvement. The AMD 5k86 was later and used a newer 0.35um process as did many 1995+ Pentiums. Where are the Pentium killer results? That is in the 8 stage pipeline that should have allowed higher clock speeds than the Pentium 5 stage pipeline.

68060@100MHz
int: ****************************** 1.50
FP: ******************** 0.98

With a 25% clock speed advantage due to the deeper pipeline, the 68060 FPU may have been outperforming the Pentium with ~83% better integer performance and ~8% better FPU performance. Then there is the secret 68060+.

https://websrv.cecs.uci.edu/~papers/mpr/MPR/ARTICLES/080502.pdf Quote:

o 68060+ -undisclosed architectural enhancements that increase performance 20–30% independent of clock frequency.


68060+@100MHz (25% performance increase over 68060)
int: ************************************** 1.88
FP: ************************* 1.23

What is the secret sauce in the 68060+? The 68060 die is small enough to double the caches which was impractical with the large die of the Pentium until late 1996 with the P55C and a die shrink. The 68060 could have double the caches at introduction if there was a market for high end 68k CPUs. The 68060+ with more than double the integer performance and better FPU performance than the Pentium while at a more affordable price could have been devastating (68060+ could use cheaper 0.5um CMOS and 32 bit memory where Pentium needed expensive 0.35um BiCMOS, 64 bit memory & perhaps OoO to remain competitive). The original 68060 included many low cost features instead, especially for embedded. The performance was still competitive but could have been much better if designed for performance. A Pentium killer would have been a PPC killer though too. PPC also suffered from shallow pipeline designs and too large of caches due to less efficient cache usage which limited clock speeds.

Hammer Quote:

My FPU less 68LC060 Rev 4 can reach 75 Mhz, but not 100 Mhz. TF1260 supports 100Mhz 68060 Rev 6.

In practice, FPU-less 68LC060 has an uncompetitive clock speed. Need ex-DEC Alpha engineers for high-clock speed designs.


DEC Alpha engineers were not needed to achieve high clock speeds. Deep pipelines were needed. Look at the list again with a few new entries added. Most professional CPU designers were around 200MHz @0.5um with 7+ pipeline stages.

CPU max clock rating @ ~0.5um chip process with pipeline stages
ARM710@40MHz 3-stage
PPC601+@120MHz 4-stage
PPC603@160MHz 4-stage
Pentium P54C@120MHz 5-stage
PPC604@180MHz 6-stage
HP PA-7300LC@180MHz 6-stage
HP PA-8000@180MHz 7-stage
Alpha 21064@300MHz 7-stage
MIPS R4400@200MHz 8-stage
68060@50MHz 8-stage
UltraSPARC@200MHz 9-stage

Motorola had the tech to produce a 180MHz 0.5um PPC chip with 6 stages. MIPS and HP were able to produce chips around 200MHz with 7-8 stage pipelines. The 68060 wasn't going to hit 300MHz at 0.5um but it should have been able to reach 150-200MHz. While a more complex instruction set may limit the clock speed, a simple in-order CPU design like the 68060 is easier to clock up than a huge and complex OoO CPU design like the HP PA-8000 (in-order 6 stage HP PA-7300LC max clock rating is the same as the 7 stage OoO PA-8000 for example). As outrageous as an 8 stage CPU that only reaches 50MHz is, the 68060 was still financially successful with the last die shrink in 1999 and still rated at only 50MHz while even Pentiums were reaching ~200MHz at a similar node.

Last edited by matthey on 30-Jan-2024 at 03:00 PM.
Last edited by matthey on 30-Jan-2024 at 02:57 PM.
Last edited by matthey on 30-Jan-2024 at 07:28 AM.
Last edited by matthey on 30-Jan-2024 at 07:26 AM.

 Status: Offline
Profile     Report this post  
Hypex 
Re: some words on senseless attacks on ppc hardware
Posted on 30-Jan-2024 11:43:25
#788 ]
Elite Member
Joined: 6-May-2007
Posts: 11200
From: Greensborough, Australia

@michalsc

One of the best examples of an AROS port I've seen is AROS hosted on PPC. I installed it a number of years ago and tried it my XE. Testing the demos. Then I took the HDD and connected it to the X1000. It loaded up fine and I ran even more demos!

 Status: Offline
Profile     Report this post  
Hypex 
Re: some words on senseless attacks on ppc hardware
Posted on 30-Jan-2024 12:15:08
#789 ]
Elite Member
Joined: 6-May-2007
Posts: 11200
From: Greensborough, Australia

@kolla

Quote:
Well, everything is more or less automated here, so even if my systems spend a bit of time every now and then, I am not - it's all automated. I just get a notification on my phone (alertzy) when something fails, and I have some gemini pages that show the overall status of my systems.


Most of my time is spent setting up and configuring a build. Native tends to work fine with everything needed. But a cross compile is more likely to break.

I do like automation. But automation needs setting up as well. As a side example a while back I found what was breaking the PPC repos for Ubuntu PPC, which for a while had given errors about PPC not being supported. Well, technically the packages were still there, but the package lists removed ppc from the arch lists. So they were rejected on a package update. They were unlikely to fix such such a simple edit, since they are lazy towards PPC, and in bug reports made excuses even though other archs weren't broken and the files were all depreciated.

So I wrote a script that checked and updated package lists if it found a list. I then went further and found I could hook into an apt update. After getting it all working I then created a package for it so it was automated as much as possible. But, because it was only simple package, I created it by hand from some package files I made. However, even though it was simple, there was a lot involved I had to fix up to get a clean package. So, next time I'll use a dedicated tool for the job, that can ensure data is in proper format without needed fixes.

 Status: Offline
Profile     Report this post  
BigD 
Re: some words on senseless attacks on ppc hardware
Posted on 30-Jan-2024 16:40:58
#790 ]
Elite Member
Joined: 11-Aug-2005
Posts: 7322
From: UK

@matthey

Quote:
A Pentium killer would have been a PPC killer though too. PPC also suffered from shallow pipeline designs and too large of caches due to less efficient cache usage which limited clock speeds.


Who cares? We've got PiStorm Emu68 now and Vampire 080 if you're a purist! Motorola were morons and PPC was the wrong course though Cell was cool!

_________________
"Art challenges technology. Technology inspires the art."
John Lasseter, Co-Founder of Pixar Animation Studios

 Status: Offline
Profile     Report this post  
Hammer 
Re: some words on senseless attacks on ppc hardware
Posted on 31-Jan-2024 3:45:25
#791 ]
Elite Member
Joined: 9-Mar-2003
Posts: 5273
From: Australia

@matthey

Quote:

Quake and Doom had many man hours of optimization for large gaming markets. There may be more man hours of optimization for Quake than 68060 support for any compiler. The most common compilers still don't have a 68060 specific instruction scheduler for an in-order CPU. You will have to wait forever with emulation of the 68k Amiga!

PC's Quake exploited Pentium FPU's limited out-of-order FDIV and zero-cycle FXCH instructions.


Quote:

Integer performance is much more important than FPU performance. The 68060 was targeted for embedded use where FPU use is less important than for the desktop.

Desktop computers have optional FPU until its integration inside the CPU.

Mandatory CPU instruction set requirements evolve when there's a "killer app".

For Pentium PC's Quake, it's advantageous to use concurrent FPU for geometry processing while integer units handle pixel/texture processing workloads when Sony's PlayStation has a CPU with integer 33 MIPS, geometry co-processor with integer 66 MIPS, and texture units.

PlayStation doesn't have hardware Z-buffer acceleration i.e. its software implementation.

Quote:

Few games used the FPU when the 68060 was released.

68060's April 1994 release was useless since 1995 was the year for Amiga's 68060 practical usage.

The Amiga platform's mainstream models weren't able to shift towards the 68040 socket platform. 68060 needs a 3.3V 68040 socket.

Intel and AMD have superior concurrent CPU, chipset, and motherboard management releases. The motherboard platform matters since a new CPU is useless without it.

Modern examples, AMD Zen 5 and Intel Arrow Lake engineering samples are floating with motherboard partners in preparation for the H2 2024 launch i.e. XMas Q4 2024 is important.

Again, it's advantageous to use concurrent FPU for geometry processing while integer units handle pixel/texture processing workloads.


Quote:

Other FPUs were not fully pipelined for at least double precision (nearly all C code) like in the 486, 68040, AMD K5 (5k86), PowerPC 601, PowerPC 603, MIPS R4000, etc.

Why the lowball targets? Where are DEC Alpha and HP PA-RISC?

It was MIPS R4400 in 1995. LOL

AMD K5 is a failure that was quickly replaced by K6 i.e. AMD purchased NexGen when AMD's K5 chip failed to meet performance and sales expectations. Development of AMD's internal K5 successor was halted in favor of continuing from NexGen's Nx686 designs, eventually becoming K6. K5 reached a clock wall of 133 Mhz.

Am5x86 wasn't a Pentium class CPU and its 586 name and PR rating was controversial i.e. 160 Mhz Am5x86-P100 is roughly equivalent to Pentium 90.

1996 Quake almost singlehandedly killed so-called Pentium clones.

Quote:

Other FPU factors are just as important, especially with the stack based x86 FPU handicap.

FXCH is included with the Pentium's March 1993 release. Release timing is important.

Your argument is almost a nothing burger when Pentium's aggressive release schedule is factored in.

When FPU mattered, Pentium was offered for sale in greater numbers at mainstream price ranges.

Quote:

The original Pentium can't superscalar (multi-issue) FPU instructions with anything except FXCH which was necessary manual register renaming to avoid stalls from reusing the default top of the FPU stack which the previous still executing FPU instruction is using with consecutive pipelined FPU instructions.


Pentium FPU's FDIV has limited out-of-order i.e. FDIV instruction can be processed while non-dependent instructions are concurrently processed. You're forgetting the Pentium
feature.

Besides FXCH, PC Quake's geometry workload specifically exploited this Pentium feature.

Pentium FPU has three pipelines i.e. FADD, FDIV, and FMUL with a common dispatcher. FDIV doesn't stall other non-dependent instructions.

For Pentium PC's Quake, it's advantageous to use concurrent FPU for geometry processing while integer units handle pixel processing workloads when Sony's PlayStation has a CPU with integer 33 MIPS and geometry co-processor with integer 66 MIPS.

Your argument is not realistic for the Quake use case.

Quote:

The pipelined x86 FPU has to execute extra FXCH instructions and needs complex scheduling to take advantage of the fully pipelined FPU.

FXCH is included with the Pentium's March 1993 release. Release timing is important.


Quote:

The 68060 can superscalar issue the most common FPU and integer instructions together and has better timings on some instruction execution latencies, especially ones using memory. This makes instruction scheduling much easier for compilers. How much difference does this make?



The 68060's design does not support concurrent floating point execution; only one of these functional units is active at a time.

https://cdn.preterhuman.net/texts/underground/phreak/68060Info.txt

Pentium Pro (P6) was released on November 1, 1995. The Pentium Pro featured out-of-order execution, including speculative execution via register renaming.

Pentium Pro (P6)'s design made its way to P6-based Celeron, Pentium II, and Xeon product lines.

Intel's Pentium (P5) and Pentium Pro (P6) aggressive release schedules are important factors.

For the Amiga, 68060 is missing in action during 1994, and the inferior solution in 1995 Lightwave render use cases e.g. MIPS VR4400, Pentium Pro, and DEC Alpha Windows NT Lightwave combo. Only PowerPC has a chance to survive against the mentioned competition. The Amiga was losing its professional market niche in 1995.

In 1995, NEC VR4400 released a 200 Mhz 0.35 μm version while Intel released the Pentium Pro at 200 Mhz 0.35 μm.

R4000 was selected to be the microprocessor of the Advanced Computing Environment (ACE). The R4400 was licensed by Integrated Device Technology (IDT, US), LSI Logic (US), NEC(Japan), Performance Semiconductor, Siemens AG (Germany) and Toshiba (Japan). IDT (US), NEC (Japan), Siemens, and Toshiba fabricated and marketed the microprocessor.

Toshiba's Tiger Shark i486 compatible chipset was used for its R4400 CPU. MIPS camp tried to replace X86 with MIPS.

When FPU mattered for Lightwave and Quake use cases, 68060 was the inferior solution.
Prove me wrong.

Last edited by Hammer on 02-Feb-2024 at 01:49 AM.
Last edited by Hammer on 31-Jan-2024 at 04:04 AM.

_________________
Ryzen 9 7900X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB
Amiga 1200 (Rev 1D1, KS 3.2, PiStorm32lite/RPi 4B 4GB/Emu68)
Amiga 500 (Rev 6A, KS 3.2, PiStorm/RPi 3a/Emu68)

 Status: Offline
Profile     Report this post  
agami 
Re: some words on senseless attacks on ppc hardware
Posted on 31-Jan-2024 3:59:42
#792 ]
Super Member
Joined: 30-Jun-2008
Posts: 1648
From: Melbourne, Australia

@BigD

Quote:
BigD wrote:
... though Cell was cool!

Cell was/is cool, but there's nothing PPC about Cell BE that couldn't be equally cool if it were say implemented on MIPS.

There really isn't a case where one could claim "only PowerPC makes it possible", unless one is aiming to make severely outdated and overpriced hardware possible.

_________________
All the way, with 68k

 Status: Offline
Profile     Report this post  
Hammer 
Re: some words on senseless attacks on ppc hardware
Posted on 31-Jan-2024 5:01:47
#793 ]
Elite Member
Joined: 9-Mar-2003
Posts: 5273
From: Australia

@matthey

Quote:

68060+@100MHz (25% performance increase over 68060)
int: ************************************** 1.88
FP: ************************* 1.23

What is the secret sauce in the 68060+? The 68060 die is small enough to double the caches which was impractical with the large die of the Pentium until late 1996 with the P55C and a die shrink. The 68060 could have double the caches at introduction if there was a market for high end 68k CPUs. The 68060+ with more than double the integer performance and better FPU performance than the Pentium while at a more affordable price could have been devastating (68060+ could use cheaper 0.5um CMOS and 32 bit memory where Pentium needed expensive 0.35um BiCMOS, 64 bit memory & perhaps OoO to remain competitive). The original 68060 included many low cost features instead, especially for embedded. The performance was still competitive but could have been much better if designed for performance. A Pentium killer would have been a PPC killer though too. PPC also suffered from shallow pipeline designs and too large of caches due to less efficient cache usage which limited clock speeds.

There's no time scale for the enhanced 68060+ and it's largely vaporware. The 68040's 1990 release to 68060's 1994 release showed a 4-year release cycle. 68060+ 1998 release would land on Pentium II era.

68060 CPU is useless without a motherboard platform and AmigaOS 3X needs a certain motherboard design for its Auto-Config. Good luck with booting Amiga Kickstart ROM on a piece of Nortel telecom or HP printer equipment.

Btw, Nortel was a major 68K embedded telecom customer and it went bankrupt.

Pentium Pro 200 Mhz was released in 1995, Pentium MMX was released in Jan 1997 and Pentium II was released in May 1997. The PC gamer favorite Celeron 300A was released in 1998.

NEC released 200Mhz R4400 in 1995.

The problem with https://websrv.cecs.uci.edu/~papers/mpr/MPR/ARTICLES/080502.pdf 68060 Macintosh comment is 68060 is not a drop-in replacement for the 68040 Macintosh.

Mac Rom hack is needed with the inserted 68060SP (Support Package) since this CPU does not support all the 68040 ISA, Specifically, "handlers" on "unsupported opcode exceptions".

With Shapeshifter, AmigaOS's side has 68060SP.

Apple focused on PowerPC.

Quote:

DEC Alpha engineers were not needed to achieve high clock speeds. Deep pipelines were needed. Look at the list again with a few new entries added. Most professional CPU designers were around 200MHz @0.5um with 7+ pipeline stages.

CPU max clock rating @ ~0.5um chip process with pipeline stages
ARM710@40MHz 3-stage
PPC601+@120MHz 4-stage
PPC603@160MHz 4-stage
Pentium P54C@120MHz 5-stage
PPC604@180MHz 6-stage
HP PA-7300LC@180MHz 6-stage
HP PA-8000@180MHz 7-stage
Alpha 21064@300MHz 7-stage
MIPS R4400@200MHz 8-stage
68060@50MHz 8-stage
UltraSPARC@200MHz 9-stage

Wrong. 68060's high clock speed argument is vaporware despite the so-called 8-stage pipeline.

I have TF1260 with 68060 Rev 1 and 68LC060 Rev 4.

I overclocked 68060 Rev 1 to 74 Mhz and it locked up the machine. The same TF1260 card with 68060 Rev 6 can reach 100 Mhz.

Motorola has problems with high clock speed.

You omitted Pentium Pro 150 Mhz (B0, C0) which used the 0.50 μm process.

Both Zen 2 and Zen 3 are fabricated on TSMC's 7 nm and have similar pipeline depth, but Zen 3 can reach higher clock speeds.

Zen 4 and Zen 4C have the same pipeline depth, but Zen 4C has a lower clock speed when compared to Zen 4. Zen 4 is designed for higher clock speed while Zen 4C is designed for low-cost small chip area size with a tighter logic gate layout.

My point, CPU pipeline depth is a factor in reaching high clock speeds, but it's not the only factor.

Real-world facts trumped your dreamland.


Last edited by Hammer on 02-Feb-2024 at 01:52 AM.
Last edited by Hammer on 31-Jan-2024 at 05:22 AM.
Last edited by Hammer on 31-Jan-2024 at 05:08 AM.

_________________
Ryzen 9 7900X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB
Amiga 1200 (Rev 1D1, KS 3.2, PiStorm32lite/RPi 4B 4GB/Emu68)
Amiga 500 (Rev 6A, KS 3.2, PiStorm/RPi 3a/Emu68)

 Status: Offline
Profile     Report this post  
Hammer 
Re: some words on senseless attacks on ppc hardware
Posted on 31-Jan-2024 5:25:36
#794 ]
Elite Member
Joined: 9-Mar-2003
Posts: 5273
From: Australia

@BigD

Quote:

BigD wrote:
@matthey

Quote:
A Pentium killer would have been a PPC killer though too. PPC also suffered from shallow pipeline designs and too large of caches due to less efficient cache usage which limited clock speeds.


Who cares? We've got PiStorm Emu68 now and Vampire 080 if you're a purist! Motorola were morons and PPC was the wrong course though Cell was cool!

CELL has crappy real-world IPC i.e. it's worst than AMD's recycled K8 known as Jaguar.

_________________
Ryzen 9 7900X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB
Amiga 1200 (Rev 1D1, KS 3.2, PiStorm32lite/RPi 4B 4GB/Emu68)
Amiga 500 (Rev 6A, KS 3.2, PiStorm/RPi 3a/Emu68)

 Status: Offline
Profile     Report this post  
Hammer 
Re: some words on senseless attacks on ppc hardware
Posted on 31-Jan-2024 5:26:58
#795 ]
Elite Member
Joined: 9-Mar-2003
Posts: 5273
From: Australia

@Hypex

Quote:

Hypex wrote:
@michalsc

One of the best examples of an AROS port I've seen is AROS hosted on PPC. I installed it a number of years ago and tried it my XE. Testing the demos. Then I took the HDD and connected it to the X1000. It loaded up fine and I ran even more demos!

Can AmigaOS 4.x apps run on AROS PowerPC?

_________________
Ryzen 9 7900X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB
Amiga 1200 (Rev 1D1, KS 3.2, PiStorm32lite/RPi 4B 4GB/Emu68)
Amiga 500 (Rev 6A, KS 3.2, PiStorm/RPi 3a/Emu68)

 Status: Offline
Profile     Report this post  
Hypex 
Re: some words on senseless attacks on ppc hardware
Posted on 31-Jan-2024 14:09:15
#796 ]
Elite Member
Joined: 6-May-2007
Posts: 11200
From: Greensborough, Australia

@Hammer

In short. No. I was running AROS hosted on Linux but there is no OS4 support. Interesting idea though. MOS can run OS4 apps. So with a bit of work AROS could run OS4 apps with limited support.

 Status: Offline
Profile     Report this post  
matthey 
Re: some words on senseless attacks on ppc hardware
Posted on 31-Jan-2024 18:29:21
#797 ]
Super Member
Joined: 14-Mar-2007
Posts: 1999
From: Kansas

BigD Quote:

Who cares? We've got PiStorm Emu68 now and Vampire 080 if you're a purist! Motorola were morons and PPC was the wrong course though Cell was cool!


agami Quote:

There really isn't a case where one could claim "only PowerPC makes it possible", unless one is aiming to make severely outdated and overpriced hardware possible.


Emulation and FPGA cores are the path to Amiga extinction as performance is a fraction of modern processors. The IBM PPC Cell processor CPU cores also had a fraction of modern CPU performance and that is why PPC is no longer used in consoles. The Cell CPU performance really is that bad. The much older 68060 outperforms the Cell CPU per clock with 1/4 of the L1 caches.

CPU | DMIPS/MHz
Cell 0.6 (PS3)
Xenon 0.6 (XBox 360)
PPC603e 1.4
PPC601 1.4
68060 1.8
PPC7410 1.8 (early G4)
e300 1.9 (Efika)
PPC440/460EX 2.0 (Sam440/460)
PA6T 2.2 (X1000)
PPC750 2.3 (G3)
Gekko 2.3 (Nintendo Gamecube)
Broadway 2.3 (Nintendo Wii)
Expresso 2.3 (Nintendo Wii U)
QorIQ-P1022/e500v2 2.4 (A1222)
PPC970 2.9 (G5)
P5020 3.0 (X5000)

The Cell PPC CPU performs poorly in 7-zip benchmarks as well.

single core | compression/MHz | decompression/MHz
IBM_Cell_PPE 0.23 0.33 (PS3)
Cortex-A53 0.56 0.92 (RPi 3, A500 Mini, A600GS)
Cortex-A55 0.63 1.03
SiFive_U74 0.70 0.92 (RISC-V 2.64 DMIPS/MHz core)

IBM_PPC_G5 0.49 0.82
IBM POWER9 1.08 0.83

https://www.7-cpu.com/

The first 4 cores in the 7-zip benchmark above are in-order core designs while the last 2 are OoO cores. In-order cores are smaller and simpler with the 3 newer in-order cores used in sub $100 USD hardware while the PPC OoO cores were expensive to develop and produce. The newer in-order cores destroy Cell while the SiFive U74 comes surprisingly close to the 2019 POWER9 using a 14nm chip process which is smaller than used for most of the in-order cores. The SiFive U74 in-order core design is as close as RISC cores come to the 68060 design but CISC instructions can be executed out of each execution pipeline each cycle which are the equivalent of two RISC instructions while avoiding more multi-cycle load-to-use stalls which is the purpose of the CISC like U74 design to begin with. A SiFive U74 core could likely reach 3 DMIPS/MHz executing CISC instructions like the 68k uses (U74 and 68060 cores can execute the equivalent of 5 RISC instructions/cycle using CISC instructions). Some people may think the RPi 4 with OoO ARM Cortex-A72 and RPi 5 with OoO ARM Cortex-A76 have surpassed in-order performance and won the core wars but these OoO cores are several times larger, use several times the power leaving less for the GPU and requiring more expensive power supplies with cooing fans and are much more complex and expensive to develop with increased security risks. 3 DMIPS/MHz in-order cores have a large cost advantage which can be leveraged and a SBC with a good GPU but moderately weaker CPU cores is likely to be more impressive. The VisionFive2 using SiFive U74 CPU cores which are higher performance than the RPi 3 and a better GPU than any of the RPi models is already competitive at $89.99 USD for the 8GiB SBC but RISC-V lacks the software to take advantage.

https://www.amazon.com/VisionFive-RISC-V-StarFive-JH7110-Quad-core/dp/B0BGM6STN8

The IBM Cell PPC CPU design strategy was in many ways opposite of the SiFive U74 strategy. The U74 (and 68060) design strategy is a practical and easy to program design that reduces or eliminates stalls making instruction scheduling simple. The IBM Cell CPU design strips out complexity and leaves it up to the programmer and compiler to manually find performance which is often too hard or impossible, certainly for compilers as benchmarks show. The Cell CPU was deliberately clocked up beyond what was practical to increase the SIMD throughput. Most integer instructions take at least 2 clock cycles, there is no barrel shifter (like 68000) and the pipeline stretched to 23 stages (deeper pipelines allow higher clock speeds but branch mispredict penalties increase) with long stalls common but the theoretical maximum performance was high using SIMD operations. Games recompiled for the Cell usually fell flat on their face as the recompiled result had the 3+ GHz CPU performing more like a 1 GHz CPU. With more programming work, the SPEs provided more SIMD performance but these are similar to GPU unified shaders but can't be used for most GPU work. It's nice having more parallel SIMD performance on the CPU side of the bus and CPU/GPU memory divide but the later PS4 used a Heterogeneous System Architecture (HSA) that shares CPU and GPU memory reducing the cost of using the unified shaders for parallel SIMD operations.

https://en.wikipedia.org/wiki/Unified_shader_model
https://en.wikipedia.org/wiki/Heterogeneous_System_Architecture

The PS3 was likely evaluated for AmigaOS 4 use. There was a partially working port that I expect was terminated at least partially due to poor CPU performance. The Nintendo PPC cores are more practical and general purpose although the hardware was not as high end as Sony PS hardware but very much as closed and protected. The closed hardware really takes away from modern consoles and their value despite reasonable cost and x86-64 hardware. Web browsers are restricted or non-existent, a mouse can't be used for games like Baldur's Gate 3 despite USB support and practically no productivity software can be used on them. The x86-64 hardware gave them single core performance and easier porting but they run hot and are more expensive than they should be. A more open budget console using cheaper and cooler in-order CPU cores with a HSA system and nice GPU could be an attractive alternative if priced around $100 USD. A system that had retro game software and appeal with a CISC CPU architecture that could maximize the performance of in-order CPU cores is compelling. It's too bad emulation of the 68k Amiga and 68k FPGA CPUs are seen as the way forward when they are acceptance of Amiga extinction.

Last edited by matthey on 31-Jan-2024 at 06:41 PM.

 Status: Offline
Profile     Report this post  
BigD 
Re: some words on senseless attacks on ppc hardware
Posted on 31-Jan-2024 20:31:09
#798 ]
Elite Member
Joined: 11-Aug-2005
Posts: 7322
From: UK

@matthey

Quote:
It's too bad emulation of the 68k Amiga and 68k FPGA CPUs are seen as the way forward when they are acceptance of Amiga extinction.


No one is going to buy a £1000+ PPC AmigaOne from outside the elitist "Classes not the Masses" AmigaOne fanboydom. The Cell had its use and brought us TLOU ahead of its time in 2013. That's good enough for me!

_________________
"Art challenges technology. Technology inspires the art."
John Lasseter, Co-Founder of Pixar Animation Studios

 Status: Offline
Profile     Report this post  
matthey 
Re: some words on senseless attacks on ppc hardware
Posted on 31-Jan-2024 23:11:56
#799 ]
Super Member
Joined: 14-Mar-2007
Posts: 1999
From: Kansas

Hammer Quote:

PC's Quake exploited Pentium FPU's limited out-of-order FDIV and zero-cycle FXCH instructions.


Sure, separate FPU units act like OoO which is nothing special (68k FPUs were similar). The "zero-cycle FXCH" is necessary for the stack based x86 FPU to take advantage of the FPU pipelining as most FPU instructions used the top of stack register/variable creating dependency chains using the result of the previous instruction which is commonly still executing for multi-cycle latency instructions. While the Pentium was issuing/executing FXCH instructions, the 68060 could execute integer instructions instead which improved performance for common mixed integer and FPU code. Quake likely used hand coded assembler FPU inlines to take advantage of the pipelined FPU but the advantage is partially offset by FPU advantages of the 68060 like better mixed code and memory handling, shorter instruction latencies in some cases and a cleaner FPU ISA. The Pentium FDIV instruction is not pipelined like most FPUs and has a longer latency than the 68060 so the 68060 has a small advantage with FDIV.

FPU instruction latency/throughput of at least double precision instructions
CPU | FADD | FMUL | FDIV | FPU Pipelining
68060 3/3 3/3 37/37 no
Pentium 3/1 3/1-2 39/39 yes
80486 8-20/8-20 16/16 73/73 no
R4000 4/3 8/4 36/36 partial
Alpha 4/1 4/1 61/61 yes
PPC601 4/1 4/2 31/29 partial

Intel Reveals Pentium Implementation Details (see table 3)
https://websrv.cecs.uci.edu/~papers/mpr/MPR/ARTICLES/070402.pdf

FADD and FMUL are far more common and important than FDIV. In some cases, there are separate and independent units that can execute FPU instructions in parallel like FADD and FDIV regardless of FPU pipelining.

Hammer Quote:

68060's April 1994 release was useless since 1995 was the year for Amiga's 68060 practical usage.


Yes, C= committed suicide before the 68060 could make a difference. Their upper management was unlikely to buy many if any 68060 CPUs when they were new and expensive. Their cheapness is why they didn't have a high end Amiga market since the 1980s. Even if C= had survived, it may not have been enough to keep the 68k at least somewhat competitive. The AIM alliance was formed in late 1991 while C= was alive. Most likely, Motorola would have tried to get C= to adopt PPC as well but savvy C= leadership would have licensed the 68k baby on the cheap that Motorola threw out with the bathwater. At least Intel was smart enough to keep and continue x86 development with their Itanic mistake but Motorola clipped the wings of their baby to make PPC look better but PPC couldn't fly.

Hammer Quote:

Again, it's advantageous to use concurrent FPU for geometry processing while integer units handle pixel/texture processing workloads.


There was more room for concurrent/ parallel operations on chip already by the late 1990s. T&L/TCL came to the desktop in late 1999 which reduced the FPU workload.

https://en.wikipedia.org/wiki/Transform,_clipping,_and_lighting

The 68060 just had a bare bones Spartan FPU which was a reasonable decision to reduce costs and undercut the Pentium in cost. The ISA was adequate and the performance was mediocre. It could easily be improved. FPU pipelining with FPU register renaming gives synergies and retains the 68k ease of programming so it made sense to wait until there is enough room for both on an affordable chip. Doubling the L1 caches would have been a better use of transistors and would have improved the performance of both integer and FPU instructions. I believe that is what they were going after with all the transistor frugalness for the 68060+.

Hammer Quote:

Why the lowball targets? Where are DEC Alpha and HP PA-RISC?


I thought we were talking about desktop FPUs and not workstation FPUs. There was a variety of pipelined, non-pipelined and partially pipelined FPUs on different CPUs at the time (as I indicated above). FPU performance was not as important for desktop use as for workstations.

Hammer Quote:

AMD K5 is a failure that was quickly replaced by K6 i.e. AMD purchased NexGen when AMD's K5 chip failed to meet performance and sales expectations. Development of AMD's internal K5 successor was halted in favor of continuing from NexGen's Nx686 designs, eventually becoming K6. K5 reached a clock wall of 133 Mhz.

Am5x86 wasn't a Pentium class CPU and its 586 name and PR rating was controversial i.e. 160 Mhz Am5x86-P100 is roughly equivalent to Pentium 90.

1996 Quake almost singlehandedly killed so-called Pentium clones.


The AMD K5 was a descent budget CPU. Intel had a good reputation so it was easier and cheaper to attack the budget market and take market share than try to outclass the Pentium in performance. It still spanked the in-order Pentium in integer performance and had enough FPU performance to play Quake kind of like the 68060 but not as good because of the stack based x86 FPU ISA. The in-order Pentium topped out at a low 120MHz using a 0.5um chip process as I recall which led to an Intel die shrink and an OoO design. That is when Intel tried to to give up the dynamic logic and return to a static design to lower cost but quickly gave it up again for performance.

Hammer Quote:

FXCH is included with the Pentium's March 1993 release. Release timing is important.

Your argument is almost a nothing burger when Pentium's aggressive release schedule is factored in.

When FPU mattered, Pentium was offered for sale in greater numbers at mainstream price ranges.


FXCH wouldn't exist with a better FPU ISA. Intel losing margin because of AMD increased competitiveness and having to lower prices wasn't a good thing for Intel. Intel tried to shake competitors out of the market using their previously exorbitant margins and were partially successful. The 68060 never had Pentium margins but embedded margins are much lower. The high end embedded market was not large enough for the economies of scale to finance a 68060 competitor. A healthy C= and Apple using the 68060 may have been but C= committed suicide and Apple jumped the fence to less green PPC pastures almost dying in the process but they were bailed out by Microsoft.

Hammer Quote:

Pentium FPU's FDIV has limited out-of-order i.e. FDIV instruction can be processed while non-dependent FP instructions are concurrently processed. You're forgetting the Pentium
feature.

Besides FXCH, PC Quake's geometry workload specifically exploited this Pentium feature.

Pentium FPU has three pipelines i.e. FADD, FDIV, and FMUL with a common dispatcher. FDIV doesn't stall other non-dependent FP instructions.


Having separate pipelined units for FADD, FMUL and FDIV is a significant advantage. I believe the 68060 combines the FMUL and FADD unit and I don't know if the FDIV unit can execute an instruction in parallel. FXCH is a kludge to allow a benefit from pipelining with the stack based x86 FPU ISA. There is a reason it was replaced by the SIMD unit. The Bytemark benchmark shows the 68060 isn't far behind in FPU performance and would likely be on par or better in performance with a 25% clock speed advantage or double the caches (ala 68060+) and both could be done together for more of an advantage.

Hammer Quote:

When FPU mattered for Lightwave and Quake use cases, 68060 is the inferior solution.
Prove me wrong.


I improved the vbcc compiler backend support for the 68060. Frank Wille just happened to compile the Bytemark benchmark and record 68060 results not knowing the FPU results were nearly on par with a Pentium at the same clock speed. My changes were only 68060 support code changes with the largest gains likely from eliminating the use of trapped instructions. This does not include changes to the 68k backend for the FPU, FPU arguments are still passed on the stack and there is still no instruction scheduler so it is possible the 68060 FPU outperforms the in-order Pentium at the same clock speed. The Bytemark FPU benchmark index is actually a composite of several realistic FPU benchmarks but there are some algorithms where the 68060 would lack performance like heavy transcendental (trigonometry related) math. Hand coded FPU assembler for the pipelined Pentium FPU would give an advantage in some cases as well like with matrix math. Most FPU code is mixed integer/FPU code even for Quake where the 68060 can be surprisingly competitive with the Pentium even with compiler generated code from a far from major compiler. Any more proof won't happen with Amiga being an emulated EOL platform.

Hammer Quote:

Wrong. 68060's high clock speed argument is vaporware despite the so-called 8-stage pipeline.

I have TF1260 with 68060 Rev 1 and 68LC060 Rev 4.

I overclocked 68060 Rev 1 to 74 Mhz and it locked up the machine. The same TF1260 card with 68060 Rev 6 can reach 100 Mhz.

Motorola has problems with high clock speed.


Motorola had problems with the 68040 hot running chip which is well known. The 68060 runs cool and they were testing 68060@66MHz versions before it was released. Motorola produced PPC CPUs well over 100 MHz using the same process. Most of Motorola's competitors were producing 150+ MHz parts with 7-9 stage core designs using the same process.

Hammer Quote:

You omitted Pentium Pro 150 Mhz (B0, C0) which used the 0.50 μm process.


The Pentium (P55C) MMX increased the pipeline by one stage from 5 to 6. The PPro had a pipeline length of 14 stages. Yes, with a 14 stage pipeline it should be possible to get 150MHz using a 0.5um process. Intel learned how to increase clock speeds with deeper pipelines. The 14 stages was almost too long for the time (like P4 and Cell). Pipeline refills are more expensive especially mispredicted branches requiring more branch prediction logic and more stages require more transistors as well. Intel x86 cores were already fat and this made them fatter where the transistors likely would have been better spent for caches. Super deep pipelining increases instruction level parallelism and superpipelining was a new design concept back then that was pushed past the limits of practicality. The 8 stage pipeline of the 68060 is more practical and is the most popular pipeline length as the most popular CPU core today, the ARM Cortex-A53, uses it.

Hammer Quote:

My point, CPU pipeline depth is a factor in reaching high clock speeds, but it's not the only factor.

Real-world facts trumped your dreamland.


There are other factors than pipeline length and chip process that determine max clock speeds. Each pipeline stage has to be balanced with the slowest logic of the slowest stages optimized, partially moved to another stage or more stages added. This takes time and effort which was likely reduced for the 68060. Motorola was better than a 50MHz 8 stage CPU @0.5um which would be embarrassing if they were trying and completely out of line with any other similar professionally developed CPU core.

Last edited by matthey on 31-Jan-2024 at 11:59 PM.

 Status: Offline
Profile     Report this post  
agami 
Re: some words on senseless attacks on ppc hardware
Posted on 31-Jan-2024 23:22:00
#800 ]
Super Member
Joined: 30-Jun-2008
Posts: 1648
From: Melbourne, Australia

@BigD

It's always about high performance with these guys.

We said the Cell CPU was cool in reference to its architecture, not because it had top performance.
In many ways it was a curious choice for Sony to put it in a game console, even if it was intended as more than just a game console.
By its design, it was more suited to cluster computing.

As it happens, around the turn of the millennium, when consumer-grade CPUs reached 1GHz, I spent some time working on a cellular computing architecture. Different to how IBM/Toshiba/Sony ended up doing it, but similar in some of the philosophies.

The main change in philosophy is what I would liken to the Navy SEAL's mantra of "Slow is Smooth, Smooth is Fast".

_________________
All the way, with 68k

 Status: Offline
Profile     Report this post  
Goto page ( Previous Page 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 Next Page )

[ home ][ about us ][ privacy ] [ forums ][ classifieds ] [ links ][ news archive ] [ link to us ][ user account ]
Copyright (C) 2000 - 2019 Amigaworld.net.
Amigaworld.net was originally founded by David Doyle