Click Here
home features news forums classifieds faqs links search
6071 members 
Amiga Q&A /  Free for All /  Emulation /  Gaming / (Latest Posts)
Login

Nickname

Password

Lost Password?

Don't have an account yet?
Register now!

Support Amigaworld.net
Your support is needed and is appreciated as Amigaworld.net is primarily dependent upon the support of its users.
Donate

Menu
Main sections
Home
Features
News
Forums
Classifieds
Links
Downloads
Extras
OS4 Zone
IRC Network
AmigaWorld Radio
Newsfeed
Top Members
Amiga Dealers
Information
About Us
FAQs
Advertise
Polls
Terms of Service
Search

IRC Channel
Server: irc.amigaworld.net
Ports: 1024,5555, 6665-6669
SSL port: 6697
Channel: #Amigaworld
Channel Policy and Guidelines

Who's Online
33 crawler(s) on-line.
 49 guest(s) on-line.
 0 member(s) on-line.



You are an anonymous user.
Register Now!
 pixie:  10 mins ago
 michalsc:  10 mins ago
 Karlos:  14 mins ago
 Rob:  18 mins ago
 matthey:  26 mins ago
 davidf215:  34 mins ago
 amigakit:  38 mins ago
 Dragster:  51 mins ago
 pavlor:  58 mins ago
 bhabbott:  1 hr 31 mins ago

/  Forum Index
   /  Classic Amiga Hardware
      /  One major reason why Motorola and 68k failed...
Register To Post

Goto page ( Previous Page 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 Next Page )
PosterThread
matthey 
Re: One major reason why Motorola and 68k failed...
Posted on 16-May-2024 2:38:15
#61 ]
Elite Member
Joined: 14-Mar-2007
Posts: 2270
From: Kansas

Lou Quote:

I was comparing to 68000, not 14mhz 68020 w/AGA. Still the AGA looks jittery.
Eye Of The Beholder, 128-enhance uses 2 screens and has other features not in other ports. Eye of the Beholder 2, is a different game.


The C128 version of Eye of the Beholder is no doubt one of the best conversions. I still think the Amiga version is better due to significantly better graphics and performance. The C128 graphics look amazing for that machine and the added features are nice but playability suffers with slowdowns. Even with a 6502@20MHz and SRAM, the Amiga graphics and sound still have the advantage. I like the SID chip which sounds good but it is not as versatile as the Amiga for sound especially for realistic sounds.

Lou Quote:

My point of comparing the TG-16 to Genesis/Mega Drive port vs port is that other than differences between the ports, most of the time the 6502-based TG-16 was better. Many complaints about the TG-16 were simply about lack of parallax scrolling in the background, but that shows developer weakness, not system weakness as Shadow Of The Beast and other games had it. The TG-16 had more sprites and flicker-free sprites at that than the Geneses/MegaDrive version. Being able to handle more objects is a cpu function. Many games slow down in these situations.


Let's compare Shadow of the Beast between the Sega and TG-16.

Genesis vs. TurboGrafx-16! *33* Games Compared!
https://youtu.be/qOXTWCSAyDs?t=1855

Amiga wins! Each system is better in different areas.

TG-16
+ many sprites
+ colorful and vibrant colors
+ good music
- seems lower resolution
- sprites/objects tend to be small
- sound effects could be better

Sega Genesis
+ large sprites/objects
+ good scrolling
+ good resolution
+ good sound effects
- lacks colors and colors are flat
- poor music

Amiga
+ good scrolling
+ good resolution
+ versatility like large blitter objects and copper effects
+ good music
+ good sound
- slow chip memory (low bandwidth)
- few sprites and sprite features
- not enough channels for music and sound together

The differences have more to do with the hardware and not the CPU. I can't see any advantage in performance from a particular CPU. The biggest CPU difference may be that the 68k is easier to program but there were many experienced 6502 programmers. The 68k benefits the Amiga more with the large flat address space that keeps the Amiga competitive despite less hardware by using more memory.

OneTimer1 Quote:

The 68k
- is not very effective when it comes to Shift operation,


The 68000 does not have a barrel shifter meaning each bit shifted takes at least one cycle. The 68000 already used a large die and there just wasn't enough silicon space without removing the microcoding which would have taken much longer to develop. The lack of barrel shifter in the 68000 was one of the primary motivations for adding the blitter to the Amiga. The barrel shifter in the 68020 was a nice upgrade as it not only reduced shift cycles but it allowed addressing mode scaled index registers which significantly improved code density, reduced the number of instructions to execute and reduced register pressure. The 68k as a family has no problem with shifts. The 68060 design has a barrel shifter in each execution pipe (OEP) and can superscalar execute 2 shifts per cycle which is better than most RISC CPU designs. Each 68060 OEP uses a barrel shifter for both shift instructions and addressing modes where RISC designs usually have separate load/store units which handle addressing modes so barrel shifters in the execution pipes are only used for executing shift instructions and usually only one execution pipe has one. Also, having a barrel shifter in each OEP makes superscalar instruction scheduling easier on the 68060 than most in-order RISC CPUs. Even OoO RISC CPUs may only be able to execute one shift per cycle.

OneTimer1 Quote:

- memory access is wasting a clock cycle when looking for the bus,


I'm not a hardware guy but more time between memory accesses should allow to use slower and cheaper memory. Maybe there are better or other reasons though. I wouldn't assume it was a design oversight.

OneTimer1 Quote:

- address misalignment is an exception you can see a design flaw, it never happens on x86.


There is nothing that mandates CISC ISAs to handle all address misalignment in hardware. The 68020 had to allow 32 bit misalignment to retain 68000 compatibility but the developers chose to go ahead and allow all misalignment perhaps influenced by competition from x86. In contrast, RISC research designs and most early RISC ISAs did not support any misalignment in hardware due to the RISC philosophy of hardware simplification. As I recall, MIPS and ARM originally did not support misalignment in hardware while PPC was one of the early RISC architectures to often support misalignment in hardware but I don't believe it was a requirement in the ISA.

OneTimer1 Quote:

- 16 MHz versions where not available when the A1000 was released.


Correct. There were only 8MHz, 10MHz and 12MHz rated 68000 HMOS chips. The first HCMOS 68HC000s were produced in 1986 allowing 16MHz and eventually 20MHz rated chips. The CMOS 68EC000 was first produced in 1991 but there were no higher clock ratings. The HCMOS 68020 became available in 1984 and had speed ratings of 12MHz, 16MHz, 20MHz, 25MHz and 33MHz but a 68020 Amiga would have required significant hardware changes, likely delayed the Amiga and raised the price a lot.

OneTimer1 Quote:

I'm not sure if the current 68EC000 fix all this problems, but it can run in 8 or 16 bit mode, the 68010 has enough cache for small loops, making it possible to compete with x86 memory copy loops.
Some soft cores have faster 68k compatible versions, making more speed with compatible address busses.


I believe the 68EC000 maintained the timings of the original 68000 for compatibility reasons. I'm not so sure the 68008 timings were maintained in 8 bit data bus mode though. The following manual has details on the different 68000 chips.

https://cdn.hackaday.io/files/1805367724052224/M68000UM_AD_M68000_Microprocessor_Users_Manual_Rev8_1993.pdf

Most 68k FPGA cores do not maintain 68000 timings. The fx68K core attempts to maintain cycle exact 68000 timing though.

https://github.com/ijor/fx68k

Last edited by matthey on 16-May-2024 at 03:11 AM.

 Status: Offline
Profile     Report this post  
Gunnar 
Re: One major reason why Motorola and 68k failed...
Posted on 16-May-2024 7:08:37
#62 ]
Cult Member
Joined: 25-Sep-2022
Posts: 512
From: Unknown

@OneTimer1


Quote:
The 68000 did come out in 1979.
Why compare with an 8 year old CPU?


Quote:
The 68k
- is not very effective when it comes to Shift operation,
- memory access is wasting a clock cycle when looking for the bus,
- address misalignment is an exception you can see a design flaw, it never happens on x86.
- 16 MHz versions where not available when the A1000 was released.


Yes the 68000 has some limitations.
The later model improved a lot.

For example:
- The 68020 is more efficient on the bus. it needs less cycle for memory access
- The 68020 has 32bit memory bus (higher memory bandwidth)
- The 68020 has a instruction cache. This improves performance.
- The 68020 has more useful instructions like 64bit mul
- The 68020 can access ODD memory address with WORD or LONG (no GURU 3)
- The 68020 is available in much higher clock

The 68020 is much more powerful than the 68000.


If you compare chips then I would compare chips of the same time.
The ARM came out after the 68020 came out. Pretty close to the time the 68030 came out.
I would compare it with either of those.



 Status: Offline
Profile     Report this post  
matthey 
Re: One major reason why Motorola and 68k failed...
Posted on 16-May-2024 19:07:08
#63 ]
Elite Member
Joined: 14-Mar-2007
Posts: 2270
From: Kansas

@all
I have updated the DMIPS/MHz ratings using Motorola official numbers which makes the 68k CPUs closer to ARM2. The results make sense as the 68020 and 68030 have the same pipeline depth but caches which ARM2 does not.

year | in-order CPU (core) | DMIPS/MHz | pipeline
1979 68000 0.16 1-stage
1984 68020 0.30 3-stage
1986 ARM2 0.35 3-stage (Be careful of claims like 0.50 which are likely VAX MIPS)
1987 68030 0.36 3-stage
1990 68040 1.10 5-stage
1994 68060 1.80 8-stage
2005 Cortex-A8 2.0 13-stage
2011 Cortex-A7 1.9 8-stage (went back to more practical 8-stage pipeline)
2012 Cortex-A53 2.3 8-stage

I had previously used the numbers from the 68k chips faq which I believe are reliable but Motorola likely used a different compiler or compiler version.

http://www.faqs.org/faqs/motorola/68k-chips-faq/

The new numbers come from the Motorola internal product overview.

https://marc.retronik.fr/motorola/68K/68000/High-Performance_Internal_Product_Portfolio_Overview_with_Mask_Revision_[MOTOROLA_1995_112p].pdf

No DMIPS is given for the 68060 in the link above so I used the number from the following 1994 Microprocessor Report about the 68060 being released.

https://websrv.cecs.uci.edu/~papers/mpr/MPR/ARTICLES/080502.pdf

The ARM2 DMIPS/MHz score is from multiple sources which mostly agree including the Acorn Archimedes talk video which I already linked.

36C3 - The Ultimate Acorn Archimedes talk
https://www.youtube.com/watch?v=Hf67JYkUCHQ

The ARM2 performance is very impressive considering the size of the core but it isn't as mythical as the hype.

Last edited by matthey on 16-May-2024 at 07:10 PM.

 Status: Offline
Profile     Report this post  
Hammer 
Re: One major reason why Motorola and 68k failed...
Posted on 17-May-2024 4:25:27
#64 ]
Elite Member
Joined: 9-Mar-2003
Posts: 5859
From: Australia

@Lou

Quote:

Lou wrote:
If you all really want to fight...then:

A 1 Mhz 6502 was just as fast as a 7.14 Mhz 68000 because many instructions on the 68000 take 8 cycles. :P

Since a C128 in C64 mode can run at closer to 1.2 Mhz, Sonic on the C64 on 128 hardware really outperforms all other base Amiga platformers until you get to the A1200.

A 6502 is just a RISC 6800. (6800 - not 68000).

The W65C02S was able to go to 14mhz in 1983.

The TurboGrafx-16/PC Engine used a variant that was running at 7.16Mhz.
This is why the SNES using a 16ish-bit variant 3.57 Mhz 65C816 ran circles around the Amiga.

A C128 with a REU with supported games like Sonic = Blast Processing! :)
https://www.youtube.com/watch?v=L4CGwp4N9xg


6502 and 65C816 didn't have a 32-bit programming model to host a 32-bit OS. Amiga's custom chips handled the multimedia workloads.

ARM2 is late for 1985's A1000's launch.

Commodore could have revised 6502 with wider 16-bit ALU and 32-bit ALU models since Commodore owned 65xx intellectual property.

Last edited by Hammer on 17-May-2024 at 05:05 AM.

_________________
Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68)
Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68)
Ryzen 9 7900X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB

 Status: Offline
Profile     Report this post  
Hammer 
Re: One major reason why Motorola and 68k failed...
Posted on 17-May-2024 5:00:41
#65 ]
Elite Member
Joined: 9-Mar-2003
Posts: 5859
From: Australia

@kolla

Quote:

kolla wrote:

Sun certainly made their own MMUs for 68k and even did tricks like running all code on two CPUs in parallel with one being a slightly behind the other, ready to take over should the first CPU trip on some bad code.

Running two 68000 is for paged virtual memory.

For the system integrator, running two 16-bit data bus 68000 would be less cost-effective compared to a single 16-bit or 32-bit data bus CPU with MMU. This is addressed by 68010's virtual memory capability.

Unix requirements evolved into memory-protected capable MMU.

_________________
Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68)
Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68)
Ryzen 9 7900X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB

 Status: Offline
Profile     Report this post  
Hammer 
Re: One major reason why Motorola and 68k failed...
Posted on 17-May-2024 5:17:32
#66 ]
Elite Member
Joined: 9-Mar-2003
Posts: 5859
From: Australia

@matthey

Quote:

TG-16
+ many sprites
+ colorful and vibrant colors
+ good music
- seems lower resolution
- sprites/objects tend to be small
- sound effects could be better

Sega Genesis
+ large sprites/objects
+ good scrolling
+ good resolution
+ good sound effects
- lacks colors and colors are flat
- poor music

Amiga
+ good scrolling
+ good resolution
+ versatility like large blitter objects and copper effects
+ good music
+ good sound
- slow chip memory (low bandwidth)
- few sprites and sprite features
- not enough channels for music and sound together

The Amiga has the Copper for mult-instancing the available 8 hardware sprite slots with minimal CPU load. This is part of the hardware feature.

Amiga's Street Fighter 2 port had the shared Atari ST artwork problem.

Elf Mania's floor line parallax effect could been implemented for Amiga's Street Fighter 2 port.

Elf Mania's development is not shared with the Atari ST platform.

_________________
Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68)
Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68)
Ryzen 9 7900X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB

 Status: Offline
Profile     Report this post  
Hammer 
Re: One major reason why Motorola and 68k failed...
Posted on 17-May-2024 7:14:03
#67 ]
Elite Member
Joined: 9-Mar-2003
Posts: 5859
From: Australia

@matthey

Quote:

matthey wrote:
@all
I have updated the DMIPS/MHz ratings using Motorola official numbers which makes the 68k CPUs closer to ARM2. The results make sense as the 68020 and 68030 have the same pipeline depth but caches which ARM2 does not.

year | in-order CPU (core) | DMIPS/MHz | pipeline
1979 68000 0.16 1-stage
1984 68020 0.30 3-stage
1986 ARM2 0.35 3-stage (Be careful of claims like 0.50 which are likely VAX MIPS)
1987 68030 0.36 3-stage
1990 68040 1.10 5-stage
1994 68060 1.80 8-stage
2005 Cortex-A8 2.0 13-stage
2011 Cortex-A7 1.9 8-stage (went back to more practical 8-stage pipeline)
2012 Cortex-A53 2.3 8-stage

I had previously used the numbers from the 68k chips faq which I believe are reliable but Motorola likely used a different compiler or compiler version.

http://www.faqs.org/faqs/motorola/68k-chips-faq/

The new numbers come from the Motorola internal product overview.

https://marc.retronik.fr/motorola/68K/68000/High-Performance_Internal_Product_Portfolio_Overview_with_Mask_Revision_[MOTOROLA_1995_112p].pdf



For https://github.com/shanshe/Z3660
Z3660 with ARM Cortex A9 is another 68K-to-ARM emulation for "big box" 32-bit Amiga 3000 and 4000.

Z3660 has ARM Cortex A9 via MYS-7Z020-V2's AMD/Xilinx ZYNQ 7020 SoC.

Z3660 can operate with a physical MC68060 like an A3660 card. Real 68K MMU has its use.

ARM Cortex A9 has 2.50 DMIPS/MHz/core, 8-stage pipelines, and OoO processing. A9's instruction queue and dispatch stage has three instructions dispatch i.e.
pipe for ALU/MUL,
pipe for ALU,
pipe for FPU/NEON or Address.

ARM Cortex A9 was introduced in 2007.

ARM states a Cortex A9 microarchitecture on TSMC 40G can reach 2 GHz.

Last edited by Hammer on 17-May-2024 at 07:18 AM.
Last edited by Hammer on 17-May-2024 at 07:16 AM.

_________________
Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68)
Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68)
Ryzen 9 7900X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB

 Status: Offline
Profile     Report this post  
Lou 
Re: One major reason why Motorola and 68k failed...
Posted on 17-May-2024 15:31:28
#68 ]
Elite Member
Joined: 2-Nov-2004
Posts: 4227
From: Rhode Island

@Hammer

Quote:

Hammer wrote:
@Lou

Quote:

Lou wrote:
If you all really want to fight...then:

A 1 Mhz 6502 was just as fast as a 7.14 Mhz 68000 because many instructions on the 68000 take 8 cycles. :P

Since a C128 in C64 mode can run at closer to 1.2 Mhz, Sonic on the C64 on 128 hardware really outperforms all other base Amiga platformers until you get to the A1200.

A 6502 is just a RISC 6800. (6800 - not 68000).

The W65C02S was able to go to 14mhz in 1983.

The TurboGrafx-16/PC Engine used a variant that was running at 7.16Mhz.
This is why the SNES using a 16ish-bit variant 3.57 Mhz 65C816 ran circles around the Amiga.

A C128 with a REU with supported games like Sonic = Blast Processing! :)
https://www.youtube.com/watch?v=L4CGwp4N9xg


6502 and 65C816 didn't have a 32-bit programming model to host a 32-bit OS. Amiga's custom chips handled the multimedia workloads.

ARM2 is late for 1985's A1000's launch.

Commodore could have revised 6502 with wider 16-bit ALU and 32-bit ALU models since Commodore owned 65xx intellectual property.

I did, in a subsequent post, link a spec for a 32bit 6502 as a successor to the 65C816...a 65C832 that was proposed but never developed.

Here it is again:
https://downloads.reactivemicro.com/Electronics/CPU/WDC%2065C832%20Datasheet.pdf

Also, relocatable Zero Page and Stack on the C128's 8502+MMU did in-fact make it a much better games machine than the C64, though lazy developers just made C64 games just as A500 was still the target platform after the AGA machines launched. Also, the dual-screen ability was neglected...as well as faster loading times of the 1571.
The VIC-IIe in the C128 did add a couple of extra features (supported 320x400 interlaced and had a way to simulate more colors)...again unused.

The lowest common denominator always dominates... :/

Last edited by Lou on 17-May-2024 at 08:23 PM.
Last edited by Lou on 17-May-2024 at 07:57 PM.

 Status: Offline
Profile     Report this post  
Gunnar 
Re: One major reason why Motorola and 68k failed...
Posted on 17-May-2024 16:04:47
#69 ]
Cult Member
Joined: 25-Sep-2022
Posts: 512
From: Unknown

@Lou

Quote:
I did, in a subsequent post, link a spec for a 32bit 6502 as a successor to the 65C816...a 65C832 that was proposed but never developed.


I find it interesting that you guys compare 6502 to an 68K family CPU.

Did anyone of you develop software for both of them?

Or are looking at "specs" here with any real experience in using them?


In my experience coding on them both - they are different worlds in programmer possibilities.

 Status: Offline
Profile     Report this post  
Lou 
Re: One major reason why Motorola and 68k failed...
Posted on 17-May-2024 20:16:33
#70 ]
Elite Member
Joined: 2-Nov-2004
Posts: 4227
From: Rhode Island

@Gunnar

Quote:

Gunnar wrote:
@Lou

Quote:
I did, in a subsequent post, link a spec for a 32bit 6502 as a successor to the 65C816...a 65C832 that was proposed but never developed.


I find it interesting that you guys compare 6502 to an 68K family CPU.

Did anyone of you develop software for both of them?

Or are looking at "specs" here with any real experience in using them?


In my experience coding on them both - they are different worlds in programmer possibilities.

I am a professional software developer for a living. However I'm not bound to worship an architecture. I let the compiler do it's work.
If Motorola fixed their slow IPC over time, it didn't matter. It's too late. ARM won.
But yes, I did learn 8088 and 6502 assembly in the 80's and early 90's...and some MC6809 assembly because they were used in cars...

Compilers improved and cost+speed+scalability is what matters and ARM has that in spades.

Learn from the PiStorm.
Even the C64 has it's own ARM coprocessor now...running DOOM!
https://www.youtube.com/watch?v=zAla_RtPECE

Last edited by Lou on 17-May-2024 at 08:24 PM.

 Status: Offline
Profile     Report this post  
Gunnar 
Re: One major reason why Motorola and 68k failed...
Posted on 18-May-2024 10:45:51
#71 ]
Cult Member
Joined: 25-Sep-2022
Posts: 512
From: Unknown

@Lou

Quote:
However I'm not bound to worship an architecture. I let the compiler do it's work.


Sure... I was just curious on what experience some opinions are based on
or if this just nonsense talk ..

It was pretty obvious that in this thread a number of strange opinions
were voiced as 6502 is a risk 6800 or ARM would be like an 6502 ...

To which most programmers and probably all CPU architect would have a different opinion here..
But lets not argue about technical correctness of opinion in an Amiga forum..







Last edited by Gunnar on 18-May-2024 at 10:50 AM.

 Status: Offline
Profile     Report this post  
Lou 
Re: One major reason why Motorola and 68k failed...
Posted on 18-May-2024 19:01:11
#72 ]
Elite Member
Joined: 2-Nov-2004
Posts: 4227
From: Rhode Island

@Gunnar

Quote:

Gunnar wrote:
@Lou

Quote:
However I'm not bound to worship an architecture. I let the compiler do it's work.


Sure... I was just curious on what experience some opinions are based on
or if this just nonsense talk ..

It was pretty obvious that in this thread a number of strange opinions
were voiced as 6502 is a risk 6800 or ARM would be like an 6502 ...

To which most programmers and probably all CPU architect would have a different opinion here..
But lets not argue about technical correctness of opinion in an Amiga forum..

Well, the topic of this thread is why Motorola failed...not the Amiga, though that is related.
So comparing Motorola's cpu division to other architectures is relevant.

The architecture had poor bang-for-buck.
True, it excelled when tasks required more than 64k, but as we've seen from the C128 which had an MMU (which with a better MMU can bank up to 1MB) and it's relocatable Zero Page and Stack
... and an REU with a DMA controller that could memory copy 4x faster than a stock C64, there were ways around this.

In 1978 a 6502 was $25...in ~1984 $4 or less ...
In 1979 a 68000 was $487 ... not until 1984 did it get down to $15
Both architectures died in the early 90's though in embedded markets they lived on for a time.

Printers eventually switch to embedded PPC chips...so did automobile PCMs.
Now everything outside of PCs is ARM...and even that is changing. ARM won.

Amiga is best done in emulation. There is no market. I only hope AROS gains momentum as a multi-platform standard so we don't have to use computers behind a 'service' and 'paywall'... Linux is well packaged and starting to become mainstream. It's still ugly to me.

Porting Mono and even the Roslyn compiler to AROS would be very beneficial. Visual Studio is an amazing tool. Port Mono and Roslyn and suddenly anyone can develop for AROS.

 Status: Offline
Profile     Report this post  
bhabbott 
Re: One major reason why Motorola and 68k failed...
Posted on 18-May-2024 19:53:21
#73 ]
Regular Member
Joined: 6-Jun-2018
Posts: 422
From: Aotearoa

@Gunnar

Quote:

Gunnar wrote:
@Lou

Quote:
However I'm not bound to worship an architecture. I let the compiler do it's work.


Sure... I was just curious on what experience some opinions are based on
or if this just nonsense talk ..

It was pretty obvious that in this thread a number of strange opinions
were voiced as 6502 is a risk 6800 or ARM would be like an 6502 ...

There's this myth that the 6502's 'RISC' design made it much better than its competitors. But compared to the 68000 it's a joke.

Jay Miner was keen on using the 68000 in an Atari home computer back in 1979, when everyone was using 6502s or Z80s and would continue doing so for another 5 years or more. That's how the Amiga came to have one (their original games console was slated to have a 6502 and 64k RAM).

I learned to program the RCA 1802 in 1980, and 6800 in 1981. The 1802 was a RISC design. It has sixteen 16-bit registers, any of which can be an address register, the program counter, or two 8 bit data registers. All instructions are 1 or 2 bytes long. Only the 8 bit accumulator "D" can be loaded with immediate data. To load a 16 bit register with immediate data you need 4 instructions. It has no stack pointer or status register. To call a subroutine you load a register with its address and make it the PC with "SEP" (Set PC). To return you switch back to the original 'PC'. Any register can be set as the index ('X') register, using the rather risky 'SEX' instruction.

The problem with the 1802 is that it used RCA's 4000 series CMOS process, which was very slow (max 2.5 MHz at 5V) combined with 16 or 24 clock cycles per machine cycle depending on the instruction.

The 6800 was much faster and had that lovely CISC stuff, but in some ways was more limiting. The single index register was a big one. You could use SP as a second index register for copying memory etc., but had to disable interrupts while doing so. The 6502 fixed this but was even more limiting because X and Y are only 8 bit, and the 8 bit stack pointer is also quite limiting. I was going to upgrade my homebrew 6800 computer to the 6809 (which has a much better ISA) but then I got a ZX-81 and discovered the wonderful world of Z80!

 Status: Offline
Profile     Report this post  
matthey 
Re: One major reason why Motorola and 68k failed...
Posted on 19-May-2024 3:00:28
#74 ]
Elite Member
Joined: 14-Mar-2007
Posts: 2270
From: Kansas

Hammer Quote:

6502 and 65C816 didn't have a 32-bit programming model to host a 32-bit OS. Amiga's custom chips handled the multimedia workloads.

ARM2 is late for 1985's A1000's launch.

Commodore could have revised 6502 with wider 16-bit ALU and 32-bit ALU models since Commodore owned 65xx intellectual property.


The 6502 is minimalist and supporting an OS is indeed difficult, let alone an advanced 32 bit OS. The 6502 family was more appealing for embedded use while Motorola improved the 6800 family with features especially for an OS and code sharing resulting in the 1978 6809 and then the 1979 68000.

https://en.wikipedia.org/wiki/Motorola_6809#6809 Quote:

Analysis of 6800 code demonstrated that loads and stores were the vast majority of all the time in CPU terms, accounting for 39% of all the operations in the code they examined. In contrast, mathematical operations were relatively rare, only 2.8% of the code. However, a careful examination of the loads and stores noted that many of these were being combined with adds and subtracts, revealing that a significant amount of those math operations were being performed on 16-bit values. This led to the decision to include basic 16-bit mathematics in the new design: load, store, add, and subtract. Similarly, increments and decrements accounted for only 6.1% of the code, but these almost always occurred within loops where each one was performed many times. This led to the addition of post-incrementing and pre-decrementing modes using the index registers. 

The main goal for the new design was to support position-independent code. Motorola's market was mostly embedded systems and similar single-purpose systems, which often ran programs that were very similar to those on other platforms. Development for these systems often took the form of collecting a series of pre-rolled subroutines and combining them together. However, as assembly language is generally written starting at a "base address", combining pre-written modules normally required a lengthy process of changing constants (or "equates") that pointed to key locations in the code. 

Motorola's idea was to eliminate this task and make the building-block concept much more practical. System integrators would simply combine off-the-shelf code in ROMs to handle common tasks. Libraries of common routines like floating point arithmetic, graphics primitives, Lempel-Ziv compression, and so forth would be available to license, combine together along with custom code, and burn to ROM. Other examples are matrix arithmetic, Huffman encoding/decoding, statistical functions, string searching (e.g. by the Boyer-Moore algorithm) and tree structure management. A larger example is found in Motorola's 6809 programming manual, which contains the full listing of assist09, a so-called monitor, a miniature operating system intended to be burned in ROM.

In previous processor designs, including the 6800, there was a mix of ways to refer to memory locations. Some of these were relative to the current location in memory or to a value in an index register, while others were absolute, a 16-bit value that referred to a physical location in memory. The former style allows code to be moved because the address it references will move along with the code. The absolute locations do not; code that uses this style of addressing will have to be recompiled if it moves. To address this, the 6809 filled out its instruction opcodes so that there were more instances of relative addressing where possible. 

As an example, the 6800 included a special "direct" addressing mode that was used to make code smaller and faster; instead of a memory address having 16-bits and thus requiring two bytes to store, direct addresses were only 8-bits long. The downside was that it could only refer to memory within a 256-byte window, the "direct page", which was normally at the bottom of memory - the 6502 referred to this as "zero page addressing". The 6809 added a new 8-bit DP register, for "direct page". Code that formerly had to be in the zero page could now be moved anywhere in memory as long as the DP was changed to point to its new location. 

Using DP solved the problem of referring to addresses within the code, but data is generally located some distance from the code, outside ROM. To solve the problem of easily referring to data while remaining position independent, the 6809 added a variety of new addressing modes. Among these was program-counter-relative addressing which allowed any memory location to be referred to by its location relative to the instruction. Additionally, the stack was more widely used, so that a program in ROM could set aside a block of memory in RAM, set the SP to be the base of the block, and then refer to data within it using relative values.

To aid this type of access, the 6809 renamed the SP to U for "user", and added a second stack pointer, S, for "system".  The idea was user programs would use U while the CPU itself would use S to store data during subroutine calls. This allowed system code to be easily called by changing S without affecting any other running program. For instance, a program calling a floating-point routine in ROM would place its data on the U stack and then call the routine, which could then perform the calculations using data on its own private stack pointed to by S, and then return, leaving the U stack untouched.

Another reason for the expanded stack access was to support reentrant code, code that can be called from various different programs concurrently without concern for coordination between them, or that can recursively call itself.  This makes the construction of operating systems much easier; the operating system had its own stack, and the processor could quickly switch between a user application and the operating system simply by changing which stack pointer it was using. This also makes servicing interrupts much easier for the same reason.  The 6809 adds a fast interrupt request (FIRQ) interrupt that saves only the program counter and condition code register before calling the interrupt code, whereas the IRQ interrupt saves all registers, taking additional cycles, then more to unwind the stack on exit. 

The 6809 includes one of the earliest dedicated hardware multipliers.  It takes 8-bit numbers in the A and B accumulators and produces a result in A:B, known collectively as D.


Similar features were incorporated in the 6809 and 68000 for an OS. These came at a cost of significantly larger cores but silicon space was getting cheaper making it practical. The features were more than convenient but also improved performance.

Byte Sieve comparison
6502@1MHz 13.9s
Z80@4MHz 6.8s
6809@2MHz 5.1s
8086@8MHz 1.9s
68000@8MHz 0.49s

The 6502 could be clocked up to compete in performance but it was upgrading a stripped down CPU. Newer CPUs like the 68000 and 8086 were improving code density, reducing memory traffic and reducing the number of instructions to execute with significant redesigns and GP registers. Not only were OSs easier to support but compilers bringing minicomputer features and performance to the PC. The CBM PET used a 6502 but the SuperPET added a 6809 to take advantage of more modern features. The 68000 Amiga was a larger upgrade from the C64 using the 6502 family. Unfortunately, compatibility was lost and CBM did not prioritize a good C64 emulator for the Amiga.

Hammer Quote:

For https://github.com/shanshe/Z3660
Z3660 with ARM Cortex A9 is another 68K-to-ARM emulation for "big box" 32-bit Amiga 3000 and 4000.

Z3660 has ARM Cortex A9 via MYS-7Z020-V2's AMD/Xilinx ZYNQ 7020 SoC.

Z3660 can operate with a physical MC68060 like an A3660 card. Real 68K MMU has its use.

ARM Cortex A9 has 2.50 DMIPS/MHz/core, 8-stage pipelines, and OoO processing. A9's instruction queue and dispatch stage has three instructions dispatch i.e.
pipe for ALU/MUL,
pipe for ALU,
pipe for FPU/NEON or Address.

ARM Cortex A9 was introduced in 2007.

ARM states a Cortex A9 microarchitecture on TSMC 40G can reach 2 GHz.


The OoO ARM Cortex-A9 is only 2.5 DMIPS/MHz compared to the in-order Cortex-A53 at 2.3 DMIPS/MHz but the Cortex-A9 should have a significant performance advantage executing 68k code as OoO should reduce the load-to-use stalls from unscheduled code after 68k to ARM translation. Let's look at a typical workload.

load 26%
store 10%
ALU 49%
branch 15%

20 instruction typical workload with ARM Cortex-A53 emulating 68k code
load x5 with load-to-use stalls (5*4= 20 cycles)
store x2 (2*1= 2 cycles with store buffer)
ALU x10 (10*0.5 to 10*1= 5-10 cycles)
branch x3 (most are conditional predicted branches so 0 cycles)
---
total: 27-32 cycles

20 instruction typical workload with ARM Cortex-A9 emulating 68k code
load x5 without load-to-use stalls (5*1= 5 cycles)
store x2 (2*1= 2 cycles with store buffer)
ALU x10 (10*0.5 to 10*1= 5-10 cycles)
branch x3 (most are conditional predicted branches so 0 cycles)
---
total: 12-17 cycles

OoO is designed to minimize stalls but the increased performance comes at a steep cost in power and area. The 32 bit OoO Cortex-A9 is likely larger than the 64 bit in-order Cortex-A53 and likely uses several times the power.

Microarchitecture | Year | Architecture | Pipeline Depth | DMIPS/MHz | Representative Frequency (MHz) | L1 Cache | Relative Size
ARM1 1985 v1 3 0.33 8 N/A 0.1
ARM6 1992 v3 3 0.65 30 4KB_unified 0.6
ARM7 1994 v4T 3 0.9 100 0–8KB_unified 1
ARM9E 1999 v5TE 5 1.1 300 0–16KB_I+D 3
ARM11 2002 v6 8 1.25 700 4–64KB_I+D 30
Cortex-A9 2009 v7 8 2.5 1000 16–64KB I+D 100 (OoO)
Cortex-A7 2011 v7 8 1.9 1500 8–64KB_I+D 40
Cortex-A15 2011 v7 15 3.5 2000 32KB_I+D 240 (OoO
Cortex-M0+ 2012 v7M 2 0.93 60–250 None 0.3
Cortex-A53 2012 v8 8 2.3 1500 8–64KB_I+D 50
Cortex-A57 2012 v8 15 4.1 2000 48KB_I+32KB_D 300 (OoO)

Digital Design and Computer Architecture, 2016
https://www.sciencedirect.com/topics/computer-science/stage-pipeline

The Cortex-A9 is a low power and small OoO core compared to later OoO ARM cores. The article above gives the difference between the big.LITTLE performance core and companion low power core as, "The Cortex-A15 delivers approximately 2.5x the performance of the Cortex-A7, but at 6x the power." The Cortex-A9 likely has less aggressive OoO though. We can compare typical code to the in-order RISC-V SiFive U74 core and 68060 core with designs that practically eliminate load-to-use stalls.

20 instruction typical workload SiFive U74 core
load x5 without load-to-use stalls (5*1= 5 cycles)
store x2 (2*1= 2 cycles with store buffer)
ALU x10 (10*0.5 to 10*1= 5-10 cycles)
branch x3 (most are conditional predicted branches so 0 cycles)
---
total: 12-17 cycles

20 instruction typical workload 68060
load+ALU x5 (5*0.5 to 5*1= 2.5-5 cycles)
store x2 (2x1= 2 cycles with store buffer)
ALU x5 (5*0.5 to 5*1= 2.5-5 cycles)
branch x3 (most are conditional predicted branches so 0 cycles)
---
total: 7-12 cycles

The in-order SiFive U74 core design practically eliminates load-to-use stalls without OoO and the power and area cost. The Cortex-A9 is 2.5 DMIPS/MHz using a 40nm process while the SiFive U74 core is 2.64 DMIPS/MHz using a 28nm process. The SiFive U74 design is similar to the 68060 design but the 68060 can execute load+ALU instructions at the same time which is why there are only 5 ALU instruction left to execute instead of 10 and provides the additional cycle savings. In order cores have their performance limitations which is removing cache stalls where OoO cores have a significant advantage. The in-order SiFive U74 core outperforms the OoO Cortex-A9 though which means the Cortex-A9 is practically removing load-to-use stalls. This is a waste considering the extra resources of OoO but it has the advantage of better performance than the in-order Cortex-A53 without instruction scheduling although the in-order SiFive U74 core and 68060 need minimum instruction scheduling as well. The in-order 68060 has more potential performance than the SiFive U74 core and could likely reach at least 3 DMIPS/MHz with modern improvements while using less power and area than most OoO RISC cores. The highest performance limited OoO PPC cores only achieved about 3 DMIPS/MHz. RISC ISAs break instructions down into simpler instructions so RISC OoO cores can put the code back together again. CISC ISAs have high performance instructions which somehow remains a best kept secret by Intel and AMD but maybe the ugly and complex x86(-64) ISA obscures it. The 68k failed because Motorola pencil pushers couldn't see it and took the easy path commonly traveled of paying ARM to develop for them.

Last edited by matthey on 19-May-2024 at 03:26 AM.

 Status: Offline
Profile     Report this post  
Gunnar 
Re: One major reason why Motorola and 68k failed...
Posted on 19-May-2024 9:36:13
#75 ]
Cult Member
Joined: 25-Sep-2022
Posts: 512
From: Unknown

@Lou

Quote:
Well, the topic of this thread is why Motorola failed...not the Amiga, though that is related. So comparing Motorola's cpu division to other architectures is relevant.



Yes comparing might be insightful.

But how many people that posted and compared here know the CPUs from real personal experience?

 Status: Offline
Profile     Report this post  
Hammer 
Re: One major reason why Motorola and 68k failed...
Posted on 20-May-2024 2:34:48
#76 ]
Elite Member
Joined: 9-Mar-2003
Posts: 5859
From: Australia

@Lou
Quote:

Lou wrote:
I did, in a subsequent post, link a spec for a 32bit 6502 as a successor to the 65C816...a 65C832 that was proposed but never developed.

Here it is again:
https://downloads.reactivemicro.com/Electronics/CPU/WDC%2065C832%20Datasheet.pdf

Also, relocatable Zero Page and Stack on the C128's 8502+MMU did in-fact make it a much better games machine than the C64, though lazy developers just made C64 games just as A500 was still the target platform after the AGA machines launched. Also, the dual-screen ability was neglected...as well as faster loading times of the 1571.
The VIC-IIe in the C128 did add a couple of extra features (supported 320x400 interlaced and had a way to simulate more colors)...again unused.

The lowest common denominator always dominates... :/


FYI, I also posted this W65CB32 document on this topic at #18 post in page 1. LOL

https://amigaworld.net/modules/newbb/viewtopic.php?post_id=870201&topic_id=45221&forum=25#870113

Again, W65CB32 wasn't ready for the Amiga Lorraine, Apple Lisa, Atari ST, Sega Mega Drive and etc.

Commodore's in-house 16-bit CPU selection was the Z8000 CPU for the Commodore 900.

Last edited by Hammer on 20-May-2024 at 02:35 AM.

_________________
Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68)
Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68)
Ryzen 9 7900X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB

 Status: Offline
Profile     Report this post  
Hammer 
Re: One major reason why Motorola and 68k failed...
Posted on 20-May-2024 2:38:22
#77 ]
Elite Member
Joined: 9-Mar-2003
Posts: 5859
From: Australia

@matthey

Quote:
Byte Sieve

Useless benchmark. Try wolf 3D.

_________________
Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68)
Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68)
Ryzen 9 7900X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB

 Status: Offline
Profile     Report this post  
Hammer 
Re: One major reason why Motorola and 68k failed...
Posted on 20-May-2024 3:13:33
#78 ]
Elite Member
Joined: 9-Mar-2003
Posts: 5859
From: Australia

@matthey

Quote:
The OoO ARM Cortex-A9 is only 2.5 DMIPS/MHz compared to the in-order Cortex-A53 at 2.3 DMIPS/MHz but the Cortex-A9 should have a significant performance advantage executing 68k code as OoO should reduce the load-to-use stalls from unscheduled code after 68k to ARM translation. Let's look at a typical workload.

load 26%
store 10%
ALU 49%
branch 15%

20 instruction typical workload with ARM Cortex-A53 emulating 68k code
load x5 with load-to-use stalls (5*4= 20 cycles)
store x2 (2*1= 2 cycles with store buffer)
ALU x10 (10*0.5 to 10*1= 5-10 cycles)
branch x3 (most are conditional predicted branches so 0 cycles)
---
total: 27-32 cycles

20 instruction typical workload with ARM Cortex-A9 emulating 68k code
load x5 without load-to-use stalls (5*1= 5 cycles)
store x2 (2*1= 2 cycles with store buffer)
ALU x10 (10*0.5 to 10*1= 5-10 cycles)
branch x3 (most are conditional predicted branches so 0 cycles)
---
total: 12-17 cycles


https://www.youtube.com/watch?v=MqiEQtzGk-Q
Benchmarking Z3660 with ARM Cortex A9 @ 766 Mhz (via Z-turn Board V2 with an AMD Zynq 7020) emulating 68K. This ARM Cortex A9 implementation has a 32 KB L1 cache and a shared 512 KB L2 cache. https://docs.amd.com/v/u/en-US/ds190-Zynq-7000-Overview

A4000's AMD Zynq 7020's ARM Cortex A9 @ 766 Mhz's 68k emulation reached the following:
SysSpeed in 1550 MIPS, 139 MFLOPS.
SysInfo is 138 MIPS, (problem with SysInfo's FPU detection)


A500's PiStorm Emu68 with RPi 3A+ ARM Cortex A53 @ 1.4 Ghz's 68k emulation reached the following:
SysSpeed is 499 MIPS and 269 MFLOPS.
SysInfo is 914 MIPS and 868 MFLOPS.


A1200's PiStorm32 Emu68 with RPi 4B+ ARM Cortex A72 @ 1.8 Ghz's 68k emulation reached the following:
SysSpeed is 3446 MIPS and 3264 MFLOPS.
SysInfo is 1973 MIPS and 2014 MFLOPS.

Significant overclock works for RPi 3A+ and 4B without triggering warranty void status.

Each benchmark has biases in different parts of the CPU instruction set, CPU cache and memory bus.

For Amiga's context, it's games like Quake benchmarks. Z3660's RTG and FP emulation still needs work.

SiFive U74 is useless for the Amiga. When Google dropped RISC-V support from Android, RISC-V's "application processor" chance also dropped.

I'm not interest in pure "embedded CPUs" since the Amiga is a desktop computer.

Last edited by Hammer on 20-May-2024 at 03:23 AM.
Last edited by Hammer on 20-May-2024 at 03:22 AM.
Last edited by Hammer on 20-May-2024 at 03:20 AM.

_________________
Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68)
Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68)
Ryzen 9 7900X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB

 Status: Offline
Profile     Report this post  
agami 
Re: One major reason why Motorola and 68k failed...
Posted on 21-May-2024 2:10:30
#79 ]
Super Member
Joined: 30-Jun-2008
Posts: 1779
From: Melbourne, Australia

@thread

We’ve been over this before:
Most of the companies that thrived in the dynamics of the nascent personal computer market of the late’70s and ‘80s, didn’t transition well (or at all) into the changed dynamic of the ‘90s and early 2000s.

There are 3 theories, nested in each other like a babushka doll, that govern all of human behavior and by proxy, the behavior of companies:
- Information theory
- Graph theory, and
- Game theory.

Sufficiently wrap your head around these, and you can throw out most qualifications offered by today’s diploma mills.

Like many other companies, Commodore among them, Motorola were resting on their laurels. You snooze, you lose. And just as nature abhors a vacuum, so does the network effect that weaves together any market.

Motorola failed to diversify for a rapidly changing market in the ‘90s. Their “Hail Mary” was allying with Apple and IBM for PowerPC. But just as the Versailles Treaty didn’t address the problems of the failing 1800’s status quo, WWII and the eventual failure of Motorola were inevitable.

_________________
All the way, with 68k

 Status: Offline
Profile     Report this post  
kolla 
Re: One major reason why Motorola and 68k failed...
Posted on 21-May-2024 6:56:15
#80 ]
Elite Member
Joined: 21-Aug-2003
Posts: 3187
From: Trondheim, Norway

@agami

Quote:
We’ve been over this before


...

_________________
B5D6A1D019D5D45BCC56F4782AC220D8B3E2A6CC

 Status: Offline
Profile     Report this post  
Goto page ( Previous Page 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 Next Page )

[ home ][ about us ][ privacy ] [ forums ][ classifieds ] [ links ][ news archive ] [ link to us ][ user account ]
Copyright (C) 2000 - 2019 Amigaworld.net.
Amigaworld.net was originally founded by David Doyle