Amigaworld.net - The Amiga Computer Community Portal Website

home

features

news

forums

classifieds

faqs

links

search

6071 members

Amiga Q&A / Free for All / Emulation / Gaming / (Latest Posts)

Login

Lost Password?

Don't have an account yet?
Register now!

Support Amigaworld.net

Your support is needed and is appreciated as Amigaworld.net is primarily dependent upon the support of its users.

Menu

Main sections

»	Home
»	Features
»	News
»	Forums
»	Classifieds
»	Links
»	Downloads

Extras

»	OS4 Zone
»	IRC Network
»	AmigaWorld Radio
»	Newsfeed
»	Top Members
»	Amiga Dealers

Information

»	About Us
»	FAQs
»	Advertise
»	Polls
»	Terms of Service
»	Search

IRC Channel

Server: irc.amigaworld.net
Ports: 1024,5555, 6665-6669
SSL port: 6697
Channel: #Amigaworld
Channel Policy and Guidelines

Who's Online

24 crawler(s) on-line.

51 guest(s) on-line.

0 member(s) on-line.

You are an anonymous user.
Register Now!

DiscreetFX: 16 mins ago

agami: 2 hrs 5 mins ago

kolla: 2 hrs 14 mins ago

amigakit: 2 hrs 52 mins ago

NutsAboutAmiga: 3 hrs 55 mins ago

michalsc: 4 hrs 2 mins ago

Tuxedo: 4 hrs 49 mins ago

Rob: 5 hrs 42 mins ago

Swisso: 7 hrs 59 mins ago

Matt3k: 8 hrs 4 mins ago

Forum Index

Amiga News & Events

MeKa 2008 (Amiga Party) (SHOWN WAS NEW AMIGA HW!)

Poster

Thread

umisef

Re: MeKa 2008 (Amiga Party) (SHOWN WAS NEW AMIGA HW!)
Posted on 24-Jan-2008 1:30:23

[ #221 ]

Super Member

Joined: 19-Jun-2005
Posts: 1714
From: Melbourne, Australia

@BigGun

Quote:
Okay, last time I looked at UAE sources there where around 30-50 instructions
used per function that emulates one 68k instruction. I think the code in question was in "cpuemu.c"
Can you explain this?

Yes, I can. You look at the *interpreting* emulation, and then talk, most explicitly, about performance of the JIT emulation.
You know, the JIT compiler does not just create a whole lot of "call xxxxx" references to the stuff in cpuemu.c. It actually translates code. You'll want to have a look in compemu.c (you know, COMPiling EMUlation....).

Quote:
The UAE code that I looked at creates huge blocks per 68k instruction.

It would help if you looked at the code you claim to be talking about.

Quote:
But the actual design is not how the Natami works. There are more than one memory bank and one bank is CPU only SRAM, so there are cases the CPU has full access to its SRAM bank.

That's no longer "chipmem" then, now is it? It's "fastmem" in Amiga terms.

Quote:
Goal of the blitter is to do 100% random texture fetch in the SRAM in 10ns - in a much huger amount of memory than your AMD CPU has.

That's a lofty goal. It's also one that, if it is actually pursued, and turns out to be possible, will make the whole thing stupidly expensive. Fast SRAM is not particularly cheap; Rdiculously fast SRAM is ridiculously not cheap.
NEC is currently the main provider of fast asynchronous SRAM. The largest chips they have which come in 8ns versions (the best they got) are 512k, and cost $9.23 a piece when purchased from semiconductorstore.com. And that leaves you with all of 2ns to deal with signal propagation, gating and so on. Good luck. If you want faster than that, things go synchronous, with all the resulting ugliness --- a large part of which has to do with packaging; You really don't want BGA stuff in a hobby project.

Quote:
But your AMD can NOT access his 2nd level cache every CPU clock.
You can probably tell me how many clocks delay it has?

My AMD can certainly access its L2 more than 50 million times a second (which is all that 060/50 of yours can hope for). In fact, if you allow the accesses to pipeline (i.e. the address of one access does not depend on the result of a recent previous one), you can make around 300 million random accesses per second. If you don't allow them to pipeline, we are still looking at 132 million successfully chased pointers. All from L2. L1, of course, is much faster.

Saying "The Natami is going to compare favourably to machine X, because the Natami's CPU can access memory on each processor clock, whereas machine X cannot" is a tad disingenuous (as well as blatantly wrong) if you chose to ignore the factor of 60 difference in clock speeds.

Quote:
How do you measure this?
Were you randomly writing only or where you randomly reading too?

I measure how many writes I can make into my video memory by making a certain number of writes into video memory and timing how long it takes. The obvious way, and the way that creates actual, verified results. Yes, there was a subtext of a reproach in that sentence.

No, I wasn't reading from video memory. Yes, we agree that reading from video memory is slow. No, your statement which I corrected was not talking about reading from video memory, either. In fact, it very explicitly referred to writing. No, "voxeling" does not involve reading from video memory.

Quote:
So how do you do this on 68k and PPC?

I have no idea. I don't care. Avoiding the overhead of write-induced reads on 68k or PPC does not put food on my table; Avoiding that very same overhead on x86 does. Which is why I know all about it, and why I can call #### on some of the things you claim about cache performance on "common" CPUs.

Quote:
If you have idea to improve the design feel free to speak up.

I'd start by removing the over-enthusiastic promoter. Because:
Quote:
You can not compare a self developed piece of hardware with some GFX card or CPU which companies are putting millions into developing.

yet the over-enthusiastic promoter is making exactly those comparisons all over the place.

Re-building the original Amiga is an ambitious project to start with; Extending it in the process while staying compatible is even better. And kudos to Thomas, not only for pulling it off, but also for only keeping his mouth shut about it until he actually pulled it off.

Quote:
But there will by caces were the Natami will be faster than other PC systems.
I think you agree with me in this?

For doing things natively on either system? And for "other PC systems" being current, reasonably specced PCs? And for "Natami" being an 060/50 based board? No, I don't agree with you. In fact, I'd be willing to bet you 10 large jars of Vegemite against 10 large packets of Lorenz Erdnusslocken that, once Natami comes out, you cannot within one month come up with a computational task which I cannot within a week implement faster on the AMD under my desk than you can on Natami.

Deal?

Last edited by umisef on 24-Jan-2008 at 01:55 AM.

Status: Offline

umisef

Re: MeKa 2008 (Amiga Party) (SHOWN WAS NEW AMIGA HW!)
Posted on 24-Jan-2008 1:44:59

[ #222 ]

Super Member

Joined: 19-Jun-2005
Posts: 1714
From: Melbourne, Australia

@TheDaddy

Quote:
Please let's stop the technical crap.

Perish the thought that one should talk technical stuff in a thread about someone building a computer....

Quote:
So let's go back to Natami and support Thomas instead of giving it a reason NOT to carry on.

I can obviously not speak for Thomas, but if this were me in his position, then the one thing that would really annoy me about this thread would be the people who masturbate (let's see whether that gets through :) over specs they don't understand.

I have not seen *anyone* criticise what Thomas is doing. I *have* seen people criticise some of the more fanciful ideas put forward by other participants (notably absent: Thomas himself). I have also seen one poster in particular make repeated comparisons of aspects of Natami against what he claims are properties of current PCs, only to be corrected about them by people who happen to know better.

Thomas is trying to build an Amiga. Great! It's not, however, going to be taking the world by storm, or to outperform the current crop of non-Amiga computers, or to be good bang-for-the-buck for anyone who doesn't have their heart set on an Amiga to start with. So let him do what he wants to do, and stop asking for things which make no sense. I mean, come on --- a ColdFire CPU card with a Cell attached? As a hobby project? Why not ask for cold fusion, peace in the middle east, and the squaring of the circle?

Quote:
I have been using a pc running emulated Workbench and I can assure you that although it does a decent job and it's faster than any Amiga it still doesn't feel like a real Amiga

There are good reasons for this. They have to do with lag between user input and on-screen reaction, and are caused by the way UAE handles things. They have nothing to do with speed, and if you had a chance to see an emulation which handled things differently (which, alas, is now 6 years in the past), you would not put the blame at the PC, but at UAE.

Last edited by umisef on 24-Jan-2008 at 02:28 AM.

Status: Offline

BigGun

Re: MeKa 2008 (Amiga Party) (SHOWN WAS NEW AMIGA HW!)
Posted on 24-Jan-2008 7:08:20

[ #223 ]

Regular Member

Joined: 9-Aug-2005
Posts: 438
From: Germany (Black Forest)

@umisef

Yes, I'm enthusiastic about the Natami.
And that is because I saw it running.
I said from the beginning that I'm very enthusiastic about the project.
I saw the Natami boot from power on to seeing the Workbench in 2 seconds.
You have to be enthusiastic if you see this as AMIGA fan

When I saw that the Natami it was obvious that its 100% AMiGA compatible but much faster than the original. Amiga OS can work with the new AGA++ chipset just normal.
But the NATAMI chipset is much faster then the GFX on any AMIGA was.
I many aspect the AGA++ is even faster then ANY GFX Card available for New AmigaONE or Pegasos.

The Natami has a CPU card.
The first CPUs will be 68060.
Other CPUs can be Coldfire or PowerPC/Cell.
So its clear that the NATAMI can have a huge CPU power.

The point that I made was trying to make is that even with a low clocked 68060,
the NATAMI will have no problem to run Workbench or Games a lot faster than our old A4000 did.

I think every Amiga fan has to like such a system.
I have Athlon 64 and DuoCore system as well, but they don't boot as fast as the NATAMI ( and they don't run AMIGA-OS)

What system are you using?
Are you still using AMiGA OS, or maybe AROS?

Quote:

umisef wrote:
@BigGun

Quote:
Okay, last time I looked at UAE sources there where around 30-50 instructions
used per function that emulates one 68k instruction. I think the code in question was in "cpuemu.c"
Can you explain this?

Yes, I can. You look at the *interpreting* emulation, and then talk, most explicitly, about performance of the JIT emulation.
You know, the JIT compiler does not just create a whole lot of "call xxxxx" references to the stuff in cpuemu.c. It actually translates code. You'll want to have a look in compemu.c (you know, COMPiling EMUlation....).

[/quote]

Fine. I see, I looked at the 68 emulation code and not the JIT code then.
I think it would simplify it if you could give some numbers:
- How many x86 instructions does UAE need to identify an 68k code?
- How many x86 instructions does the emulation it need at maximum to emulate one 68k instruction?
- How many x86 instructions does the JIT need at maximum to emulate one 68k instruction?
I hope its no hassle for you can give us some ballpark answer?

Quote:

Quote:
But the actual design is not how the Natami works. There are more than one memory bank and one bank is CPU only SRAM, so there are cases the CPU has full access to its SRAM bank.

That's no longer "chipmem" then, now is it? It's "fastmem" in Amiga terms.

I always said that is had both Chip and FAST mem.

You mix my points:
I never said that both blitter and CPU can work at max speed at the same memory location.

The point that I was making is that is common on AMIGA to use both Blitter AND CPU at the same time.
Its was normal for games to draw most of the GFX with the blitter but in addition to this use the CPU for some drawing too. Even on the Amiga Workbench its common to see BLITTER and CPU drawing mixed.

The points that I made related to this is.
a) The NATAMI is optimized for this operation.
- You can mix Blitter and CPU operations without problems.
- You can read+write both with BLITTER and read+write with CPU in GFX mem.
- This is something you CAN NOT do on other architectures without getting MAJOR penalties.

b) The NATAMI gets speed by using very low latency memory.
With 10ns bus clock the CHIPMEM has a very very low latency.
When you texture map you have to read and write to memory.
With a normal texture map implementation these read and writes will 99% not both be in a row.
If you read memory in a arbitrary line in memory (not in a row) you have a huge latency impact with normal memory. The NATAMI memory will not have this latency problem.

Quote:

Quote:
Goal of the blitter is to do 100% random texture fetch in the SRAM in 10ns - in a much huger amount of memory than your AMD CPU has.

You really don't want BGA stuff in a hobby project.

If you took a look at the CPU card of the NATAMI you will see that Thomas uses BGA chips already.
Maybe you just leave this decision to Thomas, as the HW designer?

Quote:

Quote:
But your AMD can NOT access his 2nd level cache every CPU clock.
You can probably tell me how many clocks delay it has?

My AMD can certainly access its L2 more than 50 million times a second (which is all that 060/50 of yours can hope for).

[/quote]

Why do you trying to show of with your AMD against a 50MHz 68060?
There are higher 68060 CPUs available with 75 or 80 MHz .
I always said that the 68k can not compete in CPU power with an x86 CPU.
And if you compare the computing power of the CELL than the AMD looks really poor

But the point the we were trying to make is that in a normal CPU system
the CPU has even a latency to its own cache, and the latency to memory is tremendous.
- How many cycles latency does your AMD to 2nd lever cache?
- How many cycles latency does your AMD to memory?

Again I said hoping for an 68k that has more CPU power than a PPC or x86 is silly.
If you want more CPU power then the NATAMI has the CPU SLOT.
What I said is that in many algorithms memory latency is the limiting factor.
And as the Natami has very low latency the overall performance is much higher then the clockrate will make you think.

You know yourself, that its easy to have a testcase where the memory address that you work with not goof for your cache and the working date will push your instructions out of the cache and visa versa.
- Lets say you read single bytes on every address+=4096 over a blog of 8MB
How many bytes/sec can your AMD read then?
Is it faster then the 68060?

Quote:

Saying "The Natami is going to compare favourably to machine X, because the Natami's CPU can access memory on each processor clock, whereas machine X cannot" is a tad disingenuous (as well as blatantly wrong) if you chose to ignore the factor of 60 difference in clock speeds.

[b]You misquote me!
If this was not on purpose then Please read me post again.
What I said was that the POWERPC CPU will have huge CPU performance.
But that even an 68k CPU will perform extremly swift as the memory latency on the NATAMI is that low.
And I said that with this setup you can even "construct" even situations where the 68k CPU perform good compared to other CPUs

You know yourself that the things that make todays computers slow are the situations where you wait for memory.
- There are CPUS today that have 2nd level cache latencies of 30 or more clocks
While the 68k can execute every instruction.
The higher clocked chip might wait 30 instructions before beeing able to read its own cache.
- Many CPUs today have memory clocks of several hundred !
The 68060 can read memory on every clock - but these CPUs need to wait hundreds of instructions when they access memory in an unpredictable way.

These are simple facts.

Quote:
How do you measure this?
Were you randomly writing only or where you randomly reading too?

I measure how many writes I can make into my video memory by making a certain number of writes into video memory and timing how long it takes. The obvious way, and the way that creates actual, verified results. Yes, there was a subtext of a reproach in that sentence.

No, I wasn't reading from video memory. Yes, we agree that reading from video memory is slow. No, your statement which I corrected was not talking about reading from video memory, either. In fact, it very explicitly referred to writing. No, "voxeling" does not involve reading from video memory.
[/quote]

Voxeling was only one example as was texture mapping.
Texture mapping involved both read and writes.
- Do we agree that the NATAMI chipram can compete nicely with the fastest GFX card available for Old and new Amigas?
- Do we agree that if you mix Blitter and CPU operations as often done in AMIGA on GFX creating.
That you can do this nicely with NATAMI both get huge performance impacts on other GFX cards?

- What number do you get if you involve reads?
Could you please post the code of your pixel benchmark.
I would like to see "how" random your access are.

Quote:
So how do you do this on 68k and PPC?

I have no idea. I don't care.
[/quote]

Aha You don't care.
Do you use AMIGA-OS on your machine?

Gunnar

_________________
APOLLO the new 68K : www.apollo-core.com

Status: Offline

Donar

Re: MeKa 2008 (Amiga Party) (SHOWN WAS NEW AMIGA HW!)
Posted on 24-Jan-2008 7:15:57

[ #224 ]

Regular Member

Joined: 12-Nov-2006
Posts: 117
From: Germany

@umisef
Quote:
Why not ask for cold fusion, peace in the middle east, and the squaring of the circle?

Ok i'll be the first to step forward and ask:
umisef, could you please do that for me? I'll promise to provide 10 large packets of Erdnusslocken in exchange.

Deal?

_________________
<- Amiga 1260 / CD ->
Looking for:
A1200/CF CFV4/@200,256MB,eAGA,SATA,120GB,AROS

Status: Offline

Krischan76

Re: MeKa 2008 (Amiga Party) (SHOWN WAS NEW AMIGA HW!)
Posted on 24-Jan-2008 7:50:38

[ #225 ]

Member

Joined: 25-Dec-2007
Posts: 47
From: outside the looney bin

Quote:
Yes, I'm enthusiastic about the Natami.

So am I.

Status: Offline

TheDaddy

Re: MeKa 2008 (Amiga Party) (SHOWN WAS NEW AMIGA HW!)
Posted on 24-Jan-2008 10:30:47

[ #226 ]

Elite Member

Joined: 30-Sep-2005
Posts: 4499
From: Quattro Stelle

@umisef

>>Perish the thought that one should talk technical stuff in a thread about someone building a computer....

You can be sarcastic, that's fine, but all this bitching isn't helping anyone. Fact: there is a possibility of a new Amiga machine in YEARS and that's what we need to concentrate on. Until AROS is completed and has a few nice applications then Amiga OS can only run natively on Amigas and with a bit of luck the Natami.

>>I have not seen *anyone* criticise what Thomas is doing.

I have no problem with that, as long as it's good, constructive criticism, advise more than slagging off.

>>Thomas is trying to build an Amiga. Great! It's not, however, going to be taking the world by storm, or to outperform the current crop of non-Amiga computers, So let him do what he wants to do, and stop asking for things which make no sense. I mean, come on --- a ColdFire CPU card with a Cell attached? As a hobby project? Why not ask for cold fusion, peace in the middle east, and the squaring of the circle?

I don't think it's Thomas's intention to take the world by storm and beat the crap out of Wintel machines, but only give us an updated Amiga.

I don't think there is such thing (a Coldfire cpu with a Cell attached) I think he meant there is a cpu slot which will allows you to connect a coldfire and maybe a Cell card.
I think that it's out of Thomas's plans to include the peace in the middle east but I bet he is working on circling the square.

Last edited by TheDaddy on 24-Jan-2008 at 11:06 AM.

_________________
www.loriano.pwp.blueyonder.co.uk

Status: Offline

Leo

Re: MeKa 2008 (Amiga Party) (SHOWN WAS NEW AMIGA HW!)
Posted on 24-Jan-2008 11:12:48

[ #227 ]

Super Member

Joined: 21-Aug-2003
Posts: 1597
From: Unknown

Quote:

Other CPUs can be Coldfire or PowerPC/Cell.
So its clear that the NATAMI can have a huge CPU power.

The point that I made was trying to make is that even with a low clocked 68060,
the NATAMI will have no problem to run Workbench or Games a lot faster than our old A4000 did.

I think every Amiga fan has to like such a system.
I have Athlon 64 and DuoCore system as well, but they don't boot as fast as the NATAMI ( and they don't run AMIGA-OS)

And what system do you use with your PowerPC/Cell ? Everyone knows AmigaOS runs on PowerPC as well as CELL ;) Or maybe you're talking about AmigaOS 5 ? :)

Come on...

_________________
http://www.warpdesign.fr/

Status: Offline

umisef

Re: MeKa 2008 (Amiga Party) (SHOWN WAS NEW AMIGA HW!)
Posted on 24-Jan-2008 11:53:25

[ #228 ]

Super Member

Joined: 19-Jun-2005
Posts: 1714
From: Melbourne, Australia

@BigGun

Quote:
I saw the Natami boot from power on to seeing the Workbench in 2 seconds.
You have to be enthusiastic if you see this as AMIGA fan

Sounds pretty similar to the boot speed from (emulated) power-on to Workbench that people were getting excited about in 2001. Personally, I still don't see the attraction in a fast boot if the thing you are booting to has little useful software.

Quote:
I many aspect the AGA++ is even faster then ANY GFX Card available for New AmigaONE or Pegasos.

What aspects would that be, apart from the rather useless "have the CPU read data from display memory"?

Quote:
The Natami has a CPU card.
The first CPUs will be 68060.
Other CPUs can be Coldfire or PowerPC/Cell.
So its clear that the NATAMI can have a huge CPU power.

How is a Coldfire or PowerPC going to provide "huge CPU power"? Especially when paired with what, compared to the standard fare of those CPUs, is ####-poor memory?
The G4 PowerPC has 32 byte cachelines. Judging by your earlier comments, that would result in no less than 8 read cycles from that amazing 10ns per cycle memory to fill a single cache line. Even ignoring overheads, you are still looking at no more than 12.5 million cache lines being read per second. That's about 1/3 to 1/4 of what a modern CPU can do with cheap DDR2 memory (which is $10-20 per gigabyte, as compared to four figures per gigabyte for static ram).

Quote:
The point that I made was trying to make is that even with a low clocked 68060,
the NATAMI will have no problem to run Workbench or Games a lot faster than our old A4000 did.

Now that is (hopefully) a good and valid point to make. This thing is going to be an upgrade over existing Amigas --- which is great, and I am sure lots of people will be excited about it. And if you had limited yourself to that point, I never would have posted.

Quote:
Fine. I see, I looked at the 68 emulation code and not the JIT code then.
I think it would simplify it if you could give some numbers:
- How many x86 instructions does UAE need to identify an 68k code?

Back in 2002 I did measure a bunch of these things, on a Duron 800.

Back then, interpretative emulation took about 70 clock cycles per 68k instruction. JIT-Translating took about 4000 clock cycles per 68k instruction. I'd expect these numbers to be still accurate within a factor of two --- the IPC is much better for current processors, but the penalty for stalls, in terms of clock cycles, has also gone up.

Quote:

- How many x86 instructions does the emulation it need at maximum to emulate one 68k instruction?

Let's stick with averages while executing software, OK? So that would be 70 clock cycles (a more useful metric than "instructions", and the only one I got data for).

Quote:

- How many x86 instructions does the JIT need at maximum to emulate one 68k instruction?

Again, back then, JIT-compiled code ran at roughly one emulated 68k instruction per 1 to 2 clock cycles. If we assume 1 clock cycle is the case when running cache-only stuff, we can probably expect the Duron 800 to manage 1.2 to 1.6 billion instructtions per second, so 1.5 to 2 x86 instructions per 68k instruction.

Most 68k instructions (and almost all of the instructions which 68 programs tend to spend the vast majority of execution time on) actually translate pretty much 1:1; The overhead that pushes the overall ratio a bit mainly has to do with the things emulation needs to do on basic block boundaries. So if your basic blocks are long, the overhead is spread over many instructions. If they are short, over just a few.

Quote:
- You can read+write both with BLITTER and read+write with CPU in GFX mem.

However, if you *have* a blitter, why would you use the CPU to read gfx memory? It's nifty that you *can*, but it's about as useful as being able to balance a stack of 10 coke cans....

Quote:
When you texture map you have to read and write to memory.
With a normal texture map implementation these read and writes will 99% not both be in a row.
If you read memory in a arbitrary line in memory (not in a row) you have a huge latency impact with normal memory. The NATAMI memory will not have this latency problem.

Says who? If you do this in cachable memory, your fabled 10ns suddenly turns into 40ns (if using a 16 byte cacheline). Plus the CPU overhead, which is quite high on the 68060. If you use a PowerPC, then you have 32 byte cachelines, and 80ns cache reads --- which just happens to be pretty much the same latency even completely non-optimized code experiences on a cheap AMD64 with cheap DDR2 hese days --- only that AMD latency already includes the CPU overhead, which even on a PPC still exists..

Quote:
But the point the we were trying to make is that in a normal CPU system
the CPU has even a latency to its own cache, and the latency to memory is tremendous.
- How many cycles latency does your AMD to 2nd lever cache?
- How many cycles latency does your AMD to memory?

The latency to 2nd level cache is such that it can chase pointers at 132 million-per-second. Which is, despite all your handwaving and claims to the contrary, faster than any processor board can possibly hope to chase pointers in the Natami's 10ns memory.

The latency to main memory is such that pointers are chased at about 12.5 million per second, using fully cachable memory with a full butterfly pattern which ensures cache misses.
Now, let's say you put an 68060/50 on Natami, and let's further assume that you somehow, magically, can manage to achieve a 1:1:1:1 memory timing (i.e. one clock cycle per memory access). The fastest way to chase pointers would be with a sequence of "MOV.L (A0),A0" instructions (hope I got the operand order right). You'll find on page 10-15 of the 68060 user manual that that's indeed a 1 clock cycle instruction. Woohoo!
But wait! Page 10-13 states that a data cache miss adds 3 cycles (2+w, with w being 1 for this super-duper-fast memory). So it's now a 4 cycle instruction (assuming you chase in non-cachable memory --- otherwise the trailing 3 memory accesses used to fill the cache line may block subsequent instructions further). But 4 cycles at 50MHz is 12.5 million per second, so at least you are not losing.
Except --- page 10-10, bottom paragraph points out that the 060 is running pipelined, and that it will take an extra cycle for the result of the memory read to be available for use in the next instruction. So you now need *5* clock cycles per pointer chase, in the ideal case. Never mind the looping overhead. So that 060/50 with hyper-fast (and painfully expensive) memory actually turns out to chase pointers in its main memory markedly slower than my box --- my box, which has "tremendous" latency to main memory (and my memory is the cheap kind --- buy even medium priced stuff, and it will chase pointers faster yet).

Quote:
Lets say you read single bytes on every address+=4096 over a blog of 8MB
How many bytes/sec can your AMD read then?
Is it faster then the 68060?

Not a good example (that's only 2048 distinct addresses, which with a 64 byte cacheline size means you are only ever touching 128kB, which will happily fit into the L2 of any recent CPU). So instead, I look at 32MB, with a stride of 256 bytes, OK? Well, that happens at a rate of around 93 million reads per second, which I am absolutely certain is considerably faster than the 68060 could ever hope for.

Quote:
What I said was that the POWERPC CPU will have huge CPU performance.
But that even an 68k CPU will perform extremly swift as the memory latency on the NATAMI is that low.

Well, a PowerPC will actually have pretty mediocre performance at best in such a system. Not the least because the only PowerPC which has non-mediocre performance even in a perfect surrounding infrastructure is the G5, and *that one* actually has 128 byte cachelines, requiring 32 memory cycles if paired with a 32 bit memory interface (if that is even technically feasible in the first place). 32 times 10ns is half an eternity.

The 68k will likely be considerably swifter than the 68k performance on existing 68k Amigas, no doubt. However, you have put forward the opinion that at least in some non-nonsensical circumstances, it would be swifter than current non-Amiga hardware. It won't.

Quote:
- Many CPUs today have memory clocks of several hundred !
The 68060 can read memory on every clock - but these CPUs need to wait hundreds of instructions when they access memory in an unpredictable way.

Well, as I have shown you above, the 68060 takes at least 4 clock cycles to "access memory in an unpredictable way", and if you actually want to use the result as an address, it becomes 5 clock cycles. Not quite "every clock".
Which is still a lot less than "hundreds", in terms of numbers. It is, however, not less at all in terms of elapsed time --- which is why I said that comparing access times in units of clock cycles is disingenuous given the huge disparity between clock speeds. And that's latency.

As for throughput --- those "several hundred clock cycles" can be nicely pipelined. I can make 45 million random accesses per second on my cheap hardware by having several in flight at any one time --- a trick the 68060 is not capable of to a comparable degree.

You seem to be offended by things having to wait. Well, it's what happens when one part of your system is faster than another part; Slowing down the faster part until the slower part can keep up is not what I'd call a solution.

Quote:
Voxeling was only one example as was texture mapping.
Texture mapping involved both read and writes.

Well, if you use your CPU to texture-map, then the textures have no business being in the gfx card. If you use the gfx chip to texture map, then the CPU does never read the gfx memory.

Say again, why would the CPU read gfx memory?

Quote:

- Do we agree that the NATAMI chipram can compete nicely with the fastest GFX card available for Old and new Amigas?

Compete nicely in what way? And do you want to burden down the graphics cards by making them use Amiga drivers, or do you want to talk about what they actually *can* do, with decent drivers?

Quote:

- Do we agree that if you mix Blitter and CPU operations as often done in AMIGA on GFX creating.
That you can do this nicely with NATAMI both get huge performance impacts on other GFX cards?

Why would this be any different from gfx cards --- they have blitters too, you know, and you can write to them quite nicely.
Will it come out better than existing Amiga gfx cards in existing Amiga systems? Quite possibly. Will it be superior to the box under my desk in that regard --- nope.

Quote:
Could you please post the code of your pixel benchmark.
I would like to see "how" random your access are.

unsigned long* y, *z;
z=(unsigned long*)MEMBASE+SIZE;
for (j=0;j

Last edited by umisef on 24-Jan-2008 at 12:06 PM.
Last edited by umisef on 24-Jan-2008 at 12:04 PM.
Last edited by umisef on 24-Jan-2008 at 12:01 PM.

Status: Offline

Hammer

Re: MeKa 2008 (Amiga Party) (SHOWN WAS NEW AMIGA HW!)
Posted on 24-Jan-2008 12:23:24

[ #229 ]

Elite Member

Joined: 9-Mar-2003
Posts: 5616
From: Australia

@ChaosLord

Quote:

ChaosLord wrote:
@Hammer

Intentionally misquoting people is lame.

How am I supposed to believe anything you say about complicated obtuse technical issues when you can't even understand plain simple english sentences written by BigGun?

Can't you follow the thread?

For the full context...
Quote:

BigGun: Then 800 means that this memory can transmit 1 word in 800 MHz in a burst.

The context was set for a typical PC’s DDR800 memory module.

Anyway, look up “data streaming” protocols in MPX and P4 bus.

"Both the MPC7455's and the P4's bus protocols support data streaming. This feature allows the G4's newer MPX bus to continually stream data over the bus at the rate of 8 bytes per beat until some other bus master interrupts the stream by requesting the bus. This makes much more efficient use of bus bandwidth, and allows the G4 to approach the bus's theoretical peak bandwidth". - Arstechnica

Quote:

Hammer:
That 800Mhz is not a true 800Mhz i.e. 400Mhz double rate. Mainstream architecture leans towards streaming computation.

To lower latency, some modern PC NB and MCHs has pre-fetch techniques e.g. nVidia's nForce 600i (with DASP 4.0) Series For Intel and Intel 840.

nVidia's nForce 600i's DASP has both NB cache, NB pipelining and NB pre-fetch techniques. nForce 680iSLI with DDR2 800(4,4,4,T1) has 27.2 ns @400Mhz.

**Intel 965**, 975**, P35**, P38** has Intel FMA (Fast Memory Access) i.e. out-of-order processing.
http://www.intel.com/products/chipsets/q965_q963/demo/demo.html

In real world application benchmarks, there's very little gain in nForce 600i's DASP vs Intel 965.

Still in the same PC context. Note the chipsets mentioned was for Intel X86 (Core 2) based PC.

I then typed up Intel 965's handling on memoey access.

Quote:

I'm not sure what you are trying to prove.
So you are saying that the Natami with 10ns is about three times faster than the nVidia's nForce ?
You are probably right.
I think there is no question that the Natami its fast.

One thing is fact:
While CPU clock rates did increase a lot in the recent years, the latency of DRAM did not increase at all.
If a memory program is latency bound then it runs at a fraction of the CPU clockrate.

For example a typical pointer chase algorithm on your 3Ghz x86 runs at memory latency speed.
In other words such an algorithm will run with max 25 Mhz !!!

If you have a low latency design as the Natami then you will have a much better performance for such cases. The Natami will easely run 3-4 times faster than your K8 in such case.

The normal way of trying to hide the latency is as you correctly pointed out prefetching.
This is the reason why many CPU have huge cache lines sizes of 128 byte or more.

I would disagree with "many CPU have huge cache lines sizes of 128 byte or more" when
1. Intel’s Core 2 architecture is dominant X86 CPU for the 2007 sales period.
2. Intel’s Core 2 architecture is equiped with 64byte cache lines.
3. The context was with Core 2 chipsets.

Lastly, I requested AOS 68K application benchmarks.

Last edited by Hammer on 24-Jan-2008 at 01:01 PM.
Last edited by Hammer on 24-Jan-2008 at 12:57 PM.
Last edited by Hammer on 24-Jan-2008 at 12:50 PM.
Last edited by Hammer on 24-Jan-2008 at 12:42 PM.

_________________
Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68)
Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68)
Ryzen 9 7900X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB

Status: Offline

olegil

Re: MeKa 2008 (Amiga Party) (SHOWN WAS NEW AMIGA HW!)
Posted on 24-Jan-2008 12:45:28

[ #230 ]

Elite Member

Joined: 22-Aug-2003
Posts: 5895
From: Work

@umisef

Why not BGA in hobby project? I can solder BGA packages a lot easier than most people solder a TQFP...

_________________
This weeks pet peeve:
Using "voltage" instead of "potential", which leads to inventing new words like "amperage" instead of "current" (I, measured in A) or possible "charge" (amperehours, Ah or Coulomb, C). Sometimes I don't even know what people mean.

Status: Offline

umisef

Re: MeKa 2008 (Amiga Party) (SHOWN WAS NEW AMIGA HW!)
Posted on 24-Jan-2008 13:02:34

[ #231 ]

Super Member

Joined: 19-Jun-2005
Posts: 1714
From: Melbourne, Australia

@olegil

Quote:
Why not BGA in hobby project? I can solder BGA packages a lot easier than most people solder a TQFP...

Well, it *does* need rather specialized equipment, which is not typically in a hobbyist's garage.

But what would be the real killer for me --- you can't visually inspect it. When your prototype doesn't work, and some signals are looking decidedly odd, you can't look for shorts in the solder. So all you can really do is make another prototype, and look at the same signals. If they are different, then you *might* have had a solder short. If they are the same, it's likely to be a systemic problem.

Give me things to look at any day. I don't even want to be able to use the sharp knife to remove a short (with the small packages these days, that tends to remove one and create three others). But at least being able to spot the short, and to go "aha! A short!" is good :)

Status: Offline

Hammer

Re: MeKa 2008 (Amiga Party) (SHOWN WAS NEW AMIGA HW!)
Posted on 24-Jan-2008 13:20:39

[ #232 ]

Elite Member

Joined: 9-Mar-2003
Posts: 5616
From: Australia

@BigGun
Quote:
And if you compare the computing power of the CELL than the AMD looks really poor

The computing power of the AMD processor stack** makes CELL looks really poor.

**AMD RV670(or RV680) + AMD64. RV670 supports IEEE754 and double precision floating point. RV680 has ~1TFLOPs (theoretical for stream/shader processors) on a card i.e. basically two RV670s with an unknown CrossFire management chip.

Last edited by Hammer on 24-Jan-2008 at 01:21 PM.

_________________
Amiga 1200 (rev 1D1, KS 3.2, PiStorm32/RPi CM4/Emu68)
Amiga 500 (rev 6A, ECS, KS 3.2, PiStorm/RPi 4B/Emu68)
Ryzen 9 7900X, DDR5-6000 64 GB RAM, GeForce RTX 4080 16 GB

Status: Offline

olegil

Re: MeKa 2008 (Amiga Party) (SHOWN WAS NEW AMIGA HW!)
Posted on 24-Jan-2008 14:10:45

[ #233 ]

Elite Member

Joined: 22-Aug-2003
Posts: 5895
From: Work

@umisef

Hehe. You are technically correct, but...

We use some QFN packages here, and after a dodgy prototype batch we decided to look at them.

One guy here knows a dentist, and HE had the equipment we needed to look under the buggers.

So if you know a dentist, hire his digital x-ray equipment for 2 minutes. Problem solved. Or at least identified, in our case

Status: Offline

BigGun

Re: MeKa 2008 (Amiga Party) (SHOWN WAS NEW AMIGA HW!)
Posted on 24-Jan-2008 17:09:31

[ #234 ]

Regular Member

Joined: 9-Aug-2005
Posts: 438
From: Germany (Black Forest)

@umisef

Quote:

Quote:
Lets say you read single bytes on every address+=4096 over a blog of 8MB
How many bytes/sec can your AMD read then?
Is it faster then the 68060?

Not a good example (that's only 2048 distinct addresses, which with a 64 byte cacheline size means you are only ever touching 128kB, which will happily fit into the L2 of any recent CPU). So instead, I look at 32MB, with a stride of 256 bytes, OK? Well, that happens at a rate of around 93 million reads per second, which I am absolutely certain is considerably faster than the 68060 could ever hope for.

Use the example that I gave you !
Its a very good example.

Touching every 4096-th address is a typical usage case is someone draws a vertical line in a 1024x768 resolution in truecolor. Its a very realistic example.

Your answer is WRONG.
Do you actually have A CLUE how a CPU cache works?
Or you are trying to fool people here?

BTW:You are misquoting me a lot.
And you make claims on so called "facts" that you invented.

For example:
In your above post you say that the PPC CPU will be slow because it can not burst to chipmem .
And you claim that an AMD would be fast as it can burst in DDR mem.

This total NONSENSE because:
a) No CPU ever can burst to chipmem.
As chipmem is always uncacheable.
Because chipmem is uncacheable the speed is fully dependent on the latency!

b) Cacheable is the Fastmem.
A PPC with DDR-Fast mem can of course burst to it.
A x86 has no advantage here at all.

Are you trying to twist things around here?
Your claim that an AMD is faster because it could burst is plain wrong.
The AMD could not burst from uncacheable mem either.

I really don't understand you point.
The NATAMI GFX board is the fastest existing AMIGA GFX hardware.
If I wanted a NON AMIGA HW that is fast and cheap then I could use the Playstation 3.

When I say that the NATAMI is fast as the chip memory has a MUCH lower latency than normal memory and many CPU 2nd level caches.
While you agree that the latency is very very low.
Much lower than normal memory you make a big fuzz by saying that the cache of your XX Gigaherz CPU has a latency on 8ns which is even better then 10ns.
Come on, what silly way of AMIGA bashing is this.
Saying that the AMIGA is slow because your multigigaherz CPU has 20% faster cache then the AMIGA memory is?

_________________
APOLLO the new 68K : www.apollo-core.com

Status: Offline

BigGun

Re: MeKa 2008 (Amiga Party) (SHOWN WAS NEW AMIGA HW!)
Posted on 24-Jan-2008 17:18:15

[ #235 ]

Regular Member

Joined: 9-Aug-2005
Posts: 438
From: Germany (Black Forest)

@Hammer

Quote:

Hammer wrote:
@BigGun
Quote:
And if you compare the computing power of the CELL than the AMD looks really poor

The computing power of the AMD processor stack** makes CELL looks really poor.

**AMD RV670(or RV680) + AMD64. RV670 supports IEEE754 and double precision floating point. RV680 has ~1TFLOPs (theoretical for stream/shader processors) on a card i.e. basically two RV670s with an unknown CrossFire management chip.

[b]
WHAT DOES YOUR POST TO DO WITH AMIGA ?

CAN YOU RUN AMIGA OS ON YOUR x86 CPU?

We have asked you the question many times now.
There are many fast computers out there.
Ranking from a simple HP Superdome to IBM Bluegene or to the IBM Roadrunner project.

But can you run AMIGA OS on them ?
What is the purpose of your constant off-topic posts?

_________________
APOLLO the new 68K : www.apollo-core.com

Status: Offline

umisef

Re: MeKa 2008 (Amiga Party) (SHOWN WAS NEW AMIGA HW!)
Posted on 24-Jan-2008 18:30:33

[ #236 ]

Super Member

Joined: 19-Jun-2005
Posts: 1714
From: Melbourne, Australia

@BigGun
Quote:
Lets say you read single bytes on every address+=4096 over a blog of 8MB
[....]
Use the example that I gave you !
Its a very good example.

It's an awful example, because it runs a very serious risk of running in cache. But OK, if your heart is set on it...
My machine does that at just over 60 million reads per second. Happy now?

Quote:
Touching every 4096-th address is a typical usage case is someone draws a vertical line in a 1024x768 resolution in truecolor. Its a very realistic example.

Well, that would be (a) writing instead of reading, (b) 32 bit ints instead of bytes, and (c) to video memory rather than from main memory. Not to mention that "1024x768" and "typical" really don't belong in the same sentence in 2008.

Quote:
Your answer is WRONG.
Do you actually have A CLUE how a CPU cache works?
Or you are trying to fool people here?

Well, yeah, I make my living knowing how this stuff works. Although I reckon the 60 million per second for the above test (which you insisted on) is the hardware prefetch kicking in, rather than the CPU cache. Same outcome, though --- it gets damn fast.

Quote:

In your above post you say that the PPC CPU will be slow because it can not burst to chipmem .
And you claim that an AMD would be fast as it can burst in DDR mem.

No, I am saying that a PPC which is using 10ns SRAM, rather than something properly pipelined, will siffer. And as someone said, quote,
Quote:
There are more than one memory bank and one bank is CPU only SRAM, so there are cases the CPU has full access to its SRAM
bank.

that someone suggested that the "fast ram" was indeed SRAM. I mean, come on, all your "oh, just look at the latency" stuff was about 10ns RAM that the CPU had unshared access to. Are you now trying to say that oh no, things will magically be different and the fastmem is common DDR after all? Make up your mind!

Quote:
This total NONSENSE because:
a) No CPU ever can burst to chipmem.
As chipmem is always uncacheable.
Because chipmem is uncacheable the speed is fully dependent on the latency!

OK, so we are back to talking about "chipmem". So, we are back at 65 million writes per second to my gfx memory. Yeah, *reading* from gfx memory is slow, but you have yet to provide a single example where one would have reason to read gfx memory with the CPU. While you talk a lot about how things are shared and work in tandem, all your examples are about writing to gfx memory, whereas
all your arguments are about reading.

Quote:
b) Cacheable is the Fastmem.
A PPC with DDR-Fast mem can of course burst to it.
A x86 has no advantage here at all.

Of course a PPC with fast pipelined DDR can properly burst. However, we are talking Natami, and its amazing 10ns SRAM, right? Which is not DDR, not pipelined, and most of all, not fast.

Quote:
The NATAMI GFX board is the fastest existing AMIGA GFX hardware.

Actually, I very strongly doubt that. In fact, I wonder whether the Natami can even manage to drive an ordinary 20" widescreen monitor in full colour. Those things are 1680x1050, or 1.76 megapixels. Refresh is 60 times per second, for a total of just over 105 megapixels/s. Which is just over 420 megabyte per second, purely for refreshing the screen. No manipulation yet.
A 32 bit memory interface running on a 10ns cycle can only provide 400MB/s. And the 20" screen actually doesn't just demand pixels at a steady rate of 105 million per second --- nope, due to the overhead of black shoulders and syncs, the actual pixel clock is 147MHz.
So even if you blew up that memory interface to 64 bit at 10ns cycles (which is not going to be fun to layout, nor cheap to buy memory for), image refresh would use 3/4 of that interface bandwidth for most of the time. That's *before* thinking about a blitter doing its thing, or God forbid the CPU trying to get an access in.

Quote:
When I say that the NATAMI is fast as the chip memory has a MUCH lower latency than normal memory and many CPU 2nd level caches.

Then you are saying something silly, because it just doesn't when you actually try to use it.

Quote:
Much lower than normal memory you make a big fuzz by saying that the cache of your XX Gigaherz CPU has a latency on 8ns which is even better then 10ns.

Well, again --- it certainly wasn't me who brought CPU cache speed into the discussion, now was it? That was one "biggun"...

Quote:
Come on, what silly way of AMIGA bashing is this.

Don't flatter yourself, I am not bashing the Amiga. I am exposing you for the naive fool you are. You are not the Amiga, you are just the guy who make stupid claims and then gets all upset when someone points out that they are, indeed, stupid.

Quote:
Saying that the AMIGA is slow because your multigigaherz CPU has 20% faster cache then the AMIGA memory is?

Uhm, when did we start talking about "the Amiga", rather than about the Natami?

And I guess you haven't been paying attention lately --- my machine chases pointers in MAIN MEMORY faster than any 68060 Natami can ever hope to do, even if one ignores all the issues about fastmem vs chipmem.

Again --- the Natami is a nice project, and I applaud Thomas. This thread, on the other hand, and particularly your contributions, were awful enough to make me post refutations based on quite time-consuming benchmarks...
Nobody is saying "the Amiga is slow", or even "the Natami is slow". I am merely saying "biggun is wrong ascribing semi-magical capabilities to the Natami, and here is exactly why". Do you understand the difference? You are spouting nonsense. Thomas built a computer. Thomas has my deepest respect.

Status: Offline

ferrels

Re: MeKa 2008 (Amiga Party) (SHOWN WAS NEW AMIGA HW!)
Posted on 24-Jan-2008 19:53:11

[ #237 ]

Cult Member

Joined: 20-Oct-2005
Posts: 922
From: Arizona

@umisef and @BigGun

Sheesh, I wish you guys would stop arguing.

Status: Offline

TheDaddy

Re: MeKa 2008 (Amiga Party) (SHOWN WAS NEW AMIGA HW!)
Posted on 24-Jan-2008 20:16:32

[ #238 ]

Elite Member

Joined: 30-Sep-2005
Posts: 4499
From: Quattro Stelle

@ferrels

I know, I have asked them before, just give it up.

Some like pcs some like Amigas, some like PS3 some like XBOX, some like Apple (yuk!) some like Linux, some like VHS some like Betamax, some like Blueray and so on.

Let's just make sure that the Natami happens!

The more hardware to run Amiga OS on the better, personally I would like Amiga OS on a wide range of well supported, cheap and fast pc motherboards but also on PPC architecture, a Natami would also be excellent, the more the better!

_________________
www.loriano.pwp.blueyonder.co.uk

Status: Offline

HenryCase

Re: MeKa 2008 (Amiga Party) (SHOWN WAS NEW AMIGA HW!)
Posted on 24-Jan-2008 21:04:21

[ #239 ]

Cult Member

Joined: 12-Nov-2007
Posts: 728
From: Unknown

@umisef and @BigGun

Quote:
Quote:
The NATAMI GFX board is the fastest existing AMIGA GFX hardware.

Actually, I very strongly doubt that. In fact, I wonder whether the Natami can even manage to drive an ordinary 20" widescreen monitor in full colour. Those things are 1680x1050, or 1.76 megapixels. Refresh is 60 times per second, for a total of just over 105 megapixels/s. Which is just over 420 megabyte per second, purely for refreshing the screen. No manipulation yet.
A 32 bit memory interface running on a 10ns cycle can only provide 400MB/s. And the 20" screen actually doesn't just demand pixels at a steady rate of 105 million per second --- nope, due to the overhead of black shoulders and syncs, the actual pixel clock is 147MHz.
So even if you blew up that memory interface to 64 bit at 10ns cycles (which is not going to be fun to layout, nor cheap to buy memory for), image refresh would use 3/4 of that interface bandwidth for most of the time. That's *before* thinking about a blitter doing its thing, or God forbid the CPU trying to get an access in.

The exchange shown above kind of sums up your argument (at least in my eyes). One makes a blanket statement about the power of the Natami (but doesn't/can't back it up), and one replies with a bunch of figures which have little to do with the original point.

So lets break down whether Natami has the fastest/most powerful Amiga graphics hardware around:

1. What is the most powerful GFX card available for Amiga today?
2. What is the maximum screen resolution (and refresh rate at this screen resolution) on this card?
3. What is the maximum theoretical (device isn't finished) screen resolution/refresh rate on the Natami?
4. What is the most graphically intensive app that classic Amigas can run (preferrably one that requires a GFX card)?
5. How likely is it that Natami could run an app of similar or better graphical quality?

Anyone who can answer the above questions (even if it is just questions 1 and 4) please do so.

Status: Offline

umisef

Re: MeKa 2008 (Amiga Party) (SHOWN WAS NEW AMIGA HW!)
Posted on 25-Jan-2008 1:48:36

[ #240 ]

Super Member

Joined: 19-Jun-2005
Posts: 1714
From: Melbourne, Australia

@HenryCase

Quote:
The exchange shown above kind of sums up your argument (at least in my eyes). One makes a blanket statement about the power of the Natami (but doesn't/can't back it up), and one replies with a bunch of figures which have little to do with the original point.

Uhm, do you really think something can be "the fastest existing Amiga gfx hardware" if it cannot even drive a current monitor without first having its bandwidth doubled (from the non-existent version that Gunnar has been advocating, no less, not from the version actually shown at Meka)?

Quote:
1. What is the most powerful GFX card available for Amiga today?

Well, for an actual classic Amiga, it would have to be tossup between a CVPPC, and a Voodoo4/5 in a PCI board.
On the PPC side, the R100/R200 series Radeon cards are supported, in addition to Voodoo cards.

Quote:
2. What is the maximum screen resolution (and refresh rate at this screen resolution) on this card?

What this comes down to is pixel clock. The CVPPC can pump out 24 bit pixels at 145MHz, so it can just about drive a 1680x1050@60 20" widescreen LCD. The Voodoo cards have a 350MHz RAMDAC, so they can pretty much drive anything with a VGA connector
The Radeons are considerably ahead of the Voodoo stuff --- while AFAIK not supported under OS4, even their *secondary* RAMDAC is fast enough to drive 1600x1200 resolutions.

Quote:
3. What is the maximum theoretical (device isn't finished) screen resolution/refresh rate on the Natami?

Depends on what memory system our local advocate is talking about at any given time. A 10ns cycle on the memory means reading out one set of data 100 million times a second. How many pixels each set provides depends on the width of the memory (a hardware and cost decision), and the bits per pixel. Assuming 64 bit width for the memory, and full-colour pixels using 32 bits each (the latter is taken straight from Gunnar's line drawing example), pixels can be read from memory at a rate of 200 million per second. That's the rate required to drive a 1920x1200 24" widescreen.
Assuming 32 bit wide memory (which is what Gunnar previously suggested the Natami had), one would only get 100 million pixels per second, enough to drive a full-colour 1024x768 screen, but not enough for a 1280x1024 screen (which is what all those 17" and 19" LCDs are).

Quote:
4. What is the most graphically intensive app that classic Amigas can run (preferrably one that requires a GFX card)?

For the purpose of the above, Workbench will do.
The actual gfx cards do *not* typically spend the large majority of the available memory bandwidth simply on refreshing the screen. *And* the gfx card chips have considerable amounts of write buffers built in, which means that as far as the rest of the system is concerned, writes complete fast even if in reality, they are buffered for a few clock cycles.
Simply drawing things on a screen can become tedious if the blitter and/or CPU have to fight for access slots to the memory. Not unlike the original Amiga, in fact, where some of the modes with 5 or 6 bitplanes made it really hard for the CPU to get to chipmem.

Quote:
5. How likely is it that Natami could run an app of similar or better graphical quality?

Which Natami is that? The one that actually has been prototyped, since mid 2006, or the one that Gunnar talks about? Because the two have very little in common... If you look at Thomas' site, you find that the current prototype actually uses EDO DRAM, rather than SRAM. And not particularly fast DRAM, either (cycle time around 100ns, rather than 10). And if you read the early postings in this thread, you find that it doesn't do more than 8 bitplanes, either.

So yes, if Thomas actually completely changes his design, and comes up with a design 10 times as fast as the current one (which is unlikely to ever work with the rats nest of wires that the current system is made up from), and doubles the width of the memory bus, and increases the amount of chipmem by a factor of four, *AND* gets the CPU card directly connected to the "chipset", rather than through PCI, and implements true-colour modes, then the graphical output of the Natami may be comparable to that of a 10 year old gfx card on a 20" wide screen, as long as you don't try to actually update the screen too much.

Which puts the statement
Quote:
The NATAMI GFX board is the fastest existing AMIGA GFX hardware.

squarely in the "phantasy" box. At the moment, Natami is comparable to AGA at best. It may one day be comparable to a 1998 graphics card, assuming Thomas actually plans to go with 10ns cycle SRAM. It probably won't compare favourably, but at least in the future it may be able to be compared at all. Right now, it can't.

Last edited by umisef on 25-Jan-2008 at 01:50 AM.

Status: Offline

[ home ][ about us ][ privacy ] [ forums ][ classifieds ] [ links ][ news archive ] [ link to us ][ user account ]

Amigaworld.net was originally founded by David Doyle