Click Here
home features news forums classifieds faqs links search
6071 members 
Amiga Q&A /  Free for All /  Emulation /  Gaming / (Latest Posts)
Login

Nickname

Password

Lost Password?

Don't have an account yet?
Register now!

Support Amigaworld.net
Your support is needed and is appreciated as Amigaworld.net is primarily dependent upon the support of its users.
Donate

Menu
Main sections
Home
Features
News
Forums
Classifieds
Links
Downloads
Extras
OS4 Zone
IRC Network
AmigaWorld Radio
Newsfeed
Top Members
Amiga Dealers
Information
About Us
FAQs
Advertise
Polls
Terms of Service
Search

IRC Channel
Server: irc.amigaworld.net
Ports: 1024,5555, 6665-6669
SSL port: 6697
Channel: #Amigaworld
Channel Policy and Guidelines

Who's Online
15 crawler(s) on-line.
 68 guest(s) on-line.
 0 member(s) on-line.



You are an anonymous user.
Register Now!
 OneTimer1:  7 mins ago
 misspoli:  11 mins ago
 DiscreetFX:  39 mins ago
 utri007:  1 hr 1 min ago
 NutsAboutAmiga:  1 hr 9 mins ago
 matthey:  1 hr 49 mins ago
 Amigaland:  1 hr 52 mins ago
 amigakit:  2 hrs 15 mins ago
 kolla:  2 hrs 40 mins ago
 billt:  2 hrs 53 mins ago

/  Forum Index
   /  Amiga Development
      /  PED81C - pseudo-native, no C2P chunky screens for AGA
Register To Post

Goto page ( 1 | 2 Next Page )
PosterThread
saimo 
PED81C - pseudo-native, no C2P chunky screens for AGA
Posted on 5-Mar-2022 10:28:05
#1 ]
Elite Member
Joined: 11-Mar-2003
Posts: 2453
From: Unknown

PED81C is a video system for AGA Amigas that provides pseudo-native chunky screens, i.e. screens where each byte in CHIP RAM corresponds to a dot on the display. In short, it offers chunky screens without chunky-to-planar conversion or any CPU/Blitter/Copper sweat.

Download: https://www.retream.com/PED81C

Some examples:
* https://www.youtube.com/watch?v=0xunQ6ldVKU
* https://www.youtube.com/watch?v=4eikEo45v1I
* https://www.youtube.com/watch?v=ebxwKm9K4Os
* https://www.youtube.com/watch?v=tLtLhJXInOY

Notes:
* due to the nature of the system, the videos must be watched in their original size (1920x1080);
* YouTube's video processing has slightly reduced the visual quality (i.e. the result is better on real machines).

Full the details, straight from the documentation:

Quote:
--------------------------------------------------------------------------------
CORE IDEA

The core idea is using SHRES pixels ("pixels" from now on) to simulate dots in a
CRT/LCD-like fashion.

Each dot is made of 4 pixels as follows:

ABCD
ABCD
ABCD
ABCD

where

X
X
X
X

represents a pixel.

The eye cannot really distinguish the pixels and, instead, perceives them almost
as a single dot whose color is given by the mix of the colors of the pixels. The
pixels thus constitute the color elements ("elements" from now on) of the dot.
The effect is not perfect though, as the pixels can still more or less be seen.
The sharper the display / the bigger the pixels, the worse the visual mix. In
practice, though, the effect works acceptably well on CRT, LCD and LED displays
alike.

The pixels can be assigned any RGB values ("base colors" from now on).
For example, the most obvious choice is:

RGBW
RGBW
RGBW
RGBW

Starting from the left, the pixels are used for the red, green, blue and white
elements of dots. The pixels can be assigned any values in these ranges:

R: $rr0000, where $rr in [$00, $ff]
G: $00gg00, where $gg in [$00, $ff]
B: $0000bb, where $bb in [$00, $ff]
W: $wwwwww, where $ww in [$00, $ff]

As a consequence, there is an overall brightness loss of at least 50%. For
example, the white dot (the brighest one) is obtained by assigning the pixels
the maximum values in the ranges (i.e. R = $ff0000, G = $00ff00, B = $0000ff,
W = $ffffff), which add up to $ffffff*2, the half of the absolute maximum value
of the 4 pixels, i.e. $ffffff*4.

Each set of base colors ("color model" from now on) produces the specific
palette that the dots are perceived in ("dots palette" from now on). To
understand how to calculate the dots palette, it is first necesssary to look at
how the screens work.

The raster, i.e. the matrix of the bytes (stored as a linear buffer) that
represent the dots, must reside in CHIP RAM. It is used as bitplane 1 and also
as bitplane 2, shifted 4 pixels to the right.
This how a byte %76543210 (where each digit represents at bit) in the raster is
displayed:

bitplane 2: 76543210
bitplane 1: 76543210
****

The marked bits are those that produce the dot that corresponds to the byte:

ABCD
ABCD
ABCD
ABCD

^^^^

bitplane 2: 76543210
bitplane 1: 76543210

The elements are thus indicated by the bit pairs in the byte:

%73 -> element A
%62 -> element B
%51 -> element C
%40 -> element D

Replacing the digits with letters gives a better representation:

%ABCDabcd

where:

X = most significant bit for element X
x = least significant bit for element X

Each element can have only 4 values corresponding to the bit pairs %00, %01,
%10 and %11. Such values are those stored in COLORxx. Therefore, the bit pairs
represent the COLORxx indexes:

%00 -> COLOR00
%01 -> COLOR01
%10 -> COLOR02
%11 -> COLOR03

However, there are 4 elements, so it is necessary to distinguish them; this is
achieved by adding two selector bitplanes filled with fixed patterns:

ABCD
ABCD
ABCD
ABCD

^^^^

bitplane 4: 001100110011
bitplane 3: 010101010101
bitplane 2: ABCDabcd
bitplane 1: ABCDabcd

Therefore:

bitplane 4 and 3 = %00 -> element A -> COLOR00 thru COLOR03
bitplane 4 and 3 = %01 -> element B -> COLOR04 thru COLOR07
bitplane 4 and 3 = %10 -> element C -> COLOR08 thru COLOR11
bitplane 4 and 3 = %11 -> element D -> COLOR12 thru COLOR15

Given that there are 4 elements and that each element can have 4 different
values, the total number of combinations (i.e. of dots colors) is 4^4 = 256.

In the RGBW color model, COLORxx could be set up as follows (for simplicity, the
low-order 12 bits are left to the automatic copy performed by AGA):

R | G | B | W
--------------+---------------+---------------+--------------
COLOR00: $000 | COLOR04: $000 | COLOR08: $000 | COLOR12: $000
COLOR01: $500 | COLOR05: $050 | COLOR09: $005 | COLOR13: $555
COLOR02: $a00 | COLOR06: $0a0 | COLOR10: $00a | COLOR14: $aaa
COLOR03: $f00 | COLOR07: $0f0 | COLOR11: $00f | COLOR15: $fff

Consequently, the bit pairs in the bytes yield these colors:

| %00 | %01 | %10 | %11
----+---------+---------+---------+--------
%Aa | $000000 | $550000 | $aa0000 | $ff0000
%Bb | $000000 | $005500 | $00aa00 | $00ff00
%Cc | $000000 | $000055 | $0000aa | $0000ff
%Dd | $000000 | $555555 | $aaaaaa | $ffffff

For example, the byte %RGBWrgbw = %10011010 (%Rr = %11, %Gg = %00, %Bb = %01,
%Ww = %10) represents this dot:

f00a
f00a
000a
000a
005a
005a

^^^^

bitplane 2: 10011010
bitplane 1: 10011010

The dot RGB color is thus:

R: ($ff + $00 + $00 + $aa) / 4 = (255 + 170) / 4 = 106.2 = $6a \
G: ($00 + $00 + $00 + $aa) / 4 = 170 / 4 = 42.5 = $2b > $6a2b40
B: ($00 + $00 + $55 + $aa) / 4 = ( 85 + 170) / 4 = 63.7 = $40 /

A critical aspect of PED81C is that each dot is surrounded by spurious bits:

bitplane 2: ABCDabcd
bitplane 1: ABCDabcd
**** ****

Without CPU and/or Blitter intervention, those bits cannot be eliminated - but
processing data is precisely what PED81C tries to avoid, so it is necessary to
find a way to deal with the spurious bits.
This is what happens with two consecutive bytes %ABCDabcd and %EFGHefgh:

ABCD????EFGH
ABCD????EFGH
ABCD????EFGH
ABCD????EFGH

^^^^^^^^^^^^

bitplane 2: ABCDabcdEFGHefgh
bitplane 1: ABCDabcdEFGHefgh

Between the dots produced by the bytes as explained above ("desired dots" from
now on) is a dot that is made of bits coming from both the bytes ("middle dot"
from now on), i.e. %EFGH and %abcd. The simplest solution would be masking the
middle dot out with a no-DMA vertically repeating jailbar mask sprite, but that
would introduce a horrible vertical spacing between the columns of dots and
reduce further the brightness of the screen.
A smarter solution would be adding one more selector bitplane to distinguish
between desired dots and middle dots (for readability, from now on, 0 bits are
replaced with 'ยท' where needed):

ABCD????ABCD
ABCD????ABCD
ABCD????ABCD
ABCD????ABCD

^^^^^^^^^^^^

bitplane 5: 1111ยทยทยทยท1111ยทยทยทยท1111
bitplane 4: ยทยท11ยทยท11ยทยท11ยทยท11ยทยท11
bitplane 3: ยท1ยท1ยท1ยท1ยท1ยท1ยท1ยท1ยท1ยท1
bitplane 2: ABCDabcdEFGHefgh
bitplane 1: ABCDabcdEFGHefgh

COLOR16 thru COLOR31 could then be set up so that the middle dots are mixes of
the desired dots, keeping in mind that the middle dots have the most and least
significant bits swapped around (the least significant bits of the left dot end
up in the most significant bits of the middle dot and the most significant bits
of the right dot end up in the least significant bits of the middle dot). The
simplest settings reflect the settings of the desired dots, but with the RGB
values assigned to the %01 and %10 bit pairs swapped around.
For example, in the RGBW color model:

R | G | B | W
--------------+---------------+---------------+--------------
COLOR16: $000 | COLOR20: $000 | COLOR24: $000 | COLOR28: $000
COLOR17: $500 | COLOR21: $0a0 | COLOR25: $00a | COLOR29: $555
COLOR18: $a00 | COLOR22: $050 | COLOR26: $005 | COLOR30: $aaa
COLOR19: $f00 | COLOR23: $0f0 | COLOR27: $00f | COLOR31: $fff

For example, two identical bytes %10001000 ($ff0000) would give this result
(which is correct):

RGBWRGBWRGBW
RGBWRGBWRGBW
RGBWRGBWRGBW
RGBWRGBWRGBW

fยทยทยทfยทยทยทfยทยทยท
fยทยทยทfยทยทยทfยทยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท

^^^^^^^^^^^^

bitplane 5: 1111ยทยทยทยท1111ยทยทยทยท1111
bitplane 4: ยทยท11ยทยท11ยทยท11ยทยท11ยทยท11
bitplane 3: ยท1ยท1ยท1ยท1ยท1ยท1ยท1ยท1ยท1ยท1
bitplane 2: 1ยทยทยท1ยทยทยท1ยทยทยท1ยทยทยท
bitplane 1: 1ยทยทยท1ยทยทยท1ยทยทยท1ยทยทยท

left dot: $ff0000
middle dot: $ff0000
right dot: $ff0000

However, if the bytes were %00001000 ($550000) and %10000000 ($aa0000), the
result would be:

RGBWRGBWRGBW
RGBWRGBWRGBW
RGBWRGBWRGBW
RGBWRGBWRGBW

5ยทยทยทfยทยทยทaยทยทยท
5ยทยทยทfยทยทยทaยทยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท

^^^^^^^^^^^^

bitplane 5: 1111ยทยทยทยท1111ยทยทยทยท1111
bitplane 4: ยทยท11ยทยท11ยทยท11ยทยท11ยทยท11
bitplane 3: ยท1ยท1ยท1ยท1ยท1ยท1ยท1ยท1ยท1ยท1
bitplane 2: ยทยทยทยท1ยทยทยท1ยทยทยทยทยทยทยท
bitplane 1: ยทยทยทยท1ยทยทยท1ยทยทยทยทยทยทยท

left dot: $550000
middle dot: $ff0000
right dot: $aa0000

The middle dot would end up being a full red, stronger than the desired dots,
which is not visually correct nor logical, as the middle dots would be more
prominent than the desired dots. A solution could be dimming the RGB values of
middle dots.
For example, if they were halved, the result would be:

left dot: $550000
middle dot: $800000
right dot: $aa0000

The middle dot would be a good average of the desired dots. That works
conceptually, but in practice it causes the middle dots columns to look like
vertical scanlines - which is not desirable either.

The case of different hues is even more complicated. For example, if the bytes
were %10001000 ($ff0000) and %010001000 ($00ff00), the result would be:

RGBWRGBWRGBW
RGBWRGBWRGBW
RGBWRGBWRGBW
RGBWRGBWRGBW

fยทยทยท5aยทยทยทfยทยท
fยทยทยท5aยทยทยทfยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท

^^^^^^^^^^^^

bitplane 5: 1111ยทยทยทยท1111ยทยทยทยท1111
bitplane 4: ยทยท11ยทยท11ยทยท11ยทยท11ยทยท11
bitplane 3: ยท1ยท1ยท1ยท1ยท1ยท1ยท1ยท1ยท1ยท1
bitplane 2: 1ยทยทยท1ยทยทยทยท1ยทยทยท1ยทยท
bitplane 1: 1ยทยทยท1ยทยทยทยท1ยทยทยท1ยทยทยท

left dot: $ff0000
middle dot: $55aa00
right dot: $00ff00

The middle dot would be a kind of average of the actual dots, although not
really good (a good average would be $808000).
It is possible to experiment with the COLORxx values to achieve different
results, but the overall scanlines-like effect would still remain. Moreover, the
3rd selector bitplane would steal a lot of CHIP bus slots. An alternative is
required.

The proposed solution consists in eliminating the 3rd selector bitplane and
assigning the bit pairs %01 and %10 the same RGB values (which basically gives
the most and least significant bits the same weight). As a downside, this
reduces the amount of dots colors: given that each element can have only 3
different values, the total number of colors falls down to 3^4 = 81.

For example, in the RGBW color model:

R | G | B | W
--------------+---------------+---------------+--------------
COLOR00: $000 | COLOR04: $000 | COLOR08: $000 | COLOR12: $000
COLOR01: $800 | COLOR05: $080 | COLOR09: $008 | COLOR13: $888
COLOR02: $800 | COLOR06: $080 | COLOR10: $008 | COLOR14: $888
COLOR03: $f00 | COLOR07: $0f0 | COLOR11: $00f | COLOR15: $fff

The case of two identical bytes %10001000 ($ff0000) would still give the same
(correct) result as before:

RGBWRGBWRGBW
RGBWRGBWRGBW
RGBWRGBWRGBW
RGBWRGBWRGBW

fยทยทยทfยทยทยทfยทยทยท
fยทยทยทfยทยทยทfยทยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท

^^^^^^^^^^^^

bitplane 4: ยทยท11ยทยท11ยทยท11ยทยท11ยทยท11
bitplane 3: ยท1ยท1ยท1ยท1ยท1ยท1ยท1ยท1ยท1ยท1
bitplane 2: 1ยทยทยท1ยทยทยท1ยทยทยท1ยทยทยท
bitplane 1: 1ยทยทยท1ยทยทยท1ยทยทยท1ยทยทยท

left dot: $ff0000
middle dot: $ff0000
right dot: $ff0000

The case of the bytes %00001000 ($880000) and %10000000 ($880000), would give
this result:

RGBWRGBWRGBW
RGBWRGBWRGBW
RGBWRGBWRGBW
RGBWRGBWRGBW

8ยทยทยทfยทยทยท8ยทยทยท
8ยทยทยทfยทยทยท8ยทยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท

^^^^^^^^^^^^

bitplane 4: ยทยท11ยทยท11ยทยท11ยทยท11ยทยท11
bitplane 3: ยท1ยท1ยท1ยท1ยท1ยท1ยท1ยท1ยท1ยท1
bitplane 2: ยทยทยทยท1ยทยทยท1ยทยทยทยทยทยทยท
bitplane 1: ยทยทยทยท1ยทยทยท1ยทยทยทยทยทยทยท

left dot: $880000
middle dot: $ff0000
right dot: $880000

Again the middle dot would be brighter than the actual dots, but now this can
be easily solved by simply forbidding the %01 bit pair in bytes, given that it
can always be replaced by the %10 bit pair. So, the bytes would instead be both
%10000000 ($880000) and the result would be:

RGBWRGBWRGBW
RGBWRGBWRGBW
RGBWRGBWRGBW
RGBWRGBWRGBW

8ยทยทยท8ยทยทยท8ยทยทยท
8ยทยทยท8ยทยทยท8ยทยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท

^^^^^^^^^^^^

bitplane 4: ยทยท11ยทยท11ยทยท11ยทยท11ยทยท11
bitplane 3: ยท1ยท1ยท1ยท1ยท1ยท1ยท1ยท1ยท1ยท1
bitplane 2: 1ยทยทยทยทยทยทยท1ยทยทยทยทยทยทยท
bitplane 1: 1ยทยทยทยทยทยทยท1ยทยทยทยทยทยทยท

left dot: $880000
middle dot: $880000
right dot: $880000

Also the case of different hues, %10001000 ($ff0000) and %01000100 ($00ff00),
gives a correct result (for complete correctness, in this example the low-order
bits of COLOR02 and COLOR05 are set to 0):

RGBWRGBWRGBW
RGBWRGBWRGBW
RGBWRGBWRGBW
RGBWRGBWRGBW

fยทยทยท88ยทยทยทfยทยท
fยทยทยท00ยทยทยทfยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท
ยทยทยทยทยทยทยทยทยทยทยทยท

^^^^^^^^^^^^

bitplane 4: ยทยท11ยทยท11ยทยท11ยทยท11ยทยท11
bitplane 3: ยท1ยท1ยท1ยท1ยท1ยท1ยท1ยท1ยท1ยท1
bitplane 2: 1ยทยทยท1ยทยทยทยท1ยทยทยท1ยทยท
bitplane 1: 1ยทยทยท1ยทยทยทยท1ยทยทยท1ยทยท

left dot: $ff0000
middle dot: $808000
right dot: $00ff00


--------------------------------------------------------------------------------
COLOR MODELS

The CORE IDEA section introduces the RGBW color model, but the number of
possible color models is huge (2^288). For best results, it is adviceable to
define the color models that are most suitable to the graphics to be displayed.

The most obvious general-purpose color models are:
* CMYW: Cyan Magenta Yellow White
* G: Greyscale
* KC: Key Colors (red yellow green cyan blue magenta white)
* RGBW: Red Green Blue White

This table shows the COLORxx settings for the general-purpose color models.

| CMYW | G | KC | RGBW
ELEMENT | COLORxx | RGB hi/lo | RGB hi/lo | RGB hi/lo | RGB hi/lo
--------+---------+-----------+-----------+-----------+----------
A | COLOR00 | $000/$000 | $000/$000 | $000/$000 | $000/$000
| COLOR01 | $088/$000 | $222/$222 | $f00/$f00 | $800/$000
| COLOR02 | $088/$000 | $222/$222 | $f00/$f00 | $800/$000
| COLOR03 | $0ff/$0ff | $fff/$fff | $ff0/$ff0 | $f00/$f00
--------+---------+-----------+-----------+-----------+----------
B | COLOR04 | $000/$000 | $000/$000 | $000/$000 | $000/$000
| COLOR05 | $808/$000 | $555/$555 | $0f0/$0f0 | $080/$000
| COLOR06 | $808/$000 | $555/$555 | $0f0/$0f0 | $080/$000
| COLOR07 | $f0f/$f0f | $fff/$fff | $0ff/$0ff | $0f0/$0f0
--------+---------+-----------+-----------+-----------+----------
C | COLOR08 | $000/$000 | $000/$000 | $000/$000 | $000/$000
| COLOR09 | $880/$000 | $aaa/$aaa | $00f/$00f | $008/$000
| COLOR10 | $880/$000 | $aaa/$aaa | $00f/$00f | $008/$000
| COLOR11 | $ff0/$ff0 | $fff/$fff | $f0f/$f0f | $00f/$00f
--------+---------+-----------+-----------+-----------+----------
D | COLOR12 | $000/$000 | $000/$000 | $000/$000 | $000/$000
| COLOR13 | $888/$000 | $888/$000 | $888/$000 | $888/$000
| COLOR14 | $888/$000 | $888/$000 | $888/$000 | $888/$000
| COLOR15 | $fff/$fff | $fff/$fff | $fff/$fff | $fff/$fff

For the G color model, the arithmetically perfect assignment would be:
* COLOR01, COLOR02: $333333
* COLOR05, COLOR06: $666666
* COLOR09, COLOR10: $999999
* COLOR13, COLOR14: $cccccc
However, the resulting dots palette would contain only 26 unique colors.

Each color model has strenghts and weaknesses. This table provides an evaluation
of the general-purpose color models (COLORS = number of unique colors in the
resulting dots palette).

COLOR MODEL | BRIGHTNESS | SATURATION | CONTRAST | COLORS | NOTES
------------+------------+------------+----------+--------+--------------------
CMYW | ** | * | * | 73 | no red, green, blue
G | **** | | **** | 45 |
KC | *** | ** | ** | 46 | noisy middle dots
RGBW | * | *** | *** | 65 |


--------------------------------------------------------------------------------
CALCULATING/GENERATING DOTS PALETTES

Once the color model is defined, the corresponding dots palette can be
calculated by mixing the RGB values assigned to the bit pairs in the bytes from
0 to 255. The bytes which include a %01 bit pair should be treated as illegal
and thus be assigned one of the RGB values also assigned to a legal byte (the
easiest solution is to use the value of byte 0). The calculation of the RGB
value ($6a2b40) corresponding to the byte %10011010 in the RGBW color model,
done in the CORE IDEA section, makes for a practical example.

The PED81C archive includes GeneratePalette, a handy tool that generates a dots
palette according to the desired color model and then saves it to an ILBM file.
It normalizes to $ff the components of the calculated colors, so that the latter
are brighter and have a higher dynamic range than the actual dots palette
colors, allowing for better graphics conversion. Also, it assigns the value of
byte 0 to the illegal bytes.

The command line arguments are:

A0/A,A2/A,A3/A,B0/A,B2/A,B3/A,C0/A,C2/A,C3/A,D0/A,D2/A,D3/A,FFIS100/S,FILE/A

X0: 24-bit RGB value for the %00 pair of element X
X2: 24-bit RGB value for the %10 pair of element X
X3: 24-bit RGB value for the %11 pair of element X
FFIS100: $ff treated internally as $100 (for better rounding)
FILE: output file

The 24-bit RGB values must be in hexadecimal format without prefix.

The palettes are suitable for screens which use bitplanes 3 and 4 as selector
bitplanes.

The PED81C archive also includes:
* the palettes for the general-purpose color models, stored as ILBM pictures;
* GeneratePalettes, a script that generates a few palettes (it can be used also
as a reference for GeneratePalette usage).


--------------------------------------------------------------------------------
PRODUCING GRAPHICS

The palettes can be used to draw/convert graphics.
For example, to display a picture in an RGBW screen:
1. draw/remap the picture with the RGBW palette;
2. save the picture as raw chunky data;
3. copy the raw chunky data to the raster or use it directly as the raster.


--------------------------------------------------------------------------------
SETTING UP AND USING SCREENS

PED81C screens are obtained by opening SHRES screens with these peculiarities:
* the raster must be used as bitplane 1 and 2;
* bitplane 3 must be filled with %01010101 ($55);
* bitplane 4 must be filled with %00110011 ($33);
* bitplanes 2 and 4 must be shifted horizontally by 4 pixels;
* COLORxx must be set according to the chosen color model;
* the 4 pixels in the leftmost column are made of just the least significant
bits of the leftmost dots, so it is generally recommendable to hide them by
moving the left side of the window area by 1 LORES pixel to the right.

Notes:
* to obtain a screen which is W LORES pixels wide, the width of the raster must
be W*4 SHRES pixels = W/2 bytes (e.g. 320 LORES pixels -> 1280 SHRES pixels =
160 bytes = 160 dots);
* to obtain a scrollable screen, allocate a raster bigger than the visible area
and, in case of horizontal scrolling, set BPLxMOD to the amount of non-
fetched dots (e.g. for a raster which is 256 dots wide and is displayed in
a 320 LORES pixels area, BPLxMOD must be 256-320/2 = 96);
* HIRES/SHRES resolution scrolling is possible, but it alters the colors of the
leftmost dots;
* given the high CHIP bus load caused by the bitplanes fetch, it is best to
enable the 64-bit fetch mode (FMODE.BPLx = 3).

In general, given a raster which is RASTERWIDTH dots wide and RASTERHEIGHT dots
tall, the values to write to the chipset registers in order to create a centered
screen can be calculated as follows:
* SCREENWIDTH = RASTERWIDTH * 8
* SCREENHEIGHT = RASTERHEIGHT
* DIWSTRTX = $81 + (160 - SCREENWIDTH / 8)
* DIWSTRTY = $2c + (128 - SCREENHEIGHT / 2)
* DIWSTRT = ((DIWSTRTY & $ff) > 8)
* DDFSTRT = (DIWSTRTX - 17) / 2
* DDFSTOP = DDFSTRT+SCREENWIDTH / 8 - 8

Example registers settings for:
* screen equivalent to a 319x256 LORES screen
* 160 dots wide raster
* blanked border
* 64-bit sprites and bitplanes fetch mode
* sprites on top of bitplanes
* sprites colors assigned to COLOR16 thru COLOR31

REGISTER | VALUE | ENABLED BITS
---------+-------+----------------------------
BPLCON0 | $4241 | BPU2 COLOR SHRES ECSENA
BPLCON1 | $0010 | PF2H2
BPLCON2 | $0224 | KILLEHB PF2P2 PF1P2
BPLCON3 | $0020 | BRDRBLNK
BPLCON4 | $0011 | OSPRM5 ESPRM5
BPL1MOD | $0000 |
BPL2MOD | $0000 |
DDFSTRT | $0038 |
DDFSTOP | $00D0 |
DIWSTRT | $2C82 |
DIWSTOP | $2CC1 |
DIWHIGH | $A100 |
FMODE | $000F | SPRAGEM SPR32 BPLAGEM BPL32

Given a raster which is W dots wide and H dots tall, the byte at is
located at + W*Y + X.


--------------------------------------------------------------------------------
TWEAKS/EXTENSIONS

#1

The selector bitplanes need a lot of RAM. To save RAM drastically it is enough
to store just 1 line for each of them and to reset BPLxPTx with the Copper
during the horizontal blanking period of every rasterline. As a downside, this
steals some CHIP bus slots and complicates Copperlists.

#2

If a selector bitplane is omitted, the elements become 2 couples of identical
elements; if both the selector bitplanes are omitted, all the elements become
equal. Omitting the selector bitplanes saves (a lot of) CHIP bus slots and can
be useful in particular cases. For example, the demo THE CURE does not use any
selector bitplanes and uses bytes of the kind %HHHHLLLL, where H = High bit,
L = Low bit; this, thanks to jailbar mask sprites produces perfect LORES-looking
4-color pixels (which, together with bitplanes DMA toggling every other
rasterline, produces a dot-matrix display).

#3

If the visual output suffers from a heavy "jailbars" effect, it could be
improved by shifting every other rasterline by 1 to 3 pixels - for example:

****************************************
########################################
++++++++++++++++++++++++++++++++++++++++
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
...

#4

To lessen the dithering of tweak #3 and improve the color mix, the shifting can
also be inverted on an alternate frame basis - for example, the rasterlines
could be shown on the next frame as follows:

****************************************
########################################
++++++++++++++++++++++++++++++++++++++++
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
...

However, this causes flickering visuals (especially on displays with quick
response), so it is not really recommendable.

#5

Adding a horizontal scanlines effect by swapping the elements palette on an
alternate line basis (through BPLCON4) makes the visual output resemble that of
a CRT display.

#6

To reduce the amount of graphics to draw and the memory usage, the raster size
can be halved by repeating each rasterline once (which is easily obtained by
means of FMODE.BSCAN2 and BPLxMOD). This combines well with tweak #5.

#7

If needed, the bitplanes order can be reversed, i.e. the selector bitplanes
could be assigned bitplanes 1 and 2, and the raster bitplanes could be assigned
bitplanes 3 and 4:

bitplane 4: 76543210
bitplane 3: 76543210
bitplane 2: 001100110011
bitplane 1: 010101010101

In this case, COLORxx need to be set up differently:

bitplane 2 and 1 = %00 -> element A -> COLOR00 COLOR04 COLOR08 COLOR12
bitplane 2 and 1 = %01 -> element B -> COLOR01 COLOR05 COLOR09 COLOR13
bitplane 2 and 1 = %10 -> element C -> COLOR02 COLOR06 COLOR10 COLOR14
bitplane 2 and 1 = %11 -> element D -> COLOR03 COLOR07 COLOR11 COLOR15

Note: GeneratePalette does not support such arrangement.

#8

With a careful setup of COLORxx, the unused 4 bitplanes can be used to overlay
other graphics or even up to two more chunky screens, optionally with colorkey
and translucency. That, however, would increase noticeably the CHIP bus load.


--------------------------------------------------------------------------------
NOTES

#1

The meaning of PED81C is "Pixel Elements Dots, 81 Colors".

#2

Although due to the middle dots the logical horizontal resolution is half of the
physical one, the averaging provided by the middle dots and SHRES quite fool the
eye.

#3

Visually, the best results are obtained with complex/dithered images, as plain
color areas and geometrical shapes reveal the pixels and the middle dots. In
particular, isolated dots look 3x-ish wide.

#4

81 is only the theoretical maximum number of dots colors. The actual number
depends on the chosen base colors.

#5

The core idea could be used also to display 24-bit pictures, but the coarseness
of the method wastes completely the subtlety of such high color resolution (also
verified experimentally).

#6

Usage of PED81C is of course welcome and encouraged. It would be nice if credit
were given. If used in a commercial production, I would appreciate if permission
were asked first and if I could receive a little share of the profits.


--------------------------------------------------------------------------------
PERFORMANCE CONSIDERATIONS

PED81C is very CHIP bus intensive: the bitplanes data fetched are twice that of
an equivalent 256 colors LORES screen. If Lisa had been able to use the BPLxDAT
values of inactive bitplanes (like, for example, Denise does with bitplanes 5
and 6 when 4 bitplanes only and HAM are enabled) BPL3DAT and BPL4DAT could have
been loaded with the selector values thus halving the DMA fetches - but
unfortunately that is not the case.
Therefore, one might wonder whether is PED81C is actually advantageous. A lot
depends on how graphics are rendered: for example, a favourable case is when the
CPU can keep on executing cached code after writing to CHIP RAM so that no/few
cycles are wasted between writes. A general and indirect evaluation can be done
by comparing PED81C to the traditional C2P methods as follows.

The measurements, for simplicity, are based on the amount of data to render,
convert (if needed) and fetch for output relatively to 1 line.

Reference regular screen:
* 320 pixels wide LORES
* 6 bits deep screen (for fairness, because PED81C can at most output 81 unique
colors and the actual number of colors, as shown above, might be even less
depending on the color model)

Assumptions:
* 1 chunky pixel = 1 byte
* CPU and Blitter operations in CHIP RAM involve 6 bitplanes

In only CHIP RAM is available, the figures are as follows.

CPU-only C2P:
* rendering: 320 bytes
* C2P reads: 320 bytes
* C2P writes: 240 bytes
* bitplanes fetch: 240 bytes
* total: 1120 bytes

CPU+Blitter C2P, 1 CPU pass and 1 Blitter pass:
* rendering: 320 bytes
* C2P reads by CPU: 320 bytes
* C2P writes by CPU: 240 bytes
* C2P reads by Blitter: 240 bytes
* C2P writes by Blitter: 240 bytes
* bitplanes fetch: 240 bytes
* total: 1600 bytes

PED81C:
* rendering: 160 bytes
* bitplanes fetch: 640 bytes
* total: 800 bytes

If FAST RAM is available, the figures of PED81C do not change (as the raster
always resides in CHIP RAM), while the figures of the other cases are as
follows.

CPU-only C2P:
* rendering in FAST RAM: 320 bytes
* C2P reads from FAST RAM: 320 bytes
* C2P writes to CHIP RAM: 240 bytes
* bitplanes fetch: 240 bytes
* total: 640 bytes FAST RAM, 480 bytes CHIP RAM

CPU+Blitter C2P, 1 CPU pass and 1 Blitter pass:
* rendering in FAST RAM: 320 bytes
* C2P reads by CPU from FAST RAM: 320 bytes
* C2P writes by CPU to CHIP RAM: 240 bytes
* C2P reads by Blitter from CHIP RAM: 240 bytes
* C2P writes by Blitter to CHIP RAM: 240 bytes
* bitplanes fetch: 240 bytes
* total: 640 bytes FAST RAM, 960 bytes CHIP RAM

Overall, PED81C has the edge performance-wise, especially considering that CPU
and Blitter are not busy with converting data. It must be pointed out, though,
that PED81C's logical horizontal resolution is halved (hence the 160 bytes per
line) and that the overall visual quality is inferior to that of a regular
screen mode.


--------------------------------------------------------------------------------
BACKGROUND

#1

The idea of using SHRES pixels as elements is by Fabio Bizzetti, who used it for
his Virtual Karting and Virtual Karting II games.
In the late 90s, I was in touch with him and he told me that his idea was to
"fool the RF signal" (or something along these lines). This got me thinking and
I came up with the core idea. Before writing here (in 2022!) I had never
bothered checking what he actually had done, but now I deemed it appropriate to
do it in order to provide a brief description of his method, both as an
acknowledgement of his brilliant idea and to provide more food for thought.
After starting Virtual Karting II in UAE, having a look at the moving graphics,
grabbing a screenshot, checking the values of BPLCON0 and BPLCON1, and checking
the bitplanes memory, I found out that he used bitplanes 1-3 as selector
bitplanes and assigned the pixels these elements (from left to right): red-
orange-yellow-green-cyan-azure-blue-purple (so, there are no middle dots and
dots are really 2x-wide). To mitigate the columns-looking result, he applied the
crosshatch tweak, swapping the scroll offsets on an alternate frame basis.

#2

Between the end of the 90s and 2003 I had created a system (implemented as a
shared library) based on the same core idea, but using 3 selector bitplanes.
PED81C is actually a simplification of that system, born from precisely from the
removal of the middle dots selector bitplane to improve the speed.

The old system was really rich feature-wise, as it provided:
* 256 colors screens
* HalfRes screens: screens like PED81C's
* FullRes screens: screens without middle dots - this was achieved by means of
a conversion performed by the CPU, optionally assisted by the Blitter (for
the record, the CPU-only conversion allowed 320x256 screens at about 50 fps
on an Amiga 1200 equipped with a Blizzard 1230-IV and 60 ns FAST RAM)
* chequer effect: crosshatch tweak for HalfRes screens
* double and triple buffering
* 5 embedded color models (RGBW, RGBM, RGBP, RGBPS, RGB332)
* color/palette handling functions (color setting, color remapping, 24-bit
fading and 24-bit cross-fading)
* Cross Playfield mode: 256 color screen overlay on top of another screen with
any degree of opacity between 0 and 256 (in practice, this produced 16-bit
graphics)
* Dual Cross Playfield mode: like Cross Playfield mode, but with a selectable
colorkey
* graphical contexts (clipping, drawing modes)
* pixmap fuctions (blitting, zooming, rotzooming)
* graphical primitives
* font functions
* ILBM functions

One might wonder why such system is not public - the reasons are:
* the core would need to be re-designed;
* the implementation could be better;
* the accessory functions (like the graphical ones) should be in a separate
library;
* the documentation would need a major overhaul.

Basically, I do not consider the system suitable for public distribution. I
would rather redo it from scratch... but that is precisely why PED81C was born:
while thinking how to improve the system, I realized how to eliminate the 3rd
selector bitplane and decided to get rid of the FullRes screens, because the
point of these systems is obtaining chunky screens without data conversion
(otherwise, it is better to use one of the traditional C2P methods, which give
better visual results).

#3

Originally I had planned to use PED81C to make a new game. However, I could not
come up with a satisfactory idea; moreover, due to personal reasons, I had to
stop software development. Given that I could not predict when/if I would able
to produce something with PED81C and given that the war in Ukraine put the world
in deep uncertainty, I decided that it was better to release PED81C to avoid
that it went wasted and also as a gift to the Amiga community.
I must admit I have been tempted to provide an implementation of PED81C in the
form of a library or of a collection of functions, but since setting up PED81C
screens is easy and since general-purpose routines would perform worse than
tailor-made ones, I decided to let programmers implement it in the way that fits
best their projects.

Last edited by saimo on 02-Apr-2024 at 09:09 PM.
Last edited by saimo on 02-Apr-2024 at 09:06 PM.
Last edited by saimo on 29-Nov-2023 at 12:00 PM.
Last edited by saimo on 29-Nov-2023 at 11:57 AM.
Last edited by saimo on 28-Nov-2023 at 10:52 PM.
Last edited by saimo on 26-Jun-2023 at 07:10 PM.
Last edited by saimo on 21-Jun-2023 at 04:17 PM.
Last edited by saimo on 21-Jun-2023 at 04:00 PM.
Last edited by saimo on 05-Mar-2022 at 11:29 AM.
Last edited by saimo on 05-Mar-2022 at 11:28 AM.
Last edited by saimo on 05-Mar-2022 at 11:28 AM.

_________________
RETREAM - retro dreams for Amiga, Commodore 64 and PC

 Status: Offline
Profile     Report this post  
Hypex 
Re: PED81C - pseudo-native, no C2P chunky screens for AGA
Posted on 18-Mar-2022 4:31:07
#2 ]
Elite Member
Joined: 6-May-2007
Posts: 11232
From: Greensborough, Australia

@saimo

Thanks for sharing PED81C. Can you tell us any more about it? I'm interested in how you technically made it possible?

Yes I downloaded it and watched the videos but a brief description would be good before I unpack it.

An idea? How about Doom? Yes it's been done to death but Amiga people seem to love it.

3d tends to come to mind first when chunky is mentioned.

Of course there is possibility of interesting 2d using chunky methods.

 Status: Offline
Profile     Report this post  
saimo 
Re: PED81C - pseudo-native, no C2P chunky screens for AGA
Posted on 18-Mar-2022 12:11:16
#3 ]
Elite Member
Joined: 11-Mar-2003
Posts: 2453
From: Unknown

@Hypex

Quote:
Thanks for sharing PED81C. Can you tell us any more about it? I'm interested in how you technically made it possible?

Yes I downloaded it and watched the videos but a brief description would be good before I unpack it.

The documentation in the archive contains all the information you could possibly ever want

Quote:
An idea? How about Doom? Yes it's been done to death but Amiga people seem to love it.

I don't, though!

Quote:
3d tends to come to mind first when chunky is mentioned.

Of course there is possibility of interesting 2d using chunky methods.

Sure, plenty of possibilities...

_________________
RETREAM - retro dreams for Amiga, Commodore 64 and PC

 Status: Offline
Profile     Report this post  
saimo 
Re: PED81C - pseudo-native, no C2P chunky screens for AGA
Posted on 21-Jun-2023 16:02:12
#4 ]
Elite Member
Joined: 11-Mar-2003
Posts: 2453
From: Unknown

Uploaded an archive with updated documentation.
While at it, given that I was asked for a source code example, I whipped up an AMOS Professional program that shows how to set up a PED81C screen and to perform some basic operations on it - hopefully, this will be easy to understand and also open the door to AMOS programmers. The program source is included in the archive.

Quote:
'-----------------------------------------------------------------------------
'$VER: PED81C example 1.3 (28.11.2023) (c) 2023 RETREAM
'Legal terms: please refer to the accompanying documentation.
'www.retream.com/PED81C
'contact@retream.com
'-----------------------------------------------------------------------------

'-----------------------------------------------------------------------------
'DESCRIPTION
'This shows how to set up a PED81C screen and to perform some basic operations
'on it.
'Screen features:
' * equivalent to a 319x256 LORES screen
' * 160 dots wide raster
' * single buffer
' * blanked border
' * 64-bit bitplanes fetch mode
' * CMYW color model
'
'NOTES
'The code is written to be readable, not to be general-purpose/optimal.
'-----------------------------------------------------------------------------

'-----------------------------------------------------------------------------
'GLOBAL VARIABLES

Global RASTERADDRESS,RASTERWIDTH,RASTERHEIGHT,RASTERSIZE

RASTERWIDTH=160
RASTERHEIGHT=256
RASTERSIZE=RASTERWIDTH*RASTERHEIGHT

'-----------------------------------------------------------------------------
'MAIN

'Initialize everything.

_INITIALIZE_AMOS_ENVIRONMENT
_INITIALIZE_SCREEN

'If the initialization succeeded, load a picture into the raster and, in case
'of success, execute a simple effect on it.

If Param
_LOAD_PICTURE_INTO_RASTER["picture-160x256.raw"]
If Param
_TURN_DISPLAY_DMA_ON[0]
_RANDOMIZE_RASTER
_TURN_DISPLAY_DMA_OFF
End If
End If

'Deinitialize everything.

_DEINITIALIZE_SCREEN
_RESTORE_AMOS_ENVIRONMENT

'-----------------------------------------------------------------------------
'ROUTINES

Procedure _ALLOCATE_BITPLANE[BANKINDEX,SIZE]
'--------------------------------------------------------------------------
'DESCRIPTION
'Allocates a CHIP RAM buffer to be used as a bitplane.
'
'INPUT
'BANKINDEX = index of bank to use
'SIZE = size [bytes] of bitplane
'
'OUTPUT
'64-bit-aligned bitplane address (0 = error)
'
'WARNINGS
'The buffer must be freed with Erase BANKINDEX or Erase All.
'--------------------------------------------------------------------------

Trap Reserve As Chip Data BANKINDEX,SIZE+8
If Errtrap=0 Then A=(Start(BANKINDEX)+7) and $FFFFFFF8

End Proc[A]
Procedure _DEINITIALIZE_SCREEN
'--------------------------------------------------------------------------
'DESCRIPTION
'Deinitializes the screen.
'
'WARNINGS
'Can be called only if the display is off.
'--------------------------------------------------------------------------

Erase All
Doke $DFF1FC,0 : Rem FMODE

End Proc
Procedure _INITIALIZE_AMOS_ENVIRONMENT
'--------------------------------------------------------------------------
'DESCRIPTION
'Ensures the program cannot be interrupted or brought to back, and turns
'off the AMOS video system.
'--------------------------------------------------------------------------

Break Off
Amos Lock
Comp Test Off
Auto View Off
Update Off
Copper Off
_TURN_DISPLAY_DMA_OFF

End Proc
Procedure _INITIALIZE_SCREEN
'--------------------------------------------------------------------------
'DESCRIPTION
'Initializes the screen.
'
'OUTPUT
'-1/0 = OK/error
'
'WARNINGS
'_DEINITIALIZE_SCREEN[] must be called also in case of failure.
'
'NOTES
'Sets RASTERADDRESS.
'--------------------------------------------------------------------------

'Allocate the raster.

_ALLOCATE_BITPLANE[10,RASTERSIZE] : If Param=0 Then Pop Proc[0]
RASTERADDRESS=Param

'Allocate and fill the selector bitplanes.

_ALLOCATE_BITPLANE[11,RASTERSIZE] : If Param=0 Then Pop Proc[0]
B3A=Param
Fill B3A To B3A+RASTERSIZE,$55555555

_ALLOCATE_BITPLANE[12,RASTERSIZE] : If Param=0 Then Pop Proc[0]
B4A=Param
Fill B4A To B4A+RASTERSIZE,$33333333

'Set the chipset.

DIWSTRTX=$81+(160-RASTERWIDTH)
DIWSTRTY=$2C+(128-RASTERHEIGHT/2)
DIWSTRT=((DIWSTRTY and $FF)*256) or((DIWSTRTX+1) and $FF)
DIWSTOPX=DIWSTRTX+RASTERWIDTH*2
DIWSTOPY=DIWSTRTY+RASTERHEIGHT
DIWSTOP=((DIWSTOPY and $FF)*256) or(DIWSTOPX and $FF)
DIWHIGH=((DIWSTOPX and $100)*32) or(DIWSTOPY and $700) or((DIWSTRTX and $100)/8) or(DIWSTRTY/256)
DDFSTRT=(DIWSTRTX-17)/2
DDFSTOP=DDFSTRT+RASTERWIDTH-8

Doke $DFF092,DDFSTRT
Doke $DFF094,DDFSTOP
Doke $DFF08E,DIWSTRT
Doke $DFF090,DIWSTOP
Doke $DFF1E4,DIWHIGH

Doke $DFF100,$4241 : Rem BPLCON0
Doke $DFF102,$10 : Rem BPLCON1
Doke $DFF104,$224 : Rem BPLCON2
Doke $DFF108,0 : Rem BPLMOD1
Doke $DFF10A,0 : Rem BPLMOD2
Doke $DFF1FC,$3 : Rem FMODE

'Set COLORxx.

Doke $DFF106,$20 : Rem BPLCON3
Doke $DFF180,0
Doke $DFF182,$88
Doke $DFF184,$88
Doke $DFF186,$FF
Doke $DFF188,0
Doke $DFF18A,$808
Doke $DFF18C,$808
Doke $DFF18E,$F0F
Doke $DFF190,0
Doke $DFF192,$880
Doke $DFF194,$880
Doke $DFF196,$FF0
Doke $DFF198,0
Doke $DFF19A,$888
Doke $DFF19C,$888
Doke $DFF19E,$FFF
Doke $DFF106,$220 : Rem BPLCON3
Doke $DFF180,0
Doke $DFF182,0
Doke $DFF184,0
Doke $DFF188,0
Doke $DFF18A,0
Doke $DFF18C,0
Doke $DFF190,0
Doke $DFF192,0
Doke $DFF194,0
Doke $DFF198,0
Doke $DFF19A,0
Doke $DFF19C,0
Doke $DFF106,$20 : Rem BPLCON3

'Build a Copperlist that sets the bitplanes pointers.

Cop Movel $E0,RASTERADDRESS
Cop Movel $E4,RASTERADDRESS
Cop Movel $E8,B3A
Cop Movel $EC,B4A
Cop Swap

End Proc[-1]
Procedure _LOAD_PICTURE_INTO_RASTER[FILEPATH$]
'--------------------------------------------------------------------------
'DESCRIPTION
'Loads a raw 8-bit chunky picture into the raster, ensuring that its size
'is correct.
'
'IN
'FILEPATHS = path of picture file
'
'OUTPUT
'-1/0 = OK/error
'--------------------------------------------------------------------------

Trap Open In 1,FILEPATH$ : If Errtrap Then Pop Proc[0]
L=Lof(1)
Close(1)
If L<>RASTERSIZE Then Pop Proc[0]
Trap Bload FILEPATH$,RASTERADDRESS

End Proc[Errtrap=0]
Procedure _RANDOMIZE_RASTER
'--------------------------------------------------------------------------
'DESCRIPTION
'Randomizes the raster by swapping 16 dots per frame, until a mouse button
'is pressed.
'--------------------------------------------------------------------------

XM=RASTERWIDTH-1
YM=RASTERHEIGHT-1
Repeat
C=16
While C
X0=Rnd(XM)
Y0=Rnd(YM)
X1=Rnd(XM)
Y1=Rnd(YM)
A0=Y0*RASTERWIDTH+X0+RASTERADDRESS
A1=Y1*RASTERWIDTH+X1+RASTERADDRESS
C0=Peek(A0)
Poke A0,Peek(A1)
Poke A1,A0
Dec C
Wend
_WAIT_SCREEN_BOTTOM
Until Mouse Click

End Proc
Procedure _RESTORE_AMOS_ENVIRONMENT
'--------------------------------------------------------------------------
'DESCRIPTION
'Restores the AMOS environment.
'--------------------------------------------------------------------------

Copper On
Update On
Auto View On
Amos Unlock
Break On
_TURN_DISPLAY_DMA_ON[$20]

End Proc
Procedure _TURN_DISPLAY_DMA_OFF
'--------------------------------------------------------------------------
'DESCRIPTION
'Disables the bitplanes, Copper and sprites DMA.
'--------------------------------------------------------------------------

_WAIT_SCREEN_BOTTOM
Doke $DFF096,$3A0 : Rem DMACON

End Proc
Procedure _TURN_DISPLAY_DMA_ON[SSPRITESFLAG]
'--------------------------------------------------------------------------
'DESCRIPTION
'Enables the bitplanes and Copper DMA.
'
'INPUT
'SSPRITESFLAG = $20/0 = turn / do not turn sprites on
'
'WARNINGS
'The chipset must have been set up properly.
'--------------------------------------------------------------------------

_WAIT_SCREEN_BOTTOM
Doke $DFF096,$8380 or SSPRITESFLAG : Rem DMACON

End Proc
Procedure _WAIT_SCREEN_BOTTOM
'--------------------------------------------------------------------------
'DESCRIPTION
'Waits for the bottom of the screen.
'--------------------------------------------------------------------------

While Deek($DFF004) and $3 : Wend
Repeat : Until(Leek($DFF004) and $3FF00)>$12C00

End Proc

Last edited by saimo on 29-Nov-2023 at 12:05 PM.
Last edited by saimo on 29-Nov-2023 at 12:01 PM.
Last edited by saimo on 28-Nov-2023 at 10:51 PM.

_________________
RETREAM - retro dreams for Amiga, Commodore 64 and PC

 Status: Offline
Profile     Report this post  
saimo 
Re: PED81C - pseudo-native, no C2P chunky screens for AGA
Posted on 28-Nov-2023 22:53:45
#5 ]
Elite Member
Joined: 11-Mar-2003
Posts: 2453
From: Unknown

I have just released a little update, accompanied by the PED81C Voxel Engine (PVE), i.e. a new demo. If you can't be bothered trying it yourself, you can see it in this video - but beware: YouTube's video compression degraded the visual quality (especially the colors saturation and brightness).

https://www.youtube.com/watch?v=0xunQ6ldVKU

Details about PVE straight from the manual:
Quote:
--------------------------------------------------------------------------------
OVERVIEW

PVE is an experiment to test the graphical quality and computational performance
of the PED81C system. It allows to move freely around a typical voxel landscape.


--------------------------------------------------------------------------------
GETTING STARTED

PVE requires:
* Amiga computer
* AGA chipset
* 200 kB of CHIP RAM
* 4 MB of FAST RAM
* PAL SHRES support
* digital joystick/joypad and mouse
* 2.1 MB of storage space

If the monitor / graphics card / scan doubler do(es) not support SHRES, the
colors will look off or even not show at all.
For example:
* MNT's VA2000 graphics card displays only the even columns of pixels, so only
reds and blues show;
* Irix Labs' ScanPlus AGA displays only the odd columns of pixels (contrary to
how is was originally marketed), so only greens and grays show.

To install PVE, unpack the LhA archive to any directory of your choice.

To start PVE, open the program directory and double-click the program icon from
Workbench or execute the program from shell.


--------------------------------------------------------------------------------
CONTROLS

PVE is controlled by joystick/joypad (in the game port) and mouse (in the mouse
port).

JOYSTICK/JOYPAD | MOUSE | SPLASH SCREEN | ACTION SCREEN
----------------+---------+---------------------+----------------
[UP] | | | move forwards
[DOWN] | | | move backwards
[LEFT] | | | turn left
[RIGHT] | | | turn right
[FIRE1] | | | accelerate
| [LEFT] | go to action screen |
| [RIGHT] | quit to AmigaOS | quit to AmigaOS


--------------------------------------------------------------------------------
MISCELLANEOUS

* The map wraps around at its edges.
* The number shown in the top-left corner of the action screen indicates the
number of frames rendered in the last second.
* Upon returning to AmigaOS, PVE prints out:
* the total number of frames rendered;
* the total number of frames shown;
* the average number of frames rendered per second;
* the average time (expressed in frames) taken by the rendering of a frame.


--------------------------------------------------------------------------------
TECHNICAL NOTES

* The graphics are first rendered in a raster in FAST RAM and then copied to a
triple-buffered PED81C raster in CHIP RAM.
* The screen resolution is 1020x200 SHRES pixels, which correspond to 255x200
LORES-sized dots and to 128x200 logical dots.
* The screen resolution can changed by simply redefining the width and height
constants in the code and reassembling it.
* Rendering is done by columns, from bottom to top and then left to right.
* The code applies a depth of 256 steps per column, so it evaluates 256*128 =
32768 dots per frame (and then renders only those which are actually visible).
* The code is 100% assembly.
* The code is optimized for 68030.
* The program supports only maps of 1024x1024 pixels, but it can be made to
support maps of other sizes by simply redefining the width and height
constants in the code and reassembling it.
* The handling of the user input and of the camera is decoupled from the
graphics rendering and runs every frame.
* The height of the camera adapts automatically to that of the point it is at,
but it can be made user-controllable and its maximum value can be increased
almost to the point that the lanscape disappears at the bottom of the screen.
* On an Amiga 1200 equipped with a Blizzard 1230 IV mounting a 50 MHz 68030 and
60 ns RAM:
* the program runs at about 20.2 fps;
* the rendering of graphics alone runs at about 22.2 fps;
* the impact of PED81C is of about 22.2-20.2 = 2 fps - in other words,
writing the graphics to the PED81C raster requires about 50/22.2-50/20.2 =
0.223 frames (when only the bitplanes DMA is active);
* rendering the graphics directly to the PED81C raster degrades the
performance by about 2 to 3 fps (tested only with an older and less
optimized version).
* On an Amiga 1200 equipped with a PiStorm32, the program runs at 50 fps
(unsurprisingly).
* The map size is 1024x1024 pixels.
* The map requires 2 MB of FAST RAM.
* The program takes over the system entirely and returns to AmigaOS cleanly.


--------------------------------------------------------------------------------
BACKSTORY

After a hiatus from programming of several months (due to a computer-unrelated
project), I decided to finally create something for PED81C because I had made
nothing with it other than a few little examples, I wanted to test its
graphical quality and computational performance, and... I felt like having some
good fun.
After some inconclusive mental wandering, the idea of making a voxel engine came
to mind for unknown reasons (I had never dabbled with voxel before).
When the engine was mature enough I decided to distribute PVE publicly (which
initially was not planned).


About the update, I fixed some palette values in a table in the documentation, added the formulas for calculating DIWSTRT, DIWSTOP, DIWHIGH, DDFSTRT and DDFSTOP to the documentation and implemented them in the AMOS Professional source code example. This is the snippet relative to the register settings:
Quote:
In general, given a raster which is RASTERWIDTH dots wide and RASTERHEIGHT dots
tall, the values to write to the chipset registers in order to create a centered
screen can be calculated as follows:
* SCREENWIDTH = RASTERWIDTH * 8
* SCREENHEIGHT = RASTERHEIGHT
* DIWSTRTX = $81 + (160 - SCREENWIDTH / 8)
* DIWSTRTY = $2c + (128 - SCREENHEIGHT / 2)
* DIWSTRT = ((DIWSTRTY & $ff) << 8) | ((DIWSTRTX + 1) & $ff)
* DIWSTOPX = DIWSTRTX + SCREENWIDTH / 4
* DIWSTOPY = DIWSTRTY + SCREENHEIGHT
* DIWSTOP = ((DIWSTOPY & $ff) << 8) | (DIWSTOPX & $ff)
* DIWHIGH = ((DIWSTOPX & $100) << 5) | (DIWSTOPY & $700) |
((DIWSTRTX & $100)>> 3) | (DIWSTRTY >> 8)
* DDFSTRT = (DIWSTRTX - 17) / 2
* DDFSTOP = DDFSTRT+SCREENWIDTH / 8 - 8

Last edited by saimo on 29-Nov-2023 at 12:04 PM.

_________________
RETREAM - retro dreams for Amiga, Commodore 64 and PC

 Status: Offline
Profile     Report this post  
kolla 
Re: PED81C - pseudo-native, no C2P chunky screens for AGA
Posted on 28-Nov-2023 23:53:21
#6 ]
Elite Member
Joined: 21-Aug-2003
Posts: 2945
From: Trondheim, Norway

@saimo

Very nice, just lacks the blue sky with white clouds, and ซWarpUPป ;)

_________________
B5D6A1D019D5D45BCC56F4782AC220D8B3E2A6CC

 Status: Offline
Profile     Report this post  
Lou 
Re: PED81C - pseudo-native, no C2P chunky screens for AGA
Posted on 29-Nov-2023 2:30:13
#7 ]
Elite Member
Joined: 2-Nov-2004
Posts: 4181
From: Rhode Island

Is this an evolution of your dot-matrix engine? Apologies in advance for mis-naming it...

 Status: Offline
Profile     Report this post  
ppcamiga1 
Re: PED81C - pseudo-native, no C2P chunky screens for AGA
Posted on 29-Nov-2023 7:13:34
#8 ]
Cult Member
Joined: 23-Aug-2015
Posts: 787
From: Unknown

@saimo

it is not worth time and work
buy graphics card for your amiga

 Status: Offline
Profile     Report this post  
pixie 
Re: PED81C - pseudo-native, no C2P chunky screens for AGA
Posted on 29-Nov-2023 7:17:48
#9 ]
Elite Member
Joined: 10-Mar-2003
Posts: 3161
From: Figueira da Foz - Portugal

@ppcamiga1

WTF? Are you for real?

_________________
Indigo 3D Lounge, my second home.
The Illusion of Choice | Am*ga

 Status: Offline
Profile     Report this post  
hotrod 
Re: PED81C - pseudo-native, no C2P chunky screens for AGA
Posted on 29-Nov-2023 7:44:07
#10 ]
Elite Member
Joined: 11-Mar-2003
Posts: 2994
From: Stockholm, Sweden

@pixie

That's a bitter narcissist for ya. Kill any joy there is, no passion and arrogant AF.

 Status: Offline
Profile     Report this post  
saimo 
Re: PED81C - pseudo-native, no C2P chunky screens for AGA
Posted on 29-Nov-2023 9:40:37
#11 ]
Elite Member
Joined: 11-Mar-2003
Posts: 2453
From: Unknown

@kolla

Quote:
Very nice, just lacks the blue sky with white clouds, and ซWarpUPป ;)

I don't know what you're referring to, but I guess it must be some demo published back when WarpUp was being promoted, right? (Never been into PPC cards, so I barely know the names )

_________________
RETREAM - retro dreams for Amiga, Commodore 64 and PC

 Status: Offline
Profile     Report this post  
saimo 
Re: PED81C - pseudo-native, no C2P chunky screens for AGA
Posted on 29-Nov-2023 9:42:45
#12 ]
Elite Member
Joined: 11-Mar-2003
Posts: 2453
From: Unknown

@Lou

Quote:
Is this an evolution of your dot-matrix engine? Apologies in advance for mis-naming it...

No need to apologize!
It's the other way around: the dot-matrix engine was derived from PED81C

_________________
RETREAM - retro dreams for Amiga, Commodore 64 and PC

 Status: Offline
Profile     Report this post  
kolla 
Re: PED81C - pseudo-native, no C2P chunky screens for AGA
Posted on 29-Nov-2023 10:16:21
#13 ]
Elite Member
Joined: 21-Aug-2003
Posts: 2945
From: Trondheim, Norway

@saimo

Ah, sorry, but yes you are quite correct :)

https://youtu.be/AZL86gZr0LA?si=4graLolmT0TvNuPf

_________________
B5D6A1D019D5D45BCC56F4782AC220D8B3E2A6CC

 Status: Offline
Profile     Report this post  
V8 
Re: PED81C - pseudo-native, no C2P chunky screens for AGA
Posted on 29-Nov-2023 10:45:35
#14 ]
Regular Member
Joined: 30-Mar-2022
Posts: 134
From: Unknown

@ppcamiga1

Quote:
it is not worth time and work buy graphics card for your amiga


I think you are severely mentally ill and need to be locked up in an institution.
This is how you react to some truly awesome development for amiga by true amiga fans?
You are mentally ill, you are toxic and you are a determent for any and all amiga fans and projects.

 Status: Offline
Profile     Report this post  
Karlos 
Re: PED81C - pseudo-native, no C2P chunky screens for AGA
Posted on 29-Nov-2023 23:30:57
#15 ]
Elite Member
Joined: 24-Aug-2003
Posts: 4415
From: As-sassin-aaate! As-sassin-aaate! Ooh! We forgot the ammunition!

@saimo

Very impressive! What sort of frame rate are you getting on the 68030 out of curiosity?

Also, how does the performance compare to C2P on the same hardware for a similar logical pixel count?

_________________
Doing stupid things for fun...

 Status: Offline
Profile     Report this post  
saimo 
Re: PED81C - pseudo-native, no C2P chunky screens for AGA
Posted on 30-Nov-2023 0:10:29
#16 ]
Elite Member
Joined: 11-Mar-2003
Posts: 2453
From: Unknown

@Karlos

Quote:
Very impressive! What sort of frame rate are you getting on the 68030 out of curiosity?

These are the fps figures I have gathered until now:

Amiga 1200 / PiStorm32 + Raspberry Pi 3 A+ / 50.0
Amiga 1200 / Blizzard 1230 IV / 20.2
Amiga 4000 / Cyberstorm MK III / 18.8
Amiga 1200 / Blizzard 1260 / 16-20
Amiga 1200 / TerribleFire TF1260 / 14.2

The code is fine-tuned for 68030 (unsurprisingly: it's the only CPU I have ), but the bottleneck must be the access to CHIP RAM. I'm waiting for some tests to know exactly.
In this thread on EAB you can find the gory details (I know you want them) being discussed by me and paraj

Quote:
Also, how does the performance compare to C2P on the same hardware for a similar logical pixel count?

No idea: I never used a C2P routine in my life! Maybe a copyspeed C2P wins* on 68060 because the time spent for writing to CHIP RAM should be inferior thanks to the less busy CHIP bus.
*Also in terms on visual quality.

_________________
RETREAM - retro dreams for Amiga, Commodore 64 and PC

 Status: Offline
Profile     Report this post  
saimo 
Re: PED81C - pseudo-native, no C2P chunky screens for AGA
Posted on 22-Dec-2023 10:10:04
#17 ]
Elite Member
Joined: 11-Mar-2003
Posts: 2453
From: Unknown

Just released a new version of PVE. Full changelog below. In short: it's faster and it's got a few little additions.

https://retream.itch.io/ped81c

v1.1 (22.12.2023)
* Reworked screen buffering, so that the raster data is more efficiently written to CHIP RAM when bitplanes DMA is inactive.
* Improved 68030 caches handling.
* Added 68040 and 68060 caches handling.
* Added MMU handling to avoid that the MMU affects the speed negatively.
* Optimized rendering core by making it write the dots sequentially.
* Made a little 68060-specific code optimization.
* Ensured 68060 susperscalar dispatch is enabled.
* Added live-toggable staggered lines video filter, which helps see better colors on devices that do not support SHRES and reduces the jailbars effect on devices that support SHRES (to enable/disable: [F1]).
* Made fps indicator live-togglable (to enable/disable: [F2]).
* Made quitting from the voxel screen return to the splash screen.
* Replaced mouse controls with keyboard controls.
* Added benchmark function.
* Added command line switches to control the CPU caches.
* Fixed bug that caused a longword to be written to a random location when the fps indicator was on.
* Fixed an innocuous initialization bug.
* Made cleanup code more robust.
* Updated, extended and fixed documentation.

_________________
RETREAM - retro dreams for Amiga, Commodore 64 and PC

 Status: Offline
Profile     Report this post  
saimo 
Re: PED81C - pseudo-native, no C2P chunky screens for AGA
Posted on 27-Mar-2024 22:40:50
#18 ]
Elite Member
Joined: 11-Mar-2003
Posts: 2453
From: Unknown

It was ages that I intended to dig up some 20+ years old code and use it to play with PED81C a little more. Finally I got around to do it and came up with a new test program called Zoomaniac.
Details in the video and in the manual excerpt below. Download available at https://retream.itch.io/ped81c.

https://www.youtube.com/watch?v=eehqapb20fE

Quote:
--------------------------------------------------------------------------------
OVERVIEW

Zoomaniac has been written to evaluate the performance on a stock Amiga 1200 of
a general-purpose texture scaling routine that writes directly to a PED81C
raster.


--------------------------------------------------------------------------------
PERFORMANCE

The following results are relative to the full screen effect that zooms the
cosmonaut in and out.

On a stock Amiga 1200, the execution speed is between 25 and 26 fps. If the
staggered lines are turned on, the performance drops by about 1 fps (which was
unexpected, since all that such option adds is a Copper WAIT and a Copper MOVE
for each rasterline).
Given that the DMA load caused by PED81C is "double" (see its documentation for
the details), a version that uses only half the number (2) of bitplanes has been
made to check the performance as if the Amiga had a native chunky video mode.
Surprisingly, the performance did not improve at all: relatively to the CHIP bus
access, the scaling code must interleave so nicely with the bitplane data
fetches that having more bus cycles available does not make any/much difference.

An Amiga 1200 equipped with a 68030 clocked at 50 MHz and 60 ns FAST RAM easily
performs at steady 50 fps. To find out the maximum performance, new tests were
made with special versions of the program that had the video synchronization
code disabled.
The speed when running the program normally was between 77 and 78 fps. The
staggered lines option lowered the fps by about 2. The 2 bitplanes versions
performed better, reaching 80-81 fps or, with the staggered lines on, 79-80 fps.
Like on the stock Amiga 1200, the extended Copperlist that implements the
staggered lines causes a small and similar performance drop. Instead, the
halving of the bitplanes DMA load did produce a speed increase.

The following table sums up the results.

S = stock Amiga 1200
E = Amiga 1200 68030 @50 MHz / 60 ns FAST RAM (Blizzard 1230 IV)
2 = 2 bitplanes on
4 = 4 bitplanes on
L = staggered lines on

| 4 | L4 | 2 | L2
--+-------+--------+-------+-------
S | 25-65 | 24-25 | 25-26 | 24-25
E | 77-78 | 75-76 | 80-81 | 79-80

Notes:
* when FAST RAM is detected, an alternative and more suitable scaling routine
is used (although writes still happen to CHIP RAM);
* on (some?) machines equipped with FAST RAM an even faster strategy would be
rendering to FAST RAM and then simply copying at the maximum speed the
rendered frame to the CHIP RAM raster.


--------------------------------------------------------------------------------
TECHNICAL NOTES

* The scaling routine fits any rectangle from a texture into a rectangle of any
size and ratio of another texture with nearest-neighbor matching.
* Logic and rendering are totally asynchronous: the logic runs always at 50 Hz
and the rendering never stops (unless it reaches the limit of 50 fps, imposed
by the display refresh rate), thus exploiting the machine's full potential.
* The screen buffering employs three buffers in CHIP RAM.
* The screen resolution is 1020x256 SHRES pixels, which correspond to 255x256
LORES-sized physical dots and to 128x256 logical dots.
* The code is 100% assembly.
* The program takes over the system entirely and returns to AmigaOS cleanly.


CHANGELOG

March 27, 2024
* Added the Zoomaniac demo.
* [PED81C Voxel Engine] Made a couple of minor changes.
* [PED81C Voxel Engine] Updated documentation.

January 1, 2024
* Rebuilt demos against latest custom framework.
* [PED81C Voxel Engine] Optimized slightly background rendering.
* [PED81C Voxel Engine] Corrected benchmark fps calculation (312 rasterlines were considered instead of 313).
* [PED81C Voxel Engine] Built against latest custom framework.
* [PED81C Voxel Engine] Updated, extended and fixed documentation.

Last edited by saimo on 27-Mar-2024 at 10:41 PM.

_________________
RETREAM - retro dreams for Amiga, Commodore 64 and PC

 Status: Offline
Profile     Report this post  
saimo 
Re: PED81C - pseudo-native, no C2P chunky screens for AGA
Posted on 29-Mar-2024 13:47:52
#19 ]
Elite Member
Joined: 11-Mar-2003
Posts: 2453
From: Unknown

In response to the feedback received, I have uploaded a new version of Zoomaniac that allows to enable/disable the fps limit by means of [F3].

Quote:
* The number shown in the top-left corner of the effects screen is the fps
indicator, which reports the number of frames rendered in the last second.
It is limited to 999.
* When the fps limit is on, the maximum number of frames rendered per second
is 50 also on the most powerful machines, as the display refresh rate is 50
Hz. When the fps limit is off, frames are rendered without pausing when the
previously rendered frame/frames has/have not (completely) displayed yet. On
machines which cannot run the program at 50 fps or more, turning off the
limit has no effect whasoever; on the other machines, the only visible effect
is that the fps indicator goes beyond 50, thus giving a measure of the maximum
speed that the machine can reach.


Also, this new version runs 1-2 fps faster on 68030 thanks to the data cache burst:

Quote:
* on 68030 tests proved that: it is advantageous to turn the data cache burst
on when scaling a 128 dots wide rectangle to a rectangle wider than 8 dots
(i.e. with an X scaling factor greater than 1/16); with a scaling factor of
1/16 or less the difference proved to be minimal when both the source and
destination rectangles were 256 dots tall; considering that turning the data
cache burst off would therefore be advantageous only with very narrow and
tall rectangles (which are uncommon and intrinsically rather inexpensive),
it is not worth it to implement a data cache burst management inside the
scaling routine;


CHANGELOG

v1.1 (28.3.2024)
* Turned the 68030 data cache burst on for slightly faster performance.
* Made a couple of minor optimizations.
* Added frames rendering limit toggle ([F3]).
* Worked on fps indicator: added hundreds digit; made digits smaller; made digits auto-clearing, so that they read correctly also when they are not cleared before drawing.
* Made staggered lines toggle as soon as [F1] is pressed (instead of when it is released).
* Updated splash screen.
* Redesigned the 'M' in the logo.
* Updated and extended manual.

Last edited by saimo on 29-Mar-2024 at 01:49 PM.

_________________
RETREAM - retro dreams for Amiga, Commodore 64 and PC

 Status: Offline
Profile     Report this post  
saimo 
Re: PED81C - pseudo-native, no C2P chunky screens for AGA
Posted on 2-Apr-2024 21:05:44
#20 ]
Elite Member
Joined: 11-Mar-2003
Posts: 2453
From: Unknown

To have a complete set of scaling routines (which hopefully I'll use for something someday), I added support for color-keying, zero-keying (color-keying with color 0), and horizontal and vertical flipping.
Morever, given that initially the focus was on the stock A1200, the performance on expanded machines was not optimal (as the rendering was done directly in CHIP RAM), so I added also an alternative buffering method that, when 2 rasters can be allocated in FAST RAM, allows rendering in FAST RAM and then copies the rendered raster to the raster in CHIP RAM as quickly as possible, starting when the beam reaches the bottom of the screen. This, relatively to the first effect in the test program (which is the only one whose performance was measured until now), produced a gain of 8-9 fps on my 68030-equipped Amiga 1200.

The updated test program (available at https://retream.itch.io/ped81c), to demostrate the new features, streches and shrinks a color/zero-keyed texture covering almost the entire screen over a full-screen zooming background, with all the possible flipping combinations. That is of course a bit taxing for a stock A1200, whose performance drops between 12 and 16 fps in the busiest cases.

https://www.youtube.com/watch?v=ebxwKm9K4Os

(Side note: the video was recorded before finalizing the test program, so it shows an outdated splash screen and zooming jumps relatively to the background when passing from/to the color/zero-keying effects.)

This snippet from the updated manual provides further details.

Quote:
--------------------------------------------------------------------------------
OVERVIEW

Zoomaniac has been written to evaluate the performance on stock and modestly-
accelerated Amiga 1200s of some general-purpose texture scaling routines in
conjunction with PED81C.


--------------------------------------------------------------------------------
GETTING STARTED

Zoomaniac requires:
* Amiga computer
* AGA chipset
* 170 kB of CHIP RAM
* 1.2 MB of any RAM
* PAL SHRES support
* keyboard
* 1 MB of storage space

To install Zoomaniac, unpack the LhA archive to any directory of your choice.

To start Zoomaniac, open the program directory and double-click the program icon
from Workbench or execute the program from shell.

If your monitor / graphics card / scan doubler do(es) not support SHRES, the
colors will look off or even not show at all. In such case, to hopefully fix the
colors a bit, try the staggered lines option.


--------------------------------------------------------------------------------
CONTROLS

KEY | SPLASH SCREEN | EFFECTS SCREEN
----------+-----------------------------+----------------------------
[SPACE] | go to effects screen |
[F1] | turn staggered lines on/off | turn staggered lines on/off
[F2] | turn fps indicator on/off | turn fps indicator on/off
[F3] | turn fps limit on/off | turn fps limit on/off
[ESCAPE] | quit to AmigaOS | go to splash screen


--------------------------------------------------------------------------------
MISCELLANEOUS

* The staggered lines shift the odd lines by 1 SHRES pixel to the right. On
systems which handle SHRES correctly, that will reduce the jailbars effect
(but give the screen a kind of wavy look). On system which handle SHRES as
HIRES (for example, MNT's VA2000 graphics card and Irix Labs' ScanPlus AGA -
contrary to how is was originally marketed - display only the even or odd
columns of pixels, so only reds and blues or greens and grays show), that
helps improving the colors a bit (giving the screen a kind of scanline
effect). On other systems, the results are unpredictable, but the option is
still worth a try.
* The number shown in the top-left corner of the effects screen is the fps
indicator, which reports the number of frames rendered in the last second.
It is limited to 999.
* When the fps limit is on, the maximum number of frames rendered per second
is 50 also on the most powerful machines, as the display refresh rate is 50
Hz. When the fps limit is off, frames are rendered without pausing when the
previously rendered frame/frames has/have not (completely) displayed yet. On
machines which cannot run the program at 50 fps or more, turning off the
limit has no effect whasoever; on the other machines, the only visible effect
is that the fps indicator goes beyond 50, thus giving a measure of the maximum
speed that the machines can reach.


--------------------------------------------------------------------------------
PERFORMANCE

The following results are relative to the full screen effect that zooms the
cosmonaut in and out without flipping. The source textures are 256x512 dots and
the screen internally consists of 128x256 dots. Since a dot is represented by a
byte, 128x256 = 32768 bytes are fetched and written to render a frame.

On a stock Amiga 1200, the execution speed is between 25 and 26 fps. If the
staggered lines are turned on, the performance drops by about 1 fps (albeit all
that such option adds is a Copper WAIT and a Copper MOVE for each rasterline).
Given that the DMA load caused by PED81C is "double" (see its documentation for
the details), a version that uses only half the number (2) of bitplanes has been
made to check the performance as if the Amiga had a native chunky video mode.
Surprisingly, the performance did not improve at all: relatively to the CHIP bus
access, the scaling code must interleave so nicely with the bitplane data
fetches that having more bus cycles available does not make any/much difference.

An Amiga 1200 equipped with a 68030 clocked at 50 MHz and 60 ns FAST RAM easily
performs at steady 50 fps. To find out the maximum performance, tests were made
with the fps limit off.
The speed when running the program normally was between 84 and 86 fps. The
staggered lines option lowered the fps by about 1. The 2 bitplanes versions ran
at the same speed - in this case, that is because most of the CHIP RAM accesses
happen when no bitplanes DMA is going on (see TECHNICAL DETAILS section).

The following table sums up the results.

staggered lines | off | on
-------------------+-------+--------
stock Amiga 1200 | 25-26 | 24-25
exanded Amiga 1200 | 84-86 | 84-85

expanded Amiga 1200: Blizzard 1230 IV, 68030 @50 MHz, 60 ns FAST RAM

Notes:
* given that a stock Amiga 1200 reaches about 25.5 fps, it manages to render
128*256*25.5 = 835584 dots per second; considering that the 68020 is clocked
at 14.187580 MHz, rendering 1 dot requires about 14187580/835584 = 17 CPU
cycles;
* on 68030 tests proved that: it is advantageous to turn the data cache burst
on when scaling a 128 dots wide rectangle to a rectangle wider than 8 dots
(i.e. with an X scaling factor greater than 1/16); with a scaling factor of
1/16 or less the difference proved to be minimal when both the source and
destination rectangles were 256 dots tall; considering that turning the data
cache burst off would therefore be advantageous only with very narrow and
tall rectangles (which are uncommon and intrinsically rather inexpensive),
it is not worth it to manage the data cache burst inside the scaling
routines.


--------------------------------------------------------------------------------
SCALING ROUTINES

The scaling routines fit any rectangle from a texture into a rectangle of any
size and ratio of another texture with nearest-neighbor matching. Optionally,
they can flip the rectangles horizontally and/or vertically, and treat as
transparent the dots of a specific color (color-keying) or of color 0 (zero-
keying).
Color/zero-keying allows to render graphics of arbitrary shapes without masks
(which saves RAM and CPU cycles). Thanks to the fact that PED81C graphics always
use at most 81 colors, there are 256-81 = 175 colors that can be used for color-
keying without causing any visual loss.
For performance reasons, there are the 3 separate routines.

routine | color-keying | zero-keying | speed rating
-----------------------+--------------+-------------+--------------
v_ScaleRectangle() | | | ***
v_ScaleRectangle_CK() | * | | *
v_ScaleRectangle_ZK() | | * | **


--------------------------------------------------------------------------------
OTHER TECHNICAL NOTES

* Logic and rendering are totally asynchronous: the logic runs always at 50 Hz
and the rendering never stops (unless it reaches 50 fps and the fps limit is
on), thus exploiting the machine's full potential.
* The screen is triple-buffered.
* When 2 rasters can be allocated in FAST RAM:
1. the graphics are rendered always to the available raster in FAST RAM;
2. after the rendering has completed and as soon as the bottom rasterline has
has been displayed, the rendered raster is copied as quickly as possible
to the raster in CHIP RAM (which is the one that gets displayed).
The copy successfully races the beam (on the expanded Amiga 1200 mentioned in
the PERFORMANCE section, it requires about 57 rasterlines during the vertical
blanking and 35 rasterlines during the fetching of the top rasterlines), so no
tearing occurs.
Such method yields a faster performance than rendering directly to a raster in
CHIP RAM (especially when there is overdraw and/or data gets also read from
the raster).
* The screen resolution is 1020x256 SHRES pixels, which correspond to 255x256
LORES-sized physical dots and to 128x256 logical dots.
* The code is 100% assembly.
* The program takes over the system entirely and returns to AmigaOS cleanly.

_________________
RETREAM - retro dreams for Amiga, Commodore 64 and PC

 Status: Offline
Profile     Report this post  
Goto page ( 1 | 2 Next Page )

[ home ][ about us ][ privacy ] [ forums ][ classifieds ] [ links ][ news archive ] [ link to us ][ user account ]
Copyright (C) 2000 - 2019 Amigaworld.net.
Amigaworld.net was originally founded by David Doyle