Amigaworld.net - The Amiga Computer Community Portal Website

home

features

news

forums

classifieds

faqs

links

search

6071 members

Amiga Q&A / Free for All / Emulation / Gaming / (Latest Posts)

Login

Lost Password?

Don't have an account yet?
Register now!

Support Amigaworld.net

Your support is needed and is appreciated as Amigaworld.net is primarily dependent upon the support of its users.

Menu

Main sections

»	Home
»	Features
»	News
»	Forums
»	Classifieds
»	Links
»	Downloads

Extras

»	OS4 Zone
»	IRC Network
»	AmigaWorld Radio
»	Newsfeed
»	Top Members
»	Amiga Dealers

Information

»	About Us
»	FAQs
»	Advertise
»	Polls
»	Terms of Service
»	Search

IRC Channel

Server: irc.amigaworld.net
Ports: 1024,5555, 6665-6669
SSL port: 6697
Channel: #Amigaworld
Channel Policy and Guidelines

Who's Online

17 crawler(s) on-line.

102 guest(s) on-line.

1 member(s) on-line.

Maijestro

You are an anonymous user.
Register Now!

Maijestro: 1 min ago

Rudei: 5 mins ago

OlafS25: 5 mins ago

NutsAboutAmiga: 36 mins ago

AndreasM: 43 mins ago

kolla: 1 hr 8 mins ago

clint: 2 hrs 7 mins ago

kiFla: 2 hrs 22 mins ago

zipper: 2 hrs 35 mins ago

kriz: 3 hrs 3 mins ago

Forum Index

Amiga OS4.x \ Workbench 4.x

Interesting memory allocation benchmark

Poster

Thread

Fab

Interesting memory allocation benchmark
Posted on 18-Sep-2009 14:49:16

[ #1 ]

Super Member

Joined: 17-Mar-2004
Posts: 1178
From: Unknown

Quote:

Quote:

@Fab Quote:
I think we already discussed it once. And what's sure is that absolutely nothing indicates SLAB would be more advanced

Possibly we have previously discussed it long ago, but I have since gained a better understanding of OS4.1's Slab implementation: Please go read-up on the "VMem" allocator in relation to Slabs.

You will see that VMem is O(1) & therefore modern Slab implementation's like OS4's is also O(1). VMem is actually very similar to TLSF, but predates it by a few years & so is not quite so clever (or memory efficient).

I do NOT intend to debate the pros & cons of Slab allocators vs TLSF *here*. Please start another thread if you wish to do that.

So, I was bored, and wrote a small simple test. This test isn't meant to be clever at all: it just allocates memory areas and then frees them, in a loop.

Here's the link to binaries and sources:
http://fabportnawak.free.fr/sillybench/

Note that multitasking isn't switched off for the test, and of course be aware that more iterations and idle system will produce more stable results.

MorphOS2.3 (SafeMemList allocator (a bit like OS3.x, with more safety checks)) with membench_morphos:

iterations - result
1000 : ~160000 µs (0.16 s)
2000 : ~320000 µs (0.32 s)
10000 : ~1600000 µs (1.6 s)
50000 : ~8200000 µs (8.2 s)
100000 : ~16200000 µs (16.2 s)
1000000 : ~17000000 µs (161 s)

-> I think with more time, chunks and fragmentation, it would increase quite a lot, given the underlying algorithm. :)

MorphOS2.3 (TLSF allocator) with membench_morphos:

iterations - result
1000 : ~16000 µs (0.016 s)
2000 : ~32000 µs (0.032 s)
10000 : ~170000 µs (0.17 s)
50000 : ~870000 µs (0.87 s)
100000 : ~1740000 µs (1.7 s)
1000000 : ~17000000 µs (17 s)

-> No surprise, TLSF allocator is apparently 10 times faster than old MorphOS memory allocator in this simple scenario.

OS4.1 (advanced SLAB allocator) with membench_amigaos4:

iterations - result
1000 : ~5060000 µs (5 s)
2000 : ~10120000 µs (10 s)
10000 : ~50600000 µs (50 s)
50000 : ~256000000 µs (256 s)
100000 : ~514000000 µs (514 s)
1000000 : N/A... I hadn't enough time for that. :)

-> Surprising result for the advanced SLAB allocator. It's really much much slower than MorphOS (hundreds times slower!). So it seems there's a high constant allocation time. But I really don't explain it. Even the older MorphOS allocator is way faster.

Anybody have a clue why OS4 shows such slow results? I expected TLSF to be faster, but not with this magnitude. There seems to be something really wrong with the OS4 memory allocation time, so maybe the test exploits something particularly unfavorable to OS4 allocator, i don't know.

Last edited by Fab on 18-Sep-2009 at 03:18 PM.
Last edited by Fab on 18-Sep-2009 at 03:17 PM.
Last edited by Fab on 18-Sep-2009 at 03:17 PM.
Last edited by Fab on 18-Sep-2009 at 03:04 PM.
Last edited by Fab on 18-Sep-2009 at 03:03 PM.

Status: Offline

BaldGuy

Re: Interesting memory allocation benchmark
Posted on 18-Sep-2009 15:29:35

[ #2 ]

Member

Joined: 11-Aug-2009
Posts: 28
From: Belgium

@Fab

I know not that OS4 is slow like this.

That is horrible.

Hope Amiga and Hyperion will fix soon.

_________________
AMIGA 500/EXT.FLOPPY
AMIGA 1200/030/50MHz/FPU/SCSI
AMIGA 4000/060/50MHz/SCSI/CYBERVISION
AMIGA CD32
AMIGA CDTV
AMIGA T-Shirt
AMIGA Mousepad
Commodore Underwear

Status: Offline

zerohero

Re: Interesting memory allocation benchmark
Posted on 18-Sep-2009 16:12:24

[ #3 ]

Team Member

Joined: 4-May-2004
Posts: 2524
From: Uddevalla, Sweden

@Fab

It's even slower on my A1 XE with a G4 @ 800MHz, though I have set my FSB down to 100MHz... Interesting results though.

_________________
Common sense - So rare it's almost like a super power

Status: Offline

kas1e

Re: Interesting memory allocation benchmark
Posted on 18-Sep-2009 16:17:12

[ #4 ]

Elite Member

Joined: 11-Jan-2004
Posts: 3549
From: Russia

Interesting benchmarks. Will be cool to hear somethink from the Os4 devels about :)

_________________
Join us to improve dopus5!
zerohero's mirror of os4/os3 crosscompiler suites

Status: Offline

jPV

Re: Interesting memory allocation benchmark
Posted on 18-Sep-2009 16:53:00

[ #5 ]

Cult Member

Joined: 11-Apr-2005
Posts: 812
From: .fi

With Pegasos1 and MorphOS2.3 (TLSF):

1000: 24248 µs (~0.024s)
2000: 62200 µs (~0.062s)
10000: 256290 µs (~0.26s)
50000: 1229004 µs (~1.2s)
100000: 2454614 µs (~2.5s)
1000000: 24336641 µs (~24s)

I didn't bother to boot to fresh system. So it's with couple of hours uptime, several irc clients, browser, ssh etc in the background :)

_________________
- The wiki based MorphOS Library - Your starting point for MorphOS
- Software made by jPV^RNO

Status: Offline

mike

Re: Interesting memory allocation benchmark
Posted on 18-Sep-2009 16:53:29

[ #6 ]

Regular Member

Joined: 31-Jul-2007
Posts: 406
From: Alpha Centauri

68060 tlsfmem
1000 Elapsed time: 239673 µs (0.23s)
2000 Elapsed time: 458838 µs (0.45s)
10000 Elapsed time: 2277872 µs (2.27s)
50000 Elapsed time: 11382361 µs (11.3s)
100000 Elapsed time: 23131983 µs (23.1s)
1000000 Elapsed time: 231297482 µs (221.2s)

68060 exec's finest.
1000 Elapsed time: 406642 µs (0.40)
2000 Elapsed time: 813348 µs (0.81)
10000 Elapsed time: 4038716 µs (4.0)
50000 Elapsed time: 20193021 µs (20.1)
100000 Elapsed time: 40599833 µs (40.5)
1000000 Elapsed time: 403810916 µs (403.8)

oi, recompiled with gcc -O3 -m68060 -noixemul membench.c -o membench, gcc340 came to 280013-273133 ns for 1000 allocs, gcc295 came to 237650-226703 at best

Last edited by mike on 18-Sep-2009 at 05:50 PM.
Last edited by mike on 18-Sep-2009 at 05:39 PM.
Last edited by mike on 18-Sep-2009 at 05:11 PM.
Last edited by mike on 18-Sep-2009 at 04:57 PM.

_________________
C= Amiga addict
,,,
(Oo)
⎛☮ໄ
ﮑὠՀ
Couldn't care less what other people think, seeing that there's concrete evidence they don't.

Status: Offline

bernd_afa

Re: Interesting memory allocation benchmark
Posted on 18-Sep-2009 18:27:48

[ #7 ]

Cult Member

Joined: 14-Apr-2006
Posts: 829
From: Unknown

@Fab
>Anybody have a clue why OS4 shows such slow results? I expected TLSF to be >faster, but not with this magnitude.

try such numbers for the bench.If OS4 get faster.

int sizes[] = {2, 5, 11, 13, 28, 20, 44, 19, 3, 77, 33, 127, 251,
304, 111, 700, 43, 7011, 112, 1, 4000 }; /* Silly stuff, whatever :) */

thats more praxis relatet.most frequent mem alloc are always in range from 0 to 256 bytes.i do a small profiling tool that count memalloc < 256 and memalloc >256.

and use of programs show that memallocs < 256 are about 1000* more often as larger mem allocs.

larger memallocs happen not so often in reality, but when do the bench that it do so large allocs, the mmu tables must change often and rearrange.

Status: Offline

Tomppeli

Re: Interesting memory allocation benchmark
Posted on 18-Sep-2009 18:34:00

[ #8 ]

Super Member

Joined: 18-Jun-2004
Posts: 1652
From: Home land of Santa, sauna, sisu and salmiakki

I've noticed a long time ago that allocation is fast but deallocation is slow. So add reporting of elapsed time in between allocation and deallocation loops. And rerun the test. (Also for AmigaOS4 use AllocVecTags (it uses MEMF_PRIVATE flag by default also).)

Edit: I found a bug from it:
Quote:
APTR ptr[sizeof(sizes)/sizes[0]];
for(j = 0; j < sizeof(sizes)/sizeof(sizes[0]); j++)

Last edited by Tomppeli on 18-Sep-2009 at 07:14 PM.
Last edited by Tomppeli on 18-Sep-2009 at 06:46 PM.

_________________
Rock lobster bit me. My Workbench has always preferences. X1000 + AmigaOS4.1 FE
"Anyone can build a fast CPU. The trick is to build a fast system." -Seymour Cray

Status: Offline

Fab

Re: Interesting memory allocation benchmark
Posted on 18-Sep-2009 19:00:07

[ #9 ]

Super Member

Joined: 17-Mar-2004
Posts: 1178
From: Unknown

@Tomppeli

There was indeed a copy/paste bug, but it didn't have any ill effect anyway, given the value of size[0].

But i changed it a bit. Not that it will change anything to these results, though. :)

Status: Offline

itix

Re: Interesting memory allocation benchmark
Posted on 18-Sep-2009 19:17:11

[ #10 ]

Elite Member

Joined: 22-Dec-2004
Posts: 3398
From: Freedom world

I expected old memlist based memory allocater would have been slower than SLAB allocator in OS4. Even 68k Amiga is faster...

_________________
Amiga Developer
Amiga 500, Efika, Mac Mini and PowerBook

Status: Offline

Cheese

Re: Interesting memory allocation benchmark
Posted on 18-Sep-2009 22:58:54

[ #11 ]

Regular Member

Joined: 23-Oct-2006
Posts: 314
From: Unknown

Seems SLAB rhymes with ....

Last edited by Cheese on 18-Sep-2009 at 11:18 PM.

_________________
x86/MorphOS 4.0

"Delving into the past can be a dangerous exercise." -hyperionmp

"I've been a supporter of "REACTION" GUI because is an Amiga OS thing." -Snuffy

"I personally prefer a vision of do'ers and makers rather than

Status: Offline

ssolie

Re: Interesting memory allocation benchmark
Posted on 18-Sep-2009 23:46:39

[ #12 ]

Elite Member

Joined: 10-Mar-2003
Posts: 2755
From: Alberta, Canada

@Fab
Quote:
Anybody have a clue why OS4 shows such slow results?

I would suggest you email Thomas Frieden directly and discuss it. Perhaps you can help find a root cause and a solution to fix it if there is indeed a problem.

_________________
ExecSG Team Lead

Status: Offline

pixie

Re: Interesting memory allocation benchmark
Posted on 18-Sep-2009 23:56:54

[ #13 ]

Elite Member

Joined: 10-Mar-2003
Posts: 3120
From: Figueira da Foz - Portugal

@ssolie

Quote:
Perhaps you can help find a root cause and a solution to fix it if there is indeed a problem.

And what exactly lead you into thinking there is a problem?

_________________
Indigo 3D Lounge, my second home.
The Illusion of Choice | Am*ga

Status: Offline

Samwel

Re: Interesting memory allocation benchmark
Posted on 19-Sep-2009 8:44:51

[ #14 ]

Elite Member

Joined: 7-Apr-2004
Posts: 3404
From: Sweden

@pixie

Eh.. Maybe because the speed result is waaay slower than it should be?

_________________
/Harry

[SOLD] µA1-C - 750GX 800MHz - 512MB - Antec Aria case

Avatar by HNL_DK!

Status: Offline

corto

Re: Interesting memory allocation benchmark
Posted on 19-Sep-2009 9:10:06

[ #15 ]

Regular Member

Joined: 24-Apr-2004
Posts: 342
From: Grenoble (France)

@Fab

Are we sure that memory functions do the same thing ? I mean, at work we had a similar case between two Linux and at the end one was using 'lazy allocation' : the alloc function was returned with success but mapping part of the allocation was done at the first memory access in the page.

It would be interesting to split the benchmark to know the elapsed time associated with each function.

With your raw results, that's true that something is wrong ...

As I work on tests and benchmarks everyday, I use to have (at least) 2 conclusions :
- be careful with benchmarks results and early conclusions
- they are both useful to improve software

Thanks Fab, you certainly pointed a problem.

Status: Offline

itix

Re: Interesting memory allocation benchmark
Posted on 19-Sep-2009 9:26:14

[ #16 ]

Elite Member

Joined: 22-Dec-2004
Posts: 3398
From: Freedom world

@Samwel

Quote:

Eh.. Maybe because the speed result is waaay slower than it should be?

It should be easy to find out by running the test on OS 4.0.

_________________
Amiga Developer
Amiga 500, Efika, Mac Mini and PowerBook

Status: Offline

ChrisH

Re: Interesting memory allocation benchmark
Posted on 19-Sep-2009 10:03:40

[ #17 ]

Elite Member

Joined: 30-Jan-2005
Posts: 6679
From: Unknown

@Fab
Interesting albiet worrying results. I'll hopefully have time to look at closer later, but it might explain why E programs still perform better on OS4 with a custom super-fast allocator (as provided by AmigaE & PortablE) than when directly using the OS.

It's a pity that your benchmark does not report time per allocation ( = total time / number of allocations), that would make comparisons easier, and microseconds a more sensible unit. I changed it to milliseconds for sanity, but currently left it at total elapsed time for easy comparison with your results.

BEWARE that the Debug Kernel added 50% on to my reported times.

Last edited by ChrisH on 19-Sep-2009 at 10:05 AM.

_________________
Author of the PortablE programming language.
It is pitch black. You are likely to be eaten by a grue...

Status: Offline

Cyborg

Re: Interesting memory allocation benchmark
Posted on 19-Sep-2009 10:26:34

[ #18 ]

Regular Member

Joined: 26-Nov-2003
Posts: 424
From: Germany

@Fab

bernd_afa is right. Your power-of-2 sizes are a) not praxisrelevant and b).. well.. no algorithm performs equally good or bad in every situation. I'm sure someone could also find a situation where the results would be reversed.. Anyway, if your do what bernd_afa suggested, OS4 is numerous times faster. Still not faster than MOS, but there are enough pitfalls a "benchmark" can fall into to generate questionable results.

For the heck of it here the results with bernd_afa changes on OS4:

1000: 113044 µs (0.113044 s)
2000: 225120 µs (0.225120 s)
10000: 1155693 µs (1.155693 s)
50000: 5807400 µs (5.807400 s)
100000: 11615699 µs (11.615699 s)
1000000: 116656246 µs (116.656250 s)

Quite a difference, huh? And that only because more realistic allocation sizes were used than the original silly ;) ones.

I only tested an 68k build on MOS, because I don't have the SDK (where could I get that from?). As said, it still was faster (the JIT doesn't really have any great workload with that little code) but a lot slower than with the original silly sizes.

Anyway.. this just to show you that there is absolutely no big fat problem in the memory allocation algorithms in OS4 and that the original "benchmark" of this thread doesn't mean anything. (And even if MOS is faster... well.. so be it ;) )

_________________
Regards, Cyborg.
AmigaOS4 development team member

"In the beginning was CAOS.."
-- Andy Finkel, 1988 (ViewPort article, Oct. 1993)

Status: Offline

ChrisH

Re: Interesting memory allocation benchmark
Posted on 19-Sep-2009 11:54:28

[ #19 ]

Elite Member

Joined: 30-Jan-2005
Posts: 6679
From: Unknown

@Fab
I hate coding in C, so I rewrote yours in E (actually PortablE), and compiled it for various OSes:
http://cshandley.co.uk/temp/membench/

As an aside: The E source code is about half the size of the C source, has no OS-specific work-arounds, reports more meaningful information, and looks a hell of a lot nice to boot :) . I also compiled an AROS version for the hell of it (since it is about zero extra effort).

EDIT: I have also uploaded a Bernd_AFA version of the test. Is a LOT faster as reported elsewhere!

Last edited by ChrisH on 19-Sep-2009 at 12:19 PM.
Last edited by ChrisH on 19-Sep-2009 at 12:06 PM.
Last edited by ChrisH on 19-Sep-2009 at 11:56 AM.

_________________
Author of the PortablE programming language.
It is pitch black. You are likely to be eaten by a grue...

Status: Offline

ChrisH

Re: Interesting memory allocation benchmark
Posted on 19-Sep-2009 12:16:52

[ #20 ]

Elite Member

Joined: 30-Jan-2005
Posts: 6679
From: Unknown

I have now uploaded a TEST script, which makes it incredibly easy. Here are my results using the NON-debug kernel on my 667MHz Sam440ep:
Quote:
execute membench-TEST membench-bernd_afa_OS4

1000 iterations:
Elapsed time: 114872 µs = 114 ms
Average time: 5 µs (per allocation + deallocation)

2000 iterations:
Elapsed time: 228924 µs = 228 ms
Average time: 5 µs (per allocation + deallocation)

10000 iterations:
Elapsed time: 1133979 µs = 1133 ms
Average time: 5 µs (per allocation + deallocation)

50000 iterations:
Elapsed time: 5610080 µs = 5610 ms
Average time: 5 µs (per allocation + deallocation)

100000 iterations:
Elapsed time: 11195496 µs = 11195 ms
Average time: 5 µs (per allocation + deallocation)

1000000 iterations:
Elapsed time: 111841734 µs = 111841 ms
Average time: 5 µs (per allocation + deallocation)

FWIW, I got 9 us when using the Debug kernel.

Last edited by ChrisH on 19-Sep-2009 at 12:17 PM.

_________________
Author of the PortablE programming language.
It is pitch black. You are likely to be eaten by a grue...

Status: Offline

[ home ][ about us ][ privacy ] [ forums ][ classifieds ] [ links ][ news archive ] [ link to us ][ user account ]

Amigaworld.net was originally founded by David Doyle