Discussion:
Performance of list / array / dict compared
(too old to reply)
RodionGork
2024-08-18 07:41:37 UTC
Permalink
Hi Friends!

I've seen here curious thread on comparing TCL speed with Python - and
as a "sequel" to it here is more TCLish observation.

It happened that I was also measuring languages performance (for
practical purpose - to get understanding how much calculations would fit
in 1 second limited sandbox on my site). There are two tests - for plain
integer calculations - and for calculations involving array.

I initially used list as array and while performance is somewhat lower
than with other popular scripting languages, I thought it is quite
decent, regarding list implementation (I thought it is a kind of
space-separated string by then).

Then I browsed TCL tutorial and rewrote implementations using array and
finally, dict. They were
significantly worse, which is explainable as they are not necessarily
tuned to work as linear array - but I was somewhat surprised to see the
"dict" is the worst of all (perhaps it in improved in the versions above
8.5):


C (long long): 4.28
PHP: 11.99
Python3: 42.57
TCL (list): 63.30
TCL (array): 104.78
TCL (dict): 112.41
Lua: 33.75

Implementation is here and you are welcome to check whether I missed
some obvious way to improve results:
https://github.com/rodiongork/languages-benchmark

As a sidenote, PHP and Lua use single version of array for both "linear"
and "dict-like" storage.
--
to email me substitute github with gmail please
Gerald Lester
2024-08-18 13:28:09 UTC
Permalink
Post by RodionGork
Hi Friends!
I've seen here curious thread on comparing TCL speed with Python - and
as a "sequel" to it here is more TCLish observation.
It happened that I was also measuring languages performance (for
practical purpose - to get understanding how much calculations would fit
in 1 second limited sandbox on my site). There are two tests - for plain
integer calculations - and for calculations involving array.
I initially used list as array and while performance is somewhat lower
than with other popular scripting languages, I thought it is quite
decent, regarding list implementation (I thought it is a kind of
space-separated string by then).
Then I browsed TCL tutorial and rewrote implementations using array and
finally, dict. They were
significantly worse, which is explainable as they are not necessarily
tuned to work as linear array - but I was somewhat surprised to see the
"dict" is the worst of all (perhaps it in improved in the versions above
C (long long): 4.28
PHP: 11.99
Python3: 42.57
TCL (list): 63.30
TCL (array): 104.78
TCL (dict): 112.41
Lua: 33.75
Implementation is here and you are welcome to check whether I missed
https://github.com/rodiongork/languages-benchmark
As a sidenote, PHP and Lua use single version of array for both "linear"
and "dict-like" storage.
Python has the equivalent of Lists, Arrays, and Dictionaries -- which
did you use?

For calculation, C or Golang would likely be best.
RodionGork
2024-08-18 14:08:46 UTC
Permalink
Post by Gerald Lester
Python has the equivalent of Lists, Arrays, and Dictionaries -- which
did you use?
in Python the plain list is used (well, these details could be quickly
seen by the link to sources above) - as I mentioned in TCL I decided to
try other structures only because I thought list implementation could be
not very effective internally, e.g. due to historical reasons...
Post by Gerald Lester
For calculation, C or Golang would likely be best.
Not necessarily, any language compiled to native codes will do
similarly. Moreover, there is optimized version of Python - and there is
JIT-version for Lua, while Java uses JIT anyway, so they results are
very close:

Java: 1.60
Pypy3: 7.31
LuaJit: 2.18

Perhaps someone may one day try to use JIT in TCL also (perhaps even
borrowing it from Lua may work)
--
to email me substitute github with gmail please
Rich
2024-08-18 17:05:15 UTC
Permalink
Post by RodionGork
Post by Gerald Lester
Python has the equivalent of Lists, Arrays, and Dictionaries -- which
did you use?
in Python the plain list is used (well, these details could be quickly
seen by the link to sources above) - as I mentioned in TCL I decided to
try other structures only because I thought list implementation could be
not very effective internally, e.g. due to historical reasons...
Your 'history' is 25 years out of date. Tcl's lists have been O(1)
complexity indexed arrays for that long (so long as you don't shimmer
them to/from strings on every usage).
gustafn
2024-08-19 08:35:29 UTC
Permalink
Hi RodionGork,

I took a quick look at the "primes" examples in your comparison on the
GitHub page.

Using any data structures other than lists does not make sense for this
example.
One could get an improvement of about 5% by putting the outer loop into
a proc.

Most of the time in this example is spent in the "is_prime" proc.
One can get much bigger improvements by using critcl for the is_prime
function (see below):

baseline list 1766907.44 100.00
loop proc 1689220.00 95.60
is_prime_list_c 118298.50 6.70

This is in the spirit of thinking in "system languages" and "glue
languages"
by John Ousterhout, where one should find the right mix for the
applications,
when performance matters.

all the best
-g



===================================================================================
package require critcl

critcl::cproc is_prime_list_c {Tcl_Interp* interp list primes int x} int
{
int i;
for (i=0; i<primes.c; i++) {
int d;
if (Tcl_GetIntFromObj(interp, primes.v[i], &d) != TCL_OK) {
fprintf(stderr, "list element is not an integer: '%s'\n",
Tcl_GetString(primes.v[i]));
}
if (d*d > x) return 1;
if (x%d == 0) return 0;
}
return -1;
}

critcl::load
===================================================================================

===================================================================================
proc run_list_c {} {
set primes {2 3 5 7}
set n $::env(MAXN)

for {set i 9} {1} {incr i 2} {
if {[is_prime_list_c $primes $i]} {
lappend primes $i
if {[llength $primes] == $n} {
puts "primes\[$n\] = $i"
break
}
}
}
}
===================================================================================
Rich
2024-08-18 17:03:14 UTC
Permalink
Post by RodionGork
Hi Friends!
I've seen here curious thread on comparing TCL speed with Python - and
as a "sequel" to it here is more TCLish observation.
It happened that I was also measuring languages performance (for
practical purpose - to get understanding how much calculations would fit
in 1 second limited sandbox on my site). There are two tests - for plain
integer calculations - and for calculations involving array.
I initially used list as array and while performance is somewhat lower
than with other popular scripting languages, I thought it is quite
decent, regarding list implementation (I thought it is a kind of
space-separated string by then).
Tcl lists have not been "space-separated strings" since the advent of
Tcl 8.0. Which according to this page
(http://tcl.tk/software/tcltk/8.0.html) was released March 9, 1999. So 25
years since Tcl's lists were "space-separated strings" (reality is more
complex, they were really "specially formatted strings" with "space
separated" being a simple subset of "specially formatted".
Post by RodionGork
Then I browsed TCL tutorial and rewrote implementations using array and
finally, dict. They were
significantly worse, which is explainable as they are not necessarily
tuned to work as linear array - but I was somewhat surprised to see the
"dict" is the worst of all (perhaps it in improved in the versions above
C (long long): 4.28
PHP: 11.99
Python3: 42.57
TCL (list): 63.30
TCL (array): 104.78
TCL (dict): 112.41
Lua: 33.75
Implementation is here and you are welcome to check whether I missed
https://github.com/rodiongork/languages-benchmark
Which one? There are two.

For Collaz, if you are really on 8.5, then wrapping everything that is
at global level inside a proc (i.e., everthing from line 10 to line
17), and calling that proc as the single global command, will gain a
bit of speed, as for 8.5 the bytecode compiler is limited in what it
can compile outside of procs.

For Primes (beyond the same "in a proc" for 8.5 above), in the 'list
version' using global is costing you a bit of performance (the
"linking" performed by gobal takes some time). Modifing "is_prime" to
take both x and primes as parameters will gain a bit of speed for the
list version.

The array and dict versions will be slower than the list version
because they will always be adding in the overhead of the hashmap
computation for looking up elements (no hashmap lookup in the list
version).

For the dict version (besides all the above), you /might/ gain some
speed using the [dict values] subcommand to get a list of values from
the dict, then iterating that list. I.e.:

foreach d [dict values $primes] {
}

Which should avoid performing all the hash computations to lookup each
element individually. But, that will still be creating a new list each
time your outer loop modifies the dict.

You also don't need [dict append] in the outer loop. The way you are
using the dict, doing [dict append] means paying the cost of a hash
computation and a single element list creation for each new element
added. A [dict set] will produce the same final dict string, but avoid
wrapping each 'prime' in a single element list in each dict slot,
saving that (small) overhead.
Loading...