Discussion:
tcl versa python regarding performance
Add Reply
aotto1968
2024-08-13 20:40:04 UTC
Reply
Permalink
Hi, some (unproven) statistics from my SW regarding the performance TCL versa PYTHON

The --send ... send packages
The --parent/child ... measure startup time
The other ... build data structures

Tcl is except --parent (startup) slower than python → I think the CORE problem is the OO implementation in TCL

→ the basic technology for TCL and PYTHON is a OO wrapper around the C-library this mean the BASIC workload
for TCL & PYTHON is the same and the TIME difference is just the TCL/PYTHON overload

TCL
===

setup=release
feature=tcl_pipe
.../release/inst/sbin/x86_64-suse-linux-gnu-perfserver-tcl.tcl
:PerfClientExec }: start ------------------------ : result [ count / sec ]
:statistics }: --send : 216004.5 [ 432206 / 2.000912 ]
:statistics }: --send-string : 224700.4 [ 449614 / 2.000949 ]
:statistics }: --send-and-callback : 121014.5 [ 242218 / 2.001561 ]
:statistics }: --send-and-wait : 58694.0 [ 117389 / 2.000015 ]
:statistics }: --send-persistent : 13358.7 [ 26718 / 2.000046 ]
:statistics }: --parent : 82.5 [ 165 / 2.000311 ]
:statistics }: --child : 21321.3 [ 42643 / 2.000022 ]
:statistics }: --bus : 40133.7 [ 80268 / 2.000017 ]
:statistics }: --bfl : 41333.4 [ 82667 / 2.000005 ]
:statistics }: --bin : 264818.7 [ 529687 / 2.000187 ]
:statistics }: --str : 265383.5 [ 530867 / 2.000377 ]
:PerfClientExec }: end: ----------------------------------------

PYTHON
======

setup=release
feature=py_pipe
.../release/inst/sbin/x86_64-suse-linux-gnu-perfserver-py.py
:PerfClientExec }: start ------------------------ : result [ count / sec ]
:statistics }: --send : 292415.7 [ 584872 / 2.000139 ]
:statistics }: --send-string : 294627.4 [ 589494 / 2.000812 ]
:statistics }: --send-and-callback : 154291.6 [ 308617 / 2.000219 ]
:statistics }: --send-and-wait : 73577.8 [ 147156 / 2.000005 ]
:statistics }: --send-persistent : 13796.6 [ 27594 / 2.000058 ]
:statistics }: --parent : 71.0 [ 142 / 2.000464 ]
:statistics }: --child : 21959.2 [ 43919 / 2.000024 ]
:statistics }: --bus : 67991.3 [ 135983 / 2.000005 ]
:statistics }: --bfl : 65537.4 [ 131075 / 2.000004 ]
:statistics }: --bin : 326737.6 [ 653632 / 2.000480 ]
:statistics }: --str : 327746.7 [ 655625 / 2.000402 ]
:PerfClientExec }: end: ----------------------------------------


This is the reference in C only
===============================

setup=release
feature=c_pipe
.../release/inst/sbin/x86_64-suse-linux-gnu-perfserver-c
:PerfClientExec }: start ------------------------ : result [ count / sec ]
:statistics }: --send : 372049.5 [ 744214 / 2.000309 ]
:statistics }: --send-string : 388248.2 [ 776648 / 2.000390 ]
:statistics }: --send-and-callback : 221224.1 [ 442541 / 2.000420 ]
:statistics }: --send-and-wait : 87320.3 [ 174641 / 2.000004 ]
:statistics }: --send-persistent : 15245.3 [ 30491 / 2.000024 ]
:statistics }: --parent : 552.4 [ 1105 / 2.000235 ]
:statistics }: --child : 35888.9 [ 71778 / 2.000004 ]
:statistics }: --bus : 80124.6 [ 160250 / 2.000011 ]
:statistics }: --bfl : 86655.9 [ 173312 / 2.000003 ]
:statistics }: --bin : 400718.2 [ 801590 / 2.000383 ]
:statistics }: --str : 396122.8 [ 792369 / 2.000312 ]
:PerfClientExec }: end: ----------------------------------------
undroidwish
2024-08-13 20:51:19 UTC
Reply
Permalink
Post by aotto1968
Hi, some (unproven) statistics from my SW regarding the performance TCL
^^^^^^^^

Exactly. I tend to go even further and add the attribute useless to
unproven as long as you publish some numbers with some subjective
analysis without presenting the implementation and the measurement
method.
aotto1968
2024-08-14 09:02:53 UTC
Reply
Permalink
Post by aotto1968
Hi, some (unproven) statistics from my SW regarding the performance TCL
            ^^^^^^^^
Exactly. I tend to go even further and add the attribute useless to
unproven as long as you publish some numbers with some subjective
analysis without presenting the implementation and the measurement
method.
not really, with "aggressive" optimization the TCL is doing better, but not close to PYTHON

setup=release
feature=cc_pipe
.../release/inst/sbin/x86_64-suse-linux-gnu-perfserver-cc
: start ------------------------ : result [ count / sec ]
: --send : 387331.8 [ 774702 / 2.000099 ]
: end: ----------------------------------------
feature=c_pipe
.../release/inst/sbin/x86_64-suse-linux-gnu-perfserver-c
: start ------------------------ : result [ count / sec ]
: --send : 390295.8 [ 780623 / 2.000081 ]
: end: ----------------------------------------
feature=py_pipe
.../release/inst/sbin/x86_64-suse-linux-gnu-perfserver-py.py
: start ------------------------ : result [ count / sec ]
: --send : 284371.7 [ 568775 / 2.000111 ]
: end: ----------------------------------------
feature=tcl_pipe
.../release/inst/sbin/x86_64-suse-linux-gnu-perfserver-tcl.tcl
: start ------------------------ : result [ count / sec ]
: --send : 215990.2 [ 432027 / 2.000216 ]
: end: ----------------------------------------

setup=aggressive
feature=cc_pipe
.../aggressive/inst/sbin/x86_64-suse-linux-gnu-perfserver-cc
: start ------------------------ : result [ count / sec ]
: --send : 398688.2 [ 797433 / 2.000142 ]
: end: ----------------------------------------
feature=c_pipe
.../aggressive/inst/sbin/x86_64-suse-linux-gnu-perfserver-c
: start ------------------------ : result [ count / sec ]
: --send : 401113.0 [ 802377 / 2.000376 ]
: end: ----------------------------------------
feature=py_pipe
.../aggressive/inst/sbin/x86_64-suse-linux-gnu-perfserver-py.py
: start ------------------------ : result [ count / sec ]
: --send : 286609.9 [ 573378 / 2.000552 ]
: end: ----------------------------------------
feature=tcl_pipe
.../aggressive/inst/sbin/x86_64-suse-linux-gnu-perfserver-tcl.tcl
: start ------------------------ : result [ count / sec ]
: --send : 237457.9 [ 475001 / 2.000359 ]
: end: ----------------------------------------
aotto1968
2024-08-14 09:16:03 UTC
Reply
Permalink
the first analyses is quite simple:

right now python does NOT support threads in NHI1 (will change soon) and tcl does…
this has an influence on the "release" build because this is NHI1 without threads in python and with
threads in tcl.

→ the difference is that the thread-local-storage is an STATIC REFERENCE in python and a POINTER in tcl.

→ the "aggressive" build does NOT use threads at all and the change between python and tcl is more compare-able
but is still ~20%
Gerald Lester
2024-08-14 12:04:41 UTC
Reply
Permalink
Post by aotto1968
right now python does NOT support threads in NHI1 (will change soon) and tcl does…
this has an influence on the "release" build because this is NHI1
without threads in python and with
threads in tcl.
→ the difference is that the thread-local-storage is an STATIC REFERENCE
in python and a POINTER in tcl.
→ the "aggressive" build does NOT use threads at all and the change
between python and tcl is more compare-able
but is still ~20%
I think the point that androwish was making, without seeing the code we
can not tell if you did something in a way that takes more time than
doing it in a slightly different way.
aotto1968
2024-08-15 13:04:10 UTC
Reply
Permalink
Post by aotto1968
right now python does NOT support threads in NHI1 (will change soon) and tcl does…
this has an influence on the "release" build because this is NHI1 without threads in python and with
threads in tcl.
→ the difference is that the thread-local-storage is an STATIC REFERENCE in python and a POINTER in tcl.
→ the "aggressive" build does NOT use threads at all and the change between python and tcl is more compare-able
but is still ~20%
I think the point that androwish was making, without seeing the code we can not tell if you did something in a way that takes
more time than doing it in a slightly different way.
I use the kcachegrind to debug the performance but there are a lot of "small" points to end-up in the ~20% loss against python.

-> I cannot post a "picture" because the "newsgroup does NOT accept pictures …
-> must of the code is in the TCL-C-Api for example:

Example my "ServiceCall" function: at the end of a service call I use:

if (ret == TCL_OK) {
Tcl_ResetResult(interp);
return MkErrorGetCode_0E();
}

and this simple "Tcl_ResetResult" eat 0,8% of the total performance → this is 75% of my "ServiceCall" performance.

-> not trivial, it seems that the Python people with a lot of “manpower” have already MAXIMIZED the optimization of Python.

If I step in Tcl_ResetResult the highlight is:

% eat Total performance -> function name
1,09% -> ResetObjectResult
0,54% -> FreeByteArrayInternalRep (this object is variable size around ~ 1000 bytes)

mfg ao
Alan Grunwald
2024-08-15 13:30:43 UTC
Reply
Permalink
Post by aotto1968
Hi, some (unproven) statistics from my SW regarding the performance TCL
            ^^^^^^^^
Exactly. I tend to go even further and add the attribute useless to
unproven as long as you publish some numbers with some subjective
analysis without presenting the implementation and the measurement
method.
+1
Gerald Lester
2024-08-15 14:20:54 UTC
Reply
Permalink
Post by aotto1968
Hi, some (unproven) statistics from my SW regarding the performance TCL
             ^^^^^^^^
Exactly. I tend to go even further and add the attribute useless to
unproven as long as you publish some numbers with some subjective
analysis without presenting the implementation and the measurement
method.
+1
+2
aotto1968
2024-08-15 18:27:50 UTC
Reply
Permalink
To be more precise I add an image to show the differences TCL versa PYTHON on an the example wrapper function

-> ReadI8

This is from the debugging environment with tcl/py & extension compiled in debug mode.

Loading Image...
aotto1968
2024-08-15 18:43:28 UTC
Reply
Permalink
Post by aotto1968
To be more precise I add an image to show the differences TCL versa PYTHON on an the example wrapper function
-> ReadI8
This is from the debugging environment with tcl/py & extension compiled in debug mode.
https://i.postimg.cc/NjXccdRC/performance-check-tcl-versa-python.png
better link → with callgraph
Loading Image...
aotto1968
2024-08-15 19:09:40 UTC
Reply
Permalink
Post by aotto1968
Post by aotto1968
To be more precise I add an image to show the differences TCL versa PYTHON on an the example wrapper function
-> ReadI8
This is from the debugging environment with tcl/py & extension compiled in debug mode.
https://i.postimg.cc/NjXccdRC/performance-check-tcl-versa-python.png
better link → with callgraph
https://i.postimg.cc/TYbNKXrn/performance-check-tcl-versa-python.png
even better resolution: Loading Image...
aotto1968
2024-08-15 19:18:29 UTC
Reply
Permalink
Post by aotto1968
Post by aotto1968
Post by aotto1968
To be more precise I add an image to show the differences TCL versa PYTHON on an the example wrapper function
-> ReadI8
This is from the debugging environment with tcl/py & extension compiled in debug mode.
https://i.postimg.cc/NjXccdRC/performance-check-tcl-versa-python.png
better link → with callgraph
https://i.postimg.cc/TYbNKXrn/performance-check-tcl-versa-python.png
even better resolution: https://i.postimg.cc/wvpJV4QC/performance-check-tcl-versa-python.png
bad that I can not EDIT old data of this news message… the problem is that the "postimage" stuff
changes the resolution of the image → bad

I switch to the good old Facebook to post this screenshot and wait for your comment.

-> https://www.facebook.com/share/p/wihmQPR4pBRacLLF/
saito
2024-08-15 23:39:56 UTC
Reply
Permalink
Post by aotto1968
https://i.postimg.cc/wvpJV4QC/performance-check-tcl-versa-python.png
Very nice screenshots. Is this some sort a debugger?

Assuming that you wrote both tcl and python versions and that they both
wrap the same core library, wouldn't the call trees look the same or at
least bear resemblance?
aotto1968
2024-08-16 05:34:15 UTC
Reply
Permalink
Post by saito
Post by aotto1968
even better resolution: https://i.postimg.cc/wvpJV4QC/performance-check-tcl-versa-python.png
Very nice screenshots. Is this some sort a debugger?
Assuming that you wrote both tcl and python versions and that they both wrap the same core library, wouldn't the call trees look
the same or at least bear resemblance?
Yes, both the TCL and PYTHON extensions are wrappers for the same library and the TOOL for writing both wrappers is the NHI1/ALC
(All-Language-Compiler), that is why both wrappers look similar.

the memory debugger has two parts
1) valgrind --tool=callgrind --quiet ... your sw → create callgrind.out.*
2) kcachegrind callgrind.out.* → create the view
saito
2024-08-16 18:29:44 UTC
Reply
Permalink
Post by aotto1968
Post by saito
Post by aotto1968
https://i.postimg.cc/wvpJV4QC/performance-check-tcl-versa-python.png
Very nice screenshots. Is this some sort a debugger?
Assuming that you wrote both tcl and python versions and that they
both wrap the same core library, wouldn't the call trees look the same
or at least bear resemblance?
Yes, both the TCL and PYTHON extensions are wrappers for the same
library and the TOOL for writing both wrappers is the NHI1/ALC
(All-Language-Compiler), that is why both wrappers look similar.
What I meant was that the two images look very different. I can't make
out what the boxes say, but nevertheless one is wide and shallow, the
other narrow and deep. So this may not be an apples-to-apples
comparison. As has been noted already, shimmering may play a role here
or extra levels of abstraction via extra proc calls may skew the results
in one language vs. the other.
aotto1968
2024-08-16 20:19:27 UTC
Reply
Permalink
Post by aotto1968
Post by saito
Post by aotto1968
even better resolution: https://i.postimg.cc/wvpJV4QC/performance-check-tcl-versa-python.png
Very nice screenshots. Is this some sort a debugger?
Assuming that you wrote both tcl and python versions and that they both wrap the same core library, wouldn't the call trees
look the same or at least bear resemblance?
Yes, both the TCL and PYTHON extensions are wrappers for the same library and the TOOL for writing both wrappers is the
NHI1/ALC (All-Language-Compiler), that is why both wrappers look similar.
What I meant was that the two images look very different. I can't make out what the boxes say, but nevertheless one is wide and
shallow, the other narrow and deep. So this may not be an apples-to-apples comparison. As has been noted already, shimmering may
play a role here or extra levels of abstraction via extra proc calls may skew the results in one language vs. the other.
these two pictures are generated by the tool and not by me…
-> the TCL picture is so "wide" because the TCL uses a lot of "overhead"
-> the python picture is so "narrow" because PYTHON uses much less "overhead".
undroidwish
2024-08-17 04:27:22 UTC
Reply
Permalink
Post by aotto1968
...
these two pictures are generated by the tool and not by me…
-> the TCL picture is so "wide" because the TCL uses a lot of "overhead"
-> the python picture is so "narrow" because PYTHON uses much less "overhead".
Hmm, so here we are:

a) you complain about Tcl's bad performance
b) you seem to be unwilling to disclose enough information about the
Python and Tcl implementations in order to get the big picture and
see the cause of differences and to try to discuss improvements
with you
c) due to b) you continue to complain about Tcl's bad performance

Not quite a fruitful cycle.
aotto1968
2024-08-17 05:28:59 UTC
Reply
Permalink
Post by undroidwish
Post by aotto1968
...
these two pictures are generated by the tool and not by me…
-> the TCL picture is so "wide" because the TCL uses a lot of "overhead"
-> the python picture is so "narrow" because PYTHON uses much less "overhead".
a) you complain about Tcl's bad performance
b) you seem to be unwilling to disclose enough information about the
   Python and Tcl implementations in order to get the big picture and
   see the cause of differences and to try to discuss improvements
   with you
c) due to b) you continue to complain about Tcl's bad performance
Not quite a fruitful cycle.
I'm "not" complain about TCL bad performance I just mention that PYTHON has
done much more work on performance than TCL.

If you have 300.000 transaction per second (PYTHON) or 200.000 transaction
per second (TCL) is just an case for someone who need this difference.
undroidwish
2024-08-17 07:59:41 UTC
Reply
Permalink
Post by aotto1968
...
I'm "not" complain about TCL bad performance I just mention that PYTHON has
done much more work on performance than TCL.
Fine, then please elaborate on this claim. What exactly did Python
better and more in terms of performance? Any pointers welcome.
Post by aotto1968
If you have 300.000 transaction per second (PYTHON) or 200.000 transaction
per second (TCL) is just an case for someone who need this difference.
Indeed could this be a reason to ask if there are better ways of using
the Tcl framework in order to get the Tcl implementation be on par with
the Python one. As stated many times before, to discuss this on c.l.t.
will require that you provide more implementation details.
aotto1968
2024-08-15 21:48:37 UTC
Reply
Permalink
a short conclusion from Facebook …

"If you analyze the C lib wrapper for MqReadI8, the TCL code adds about 200% wrapper load and the PYTHON code adds about 10%
wrapper load." (ref: https://www.facebook.com/share/p/wihmQPR4pBRacLLF/)

→ I think TCL has an "performance-problem".
Christian Gollwitzer
2024-08-16 07:12:52 UTC
Reply
Permalink
Post by aotto1968
a short conclusion from Facebook …
"If you analyze the C lib wrapper for MqReadI8, the TCL code adds about
200% wrapper load and the PYTHON code adds about 10% wrapper load."
(ref: https://www.facebook.com/share/p/wihmQPR4pBRacLLF/)
→ I think TCL has an "performance-problem".
I won't solve the problem, just to say: It's impossible to help you with
this, because you don't explain:
* who wrote this wrapper
* where to find the code
* what benchmark are you running

It could be, e.g. that your benchmark code introduces shimmering and
then there's lots of conversion going on. It might be something
completely different. Or it might be that Tcl is indeed slower than
Python (in most of my comparisons, it was the opposite - unless you
offload work to external libraries).

Regards,

Christian
undroidwish
2024-08-16 08:41:32 UTC
Reply
Permalink
Post by Christian Gollwitzer
Post by aotto1968
...
→ I think TCL has an "performance-problem".
I won't solve the problem, just to say: It's impossible to help you with
* who wrote this wrapper
* where to find the code
* what benchmark are you running
...
+1

PS: Philosophically, the perpetual perception of performance problems
is inherent to human design (and possibly inextricable even).
aotto1968
2024-08-16 09:44:26 UTC
Reply
Permalink
Post by Christian Gollwitzer
Post by aotto1968
a short conclusion from Facebook …
"If you analyze the C lib wrapper for MqReadI8, the TCL code adds about 200% wrapper load and the PYTHON code adds about 10%
wrapper load." (ref: https://www.facebook.com/share/p/wihmQPR4pBRacLLF/)
→ I think TCL has an "performance-problem".
* who wrote this wrapper
* where to find the code
* what benchmark are you running
It could be, e.g. that your benchmark code introduces shimmering and then there's lots of conversion going on. It might be
something completely different. Or it might be that Tcl is indeed slower than Python (in most of my comparisons, it was the
opposite - unless you offload work to external libraries).
Regards,
      Christian
1) just the "stupid" Tcl_ObjectGetMetadata to retrieve the pointer associated with an oo-object cost 1/3 of the wrapper
performance → the whole header of a tcl OO wrapper cost more than everything else in the wrapper.

-> if you look into the code it is an hash-table lookup !!!
-> in python it is a ZERO-time operation

2) just to create an INT-object from an integer the TCL create always an object from scratch inclusive malloc etc
-> python uses for small numbers (integer) a table of already pre-alloc objects as ZERO-time operation

3) the set/reset-result have to free all the (stupid) objects that add additional 1/3 of the wrapper cost


analysis.

-> the Tcl_ObjectGetMetadata is clear an design-error
-> the missing small-int-object pre-alloc is an programmer-lazy-error


if someone can setup a screen sharing session than I can explain the problem in more detail
( need to test the screen-sharing first because because I not use to it )
aotto1968
2024-08-16 20:16:25 UTC
Reply
Permalink
I spend some time on research and further optimization ... but ...

One thing seems clear: "lang-Python" with AGGRESSIVE optimization is NOT far from "lang-C" speed.
With aggressive optimization, Python creates a runtime optimization (--enable-optimizations) during compilation and that WITH
threads which, unlike TCL, CANNOT be disabled in Python. Also, the runtime library is FIRMLY integrated into Python.

TCL with aggressive optimization also uses the static runtime library BUT no threads.

→ for updates check the picture in the comment. https://www.facebook.com/share/p/WYmfnRWybY1Sh42f/

summary for aggressive …
.../perf-aggressive/inst/sbin/c/x86_64-suse-linux-gnu-perfserver
:PerfClientExec }: start ------------------------ : result [ count / sec ]
:statistics }: --send : 403779.6 [ 1615234 / 4.000286 ]
:PerfClientExec }: end: ----------------------------------------
.../perf-aggressive/inst/sbin/py/x86_64-suse-linux-gnu-perfserver.py
:PerfClientExec }: start ------------------------ : result [ count / sec ]
:statistics }: --send : 311506.7 [ 1246216 / 4.000608 ]
:PerfClientExec }: end: ----------------------------------------
.../perf-aggressive/inst/sbin/tcl/x86_64-suse-linux-gnu-perfserver.tcl
:PerfClientExec }: start ------------------------ : result [ count / sec ]
:statistics }: --send : 227151.4 [ 908663 / 4.000253 ]
:PerfClientExec }: end: ----------------------------------------
aotto1968
2024-08-18 20:42:21 UTC
Reply
Permalink
add some documentation regarding the performance testing:
-> http://thedev.nhi1.de/theLink/main/md_docs_2main_2README__PERFORMANCE.htm
et99
2024-08-18 22:29:08 UTC
Reply
Permalink
Post by aotto1968
-> http://thedev.nhi1.de/theLink/main/md_docs_2main_2README__PERFORMANCE.htm
I recently wrote some C code using Visual Studio 2022 and they have a wonderful performance profiler. I was able to determine that 80% of the cost of the module I was developing was caused by calls to some library routines I was using. By writing my own versions that didn't need to be so generalized, I got that down to 10%.

One problem was that once I turned on the compiler optimization, the profiler became pretty much worthless to measure my own code's performance, so I couldn't get that 10% any lower.

But it would be kinda cool to try using those VS tools on the tcl source code, but I don't know of any way to build tcl inside VS where one could use those tools.
aotto1968
2024-08-19 12:26:39 UTC
Reply
Permalink
Post by et99
Post by aotto1968
-> http://thedev.nhi1.de/theLink/main/md_docs_2main_2README__PERFORMANCE.htm
I recently wrote some C code using Visual Studio 2022 and they have a wonderful performance profiler. I was able to determine
that 80% of the cost of the module I was developing was caused by calls to some library routines I was using. By writing my own
versions that didn't need to be so generalized, I got that down to 10%.
One problem was that once I turned on the compiler optimization, the profiler became pretty much worthless to measure my own
code's performance, so I couldn't get that 10% any lower.
But it would be kinda cool to try using those VS tools on the tcl source code, but I don't know of any way to build tcl inside
VS where one could use those tools.
With the callgrind tool on linux you can analyze any kind of executable, even executable's without symbols compiled in.
aotto1968
2024-08-24 20:14:44 UTC
Reply
Permalink
Enclosed you will find an update of the results of the performance test, with a focus on the description of the tools, tests and
the analysis of the results.

-> http://thedev.nhi1.de/theLink/main/md_docs_2main_2README__PERFORMANCE.htm#performance-server
aotto1968
2024-08-26 17:58:09 UTC
Reply
Permalink
Now it's that time again and I've put a small C++ project in between to slowly but surely make better use of the "kernel" via
the C++ compiler (and some features like templates etc.).

-> http://thedev.nhi1.de/NHI1/main/index.htm

The performance code has been revised again and the unnecessary TCP tests are placed behind the UDS tests. With the C++ "agile"
kernel, C++ is now on a par with C, while using a much more user-friendly programming interface.

-> http://thedev.nhi1.de/theLink/main/md_docs_2main_2README__PERFORMANCE.htm#README_PERFORMANCE

Also important, the ALC (All-Language Compiler) compiler was written in TCL.
aotto1968
2024-09-06 08:32:50 UTC
Reply
Permalink
short update

When adding the new option "__parser__(null-allowed)" I now get a nice (but still incorrect) error message. It is important to
note, however, that I can see the stack trace across TCL and C code. In fact, the Programming Language Micro Kernel (PLMK) does
not care which target language is involved.

https://www.facebook.com/permalink.php?story_fbid=pfbid0275qjybHm1Fk2iyvkwGhypQ3hgxnMH6V9HcpZxtRDXzxBW8TRxVVPG42ogDThxgGFl&id=100069563501101
aotto1968
2024-09-06 20:36:27 UTC
Reply
Permalink
Down is the "C" code of the C-Function to test the an "object" to be valid.

1. in "python"

bool MK(TestObject) (
PyObject * pyO,
PyTypeObject * typeO,
MK_OBJ * objP,
MkTestClassE * flagP
) {
MkTestClassE flag = MkTestClassE_NONE_OBJECT;
MK_OBJ obj = NULL;
if (pyO == Py_None) {
flag=MkTestClassE_NULL; goto end;
}
if (!PyObject_TypeCheck(pyO,typeO)) {
flag=MkTestClassE_WRONG_CLASS; goto end;
}
MK_MNG objM = VAL2MNG(pyO);
if (objM == NULL) { flag=MkTestClassE_NULL ; goto end; };
obj = MkObj(objM);
if (obj == NULL) { flag=MkTestClassE_INVALID_SIGNATURE ; goto end; };
flag = MkTestClassE_OK;
end:
if (flagP) *flagP = flag;
if (objP) *objP = obj;
switch(flag) {
case MkTestClassE_NONE_OBJECT : return false;
default : return true;
}
}

2. same in "Tcl"

( tcl "C"-Api has "no" function to test if an object has an "given" type etc. )

bool MK(TestObject) (
OT_Prefix_ARGS
Tcl_Obj * tclO,
MK_OBJ * objP,
MkTestClassE * flagP
) {
MkTestClassE flag = MkTestClassE_NONE_OBJECT;

int len=0;
MK_STRN str = Tcl_GetStringFromObj(tclO,&len);
if (len == 0 || MkStringIsNULL(MkStringCreate(len,str))) {
flag=MkTestClassE_NULL; goto end;
}

Tcl_Object tclObj = Tcl_GetObjectFromObj (interp, tclO);
if (tclObj == NULL) {
Tcl_ResetResult(interp);
flag=MkTestClassE_NONE_OBJECT; goto end;
};

objM = Tcl_ObjectGetMetadata(tclObj, &MK(AtomMeta));
/* NULL or wrong class etc */
if (objM == NULL) { flag=MkTestClassE_NULL ; goto end; };

objM = MkObj(objM);
if (objM == NULL) { flag=MkTestClassE_INVALID_SIGNATURE ; goto end; };

flag = MkTestClassE_OK;
if (objP) *objP = objM;

end:
if (flagP) *flagP = flag;
switch(flag) {
case MkTestClassE_NONE_OBJECT : return false;
default : return true;
}
}
undroidwish
2024-09-08 13:04:56 UTC
Reply
Permalink
Post by aotto1968
Down is the "C" code of the C-Function to test the an "object" to be valid.
...
Hmm, to judge the "efficiently" of the tcl "c" api is between difficult
and impossible due to your "cryptically" "c" code snippets. In other
words, more context would be of tremendous help. So where is the beef?
Or ham, or even spam?
Gerald Lester
2024-09-08 13:26:14 UTC
Reply
Permalink
Post by undroidwish
Post by aotto1968
Down is the "C" code of the C-Function to test the an "object" to be valid.
...
Hmm, to judge the "efficiently" of the tcl "c" api is between difficult
and impossible due to your "cryptically" "c" code snippets. In other
words, more context would be of tremendous help. So where is the beef?
Or ham, or even spam?
Others, including myself, have asked him for the Tcl and Python code he
is using to get his measurements -- he has consistently refused to
supply the code.

Long and short, to me this is just spam.
Christian Gollwitzer
2024-09-08 17:10:43 UTC
Reply
Permalink
Post by Gerald Lester
Post by undroidwish
Post by aotto1968
Down is the "C" code of the C-Function to test the an "object" to be valid.
...
Hmm, to judge the "efficiently" of the tcl "c" api is between difficult
and impossible due to your "cryptically" "c" code snippets. In other
words, more context would be of tremendous help. So where is the beef?
Or ham, or even spam?
Others, including myself, have asked him for the Tcl and Python code he
is using to get his measurements -- he has consistently refused to
supply the code.
Long and short, to me this is just spam.
Also my conclusion. As far as I understand it, he has written his own
warpper generator - something like SWIG - and calls ist "universal
compiler" or similar names. His Tcl wrappers are much slower than his
Python wrappers. It seems that he creates TclOO objects in his code. I
would suggest to simply use SWIG and then talk about the performance. My
guess is that SWIG wrappers will not show any difference between Tcl and
Python (because they are not based on TclOO in Tcl, which is not
necessary to get an OO interface).


Christian
aotto1968
2024-09-09 06:04:32 UTC
Reply
Permalink
Also my conclusion. As far as I understand it, he has written his own warpper generator - something like SWIG - and calls ist
"universal compiler" or similar names. His Tcl wrappers are much slower than his Python wrappers. It seems that he creates TclOO
objects in his code. I would suggest to simply use SWIG and then talk about the performance. My guess is that SWIG wrappers will
not show any difference between Tcl and Python (because they are not based on TclOO in Tcl, which is not necessary to get an OO
interface).
SWIG is far less than ALC but you are right TCL is much slower than PYTHON in C integration. it is an "design" issue of the API
because in PY all (basic) objects are instances and these instances are defined in C. The tcl OO is like an and-on to the TCL
language and in PY it IS the PY language.

I posted the code above just to show some simple facts:
1. Tcl has no "NULL" object → "NULL" is not even defined in Tcl
2. Tcl has no C-API to get/compare a "TYPE" of an object like Py_TYPE.
3. even to get an simple pointer from an object is just a C cast in PY and a
HASH table lookup in TCL

tcl has an advantage that TCL has an usable THREAD interface BUT this thread interface "cost" ~30% performance
compared to non-thread.
undroidwish
2024-09-09 09:11:57 UTC
Reply
Permalink
Post by aotto1968
...
1. Tcl has no "NULL" object → "NULL" is not even defined in Tcl
...
I'd call this a non-argument performance and otherwise given that
even the inventor of NULL regretted this his invention, see

https://en.wikipedia.org/wiki/Tony_Hoarehttps://en.wikipedia.org/wiki/Tony_Hoare

in this quote

"I call it my billion-dollar mistake. It was the invention of the null
reference in 1965. At that time, I was designing the first comprehensive
type system for references in an object oriented language (ALGOL W). My
goal was to ensure that all use of references should be absolutely safe,
with checking performed automatically by the compiler. But I couldn't
resist the temptation to put in a null reference, simply because it was
so easy to implement. This has led to innumerable errors,
vulnerabilities, and system crashes, which have probably caused a
billion dollars of pain and damage in the last forty years."
aotto1968
2024-09-09 09:59:45 UTC
Reply
Permalink
Post by undroidwish
Post by aotto1968
...
1. Tcl has no "NULL" object → "NULL" is not even defined in Tcl
...
I'd call this a non-argument performance and otherwise given that
even the inventor of NULL regretted this his invention, see
https://en.wikipedia.org/wiki/Tony_Hoarehttps://en.wikipedia.org/wiki/Tony_Hoare
in this quote
"I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the
first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of
references should be absolutely safe, with checking performed automatically by the compiler. But I couldn't resist the
temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors,
vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years."
→ you compare "apple" with "orange"

a NULL in "C" is not the same as an "None" in python or a "null" in java
→ this guy complain about the "NULL-POINTER" mistake which crate a dump in misuse etc
→ the "NULL" I speak about is to introduce a "non existing reference" as an information which is **not**
a NULL-pointer is "C".

example: if you lookup "otto" in a database than you get a "non existing NULL" back it "otto" does not exists
TCL has *no** default "NULL" type to indicate this issue, TCL can give an empty string back like "" **but**
if the "empty string" is the real value for "otto" in the database THAN you can NOT distinguish the both cases
1) "otto" does NOT exists
2) "otto" exists BUT has the value ""
undroidwish
2024-09-09 10:08:28 UTC
Reply
Permalink
  example: if you lookup "otto" in a database than you get a "non
existing NULL" back it "otto" does not exists
           TCL has *no** default "NULL" type to indicate this issue,
TCL can give an empty string back like "" **but**
           if the "empty string" is the real value for "otto" in the
database THAN you can NOT distinguish the both cases
           1) "otto" does NOT exists
           2) "otto" exists BUT has the value ""
Interesting, you're now talking about a result set, which could be
expressed as a Tcl list. An empty result set is an empty list, then.
In your lookup example a non-empty result has exactly one list element.
So where's your point?
aotto1968
2024-09-09 12:30:07 UTC
Reply
Permalink
Post by undroidwish
   example: if you lookup "otto" in a database than you get a "non existing NULL" back it "otto" does not exists
            TCL has *no** default "NULL" type to indicate this issue, TCL can give an empty string back like "" **but**
            if the "empty string" is the real value for "otto" in the database THAN you can NOT distinguish the both cases
            1) "otto" does NOT exists
            2) "otto" exists BUT has the value ""
Interesting, you're now talking about a result set, which could be
expressed as a Tcl list. An empty result set is an empty list, then.
In your lookup example a non-empty result has exactly one list element.
So where's your point?
→ ok, let make is very very simple.

near every programming language have some kind of "null" object/value/etc defined,
→ do you think the whole programming world is "stupid" ?
undroidwish
2024-09-09 13:06:32 UTC
Reply
Permalink
Post by aotto1968
...
→ ok, let make is very very simple.
near every programming language have some kind of "null"
object/value/etc defined,
→ do you think the whole programming world is "stupid" ?
You gave an example of a database query, which can produce a
result set with one item or nothing. Which in my argument fits
perfectly in a representation as Tcl list.

Now you seem to complain, that the Tcl paradigm does not have
a NULL thing. Which depending on context could well be an
empty list, as I've tried to explain, BTW.

And let me be clear: my thoughts about the whole programming
world were not part of the discussion.

So I interpret your last posting as pure trolling.

Congratulation, you have qualified for my ignore list. Good riddance!
aotto1968
2024-09-09 08:09:12 UTC
Reply
Permalink
Post by undroidwish
Post by aotto1968
Down is the "C" code of the C-Function to test the an "object" to be valid.
...
Hmm, to judge the "efficiently" of the tcl "c" api is between difficult
and impossible due to your "cryptically" "c" code snippets. In other
words, more context would be of tremendous help. So where is the beef?
Or ham, or even spam?
Others, including myself, have asked him for the Tcl and Python code he is using to get his measurements -- he has consistently
refused to supply the code.
Long and short, to me this is just spam.
I understand that the "missing-code" thing is just a kind of "self-protection" issue

→ If you make an performance check between TCL and JAVA you don't start with "analyze the JAVA kernel" etc
→ If I post a result this is the the result *AFTER* all known optimization was applied etc
→ If I write that TCL is ~30% slower as PY than it is so.
→ Even if I would send you the PLMK-kernel-code you probably will never understand this but this is no problem
because you also drive a car without "understand" the internals of a car and if I say that PORSCHE
is fasten than "VAUXHALL" than nobody will say:
-> "I don't accept this until I have checked the VAUXHALL design specs".
aotto1968
2024-09-10 20:51:25 UTC
Reply
Permalink
To give some substance to the amazement about the "performance test", I am in the process of building a "perfserver
distribution". The problem with something like this is that the files have to be extracted from the build environment and then
moved to a completely new location to "work".

https://www.facebook.com/permalink.php?story_fbid=pfbid02pHoDtBkJUrQKZqMa1JsxpC4pGLXj7J1Chx6p96idKRzXh3zQmAVFjWMbxYypYei8l&id=100069563501101
aotto1968
2024-09-11 19:40:32 UTC
Reply
Permalink
The really impressive thing about the result of the new JAVA performance test is that PYTHON (a scripting language) is on the
SAME performance level as JAVA (a compiled language). PYTHON has obviously invested heavily in performance optimization.

x86_64-suse-linux-gnu | send send send send create create data data
2024-09-11 21:27:41 | NOTHING END CALLBACK WAIT PARENT CHILD BUS BFL
------------------------- | -------- -------- -------- -------- --------- -------- -------- --------

pipe:
R: C | 530275 400403 222971 90707 3859 37965 89818 81581
R: C++ | 528852 396473 219499 89816 2470 36635 89619 89994
R: Python | 492501 304463 159169 73570 101 22109 68875 66767
R: Tcl | 306202 144504 78180 49443 81 18337 25233 25242
R: Java | 474683 313162 170157 79324 69 19772 72242 72031


x86_64-suse-linux-gnu | send send send send create create data data
2024-09-11 21:27:41 | NOTHING END CALLBACK WAIT PARENT CHILD BUS BFL
------------------------- | -------- -------- -------- -------- --------- -------- -------- --------

uds_fork:
R: C | 524971 396466 224320 89966 11814 38479 83146 90475
R: C++ | 520668 390057 216610 89014 8712 36018 89073 89726
R: Python | 494415 314163 160014 75356 344 22197 69082 67205
R: Tcl | na. na. na. na. na. na. na. na.
R: Java | na. na. na. na. na. na. na. na.


x86_64-suse-linux-gnu | send send send send create create data data
2024-09-11 21:27:41 | NOTHING END CALLBACK WAIT PARENT CHILD BUS BFL
------------------------- | -------- -------- -------- -------- --------- -------- -------- --------

uds_thread:
R: C | 504425 375809 212943 88359 32173 37978 87952 88606
R: C++ | 494135 365464 205933 88317 31582 35011 88070 73875
R: Python | na. na. na. na. na. na. na. na.
R: Tcl | 296177 139328 75300 47983 82 18166 24879 24783
R: Java | 463538 309542 161559 79059 19282 19779 72112 71591


x86_64-suse-linux-gnu | send send send send create create data data
2024-09-11 21:27:41 | NOTHING END CALLBACK WAIT PARENT CHILD BUS BFL
------------------------- | -------- -------- -------- -------- --------- -------- -------- --------

uds_spawn:
R: C | 530643 399273 228727 90660 3795 37906 89667 90450
R: C++ | 522584 389941 218076 89381 2473 36351 88390 89105
R: Python | 494427 312230 137856 70085 101 22259 68911 66719
R: Tcl | 315789 147933 79493 49956 79 17944 23599 25445
R: Java | 475697 306405 161973 79067 68 18112 67439 66953
aotto1968
2024-09-11 20:05:00 UTC
Reply
Permalink
Quick update, I removed the "--enable-symboles" from TCL in the release build and the performance has improved significantly but
it is still ~25% behind PYTHON and JAVA. Apparently "--enable-symboles" has a significant impact on performance in TCL, which is
good to know because it is often delivered with "--enable-symboles" in order to better analyze an error later during operation.

x86_64-suse-linux-gnu | send send send send create create data data
2024-09-11 21:54:55 | NOTHING END CALLBACK WAIT PARENT CHILD BUS BFL
------------------------- | -------- -------- -------- -------- --------- -------- -------- --------

pipe:
R: C | 530275 400403 222971 90707 3859 37965 89818 81581
R: C++ | 528852 396473 219499 89816 2470 36635 89619 89994
R: Python | 492501 304463 159169 73570 101 22109 68875 66767
R: Tcl | 402439 236730 127712 59048 133 24002 44337 43762
R: Java | 474683 313162 170157 79324 69 19772 72242 72031


x86_64-suse-linux-gnu | send send send send create create data data
2024-09-11 21:54:55 | NOTHING END CALLBACK WAIT PARENT CHILD BUS BFL
------------------------- | -------- -------- -------- -------- --------- -------- -------- --------

uds_fork:
R: C | 524971 396466 224320 89966 11814 38479 83146 90475
R: C++ | 520668 390057 216610 89014 8712 36018 89073 89726
R: Python | 494415 314163 160014 75356 344 22197 69082 67205
R: Tcl | na. na. na. na. na. na. na. na.
R: Java | na. na. na. na. na. na. na. na.


x86_64-suse-linux-gnu | send send send send create create data data
2024-09-11 21:54:55 | NOTHING END CALLBACK WAIT PARENT CHILD BUS BFL
------------------------- | -------- -------- -------- -------- --------- -------- -------- --------

uds_thread:
R: C | 504425 375809 212943 88359 32173 37978 87952 88606
R: C++ | 494135 365464 205933 88317 31582 35011 88070 73875
R: Python | na. na. na. na. na. na. na. na.
R: Tcl | 390334 224660 123795 63402 139 24137 44490 38410
R: Java | 463538 309542 161559 79059 19282 19779 72112 71591


x86_64-suse-linux-gnu | send send send send create create data data
2024-09-11 21:54:55 | NOTHING END CALLBACK WAIT PARENT CHILD BUS BFL
------------------------- | -------- -------- -------- -------- --------- -------- -------- --------

uds_spawn:
R: C | 530643 399273 228727 90660 3795 37906 89667 90450
R: C++ | 522584 389941 218076 89381 2473 36351 88390 89105
R: Python | 494427 312230 137856 70085 101 22259 68911 66719
R: Tcl | 402035 234600 126871 63312 134 22768 41101 39604
R: Java | 475697 306405 161973 79067 68 18112 67439 66953
Loading...