Discussion:
2GB limitation
(too old to reply)
alexandru
2024-07-22 06:17:09 UTC
Permalink
Hi,

Will there be a fix for the 2GB size limit that a string representation
have in Tcl?
Maybe already fixed in Tcl 9.0?

Thanks
Alexnadru
Andreas Leitgeb
2024-07-22 08:31:35 UTC
Permalink
Post by alexandru
Will there be a fix for the 2GB size limit that a string representation
have in Tcl?
Maybe already fixed in Tcl 9.0?
Yes, that's one of the reasons for switching to tcl9 as soon as
possible.
alexandru
2024-07-22 17:06:07 UTC
Permalink
Wow, that unexpected and cool!
Thanks
Rich
2024-07-22 17:17:21 UTC
Permalink
Post by Andreas Leitgeb
Post by alexandru
Will there be a fix for the 2GB size limit that a string representation
have in Tcl?
Maybe already fixed in Tcl 9.0?
Yes, that's one of the reasons for switching to tcl9 as soon as
possible.
What is the new larger "limit" in Tcl9?
Emiliano
2024-07-23 00:58:03 UTC
Permalink
On Mon, 22 Jul 2024 17:17:21 -0000 (UTC)
Post by Rich
Post by Andreas Leitgeb
Post by alexandru
Will there be a fix for the 2GB size limit that a string representation
have in Tcl?
Maybe already fixed in Tcl 9.0?
Yes, that's one of the reasons for switching to tcl9 as soon as
possible.
What is the new larger "limit" in Tcl9?
In 9.0 the type of the 'length' member of the Tcl_Obj struct (the number of
bytes at '*bytes' member, not including the terminating null) has changed
from int to ptrdiff_t, so it will remain (1<<31)-1 => 2147483647 bytes on
32 bit platforms (unsurprisingly) and (1<<63)-1 => 9223372036854775807
(9,22 exabyte) on 64 bit platforms.

IIUC that's also the (new) number of elements for a Tcl list. In practice
the number will be less, since the length of the string representation of
such list will hit the '*bytes' max length first.
--
Emiliano
Andreas Leitgeb
2024-07-24 16:22:53 UTC
Permalink
Post by Emiliano
In 9.0 the type of the 'length' member of the Tcl_Obj struct (the number of
bytes at '*bytes' member, not including the terminating null) has changed
from int to ptrdiff_t, so it will remain (1<<31)-1 => 2147483647 bytes on
32 bit platforms (unsurprisingly) and (1<<63)-1 => 9223372036854775807
(9,22 exabyte) on 64 bit platforms.
My hearsay was "generally 64 bit (minus the sign-bit)".
Are you sure that length-type is *always* ptrdiff_t, and
that this may be 32bit?

The "64bit'ness" of a platform is also a bit more complicated...
There are platforms, where pointers are 64bit, but ints are
still 32 (despite machine words being all 64bit) - in those
cases, I'd expect ptrdiff_t to be 64 bit, but on a real old
32bit machine, I don't really know for sure...
Post by Emiliano
IIUC that's also the (new) number of elements for a Tcl list.
In practice the number will be less, since the length of the
string representation of such list will hit the '*bytes' max
length first.
Not all lists are ever turned to string-rep. While they are
semantically "just strings", well written programs can avoid
the actual obtainment of the string rep, at least for those
really long lists that may be relevant here.
Emiliano
2024-07-24 20:05:19 UTC
Permalink
On Wed, 24 Jul 2024 16:22:53 -0000 (UTC)
Post by Andreas Leitgeb
Post by Emiliano
In 9.0 the type of the 'length' member of the Tcl_Obj struct (the number of
bytes at '*bytes' member, not including the terminating null) has changed
from int to ptrdiff_t, so it will remain (1<<31)-1 => 2147483647 bytes on
32 bit platforms (unsurprisingly) and (1<<63)-1 => 9223372036854775807
(9,22 exabyte) on 64 bit platforms.
My hearsay was "generally 64 bit (minus the sign-bit)".
Are you sure that length-type is *always* ptrdiff_t, and
that this may be 32bit?
In 9.X, it is ptrdiff_t. In 8.Y is still int.

See https://core.tcl-lang.org/tcl/file?ci=trunk&name=generic/tcl.h&ln=325-333
and
https://core.tcl-lang.org/tcl/file?ci=trunk&name=generic/tcl.h&ln=740-752

ptrdiff_t can still be a 32 bits wide value. See below.
Post by Andreas Leitgeb
The "64bit'ness" of a platform is also a bit more complicated...
There are platforms, where pointers are 64bit, but ints are
still 32 (despite machine words being all 64bit) - in those
cases, I'd expect ptrdiff_t to be 64 bit, but on a real old
32bit machine, I don't really know for sure...
This is what I mean when say "on 32-bit platforms is still 2GB",
since i386-i686 platform has a 32 bit ptrdiff_t.

On my ancient i686 machine:

$ uname -m
i686
$ tclsh9.0
% expr {(1 << (8 * $tcl_platform(pointerSize))-1) - 1}
2147483647
% package provide Tcl
9.0b3
% set tcl_platform(pointerSize)
4
Post by Andreas Leitgeb
Post by Emiliano
IIUC that's also the (new) number of elements for a Tcl list.
In practice the number will be less, since the length of the
string representation of such list will hit the '*bytes' max
length first.
Not all lists are ever turned to string-rep. While they are
semantically "just strings", well written programs can avoid
the actual obtainment of the string rep, at least for those
really long lists that may be relevant here.
Yes, but that's an optimization. Tcl semantics are still defined
in terms of strings operations. I prefer not to depend on internals.
--
Emiliano
Harald Oehlmann
2024-07-24 07:07:15 UTC
Permalink
Post by Rich
Post by Andreas Leitgeb
Post by alexandru
Will there be a fix for the 2GB size limit that a string representation
have in Tcl?
Maybe already fixed in Tcl 9.0?
Yes, that's one of the reasons for switching to tcl9 as soon as
possible.
What is the new larger "limit" in Tcl9?
expr {2**63}
Loading...