Book Home Programming PerlSearch this book

25.2. Endianness and Number Width

Computers store integers and floating-point numbers in different orders (big-endian or little-endian) and different widths (32-bit and 64-bit being the most common today). Normally, you won't have to think about this. But if your program sends binary data across a network connection, or onto disk to be read by a different computer, you may need to take precautions.

Conflicting orders can make an utter mess out of numbers. If a little-endian host (such as an Intel CPU) stores 0x12345678 (305,419,896 in decimal), a big-endian host (such as a Motorola CPU) will read it as 0x78563412 (2,018,915,346 in decimal). To avoid this problem in network (socket) connections, use the pack and unpack formats n and N, which write unsigned short and long numbers in big-endian order (also called "network" order) regardless of the platform.

You can explore the endianness of your platform by unpacking a data structure packed in native format such as:

print unpack("h*", pack("s2", 1, 2)), "\n";
# '10002000' on e.g. Intel x86 or Alpha 21064 in little-endian mode
# '00100020' on e.g. Motorola 68040
To determine your endianness, you could use either of these statements:
$is_big_endian    = unpack("h*", pack("s", 1)) =~ /01/;
$is_little_endian = unpack("h*", pack("s", 1)) =~ /^1/;
Even if two systems have the same endianness, there can still be problems when transferring data between 32-bit and 64-bit platforms. There is no good solution other than to avoid transferring or storing raw binary numbers. Either transfer and store numbers as text instead of binary, or use modules like Data::Dumper or Storable to do this for you. You really want to be using text-oriented protocols in any event--they're more robust, more maintainable, and more extensible than binary protocols.

Of course, with the advent of XML and Unicode, our definition of text is getting more flexible. For instance, between two systems running Perl 5.6.0 (or newer), you can transport a sequence of integers encoded as characters in utf8 (Perl's version of UTF-8). If both ends are running on an architecture with 64-bit integers, you can exchange 64-bit integers. Otherwise, you're limited to 32-bit integers. Use pack with a U* template to send, and unpack with a U* template to receive.



Library Navigation Links

Copyright © 2001 O'Reilly & Associates. All rights reserved.