Exploring Java

Previous Chapter 7
Basic Utility Classes
Next
 

7.7 Internationalization

In order to deliver on the promise "Write once, run anywhere," the engineers at Java designed the famous Java Virtual Machine. True, your program will run anywhere there is a JVM, but what about users in other countries? Will they have to know English to use your application? Java 1.1 answers that question with a resounding "no," backed up by various classes that are designed to make it easy for you to write a "global" application. In this section we'll talk about the concepts of internationalization and the classes that support them.

java.util.Locale

Internationalization programming revolves around the Locale class. The class itself is very simple; it encapsulates a country code, a language code, and a rarely used variant code. Commonly used languages and countries are defined as constants in the Locale class. (It's ironic that these names are all in English.) You can retrieve the codes or readable names, as follows:

Locale l = Locale.ITALIAN;
System.out.println(l.getCountry()); // IT
System.out.println(l.getDisplayCountry());  // Italy
System.out.println(l.getLanguage());    // it
System.out.println(l.getDisplayLanguage()); // Italian

The country codes comply with ISO 639. A complete list of country codes is at http://www.ics.uci.edu/pub/ietf/http/related/iso639.txt. The language codes comply with ISO 3166. A complete list of language codes is at http://www.chemie.fu-berlin.de/diverse/doc/ISO_3166.html. There is no official set of variant codes; they are designated as vendor-specific or platform-specific.

Various classes throughout the Java API use a Locale to decide how to represent themselves. We have already seen how the DateFormat class uses Locales to determine how to format and parse strings.

Resource Bundles

If you're writing an internationalized program, you want all the text that is displayed by your application to be in the correct language. Given what you have just learned about Locale, you could print out different messages by testing the Locale. This gets cumbersome quickly, however, because the messages for all Locales are embedded in your source code. ResourceBundle and its subclasses offer a cleaner, more flexible solution.

A ResourceBundle is a collection of objects that your application can access by name, much like a Hashtable with String keys. The same ResourceBundle may be defined for many different Locales. To get a particular ResourceBundle, call the factory method ResourceBundle.getBundle(), which accepts the name of a ResourceBundle and a Locale. The following example gets a ResourceBundle for two Locales, retrieves a string message from it, and prints the message. We'll define the ResourceBundles later to make this example work.

import java.util.*;
public class Hello {
  public static void main(String[] args) {
    ResourceBundle bun;
    bun = ResourceBundle.getBundle("Message", Locale.ITALY);
    System.out.println(bun.getString("HelloMessage"));
    bun = ResourceBundle.getBundle("Message", Locale.US);
    System.out.println(bun.getString("HelloMessage"));
  }
}

The getBundle() method throws the runtime exception MissingResourceException if an appropriate ResourceBundle cannot be located.

Locales are defined in three ways. They can be stand-alone classes, in which case they will either be subclasses of ListResourceBundle or direct subclasses of ResourceBundle. They can also be defined by a property file, in which case they will be represented at run-time by a PropertyResourceBundle object. When you call ResourceBundle.getBundle(), either a matching class is returned or an instance of PropertyResourceBundle corresponding to a matching property file. The algorithm used by getBundle() is based on appending the country and language codes of the requested Locale to the name of the resource. Specifically, it searches for resources using this order:

name_language_country_variant
name_language_country
name_language
name
name_default-language_default-country_default-variant
name_default-language_default-country
name_default-language

In the example above, when we try to get the ResourceBundle named Message, specific to Locale.ITALY, it searches for the following names (note that there are no variant codes in the Locales we are using):

Message_it_IT
Message_it
Message
Message_en_US
Message_en

Let's define the Message_it_IT ResourceBundle now, using a subclass of ListResourceBundle.

import java.util.*;
public class Message_it_IT extends ListResourceBundle {
  public Object[][] getContents() {
    return contents;
  }
  
  static final Object[][] contents = {
    {"HelloMessage", "Buon giorno, world!"},
    {"OtherMessage", "Ciao."},
  };
}

ListResourceBundle makes it easy to define a ResourceBundle class; all we have to do is override the getContents() method.

Now let's define a ResourceBundle for Locale.US. This time, we'll make a property file. Save the following data in a file called Message_en_US.properties:

HelloMessage=Hello, world!
OtherMessage=Bye.

So what happens if somebody runs your program in Locale.FRANCE, and there is no ResourceBundle defined for that Locale? To avoid a run-time MissingResourceException, it's a good idea to define a default ResourceBundle. So in our example, you could change the name of the property file to Message.properties. That way, if a language-specific or country-specific ResourceBundle cannot be found, your application can still run.

java.text

The java.text package includes, among other things, a set of classes designed for generating and parsing string representations of objects. We have already seen one of these classes, DateFormat. In this section we'll talk about the other format classes, NumberFormat, ChoiceFormat, and MessageFormat.

The NumberFormat class can be used to format and parse currency, percents, or plain old numbers. Like DateFormat, NumberFormat is an abstract class. However, it has several useful factory methods. For example, to generate currency strings, use getCurrencyInstance():

double salary = 1234.56;
String here = 
    NumberFormat.getCurrencyInstance().format(salary); 
    // $1,234.56
String italy = 
    NumberFormat.getCurrencyInstance(Locale.ITALY).format(salary);
    // L 1.234,56 

The first statement generates an American salary, with a dollar sign, a comma to separate thousands, and a period as a decimal point. The second statement presents the same string in Italian, with a lire sign, a period to separate thousands, and a comma as a decimal point. Remember that the NumberFormat worries about format only; it doesn't attempt to do currency conversion. (Among other things, that would require access to a dynamically updated table and exchange rates: a good opportunity for a Java Bean, but too much to ask of a simple formatter.)

Likewise, getPercentInstance() returns a formatter you can use for generating and parsing percents. If you do not specify a Locale when calling a getInstance() method, the default Locale is used.

NumberFormat pf = NumberFormat.getPercentInstance();
System.out.println(pf.format(progress)); // 44%
try {
  System.out.println(pf.parse("77.2%")); // 0.772
}
catch (ParseException e) {}

And if you just want to generate and parse plain old numbers, use a NumberFormat returned by getInstance() or its equivalent, getNumberInstance().

NumberFormat guiseppe = NumberFormat.getInstance(Locale.ITALY);
NumberFormat joe = NumberFormat.getInstance();  // defaults to Locale.US
try {
  double theValue = guiseppe.parse("34.663,252").doubleValue();
  System.out.println(joe.format(theValue)); // 34,663.252
}
catch (ParseException e) {}

We use guiseppe to parse a number in Italian format (periods separate thousands, comma as the decimal point). The return type of parse() is Number, so we use the doubleValue() method to retrieve the value of the Number as a double. Then we use joe to format the number correctly for the default (US) locale.

Table 7.8 summarizes the factory methods for text formatters in the java.text package.

Table 7.9: Format Factory Methods
Factory Method
DateFormat.getDateInstance()
DateFormat.getDateInstance(int style)
DateFormat.getDateInstance(int style, Locale aLocale)
DateFormat.getDateTimeInstance()
DateFormat.getDateTimeInstance(int dateStyle, int timeStyle)
DateFormat.getDateTimeInstance(int dateStyle, int timeStyle, Locale aLocale)
DateFormat.getInstance()
DateFormat.getTimeInstance()
DateFormat.getTimeInstance(int style)
DateFormat.getTimeInstance(int style, Locale aLocale)
NumberFormat.getCurrencyInstance()
NumberFormat.getCurrencyInstance(Locale inLocale)
NumberFormat.getInstance()
NumberFormat.getInstance(Locale inLocale)
NumberFormat.getNumberInstance()
NumberFormat.getNumberInstance(Locale inLocale)
NumberFormat.getPercentInstance()
NumberFormat.getPercentInstance(Locale inLocale)

Thus far we've seen how to format dates and numbers as text. Now we'll take a look at a class, ChoiceFormat, that maps numerical ranges to text. ChoiceFormat is constructed by specifying the numerical ranges and the strings that correspond to them. One constructor accepts an array of doubles and an array of Strings, where each string corresponds to the range running from the matching number up through the next number:

double[] limits = {0, 20, 40};
String[] labels = {"young", "less young", "old"};
ChoiceFormat cf = new ChoiceFormat(limits, labels);
System.out.println(cf.format(12)); // young
System.out.println(cf.format(26)); // less young

You can specify both the limits and the labels using a special string in another ChoiceFormat constructor:

ChoiceFormat cf = new ChoiceFormat("0#young|20#less young|40#old");
System.out.println(cf.format(40)); // old
System.out.println(cf.format(50)); // old

The limit and value pairs are separated by pipe characters (|; also known as "vertical bar"), while the number sign serves to separate each limit from its corresponding value.

To complete our discussion of the formatting classes, we'll take a look at another class, MessageFormat, that helps you construct human-readable messages. To construct a MessageFormat, pass it a pattern string. A pattern string is a lot like the string you feed to printf() in C, although the syntax is different. Arguments are delineated by curly brackets, and may include information about how they should be formatted. Each argument consists of a number, an optional type, and an optional style. These are summarized in table Table 7.9.

Table 7.10: MessageFormat arguments
Type Styles
choice pattern
date short, medium, long, full, pattern
number integer, percent, currency, pattern
time short, medium, long, full, pattern

Let's use an example to clarify all of this.

MessageFormat mf = new MessageFormat("You have {0} messages.");
Object[] arguments = {"no"};
System.out.println(mf.format(arguments)); // You have no messages.

We start by constructing a MessageFormat object; the argument to the constructor is the pattern on which messages will be based. The special incantation {0} means "in this position, substitute element 0 from the array passed as an argument to the format() method." Thus, we construct a MessageFormat object with a pattern, which is a template on which messages are based. When we generate a message, by calling format(), we pass in values to fill the blank spaces in the template. In this case, we pass the array arguments[] to mf.format; this substitutes arguments[0], yielding the result "You have no messages."

Let's try this example again, except we'll show how to format a number and a date instead of a string argument.

MessageFormat mf = new MessageFormat(
    "You have {0, number, integer} messages on {1, date, long}.");
Object[] arguments = {new Integer(93), new Date()};
System.out.println(mf.format(arguments));
    // You have 93 messages on April 10, 1997.

In this example, we need to fill in two spaces in the template, and therefore need two elements in the arguments[] array. Element 0 must be a number, and is formatted as an integer. Element 1 must be a Date, and will be printed in the long format. When we call format(), the arguments[] array supplies these two values.

This is still sloppy. What if there is only one message? To make this grammatically correct, we can embed a ChoiceFormat-style pattern string in our MessageFormat pattern string.

MessageFormat mf = new MessageFormat(
    "You have {0, number, integer} message{0, choice, 0#s|1#|2#s}.");
Object[] arguments = {new Integer(1)};
System.out.println(mf.format(arguments)); // You have 1 message.

In this case, we use element 0 of arguments[] twice: once to supply the number of messages, and once to provide input to the ChoiceFormat pattern. The pattern says to add an "s" if argument 0 has the value zero, or is two or more.

Finally, a few words on how to be clever. If you want to write international programs, you can use resource bundles to supply the strings for your MessageFormat objects. This way, you can automatically format messages that are in the appropriate language with dates and other language-dependent fields handled appropriately.

In this context, it's helpful to realize that messages don't need to read elements from the array in order. In English, you would say "Disk C has 123 files"; in some other language, you might say "123 files are on Disk C." You could implement both messages with the same set of arguments:

MessageFormat m1 = new MessageFormat(
    "Disk {0} has {1, number, integer} files.");
MessageFormat m2 = new MessageFormat(
    "{1, number, integer} files are on disk {0}.");
Object[] arguments = {"C", new Integer(123)};

In real life, you'd only use a single MessageFormat object, initialized with a string taken from a resource bundle.


Previous Home Next
The Security Manager Book Index Input/Output Facilities

Java in a Nutshell Java Language Reference Java AWT Java Fundamental Classes Exploring Java