Saturday, December 25, 2010

String literal pool

How to create a String

There are two ways to create a String object in Java:

  • Using the new operator. For example,
    String str = new String("Hello");.
  • Using a string literal or constant expression). For example,
    String str="Hello"; (string literal) or
    String str="Hel" + "lo"; (string constant expression).

What is difference between these String's creations? In Java, the equals method can be considered to perform a deep comparison of the value of an object, whereas the == operator performs a shallow comparison. The equals method compares the content of two objects rather than two objects' references. The == operator with reference types (i.e., Objects) evaluates as true if the references are identical - point to the same object. With value types (i.e., primitives) it evaluates as true if the value is identical. The equals method is to return true if two objects have identical content - however, the equals method in the java.lang.Object class - the default equals method if a class does not override it - returns true only if both references point to the same object.

Let's use the following example to see what difference between these creations of string:

public class DemoStringCreation {

public static void main (String args[]) {
String str1 = "Hello";
String str2 = "Hello";
System.out.println("str1 and str2 are created by using string literal.");
System.out.println(" str1 == str2 is " + (str1 == str2));
System.out.println(" str1.equals(str2) is " + str1.equals(str2));


String str3 = new String("Hello");
String str4 = new String("Hello");
System.out.println("str3 and str4 are created by using new operator.");
System.out.println(" str3 == str4 is " + (str3 == str4));
System.out.println(" str3.equals(str4) is " + str3.equals(str4));

String str5 = "Hel"+ "lo";
String str6 = "He" + "llo";
System.out.println("str5 and str6 are created by using string
constant expression.");
System.out.println(" str5 == str6 is " + (str5 == str6));
System.out.println(" str5.equals(str6) is " + str5.equals(str6));

String s = "lo";
String str7 = "Hel"+ s;
String str8 = "He" + "llo";
System.out.println("str7 is computed at runtime.");
System.out.println("str8 is created by using string constant
expression.");
System.out.println(" str7 == str8 is " + (str7 == str8));
System.out.println(" str7.equals(str8) is " + str7.equals(str8));

}
}

The output result is:

str1 and str2 are created by using string literal.
str1 == str2 is true
str1.equals(str2) is true
str3 and str4 are created by using new operator.
str3 == str4 is false
str3.equals(str4) is true
str5 and str6 are created by using string constant expression.
str5 == str6 is true
str5.equals(str6) is true
str7 is computed at runtime.
str8 is created by using string constant expression.
str7 == str8 is false
str7.equals(str8) is true

The creation of two strings with the same sequence of letters without the use of the new keyword will create pointers to the same String in the Java String literal pool. The String literal pool is a way Java conserves resources.

 

String Literal Pool

String allocation, like all object allocation, proves costly in both time and memory. The JVM performs some trickery while instantiating string literals to increase performance and decrease memory overhead. To cut down the number of String objects created in the JVM, the String class keeps a pool of strings. Each time your code create a string literal, the JVM checks the string literal pool first. If the string already exists in the pool, a reference to the pooled instance returns. If the string does not exist in the pool, a new String object instantiates, then is placed in the pool. Java can make this optimization since strings are immutable and can be shared without fear of data corruption. For example

public class Program
{
public static void main(String[] args)
{
String str1 = "Hello";
String str2 = "Hello";
System.out.print(str1 == str2);
}
}

The result is

true

 

Unfortunately, when you use

String a=new String("Hello");

a String object is created out of the String literal pool, even if an equal string already exists in the pool. Considering all that, avoid new String unless you specifically know that you need it! For example

public class Program
{
public static void main(String[] args)
{
String str1 = "Hello";
String str2 = new String("Hello");
System.out.print(str1 == str2 + " ");
System.out.print(str1.equals(str2));
}
}

The result is

false true

 

A JVM has a string pool where it keeps at most one object of any String. String literals always refer to an object in the string pool. String objects created with the new operator do not refer to objects in the string pool but can be made to using String's intern() method. The java.lang.String.intern() returns an interned String, that is, one that has an entry in the global String pool. If the String is not already in the global String pool, then it will be added. For example

public class Program
{
public static void main(String[] args)
{
// Create three strings in three different ways.
String s1 = "Hello";
String s2 = new StringBuffer("He").append("llo").toString();
String s3 = s2.intern();

// Determine which strings are equivalent using the ==
// operator
System.out.println("s1 == s2? " + (s1 == s2));
System.out.println("s1 == s3? " + (s1 == s3));
}
}

The output is

s1 == s2? false
s1 == s3? true

There is a table always maintaining a single reference to each unique String object in the global string literal pool ever created by an instance of the runtime in order to optimize space. That means that they always have a reference to String objects in string literal pool, therefore, the string objects in the string literal pool not eligible for garbage collection.

 

String Literals in the Java Language Specification Third Edition

 

Each string literal is a reference to an instance of class String. String objects have a constant value. String literals-or, more generally, strings that are the values of constant expressions-are "interned" so as to share unique instances, using the method String.intern.

Thus, the test program consisting of the compilation unit:

package testPackage;
class Test {
public static void main(String[] args) {
String hello = "Hello", lo = "lo";
System.out.print((hello == "Hello") + " ");
System.out.print((Other.hello == hello) + " ");
System.out.print((other.Other.hello == hello) + " ");
System.out.print((hello == ("Hel"+"lo")) + " ");
System.out.print((hello == ("Hel"+lo)) + " ");
System.out.println(hello == ("Hel"+lo).intern());
}
}
class Other { static String hello = "Hello"; }

and the compilation unit:

package other;
public class Other { static String hello = "Hello"; }

produces the output:

true true true true false true

This example illustrates six points:

  • Literal strings within the same class in the same package represent references to the same String object.
  • Literal strings within different classes in the same package represent references to the same String object.
  • Literal strings within different classes in different packages likewise represent references to the same String object.
  • Strings computed by constant expressions are computed at compile time and then treated as if they were literals.
  • Strings computed by concatenation at run time are newly created and therefore distinct.

The result of explicitly interning a computed string is the same string as any pre-existing literal string with the same contents.


No comments:

Post a Comment