Monday 11 May 2015

javac : The Java Compiler

Overview

In Java, source code files have the extension .java. Java source code files are standard ASCII text files, much like the source code files for other popular programming languages like C++. It is the job of the Java compiler to process Java source code files and create executable Java bytecode classes from them. Executable bytecode class files have the extension .class, and they represent a Java class in its useable form.
Java class files are generated on a one-to-one basis with the classes defined in the source code. In other words, the Java compiler generates exactly one .classfile for each class you create. Technically, it is possible to define more than one class in a single source file; it is therefore possible for the compiler to generate multiple class files from a single source file. When this happens, it means that the source file contains multiple class definitions.
You may have heard something about just-in-time compilers in reference to Java. It's important not to get these compilers confused with the Java compiler and the role it plays. The Java compiler is responsible for turning Java source code into Java bytecodes that can be executed within the Java runtime system. The Java Virtual Machine, which is a component of the runtime system, is responsible for interpreting the bytecodes and making the appropriate system-level calls to the native platform. It is at this point where platform independence is achieved by Java; the bytecodes are in a generic form that is only converted to a native form when processed by the Virtual Machine.

Usage

The Java compiler is a command-line tool, meaning that it is invoked from a command prompt, such as the MS-DOS shell in Windows 95. The syntax for the Java compiler follows:
javac Options Filename
The Filename argument specifies the name of the source code file you want to compile. The compiler will generate bytecode classes for all classes defined in this file. Likewise, the compiler also will generate bytecode classes for any dependent classes that haven't been compiled yet. In other words, if you are compiling class A, which is derived from class B, and class B has not yet been compiled, the compiler will go ahead and compile both classes.

Options

The Options compiler argument specifies options related to how the compiler creates the executable Java classes. Following is a list of the compiler options:
-classpath Path
-d Dir
-g
-nowarn
-O
-verbose
The -classpath option tells the compiler to override the CLASSPATHenvironment variable with the path specified by Path. This causes the compiler to look for user-defined classes in the path specified by Path. Path is a semicolon-delimited list of directory paths taking the following form:
.;<your_path>
An example of a specific usage of -classpathfollows:
javac -classpath .;\dev\animate\classes;\dev\render\classes A.java
In this case, the compiler is using a user-defined class path to access any classes it needs while compiling the source code file A.java. The -classpathoption is sometimes useful when you want to try compiling something without taking the trouble to modify the CLASSPATHenvironment variable.
The -d option determines the root directory where compiled classes are stored. This is important because many times classes are organized in a hierarchical directory structure. With the -doption, the directory structure will be created beneath the directory specified by Dir.
The -g compiler option causes the compiler to generate debugging tables for the Java classes. Debugging tables are used by the Java debugger, and they contain information such as local variables and line numbers. The default action of the compiler is to only generate line numbers. If you are going to be using the Java debugger, you must use the -goption. Additionally, for debugging, make sure you don't use the -O option, which optimizes the code.
The -nowarn option turns off compiler warnings. Warnings are printed to standard output during compilation to inform you of potential problems with the source code. It is generally a good idea to keep warnings enabled, because they often signal problem areas in your code. However, you may run into a situation where warnings are getting in the way, in which case the -nowarnoption might be useful.
The -O option causes the compiler to optimize the compiled code. In this case, optimization simply means that static, final, and private methods are compiled inline. When a method is compiled inline, it means that the entire body of the method is included in place of each call to the method. This speeds up execution because it eliminates the method call overhead. Optimized classes are usually larger in size, to accommodate the duplicate code. The -Ooptimization option also suppresses the default creation of line numbers by the compiler. Keep in mind that the -Ooption should not be used when you plan on debugging the compiled code using the Java debugger.
The -verbose option has somewhat of an opposite effect as the -nowarnoption-it prints out extra information about the compilation process. You can use -verbose to see exactly what source files are being compiled and what class files are being loaded.

The Non-Optimizing Compiler

Some distributions of the Java Developer's Kit include an alternate Java compiler called javac_g. This version of the Java compiler generates code without some of the internal optimizations performed by the standard javaccompiler. If this compiler is in your JDK distribution, be sure to use it when you are compiling code for debugging. Otherwise, stick with the javac compiler for all release code.

Bugs

As of this writing, the latest release of the Java Developer's Kit is 1.02, which contains some known bugs. More specifically, the following Java compiler bugs have been documented and acknowledged by the JavaSoft development team:
  • The compiler doesn't distinguish between the same class names in different packages.
  • The compiler doesn't distinguish between class names that are only differing by case (Windows 95/NT version only).
  • The compiler will not compile a method with more than 63 local variables.
The first bug is only a problem if you are using different packages containing classes with the same name. Generally speaking, most programmers probably won't develop two packages with same-named classes in each. However, the problem can easily arise without your even realizing it; suppose you are using someone else's package that has a bunch of classes already defined, and a class name conflicts with one of your own. Or, for example, suppose you had your own package including the following source code:
package stuff;
import java.util.*;

public class Hashtable
{
  public Hashtable() {
    // initialize the hashtable
  }
}
A class called Hashtablealready exists in the java.utilpackage, so your Hashtableclass would conflict with it upon compilation thanks to the compiler bug.
The second compiler bug also is related to class names, and this bug rears its head whenever you have two classes with names that differ only by case, as shown in the following code:
// File EncryptIt.java
class EncryptIt
{
  // encrypt something
}

// File encryptit.java
class encryptit
{
  // encrypt something else
}
Notice that the second class, which is defined in a different source code file, has the same name as the first class, with the exception of the case on two of the characters. The Java compiler will give an error while attempting to compile this code, although technically the class naming is legal. Keep in mind that this bug exists only on the Windows 95/NT platform.
Finally, the last bug deals with the number of local variables defined in a method. If a method defines more than 63 local variables, the Java compiler will not be able to compile the method. The Java language specification has yet to set a specific upper limit on the number of local variables allowed, so you can think of the number 63 as the working limit until a formal decision has been made.

Admittedly, none of these bugs are all that likely to occur, simply because most programmers give their classes unique names and typically use less than 63 local variables in each method! However, just in case you ever find yourself pulling your hair out over a strange compiler problem, these bugs might be good to keep in mind. 

No comments:

Post a Comment