Case sensitivity of Java class names

15,051

Solution 1

  • Are there any guarantees about which classes are loadable by the bootstrap class loader in every JVM?

The core bits and pieces of the language, plus supporting implementation classes. Not guaranteed to include any class that you write. (The normal JVM loads your classes in a separate classloader from the bootstrap one, and in fact the normal bootstrap loader loads its classes out of a JAR normally, as this makes for more efficient deployment than a big old directory structure full of classes.)

  • If there are any guarantees, does the behavior in the example above violate the guarantee (i.e. is the behavior a bug)?
  • Is there any way to make "standard" JVMs load a and A simultaneously? Would writing a custom class loader work?

Java loads classes by mapping the full name of the class into a filename that is then searched for on the classpath. Thus testcase.a goes to testcase/a.class and testcase.A goes to testcase/A.class. Some filesystems mix these things up, and may serve the other up when one is asked for. Others get it right (in particular, the variant of the ZIP format used in JAR files is fully case-sensitive and portable). There is nothing that Java can do about this (though an IDE could handle it for you by keeping the .class files away from the native FS, I don't know if any actually do and the JDK's javac most certainly isn't that smart).

However that's not the only point to note here: class files know internally what class they are talking about. The absence of the expected class from the file just means that the load fails, leading to the NoClassDefFoundError you received. What you got was a problem (a mis-deployment in at least some sense) that was detected and dealt with robustly. Theoretically, you could build a classloader that could handle such things by keeping searching, but why bother? Putting the class files inside a JAR will fix things far more robustly; those are handled correctly.

More generally, if you're running into this problem for real a lot, take to doing production builds on a Unix with a case-sensitive filesystem (a CI system like Jenkins is recommended) and find which developers are naming classes with just case differences and make them stop as it is very confusing!

Solution 2

Donal's fine explanation leaves little to add, but let me briefly muse on this phrase:

... Java classes with the same case-insensitive name ...

Names and Strings in general are never case-insensitive in themselves, it's only there interpretation that can be. And secondly, Java doesn't do such an interpretation.

So, a correct phrasing of what you had in mind would be:

... Java classes whose file representations in a case-insensitive file-system have identical names ...

Share:
15,051

Related videos on Youtube

Josh Sunshine
Author by

Josh Sunshine

Systems Scientist at Carnegie Mellon University. I study programming languages and software engineering.

Updated on June 05, 2022

Comments

  • Josh Sunshine
    Josh Sunshine almost 2 years

    If one writes two public Java classes with the same case-insensitive name in different directories then both classes are not usable at runtime. (I tested this on Windows, Mac and Linux with several versions of the HotSpot JVM. I would not be surprised if there other JVMs where they are usable simultaneously.) For example, if I create a class named a and one named A like so:

    // lowercase/src/testcase/a.java
    package testcase;
    public class a {
        public static String myCase() {
            return "lower";
        }
    }
    
    // uppercase/src/testcase/A.java 
    package testcase;
    public class A {
        public static String myCase() {
            return "upper";
        }
    }
    

    Three eclipse projects containing the code above are available from my website.

    If try I calling myCase on both classes like so:

    System.out.println(A.myCase());
    System.out.println(a.myCase());
    

    The typechecker succeeds, but when I run the class file generate by the code directly above I get:

    Exception in thread "main" java.lang.NoClassDefFoundError: testcase/A (wrong name: testcase/a)

    In Java, names are in general case sensitive. Some file systems (e.g. Windows) are case insensitive, so I'm not surprised the above behavior happens, but it seems wrong. Unfortunately the Java specifications are oddly non-commital about which classes are visible. The Java Language Specification (JLS), Java SE 7 Edition (Section 6.6.1, page 166) says:

    If a class or interface type is declared public, then it may be accessed by any code, provided that the compilation unit (§7.3) in which it is declared is observable.

    In Section 7.3, the JLS defines observability of a compilation unit in extremely vague terms:

    All the compilation units of the predefined package java and its subpackages lang and io are always observable. For all other packages, the host system determines which compilation units are observable.

    The Java Virtual Machine Specification is similarly vague (Section 5.3.1):

    The following steps are used to load and thereby create the nonarray class or interface C denoted by [binary name] N using the bootstrap class loader [...] Otherwise, the Java virtual machine passes the argument N to an invocation of a method on the bootstrap class loader to search for a purported representation of C in a platform-dependent manner.

    All of this leads to four questions in descending order of importance:

    1. Are there any guarantees about which classes are loadable by the default class loader(s) in every JVM? In other words, can I implement a valid, but degenerate JVM, that won't load any classes except those in java.lang and java.io?
    2. If there are any guarantees, does the behavior in the example above violate the guarantee (i.e. is the behavior a bug)?
    3. Is there any way to make HotSpot load a and A simultaneously? Would writing a custom class loader work?
    • Greg Hewgill
      Greg Hewgill almost 12 years
      So let me get this straight.. you've got two similarly named classes testcase.a and testcase.A, in two different directories on your classpath (because you can't have them in the same directory on a case insensitive filesystem) - and you're wondering why the JVM can't find the correct class file to load?
    • Josh Sunshine
      Josh Sunshine almost 12 years
      @GregHewgill You've described the scenario at the beginning of my question correctly, but my questions are more broad.
    • Greg Hewgill
      Greg Hewgill almost 12 years
      Well, the whole idea of a "classpath" is platform-dependent from the point of view of the JLS and JVMS. Neither of those documents will help you resolve this question.
    • Josh Sunshine
      Josh Sunshine almost 12 years
      Are there other documents that will? Can I write implement a valid, but degenerate JVM, that won't load any classes except those in java.lang and java.io?
    • Hot Licks
      Hot Licks almost 12 years
      Java itself is fully case-sensitive. It cannot control whether the supporting file system is case-sensitive or not. "the host system determines which compilation units are observable" is simply saying that the JVM itself does not control the classpath -- has nothing to do with case sensitivity.
    • Hot Licks
      Hot Licks almost 12 years
      You can load both a and A simultaneously. Just place them in a JAR file and load from there.
    • Josh Sunshine
      Josh Sunshine almost 12 years
      @HotLicks You're right, putting the classes in a JAR file works. That seems inconsistent. Any idea why the my JVM handles JAR files different from regular directories?
    • Greg Hewgill
      Greg Hewgill almost 12 years
      @JoshSunshine: The ZIP file format (JAR files are really ZIP files) can be considered to be a case-sensitive filesystem.
    • Hot Licks
      Hot Licks almost 12 years
      @JoshSunshine -- The host file system doesn't get involved in accessing JAR files, other than to serve up the JAR itself. The "inconsistency" is in the OS, not the JVM.
    • Matthew Read
      Matthew Read almost 12 years
      @HotLicks Ignoring Mac and Windows, you think Linux incorrectly handles case sensitivity? Wouldn't that be an issue elsewhere and not just here?
    • Hot Licks
      Hot Licks almost 12 years
      @MatthewRead -- I'm not a Linux guru, but it was my impression that you can select case-sensitive or not when you configure a file system on Linux. (At least I know this is true of some Posix-compliant non-Linux systems.) Used to be the default was case-sensitive, but now many are not sensitive, to be "compatible" with Windoze.
    • bestsss
      bestsss almost 12 years
      @MatthewRead, linux has no file system on its own, however ext2..4 are most definitely case sensitive
  • Josh Sunshine
    Josh Sunshine almost 12 years
    Re: "make them stop." I can't! I'm writing a compiler for another language that compiles to Java bytecode. Identifiers in the source need to match identifiers in the target (bytecode), and several important examples in the other language have the same case insensitive name.
  • Donal Fellows
    Donal Fellows almost 12 years
    Well, the normal way of dealing with such things is to normalize the case in the translation.
  • Josh Sunshine
    Josh Sunshine almost 12 years
    That's what I've done, but it makes interoperability between Java and the other language much more awkward. Java code that refers to classes created by compiling code written in the other language has to use normalized names instead of the source names. That said, the details of my project are a little beside the point, I'm more interested in understanding JVM spec's description of class loading and the particular behaviour of the HotSpot JVM.
  • Donal Fellows
    Donal Fellows almost 12 years
    @Josh You could try storing the generated class definitions directly in a JAR rather than in .class files on the native filesystem. JARs are case-sensitive internally, so it would avoid all these weird problems, but would require you to adjust the compiler so as to control how it writes the files.
  • Donal Fellows
    Donal Fellows almost 12 years
    A name can be case-insensitive (and I have encountered case insensitive strings, long long ago on a weird mainframe; brrr!) but Java's names are always case-sensitive, and everyone's strings have been case-sensitive for… well, certainly the whole of my career at least.
  • bestsss
    bestsss almost 12 years
    all java. are loaded by the bootstrap one, the rest of the classloaders are strictly prohibited doing so. It's part of the security model. Anything else that doesn't start w/ java. is a fair game.