Java Beyond Classes
[RU]

Java Beyond Classes

The advocates of Java often stress that anything at all is a class in Java, and that such a uniformity of approach is a great advantage of Java over the other programming languages. Like all advertising, this slogan lies at least thrice: first, there are other languages proclaiming a uniformity of treatment (for instance, Ruby is said to treat anything at all as an object); second, not all Java is classes; and third, reducing anything to classes (or other preferred entities) is no advantage at all.

Any programming language is designed to provide access to a range of typical computer operations, a superstructure upon the physical interactions of electric circuits inside the computer (or a number of computers in a distributed system), which result in a change of the computer's (or a network's) logical state, that is, the ways we see the computer system. It is these logical states that are of interest for the user, and not the way they are physically implemented or designated in a programming language.

A Java class is one of the possible models for a typical logical state, representing a standard way of operation; a class is of no use if we do not intend to create multiple instantiations required to make formal properties and members a real tool of operation control. In other words, all classes are abstract; they are mere possibility of definition, since a complete specification requires an object, a logical state of the computer system. This additional specification is delayed until an instance of the class is created at runtime, which triggers various initiation processes. Some object-oriented languages (like Python and Ruby) go further and allow adding properties and methods to an object rather than a class (thus producing the so-called singletons). On the other hand, static properties and methods spoil the purity of class ideology, reducing a class to a mere object, the only instance of a class, which is equivalent to a singleton. Obviously, the static component of any class can always be made a separate class; there is no need to keep static definitions inside the class, except, possibly, the minor convenience of implicit scope specification. Most classes in the java.lang.* package are, in fact, fake classes; they are essentially static and all they do is to define a kind of namespace for the reserved words, functionally equivalent to if, for, or switch. Using reserved words, primitive types and "namespaced" static names together is an eclectic mix beyond classes.

In strongly and statically typed languages like Java, the ideological purity is violated anyway, since instances of a class are not necessarily classes themselves. This is the absolute majority of practical cases, since it is only the instances of java.lang.Class class and its descendants that are treated as classes, but this branch of the java.lang.Object class is utterly useless since there is no way to dynamically construct or modify classes; of course one can do that directly modifying the byte code, but this method is essentially non-portable. That is, the "formal" classes like java.lang.Class, java.lang.reflect.Method or java.lang.reflect.Member are nothing but a weird type of documentation, metaphorically introducing the basic Java constructs.

Well, let us admit that there are both classes and instances of classes as two basic primitives. Does that stop further questions? In no way. There is a primary eclecticism that cannot be removed. Java programs have first to be compiled into the byte code, and then interpreted by the Java Virtual Machine. The Java compiler and the JVM are the examples of non-class entities that do not fit into the class hierarchy, even if we start from the topmost class, java.lang.Object. The different implementations of Java compiler and JVM will influence the actual performance and cause minor incompatibilities, which could be overcome by either selecting one of the existing implementations as an industry standard, or rather by accepting a protocol to specify the compiler and target JVM in the byte code, for some universal JVM to be able to adapt its behavior to these data, to make the interpreter backward compatible with all the previous versions.

No doubt, this particular discrepancy can be removed by introducing special "formal" classes for the compiler and the JVM, and one can certainly write the both in Java. But the experience of some other languages trying to follow that line shows that this is a dead end. Today, modular diversity seems to be the governing trend, and all the modern systems incorporate integration features to accommodate modules written in different languages. In particular, this means the abandonment of the tree model as a standard of code structure. Instead of a rigid hierarchical structure, we adopt a flexible hierarchy, allowing for recursive inheritance and unresolvable circularity. In such hierarchies, the whole can be unfolded starting from any arbitrarily chosen element, producing a specific hierarchical structure; moreover, several such structures can be unfolded by different threads in concurrence. A germ of this approach can be found in Java as the inherent circularity of objects and classes: objects are instances of a class, while classes are also objects.

The principal impossibility to reduce all the language to a single primitive, or a few independent primitives, is due to inherent incompleteness of the classical logic, as expressed by the well-known liar paradox. Any computer language need an interpretation in terms of the computer states and their sequences; as soon as we introduce a formal representation of the interpreter, we come across the problem of interpreting that interpreter, and so on. Within classical logic (including all the formal extensions like modal, multi-valued, typed combinatory logic, etc.) this problem cannot be solved in a consistent way. The law of excluded middle lies in the core of the liar paradox.

Developing highly reflexive programming languages could certainly be a useful experience and a source of many interesting discoveries in the fields other than applied computing. A programmer needs a language that could express anything within a specific application area with minimum effort. Of two programming languages that ensure the same functionality, that which is more compact and logically transparent will be preferred. In fact, object oriented programming came in response to such a demand, promising to structure the code in an intuitively understandable way, thus eliminating the undesirable side effects.

However, as usual, too much good is no good. Trying to construct classes for every purpose leads to cumbrous and inefficient programming, where any change requires a delicate work of adjusting the interaction of numerous classes, rather than a trivial replacement of a piece of code. The existence of numerous tools for managing classes and structural optimization does not help much in most practical cases.

Classes are formally equivalent to code libraries, with certain access restrictions, which can be implemented in many ways, from mere convention to modular namespaces or built-in ACL. That is, object-oriented programming does not need a special syntax; it is a programming style rather than technology. The same holds for any other syntactic rules. We conclude that a programming language is not identical to its syntax (including the formal semantics), though any language needs some form in every particular application. Eventually, all the programming languages represent the same universal language reflecting the hierarchical organization of both its object (the logical states of computer systems) and the user needs.

For any method abusively employed outside its original scope, its merits are most likely to become its faults. Exaggerated consistency means inadequacy. Obviously, one could use a code set of only two characters for the English language, but will that make it more readable? Most human readers would prefer the traditional Roman alphabet. Moreover, in writing, many people stick to their favorite shortcuts instead of spelling frequent words in full; this effectively extends the alphabet. Such compound characters are much like Arabic ligatures, or sometimes like Chinese and Japanese hieroglyphs, which are as well composed of simpler components, though not necessarily phonemic. Obviously, the usage of shortcuts/ideograms can become inefficient when there are too many of them, and search for a proper character from the extended set takes more time than the traditional spelling. That is, for each application area, there is an optimal alphabet (or a class of equivalent coding systems) ensuring maximum efficiency. The length of this alphabet depends on the parameters of the relevant activities. Thus, for a fixed operation set, the alphabet tends to include special characters for individual operations (regardless of the Kolmogorov complexity); on the contrary, in a rapidly developing environment, a relatively compact universal character set will be preferable.

Similarly, in programming, it makes no good to reduce the whole life to a single platform, or a coding standard. Using a preferred development environment is like a passionate love for machine codes, deliberately avoiding any high-level languages. The ideology of universal hierarchical programming is gradually gaining strength in applied programming and computer science. Still, the "language wars" are not yet over, and the obsession of finding the only "true" solution (much like the innumerable attempts to design the perpetuum mobile do not stop up to now, when everybody knows that such devices are impossible due to the most general laws of thermodynamics) keeps haunting the computer world.


[Computers] [Science] [Unism]