To Jump or Not To Jump?
[RU]

To Jump or Not To Jump?

The argument about the admissibility of one or another language construction in programming, as well as the considerations of the "good" and "bad" style, is much like attempting to establish an official standard in natural languages. Excluding the "four-letter" words from the dictionaries will not prevent people from using them. Similarly, the much abused word goto still occupies an important place in higher-level programming, despite all the attempts to eliminate it.

Indeed, are there any serious objections, or any reasons to consider it as "bad style"? Yes, one can write an utterly incomprehensible code with labels all over without any obvious reason. But it is as possible to achieve utter incomprehensibility in a modular code without any jumps at all, as soon as we go beyond the trivial cases. They blame the goto statement for the supposed circumvention of the regular cleanup procedures of the block; however, this is rather an issue of poor compilation logic than code deficiency. The adepts of structural programming say that a program with jumps cannot be made transparently structured, that it lacks a manifest separation of structural units; but such statements merely reflect their self-restriction to a narrow class of very primitive structures (thus making them a kind of programming vegans). There are also high-flown declarations that the usage of the goto statement is incompatible with large-scale industrial programming; this is at least strange, since the industry has long since developed with conditional and non-conditional jumps, and this has never been a serious obstacle. On the other hand, why should we always stick to programming industry? The art of programming is an as important component of the human culture.

Primarily, in the deepest background, jump commands are hard-built in the hardware as an indispensable part of the command set; any processor would be almost useless without the possibility of switching to a different code branch. Theoretically, one could implement an architecture deliberately based on modules, blocks of insulated code and event-driven flow control; but this would be just a cumbrous version of the same simple logic. Indeed, event processing implies passing control to the event handler, and it does not matter whether this basic functionality is hardcoded in the processor or represented by a command sequence stored in memory. Branching is universal. Regardless of hardware, it reflects the core logic of sequential processing, as any processing at all develops in time and hence implies a sequential component. The attempts to eliminate time from programming are like eliminating food from eating. On the contrary, structured programming is application-bound, it depends on the current state of industry; hardcoding such a transitory logic in computers would be unreasonable, unless we want to throw away the outdated hardware every time the popular patterns change.

Of course, there are classes of essentially static problems that do not imply any sequencing. For instance, to solve a differential equation, one does not necessarily want to do it step by step, point by point; in some cases, we would rather prefer a parallel procedure, or an operational solution allowing to immediately obtain the function value for any given argument (like in analog computers). Similarly, adaptive computers do not need a strictly algorithmic code and definite memory locations, developing approximate structures on demand. However, when it comes to digital modeling and automation of workflow, sequential processing cannot be avoided.

In the structural approach, a program is a collection of blocks A, B, ... , with each block being a closed unit, which, once initiated by an event, is to be executed as a whole (as a single operation). That is, any flow control is effectively reduced to a trivial conditional statement:

if e1 A;
else if e2 B;
...

Of course, in real life, there are many forms of the same. For instance, one could define special event handlers, register them in the system, and then enable listening for events e1, e2, ... In this way, the above sequential evaluation can be made parallel (for interdependent events, meaning a different computation flow). This does not change the structure in principle. In a "poorly structured" code, one could find something like

on e1 goto branch_e1;
on e2 goto branch_e2;
...
goto Exit;

branch_e1:
  A;
  goto Exit;
branch_e2:
  B;
  goto Exit;
...

Exit:

However ugly it may seem to a rigorist, any structural logic can be exactly reproduces with goto (just because it is indeed implemented that way on the hardware level); moreover, nothing prevents us from highlighting, if necessary, the global structure with certain formatting conventions (like indentation, or label names and placement). A good compiler would interpret the code in a parallel sense, where possible, automatically defining the handlers for indicated events and the cleanup code for branching. With less intelligent compilers, a programmer will have to manually treat any problematic issues (which may sometimes be even better). In any case, using programmatically controlled jumps is much more flexible in treating structured events. For instance, if event e1 implies event ek, but not the other way round, one could simply write:

on e1 goto branch_e1;
on e2 goto branch_e2;
...
on ek goto branch_ek;
...
goto Exit;

branch_e1:
  A;
  goto branch_ek;
branch_e2:
  B;
  goto Exit;
...
branch_ek:
  K;
  goto Exit;
...

Exit:

In the "structured" style, one would have to duplicate code, generate additional events, create a structured event queue, or something as heavy... One of today's most popular solution would "wrap" the original blocks in functions and replace the straightforward A with {call fA;}, thus effectively producing a hierarchy of embedded blocks (or recursive functions). With this technology, indeed, different blocks can be easily combined without code duplication: {call fA; call fK;}; however, isn't it more like an alias of a combination of jumps? Compared to such ugly monsters, a simple goto looks much nicer and makes the logic of computation absolutely transparent, so that any change in the event structure could be easily implemented.

To make code more readable, there are various shortcuts for the typical cases of branching; for instance, the two common varieties of the switch statement. The functional approach can be easily combined with jump-styled coding, provided the programmer (or compiler) is accurate enough to initiate cleanup every time a jump crosses a block boundary, to control the visibility of names and the accessibility of private variables. Explicit logic of cleanup gives the programmer unmatchable power and flexibility, which may significantly increase the efficiency of the code. By the way, the intricate "wrapping" technologies used in object-oriented and functional programming to create various stateful objects (functions, iterators etc.) is nothing but a cumbrous way of representing partial cleanup during a cross-boundary jump.

The major problem about jumps is too much perfectionism. The attempts to optimize the code beyond a reasonable level (especially, exaggerated reusability) may result in a real work of art, unique and impractical. Freedom is the other side of responsibility, and lack of responsibility annihilates freedom. Using a language allowing explicit branching, one can jump into a block, do something, and then jump out without a bit of housekeeping, with a risk of running into uninitialized objects, memory leakage etc. But similar problems haunt any other languages, including those carefully cleared of any jumps—one could mention the frequent memory leaks in Java as a typical example. On the other hand, isn't it a strange kind of logic, to devoid programmers of a powerful tool just because some of them cannot properly use it?

There may be different opinions on whether one should be able to enter a block in multiple entry points, or only at the beginning, and whether it is admissible to leave the block at any moment. For instance, in concurrent programming, each block can be considered as a thread (or even a separate process), and hence there is nothing strange in that another thread would only communicate with some of its components. This is how natural things interact, after all. On the other hand, considering a block as an instance of a class, we come to the necessity of instantiation and disposal; in this case, the compiler may interpret multiple entry points as a single entry point with parameters, with any premature exit interpreted as a branch to the termination handler.


[Computers] [Science] [Unism]