Universal scaling

Zone scales as a universal language of design

Pavel B. Ivanov
Troitsk Institute for Innovation and Fusion Research (TRINITI)

E-mail: unism@ya.ru

Written: 18 December 1995
Revised: 11 March 1997


A hierarchical approach to the description of structure formation processes in arts is presented, which describes the development of various zone structures in music, literature and the visual arts in a uniform way. A new look at the phonemic organization of language is suggested, and the role of vowel successions in speech is related to that of melody in music. This analogy is specifically applied to poetical intonation. Experimental tests of the theory are proposed, and the possible practical applications are discussed.

1. Introduction

Philosophizing about culture often refers to art as a specific channel of communication between people, a kind of language different from both the formal language of science and the "vulgar" everyday speech. Art criticism may be quite meticulous about the "message" of a work of art, and the problem of relating the formal aspects of art production to the author's ideas and intentions has been the pivot of theoretical aesthetics during the last two centuries. The problem has acquired another aspect since computers became widely employed in the analysis of arts, and then as an instrument of the artist. The rapid development of multimedia techniques gives a new impulse to the synthetic tendency in arts, which requires a deeper understanding of the links between different arts, and the new criteria of the artistic in general. Would it be right to call the output of some computer program music, painting or verse? Is there any algorithm for translating music into painting and back? How artistic performance can be documented, to enable its adequate reproduction in another environment and by other people? What in the arts is culture-dependent, and what is its universal content? These questions require constructive answers, which are only possible within a general approach regarding the arts of all the times and nations as different manifestations of the same thing, in different circumstances.

Recently, I suggested a general hierarchical approach that could provide the necessary framework for comparative analysis in arts [1,2]. In that theory, human perception (including aesthetic perception as a special case) was treated as a multilevel phenomenon, with possible shifts of consciousness from one level to another. At each level, the continuum of sensory images can be decomposed into elementary conceptions selected from a pre-defined set. Such conception sets are essentially discrete, though their elements are continuous zones, so that any sensation within a zone would be perceived as a variant of the same conception. These discrete structures define the quality of a work of art, that is, the combination of characteristic features which makes it recognizable in the various performance forms. Combining hierarchical approach with some informational and quantum-mechanical concepts leads to a mathematical theory of musical scales describing the properties of the already existing scales and predicting a number of new ones, indicating their principal aesthetic properties and applications [3]. The same structures were found in visual form perception, so that a number of parallels in the developments of music and the visual arts became evident [2]. It was natural to admit that analogous hierarchies could also be found in the third fundamental branch of art, in literature. Since synthetic arts of today mainly combine, in different ways, the word with the sound and the visual form, the discovery of scaling phenomena in the belles-lettres resembling musical scales would mean the universal aesthetic significance of scale hierarchies, making hierarchical approach a kind of universal language to speak of the arts.

This work suggests a few considerations in support of the idea of hierarchical scaling in speech, and especially artistic speech, literature. Naturally, I cannot pretend to cover the problem in all its complexity, and my article should rather be considered as a proposal for further research, part of which being actually in progress.

The next two sections are dedicated to the description of hierarchical scaling in music and the visual arts (painting, architecture, dance, and so on). A discussion of the general conceptual background of the application of hierarchical approach to aesthetic phenomena follows. The succeeding sections formulate the principles of the hierarchical description of language and outline the hypothesis of phonemic scales in speech, together with some of its consequences for poetry. The concluding remarks are concerned with possible applications of the theory in the arts and related activities.

2. Musical scales

Any art is many-dimensional, that is, there are several parameters that may vary smoothly within some interval. Thus, music could be characterized by the distribution of sound frequencies, volume dynamics, tempo variations, spatial distribution of sound sources, and so forth. Painting could be described in terms of shapes and their arrangement, colors, special effects etc. Apparently, artists may use arbitrary combinations of these parameters, and nothing prevents them from any choice. Still, the practice shows that artistic work is never completely stochastic, and there are strong limitations on the artist's creativity. For example, in music, there is a 12-tone scale which governs the pitch organization of most modern compositions. Many musical instruments are "well-tempered" and cannot produce sounds other then the twelve degrees of this scale; even the continuous tone shifts widely used in modern music are generally recognized as modifications of the same 12-tone scale rather then something self-defined, and smooth pitch variations are just a way of transition from one degree of the scale to another. Still, the 12-tone system is not unique in music. Everyone knows the seven notes forming the diatonic scale, which historically preceded the "well-tempered" scale. The difference in pitch between any two neighboring notes of the 12-tone scale is the same, which is not so in the diatonic scale. To reproduce the diatonic scale within the 12-tone system, one has to associate the diatonic notes with some degrees of the 12-tone scale. This embedding is not unique, and the different embeddings result in different diatonic modes, thus forming a two-level hierarchical structure. One more level appears when one comes to consider harmony. The basis of the classical harmony, the triad, can be treated in the analogous way as an embedding of a three-tone scale into the same 12-tone system. Again, this embedding is not unique, and the well known major and minor triads arise as the possible variants. Thus, one obtains an alternative language to speak of musical pitch, namely that of scale hierarchies and scale embeddings.

This reformulation of the classical theory of music in terms of interacting scales would be useless if it described the already known facts only. However, it brings forth some new ideas. First of all, one could ask which scales can be embedded into the 12-tone system, and is there any scale that cannot be embedded. Why should we confine ourselves with the three-level structure, not trying to build more complex constructions? And, in general, is the 12-tone scale the only universal scale permitting multiple embeddings? These questions have been asked by many musicians and theorists during the last centuries, and there are numerous misunderstandings up to now. The most typical mistake is that the notes in a scale are the exact points on the pitch axis, and a degree of a scale is characterized with a definite sound frequency. This approach manifests itself in the attempts to associate the musical intervals with some simple frequency ratios, so that smaller numerator and denominator would correspond to more consonant intervals, while the ratios of greater numbers would mean dissonance. The actual musical performance hence would be evaluated by the accuracy of reproducing the pre-defined intervals, and the 12-tone system of modern music is considered as a mere compromise in approximating the greatest possible number of common intervals with a tolerable accuracy.

This "numerological" approach has a long history, ascending up to Pythagoras, or even to the magic consciousness of the primitive peoples. It played an important role in discovering the non-linear character of pitch perception, resulting in the internal overtone (and sometimes undertone) expansions of any sound. Indeed, one can never hear a pure tone with only one frequency present; non-linear mental processes will make it enriched with many other frequencies, and it is this harmonic structure of a musical sound that allows us to prescribe it a definite pitch. Another valuable result of pitch numerology was the construction of many scales other then the 12-tone one, the systems of 19 and 31 tones being the most popular [4-7]. Still, these new scales were generally considered from the viewpoint of approximating the sets of pre-selected intervals, and their own aesthetic potential was utterly lost in this way.

One might say that the principal drawback of this approach is the absolutization of discreteness in the human perception. The picture changes drastically when one accounts for the intrinsic uncertainty of pitch evaluation arising from the complex organization of human activity and brain processes [1,2]. Actually, nobody can determine pitch exactly, and nobody has any need to. Implicitly, this idea was already present in the numerological approach when the approximate intervals were introduced, which was a strong violation of logic, since approximating the "consonant" ratio 3/2 with something like 30001/20001 would result in a harshest dissonance because of the very large numerator and denominator in the latter fraction; however, such replacement produces no effect on human perception, and the two intervals subjectively sound the same.

So, it would be natural to admit that a note is not subjectively represented with a point on the frequency axis, but rather with a frequency zone, so that all tones within the zone are conceived as the same. This does not necessarily mean that the respective frequencies cannot be resolved in hearing; all I say is that the tones within a zone are functionally equivalent, that is, they correspond to the same musical conception. Moreover, the existence of perceptibly different tones within a zone is very important in music, since it makes the minute adjustment of intonation possible. In fact, the skill of a performer is mainly the matter of proper intonation, and the mechanical reproduction of the exact pitch sequences can hardly be considered as artistic performance.

The zone nature of pitch perception has been experimentally established by outstanding Russian scientist N. Garbuzov [8]. He spent more than thirty years of his life in thorough experimenting trying to prove that any side of musical perception implies the appropriate zone structure, including the perception of loudness, tempo, note duration, and musical timbre [9-11]. The related results were also obtained in many psychophysical experiments [12], and it was H. Helmholz who suggested, despite of his theoretical favoring of the "pure" intervals, the Gaussian distribution as a model of an isolated harmonic [13].

The idea of zone scales being adopted, the problem of determining the pitch zones would arise. On the basis of hierarchical approach, a mathematical model can be developed, predicting all the possible zone structures and describing the hierarchies of scale embeddings possible within each scale [1,3]. Though the model does not assume that a scale must contain an integer number of zones per octave, all the computer-found scales happen to be approximately periodic, with almost integer number of zones per octave. Thus, there is a scale with 12.006 zones per octave (which obviously corresponds to the traditional 12-tone scale), as well as the scale with 6.943 zones per octave corresponding to the diatonic scales. It has been found that the scales with few zones per octave lack "temperament", that is, their zones are not equally spaced within the octave. This effect becomes negligible for the more "dense" scales, since their zones always include the points of the uniform division of the octave. In particular, the zones of the 7-tone scale are spaced in accordance with the interval structure of the classical diatonic sequences, while the zones of the 5-tone scale reproduce the pentatonic interval structure.

Many properties of the existing scales can be described in this mathematical model, including the typical usage of intervals in each scale. Thus, some scales are modally labile, and their intervals generally reveal themselves in melody, while in harmonically labile scales intervals tend to appear in the harmonic vertical. The 12-tone scale is stable both modally and harmonically, and this explains its unique position in the modern musical culture. Still, the theory predicts new stable scales, including the already mentioned 19-tone and 31-tone systems. The internal organization of these scales is quite different from that of the 12-tone scale, and they cannot be considered mere extensions of the latter.

This is a key point of the hierarchical theory. Thus, the numerological approach defines the "consonant" intervals once and for ever, so that all the musical scales were treated as more or less crude approximations to the "ideal" proportions; that is, music should be the same in all the scales, with minor technical differences. This viewpoint has been widely criticized ever since the beginning of the twentieth century, as the interest to oriental and African music grew among the European musicians. The traditional music of Asia has been acknowledged as an independent branch in musical history, and the numerous oriental musical modes were recognized as qualitatively different from the European scales. Hierarchical approach permits to distinguish different lines of scale development, one of which leads to the chromatic system of the European music, and another results in various generalizations of pentatonic. It is only in the 31-tone scale, that the two lines merge together.

According to hierarchical approach, music in any scale should sound differently, depending on the internal organization of this scale. For example, the 12-tone music is not like the diatonic or pentatonic music, though the intonations of the latter two can be somehow modeled within the 12-tone scale. The analogous relations exist between the 12-tone system and the 19-tone scale. One cannot associate intonations of different scales in a straightforward way, and transfer the ways of performance from one scale to another without reservations.

Hierarchical description of scale formation introduces development into the theory of music, so that the historical scales become the necessary stages of this development, rather than the imperfect copies of the same "ideal" scale existing without any relation to human history. This change of view is analogous to the L. Vygotsky's revision of traditional psychology, when the historical development of concepts was established, in close connection with the development of human thought [14].

3. Scales in the visual arts

The visual arts are different from music in much the same way the spatial coordinates in physics differ from the time coordinate. While, in music, the composition unfolds itself sequentially, moment by moment, one can grasp the whole visual image at once, and its parts are simultaneously present before the observer. In principle, the points of a picture can be arbitrarily ordered in viewing, while musical performance assumes an irreversible sequence.

However, the analogy with physics can be traced somewhat farther, and one can recall that physical space is closely linked to physical time, since they both manifest themselves only in a specific physical motion. Modern physics derives the topology and geometry of space-time from the interactions of material bodies (or fields), and the distinction between spatial and temporal coordinates becomes relative. It would be natural to expect the artistic space and time to be intertwined too, and the remarkable fact is that the idea of combining music with light became very popular in the beginning of this century, in parallel with the relativistic revolution in physics.

In the history of culture, there were many attempts to correlate music and painting [15]. The seven colors of rainbow seemed to be a natural analog of the seven notes of the diatonic scale. However, there has been little progress in this way until now. This failure could be explained from the viewpoint of the modern theory of color and the practice of computer color synthesis. The one-dimensional ordering of colors in the rainbow has proved to be merely incidental, since the actual color space is at least three-dimensional, with a non-trivial topology.

Why the spatial organization of painting had not attracted that much attention, though, logically, it should be the first in the search of analogies between music and painting? I suppose, this was mostly because there are several space coordinates, in contrast with the single time dimension. Now, when the color space has lost its apparent simplicity, the moment came to revise the relations between the artistic space and time. This revision could be performed on the basis of the hierarchical theory of activity in general psychology [16]. Combined with the theory of scale hierarchies, it leads to the conclusion that there is a direct analog of a musical interval in the visual perception, namely the angle in the viewing plane [2]. Musical pitch can be associated with a direction in the plane, and the traces of this resemblance have been found in the history of painting, sculpture, and architecture. Since the mathematical theory of musical scales has revealed the characteristic features of different scales, one can easily observe that the level of scale development in the music of any culture corresponds to the level of the development of spatial forms in that culture's painting. Thus, pentatonic music would be usually accompanied with pentatonic painting, arranging forms around the distinct five directions in the plane and avoiding definite geometrical shapes. The painting of Europe provides the examples for the whole range of visual scales, from pentatonic and diatonic to the 12-direction system, the analog of the 12-tone musical scale.

The origin of this similarity between music and the visual arts lies in the similar organization of activity. From the psychophysical point of view, the evaluation of a plane angle, involves the same internal operations as the evaluation of a musical interval, all the complexity of scale hierarchies growing from this elementary activity [2]. The situation is quite analogous to physics, where time intervals are often measured by the distance one could cover in that time moving with a constant velocity, while distances can be measured by the time it takes to walk them at a definite speed.

Now we can return to color. In 1911, Russian painter W. Kandinsky noticed that, in painting, color plays the same role as instrumental timbre in music: the boarder of two colors produces a line [17]. Kandinsky associated the principal colors of his paintings with the timbres of different musical instruments (or with the registers of one instrument). These observations hold for the major part of modern painting, from realistic to psychedelic. However, this does not mean that color cannot be used like musical pitch (and that musical pitch cannot be used for coloring). The evident example is the usage of color and pitch in the impressionist art. More evidence comes from psychosemantics. One can notice that any language has its own collection of words and word combinations designating color. This collection is not closed, since there exist standard ways of generating new color terms. Nevertheless, all these terms form a discrete set, tending to cluster in a kind of zones. R. M. Frumkina experimentally investigated the relations of subjective proximity in the field of color designations [18]. She has shown that all the color terms in Russian can be semantically grouped into seven distinct clusters (and possibly several groups representing mixed colors). This means that color is psychologically defined as an element of a zone structure, just like a musical note is perceived as a zone in some pitch scale. However, the collection of principal colors obtained in these experiments did not coincide with the rainbow sequence and cannot be arranged in a line.

It would be interesting to compare the perception of the visible part of solar spectrum by people belonging to different cultures, with different musical traditions. One could expect that the dominance of pentatonic music would correlate with a five-color sequence in the rainbow, contrary to the seven colors distinguished by Europeans. I could also predict that, even for Europe, there may be differences in the number of colors distinguished in the spectral continuum, depending on the cultural level of the sampling group and the conditions of observation. This could be proved in a simple psychophysical experiment measuring the differential thresholds of color perception with varying width of the spectral band in view.

4. Fundamental duality

The description of scales in music and the visual arts can be substantially generalized. It is well known that people are not aware of everything they do, and the focus of awareness moves in the course of activity from one conscious goal to another [19]. This movement to a pre-selected goal is called action, and any activity unfolds itself in a sequence of actions, which can similarly be implemented as the sequences of operations. On the level of activity, the world is represented with a continuum of possibilities, while any operation is essentially discrete, since it cannot have subjective duration in both space and time [2]. The level of action is intermediate between activity and operation, and therefore awareness assumes a synthesis of continuity and discreteness. This synthesis is naturally reflected in the theory of zone scales.

I suggest that the way of unfolding activity into a sequence of actions is a fundamental characteristic of a specific culture, and it should reveal itself in any aspect of it. This implies that every particular situation would be evaluated by the people through a most general process called categorization: at any level, the situation is related to one of the discrete set of categories, each category assuming a continuous zone representing the possible variations of the same situation. Scale hierarchies in music and the visual arts are examples of categorization, and I expect that the model developed for the description of these hierarchies would be applicable to any other cases, provided the appropriate variables have been found.

The principal duality of continuity and discreteness can be a useful source of theoretical ideas, and a criterion of trustworthiness for any conceptual system. Thus, one should always remember that any opposition is only valid within a definite set of categories and may disappear on another level of hierarchy, in another scale, where the "opposite" things may belong to the same zone. In logic, it means that a result of formal deduction should be considered as a hypothesis, rather than a proof; because of the zone nature of the logical truth, the formal conclusion may be found off the zone, or in some other zone (like the round-off errors in real arithmetic). Virtually, the very category of truth is a conceptual frame at a definite level of consideration; in a restricted way, this fact has already been acknowledged by modern mathematics [20]. This relativity of appraisal is apparent in the arts, where it is a common phenomenon that an act inadmissible in one aesthetic environment becomes quite normal in another. The musical example is the hierarchical understanding of dissonance, which is defined as a violation of the zone structure at some level of the hierarchy of scale embedding; typically, there may be dissonances in harmony, modal dissonances (altered notes), and scale dissonances (widely used in modern music to produce a stratified texture).

Another type of theoretical fault is exaggerating the continuous side of phenomena. In aesthetics, this tendency is represented by intuitionism of any sort, overstatement of the spontaneous creativity and the neglect of composition. The duality principle says that no human activity can be completely spontaneous, and some processes of categorization would be always present. The artist may be unaware of the system of categories involved, and it is the matter of aesthetic education to enrich the syncretic creativity with a variety of conscious attitudes, which could enormously enhance the expressiveness of art.

5. Scales in language

As everything in the world, art is hierarchically organized, and literature constitutes a specific level of this hierarchy. This level could be considered as intermediate between aesthetic and scientific creativity. Language is designed to reflect any kind of activity, including artistic work and speech production itself. Still, the ways language is used in the arts is different from that of everyday life, as well as from scientific reasoning or philosophical discourse. The characteristic feature of the artistic usage of language is that its conceptual aspect is of an accessory significance here, and it is its ability to produce forms that is important. This feature may be not obvious in literature, since people are rarely distracted from the semantic side of their reading. Still, such distraction does occur from time to time, and most people know this unusual feeling of hearing the modulations of speech with no regard to its meaning. This is the case when one can enjoy the verse, or a song, in a foreign language; however, people may have the same experience merely talking with somebody of the commonest things. Great poets and writers feel it much stronger than the ordinary people, and they follow that feeling as they write.

Numerous ways of using the words to produce forms are known in modern art. In the written speech, the word becomes visualized, and its visual image may undergo most complex transformations, according to the laws of the visual arts. In simple cases, it is only the graphic form that changes, like in the compositions by Richard Costelanetz [21]. More complex compositions may involve computers, holography and fractal theory [22]. Analogous transformations of the sound of speech may be encountered in some varieties of modern abstract music.

Of course, the way of hearing or reading a word cannot leave the way of its aesthetic perception intact. Still, aesthetic experiments with the visual image of the word, or its musical intonation, should be rather referred to the sphere of synthetic art, being closer to music or painting than to the belles-lettres. However, hierarchical approach indicates that language must have its own means of form production, which should receive more attention than they were given before.

For further analysis, I will distinguish two levels of the artistic usage of language. One of them is concerned with the wording itself, with little relation to the content of the speech. The opposite direction is to categorize the meaning of the words making them symbols rather than designations of real things. These two opportunities correspond to the distinction of outer and inner speech described by L. Vygotsky [14]. Since, according to Vygotsky, thought occurs in language, the both directions of its development, from the internal scheme to speech behavior and from the outer speech to its folded reflections, can be aesthetically represented in human culture. Actually, the both levels of the aesthetic categorization of language will co-exist in any artistic work, with varying relative significance.

I should stress here, that my distinction of the two categorization levels applies to both spoken and written language. Naturally, these manifestations of the linguistic ability are quite different in their modes of expression, they differ in structure and obey different functional laws. Still, the both kinds of speech realize the same process of thought becoming a linguistic form, and back, from a particular arrangement to its subjective sense. Apparent difference of spoken and written language is irrelevant to the essence of speech production, which might be related to their common origin, the gesture [23].

Hierarchical approach predicts that on the both levels of categorization one might discover zone hierarchies analogous to musical scales. Theoretical aesthetics and literary criticism have done much to describe the variety of forms related to the contents of an artistic text, which I will, for simplicity, call the semantic level. Numerous laws of composition have been suggested for both poetry and prose, though the lack of system can still be felt in these studies up to now.

Semantic scales most clearly manifest themselves in the classical narration. A story usually unfolds around a few main characters, who are interconnected by typical situation links. This structure admits some flexibility, and the characters of the story may look differently depending on the situations they enter. This behavioral variability defines a kind of behavioral zone for each character, and the effect of dissonance is achieved when a character acts in a way incompatible with its zone; however, such acts are never arbitrary, they always belong to some lower-level scale. Such dissonances are quite common in modern literature, which is like the "emancipation of dissonance" observed in contemporary music. The structures in the literature of the former epochs were simpler, with less dissonances, and the farther to the past we descend, the narrower collection of the standard characters we find. The theory says that the primitive forms of narration should be based on the diatonic (seven zones), or even pentatonic (five zones) scale. Since these scales are modally labile [3], the basic type of structure they imply is a chain-like sequence of events, with no evident final point; however, such "tonicality" may occur locally (in a small fragment of the story), especially in the diatonic scale. This conclusion is supported by the results of the computer simulation of fairy tale production [24], though more experiments and analytical study are still waiting to be performed. For instance, it would be interesting to relate the gods of the Ancient Greece to the tones of the Perfect and Immutable System that governed the ancient Greek music [25].

There is yet another area of semantic-level categorization, which concerns the meaning of the words and phrases. The existence of psychosemantic zones has been demonstrated in the above-mentioned experiments by Frumkina [18]. Hierarchical approach predicts that analogous zones could be found for the word meanings in any particular field of human activity, which might be detected in similar experiments. Thus, one could discover psychosemantic classifications of emotions, motives, attitudes, or personal traits, which are implicitly used in literature or artistic speech.

6. Articulation scales

Compared to the semantic level of categorization, available data on speech production provides less evidence for hierarchical scaling. Two scientific traditions exist in this area, representing two polar abstractions described above: one is to study speech as a psychologically continuous process, and the other is to consider speech as a result of formal structure transformations, assuming the existence of a priori defined sets of unchangeable elements. The former line inevitably comes to attributing the laws of speech production to something beyond the language, like social influence or the construction of the brain. The latter trend closely adjoins to the traditional linguistics, with its enumeration of grammatical structures, the parts of speech, morphological categories and so on. However, the understanding of the principal duality of the continuous and the discrete sides of speech production is gradually finding the way in linguistics as well, especially in the theories based on the notion of functional semantic fields [26]. The idea of categorization in phoneme recognition becomes rather popular in psychological research too [27].

Hierarchical approach is immediately applicable to the development of language, that is, in comparative linguistics [28]. It implies that the principal direction of any development is from the primitive syncretism to ever more complex zone structures (scales). Consequently, primitive languages could not be highly structured, and their categories had to be much more diffuse than those of modern speech. This assertion does not agree with the traditional views.

For instance, let us consider comparative phonemics, studying elementary portions of speech, the phonemes. Schematically, the traditional approach is to compare many languages, and to establish some formal rules transforming the phonemic system of one language into another. The existence of such rules for a number of languages is attributed to their common origin, and the phonemic system of their "common predecessor" is being reconstructed to describe the widest range of actually observed transformations. Finally, one obtains a highly developed system of phonemes, containing 11 or 12 vowels, 8 nasals and liquids, and up to 28 consonants, with several substructure levels [28, p.166]. Is it likely that primitive people, living several thousand years before the first civilization on the Earth, could develop that complex phonemics? One would rather expect that the primitive languages distinguished only a few phonemes, with very wide zones admitting a significant scatter in their pronunciation. Originally, the phonemes were not separated enough from syllables, and even from the whole words. This naturally explains why ideograms always went first in the history of written language, and why the syllabic stage preceded the true phonetic writing in the history of hieroglyphs becoming letters. From this viewpoint, the development of phonemics should be understood as a gradual improvement of phonemic hearing, increasing the number of distinguishable phonemes and tightening the range of their pronunciation variations. This process followed different routes in different places, and that is why there are many phonemic systems. Further, since the development of phonemic scales is governed by the same objective laws in any language, independently formed phonemic systems would reveal the same structural features, and there is no need to seek for a common predecessor to explain this similarity. Hence, there is much more allowance for the independent genesis of speech in many far-apart geographical locations, and one does not have to imagine mass migrations at the early stages of the human history.

These conclusions may be illustrated by the examples from the history of music. Thus, the different versions of the pentatonic scale can be found in the traditional music of any nation; however, no one would deduce from this fact, that, say, the Chinese and the Greeks would descend from the same people in the process of divergent evolution. There are many particular variants of the diatonic scale, and even the elements of the 12-tone system independently appeared in several regions of the Earth.

Of course, I do not assert that there was no divergent evolution at all in the development of phonemic systems. One can observe such processes up to now, regarding the pronunciation in the ethnic groups living in a foreign-language environment. Thus, Russian speakers in the USA pronounce the words differently from those living in Russia; this difference could be explained by the interference with the American articulation habits, and a faster rate of speech typical for English USA. However, the possibility of the independent formation of analogous structures in different languages makes comparative studies more difficult, since there is a necessity to discriminate the different modes of language development.

Phonemic zones exist in modern languages as well. For instance, in Portuguese the same vowel can be pronounced differently depending on its position in the word, or within a phrase [29]. However, in most cases, the zone nature of the phonemes is obscured by the traditional rules of spelling, which rarely reflect the actual pronunciation. The classification of phonemes in a particular language is a non-trivial task, and the description of the phonemic system may be influenced by theoretical attitudes. Hierarchical theory predicts that modern pronunciation (at least in Europe and North America) should be related to the pitch structure of contemporary music, as well as the ways of constructing the visual forms. It seems likely, that a direct correspondence can be found between musical scales and the vowels in a language.

This point requires a deeper consideration. Both speech and music are made of sound, so to say. Still, sound is used differently in either case. While music is essentially the art of intonation, speech grows from another side of sound, articulation. Therefore, structures in speech should be derived from timbre rather than pitch. The consonants do not seem to possess the timbre definiteness necessary to produce a relatively stable patterns; their articulation is transitory, compared to the vocalization of the vowels. Three-dimensional spectra of phrases (which are now available on screen using programs like Wave for Windows and Voice Toolkit) clearly show that the major part of the sound energy is concentrated in the peeks constituting the formants of the vowels, while the consonants mainly define the form of the peeks. This is quite analogous to the role of instrumental noise and performance accents in music, where they modify the appearance of spectra, but do not affect pitch.

I conclude that the vowels of a language should form a kind of scale, and the development of language is accompanied with the increase of the number of distinguishable vowels, with more restrictions on their actual pronunciation. Originally, there could be a single center of vocalization, which would later split into a few distinct zones. This hypothesis is supported by the development of phonemes in children's vocalization on the early stages, before they begin to speak; the insufficient separation of the vowels may also be a source of many word deformations observed in the babies' speech. Still, more work in this direction would be desirable.

The appearance of written language played an important role in the history of phonemic scales. It codified the inventory of vowels (scale zones) that already existed, thus making their usage much more conscious, so that the features of the corresponding phone system could be unfolded in full. On the other side, fixed alphabets constrained further phonemic development, and the actual pronunciation of words gradually became less resembling their spelling. The force of tradition is rather strong in the language, and the outdated spelling rules may even lead to the apparently retrograde evolution, when the number of vowels would decrease. However, this process does not mean the revival of some primitive phonological system; the zones of the scale remain rather narrow, unlike the wide zones of the older scale with the same number of zones. The typical example is modern Spanish, with its five very distinct vowels. This phenomenon can be related to the scale embedding in music.

Since it is rather difficult to determine phonemic zones in modern languages, one could try to investigate the cases of speech behavior that could reveal phonemic scales in a more clear way. I suppose that the aesthetic potential of speech is based on the articulation scales, and literature as a kind of art implies deliberate choice of sound sequences. Poetry seems to be the most dependent on articulation part of literature.

7. The art of speech

There is nothing new in the assertion that poets are rather careful selecting the words and arranging them in the verse. The draft manuscripts of many eminent poets show the traces of the poignant search for exact wording, the feeling of the sound often dominating over semantic considerations in this search. The verse should sound right, this is the first formal requirement in poetry.

However, the existing theories of verse primarily concern the rhythmic side of versification, while the rules of poetical articulation merge in the background. Some attention paid to rimes is not enough to explain the structure of the verse (especially if it contains no rimes at all), and the descriptions of the various kinds of sound repetitions (assonances, alliterations, anaphors and epiphors, etc.), as well as phone confluence or avoidance, usually treat them as arbitrary technical tricks. The hypothesis of articulation scales provides a different view on the phonemic structure of the verse. I conjecture that the alteration of vowels in a verse is analogous to melodic movement in music, and the role of the consonants is similar to that of instrumental timbres: they modify the articulation of vowels and partially control speech dynamics. Thus, the languages where many consonants may occur together (like Russian and German) would be characterized by a relatively slow rate of speech; deliberate confluence of consonants is often used in poetry to slow down the recitation.

The similarity of a sequence of vowels to musical melody looks quite natural from the practical point. Thus, in cartoon making, the actual speech is often replaced with an imitation preserving the general articulation and changing the "noise" component of speech to make it incomprehensible. In many cases, this melodic side is evident in poetry. For example, let us read the beautiful poem by Robert Graves [30, p.103], discarding all the consonants, with only vowels left:

She tells her love while half asleep,
    In the dark hours,
      With half-words whispered low:
As Earth stirs in her winter sleep
    And puts out grass and flowers
      Despite the snow,
      Despite the falling snow.

The effect becomes even more impressive if one allows for partial vocalization of such half-consonants as [j], [w], [l] and [r], which are known to be syllable-producing in some languages.

Of course, articulation melodies in modern poetry may be rather complicated, and I must admit that there should exist some analog of musical harmony. In the case of painting, I suggested the process of intonation folding as the main source of harmonies (chords), which were associated with the plane figures in graphics [2]. The same holds in speech, and one might regard a confluence of vowels pronounced as single vowels (vowel fusion) an analog of a musical chord. Evidently, diphthongs and triphthongs will be the first candidates for articulation chords. Also, there are speech dynamics effects producing vowel fusion in speech, and this is the most typical way of introducing harmony into verse, similar to arpeggio or hidden polyphony in music. Speech is rather difficult to achieve true polyphony, though, I suppose, some examples might be found in poetry.

Languages differ in their ways of arranging the succession of vowels in time. Thus, diphthongs are relatively rare in Russian, and the "chord" effect can only be achieved by the dynamic means. There are also differences in the quality of phonemes which would influence the aesthetic perception. For example the above poem by R. Graves was written in English UK, and it would loose much of its expressiveness when recited in English USA — though possibly acquiring expressiveness of a different kind. Translation from one language into another may completely alter the articulation design of the verse, especially when the interpreter tries to preserve the semantic aspect. For instance, there are numerous interpretations of the songs by The Beatles in Russia, which are characterized by a rude discordance between the lyrics and music; the exact reproduction of instrumental arrangement enhances this impression. A perfect translation, preserving both the contents of the poem and its articulation structure, is hardly ever possible. A talented interpreter would usually prefer to change the message of the verse to reproduce the general tonality. To continue the previous example, many hits of Euro-American pop-music have been translated into Russian and became true hits in Russia, though the text of the One-Way Ticket never resembles that of the Blue Hoarfrost, and the Yellow River has nothing to do with the Carlsson. Similar examples can be found for other languages: thus, the same Yellow River sounds as Amerique in French (with a close reproduction of articulation), and, inversely, Comme d'habitude has become My Way in the English translation (with a quite different articulation scheme). In this connection, one could recall that translating a musical piece from one instrumental cast to another may require a significant change in the notes too.

If poetical intonation is closely related to the succession of vowels, the study of poetical texts from this angle would reveal many interesting facts. The development of poetry might be traced in connection with the development of musical scales and graphic intonations, though the reconstruction of the true pronunciation in the past is most difficult; the conception of zone hierarchies might give some clues to such reconstruction.

8. Concluding Remarks

One might say, "Well, you may invent any theory. And what's the use of it? Artists will create from their feelings rather than from a cold reasoning, and most judges of their art will trust the general impression rather than the detailed examining of the composition."

I agree that syncretic impression plays an important part in arts; in hierarchical aesthetics, art itself is described as the syncretic level of creative reflection, as compared to the analytical level of science, and the synthetic philosophical reflection. However, formal methods are much more significant in design than it has been commonly thought. The activity of an artist is always directed by the nature of the material used, and the more developed are the techniques applied the more creative freedom the artist attains. A talented self-taught person can use the tricks discovered or re-invented in one's personal experience, while an educated artist has access to the treasury of expression means collected by all the humanity in the course of many centuries. Education cannot replace talent, but an educated talent is thousand times more powerful. So, the search for the new modes of action is bound to lead to new discoveries in art.

The theory of hierarchical scaling can be used in constructing new instruments. For example, the keyboard of the 19-tone piano could be organized in a special way [1]. Combining different scales in computer music requires special software allowing for convenient scale selection. Thus, the existing software for MIDI programming is generally oriented to the 12-tone scale, and one has to manually adjust the pitch shifts to write 19-tone music. The analogy between musical scales and graphic intonations would enable the development of new tools in the existing graphic packages, and the zone nature of color perception assumes the possibility of a new type of palettes and color transformations. One more aspect of it is optimal tuning the ordinary instruments . Thus, the traditional violin tuning by the pure fifths (with the 3/2 ratio of frequencies for the neighboring strings) is optimal for performance in the Pythagorean mode, while chords and temperation require more effort [31]. For 19-tone performance, it would be natural to tune the violin by the 19-tone ("harmonized") fifths, which are somewhat smaller than the pure fifth.

Yet another possible application for the theory of scale hierarchies is the notation problem. Usually, notation is thought to serve for mere recording something with the best possible accuracy. Thus, the traditional musical notation is rather incomplete, and there are numerous coding formats in computer music that permit much more performance control. Graphic programming languages (like ColorTalk in the Fractal Design Painter) show the same tendency. I would argue that the role of notation is more than passive recording, since it logically orders the parts of the aesthetic whole. Notation reveals the idea in the art, so that this idea could then be expressed in different ways by many performers, or in the other arts. And the concept of hierarchical categorization might help in designing new notation systems, thus becoming a universal language of design.

I have already mentioned some psychological, linguistic and culturological experiments suggested by hierarchical approach. Of course, there may be more experimenting along the hierarchical line, which might lead to a better understanding of scaling phenomena in the arts.

Scale hierarchies can be employed in art criticism and theoretical aesthetics. There may be educational applications, either stimulating creativity or enhancing aesthetic perception. These and other possibilities makes me hope that the ideas of hierarchical approach will be of some use for the artists and the people they work for.


1. P. B. Ivanov, "A hierarchical theory of aesthetic perception: Musical scales" Leonardo, 27, no. 5, 417-421 (1994)

2. P. B. Ivanov, "A hierarchical theory of aesthetic perception: Scales in the visual arts" Leonardo Music Journal, 5, 49-55 (1995)

3. L. V. Avdeev and P. B. Ivanov, "A Mathematical model of scale perception" Journal of Moscow Phys. Soc., 3, 331-353 (1993)

4. H. Sonneborn III, "The well-tempered computer" Popular Computing, 3, 218-224 (1983)

5. W. Stoney, "Theoretical possibilities for equally tempered musical systems" The Computer and Music (H.B.Lincoln, ed.). (Ithaca: Cornell Univ., 1970) pp.163-171

6. E. Gagliardo, "Enneadekaphonic music. A new system of harmonic tones" Atti dell'Accademia Ligure, 37, 536-546 (1980)

7. A. de Beer, "The development of 31-tone music", Sonorum Speculum, no.38 (1969); and private communication (1991)

8. N. A. Garbuzov, The zone nature of pitch hearing (Moscow: USSR Acad. of Sci., 1948)

9. N. A. Garbuzov, The zone nature of dynamics hearing (Moscow: Muzgiz, 1955)

10. N. A. Garbuzov, The zone nature of tempo and rhythm (Moscow: USSR Acad. of Sci., 1956)

11. N. A. Garbuzov, The zone nature of timbre hearing (Moscow: Muzgiz, 1956)

12. S. A. Gelfand, Hearing: An introduction to psychological and physiological acoustics (New York: Marcel Dekker, 1981)

13. H. L. F. Helmholz, On the sensation of tones as a physiological basis for the theory of music (N.Y.: Dover, 1954)

14. L. Vygotsky, Thought and language (Alex Kozulin, ed.) (Cambridge, MA: MIT Press, 1986)

15. B. M. Galeyev, Man, art, technology: The problem of synaesthesia in art (Kazan: Kazan Univ. Press, 1987)

16. A. N. Leontiev, Activity, consciousness and personality (Englewood Cliffs, NJ: Prentice Hall, Inc., 1978)

17. W. Kandinsky, Concerning the spiritual in art (New York: Dover, 1977)

18. R. M. Frumkina, "Psycholinguistic methods for semantic research" Psycholinguistic Problems in Semantics (A. A. Leontiev and A. M. Shakhnarovitch, eds.) (Moscow: Nauka, 1983) pp.46-85

19. P. B. Ivanov, "Physics and psychology in the hierarchical world: Towards physical psychology" IPPE preprint (June 1996)

20. R. Goldblatt, Topoi: The categorial analysis of logic (Amsterdam: North-Holland, 1979)

21. R. Monaco, New American Poetry (N.Y.: McGraw-Hill, 1973)

22. E. Kac and O. Botelho, "Holopoetry and fractal holopoetry: Digital holography as an art medium" Leonardo, 22, 397-402 (1989)

23. E. Bates, "Intentions, conventions, and symbols" The Emergence of Symbols Cognition and Communication in Infancy (E. Bates et al., eds.) (N.Y.: Academic, 1979) pp.33-68

24. M. G. Gaaze-Rapoport, D. A. Pospelov, and E. T. Semenova, The generation of structures in fairy tales (Moscow: Scientific Council on Cybernetics of the Acad. of Sci. of the USSR, 1980)

25. E. Herzman, "The ancient functional theory of the mode" Problems of Musical Science, Issue 5 (I. Prudnikova, ed.) (Moscow: Sovetsky Kompozitor, 1983) pp.202-223

26. A. V. Bondarko, Functional grammar (Leningrad: Nauka, 1984)

27. P. D. Eimas, "Infant speech perception" Scientific American (Russian edition), no.3, pp.12-19 (1985); translated from: Scientific American, v.252, no.1 (1985)

28. Oswald Szemerenyi, Introduction to comparative linguistics (Russian translation) (Moscow: Progress, 1980)

29. C. James, Contrastive analysis (London: Longman, 1980)

30. Robert Graves, Poems Selected by Himself (Harmondsworth: Penguin, 1976) p.103

31. N. Pereversev, The problems of musical intonation (Moscow: Muzika, 1966)

[Download PDF] [Papers]