Abstract of Final Report
To Army Research Lab from TelArt Inc.
Under contract number xyz
Paul Prueitt & Alexander Zenkin, Draft April 22, 2000
Copywrite, Paul Prueitt. Available for Public Review
This Final Report has two parts. The first part addresses issues in the formalization of abstract knowledge. The second part addresses issues in the formalization of personal and public knowledge about complex natural systems. (This April 22 version contains only the proposed Introduction to the two parts).
We consider how to create a proper statement about the processes specific to Alexander Zenkin’s CCG (Computer Cognitive Graphics) visualization methodology as applied by him in order to generate new knowledge in logic and classical number theory (see white paper titled: SUPERINDUCTION: NEW LOGICAL METHOD FOR MATHEMATICAL PROOFS WITH A COMPUTER, Zenkin & Zenkin 2000). This paper was delivered in a public presentation at George Washington University on April 25, 2000.
Our statement must address what we know about how human creativity is managed. To this purpose, we draw on our mutual experience from the fields of foundations of mathematics, logic, computer science and neuropsychology.
We suppose that human creativity is necessarily at the core of the emerging knowledge sciences. Moreover, new knowledge technologies are being developed and increasingly used in government and in the private sector. It is the claim of this Final Report that the CCG visualization methodology opens a new possibility for machine aided creativity and knowledge generation. This report to ARL is the first comprehensive report (in English) on how CCG might be applied to general problem of data-mining.
The first applications of CCG where made (1975 - present) to an investigation about the foundations of mathematics and logic. Specifically, an application of CCG methodology is applied to a generalization of Waring’s Problem and other problems in classical number theory, logic and proof theory.
Our present work for ARL has focused on generalizing existing CCG methodology. We suggest how the methodology might be applied to additional areas of specific interest to the Army Research Lab. These areas of interest include (1) automated processing of EEG data for neuropsychological research, (2) generation of knowledge about collaborative dialogue occurring in Internet environment, and (3) generation of knowledge about complex processes such as micro-economic activities.
In the Report’s Part 1, an AS-IS model is created. This AS-IS model is specially about software that was developed by Anton and Alexander Zenkin and the methodology through which this software has been used by A. Zenkin in the development of new results in classical number theory. >From this AS-IS model we see that CCG aided new knowledge creation results from a transformation of data structure. How this is accomplished, in the specific case of examples from classical number theory, is reviewed in Section 1. In Section 2 we review a proof related to Cantor’s Theorem about the non-denumerability of the real number line. Section 3 addresses general concerns with the proper formulation of private and social knowledge. Cantor’s Theorem is taken as illustrative of a specific class of social knowledge development where visualization might help to see the truth about the methods used in formal inference.
Part 2 of this Final Report provides the results of our generalization of CCG from the domain of elementary number theory and logic to the domain of data mining. Classical data mining methodology organizes past data into correlation frequences and thus reveals some set of statistical prototypes for components of real world events. CCG is envisioned as an enabler for the production of similar prototypes. However, unlike current data mining methodology, CCG methodology provides a non-numerical prototype in the form of CCG images (see Appendix A: list of CCG images published by A. Zenkin.)
The problem of measurement:
In CCG methodology the perceptual measurement of the structures in classical number theory is weakly controlled by the acuity of human visual system. Similar to a historical situation, related to modern meta-mathematics and Cantor’s diagonalization proof, perceptual measurement can be fooled (Prueitt; Sense-Making Environments, 2000). If the ‘instrumental’ measurement of the world is not proper, then knowledge artifacts are generated and can be sensible. These artifacts and the perception of humans can be validated through an iterative vetting process. In this case, the generated relevant structure maybe illusionary. Thus computer aided sense-making can lead to catastrophic consequences.
By instrumental measurement we mean the use of physical devices to measure the report of occurrences of specific kinds, such as the specific states of a computer processor or microelectrode. Instrumental measurement is the simplest type of measurement and is done within a framework where an assumption of Newtonian type interactions is plausible.
Our group makes an important and absolutely necessary distinction between simple (or instrumental) measurement and complex (or perceptual measurement). This distinction can be grounded in scientific scholarship. However, the distinction is rather an issue of social philosophy and thus we will not attempt here to make what might be considered as proper reference to the scholarship.
It seems clear that the natural system must be measured in very specific fashion to enable the development of relevant information. For example EEG data may, or may not, allow proper measurement of human emotional states. The currently understood correlation between EEG events and emotional events is tenuous.
To address this ‘measurement’ problem we first develop an intermediate organization to the data based on some proper annotation of invariance. Instrumental measurement is vetting by a complex process that involves human perception and thought process about the correlational status between the events in the natural system and the events organized out of the instrumentally derived data.
The specific outcome of an initial organization of instrumentally derived data can be considered to be somewhat arbitrary since the optimal organization of structure is not know to exist. However, specific vetting processes can be discovered and employed to produce a conversion of informational artifacts into knowledge, and eventual encoded in some way in the form of knowledge artifacts. Thus artifacts can extend the reliable knowledge owned by a human or shared within a community.
We look for a system of tokens that can be used to enumerate salient features elected by a domain expert. In Zenkin’s use of CCG, for mathematical knowledge discovery, we conjecture that an informal (tacit) sign system has been generated during his personal experience with the CCG system. Encoding explicit knowledge from tacit knowledge is called ‘vetting‘. CCG methodology is based on the assumption that mental images have a class of non-linguistic causes. The vetting process brings these non-linguistic causes into a more objective and shareable status in the form of tokenized artifacts. The correspondence between mental images and CCG images in a data base of CCG images is to be used as a knowledge base for knowledge generation.
Vetting of tacit knowledge from expert(s) working with a CCG system is the subject of certain trends in scientific scholarship. We will not discuss directly the scholarship, except to reference the works of the great Russian semioticians Dmitri Pospelov and Victor Finn. These individuals, and colleagues, developed an extension of Peircean and Mill’s logic in order to create a ‘control language’ for the encoding of human knowledge about complex objects of investigation. Our group follows this approach, capturing some of the simplest elements of situational control (second order cybernetic) and quasi axiomatic reasoning.
Descriptive enumeration is used to delineate a set of axiomatic like tokens from which weak logics can be generated. A logic is considered to be ‘weak’ if it contains both reliable logic and plausible logic. Our reasoning is that once a relatively complete set of tokens is available, then standard CCG transformations of structure might be used to gain intuitions about what specific structural invariance, in instrumentally derived measurement, means in context to a specified object of investigation. There is thus a classical knowledge engineering aspect to the envisioned knowledge creation technology.
Tacit to explicit knowledge transformations
This section outlines our approach to the problem of vetting personal and scientific knowledge about human emotional states using human visual inspection of EEG structural invariance.
Specifically, there is a problem of how to transform raw EEG data into some ‘informational structures‘. The set of structure ‘prototypes’ play a role similar to the axioms of elementary number theory. The complete token system then corresponds loosely to something we wish to learn, or to evolve into public knowledge Initially we will be mostly blind as to what patterns in the EEG data might be meaningful to our specific investigation
Initially the EEG data is seen to have only a partial representation of the structures that are needed to create a sign system and therefore a control language, in the sense described by Pospelov. In ‘Situational Control’ (1985, published only in Russian) Pospelov developed a report on the use of open loop second order cybernetic systems to control complex natural systems such as cities or social units. This cybernetic system creates a man - machine interface and a control language that allows collective knowledge to be vetting into a data base of common artifacts. Pospelov’s description appears to have proceeded all other knowledge engineering systems, based on applied semiotics. It is out opinion that his description can be a proper basis for practical language languages for use in new knowledge technologies. The generation of the tokens for axiomatization into a control language can be accomplished by generating CCG database of images.
Our group, working under this ARL contract, has conjectured that after EEG informational structures are partially obtained, then this incomplete "sign system of patterns" can be used to generate a number (perhaps between 50 - 150) new CCG images if the domain expert has a familarility with how the sign system is itself generated.. i.e. the meaning of the informational structures in context to data considered, and if some principled investigation of the causes of the patterns is in his or her mind.
The CCG technology’s role is to mediate between an (1) evolving control language, (2) a database of semiotic (CCG) images, and (3) actual human knowledge about the Object of Investigation.
A second problem has to do with how quickly the incoming data’s representation, as information, becomes not relevant to the awareness of the domain expert. The are two aspects of this problem. First, the uncertainty we have about the proper control language and the incompleteness of the knowledge we are seeking. Second, the non-stationarity of the natural system we are investigating.
If the measurement of the world is perfectly relevant, in the best sense, again no new knowledge is generated unless there is a process of human awareness. The process of human awareness has both right brain and left brain character and so this balance between the two is addressed in an analysis of CCG methodology and implementations. The analysis is reported in Part 2.
This relationship between sign semiotics and image semiotics is central to the problem of understanding the nature of human and social knowledge. We say that information can be derived from a proper measurement of the past. However knowledge requires an interpretation of information and or experience, in order to exist for the first time. This knowledge may be illusionary and then must be (1) properly vetted and (2) validated.
New knowledge is created in a mental imaging process. By understanding this imaging process more fully, we may facilitate the creation of new private knowledge. With proper knowledge engineering methodology we may enable this tacit knowledge to be vetted into public knowledge.
On the nature of public knowledge
General systems properties must be used to grasp how new knowledge creation occurs in real social systems. For example, political aspects of social interactions may result in actors changing the cause of actions. The experience of knowledge may change the nature of one’s personal knowledge about the present situation.
Knowledge Management (KM) is currently a buzz world in various communities. We hold that KM (thought of as the management of knowledge sharing) is primarily not a technology but rather a process of truth finding and community building (Prueitt, Sense-making Environments). Complexity science is identifying to the scholarly community certain facts about the nature of transformations of information between a subjective experience and an objective reality. These transformations are central to human communication within communities of shared interests. Karl Popper and Charles Sanders Peirce both developed a common view that subjective and objective experience work together to produce both public and private knowledge experience.
CCG methodology, in conjunction with semiotic control languages and knowledge engineering allows one to discuss knowledge technology properly. We point out that commercial KM, in early 2000s, is not in the same class as the current generation of Information Technology (IT) since current IT systems only serve as an enabler of knowledge sharing. We claim that the marketing of KM by vendors has muddied the perception about what might be properly called knowledge science and what might be envisioned as proper knowledge technologies. Our group hopes that the following Final Report gives back some clarity.