From Federated Data Management to Information Production

 

 

 

Paul S. Prueitt, PhD

Research Professor, George Washington University

 

 

 

 

 

 

October 26, 2003

 

Revised December 18, 2005

 

 

 

Overview

 

Topic Maps and the Virtual Model

 

Latency and Diversity of Viewpoint

 

Formative Ontology Technology is identified on the Horizon

 

The Issue of FEA and its (non) support for Logic over Schemas

 

References

 

 

 

 

Note (December 18, 2005: A set of actually foot notes were made on a revision that occurred December 18, 2005 by the author.  2003 references to materials on the Internet are labeled “[n]”. 

 

 

 

 

 

 

Available on the web at

 

http://www.bcngroup.org/admin/technologyReviews/InformationalLatency.htm


 

Overview

 

New Federal Enterprise Architecture (FEA) guidelines are focused on federated management of structured data. The FEA is oriented towards managing existing data resources already structured into databases.  But new information production NOT first encoded into a relational database is technically possible.  The FEA is largely mute on this issue.  Related issues include the reductionism in computer science [link 1], and the problems that develop due to the reductionism in business practices [link 2].

 

Technical issues related to a Human-centric Information Production technology are covered in OntologyStream Inc research notes on Formative and Differential Ontology [link 3].  Examples of non-relational database data sources include natural language text archives, raw data from medical research, and physically instrumented data sources such as direct measurement of complex physical phenomenon from physical sensors [link 4].  

 

The relevant issues, for Differential and Formative Ontology, are grounded in a differentiation between pure mathematics and computer algorithms [1].

 

Pure mathematics can assume a continuum by adopting topological notions found in real analysis and topology [2].  The problems in making the necessary assumption are similar to the problem of induction of formalism that occurs even with the induction of the definition of counting numbers [3].

 

One can use Hilbert mathematics to create a representation of latent structure in non-relational database data sources.  [4]

 

The representation of latent structure involves continuum mathematics, in fact Hilbert space mathematics.  This representation uses conceptual abstraction that is not reducible to discrete mathematics and thus NOT completely reducible to computer states and transitions between computer states.  It is certainly NOT reducible to a lambda calculus [5]. 

 

Using new work on the foundations of mathematics, a virtual model is maintained as an abstraction about categories of information in continuous mathematical formalism and in human minds.  The new work is placed into the foundational materials of mathematics because it is hoped that mathematics can evolve beyond the Hilbert programme, where his 13th problem shows precisely why a new foundation needs to be laid.  . 

 

Our approach is more sophisticated than any of the traditional techniques within the current generation Information Technology.  The use of continuum mathematics moves the model of information out of the computer, while allowing a discrete homology to exist in the computer.  The homology maintains a correspondence to the structure and nature of tacit human knowledge in real time.  The connection to an experience of an individual human is seen as critical in settling issues related to perceptual measurement and metabolic induction processes found to be key in maintaining human awareness.

 

Continuum mathematics must lie outside of von Neumann computer science, but is still a formal construction that has the nature of abstraction.  Quantum and nano computing is not von Neumann, because there is sensitivity to non-local effects.

 

A process is followed, whereby the constructions from Differential and Formative Ontology are “made human-like” or reified.  Our work uses elements of the Topic Map standard [13].  But beyond the Topic Map standard, the Scientists of the Behavioral Computational Neuroscience Group (BCNGroup.org) have created a radically new type of Information Production [5] capability based on a model of human memory, awareness and anticipation [6].

 

Some scholars are claiming that standardizing the FEA on Information Technology is a mistake.  They believe that the FEA standardization reinforces the illusion that Information Technology somehow solves more problems than it can solve, even if funding at an infinite level for an infinite period of time.  

 

A specific line of argument is presented that the current generation of IT is in fact not capable of the types of agility and shifts in viewpoint that is necessary to support advanced notions of participatory electronic government.  The same line of argument leads to conclusions about the inappropriateness of the current generation of IT as an intelligence technology meeting challenges of asymmetric threats [7].  

 

Our argument is one that addresses ideological fundamentalism [8].  The argument challenges some social as well as philosophical perspectives.  It is made while holding the position that fundamentalism has specific cultural values when not taken to an extreme.  But in the extreme, fundamentalism is observed to not be tolerant of multi-cultural viewpoints and opinions.  In a participatory democracy, rule making can underlie authority.  The capabilities for rule making should be transparent.  This is not now true of many of the IT standards processes. 

 

Topic Maps and the Virtual Model

 

Model driven meta-data management promises gains due to reduction of integration costs.  In many new commercial systems, a virtual model itself is represented independently from the physical model in databases.  The “virtual model” decouples applications from physical data sources.  A code system like UDEF (Universal Data Element Framework), developed by the Open Group,  is an example of a virtual model, without database schema.  The code system is simply a set of codes that have references to categories of data elements. 

 

Financial gains are derived from an enhancement of our understanding of the virtual models as expressed in relational databases.  These virtual models reduce the latency in information production and provide a degree of interoperability.  They provide a physical layer independent from the platform in which the data is organized.  So the industry’s homework appears to be completed and a new and important capability delivered. 

 

The so-called semantic dimension to data is already partially captured in simply having a community adoption of a specific code for data elements. 

 

There are some additional assignments, however.  The resolution of naturally occurring and critically important terminological ambiguity [9] must be managed.  This management requirement develops because the same data element can be used differently at different times.  New data structures may also come into play.  In both cases, the modification of the data code is possible if there is a framework to work within, such as there is in UDEF.  The work by Sandy Klausner on what he has called CoreTalk is a even nicer solution; but one that did not find community adoption (as of 2005). 

 

Several metadata companies make a very good differentiation between “virtual models” of metadata and “physical models” of metadata.  Their orientation is towards modeling the data organization found in the existing generation of relational databases.  In this orientation, one can use a Topic Map distinction between the organization of data as XML and the organization of mental constructs experienced in the mind of humans.  This distinction facilitates the development of Human-centric Information Production.  In early 2005 the BCNGroup proposed this paradigm and developed a RoadMap for its adoption. [6]

 

With the Topic Map distinction, one can “step away from” computer representation and formal logic and make a simple interpretation using cognitive abilities.  The individual human easily makes a real time variation in the underlying organization of information. 

 

Virtual models and the separation of these virtual models from physical organization of data is a partial step if these virtual models only reflect organizational principles achieved in the relational database schema.  When compared to natural language, the FEA standards are less that what is required.  Humans, with tacit knowledge, understand linguistic variation.  The computer cannot be assumed to have access to human tacit knowledge.

 

The BCNGroup Roadmap specifies how latent structure can be revealed and used to make highly relevant query to ontology and data codes. 

 

Latent semantic technology has flexibility that allows human control of interpretative nuance.  My work on differential ontology is used to inspect structural organization information about linguistic variation.  This is done is several ways, but each of these ways depends on the topological features of continuum mathematics.  Because scientists understood topological theory, we discovered a surprising method for implicit encoding of structural information.  The method is flexible and perhaps optimal [6].  Once the implicit encoding seems right, in the judgment of a human, then the structural information is projected into discrete mathematics in the form of (class :object) pairs. 

 

Using our new methods, the control and shaping of variation within an implicit encoding of data may be made via formal constructs.  These constructs include a class of transformations over the set of all possible organizational schemas [10]. 

 

Our intuitions, as scientists, suggested that these transformations require topological features that are not always present in discrete mathematics.  For example, in a continuum no matter how close two points are, one can always insert a new point in-between the two points.  If data is stored in adjacent computer registers it is not possible to insert a register in-between.  One can pick up the data and rewrite the data so that the physical model now has the data structure update.  But this pick-up and rewrite operation requires time and expense.

 

We then made an additional discovery.  In what are called Ontology Reference Bases, the time and expense involved in performing Differential and Formative Ontology is optimally minimalized in a surprising fashion [11]. 

 


 

Latency and Diversity of Viewpoint

 

Federated data management will reduce informational latency only to a certain point.  Conciliation over a diversity of viewpoints from stakeholders must be achieved if latency of information is to be reduced to near zero.  The reason is that cultural resistance to information occurs if the points of view of stakeholders are marginalized or discounted by an all-powerful authority.  Federated data management has been imposed on communities in exactly this fashion. 

 

Because a diversity of viewpoints is natural to the human condition, information will flow in an organization only as fast as stakeholders feel comfortable about the fidelity of representation.  Conflict occurs when one point of view is made dominant over all others.  These conflicts may surface in the form of delays in information sharing.  Informational latency is reduced when stakeholders feel that personal nuances are not represented in controlled vocabularies. 

 

In addition to the cultural resistance that is observed to occur, scientists observe the phenomenon of false sense making [12].  Managing organizational structures may not provide access to needed information, because the specific information may not exist in the databases.  For example, information has ambiguity that reflects real properties of the natural world [9].  Information about ambiguity may not be present in the computer.  Some times information cannot be obtained because real world events have not yet occurred.  If the information is about how the markets are changing due to the introduction of new tax law, then it is possible that as yet undetermined interpretations by the courts have not as yet occurred.  In many every day circumstances, the needed information cannot exist in organizational structures that have been federated because federated structure must be crisp and precise. 

 

There are two types of information.  The first type is the information that is known and federated into a structured database. The second type is the information that is not known and might not be compatibly encoded into pre-existing federated structure.  Of the two types, it can be the case that the second type is far more important. 

 

In most information technology, human viewpoints have a tendency to be normalized and stakeholder nuance lost.  This problem is technically addressed by separating the concept of mental constructs (experienced by humans) and the machine encoding of information.  The issue of latency of information is addressed by imposing a structure to the organization of localized information.  The imposition of structure is made in an agile and flexible fashion.  A dependency on human reification facilitates natural shifts in attention and in informational organization over the information stores. 

 

The Topic Maps standard provides one standard means to instrument differences between human tacit knowledge (mental constructs) and computer data structures.  Differential ontology works with this standard in order to provide mathematics and the social/psychological science foundation to information production systems. 

 

Using ORBs knowledge mediation occurs when differential ontology exposes the relationship between private experiences of knowledge and abstractions made about categories of information.  This categorical abstraction can have two forms, one in the Hilbert mathematics and one in the discrete mathematics.  Reconciliation of how private viewpoints are expressed is then instrumented using both forms of categorical abstraction. 

 


 

Formative Ontology Technology is Identified on the Horizon 

 

ORBs preserve the diversity of informational organization by instrumenting translations between contextual terminologies. In some cases, situational context has unique elements that have not been explicitly planned for.  The core notion of differential ontology is that implicit and explicit representation can be separated and functional properties of ORBs made to match the nature of real world phenomenon.

 

The required explicit structure can come to exist as controlled vocabularies and various support constructs for machine learning and inference.   But the path from implicit representation depends on latent semantic technologies and a conversion process from these Hilbert space formalisms into (class:object) pairs.

 

 

Figure 1: Diagram use to discuss Differential Ontology [14]

 

Extensive scholarship, in the ecological psychology and cultural anthropology literatures, points out that terminological use within communities of practice binds the community together and expresses far more than can be reliably encoded into computer structured information [15].

 

New innovative technology provides collaborative support for changes in a virtual model of any taxonomy, ontology, controlled vocabulary, or database schema; and transitions between virtual models having structural differences.  The support involves breakthroughs related to how information organization can be shifted at the most fundamental level, and design implementation that support cultural activities related to human reconciliation processes [16]. 

 

The National Project to Establish the Knowledge Sciences connects the dots [17].  Many groups recognize cultural limitations as being the essential class of limitations.  This limitation inhibits the expression of true democracy in a real time context.  But the limitation is not merely a consequence of fundamentalism in social behavior.  Mathematicians and logicians recognize the limitation as discussed in terms of a logician’s, Kurt Godel (1906 – 1978), thesis on completeness and consistency in logical and formal systems.  Psychologists see this as an issue of action-perception cycles where one-half of this process has to reside in the bit and bytes of a computer program state. 


 

The Issue of FEA and its (non) support for Logic over Schemas

 

Many in the government are working to further enable the citizen’s role in information publishing, truth making and rule making.  We observe the fact that except in civil elections, these functions are now enacted by only a few.  Democracy is sometimes regarded as a process of periodically extending rule-making power to a few elected individuals and those appointed.  These individuals have had a continuing influence on the information technology sector.  It is due to their good intentions that electronic government is discussed.

 

But is agile rule making infrastructure support still a “hidden” signal in the noise?  Many believe that the answer is yes.  However there is sufficient pressure from government working groups so that the FEA is not completely agnostic about schema diversity and transformation of existing informational organization by the citizen.  

 

 


 

References

 

[1] http://www.bcngroup.org/area2/KSF/HIP.htm

[2] http://www.bcngroup.org/area2/KSF/Notation/researchNotes/note25.htm

[3] http://www.bcngroup.org/area2/KSF/KSFArchitecture.htm

[4] http://www.ontologystream.com/beads/frameworks/pondFramework.htm

[6] http://www.bcngroup.org/area2/KSF/Notation/notation.htm

[7] http://www.bcngroup.org/area3/pprueitt/kmbook/Chapter4.htm

[8] http://www.ontologystream.com/area1/MemeticOntology/mappingSocialSymbols.htm

[9] http://www.bcngroup.org/area2/KSF/Notation/notation.htm#_Section_4.1:_Description

[10] http://www.bcngroup.org/area2/KSF/Notation/notation.htm#_Section_3:_

[11] http://www.ontologystream.com/cA/index.htm

[12] http://www.bcngroup.org/area3/pprueitt/private/KM_files/frame.htm

[13] http://www.topicmaps.org

[14] http://www.bcngroup.org/procurementModel/to-be/dof.htm

[15] http://www.bcngroup.org/area2/KSF/KSFconference.htm

[16] http://www.ontologystream.com/area2/KSF/KnowledgeScience.htm

[17] http://www.bcngroup.org/area2/KSF/nationalProject.htm

 

 

 



[1] One half of my 1989 PhD thesis “Mathematical models of biological intelligence exhibiting learning” is on mathematical homology between discrete switching networks and systems of continuous differential equations. 

[2] Specifically the work by Dedekind on infinite sequences of real numbers converging to an irrational number. 

[3] See any text on Godel’s work on completeness and consistency.  In particular see Chapters 2, 3, and 4 of Penrose, Roger (1989) “ The Emperor’s New Mind”. 

[4] The measurement of latent structure is not always called the same thing, and there are several different mathematical models, primarily (1) algebraic or (2) stochastic. 

[5] See : http://en.wikipedia.org/wiki/Lambda_calculus for description of Lambda calculus

[6] The RoadMap was serialized and published by datawarehouse.com

dataWarehouse.com | Brought to you by DM Review

it was revised and published at:  http://www.bcngroup.org/area1/2005beads/GIF/RoadMap.htm