Global Information Framework and

Knowledge Management

 

 

 Revised slightly

April 16, 2006

With foot notes

 

             

 

 

A prototype

 

Public Document

Updated Friday, July 15, 2005, Version 9.8

 

Point of Contact: Dr Paul S Prueitt,   psp  @  ontologystream.com

 

Behavioral Computational Neuroscience Group

Development Committee

 


Global Information Framework and

Knowledge Management

 

A prototype

 

Position Paper

 

Behavioral Computational Neuroscience Group

Development Committee

 

Table of Contents

 

 

Why a roadmap is needed for semantic technology adoption                                 4

 

Executive overview                                                                                             7

 

Section 1: Proof of concept                                                                                 10

 

Section 2:  Context and objectives                                                                       12

 

Section 3: Ontology architecture                                                             15

 

Section 4: The ontology encoding innovation                                                         20

 

Section 5: Informational convolution                                                                     26

 

Section 6: The minimal deployment                                                                     29

 

Section 7: Regularity in report generation                                                 30

 

Section 8: Predictive Analysis Methodology                                                         33

 

Section 9:  A future anticipatory technology                                                          40

 

Section 10: The Second School of Semantic Science                                           44

 

Advisory Committee and Companies                                                                    48

 

Appendix A: Statement of Purpose                                                                      49

 

Appendix B:  Project Outline                                                                               50

 

Appendix C:  Semantic Science                                                                          51

 

Appendix D:  Knowledge Sharing Foundation Core                                                53

 

Keywords                                                                                                          54

 


 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Copyright 2005  BCNGroup

Why a roadmap is needed for semantic technology adoption

 

In 2005 everyone knows what a horse and buggy is and what an automobile is.  Each person in our society knows the story of the emergence of the automobile manufacturing business sector and the American love affair with the automobile. 

We do not know to what extent meaning might be captured in a “semantic web[1]”. We have not experienced anything that informs us about what semantic technology might become. Relevant issues in linguistics, social theory and the nature of science are not well known in departments of computer science.  Most of computer science is shaped by engineering theory and scientific reductionism.

A roadmap is needed.  One is provided here by a group of natural scientists and mathematicians.

We propose a broad program to establish the intellectual and technology foundation to a science of knowledge systems, and to integrate and deliver to the market place information science based on the proper use of what is called “machine encoded ontology”. 

The first delivery of ontology-based technology can be accomplished within a few months, given our previous work and the existing technology components.


BCNGroup scientists recommend a demonstration program that has three parts organized in two phases.

Phase 1

1) Technology integration

 

Phase 2:

2) Advanced knowledge management certification

3) Ontology development.

 

Our proposal addresses these parts one at a time, the first part is proposed at a cost of $750,000 over period of six months. 

The first part depends on our (already) having completed a principled selection of advanced knowledge management systems, semantic extraction systems and data persistence systems. 

In each case patents protect the underlying technology and allow a science advisory board to develop an in-depth description of each technology component.  We have developed curriculum that exposits the philosophical principles on which their software user interfaces are dependant.  This curriculum is being readied for delivery as knowledge management certification and as text books designed for university curriculum. 

Beyond the first deployment, the concept of a knowledge sharing foundation [2] is proposed, and is being readied as a “Red Hat” type business model.  This is not, however, designed as a business.  Rather the knowledge sharing foundation is designed as a cultural institution directed to found the knowledge sciences and to develop curriculum that helps average Americans and individuals all over the world. 

 

 

 


Executive overview

 

A human-centric information production [3] capability is defined and existing, commercial, software identified.  A distributed information system is specified that will enable the real time representation and sharing of human knowledge about situations.   A global information framework is used as a human control interface over complex ontology. 

Example: Aircraft landing at a specific airport will express behavioral patterns.  An airport ontology and aircraft landing ontology is used to provide an interpretation of the behavioral patterns expressed in each landing.  Over time, the observed behavioral patterns lead to early diagnosis of risks.  A human looks at the patterns and makes judgments based on personally held tacit knowledge.  New concepts about the patterns are encoded as meta-data.  The patterns themselves are encoded as new behavioral ontology reflecting the history of observations about aircraft landing at a specific airport. 

Inputs come from any reporting-software-system.  An example is US Customs and Border Protection reports on search and targeting operations or administrative rulings on tariff codes.  Inputs can be developed from any event reporting mechanism, whether written reports or reports that involve the manual development or modification of ontology. 

Outputs include a computable and visualizable historical record about situations reported.  The record is expressed as ontology and human visualization of this record is provided. 

Visualization of graphical structure requires human perception to evoke an experience of knowledge about a specific situation, or an event space.  Graph labels suggest meaning in much the same way as sentences suggest meaning when humans compose sentences. 

Figure 2: Visualization of concept indicators in a collection of fables

Language dependence is not fully achieved, for reasons that have to do with differences between natural languages. Each natural language has language dependant characteristics.   In principle, our technology establishes a foundation for using ontological models having correspondences to sets of concept representations.  

Ontological models are envisioned as being a type of “interlingua”, not of words composed in grammar, but as systems of signs that are interpreted by humans in various natural language settings. 

Ontological models provide metadata that indicate where possible misunderstanding might occur, thus ontological models provide a complex formalism to help in the translation and transcription of meaning from one human language to another.  Like mathematical models, ontologies are useful as enablers of computations based on the structure of the defined sets of concepts.  In mathematic the concepts are those related to field dynamics and to the conservation laws of physics.  The ontological models we are developing are about more complex subjects, such as the intentions of a determined enemy to bring elements of bioterrorism into the United States. 

Ontology extends Hilbert mathematics from deterministic systems to complex systems having uncertainty and under-determined constraints.  Representing real advances in objective science, ontological models provide a computational basis for the real time projection of human knowledge within communities of practice.  Medical informatics and bioinformatics are demonstrating value in two early utilizations of ontological modeling.  A construction over conceptual representations of topics in social discourse has been applied to medical literatures.  These constructions bring relevant information to medical research communities.  Bio-event defense architecture has been outlined and proposed by members of the BCNGroup.  Application in pharmaceutical research and development can be identified under non-disclosure agreements.  These applications involve conceptual modeling as well as direct modeling of the ontology involved in gene expression. 

Subject matter indicators are represented in computable ontology constructions, including the most simple, and yet most powerful, one based on a concept label and one to n (undetermined integer) word, word stem or word phrases.  These concept labels are formally represented in several knowledge base technologies as an n-tuple. 

< a0, a1, . . . an >

Our proposed deployment, takes these technologies and integrates them.  We also add a data mining process where fast convolution transforms have both a mathematical formulation as well as operational realizations of these transforms as precise data retrieval methods.

Convolution operators are fast computed over the elements of a set of concept representations. The convolution operator results in the separation of context and merging of ontology.  One pass over an in-memory data structure is sufficient. 

Figure 1: Co-occurrence patterns with “hash”

From these technologies, sets of concept representations are developed and made accessible to search and analysis.  Several semantic extraction tools are integrated with a commercially available RDF repository [4].  The output of these semantic extraction tools is a set of subject matter indicators, represented as RDF statements.  The basic set operations are available as formalism and thus the standardization of these constructions are a matter of public record. 

It is vital to recognize that the subject matter indicators are given final interpretations by knowledgeable humans. 

Our Phase 1 proposal is to integrate five stand-alone COTS[5] products; two semantic extraction systems, two knowledge management systems, and a taxonomy acquisition system, with an Open Source ontology editing tool (Protégé), an Open Source document repository system, and a simple RDF (Resource Description Framework) data repository.

Knowledge management systems address the need to enhance the quality of reporting as well as to make managed vocabularies available as an interface between ontology constructions and normal human use.  Semantic extraction systems convert free form text into structured metadata. Existing taxonomy is draw from across the world. 

These products are to be configured using J2EE web services and a server binding protocol based on a high level scripting language called Python. 

An advanced RDF repository is used to provide persistent storage for organized sets of concept representations. Concept representations and organized collections of these representations are convertible to standard ontology representation languages.  A formal theory about co-occurrence patterns is used to express a category of mathematical constructions called convolution operators. 


Section 1:  Proof of concept

 

The customer wants a global information framework that:

1.         Has a high degree of language independence.

2.         Compresses data regularity, primarily co-occurrence patterns, into structurally organized concept representations

3.         Converts uncertain, sketchy, sometimes incorrect instance information into clear, concise and complete reports about a situation. 

4.         Provides a means to develop global synthesis over a large event space. 

 

Ontological modeling also provides new types of information technology features that are not anticipated by the customer.  For example, the set of concept representation, and ontological model encoding structures, allows access to past information instantaneously, without relational database indexing.  

The BCNGroup has specified a six month technology integration and a fully functional beta site deployment.  The beta deployment will serve as a prototype for additional deployments based on similar principles.  

Our technology delivers means for deriving language independent situation and global event analysis based on ontological models.  The software systems integrate semantic extraction in English, Arabic and German.  Other languages are possible using the same techniques. 

General principles related to the differential ontology framework are laid out so that multiple languages are integrated into constructed elements of a single explicit ontology.  The integrated system will demonstrate features that are not available from any current semantic or knowledge management system. 

Our general principles are part of an emerging discipline related to the measurement of complex systems and the use of formal ontology as a means to abstract knowledge about situations that arise in a complex world.  Iterative modifications to visualized ontology lead to an adaptation of ontological models of these situations. 

The Global Information Framework depends critically in having a user interface that allows any subject matter expert to have visual access over situationally relevant concept representations. Situational relevance requires a subsetting mechanism.  Control over concept subsetting mechanisms serves to focus the attention of the user into part of the elements of an ontological model.  Once elements are identified, by our selective attention mechanisms, these elements are extended using ontological inferencing to produce a coherent view of what is known and encoded within the representational space.  User input in each phase of this process is not merely supported, user input is required to achieve relevant and fidelity.

This human-centric, ontological model based, approach creates a distinct alternative to classical expert systems and artificial intelligence approaches.  Our alternative creates a higher dependency on human involvement and requires that some humans accept responsibility over decisions.  Clearly the cultural barriers we, the BCNGroup scientists, have experienced have something to do with the requirement that humans accept more responsibility and are subject to rational outcome metrics.  We have been forced to take the position that the artificial intelligence funding is wrong minded both based on arguments from the natural sciences and because the effect of artificial intelligence expenditures is to allow the consulting IT industries to not take complete responsibility for past, current of future performance outcomes.

Relevant cognitive neuroscience tells us that attentional focus evokes cognitive responses.  This science also tells us a great deal about how this attentional focus is managed by the brain system [6] .  As the BCNGroup moves forward the Second School (see Section 10) we will develop user profiles that use the Human-markup language standard [7] to bring elements of cognitive engineering into the interface design.  A simple control interface has been designed from existing text based and mouse based interfaces [8].  The beta deployment of this interface is within two months of funding ($750,000).

Results from cognitive neuroscience have been used to design user interface elements that change the visualization based on user commands and actions. These changes in visualized state produce shifts in figure/ground perception.  


Section 2:  Context and objectives

 

As in physical and engineering sciences, the results of collective intellectual work lead to advances in science, including economic and biological science. 

In the context of other types of collective work, such as in financial services, intelligence analysis, fiduciary reporting, compliance reporting, complex control, and biological science; the GIFT provides a means to produce a type of collective intelligence.  Subject matter experts create this collective intelligence using our software components.  

Global information frameworks provide features that are not available from any current semantic technology or knowledge management system.  It is in this sense that the technology is a gift to our society.

GIFT was designed specifically to address global analysis of US Customs and Border Protection selectivity of commodity shipments for targeted examination of containers. However, GIF technology is applicable to far more than the current critical problems in information technology modernization efforts at Treasury, State, Department of Defense and Department of Homeland Security. GIFT provides a principled ground in which to extend formal models of natural event structure those objects of investigation are by nature complex.  The gift has to be accepted, however, and so far the revolutionary nature of the approach on which these integrated technologies depends has been counter intuitive to mainstream artificial intelligence and to the IT procurement process.

Over the past decade a revolutionary ground has been prepared by scientists and technologists who felt that intelligence and military activity required a new information science paradigm.  We have faced an entrenched discipline and procurement process.  The individual involved in maintaining this process have been, so far, unwilling to even accept the possibility of a paradigm shift. 

So natural scientists have developed the “Second School of Semantic Science”.  The Second School points out that the First School treats intelligence as if it is a merely a mechanism that can be decomposed into a set of fixed semantic states and a first order logic defined on this set.  Natural science, and common sense, tells us that intelligence is not proper characterized in this way.

The following have been our long term design objectives for Treasury and DoD:

·         Improve the quality of analysis, and utility of complex intelligence products;

·         Provide specific and tailored intelligence to enhance our ability to visualize the battlespaces, including the terrorism engagement space, and ensure total operational awareness;

·         Improve the throughput and speed of delivery of National intelligence;

·         Reduce or eliminate unnecessary redundancy and duplication in intelligence products;

·         Strengthen information and production management and ensure policies, procedures, concept development, training, and technical-human engineering;

·         Establish and integrate standards (based on mandated Department of Defense (DoD) community standards/architectures) for commonality, interoperability, and modernization in coordination with appropriate elements and activities;

·         Explore and examine very advanced technology and concepts for future integration;

·         Provide a thematic analysis as the basis for information warfare, both defensive and offensive activities.

Our proposed beta deployment demonstrates the viability of a specific roadmap.  The roadmap starts where our industries are today and shows a specific path to the design, development and deployment of next generation tools.

We have interoperability with W3C standards, but our capabilities are forward looking.  Perhaps the most critical contribution is a data encoding mechanism that supports the development of collective intelligence and work products that can be re-used as models of complex phenomenon.   

Our proof of concept involves the deployment of a prototype that is fully operational and is to be used in critical context.  This prototype can be deployed at any site and requires only part of the deployment team have high levels of security clearance.

 

Generality:  Nothing in this roadmap excludes the development of GIFT deployments in bio-chemical engineering, banking, manufacturing, publishing or any other complex human activity.  The technology is considered to be more advanced than any existing e-commerce system; or any deployed knowledge management system.  Several of our teaming corporations are precisely those corporations who are regarded as having the leading edge deployed systems.  There participation is made at or below costs simply because the methods and capabilities of these systems are under-appreciated due to the break they make from classical artificial intelligence and expert system based IT deployments.


Section 3: Ontology architecture

 

We have available a package of patented innovations in data encoding technology. 

 

Figure 3: distributed Ontology Management Architecture (d-OMA)

 

In several layers of our existing software, data regularity in context is discovered using semantic extraction techniques.  The patterns made from regularity are made explicit in the form of a set of concept indicators.  For us, R&D does not mean “research and development”, because this term has been deemed or “no value” in the IT industry or in government IT procurement circles.  The political incorrectness of funding long term R&D stems from the failure of research and development using the classical approaches.

For us, “R&D” is research and discovery.  The research, in this context, is an individual investigation of some complex natural phenomenon, such as the purchasing interests of an on-line shopper.  The GIFT provides human-centric investigator tools.  Like microscopes and carpentry tools, the GIFT tools do nothing by themselves.  These tools become useful when they are used by skillful domain experts. 

What are observed by the tools are conceptual structures in social discourse.  What is constructed is a model of how these structures set within the various thematic expressions.  The subject indicators have structural relationships to individual natural language terms and patterns of term occurrences. 

Subject matter indicators are identified using several types of patented semantic extraction processes.  These include two forms of patented conceptual aggregation from a letter, stem, word or word phrase, n-gram measurement of text [9]; as well as newly patented probabilistic Latent Semantic Analysis (PLSA) [10].

Ontology construction, no matter how they are developed, consists of representations of concept schemas and their relationships.  In GIFT, natural language terms, and patterns of co-occurrence, provide ontology definition as sets of concepts, with properties and attributes, organized with visual navigational aids. 

A subsetting mechanism brings into a visual focus all and only that part of extensive ontology repositories persisted in RDF repository and hash tables.  Human interfaces to shared ontology repositories are designed to mimic the perceptual figure/group relationship observed by natural science to be the key mechanism involved in individual action-perception cycles.  These human interfaces allow local manipulation and editing of sets of concepts found to be relevant to individual analysis of specific events, such as US Customs and Border Patrol selecting a container for a search procedure. 

A local analysis by an individual occurs. Using the new software, this analysis occurs with the greatest amount of flexibility.  At each of many sites, human analysts edit small details, modify underlying assumptions, and otherwise examine how concepts identified locally might be related to sets of concepts being maintained in global repositories.  The concepts themselves are equipped with metadata about how these concepts might be identified in text, and co-occurrence relationships that concepts have with other concepts in the repository. 

A collective intelligence can be expected.  Subject matter experts enable a type of global analysis due to local manipulation.  Local manipulation occurs based on direct experience, but this experience is conditioned by the recent global analysis.  Real time intelligence response is therefore very likely.

Collective global analysis occurs because individual human interfaces to concept repositories have selective attention mechanisms.  BCNGroup scientists understand the physical and mental activity involved in individual human action, cognition and perception cycles.  This understanding is part of several academic literatures; “cognitive engineering” and “evolutionary psychology”.  The focus of this science is on the behavioral patterns of people and systems of living systems.

Collective global analysis occurs within contexts that are implicitly (not necessarily explicitly) structured by relationships that are established when many individuals work with localized ontology. Individuals produce reports based on the subsetting mechanisms that “retrieve” that part of the globally stored RDF repository that is deemed relevant.  As local analysis produces reports, these reports themselves are subjected to linguistic and ontological methods where reconciliation of terminological and viewpoint differences become critical.

SchemaLogic’s SchemaServer product will be interfaced with web services that manage the specifications of a global terminology library.  Terminological reconciliation is a current capability provided by SchemaLogic Inc to a variety of commercial clients.  

Ontological science tells us that local manipulation of concepts by an individual within the contingencies of the moment involve human tacit knowledge and can rapidly lead to a deep understanding of a specific event in the context of larger issues and concerns.  However, human understanding is both highly situational and strongly shaped by opinion. Any specific understanding depends on individual(s) defining terms so that these fit within a coherent view of the events occurring.  In key situations, a single common viewpoint is not possible, nor is a single viewpoint always desirable.

The control of managed vocabulary is essential to uniform work on enterprise wide ontological models.  One key failure of the Tom Berners Lee (W3C standards bodies) is the absence standard methods for the reconciliation of terminological differences.  Our system has several layers of methodology that are tied by first principles to the way our data is encoded into computer memory.  The data encoding specifications are simple, non-proprietary, and available to review and use.

 

Figure 4: The production of scoped ontology with humans in the loop

 

A second knowledge management system is a product from Acappella Software Inc. The regularity of responses to standard situations can be studied, resulting in patterns of expression that are captured in pre-existing textual snippets of expression.  This allows a patented process to assist in flexible report generation having the 3Cs, clarity, completeness and consistency.

Both knowledge management systems are tied together with standard knowledge representational data encoding based on RDF (Resource Description Framework) and Orbs (Ontological referential bases).  Ontology representations can be used within the difficult contexts of uncertain information, shifts in context, and changes in the underlying situation.  In most cases, a human analyst will easily alter interpretations and schema properties in real time to accommodate these practical limitations.

The two commercial knowledge management systems provide support for cultural transitions.


Section 4:  The ontology encoding innovation

 

Scoped ontology sits on an exceedingly simple data structure standard, developed and published by OntologyStream Inc.  Bypasses to the well known XML persistence and search limitations are found by using this encoding.

This data structure is a topic taxonomy organized in a specific fashion, disclosed as a matter of public information.  Differential ontology framework works in a specific fashion to create a global information framework where managed vocabulary and ontology is generated and used as a knowledge management capability.

Several small deployments have been completed.  For one of the state governments, a consultant/specialist created 216 concept representations and organized them into the upper two layers of a differential ontology framework.  A prototype for a large deployment in US Customs was developed but not deployed (as of May 2005).  We are seeking a contractual means to deploy based on the team agreements between ten leading, but small, innovative knowledge technology corporations.  Non-deployment of the prototype is deemed by our group to be one manifestation of profound incompetence by specific Lockheed Martin management.  A GAO investigation was initiated in May 2005. 

Situationally focused models of specific events were considered as targeting software to be used in a future modernized US Customs and Border Protection.  Work stopped on this deployment as of March 2005, due to contracting issues [11].  However the concept of scoped ontology has now been demonstrated in the state (DHS) deployment and in a commercial deployment (not disclosed).   These are small deployments which act as a proof of product. 

The upper layer of the differential ontology framework is a set of universal abstractions, such as abstractions about the flow of time. The middle layer contains domain specific concepts and utilities such as security policies, concepts about how containers are searched, or concepts about what is a commodity.

In our small state DHS project, several specific systemic risks were identified, leading to corrections in risk management policies.

Differential ontology is deployed within a action oriented process model called AIPM, see Section 7, Figure 8).  Working from event reports, semantic extraction activities are developed and data instances are parsed to produce reporting triggers. 

Triggers launch processes that construct scoped ontology.  The development of small (5 -20 topics) situationally specific scoped ontology is the usual outcome from automatic scoping processes.  These ontology representations can be used for rapid communication of structured information and for building histories.   We need the larger deployment to show how ontology streaming might aid in global analysis and responses.

In some domains, for example Custom’s Harmonized Tariff Schedule, there may be hundreds of thousands of concepts, but a small set of organizing principles that generate categories over these topics.  The categories are suggested by algorithms, and then reified by human analysts.

Event specific categories developed as a means to visualize elements of event space phenomenon. The event phenomenon is then “understood” using the concepts in upper abstract and domain specific ontology. 

Figure 5: The GIFT architecture as of 2002

The full GIFT architecture is being realized using a server glue language called Python.  The key is to bring the required products together in a work environment. 

Figure 5 (first seen in 2002) expresses our long term interest in Visual Text (the Text Analysis International Corporation Inc (TAI) ), semantic extraction and schema logic (SchemaServer). 

NLP++ is the language that TAI founder Amnon Meyers developed and used to build a, now patented, “system for developing (rapidly) text analysis systems”.  Our scientists know that each targeted domain of text data elements has different structure to functions relationships. Textural semantics, or meaning, is critically dependant on the specific domain of text data elements. The NLP++ language is used to instrument the focused measurement of function to structure relationships. 

Probabilistic latent semantic analysis (PLSA), patented in 2005 by Recommind Inc, is used to develop n-ary representations of subject matter indicators.  NdCore (Applied Technical Systems Inc), Readware (MITi Inc); and SLIP analysis (OntologyStream Inc) is used to get different looks at the same data.  As the set of subject matter indicators are developed, RDF encoded concept representations are developed and the NLP++ based software is now used to instrument the detection of these concepts in text.  The “two sides” of the differential ontology framework are established. 

The SchemaServer product from SchemaLogic provides the knowledge management features required to management controlled vocabularies and thus to allow human use of natural language to control the development of use of sets of concepts (ontology). 

Acappella Software provides a product that helps to create clear, complete and concise, the 3Cs; written reports in the first place.

The development of our data layer has been in conjunction with our work on extending some intellectual property for Applied Technical Systems, and is discussed in a public document titled “Notational System for the Ontology Referential Base (Orb)” [12]. 

Our data layer is different, in that the ontology encoding innovation provides a primary user interface and many of the key features related to real time use of ontology for situational analysis.  Orb data encoding are agile and free of pre-imposed data schema.   Our architecture separates the data layer from the transaction and presentation layers.  The data layer has a simple, non-proprietary, encoding into computer memory and a simple read and write to a word processing text file. 

A standard RDF repository persists the data structure as sets of concept-representations.  The Orb repository provides real time data mining and very advanced methods related to categorical abstraction and event chemistry rotationally defined transformation operators [13].  Interoperability between Orbs and RDF is simple and unencumbered.

Using the standard Orb encoding we define very fast parsing and convolution operator to act directly on mapped memory.  We can show that this encoding supports almost instantaneous transformation, search and retrieval.

Figure 6: The key-less hash table Orb encoding

The hash table has become a central tool in the development of very agile encoding of data in a “structure free” form.  Orb encoding is very similar to a classical hash table, and yet requires no hash function and no indexing.  Data is located by interpreting the ASCII string as if it were a base 64 number.  Thus location is context.  Orbs (Ontology referential bases) have a slightly different standard form when compared with RDF.  However, Orb structure can be placed into the Intellidimension Inc RDF repository using a mechanism defined in the notational paper.  Intellidimension provides one of the commercial off the shelf software systems that we use at very low costs. 

We suggest that, in the near future, digital signal processing will perform the convolution of Orb structures.  Because of a notational illustration of how the Orb convolutions map to standard digital signal processing, we conjecture that scalability issues can be resolved by simple engineering using electromagnetic spectrum. 

A quick mathematics like “analytic proof” is used to demonstrate that our approach is not likely to run up against any scalability issue.  A time delay currently does exist in processing convolutions.  However, we are able to demonstrate what may be optimal processing times using an implementation of the key-less hash table in the current SLIP browsers [14].

The real deployment issues are in developing an understanding of how to use sets of concepts encoded as Orb construction and having ambiguation/disambiguation and terminological reconciliation issues worked out.  Theses educational and cultural are the issues that we still hope to work out with US Customs and Border Protection, as we continue to argue that the approach outlined to them in January and February 2005 is both wise and productive.

 


Section 5:  Informational convolution

 

Informational convolution is defined to be a process that visits each element in a hash table and performs some logic using a small rule engine and the contents of the hashed element and associated bucket.  Our work is a generalization of patents, and partially owes its origin to a contract that Applied Technical Systems gave to one of our group in 2002. 

The mechanisms involved in informational convolution have a deeper grounding into situational control and topological logics [15]. Our work on convolution mechanisms has been connected, notationally, to the internal parser and ontology within MITi’s Readware product and within Applied Technical Systems’ NdCore product.  Several other patents also inspire the work. 

Because of the modern object oriented hash table, one is able to perform localization of information in an efficient manner.  As is empirically verified, “categorical” localization of information may be derived from very large data sources and produce very small subject indicators.  This “categorical collapse” is due to “ontological regularity” in textual expressions about similar topics. 

These subject indicators can then be applied to entirely different data sources.  Commonalities are observed about how words are used in everyday activities.  The natural “standard” is “common terminological usage” in relevant communities. If one ignores, as most OWL standardization of semantic web applications do, then one develops non-functional concept interoperability.  Our team has fielded commercial knowledge management systems that recognize the true nature of human use of natural language. 

One can allow more than one object to be placed into a variable length hash table bucket, giving the system a response degeneracy that is needed to model the passage of events through what is sometimes called a tipping point.  The same bit structure allows metadata to be gathered and stored, and used to help separate the contexts imposed quite naturally in everyday activity.  This use of metadata does not occur without reminders that humans have cognitive capability that computer programs simply to not, and likely will not ever have. 

Figure 7: The semantic extraction process

The key to the Ontology Reference Base (Orb) is a simple mechanism that encodes informational bits.  The bits are required to have the form (class, object) where the class gets its definition from what is a stochastically defined neighborhood “around” all of the occurrences of the object, and the object gets its definition from a specific occurrence of a neighborhood. 

The definition may be best achieved using probability Latent Semantic Analysis patents by Recommind Inc. However, several reasonable methods may be used to define and refine class definition.  Once class definitions begin to reflect the organization of concepts about the organization then we have made explicit a control structure over information processes about that organization.

This control structure is not expected to replace human judgment.  Our natural scientists point out that a fundamental error is made by academic and consultant groups in acting as if human judgment is reducible to algorithms.  This error continues unabated at government institutions like DARPA, NIST and NSF.  Because funding continues to be poured into this false paradigm, many small and large IT consulting companies develop and market software based on the AI mythology. 

The Roadmap starts by debunking this mythology and demonstrating that human centric ontology use creates high value to those markets that have been wasting time and money on a failed paradigm. 


Section 6: The minimal deployment

 

The first step develops a multi-user web based set of web servers for managing relevant processes and the ontology persistence repository. 

The second step involves the formalization of specific information vetting processes in line with accepted cultural practices.  This is to be accomplished using use cases, in which the step-by-step enumeration of information processes is written down.  The enumerated steps from a model of cognitive, communicative, and behavioral aspects related to analytic practice.

The third step involves the development of a knowledge-use map, indicating where in the enumerated processes one might deploy scoped ontology formalism to capture the structure of typical information flow.  The knowledge-use map is part of a consulting methodology developed by our team. Once the use map is developed; additional analysis can be extended regarding how knowledge is used.  

The fourth step is to develop medium size enterprise ontology.  Enterprise-scoped ontology is developed using consulting methodology and results in a model of information flow within part of the organization.  The model specifies information paths as well as the details that should be considered in the perception of information through introspection.  

The fifth step is to develop a number of component ontologies that encode specific information structure. 


Section 7: Regularity in report generation

 

Any system within the Global Information Framework depends critically on knowledgeable users.  Humans provide situational awareness focused by cognitive clues.  These cognitive clues are presented via a computer screen in the form of small graph structure (subsets of community defined ontological models). The graphs show specific concepts and relationships between concepts. 

GIFT architecture assumes an "intelligent design" to the way that things in the real world work.  Intelligent design is reflected in the order that emerges from apparent chaos.  When phenomenon is properly measured the order structured by this intelligent design can be mapped to a pre-existing ontological model. Perhaps this is the way that natural language works.  Traditional “semantic extraction” technology almost, but not quite, makes this kind of measurement from full text written by humans. 

Differential ontology framework opens up the extraction process to human judgment.  It does so by having an explicit set of concepts well enumerated, by the community, and made available to the community as a means to support human communication.  The notion of a framework is one that is used to organize these sets of concepts, and to map semantic extraction algorithms to specific concepts.  These frameworks provide a sufficient basis for an enumeration of concepts organized to provide the community with formal structure.  The formal structure also allows machine computations, and interoperability, that is understood within the context of the communities’ needs. 

Before semantic technology can be properly deployed, the potential role of ontology frameworks has to be understood.  The Roadmap addresses this need through our support for the development of knowledge management certification programs for those who will be first adopters.

Principles derived from cognitive science and other scholarly literatures provide this understanding.  The understanding is consistent with our everyday experience.

In each of the semantic extraction tools, a framework measures text for patterns indicating subject matter.  This measurement process is common to the methodology that we propose as the core technology for all future semantic technology.  What is measured is a set of subject matter indicators.  How the measurement occurs does vary depending on which of the semantic extraction software systems we use.  Readware, NdCore and latent semantic analysis each achieves a framework function. 

Human eyes and cognition makes sense of these measurements and supply details that are not available in the explicit, pre-existing, concept representation.  The small graph constructions are visualized (see Figure 1) and are modified by the user, resulting in both the use of concept representations and the modification of the small graphs. 

Ontological models are situational in nature and cannot be fixed in advance. The regularity of structure in context is complex and situationally exposed.  Situationally scoped ontology, the named graphs produced by differential ontology, has the role of representing knowledge.  These named graphs capture human knowledge in a way that is similar to a written report. 

In many cases, federal law mandates clear, complete and consistent reports when a government agent takes certain actions.  An example of a report covered by such mandate is an administrative ruling about the Harmonized Tariff Schedule code that is assigned to each commodity imported into the United States.  Administrative rulings, as one can see from our experimental site [16], are often written as precise reports using legal language.  These reports are clear, complete and consistent: the 3Cs.

The HTS administrative rulings are well written and have the quality of judicial rulings, and often carry the same effect as a ruling by a court.   On the other hand, many other reports generated by US Customs and Border Protection Reports are not 3C complaint. 

In the global information framework, the productions of reports are enhanced in four ways.

1)  Software is used to enhance the productivity of agents by presenting a survey type software interface leading the analyst quickly through a set of situationally dependant questions.

2)  Whether by automated means or by standard relational database interfaces, reports generated by analysis are parsed to produce semantic expansion of the concepts found by semantic extraction processes run on an individual report in real time.

3)  The concepts found to be associated with a report are iteratively refined by allowing the analysts to view situationally scoped ontology individuals.  These “ontology individuals” (OIs) are developed computationally from subsetting mechanisms that use upper abstract and middle domain ontologies. 

4)  Global predictive analysis methodology is developed from the global organization of OIs into event space models. 

Figure 8: Modification of intelligence community model

 

In 2002, one of the BCNGroup scientists was working on a project supporting the Total Information Awareness program at DARPA.  He was given a seven step actionable intelligence process model.  As is well known, the perception and measurement problems are often left out in classical information science.  The inclusion of two additional steps was deemed necessary to allow the process model to work as a complete paradigm, see Figure 8.

Section 8:  Predictive Analysis Methodology

 

Members of our team have developed a document/data repository based on the Open Source document repository system called Greenstone.  We have extended this system and are integrating it with components that we recommended deploying together.

Once assessment objects, like written reports and database elements, are in our repository, various existing concept recognition and trending technologies are used to produce a model of the evolution of various situations.  In the prototype systems, we have looked at cyber threats and cyber attack mechanisms, and thematic analysis over administrative reports.  Discussions have occurred about how to generalize the modeling capabilities to use ontological modeling as a control aid in human centered analysis of processes occurring in complex environments, such as an ideological war or the events occurring in an aquatic system.  Predictive analysis methodology has evolved from these preliminary projects and from scholarly literatures. 

Before all else, predictive analysis methodology depends on the quality of the model that is produced.  So how does one evolve quality ontological models?

Stephenson, Prueitt and Einwechter used or data/document repository as well as cyber attack ontology and cyber risk analysis called Formal Analysis of Risk in Enterprise or FARES project.  The FARES system develops scoped ontology about possible cyber attacks, and about risks that are defined by the interaction between explicitly developed ontology, distributed data and specific data instances. 

User defined and community confirmed ontology and assessment elements from this repository are retrieved as required by users.  Iterative processes refine theories based on specified ontology.

 

 

Figure 9: Definition of new ontology in the Protégé editor

The ontology is used in a fashion similar to how mathematics is used in modeling physical phenomenon.  In the FARES project we have also defined several complex behavioural models using formalism called colored Petri nets.   The Petri net is the formalism used in a number of commercial simulation packages.  The principle concept is that states of the “system” are modeled as nodes of a graph and transitions between states by a movement of a token placed on the nodes of the graph. 

Scoped ontology elements expose an anticipatory process model over Risks and Gains.  This more general methodology will use an (1998) extension by Prueitt and Kugler of the J. S. Mill’s logic. 

Information from applied knowledge management systems can be reviewed by human introspection in order to enrich scoped ontology structure and related assessment objects.  Scoped ontology allows human users to work with complex problems modeled as ontology and yet not be over whelmed by extensive ontology structure.

The scoped ontology serves two purposes.  First, ontology reminds users about specific details of a global model for information flow within the enterprise.  The scoped ontology serves to focus a human’s awareness on part of a larger representation about possibly relevant concepts.  Second, the scoped ontology provides consistency and uniformity for many intelligence products across a large distributed enterprise.  One behavioral consequence of this consistency is that the users tend to conform naturally to a global information flow model, thus providing uniform alignment to policy.  The model is agile and human communities are to be comfortable with it.

Informational transparency may also increase due to increased expectation that work products have specific character. Informational transparence results as individual humans work with the three types of resources,

·       Scoped ontology delivered through a subsetting mechanism

·       Assessment elements being used to develop theories about events and event structure

·       Explicit ontology defined as OWL (lite) and encoded into either the Intellidimension Inc RDF repository or an Orb repository element

The use of ontology breaks predictive analysis methodology into small steps. 

Styles of scoped ontology use and construction will reinforce a sense of community and allow the development of personalization and familiarization so essential to real time community.

Figure 10: Process model

 

Workflow can be instituted.  For example, scoped ontology components can be pulled from a repository in order to conduct incremental steps, A, B, C, D.  These steps start with a Screening Assessment and end up with a Situational Assessment. 

The development of new components by staff allows one domain of expertise to be a prototype for multiple extensions of policy and knowledge into new areas.  Once enterprise ontology are in place one may extend scoped ontology usage as a reporting and communication medium.  The enterprise ontology organizes a universe of ideas into topics and questions.  The scoped ontology brings part, but not all, of this universe into a perceptual focus, represented sometimes by a small graph. 

One data flow model for implementing scoped ontology technology might consist of the following:

A)                  An analyst, or team, uses scoped ontology resources to develop structured assessments based on question answering within small-specialized components. These assessments are forwarded into an enterprise ontology subsetting system and routed using workflow.

B)                  New ontology elements are created, or modified from framework prototypes.

C)                  Component scoped ontology are generated from enterprise ontology.

D)                  A meta-object facility is used to link information content to data elements

E)                   Scoped ontology assessments are archived for future reference.

The relationship between scoped ontology and enterprise ontology can be complex and yet shaped to the features of emergent situations.  For example, a scoped ontology may capture part of a larger process conducting a review of a situation.  This part can become a view of a subordinate process and thus change context.

The scoped ontology can also serve to produce a specific order to questions, or components, that are then viewed within the enterprise scoped ontology by a larger community.

Figure 11:  Model of individual ontology use

A presentation of the core technologies is necessary to obtain a first hand experience. 

Advanced knowledge management certification programs are designed and will be offered by KM Institute and several universities (Phase 2).

Four aspects:  One may model the differential ontology framework as having the following aspects:

 

1)  the creation of the enterprise ontology and its resources

2)  the use of ontology by a specialist to conduct an assessment (for example, a screening review)

3)  the answering of questions by a respondent

4)  the generation of a report

 

Creation: Ontology provides an organization to thought about the flow of information in a large enterprise.  Many, but not all, of the details have been worked out in advance.  There is a natural and easy just-in-time selection of topics that allow the specialist to “navigate” through a universe of ideas.  Ideas can be organized, as they are in the real world, into worldviews and the concepts in different worldviews kept separate using controlled vocabularies, mapping between terminology use and subject matter indicators. 

Use: Scoped ontology provides an overall structure to the conceptual representation of informed mental universes. A navigation process causes an additional (and separate) structuring of elements of ontology.  The structuring of navigation through topics is adaptive to how the specialist has navigated the topics up to a certain point.  As the navigation occurs, the specialist will easily generate new scoped ontology.  The structure of navigational bias can also be imposed using portal technologies and special situational logics, such as are being developed in a number of labs, for inclusion in meta-object facilities. 

Answering: Parts of the enterprise ontology can be viewed in order to make reminders that information is needed.  A respondent or algorithm supplies this information using one or more question frameworks.  The respondent has choices defined within a knowledge management system where messages are sent back and forth. 

Report: A report can be generated based on the questions answered, or based on a specific scoped ontology.


 Section 9: A future anticipatory technology

 

The Roadmap delineates an approach towards ontological modeling of complex natural systems, like the information flow within the US Customs and Border Protection.  Ten commercial software systems are to be integrated at the patent level and then deployed based on a specific notational system [17]. This selection of technologies is based on a principled selection of components that fit together and provide something more than merely the sum of the parts. 

Differential ontology framework creates abilities at the individual level for a community of knowledgeable person to interact with the knowledge of many others within their community in near real time.  A collective intelligence is made possible. The notation acts as a super standardization constraint, serving the same role as Hilbert mathematics serves for engineering and physical sciences.

One of the technologies that have been layered on top of our proof of concept is called Anticipatory Technology.  It is derived from a complex systems point of view and from certain cybernetic schools developed in the former Soviet Union.  It has been grounded in empirical sciences related to human perception, memory and anticipatory responses.  The technology is a methodology supported by advanced knowledge management practices.  We disclose the foundational elements of this work as part of a knowledge management certification program and have published the notational framework on which this certification program is designed.

In everyday activity humans exhibit anticipation about what will happens next.  Humans, and all living systems, do exhibit anticipatory behavior as part of our survival mechanisms.  Each of us understands, from our direct experience, the natures of anticipation.  Anticipation is not, for example, certain.  It is not a prediction of the future.  Anticipation merely gets us ready for certain types of outcomes.

Semantic science suggests that what happens next is not solely caused by what is happening now.  This suggestion is in direct contradiction to the practices of mathematicians and scientists form the “first school”. In the first school, every physical process is a Markovian process [18].  What happens next also depends on the nature of causes related to the environment and to the resolution of internal processes.  The behavior of insurgency in Iraq, for example, is not deterministic but rather part of a complex expression where causes are not always observable.  Anticipatory technology allows one to get ready for outcomes that cannot be precisely defined before hand.

The anticipatory nature of human behavior is only partially understood by science.  The foundations of classical science, e.g., Hilbert mathematics, seem too strong and too precise to model the real phenomenon associated with human intentions.  Using our notational system, formal specification of concepts in an ontological model can be equipped with disambiguation and reconciliation metadata.  This equipment is specified in our notational papers and in the encoding architecture (see Figure 6, Section 4).

Natural science has developed empirical evidence about processes involved in individual and collective anticipatory behavior.  Differential ontology framework captures a descriptive ontology from which this behavior can be placed into context.  The anticipatory outcome is then expressed as a scoped ontology.  We have detailed the technical architecture in other parts of the Roadmap. 

We briefly review the principles that we assume are involved in real anticipatory behavior.  Human anticipation appears to have three categories of causal entailments;

1)           Causes consequent to patterns encoded into memory,

2)           Structural entailment that can be predicted from engineering techniques, and

3)           Environmental affordance

These three entailment (casual) categories are mapped rather purely to the physical process involved in human memory, the reaction dynamics involved in cognitive processing, and the physical sensing of possible routes to behavioral responses.

Figure 12 illustrates an anticipatory architecture that assists humans in the production of actionable intelligence.  This is part of Prueitt’s original work on anticipatory technology.   The architecture develops two types of ontological components; the first type is defined as a set of abstractions about the patterns that are observed by computer algorithms.  These patterns are thought to be elements that when composed together produce subject matter indicators. The patterns are, of course, originally expressed within the grammar of language. 

VisualText, one of our component technologies, provides a development environment that assist in the observation of grammar and co-occurrence and produces a classification of nouns, verbs and other elements of grammar.  Linguistic knowledge can be capitalized on.  But at core the ontology components in our set of abstractions about the patterns are not depending on grammar.  The patterns are represented as an small graph with word labels that evoke mental experiences.  These icons are linked together so that if the user wants to see the co-occurrence patterns for “disputes” then the “center” of the icon shifts. 

In differential ontology framework, text is harvested using instrumentation appropriate to understanding the structures in text.  The structure obtained from this measurement is encoded into a set of ordered triples

{ < a, r, b > }

where a and b are subjects making reference to elements of, or aspects of, the measured “structural” ground truth in raw data. 

Figure 12:  Model of naturally occurring anticipatory mechanisms

The sets are encoded in computer memory as Ontology referential base (Orbs) having fractal scalability and interesting data mining features.  Temporal resolution is obtained using categorical abstractions encoded into Orb sets.  These Orb sets are presented as a simple graph construction.  A human then views the simple graph. 

 The graph acts as a cognitive primer.  We call this human-centric information production or (HIP) because the computer processes are supportive of a human becoming aware of how information fits together and as a consequence “connects the dots” and understands something new. 

The visual icons are minimalist, in that only a minimal exposure to information is made. The primary “cognitive processing is then made by an individual human.  A key element is that the Orb encoding has to be simple and straight forward so that the user can easily manipulate the information (something that is not possible with OWL (W3C standard) ontology software like Protégé.

 Our algorithms, data encoding standards and computer interface design methodology have been created to enhance natural anticipatory mechanisms available to humans.  The following tasks can be achieved:

 

Task 1: Create a conceptual indexer, signature, technology that does not need to preserve individual identities.  Like integers, the signatures are higher level abstractions and is thus not merely about an individual.

Task 2: Create a polling instrument using web harvest of web logs.  The results are posted to a web site where real time modeling of social discourse can be viewed using the Topic Map standard.

Task 3: Create agile real time web harvesting processes that reveals the structure of social discourse within targeted communities

Task 4: Do this in such a fashion that the results are about the social expression between people and predict tendencies towards actions or interests

Task 5: Develop a common notation around the representational language so acquired

 

Technical/Scientific Considerations:  A group of scholars have a common and shared sense that anticipation technology must involve a theory of substructure, a theory of how substructure forms into phenomenon, and a theory of environmental affordances. 

The Roadmap has been developed through a more than decade long process of scientific review of the emerging IT industry, algorithms in machine learning, neural networks, data mining, informatics and related domains. The review is supported by a scientific advisory board.   Our advisory board has also  reviewed certain literature in natural science about memory, awareness and anticipation.  We use natural science to design an extension of certain academic traditions related to Soviet era cybernetics and Western cognitive and quantum neuroscience.

In our architecture, raw computer data is parsed and sorted into categories.  Iterative parsing and categorization seeks the patterns of regularity within various contexts.  Patterns of categories are visualized as very simple graph constructions.  The parsing produces localized information that is then organized using mathematically well defined convolution operators to produce visual information about categories of co-occurrence. 

Because of the regularities in patterns, an entire inventory of patterns can be stored in extremely small data structures.  The structures are used to inventory co-occurrence patterns and to make structured correspondence to explicitly defined sets of concepts. 

The concepts are developed by a community and contain metadata about reconciliation issues and issue related to ambiguation and disambiguation.  Reconciliation, ambiguation and disambiguation are then interpreted in a notational system as element of response degeneracy necessary to any formalism that models the complex phenomenon that we address. 

With the correlation to design flaws and the positive extensions based on category analysis and pattern detection. 

 

 


Section 10:  The Second School of Semantic Science

 

The stratified model uses class:object pairs, simple graphs, and a process for collapsing occurrences into categories to create persistent data structure and relationships that reflect how physical and/or informational components work together to express real world complex behavior.  Hilbert mathematics plays a simpler role when one is dealing with engineering and certain categories of physical phenomenon. 

During the time in which Hilbert mathematics was being developed and applied to physical science, professional scientists worked as a single community in the development of this wonderful physical science we now have.  This physical science was not extended into the life sciences.  Attempts for this extension have been exhaustive, but specific types of failures occur in those formalisms that have the same type of underlying logic as does Hilbert mathematics.   These types of failures can be aligned with formal analysis of mathematics itself, using the principles of mathematics. 

The stratified model is derived from a specific scientific/mathematical/logic literature represented in its simplest form with reference to graph theory, and specifically Peircean logics that conform to what is called the “Unified Logical Vision” to cognitive graphs (John Sowa, 1984), to Ontology Web Language, and to Topic Maps. 

The stratified model has two forms

(1)             conceptual and notational and

(2)             implementation as computer processes. 

Implementations as computer processes are specified in great and exact detail as a patent disclosure or in some cases as public domain disclosures.  The “ownership” of the fundamentals to a functional science of complexity is raised due to the historical context that we have experienced since the early 1980s.  The BCNGroup, a not for profit organization founded in 1993 in Virginia, has developed a Charter that uses patent law to assist in the organization of patented computer processes, and to express new intellectual property within a roadmap for the adoption of semantic technology [19].  Over time, the objective of the BCNGroup is to make the complete set of relevant computer processes available within an economic model that is consistent with democratic principles and social needs.

An existing “experimental system”, reduces to practice a number of specific concepts that are expressed in our notational system.  So we say, following our cultural practice, that certain of the concepts that are motivating the notational system are potentially patentable when a specific set of rules are given that describes exactly how one might build a computer program that runs on computer hardware and produces behavior that corresponds to features discussed in the notation.  Our opinion is that these patents should be publicly explained so that people know how to use the techniques. 

Conventions and notation are used to provide a common intellectual environment for extending those implementations that have been already completed or prototyped. 

Many of the first generation of semantic extraction patents will expire in the near future.  Additional ones will be developed to allow a defensive stance against new patents disallowing our use of original patents.  These will be used within our systems to provide foundational capability, and to demonstrate what human centric knowledge management can become. 

 

Our purpose is to provide educational processes that teach the concepts essential to these conventions and notational systems.  The internal processes involved in Human-centric Information Production (HIP) using Orbs can be readily understood in the context of how we experience thoughts, and so the technical detail follows our experiences. 

 


 

The advisory committee

 

 

The advisory committee:

 

·        Dr Kent Myers (Advisory Board)

·        Dr Ben Geortzel (Advisory Board)

·        Dr Peter Stephenson (Advisory Board)

·        Brianna Anderson (Advisory Board)

·        Dr Peter Kugler (Advisory Board)

·        Dr Alex Citkin (Advisory Board)

·        Dr Art Murray (Advisory Board)

·        Dr Paul Prueitt (Advisory Board)

·        Dr Karl Pribram (Advisory Board)

·        Dr John Sowa (Advisory Board)

·        Rex Brooks (Advisory Board)

·        Doug Weidner (Advisory Board)

·        David Bromberg (Advisory Board)

 

Companies

 

·                    SchemaLogic Inc  (SchemaServer, knowledge management technology)

·                    Acappella Software (Knowledge management technology)

·                    Recommind Inc  (probabilistic  Latent Semantic Analysis technology)

·                    Applied Technical Systems Inc (conceptual role up, semantic extraction)

·                    Intellisophic Inc  (taxonomy)

·                    Text Analysis International Corporation Inc (text analysis tools)

·                    MITi Inc (Ontology based semantic extraction)

·                    The Center for Digital Forensic Studies (Risk analysis using formal methodology and ontology)

·                    OntologyStream Inc (project management, technical architecture)

·                    Intellidimension Inc (two full time developers plus RDF repository servers)

 


Appendix A:  Purpose

 

Global Information Architecture, using

Composite Semantic Architecture

Prototype

 

Draft Version 20.0 April 28, 2005

 

Purpose:   Conceptual constructions aim to make visualizable and computable

·                    The aggregation of event information into a knowledge domain expressed as a set of concepts,

·                    The fetching of information using conceptual organization,

·                    The focusing of human selective attention using ontology subsetting mechanisms,

·                    The extraction of subject matter indicators from human text,

·                    The elements minimally needed for objectively examining risk and gain to the enterprise

The proposed architecture for Ontology Mediation of Information Flow is called Differential Ontology Framework (DOF).  DOF has the following elements:

1)                  A semantic extraction path that uses any of several COTS products to parse written human language and produce an n-gram (or generalized n-gram) over word stems, letter co-occurrences, or phrases; as well as rules defined over these n-grams. 

2)                  A concept identification cycle that associates with the results of n-gram based semantic extraction one or more explicitly defined concepts.

3)                  An ontology development path that uses human descriptive enumeration to produce a three layer modular ontology – with the topmost layer highly abstract and common, the middle layer with multiple domain and utility ontologies, and a lower layer with small scoped ontology. 

4)                  Technical requirements are imposed on these three layers so as to, in the presence of instance data, produce a minimal subset of concepts from the middle layer that provides a clear, complete and consistent (the 3Cs) understanding of data reported as an instance and relating to an event.

5)                  Global aggregation of event structure so as to support global analysis of distributed and temporally separated events. 

 


Appendix B: Project Outline

 

·                    Paul Prueitt and Amnon Myers will receive six-month full time contracts. 

·                    Two software developers from Intellidimension will receive one and ½ -month full time contracts. 

·                    Art Murray, Nathan Einwechter, Alex Citkin, Peter Stephenson, Peter Kugler and Ben Goertzel will receive six-month 1/6-time contracts. 

·                    Acappella Software, Intellisophic, Center for Digital Forensic Studied, MITi, SchemaLogic, Recommind and Applied Technical Systems will each supply one engineer each for one month.  Evaluation copies of software will be provided at no cost. 

 

 

Total hours:  4,990

Total time costs:  $633,750

 

There will be three workshops.  The first workshop will be held within 30 days.

 

·                                11 participants, some who will attend by teleconference

·                                Final approval of Plan of Action and Management task list with milestone dates.

·                    Final approval of subcontracts with tasks and expenditures

·                    Program reports

 

Operating costs for workshops and travel is $30,000.

Total Time and Materials cost $663,750. 

Prime contractor overhead is 15%, or $99,562.50. 

 

Total contract sought:  $765,312.50.

 

OntologyStream Inc will provide office space and computing resources.  Prime contractor will provide management. 

 

Contributing COTS vendors will be compensated for support and engineering time, and will provide software on a free, or very low cost, evaluation basis. 

Patents:  Negotiations for full compensation over all selected IP will be made based on fair use and just reward principles, as defined in common law.  Each of the participants will stand to make considerable compensation for the long-term use of proprietary property.  

Follow on:  A large follow-on project is anticipated. 

 

 

.

 

Appendix C: Semantic Science

 

A "Second School of Semantic Science" approach to ontology mediation of human communication has been developed.  This approach does NOT use description logics, but rather a descriptive enumeration of sets of concepts and a scripting language that uses ontology terminology as computable elements.

We take the position that some aspects of descriptive logic are useful, and that certainly some applications of ontology "reasoning" can be found using descriptive logics; but that the use of ontology as formalism for complex system study requires other elements not found in classical logic.

We take the position that controlled vocabulary, or the term we use "managed vocabulary", can be mapped to explicit ontology (defined if one wants as standard Ontology Web Language (OWL) with description logic), via the notion of a subject matter indicator (SMI). This mapping can be dynamic and based on human use and human control. Thus the precise, and we feel inappropriate, standardization of terms' meaning is avoided. The mapping between words and patterns of word occurrences is maintained in a weak form so that the normal use of ambiguation can be reflected in how ontology is used in computing. 

This weak form preserves metadata about terminological variation of meaning until the moment in which a human perceives a part of a larger ontology structure within the experience of a moment.  We make the principled position that it is only in the present moment that human tacit knowledge is available to complete the process of forming a non-ambiguous mental image of meaning in the context of that moment. 

The SMIs can be produced by any one of the many semantic extraction processes (none of which use descriptive logics).  In classical physics Hilbert mathematics creates a description of all behavior of a "Newtonian" system.  We hold that Newtonian science and Hilbert mathematics is too precise to be used as a means to model complex behavior.  In the Second School, the set of SMIs take the place of mathematics in a description of reality.  Various type of abstract mathematics acts on this set to produce dynamic models of complex behaviors, including

Double articulation (linguistics or gene/phenotype correlations)

Response degeneracy (Gerald Edelman's term)

Meta stable state transitions within metabolic chains

Holonomic, non-holonomic causation (Karl Pribram’s term)

These complex behaviors may be expressing a condition of local indeterminacy under global and underconstrained forces.  

 

 

 

 

 

 

 

The Knowledge Sharing Foundation

 

The knowledge sharing foundation concept was first developed (2003) as a suggestion supporting the US intelligence agency needs to develop information about event structure.  Previous to this, a small group of scientists have talked about the need for a curriculum, K – 12 and college, to support an advancement of cultural understanding of the complexity of natural science.  By natural sciences, we mean social and cognitive science in the context of human communication. 

 

The suggestion to support new intelligence technology deployments is predicated on the intelligence community’s responsible use and on the co-development of an open public understanding of the technologies employed. 

 

Ontologystream Inc has developed an (fairly complete) understanding of the types of text understanding technologies available within the intelligence community. 

 

Q-1: What is needed to support awareness of events in real time?

Q-2: What is needed to support community use of analytic tools?

Q-3: What are the benefits to Industry

Q-4: What are the Foundation Elements

Q-5: Examples of Innovation

Q-6: Why are educational processes important?

Q-7: How does the software compensation model work?

Q-8: How are test sets made available to the competitive communities?


 



[1] Term coined by Tim Berners-Lee

[5] Commercial Off The Shelf

[6] Levine, D. & Prueitt, P.S. (1989.) Modeling Some Effects of Frontal Lobe Damage - Novelty and Preservation, Neural Networks, 2, 103-116.

Levine D; Parks, R.; & Prueitt, P. S. (1993.) Methodological and Theoretical Issues in Neural Network Models of Frontal Cognitive Functions. International Journal of Neuroscience 72 209-233.

[8] This work is seen at Pacific Northwest National Labs www.pnl.gov

[10] for description of PLSA see papers from www.recommind.com

 

[11] The issues here can be discussed under non-disclosure agreements. 

[18] Markov was an important mathematician whose work on stochastic transitions between system states creates the explicit assumption that all cause comes from the moment before and event.