[165] home [167]

Saturday, February 11, 2006

Generative Methodology Glass Bead Games

On using RDF to model web services

Link into the discussion on the Rosen eforum -> [167]

Footnotes are by Paul Prueitt

Index to sections

-------- Closed World Assumption verses Open World Assumption

-------- Non-monotonicity vs Monotonicity

-------- Clarifying Example

-------- Approach to the solution

This work is work in progress

Notes by Paul in italics

Hi Paul,

as usual, this reply is very verbose and pedantic but it is the only way I am able to give an effective (I hope) explanation to what I think are the answers to your questions. I know someone does not like to be overwhelmed with words, so please, if I am saying something that you already know or that you find straightforward, then skip to the end and proceed backwards to find the motivations of my statements :-) [1]

OK, I am not sure it is not a matter of theory, as you say.

At least two theoretical issues are involved here: the first is monotonicity vs non-monotonicity, and the second is the so-called "Closed World vs Open World Assumption". I will first address them separately, but it will turn out that they actually are intertwined. [2]

-------- Closed World Assumption verses Open World Assumption

Adopting the Closed World Assumption (CWA) means that, by giving a set of statements (an RDF document, say) S and a set of inference rules R (which you can think to be obtained from a set of semantic conditions), you are implicitly specifying a total truth-value assignment to the set of all possible statements P.

Let's call “I” the set of statements that contains all of the statements in S plus all of the statements which can be proved to be true by applying inference rules in R to statements in S (formally, the closure of S w.r.t. R). Then, if the CWA is adopted, any document containing a set of statements defines a *total* function isTrue:P-->{T,F) such that every statement in S' is assigned the truth value "TRUE", and every statement in (P-S') is assigned the truth value "FALSE". There is no "uncertainty".

On the contrary, under the Open World Assumption (OWA), it is never the case that a statement is considered to be "FALSE" just because you cannot prove (or, as a particular case, just because you do not have explicitly asserted) that it is "TRUE". [3]

In order to say that, you would have to prove that it is actually "FALSE" [4]. A straightforward example:

ontologystream:Paul mynamespace:knowsAbout rosentheory:complexity

computerscience:Andrea mynamespace:isFriendOf ontologystream:Paul

in the above, ontologystream, mynamespace, rosentheory, computerScience are all namespaces where what follows after the “:” is supposed to be defined within a context where the term is unique within that namespace.

What does this document tell? If you want to give an "a priori" answer, it tells the things it asserts. Not that great. Now, if we adopt the CWA, then we are able to conclude much more things than those it asserts: for example, that it is "FALSE" that "computerscience:Andrea mynamespace:knowsAbout rosentheory:complexity", and that it is also "FALSE" that "ontologystream:Paul mynamespace:isFriendOf computerscience:Andrea", because these statements are neither asserted nor inferred (we have no inference rule yet).

Now, let's keep our CWA assumption but add some inference rule. Let's suppose that friends of a person know about everything that person knows about [5] (because of communication, say), and that friendship is a symmetrical relationship [6]. Then, you can now assign a value of "TRUE" also to statements "computerscience:Andrea mynamespace:knowsAbout rosentheory:complexity" and "ontologystream:Paul mynamespace:isFriendOf computerscience:Andrea", and "FALSE" to all others. [7]

Instead, under the OWA, you can say nothing of what is neither asserted nor proved by inference rules: mathematically, the above isTrue function is not total, indeed it is undefined.

Now, given an RDF document, it is important to recognize that while the choice of the inference rules is a "late binding" in the sense that semantic conditions are not a property of the document but rather of the answering engine (as I told you in a previous post), with regards to the OWA vs CWA issue there is an "early binding", in the sense that everything you write in RDF syntax *must* be seen under the OWA.

-------- Non-monotonicity vs Monotonicity

I am slowly approaching your questions, please be patient. At this point, I need to generalize. First, let's define function truthValueOf:P-->{T,F,U} (where U stands for "UNKNOWN") as:

truthValueOf(s) = "U" if isTrue is not defined on s,

truthValueOf(s) = isTrue(s) otherwise

And let's introduce the monotonicity issue. Suppose to have two sets of RDF statements (i.e. RDF documents) T1 and T2, T1 being a subset of T2, and denote function truthValueOf for document T1 as tv1, and function truthValueOf for document T2 as tv2. Then, you have a *monotonic* logic whenever

for all statements t, (not(tv1(t)=tv2(t)))->(tv1(t)=U),

that is, whenever adding statements *does not change* the truth value of any statement which was already known to be "TRUE" or "FALSE", but can change the truth value of statements that was "UNKNOWN" to either "TRUE" or "FALSE". In other words, adding information only reinforces your certainties.

Viceversa, in a non-monotonic environment this not always occurs: in such a setting, in fact, adding new statements to a document could change the truth assignments for statements that were previously known to be "TRUE" or "FALSE".

-------- Clarifying Example

To illustrate the difference, the following example is commonly used. Suppose one is in a monotonic setting, and to have the following set of statements (no need to be RDF, it's a general issue):

"Birds can fly. Penguins are birds. Jack is a penguin."

From them, you can infer that "Jack can fly". Statement "Jack can fly" assumes value "TRUE". Now, suppose you want to model an *exception*, that is birds *usually* can fly but penguins cannot, and add statement "Penguins cannot fly": from this, you can infer that "Jack cannot fly". However, in a monotonic setting, statements that was previously "TRUE" remain "TRUE", so having both "Jack can fly" and "Jack cannot fly" assigned to "TRUE" yields to an inconsistency. Full stop. Your document is nonsensical.

In a non-monotonic setting, instead, you are allowed to assert something like this:

"By default, birds can fly. Penguins are birds. Jack is a penguin."

And this is equivalent to tell your answering engine "well my friend, if you cannot otherwise prove that a bird cannot fly, then please assume that every bird can fly". Here, you are establishing an inference meta-rule. Now, suppose to add the sentence "Penguins cannot fly": no inconsistency arises, because your engine can now "retract" the previously inferred statement "Jack can fly" (more precisely, it can reconsider its assignment to value "TRUTH") and you don't have as a result that two conflicting statements are both evaluated as "TRUE". This process is called "default reasoning". [8]

-------- Approach to the solution

Ah, eventually, I come to the point: what does this stuff has to do with your questions.

In case of "Action" and "Event", you say

"I have an 'entity' called 'Action' with what the reference document terms 'a parent' of'Event'", and "Some of the attributes of Action are not attributes of Event".

Assuming that 'parent' means rdfs:subclassOf, that is to say "all Events are also Actions", then what you have is a non-monotonic modeling. BY DEFAULT, Actions have this and this and this property, BUT some of them do not apply to Events (remember? BY DEFAULT, birds can fly, BUT that does not apply to penguins).

Non-monotonicity is a convenient facility, but unfortunately (for you, obviously there are motivations for having monotonicity in RDF) it is not trivial to rewrite a model written for a non-monotonic environment into an equivalent model written for a monotonic environment, where for *equivalent* I mean that their truthValueOf functions are identical (assign the same truth value to all statements).

So, in order to correctly map your specification into an document RDF, you have to transform it so that it assigns every sentence involving Events, Actions, their properties, and so on, the same truth values they are assigned in the non-monotonic, original environment. [9]

The good news is that this is a very general problem, and that design patterns exist. I am not a guru, though. So, I just give you some example in order to let you understand the nature of the problem, but please do not expect that I can give you the universal solution or algorithm because I don't even know whether it exists.

This work is work in progress (written by Paul - February 23, 2006)

The rules or thumb for converting OO data schema to OWL would be useful if these can be enumerated.

I had come to feel the same as your note below expresses regarding there being "no" OWL type subclass relationships in the OO paradigm. The concept in OO is that there is an object with private data. Messages are sent between objects. There is an object hierarchy possible. But, perhaps the object hierarchy has more to do with the model of how the computer is working (like the GUI) then the "external world". A certain type of GUI window is a more basic window with some changes to the internal data and the behavior (methods). On the other hand, the OWL class hierarchy is designed specifically to support description logic.

Frames would seem to be more consistent with the OO model, where slot = a property that is a relationship. But again, it is may understand, likely wrong but I do not know how to correct this understanding, that the Protege Frames notion of a frame is not the same as the notion of a frame (a context) that was developed by Frank Schank.

http://www.informatics.susx.ac.uk/books/computers-and-thought/chap3/node9.html

So, it might be that the rules for mapping between OO and OWL would be to flatten the set of "objects", in this case the SOA entities, into a set of classes having no subsumption relationships. Then two types of properties would be defined. One type would be restrictions such as functional and inverse function restriction on individuals? Many of the class restrictions (if I am saying this correctly) like disjoint intersection might not come up in the mapping rules.

One consequence of working through all of the issues regarding the SOA IM model, is that we might be able to address the question about "is there any implicit information that can be shown using a OWL reasoner, once the rules for mapping have been applied to a OO model?"

At this point I cannot even guess. In spite of the perhaps, in some respects, different intended uses for OO and OWL, both are used to organize data, persist data, and offer various means to use the data selectively.

A foundational and informed paper on this mapping task is needed.

More extended discussion at à [4]

[1] The purpose of a colloquy is to expose the issues in a scholarly way. This requires the type of extended discussion that you are presenting and we are appreciative of this extended discussion.

[2] Yes, this is of course true that the debate has been couched in this way. There is the appearance of completeness since there is one side and then the opposing viewpoint. But a third viewpoint exists outside of this debate, and should not be shaped by the specifics of the debate between these two intertwined theoretical issues. The third viewpoint is informed by this debate, but not shaped by it, because the third viewpoint states a definition of natural complexity as being something that cannot be reduced “perfectly” to formalism of classical logics, the so called expressive logics, or Hilbert type mathematics.

This is the position of the second school of semantic science. The intertwined two theoretical issues that you raise are those of the first school. The close verses open world discussion does not allow the principled discussion of complexity as defined by Rosen, because the emphasis in this discussion is about the assertion that the close world assumption can make sense as a complete theory of the natural world. Once this discussion stops, then the consequences of Rosen’s definition of natural complex can be explored without getting derailed by the debate occurring within the first school.

The second school must avoid stepping into that debate, and this avoidance is very difficult.

[3] Here is precisely where the polemic starts. Some one in the second school would never say that the open world assumption does not allow false statements. Certainly this is not the interpretation that someone in the second school would allow to start the “definition” of the position of the second school.

Openness means, to the second school, that an induction has to occur to measure the natural world, and that this induction has to occur in real time based on the functioning of quantum neuro properties as discussed by Hameroff and Penrose and others. Formalism is a subject that comes in quite a lot later in a discussion of the consequences of cognitive category formation, and the resulting definition of formalisms like logic and mathematics.

[4] This attempt to prove something false is not how the second school approaches the question of reification or the acceptance of a statement as having interest to (future) science. The second school might point out that the reduction of statements of truth, such as “I love my wife” are deeply problematic. But this point draws the second school into the first school’s polemics. The point is that the true alternative to the first school does not approach the question of truth is a simple fashion. Precision and exactness is ok in cases where the dynamics (the entailment) is simple (as defined by Rosen). But in cases where the entailment is complex, one has to regard a “demand” for precision and exactness with reservations. The reservations are not merely hotty, but are from a mature viewpoint about the limitations of formalism and the nature of reality.

[5] This is, in practical terms, a hypothetical. In diplomatic conversation there is often an unwillingness to understand what the other’s person is saying.

[6] In fact, a formal property of symmetry regarding any two “friends” is a truncation of the reality that actually exists between these two individuals. In a strong way, one can say as Wittgenstein did, that there is a language game and that formal inference rules of this type are used only as a convenience. In the Blue and Brown Books (later Wittgenstein) he talks about patterns that a key chain makes when throwing on a table. Like Hostettler later on, this part of the Blue and Brown books talks about how similarity of patterns is the key to understanding how categories form that are then used to point to similarities and differences. This type of “logic” is then extended by quasi axiomatic theory (Finn and Pospelov) to a analysis of the similarities of function given similarity of structure in biological expression, see:

http://www.bcngroup.org/area3/pprueitt/kmbook/Chapter6.htm

[7] This is the key criticism of the first school, by the second school, e.g. that this inferred assertion is taken as a computed truth. The problems only start with taking the result of computation as an asserted truth. But rarely can the first school get beyond this first problem.

[8] See the works by Alan Rector on the problem with OWL definition of exceptions and defaults.

[9] This is exciting. In my mind one has to start out with all concepts of a ontology as having no sub-subsumption relationships, ie no class – sub-class distinctions. Then the properties can be defined as the attributes are defined in the OASIS SOA-IM model. The properties can be aggregated into categories using “sameAs” and differentFrom” properties on properties. The resulting properties now have almost the sense of slots within the frames concept of Schank. This frames concept of Schank is overwritten in Protégé-frames to allow the class – sub-class distinction to be introduced. As a result the original insights of Schank are over written by the Protégé paradigm.