Feeds:
Posts
Comments

The Semantics of Identity

In this post I will try to demonstrate  the semantics of identity in the context of “identity theft” –explaining why “identity theft” exists and why it is so absurd.  The reason is not very complex, so the understanding of it is within anyone’s capacity.  There are at least three perspectives considered below.

There is of course the personal and victimized perspective.  I shall refer to this as the victim’s perspective for the people are undoubtedly victimized by “identity theft”.  And there is another perspective mainly shared by political, banking and lending institutions, and big business.  The perspective of these stakeholders is the institutional perspective and it is shared by these establishments.  The third perspective is that of the skeptic and the critical thinker –not that all skeptics are critical thinkers and vice-versa.

To understand  the semantics of identity one must first understand that thinking is inseparable from doing.

In addition to the fact that there are at least two perspectives to any identity there are at least two ways of intersecting these active perspectives.   To understand the semantics, we can look at a static state of affairs –as we often must– and to do so, we freeze a situation for retrospective analysis.

There is also a way of seeing what is happening as a dynamic interrelated set of entities, objects and processes, because, as a matter of fact, everything is always moving and in motion.  One side is not independent of the other, each is an aspect of a single object or entity.   There is then, a static aspect to identity and a dynamic aspect to identity.

I am sure most readers would take all that as rather abstract so let’s try to make it more concrete at the risk of sounding a little absurd in the face of reality.  Recall now that the institutional establishments have invented the phrase “identity theft”, and that such establishments have a stake in controlling one’s identity.  Now theft of course, means that something, usually property, is stolen by somebody else, so we can dispense with further discussion of that part of the phrase.  The rest of the phrase implies that it is one’s own identity being stolen.

Let’s examine that; beginning with personal identity as this is the case we all know best.  For all our intents and purposes, the proper name of an identity is one such (relatively) static aspect of identity. A proper name has its own semantics.  A proper name is a label and sometimes a handle for an identity.

A label is meaningful for separating things, for calling people and for tracking, that is; a proper name is meaningful in the context of labeling things for distribution, segregation, control and similar purposes.  Now, notice that we are no longer talking about identity. Here, now, we are talking about the utility of proper names.  Now this fact –a seemingly innocuous switch of subject– seems very important to me. Particularly as many people are victims of “identity theft” these days.

It is here that the skeptic might be thinking that all this is absurd and that it matters not what name is used for any purpose. Identity theft, for example, does not have to mean that one’s identity is being stolen some skeptic may say.  To that I would respond that words and names are often convenient labels but it does not make them true or correct labels.  For the labels to be determined to be true, one must be able to examine the case and its implications in order to come to an agreement as to whether it is a true or correct application of the label.

If you are following me, you will resolve after due consideration, that “identity theft” is not a true description of what is happening in such a state of affairs and within the surrounding cloud of circumstances.  If you are a victim of so-called “identity theft” What is being stolen is not your identity or even your proper name –because many people share the same proper name– what is being stolen is data or “information” linked to your occupational being here (and all about your participation in an economy).

Now, it matters only whether some property of the victim is being stolen or whether property of the establishment is being stolen or otherwise  misappropriated.

Your credit cards and wallet and bank account numbers and passwords in your possession are your property, but your credit history files are not your property.  The data in your credit history file is stored on computers that are not your own. The files and computers belong to the bureaucratic and institutional establishments. The data inside these files are fully intended and designed to provide information about you and to further segregate and classify you.

Now almost everyone has heard of the situation referred to by calling it “identity theft”. That actually means that someone has broken into your credit history and related files and is illegally using the information there to impersonate you.

By calling it “identity theft” instead a high-tech crime of “data hacking” and theft of “information”, bankers and big business retailers and other lenders are able to make it your problem. e.g., as if your “identity” is being stolen. Though it is not your identity being stolen. Why is their data and systems being hacked your problem and not their problem?  Shouldn’t they be held accountable?  Is it because “they have the power” and you do not?  If that is what you think; that answers why it is your problem.

Understand that it is the actual digital files and information that they –the institutional establishment– created about you that is being misappropriated.  The files and information are not your property. Because they are not your property, why should you be required to ensure the safety and security of these files? Why aren’t they held accountable for their own property and for the damage they do to your reputation?  The answer is simple.  You do not realize your own power and you do not exercise your will to object and to push back against the establishment when it goes wrong-headed.

Therefore your elected officials with legal power over the institutional establishment allow them to do it. Electing new politicians cannot help you, if you are not thinking correctly.  The new politicians will allow it too. For one, they will say: It has always been that way.  The real or true reason though, is: It is allowed because the semantics are unclear.  The semantics are unclear because the labels are being used in a confusing way.  The consequence of confusion is more confusion.  A consequence is the effect of taking an action, so that is to say that wrong-headed actions are taken.  In the wake of confusion, lucrative deals come more readily than when things are more settled.

The semantics of identity is assignment.  Assignment is an elementary semantic process.  What does it take to make an assignment?  What are you doing when you make an assignment?  Creating theories and concepts about the way things appear and their implications is a part of critical thinking.

In the case that I tried to frame above, the issue is identify theft.  So, we must question our assumptions about identity and what it is.  Without semantics –that is, without basic symbolic knowledge of the original cause and the interrelationship of identity to the wholesome and unified awareness, one might accept that an identity is only a name, label, or some nomenclature that we use in language and in talking about people and other things.

Critical thinking requires one to be clear and precise about their concepts.

So my concept of identity converges to the assignment of a name or label in order to identify the object or process occurring to the awareness.  The semantics of identity is assignment.  This is to say, quite objectively, that an occurrence that appears to have an identity (such as the appearance of the computer before you) is such identity only because we have assigned it the name “computer”.  Identity is a symptom of the assignment we make.  What does it take to make an assignment?  What are you doing when you make an assignment?  You are using your authority and free will to do something.  Framing and answering such questions is a part of critical thinking.

If you are a parent, ask yourself, who or what gave you the authority to assign your child a name –their identity.  No kingdom or government regulates such a thing.  If you are not a parent , ask yourself, what gives you the authority to call something good or bad — to pass judgement by association –by naming or referring to something substantive of a thing.  What power or right gives you, or me for that matter, the authority to call the computer a computer and not a calculator or some other name?

The semantics of identity is assignment.  An assignment confers rights to being.  When you make an assignment, what you are doing, is that you are conferring rights.  Thinking is inseparable from doing.

Whether it is done subconsciously, preconsciously or consciously, because it  is a matter of implication and consequence, and because it is held almost entirely within the realm of the mind and within the “processes of thinking”, until it becomes a matter of fact or consequence, most people tend not to even notice that they are exercising their own personal authority.  You exercise your authority to make an assignment while making any decision that is involved in creating or affirming any identity. This is done by conferring upon it, its right to identity.   Think of it… what power! What Grace?

Your right to identity along with your proper name – was conferred upon you by your parents. No one can take that away.  In fact, if you want to change your name, you need a judge to grant that right for good reason.  Why do you give into the ruse that your identity is being stolen –so easily?  Because the bankers and lenders say so?

Your parents give you your name, but your own self-identity arises from that perfect union and nurture from which your being here unfolds. Your rights to the pursuit of life, liberty and the pursuit of happiness is a birthright.  Does it make you happy that the institutional establishment has taking over control of your life and liberty and that they have abrogated your own power and property of self-identity and used it for their own profit?

If you do not like that reasoning, try this:  If you are a victim of so-called identity theft, then you have apparently nothing more to lose.  Why not go to court and point out to the judge that your identity is quite intact and that it was the lender’s data files that were hacked and stolen and that has created havoc in your life along with financial ruin.  Now of course, you cannot do this if you are careless with your own property:  your files, data and information.  In this, case you must suffer the consequences of being careless, unthinking and out of touch with what is happening around you.

I am not a lawyer.   Perhaps one will comment. I believe if you are careful and the theft is not of your own property– i.e., your identity or your wallet or card or something else in your care and possession, then you have a case for judicial review.  If your judge concurs that it is indeed a credit or banking system or file being hacked, causing in its wake irreparable harm that you can document, the judge may rule that the banker or creditor is a fault for not protecting the information in the first place.

Disclosing one’s private information, whether intentionally or not, is a crime that should carry hefty penalties as a multiple of the damage that is inflicted on the victim. Now don’t you agree after all, that we should all think more carefully and critically about the way words are used?

The Semantics of Critical Thinking

In an earlier post, I tried to answer the question: Why we  need semantics and why it is so important for computers.  I think that I touched on the problem of critical thinking and how hard it is for people to do.  That is a part of the semantics of critical thinking but it is not the whole story.

Critical thinking is thinking about your thinking –mainly examining your questions, assumptions, knowledge and beliefs before deciding an issue or a question. Many teachers or educators who teach critical thinking would emphasize the reflective and introspective activities and notions for which one should account in the process of critical thinking.  One might call such things the “elements” of critical thinking as in this wonderful model of critical thinking published by the Foundation for Critical Thinking:

http://www.criticalthinking.org/CTmodel/CTModel1.cfm

A Model of Critical Thinking

For example,  if you are a thinking human being, then you will experience the need or desire to really think hard.  When you think about something very important to you, then you should try to use critical thinking to figure out your best options.  The model above tries to explain that “much of our thinking, left to itself, is biased, distorted, partial, uninformed, or downright prejudiced. If we want to think well, we must understand at least the rudiments of thought, the most basic structures out of which all thinking is made. We must learn how to take thinking apart.”

I do not agree that these are “the most basic structures out of which thinking is made”, though as the “elements” of this model of critical thinking, they address the several aspects of critical thinking, such as: getting information, evaluating assumptions and inferences, using a theory or concepts, and; standards for critical  thinking, such as: clarity, precision and relevance, for example. If you click on the image above, it will take you to the interactive model where you can get more information on how to apply critical thinking.

While you look over the model what you will come to realize, if you did not already, is that critical thinking is hard to do.  There is a lot of reflection and introspection, and there is some reasoning as well.  This explains why we need semantics for computers.  Semantics or semiotics is the foundation of critical thinking.  The concepts, theories, principles and axioms called for in the model above are necessary.  How many people have all the concepts, principles and theories?  How many who say they do can actually justify them and qualify them from their own assumptions or beliefs?  Such questions cannot be answered.  If seems quite enough to say: not many.

That is why we need a semantic theory for computers.  The concepts, theories, principles and axioms of critical thinking are semiotic concepts, theories, principles and axioms –without them we have little solid ground for processing symbolic thought.  That is what makes Adi’s semantic theory such an important breakthrough for the computer industry and for everyday people.

For this first time in history, we have a sound theory that provides the concepts and theory sufficient to dealing with four of the eight elements of critical thinking (of this model), that is:  processing the implications and consequences, the assumptions, interpretations and the inferences relevant to any of the other four elements: information, point of view, purpose and questioning a specific issue.

Further application of Adi’s semantic theory will lead to Computer-Assisted Critical Thinking (C-ACT) where we can rely on all the great functions of computers to apply acceptable standards to well-grounded critical thinking, namely: clarity, precision, accuracy, logic. completeness, validity and relevance, etc.  We all need to do some critical thinking. With a little luck, a few resources and some support, we will soon be able to have our computers help us do critical thinking about the questions we have, and the issues we want to resolve.

The Semantics of Disconnect

Check out this ebook about what matters now (it is in pdf format).  I have added my meme below.  If you are touched by what matters most to you, you should post the ebook at your blog and add your meme to it.

My meme for what matters now is the

D i s c o n n e c t

in meaning and life that many people experience.

The Choice

It is your Choice.

It has been some time since my last post and not much has changed with semantic technology except that business slowed for many.  Tom Adi and I have been busy with pending publications, one of which was announced here.  That article has all the details of the algorithms and information technology that were  derived from Tom Adi’s original research

For those that are interested in reading that peer-reviewed article, this post and the links provided here will provide you a framework for understanding the cognitive and semantic theory that is introduced and presented in detail there.  I think it might be hard for database programmers and systems engineers to follow because they are familiar with data reduced by some kind of independent and reductive determinism, such as with a statistical or intentional model.  This involves neither of those techniques.  This implies there is something to learn.

The original research and the unifying  processes described here are used to characterize the determinate elements and operations of an otherwise indeterminate situation.  Characterizing the determinate elements and operations of a situation is part of the scientific process of discovery.  Discovering and characterizing such elements and operations has the character of a learning process and not that of an ideology (the body of ideas reflecting modern-day logical deduction and reductive determinism common of the Internet computing culture).

We were not trying to understand or create data or knowledge or information processing models, except as we conceptualized how to create experiments, test and implement text analytic systems of understanding.  Tom Adi is a computer scientist though his original research and our decades of work together was squarely aimed at understanding how the human mind identifies and interprets the determinate elements of salience while reading expressions from messages and texts.  This is certainly something a human mind does: read and interpret stuff about the world around them.

Understanding the human mind is important to researchers and scientists.  New scientific findings suggest that cognitive skills are activated from outside of individuals, outside of their independent minds and away from their independent being.  This understanding gleaned from studies of apes and children in learning situations, sheds additional light on why computers are unable to intelligently learn from AI models, by reason of logical patterns, by statistics or by other models of intentional semantics.

This Nova program series about Ape Genius, highlights studies of a variety of primates, including humans.  The research focuses on the capability to learn, to respond to reward and gesture, and on experiments that measure cognitive and intelligent task mastery.

This research points out that a big difference between the other primates and humans is found in the capacity to teach.  Humans differ from other apes in that human children are taught and may even anticipate being taught. Researchers found what they called a magic Triangle, the social situation where a student and a teacher are focused upon a substantive object. They see this as the key to why this world is not the Planet of the Apes, instead of it being the domain of human beings as it is.

A similar sort of thing happens when a reader focuses attention onto an instructive piece of writing as this post may be found to be, that is:  the same forces and influences come into play as those that exist between an independent teacher, a student and a substantive object (or subject) of attention: the Triangle –here between author, reader and the subject of the post.

This Triangle exists in logic and in the conventional functionality involved in all interpersonal communications including film media.  Consider the reference to the planet of the apes above.  I am invoking the Triangle to put the focus of attention on the semiotic of the film: that we could just as well be digging termites from mounds in the jungle, if we did not already realize that we have this capacity to recognize the conceptual interdependence between teaching, learning and the substance of existence– from its semiotic character. To bring this out of one’s subconscious mind means one must bring the substance of the concept to the attention and into the mind and struggle to hold and absorb it –that is the capacity to recognize existing interdependency.

In fact, while one can readily separate everything in their world into emotion, feelings, knowledge, beliefs, desires, demons, angles, tools, material goods and artifacts and what have you, there are two possibilities of the existence of such things.  There is the stuff of the conception and the stuff of physical matter.  Any sensuous impression or perception falls into the extension of the conception or into the extension of physical matter.  We can write that any thing belongs to either the physical substance S of their existence or to a conception C of that existence, that conscious being as it is, is cognizant that there is a material equivalence between them (has a prior knowledge that there is interdependency between the physical and conceptual substance of existence).

And I should further define it here as that unified being: a Triangle of interdependently conscious understanding Ui working on a system comprised of physical substance S and of the conception C or Ui[S,C] where the conception of existence C is materially equivalent to the physical substance of existence S in a logical sense (C<==>S). The implication (meaning) is that the conception C is true only when the substance S is true otherwise the conception C is false (in a logical sense).

In a material sense, there is a transformation of substantive and salient impressions into a conception of that substance.  The transformation can be seen as a cognitive learning process. One can conceive of that transformation as part of one’s intelligence or cognitive processes, though because it is a biological system, its operations, elements and processes appear to have a relative structure very much like the metabolic elements, operations and processes where molecules are transformed into metabolites according to the combined structure of determinate elements and the needs of the organism.

In this way one can conceive of their cognitive system of conception [S,C] in the same way one conceives of their own Metabolic Repair system [M,R].  The purpose of the conception can then be seen as the innate capability to repair the substance of one’s own existence just as one’s own metabolism repairs their own body.  The implication is that we can repair this existence.

You have probably not heard of the human intelligence and the elements, operations and thematic relations of the cognitive processes being equated to the metabolism before.  This is because it is an original idea.  There are many implications of this projection of reality, which I trust the reader will bring to their mind as they consider what has been reported here. Most important is the significance and priority of interdependence over independence, though the realization of the unifying purpose of the conception is individually liberating.  It has the power we need to change the world in which we exist.  The prospect of unifying humanity against disorder and chaos is not as daunting when the natural interdependency of existence is considered.

Now it is important to everyone that their conception and their being here in this place we call the world, is unified and not schizophrenic– otherwise we will have chaos and misunderstanding. A schizophrenic existence is one where the elements of the existence are not only disjointed, they are disparate and even antagonistic. Because that is how people are today –disjointed, with disparate opinions and beliefs and antagonistic feelings– we have a serious situation that is in need of repair. Didn’t you know, people feel disconnected, even unaccountable.  There are reasons for this.

I will come back to the unifying processes of the unified conception, which is the conceptual part of meaning of conceptual interdependence, in the next post.  In this post I want to define the importance of the interdependence part and introduce the reader to the social influences that were invented to divert people from the power of their own autonomous reasoning, and instead keep them in line and under control as a whole– that will come down below.

Returning now to the interdependent structure; unlike intradependence which expresses the inward functionality of the elements of wholeness, the unifying processes and their orientation, the functions of interdependence must reach outward away from the self and towards others. Still, maybe a little surprisingly for some readers, these are unifying processes implementing a unifying process.  The implication is that reunification will be achieved in the end. Considering how far barely unified nations of people have advanced the race, it begs the question why we cannot achieve a unified world order in our lifetime.

There is a way, though before you can recognize it, you must first consider and acknowledge that there are extensions to functional and thematic relations as well as intensions to all social relations.  The extensions of the thematic relations between the self and others are called social or interpersonal and these situations and states of affairs are addressed by social interdependence theory.  Nouns and verbs and other descriptive and lexical elements of language and its grammatical conventions fall into these extensions as does knowledge, beliefs, opinions, etc.

The intensions of the thematic relations are comprised of the elements, the boundaries and the engagement conditions enveloping and existing between the self and others and from within which motives are activated. These are addressed by conceptual interdependence theory which states that there are conceptually interdependent boundaries and engagement conditions that are uniformly projected (according to precedence and by way of an extended projection principle) onto the unifying and determinate elements underlying every state of affairs.

In this respect, it does not matter if the substance of that existence has physical or conceptual properties or attributes –as such properties and attributes are neither distinct nor separate.   What is of significant importance to the Triangle is the fulfillment of geometric points and angles in the construction of its structure: i.e.,  teaching/teacher and student each focusing attention onto a substantive object is a: conceptual structure.  This structure is the subject matter of Adi’s cognitive and semantic theory.

This conceptual structure can be understood from the ways a situational analysis is conducted according to social interdependence theory.  In their 2008 paper, Why We Need Interdependence Theory,  social psychologists Caryl E. Rusbult and Paul A. M. Van Lange, write in the abstract on social influence:

Interdependence theory identifies the most important characteristics of interpersonal situations via a comprehensive analysis of situation structure and describes the implications of structure for understanding intrapersonal and interpersonal processes. Situation structure matters because it is the interpersonal reality within which motives are activated, toward which cognition is oriented and around which interaction unfolds.

In a very similar fashion, the thematic relations of my conceptual interdependence theory (comprised of my interpretation of Adi’s cognitive and semantic theories) identifies the significant characteristics of the Triangle, that is; the interpersonal reality within which teaching and learning motives are activated, toward which cognition is oriented and around which interaction and representation (speech, reading, writing and arithmetic) unfolds.

These themes, unlike their extensions, are not linguistic, but are pre-linguistic, in their origins.  The necessary thematic relations are not given by nouns and verbs or other parts of speech. You can easily recognize the polar coordinates of pairs of adjectives like good or bad, fast and slow, pretty and ugly, yet such extensions of concepts have little to do with the inherent boundaries of conceptual interdependence except to demonstrate such interdependence in the existential objects and subjects they denote in their extension or reference.

The philosophy of language does not adequately account for the word structure, nor of the elements, operations or the interdependency of the thematic relations indicated by internal structure.  What does a noun have to do with activating and focusing the attention?  What primacy of the gestalt is captured by the verb?  If you go down that road, as many researchers have, all you end up with is shifting assumptions, nearly whimsical conventions and delusional though deductive relativism.

Because Information scientists depended on the faulty ideas of a few linguists, this explains why AI models are unable to learn on their own; the thematic relations identified by linguists with their various natural language and grammatical models are not the thematic relations we need for capturing conceptual, social or any other kind of interdependence outside the syntax of the sentence.

Yet –at the foundation of the understanding there are these thematic relations on which all teaching, language, communication, logic and mathematics continually revolve and from which ideas and thought arise.  I will get into them a little deeper in my next post. There I will take up the unity of being, the unity of the conception and the system of understanding the world as an anticipatory system Ui[S,C] introduced briefly above.

Here below I want to offer the reader this comprehensive treatment of the subject of social influence in the form a four part (four hour) BBC documentary series.  After watching this series, I trust you will agree with me that in order to keep from being duped by all those who would control our deepest emotions and desires; we must know the elements and operations that are used for that control so that we are able to recognize it and learn to avoid its effects when such control affects our own lives.

How we (the American culture) were drawn in to this present day reality, and how we are affected by powerful influences without even knowing it, is plainly portrayed in this BBC documentary. In light of present day economic circumstances it presents a chilling commentary on what got us here and it may be a harbinger of what is yet to come.

Each part is about one hour and I realize how difficult it is for some people to pay attention for more than a few minutes. But if you are less than one hundred years old, you will find much of this relevant and quite interesting. If you are socially and politically conscious, it will be even more worth the time it takes to watch, I promise.

The Century of the Self

* Century of the Self, Part I, Happiness Machines
* Century of the Self, Part II, The Engineering of Consent
* Century of the Self, Part III, There is a Policeman Inside All Our Heads: He Must Be Destroyed
* Century of the Self, Part IV, Eight People Sipping Wine in Kettering

After watching the series, ask yourself:  Can the American Self realize its interdependence after centuries of hard won independence? Chances are, you will be able to judge in your lifetime.  Leave your opinion as a comment here below.  I’ll be back within a few weeks with the followup post on the unity of being and the unity of the conception.

I have heard it said that “a rose by any other name would smell as sweet”. The original line comes from Shakespeare’s famous play about Romeo and Juliet:

“What’s in a name? That which we call a rose
By any other name would smell as sweet.”

According to scholars: Juliet, prevented from marrying Romeo by the feud between their families, complains that Romeo’s name is all that keeps him from her. Juliet’s lines before the quotation most often remembered, are:

“Tis but thy name that is my enemy;
Thou art thyself, though not a Montague.
What’s Montague? it is nor hand, nor foot,
Nor arm, nor face, nor any other part
Belonging to a man. O, be some other name!”

She is certainly complaining that a name is no “part belonging to a man.” It is not part of the substance of being a man. Juliet is trying to make sense of her conflicted being– inflamed with her heart’s desire. Perhaps as part of her thinking about ways out of the conflict in which she is engaged, she reasons there is no particular cause that her man could not be called by another name. She pleads for another judgment supported with her famous argument that the “essence”, or “bare and particular substance” of a Rose is its sweet smell –that would remain if called by any other name– she continues in the following lines:

“So Romeo would, were he not Romeo call’d,
Retain that dear perfection which he owes
Without that title. Romeo, doff thy name,
And for that name which is no part of thee
Take all myself.”

She concludes Romeo would still have that “dear perfection which he owes” were he not called Romeo. Finally she asks he forsake his name for her. This is not an easy thing she asks. Forsaking one’s name means forsaking one’s family, one’s ancestors and heritage and perhaps one’s fortune and inheritance. She does not offer to forsake her own name. That demonstrates a small and unavoidable, and also undeniable, part of the subjective obfuscation of the “dear perfection” which is the bare and particular substance of everything in existence.

Human culture is unique in our capabilities to name all the things that touch the existence and impinge on the experience whether or not such things have or show tangible form. Such is the distinctive character and aroma of the rose, the distinctive character and dear perfection of Romeo, for example. Other kinds that have brains, intentional states and awareness, do not confer, give or grant names to things and record them for their posterity.

This suggests that one owes all that adds to one’s own knowledge and character to that “dear perfection” -that Wisdom to which each of us may often appeal –that, that exists, remains and endures—beyond individual and subjective existence and experience– certainly beyond the names of things. For what might become of that presence we call a rose if no one person were ever around to experience its sweet aroma, to call it a rose; to cultivate and appreciate it—no one with any senses to thrill, existing or thinking?

The terms “thought,” “idea” and “belief” are just names for the “stuff” of that dear perfection that seems to flow into and out of each of us, just as the name of the rose is a handy moniker for the apparent salience of that unmistakable aroma and perfection sensed upon its appearance or recollection. These names are terminology we invent; to order, “slice up” and talk about the presence of “stuff” going on or happening in our heads, in our hearts and all around us, only because that same stuff is particular to everything that is going on or happening and it cannot otherwise be distinguished.

Therefore it is a distinguishing process, this conceptual, reflective thinking in which each of us engages. Thinking is something that each of us happens to do, though the stuff or substance that we take in as input to such awareness and cognition is nebulous and it is regularly deemed unclassifiable as it is channeled, consumed, recognized and altered into a product observed or otherwise output.

It is for one’s own self to distinguish for individual perspectives of that stuff can only be accorded a spontaneously occurring designation suited to the moment, to an aspect or to a function. It is wise to: Know thyself.  Yet, if we are to know it, however sophisticated the designation that is born of that special ideal, we ought not to mistake the cognition nor the designation of it  for substantive power of it –this power by which each of us are gently impelled and often rigidly compelled to register, resolve and to reason.

Shakespeare wrote:  “We are such stuff / As dreams are made on”

Dreams are a force on their own. The mind is made on that same sort of stuff. Though we cannot quite touch this powerful substance we cannot deny the forceful and influential effect the dreams of all humanity have had on our collective and individual awareness.  In order that we may dream a dream and think a thought, indeed, that we enjoy the inherent capacity to know a rose, we have this power to use, to do work;  to try and extract or separate some of that essence we call the rose– from its existence in that dear perfection in which it resides and from which all the world gets its share.

The Egyptians attributed such power and rudiments to Thoth, the early Greeks attributed the power to the Logos. Today, some call this Providence and many call this the power of knowledge.  The rose is not a lotus nor any other kind except that it is. The existence of names, terms and the pervasive use of language throughout the time of human civilization, supports the fact that there is present, common, and enduring value to that dear perfection empowering and diffusing every idea—the cumulative and dynamically incremental heritage of creaturely sharing in which each of us persistently partake and delight.

That there are rudiments derivable from such distinctly human qualities and that these are representative sign functions for the wisdom and thought also obtainable from the gamut of human languages, seems incredible…and but for Adi’s Semantic Theory we are indeed clueless, rudderless.  Many scientists, those skeptics called relativists, even professional linguists, resist the idea that the stuff of dear Perfection, the Logos behind all speech and every human word is indeed amenable, if not to definition, at least to utilitarian indication; and there is meaning enough in that for everyone’s pleasure.

A New Theory of Cognition

I am happy to announce the publication of “A New Theory of Cognition and Software Implementations in Information Technology” to be published in the April-June issue of the Journal of Information Technology Research, Vol. 2, Issue 2, 2009.

Abstract

“The Scientific Method means that theories are developed to explain observed phenomena— similar to the task of text analysis—or to search for unobserved phenomena—similar to text retrieval. Theory development means testing a large number of proposed theories (hypotheses) until one is corroborated. To make theory development efficient, a method is needed to construct promising theories—ones more likely to succeed. Such a method is part of a new theory of cognition that is introduced. The theory is implemented in the software Readware. Readware uses theory development methods for text analysis and retrieval. Readware’s development, features and large-scale performance are reviewed. This includes a fast ontology-building system, the cross-lingual word-root theory base, a language to code theories, algorithms and ontology implementations, and software applications and servers that perform text analysis and retrieval using Readware API functions.”

Article copies are available for purchase from InfoSci-on-Demand.com.  You can also look for the publication in your local university library.

The Keys to Relevance

A key is a fundamental or central operative of harmony. The connexion of relevance is recognized concordantly.

A quick read of popular technology news and review sites– gives one the impression that the trouble people have with search engines– those called semantic search engines and all other search engines too– is the relevance of the results. This of course, is besides any trouble people have with the actions of the company fielding the search technology, e.g. the corporate entities such as Google, Yahoo, Microsoft and Powerset, and Hakia, Cognition among about a hundred others.

The problem I want to address is the problem with the relevance of the results, because even with the new crop of semantic search engines using sophisticated Natural Language Processing (NLP) technologies, the problem remains. In fact, NLP technology hardly addresses ambiguity, let alone, relevance. There is a good reason for this that I can now summarize, after dealing with it for more than twenty-five years. The problem actually stems from the abstract ideas about relevance.

By ideas, I just mean people’s thoughts and deliberations over exactly what is relevant. For corporations, the answer is very obvious: what is relevant is nothing but that substance that increases corporate equities. That substance may be abstract to some, but it is very real to corporate shareholders and the business managers they employ. It makes perfect sense that money and wealth and any investments of the same in any assets that generate that substance is the thing that is relevant to business. Any assets that do not perform or have little or no uptake in that way are dumped. This is why industry, corporations and businesses thereof, have overlooked the keys to relevance and have instead held steadfast to their own values of relevance to their own institutions.

However, a quick scan of current events shows that this substance they value so highly is not very real. It can evaporate and disappear before your very eyes. This is because money has a closer connection to fuel than it does to the fundamental keys of relevance. Anyway, that shows that that substance: money, wealth, etc. are not fundamental keys to relevance. The same thing is true of the objects of Natural Language Processing technologies: they are not the substance or keys to relevance; some of the objects they use are the signs of such substance: the semiotic keys.

A word is such a key: a semiotic key, not a harmonic key. A harmonic key is the substance of relevance and meaning of which the word is a semiotic key symbol. A symbol is a sign of something invisible or abstract. A word is a sign of abstract harmonic keys that are the substance and essence of relevance to judgment. As I pointed out in my previous post, these keys are marked by sounds, phonemes, letters.

Harmonic keys would behave just as they sound, and they would not leave anyone reeling in discord and conflicted in such a way as money markets are conflicted today. This is because the keys to relevance are concordant to the essence of relevance in one’s own mind. By that I mean, what is relevant to one’s own judgment. Just what is that?

That is of course, the premises required by the rational mind. For the premises of an argument for your judgment are often left unstated or hidden– left for you to figure out – left abstract.

Would anyone like to know more? For example, would you like to know the premise of fear? Why do people feel fearful about the economy today? The premises for this are factual. The signs appear in the outlook or horizon and in the word fear: the semiotic symbol itself. If you know this premise, you already know the reason for your fearfulness. It is like asking what is the fundamental quantity of fearfulness? The fundamental quantity of a physical substance is mass, length or time. What is the fundamental quantity of a conceptual substance like fear? Wouldn’t you want to know this measure?

Leave a comment if you would.

The notions of meaning, semantic mapping and relevance are nebulous not because they are fanciful, though that could be argued in many cases. The conceptions or mental representations of these notions–formed in people’s minds– are not entirely clear nor plainly understood.

Many would argue that one cannot know the concepts in people’s mind; particularly those involved in computer programming and computational linguistics.  They consider thoughts to be ephemeral and what people may have in mind relevant but murky at best. Because meaning is the effect of interpreting a sign— something that refers to an object—on the interpreter’s mind, we must attempt to understand these signs. In this attempt, we must clarify not only our language but the interrelations of its terms to our thoughts, our emotions and other personal motivations.

In this post we will take flight, despite the forecast, through the storms of controversy and the clouds of confusion. I will take you to a place where the meaning is sound and discovery is a realization. It is a long and difficult exploration so get comfortable before you begin.

Because people think with words and communicate with language; because we learn by reading and writing, and because laws are written with words, it makes sense to understand what it is about words and language that is connected to thinking. We know it is meaning that connects it altogether.

What every sense-maker wants is to abstract the significant.  In the face of incompleteness we may settle for the explanatory power of the elements and dimensions of that meaning. This so that a course may be predicted with greater confidence and that we may reach a more calculated response.  Because as any sense-maker knows: impressions evoke more perceptive thought and such thoughts provoke action.

Because human behavior is the cause of severe problems in society, it would be a great assistance if computers were able to help clarify to people, to us, the elements and dimensions of thought processes and patterns underlying languages, logics, laws, computing, and all inventions of the mind. If only so we can be more certain we have the same things in mind.

It is like that in the case of the moniker semantic search engine.  What is a semantic search engine? It is obvious that we agree about the search engine part. Agreement about the word semantic is less certain; we do not have the same thing in mind at all.

Computational linguists, generally speaking, do not study meaning in the general sense outlined above. Instead, linguists, under the influence of leaders such as Bloomfield and Chomsky, left the study of meaning and mental representations to psychologists and to others.  The study of meaning, mental representations or semantics, was deemed unscientific.  Chomsky called it “mental gymnastics” and “pseudosemantics”.  Whoever has a linguistics degree now, was trained in that tradition.

In this article I will post about semantic maps and the semantic map employed by Readware technology in particular; mainly because that is the one I know. I will also emphasize where Readware’s approach and semantic mapping departs from the traditional approaches that combine natural language processing (grammar-based NLP) with artificial intelligence and database techniques.

(Full Disclosure – I developed Readware technology along with Dr. Tom Adi.)

A semantic map is related to the schema used in relational database technologies with a data dictionary. It is used to model the data in the system.  In semantic web products like Powerset, and some NLP-based products, a semantic map is used to model the data in the system according to the ontology being utilized. Those that don’t use an ontology specifically, consider graphs of triples (e.g. a,r,b, or a relates to b) to be semantic mappings that map language or some other item into a logical assertion –also called a concept.  Let me depart to define the term ontology.

In AI technology, and on the semantic web, an “ontology” is a set of classes, attributes and relationships (essentially a set of assertions) that are used to model a domain of knowledge in a language that is close to that of formal logic (according to Tom Gruber). In addition to use in mechanical inference, the main purpose of such ontologies is not to study natural phenomena but to integrate “heterogeneous databases, enabling interoperability among disparate systems,” Gruber says.  This is different than the long time use of the term in philosophy where it refers to the study of being (in the world) and where its purpose is the study of natural phenomena.

An ontology of information assets (e.g. the Dublin Core) can help me sort and identify media but it cannot help me understand why some people of some nationalities want to kill everyone of another nationality.  What is the meaning of that? Isn’t that phenomena worth studying so we can avoid that sort of behavior in people where we live?  Talk about a use case for disambiguation — but you won’t find any NLP technology up to that task in this case.  They say it is not possible with current technology but what they mean is: it is not possible with their technology.

So the reason I bring this up is because some researchers, computer scientists, developers and even product managers who talk about a “semantic map” are not mapping words onto psychological or cultural meaning at all.  It is not even about natural phenomena or “meaningful” experience. They are mapping words onto the artificial classes, and or attributes and relationships (assertions) that are used to model a domain of knowledge coded in their computational specification.

According to their announcement, Cognition is mapping words by word forms and word senses (attributes of words) and by synonyms and antonyms (relationships between words) among other mappings. They call that meaning and semantics.  Besides the significant difference in our understanding of meaning this is also where there is a significant departure between the traditional logic and techniques of artificial intelligence and those of Readware technology.

You see.  The objects of NLP-driven search engines –and AI programs and RDF files– are assertions.  Millions of assertions if you heard what the folks at Cognition said.  Each vendor is trying to prove they have or make more assertions than the others.  It is as if they reason that if you can catalog all the possible choices, you can sort out which ones apply at specific instances.

It is not so easy. Because information is generally incomplete and inconclusive, just like it is in the case of language and semantics, one must study the subjects to get to the truth. And, in cases like this, the truth is often hard to recognize and may be hard to pin down.

Every natural phenomena we sense has to be studied to some extent.  It is how we learn. The best way to study any natural phenomena, such as language and meaning, is with the scientific method.  For literate people brought up in the western traditions of the world, thinking is less rigorous and formal but it is patterned on this method, and: it is how we learn.

So let me say how I think and learn.  I think with conjecture, guesses… theories… about the nature of the real things and principles I recognize. This leads me to choices and to assertions. I conclude with what is true or not and what assertions might be made; I do not start with them.  I start, usually, with a conjecture and begin to refine a simple idea or mental representation of the reality of the situation.  Descarte called these initial thoughts innate ideas and Kant referred to them as a priori judgments.

Let me ask you seriously:  Do we want computers to check our logic or do we want them to help us create more useful and concrete theories for solving the problems we face?  Do we want computers to be spectators of the human condition or do we want computers to be useful participants assisting us in thinking about how we can improve the human condition?

I think constriction to assertions and traditional logic limits the help computer programs could provide in clarifying theories, evaluating choices and making predictions.  In the words of the eminent mathematician Dr. Vaughn Pratt:

Traditional logic, like classical mechanics, is a spectator sport: there is an apparatus and a separate observer. Information flows from the apparatus into and around the observer, whose measurements are assumed not to disturb the apparatus. The observer is therefore an information processing system, the essence of which is a graph with nodes A,B,… along whose edges f:A->B (measurement f with source A and result B) information flows. The apparatus itself does not see these edges (but constitutes the sources of some of them) and is not disturbed by the observer. The graph of an idealized observer is a Heyting or even Boolean algebra in the case of nonconstructive logic and a cartesian closed category in the case of constructive logic. Considerations of computational complexity and relevance may call for weaker observers, but not so weak that they disturb the apparatus.

The essence of traditional logic then is an intelligent graph reaching its edges into an unsuspecting structure and contemplating its behavior.

This is useful for static structures and well-known procedures but language and the world itself is made up of dynamically changing structures and interrelated processes. Nothing is really static.  So a key difference between Readware technology and the AI technology used by NLP approaches is the difference between being a spectator and being a participant.  The nature of Readware logic is to order and interrelate the elements of a structure and thereby determine the essence of its controlling processes.

The Readware appartus or semantic map is an information processing device (a regular sign system) and the observer (the mind) is a controller. It receives partially processed (ordered) information from the apparatus and it responds with decisions (and even assertions). The abstract objects of Readware are not assertions and Readware algorithms generate theories not assertions.  There is a big difference between these two notions and the consequences of their common use and deployment.

Notwithstanding the search relevance, and as for the rest of the results NLP products achieve, let me say they show an incredible amount of sophistication.  The parsing and recognition of word forms and word senses is world class in all the systems on the market.  The products that implement the complex parsing and indexing of documents into word forms and senses and entity classes and relations are world class products.  I would love to have a stack comprised of Readware and any one of these language processors.

Because what this crop of vendors do not do (even though their claims imply they do), is pattern experience well enough to induce meaning (in computational memory) from the word forms themselves, as I described above and as literate people do.  Let’s look at an example.

Psychologists who study emotional trauma’s consider language as some of the best evidence available.  If someone says they hurt, they should also know that they are hurt and therefore the self-report of being hurt is valid evidence along with other behavioral and physiological evidence, or the lack thereof. It is important for the care provider to recognize, or create a mental representation, of what it means for the patient to be hurt particularly in the instance where there is a lack of physiological or other behavioral evidence.

In order to interpret self reports, (and testimony, text reports, etc.) one needs to study language. This is best done using natural language semantics. To study the semantics of language one needs universal elements or objects. Linguists, psychologists and others who do study semantics look for such semantic universals—concepts that are cross-lingual and cross-cultural.  Being in psychological pain is not an English or even a Chinese language object, it is a meaningful impression of biological or psychological activity.

Such universal concepts are the major defining characteristic of a “semantic map” for a computer program whose vendor makes the incredulous claim that:

We have taught the computer virtually all the meanings of words and phrases in the English language,

As Cognition chief executive Scott Jarus told AFP.

He could have claimed that Cognition had cross-referenced all word forms and senses known to the English language with the definitions of all the words and phrases in their lexicon.  I believe they may have done that much.  However, if the “meanings” of the words were really taught to the computer, then the computer ought to be able to look-up a word and use its definition and any other entailments to perform a search on the subject and report the result.

The Cognition search engine clearly cannot do this. It can only search on the word or words a user provides.  Pick any word and try it for yourself.  Being able to look up any word and cross reference all its details is a skill that goes a long way.  And reading, interrelating and using those definitions to explain reality or experience is something else entirely; this is where a semantic map becomes necessary.

In psychology, a semantic map is a pattern imposed on reality or experience to assist in explaining it, mediating perception, or guiding response. That is the conception of a semantic map I want the reader to have in mind as we continue.  By understanding a semantic map in this way, one has an intimate way of evaluating the efficacy of a proposed semantic map without buying into the computer products first. Because evaluation of semantic maps is the critical and necessary step before adopting them, let me be redundant.

  1. A semantic map is a pattern imposed on reality or experence.
  2. The purpose of a semantic map is to assist in explaining experience.
  3. The purpose of a semantic map is to mediate perception or guide response.

Those annoucing semantic maps should meet this criteria and explain what part or how much of reality they have successfully mapped. My own participation in the research and devlopment of a semantic map (we called it a semantic matrix in the original work) began in 1982.

Getting through the online noise and storm around the concepts of semantics and relevance-– to the actual elements and dimensions of “meaning” and “human understanding” -–is a long term, often frustrating and sometimes harrowing experience. Now that I am thinking about it, it puts me in mind of spiraling down to the ground through thunderclouds in severe weather.  The problem is that things can get out of hand quickly.

For those that maintain control and make it to the ground, there are ways to understand these concepts and all concepts of the mind. This is simply because in order to be shared and to persist; a) any favorable concept must become less abstract and nebulous and form into a more concrete idea, scheme or plan, and; b) there are abstract and specific and recognizable elements and dimensions to every well-formed plan, idea and understanding.  By favorable, I mean a concept likely to survive for whatever purpose: good or bad.

Ultimately language is intimately involved and plays a great role in all forms of human understanding.  And so, many researchers accept that there is a mapping, between the abstract elements and dimensions of meaning and the signs of language.  Readware technology maps the signs of language onto abstract elements and dimensions of emotional and physical control using a matrix of sound symbols as the semantic map according to Adi’s Semantic Theory (ATS).

Some may ask: why choose the elements and dimensions of emotional and physical control? In truth, we did not choose them.  They were derived from a semantic study.  But in hindsight, one wonders why others  did not already recognize them. Going back to the psychologist interpreting their patient is hurting from psychological pain, caregivers want to know how to control that ‘pain’ so it can be mediated in the constitution of their patients.  To get control of pain means we must rest that control from someone or something else.  Physicians have tactics and best practices for this case.

I cannot think of many things in the world that are not directly interrelated to or affected by some form of emotional, physical or environmental control.  Because emotional and physiological control is a large part of the human condition, a shared and interpersonal semantic space is readily patterned by its elements and dimensions.

The Readware semantic matrix has a small number of elements and dimensions for mapping a large number of interpretations.  This is why I read with some amazement the announcement by Cognition that they have the world’s largest semantic map. It motivated me to write this post.  One reader commented that those of us that disagree with Cognition’s claim should just hold our objections and let them tout their wares.

The problem is that the claim they make gives the reader the wrong impression.  Here is the impression Anthony C. shared with his own words:

The academics can discuss the Olde English and definitive dictionaries that have a set number of words, but I’d prefer an NLP system that understands all the meanings of those dictionary entries. That one sounds like it can build a business by licensing “the bit about them that’s unique.”

Anthony has the (wrong) impressions that the OED is a dictionary of a fixed number of words in the Olde English language and that the NLP system (Cognition) understands all the meanings of it’s dictionary entries. And he reaches a dubious conclusion because of those impressions.

It does not take much to prove that there is no NLP logic capable of interpreting the meaning of simple word forms like feel, fear, hope and love and using those interpretations for locating instantiations in natural language expressions (text). There is only keyword search.  People speaking other languages also have words that refer to the same meanings indicated by the words feel, fear, hope and love becasue these human emotions are experienced irrespective of the language spoken. Keywords don’t work cross-lingually on texts.

I expect most sense-makers will hold, as I do, that there is no possibility of achieving “a more accurate or relevant understanding” without understanding the universal elements and dimensions of the meaning of such signs.

An Introduction to Semantic Mapping with Readware technology.

Some researchers believe semantic universals can be found in simple terms like feel, fear, hope and love that are shared across cultures and languages. Many believe that more complex concepts are not shared by many languages. Along with my colleague Dr. Tom Adi I believe that certain sounds are symbols material objects used to represent something invisible. These sounds are shared by all languages. The symbols represent abstract semantic universals that are used by the mind for symbolic processing tasks such as thinking, reading and writing.  The modern phonetic alphabet represents these symbols thus:

a b c d e f g h i j k l m n o p q r s t u v w x y z

All the sounds of every word of the English language are mapped into the writing system using these symbols.  The Adi Theory of Semantics (ATS) maps these symbols onto 11 dimensions of emotional and physical process control. The Roman alphabet and phonetic symbols are arbitrary, of course, as are the conventions for combining the sounds of the English language and any natural language for that matter. Therefore the symbols of any language can be mapped to the universal elements and dimensions of these elementary, compound and interrelated processes without changing or disturbing them in any way.

Every symbol maps onto one of seven abstract processes—assignment, manifestation, containment, assignment of manifestation, assignment of containment, manifestation of containment, and assignment and manifestation of containment—each with  one of four abstract polarities—closed-self, open-self, closed-others, and open-others  This is visualized in the table I have included below.

You may notice that some cells are empty and some have multiple symbols.  There are reasons for this though listing them here would take us away from the present discussion. You may also notice the abstract objects closed-self and open-self are opposites as are closed-others and open-others. These pairings of polarities can be taken to represent abstract interpersonal engagement conditions (self and others) with abstract interpersonal boundary conditions (closed and open).

So, in Readware technology, intuitions about the words of a language– such as feel, fear, hope and love — are obtained by mapping the phonemes of these words —considered by most linguists to be the smallest elements of meaning– onto these abstract semantic universals. This is done with a simple algorithm used for transforming a word into the abstract objects indicated by the structure of the (word’s-root) phonemes (an abstract word theory) representing the intuitive meanings of the word.

These abstract theories (sign functions) produce impressions that have many possible interpretations or realizations.  We would not want to put a number on this and rather believe that the number of realizations are open-ended.  Such realizations have explanatory power that can be studied outright, Readware technology quantifies them for use in computational algorithms.  Now let’s get into the practical implementation so we see how it works.

Most words are ambiguous because sounds produce ambiguity (multiple meanings) when combined in a word root. This is at least partly because each phoneme symbolizes compound abstract objects that convey different aspects and characteristics of the natural phenomenon referenced by the word.  Polysemy is a linguistic term that means that a word root may refer to different objects in different contexts.  All phonemes are polysemic.  All words appear to be polysemic too.

According to ATS, every word can be transformed into one of several forms of quantifiable functions defined over the discrete domains and ranges (of control) dimensioned by these abstract semantic universals. Thus, even emotional word roots that are ambiguous can be included in the evidence studied to understand their nature.

Adi’s theory of semantics has rules to convert any word root into an abstract mathematical mapping such as f: X->Y or f(X), etc..  Consonants that refer to processes with higher precedence play the role of the function f of the mapping f: X->Y and the remaining consonants represent the domain X or range Y of the mapping.  The mapping f(X) is a mapping with an unspecified range.  The words feel, fear, hope and love are mapped by this formula, implying that the context for these word can range across anything at all.

For some, the formulas and the abstract mappings of semantic universals, may be too abstract to be of much practical use, yet each abstract mapping can be interpreted into more concrete terms suitable not only as a definition, but also as a knowledge representation with extraordinary explanatory power.  In other words these abstract impressions induce more concrete interpretations.  Such interpretations can be corroborated with personal experience

Consider, for example, that the sound /fe/ symbolized by the letter “f” represents the abstract semantic universals open-self and manifestation. Remember that open-self is a polarity (think charge, inclination, valence) and that manifestation is a process.  Action and activity are effects of processes of manifestation, i.e., one manifests behavior and actions, and activity is manifested.  So this is how a phoneme from a language is mapped to a compound abstract object that evokes multiple interpretations –meanings– from the sound-symbol.

Both words, feel and fear are used to refer to sorts of emotional activity. That is not a definition from a dictionary though it is defining.  In the word fear, the emotional activity applies to the domain of self –is assigned inward– (by the polarity) indicated with the consonant “r” in the formula.  In the word feel, the “l” has outward polarity and assigns the manifestation outward– the domain is open to the outside.  These abstractions give us some impressions of what it instinctively means to fear or feel–yet they are even more than that. They are theories. We can use them to generate more concrete theories about what the words fear and feel mean.

For the “f” in feel, these abstract universals can be realized simply as “opening oneself to outside manifestations”. For the sound “f” in fear, the universals induce the more concrete realization: vulnerable state. Vulnerable is a realization of the open-self. A state is a realization of a particular manifestation.  Of course, any realization is conjunct to the situation, circumstances or context of use.

An event is also an interpretation of manifestation and the open-self is the universal negative, so a concrete realization of fear is a negative event. The open-self polarity is also realized as unfamiliar emotion and both the words feel and fear are used to communicate a sense of unfamiliar emotional activity. The unfamiliar can induce fear and result in feeliings of anxiety and agitation.  And a feeling is often initially unfamiliar enough to get our attention.  Do you feel me?

The explanatory power of the semantic universals of this mapping enable us to make predictions such as:

–the advent of uncertain or unfamiliar circumstances can evoke fear in the minds of people.

This prediction can be applied to events concurrent with the uncertainty of the political future in America.  Given a collection of American news articles covering this year (2008) from January until September, Readware algorithms can identify the instances that evoke fear and contribute to increasing uncertainty, anxiety and agitation of popular opinion from articles that do not.  This can be done with a query of the form: fear, because.  The entire process for this case, would take less than a few hours, including installing software, parsing and indexing documents and achieving the results. It would cost pennies per article given a few million artilces exist in that range.

By finding instances, textual evidence from passages in press reports, Readware technology can be used to inform political strategists, for example– to locate and track relevant issues that produce fear in the populace and deserve focused attention. Such information can be used to great advantage, to damage an opponent, or not at all.

So, If you are really thinking about using semantic technology, you should know about the limitations of products based on traditional AI-style logic and mechanical inference.  Learn to recognize the difference between what they claim to do and what they actually do.  And be aware of the availability of alternative methods that exploit the explanatory power of text.

Search. I suppose there is no denying that the word “search” ascended to significance in the consciousness of more people since the birth of Information Science than perhaps at any other time in history. This supposition is supported by a recent Pew Foundation internet study stating that:

The percentage of internet users who use search engines on a typical day has been steadily rising from about one-third of all users in 2002, to a new high of just under one-half (49%).

While it may not be obvious, it becomes apparent on closer examination of the phenomena, that the spread and growth in the numbers of words and texts and more formal forms of knowledge, along with the modern development of search technology, had a lot to do with that.

Since people adopted the technology of writing systems, civilizations and societies have flourished. Human knowledge and culture, and technological achievement, have blossomed. No doubt.

Since computers, and ways of linking them over the internet, came along, the numbers of words and the numbers of writers have increased substantially. It was inevitable that search technology would be needed to search through all those words from all those writers. That is what Vannevar Bush was telling his contemporaries in 1945 when he said the perfection of new instruments “call for a new relationship between thinking man and the sum of our knowledge.

But somewhere along the line things went wrong; some things went very, very wrong. Previous knowledge and the sum of human experience was swept aside. Search technology became superficial, and consequently, writing with words is not considered as any kind of technology at all. That superficiality violates the integrity of the meaning of search, and the classification of words merely as arbitrary strings is also wrong, in my view.

Some scientists I know would argue that the invention of writing is right up there at the top of human technological achievement. I guess we just take that for granted these days, and I am nearly certain that scientists that were embarking into the new field of information technology in the 1940’s and 1950’s were not thinking of writing with words as the world’s first interpersonal memory– the original technology of the human mind and its thoughts and interactions.

Most information scientists have not yet fully appreciated words as technical expressions of human experience but treat them as labels instead. By technical, I mean of or relating to the practical knowledge and techniques (of being an experienced human).

Very early in the development of search technology, information scientists and engineers worked out assumptions that continue to influence the outcome, that is, how search technology is produced and applied today. The first time I wrote about this was in 1991 in the proceedings of the Annual Meeting of the American Society of Information Science. There is a copy in most libraries if anyone is interested.

And here we are in 2008, in what some call a state of frenzy and others might call disinformed and confused– looking at the prospects of the Semantic Web. I will get to all that in this post. I will divide this piece into the topics of the passion for search technology, the state of confusion about search technology, and the semantics of search technology.

The term disinformed is my word for characterizing how people are under-served if not totally misled by search engines. A more encompassing view of this sentiment was expressed by Nick Carr in an article appearing in the latest issue of the Atlantic Monthly where he asks: Is Google making us stupid?

I am going to start off with the passion of search.

Writing about the on-line search experience in general, Carmen-Maria Hetrea of Britannica wrote:

… the computing power of statistical text analysis, pattern-matching, and stopwords has distracted many from focusing on (should I say remembering?) what actually makes the world tick. There are benefits and dangers in a world where the information that is served to the masses is reduced to simple character strings, pattern matches, co-location, word frequency, popularity based on interlinking, etc.

( … ) It has been sold to us as “the trend” or as “the way of the future” to be pursued without question or compromise

That sentiment pretty much echos what I wrote in my last post. You see, computing power was substituted for explanatory power and the superficiality of computational search was given credibility because it was needed to scale to the size of the world wide web.

This is how “good enough” became state of the art. Because search has become such a lucrative business and “good enough” has become the status quo, it has also become nearly impossible for “better” search technology to be recognized, unless it is adopted and backed by one of the market leaders such as Google or Microsoft or Yahoo.

I have argued in dozens of forums and for more than twenty years that search technology has to address the broader logic of inquiry and the use of language in the pursuit of knowledge, learning and enhancing the human experience. It has to accommodate established indexing and search techniques and it has to have explanatory power to fulfill the search role.

Most that know me know that I am not showing up at this party empty-handed. I have software that does all that and while my small corporate concern is no market or search engine giant my passion for search technology is not unique.

In her Britannica Blog post about search and online findabillity, Carmen-Maria Hetrea summed up her passion for search:

Some of us dared to differ by returning to the pursuit of search as something absolutely basic to the foundations of our human existence: the simple word in all of its complexity — in its semantics and in its findability and its futuristic promise.

You have to ask yourself what you are really searching for before you can find that it is not for keywords or patterns at all. Out in the real world almost everyone is searching for happiness. Some are also searching for truth or relevance. And many search for knowledge and to learn. If your searching doesn’t involve such notions, maybe you don’t mind the tedium of thorough, e.g., exegetical, searching. Or maybe you are someone who doesn’t search at all, but depends on others for information.

How is the search for happiness addressed by online search technology? Should it be a requirement of search technology to find truth or relevance? Should a search be thorough or superficial? Is it about computing power or explanatory power? I am going to try and address each of these questions below as I wade through the causes of confusion, expose the roots of my passion and maybe shed some light on search technology and its applications.

Some people have said in the online world you have both the transactional search and the research search, which are not the same. They imply that these search objectives require different instruments or plumbing. I don’t think so. I think it is just a crutch vendors use to justify superficial search. Let’s look at an example transactional search, say, searching for a new car. There are so many places where you can carry out that transaction, being thorough and complete is not an issue. Here’s is a search vendor quiz:

Happiness is a ___________ search experience.

Besides searching for objects of information that we know but don’t have at hand, in cyberspace and on the web, we might search for a pizza place in a new destination. Many search for cheap air fares or computer or car parts, or deals on eBay, while others search for news, music, pictures and many other types of media and information. A few others search for knowledge and for explanation. Happiness in the universe of online search is definitely a satisfying search experience irrespective of what you are searching for.

Relevance is paramount to happiness and satisfaction whether searching for pizza in close proximity or doing research with online resources. Search vendors are delivering hit lists from their search engines, where users are expecting relevance and to be happy with the results. Satisfaction, in this sense, has turned out to be a tall order and nonetheless a necessary benefit of search technology that people still yearn for.

Let’s now turn to the state of confusion.

Carmen-Maria mentions that new search technology has to be backward compatible and she also complains that bad search technology is like the wheel that just keeps getting reinvented:

The wheel is being reinvented in a deplorable manner since search technology is deceptive in its manifestation. It appears simple from the outside, just a query and a hitlist, but that’s just the tip of the iceberg. In its execution, good search is quite complex, almost like rocket science.

… The wealth of knowledge gained by experts in various fields – from linguists to classifiers and catalogers, to indexers and information scientists – has been virtually swept off the radar screen in the algorithm-driven search frenzy.

The wheel is certainly being re-invented; that’s part of the business. I am uncertain what Carmen-Maria means by algorithm-driven search frenzy. Algorithms are the stuff of search technology. I believe that some of the problems with search stem from the use of algorithms that are made fast by being superficial, by cutting corners and by other artificial means. The cutting of corners begins with the statistical indexing algorithms or pre-coordination of text– so retrieval is consequently hobbled by weaknesses in the indexing algorithms. But algorithms are not the cause of the problem.

Old and incorrect assumptions are the real problem.

Modern state-of-the-art search technology (algorithms) invented in the 1960’s and 1970’s strip text of its dependence on human experience under something information science (IS) professionals justify as the independence assumption. Information retrieval (IR) professionals– those that design software methods for search engine algorithms– are driven by the independence assumption to treat each text as a bag of words without connection to other words in other texts or other words in the human consciousness.

I don’t think he was thinking about this assumption when Rich Skrenta wrote:

… – the idea that the current state-of-the-art in search is what we’ll all be using, essentially unchanged, in 5 or 10 years, is absurd to me.

Odds are that he intends to sweep a lot of knowledge out of the garage too, and I would place the same odds that any “new” algorithm Rick brings to the table will implicitly apply that old independence assumption too.

So this illustrates a kind of tug of war between modern experts in search technology and the knowledge of ages of experience. There is also a kind of frenzy or storm over so-called “new” technologies and just what constitutes “semantic” search technology. While some old natural language processing (NLP) technology has debuted on the online search scene, it has not brought any new search algorithms to light. They have only muddied the waters in my opinion. I have written about this in previous posts.

The underlying current is stirred up by imbalance existing in the (significant) history of search technology contrasted with the nascence of online search and other modern applications of search technology. Add to that disturbance the dichotomy exasperated by good (satisfying) and bad (deceptive) search results, multiplied by the number of search engine vendors, monopolistic or otherwise, and you have the conditions where compounding frenzy, absurdity and confusion, rather than relevance, reigns supreme.

I like to think my own view transcends this storm and sets an important development principle that I established when I produced the first concept search technology back in 1987. The subjects of the search may be different but the freedom to search for objects, for answers, or for theories or explanations of unknown phenomena is the right of inquiry.

This right of intellectual inquiry is as important and as basic as the freedom of speech. This is what ignites my passion for search technology. And I cannot stand to have my right of inquiry blocked, limited, biased, restricted, arrested or constrained, whether by others, or by unwarranted procedure (algorithm) or formality, or by mechanical devices.

I wear my passion on my sleeve and it frequently manifests as a rant against the “IT” leaders or so-called experts that Carmen-Maria wrote about:

Many consider themselves experts in this arena and think that information retrieval is this new thing that is being invented and that is being created from scratch. The debate often revolves around casual observations, remarks, and opinions come mostly from an “IT” perspective.

To be fair, not all those with “IT” perspectives are down with all this “new thing” in online search engines. Over at the Beyond Search blog, Stephen Arnold wrote about the problem with the thinking about search technology:

… fancy technology is neither new nor fancy. Google has some rocket science in its bakery. The flour and the yeast date from 1993. Most of the zippy “new” search systems are built on “algorithms”. Some of Autonomy reaches back to the 18th century. Other companies just recycle functions that appear in books of algorithms. What makes something “new” is putting pieces together in a delightful way. Fresh, yes. New, no.

I also think Stephen understands the history of search technology pretty well. He demonstrates this when he writes:

Software lags algorithms and hardware. With fast and cheap processors, some “old” algorithms can be used in the types of systems Ms. Hane identifies; for example, Hakia, Powerset, etc. Google is not inventing “new” things; Google is cleverly assembling bits and pieces that are often well known to college juniors taking a third year math class.

Like Carmen-Maria Hetera, Stephen Arnold sounds biased against algorithms, “old” algorithms in particular, though I don’t think he intended any bias, as many of the best algorithms we have are “old”. There are really not many “new” algorithms. Augmented, yes. Modified, Yes. New, no.

To be involved in IT and biased against algorithms is absurd as long as technology is the application of the scientific method and scientific search methods are understood as collections of investigative steps systematically combined into useful search procedures or algorithms. So there you have my definition of search technology.

The algorithms for most search technology are not rocket science and can be boiled down to simple procedures. At the very least there is an indexing algorithm and a search algorithm:

Pre-coordination per-text/document/record/field procedure:

  1. Computerize an original text by reading the entire text or chunks of it into computer memory.
  2. Parse the text into the smallest retrievable atomic components (usually patterns (trigrams, sentences, POS, noun-phrases, etc.) or keywords or a bag (alphabetical list) of infrequent words).
  3. Store the original text with a unique key and store the parsing results as alternate keys in an index.
  4. Repeat for each new text added to a database or collection.

Post-coordination per-query procedure:

  1. Read a string from input, parse the query into keys in the same way as a text.
  2. Search the index to the selected collection or database with the keys.
  3. Assemble (sort, rank) key hits into lists and display.
  4. Choose hit to effect retrieval of the original text.

These basic algorithms are fulfilled differently by different vendors but vendors do not generally bring new algorithms to the table. They bring their methods of fulfilling these algorithms; they may modify or augment regular methods employed in steps 2 and 3 of these procedures as Google does with link analysis.

In addition, vendors fold search technology into a search engine. Most online search engines– those integrated “software systems” or search appliances that process text, data and user-queries, are composed of the following components:

  1. A crawler for crawling URI’s or files on disk or both.
  2. An indexer that takes input from the crawler and recognize key patterns or words.
  3. A database to store crawler results and key indexing (parsing) results.
  4. A query language (usually SQL, Keyword-Boolean) to use the index and access keys in the database.
  5. An internet server and/or graphical user interface (GUI) components for getting queries from, and presenting results to, users.

Most search engine wizards, as they are called, are working on one or more of these software components of online search engines. You can look at what a representative sample of these so-called wizards have to say about most of these components at the ArnoldIT blog here. If you read through the articles, you won’t find one of them (and I have not read them all) that is working on new indexing methods or new mapping algorithms for mapping the meaning of the query to the universe of text, for example.

Many of the “new search engines,” popping up everywhere, are not rightly called new search technology even though they frequently bear the moniker. They are more rightly named new applications of search technology. But even vendors are confused and confusing about this. Let’s see what Riza Berkin of Hakia is saying in his most recent article where he writes:

But let’s not blind ourselves by the narrowness of algorithmic advances. If we look closely, the last decade has produced specialist search engines in health, law, finance, travel etc. More than that, search engines in different countries started to take over (like Naver, Baidu, Yandex, ect.)…

He had been writing that Search 1.0 began with Alta Vista (circa 1996) Search 2.0 is Google-like and Search 3.0 is semantic search “where the search algorithms will understand the query and text”. I guess all those search engines from Fulcrum, Lexis-Nexis, OpenText, Thunderstone, Verity, Westlaw, and search products from AskSam to Readware ConSearch to ZyIndex, were Search 0.0 or at leat P.B. …. You know like B.C. but Pre-Berkin.

And so this last paragraph (above) makes me think he is confusing search applications with search technology. His so-called specialists search engines are applications of search technology to the field or domain of law, to the field or domain of health, and so on.

Then he confuses me even more, when he writes about “conversational search”:

Make no mistake about it, a conversational search engine is not an avatar, although avatars represent the idea to some extent. Imagine virtual persons on the Web providing search assistance in chat rooms and on messengers in a humanly, conversational tone. Imagine more advanced forms of it combined with speech recognition systems, and finding yourself talking to a machine on the phone and actually enjoying the conversation! That is Search 2.0 to me.

Now I can sympathize with Riza because I used the phrase “conversational search” to describe the kind of conceptual search engine I was designing in 1986. I am not confused about that. I am confused that he calls that Search 2.0 when earlier– statistically augmenting the inverted index –was described as Search 2.0.

He doesn’t stop there. He continues describing Search 3.0 that “will be the ‘Thinking Search’ where search systems will start to solve problems by inferencing. ” Earlier he wrote that semantic search was Search 3.0. Semantics requires inferencing, so I began to reckon maybe thinking and semantics are equal in his mind, until he writes: “I do not fool myself with the idea that I will see that happening in my life time” — so now I am confused again. I think it is what vendors want; they want the public to remain confused about the semantics of search and what you get with it.

And that brings me to the semantics of search.

There are only two words that matter here: Thoroughness and Explanatory.

When I started tinkering with text processing, search and retrieval software in the early 1980’s, I was captivated by the promise of searching and reading texts on computers. The very first thing that I noticed about the semantics of search, before my imagination became involved in configuring computational search technology, was thoroughness. The word /search/ implies thoroughness if not completeness in its definition. Thoroughness is a part of the definition of search. Look at the definition of search for yourself.

You need only look at one or two hit lists from major search engines and you can see that is not what we get from commercial search engines, or from most search technology. Search is not a process that is completed by delivering some hints of where to look, but that is what it has been fashioned into by the technological leaders in the field. Millions of people have accepted it.

Yet, in our hearts we know that search must be complete and it must be explanatory to be satisfying; We must learn from it, and we expect to learn from conducting a search. Whether we are learning of the address to the nearest pizza place or we are learning how to install solar heating, it is not about computational power, it is about explanatory power. They forgot that words are part of the technique of communicating interpersonal meaning, let’s hope search vendors don’t forget that words have explanatory power too.

Tell me what you think.

Peter Mika recently wrote an article about the semantic web and NLP-style semantic search. I should just ignore his claim that there are only two roads to semantic search because he is plainly mistaken on that count. As Peter works for Yahoo, he was mainly discussing data processing with RDF and Yahoo’s Search Monkey. He obviously knows that subject well.

He constructed an example of how to use representational data (such as an address) according to semantic web standards and how to integrate the RDF triples with search results. His claim is that one cannot do “semantics” without some data manipulation and for that the data must be encoded with metadata; essentially data about the data. In this case, the metadata necessary to pick out and show the data at the keyword: address.

At the end of his article, Peter talks about the way going forward, and; in particular, about the need for fostering agreements around vocabularies. I suppose that he means to normalize the relationships between words by having publishers direct how words are to be used. He calls this a social process while calling on the community of publishers to play their role. Interesting.

About the time Peter was beginning his PhD candidacy, industry luminary John Sowa wrote in Ontology, Metadata and Semiotics that:

Ontologies contain categories, lexicons contain word senses, terminologies contain terms, directories contain addresses, catalogs contain part numbers, and databases contain numbers, character strings, and BLOBs (Binary Large OBjects). All these lists, hierarchies, and networks are tightly interconnected collections of signs. But the primary connections are not in the bits and bytes that encode the signs, but in the minds of the people who interpret them.

This is the case in the trivial example offered by Peter. The reason one is motivated to list an address in the search result of a search for Pizza is because it is relevant to people who are searching for a pizza place close to them. In his paper, John Sowa writes:

The goal of various metadata proposals is to make those mental connections explicit by tagging the data with more signs.

This is the essential nature of the use case and proposal offered by Yahoo with SearchMonkey. It seems a good idea, doesn’t it? Yahoo is giving developers the means to tag such data with more signs. Besides, it has people using Yahoo’s index, exposing Yahoo’s advertisers. Sowa cautions that:

The ultimate source of meaning is the physical world and the agents who use signs to represent entities in the world and their intentions concerning them.

Which resources do investigators or developers use to learn about agents and their intentions when using signs? The resource most developers turn to is language and they begin by defining the words of language in each context in which they appear.

Peter says it is common for IR systems to focus on words or grams and syntax. While some officials may object, though NLP systems such as Powerset, Hakia and Cognition use dictionaries and “knowledge bases” to obtain sense data, they each focus mainly on sentence syntax and (perhaps with the exception of Powerset) use keyword indexes for retrieval just like traditional IR systems.

Hakia gets keyword search results from Yahoo as a matter of fact. All of these folks treat words, and even sentences, as the smallest units of meaning of a text. Perhaps these are the most noticeable elements of a language that are capable of conveying a distinction in meaning though they certainly are not the only ones. There are other signs of meaning obtainable from textual discourse.

Believe it or not, the signs people use most regularly are known as phonemes. They are the least salient because we use them so often, and frequently they are also largely used subconsciously. Yet, we have found that these particular sounds are instantiations, or concrete signs, of the smallest elements of abstract thought– distinctive elements of meaning that are sewn and strung together to produce words and form sentences. When they take form in a written text they are also called morphemes.

Some folks may not remember that they learned to read words and texts by stringing phonemes together, sounding them out to evoke, apprehend and aggregate their abstract meanings. I mention this because if a more natural or organic semantic model were standardized, the text on the world wide web could become more tractable and internet use might become more efficient.

This would happen because we could rid ourselves of the clutter of so many levels of metalevel signs and the necessity of controlled vocabularies for parsing web pages, blogs and many kinds of unstructured texts. An unstructured text is any free flowing textual discourse that cannot easily be organized in the field or record structure of a database. Neither is it advantageous to annotate the entirety of unstructured text with metalevel signs. Because as John Sowa wrote:

Those metalevel signs themselves have further interconnections, which can be tagged with metametalevel signs. But meaningless data cannot acquire meaning by being tagged with meaningless metadata.

So now it begs the question of whether or not words and their definitions are just meaningless signs to begin with. The common view of words—as signs— is that they are arbitrarily assigned to objects. I am unsure whether linguists could reach consensus that the sounds of words evoke meaning, as it seems many believe that a horse could have been called an egg without any consequence to its meaning or use in a conversation.

Within the computer industry it becomes even more black and white: A word is used to reference objects by way of general agreement or convention, where the objects are things and entities existing in the world. Some linguists and most philosophers recognize abstract objects as existing in the world as well. Though this has not changed the conventional view that is a kind of defacto standard among search software vendors today.

This view implies that the meaning of a word or phrase -its interpretation- adheres only to conventional and explicit agreements on definitions. The trouble is that it overlooks or ignores the fact that meaning is independently processed and generated (implicitly) in each individual (agents) mind. This is generally very little trouble if the context is narrow and well-defined as in most database and trivial semantic web applications on the scene now.

The problems begin to multiply exponentially when the computer application is purported to be a broker of information (like any search engine) where there is a verbal interchange of typically human ideas in query and text form. This is partly why there is confusion about meaning and about search engine relevance. Relevance is explicit, in as much as you know it when you see it, otherwise, relevance is an implicit matter.

Implicit are the dynamic processes by which information is recognized, organized, acted on, used, changed, etc. The implicit processes in cognitive space are those required to recognize, store and recall information. Normally functioning, rational, though implicit and abstract thought processes organize information so we that may begin to understand it.

It is obvious that there are several methods and techniques of organizing, storing and retrieving information in cyberspace as well. While there are IR processes running both in cyberspace and in cognitive space, it is not the same abstract space and the processes are not at all the same. In cyberspace and in particular in the semantic web, only certain forms of logical deduction have been implemented.

Cognitive processes for organizing information induce the harmonious and coherent integration of perceptions and knowledge with experience, desires, the physical self, and so on. Computational processes typically organize data by adding structure that arranges the information in desired patterns.

Neither the semantic web standards, nor microformats, nor NLP, seek the harmony or coherence of knowledge. Oh, yes, they talk about knowledge and about semantics yet what they deliver are little more than directives; suitable only for data manipulation in well-understood and isolated contexts.

Neither NLP nor semantic web meta data or tools presently have sufficient faculty for abstracting the knowledge that dynamically integrates sense data or external information with the conditions of human experience. The so-called semantic search officials start with names and addresses because these data have conventionally assigned roles that are rather regular.

When it comes down to it, not many words have such regular and conventional interpretations. It would actually be quite alright if we were just talking about a simple database application, but proponents of the semantic web want to incorporate everything into one giant database and controlled vocabulary. Impossible!

While it appears not to be recognized, it should be apparent that adherence to convention is a necessary yet insufficient condition to hold relevant meaning. An interpretation must cohere with its representation and its existence (as an entity or agent in the world) in order to hold. Consider the case of Iraq and weapons of mass destruction. Adhere, cohere, what’s the difference –it’s just semantics– right? Nonetheless, neither harmony nor coherence can be achieved by directive.

A consequence of the conventional view is that such fully and clearly defined directives leave no room for interpretation even though some strive for under specification. The concepts and ideas being represented can not be questioned; because, being explicit directives, they go without question. This is why I believe the common view of words and meaning that many linguists, computer and information experts, like Peter, hold, is mistaken.

If the conventional view were correct, the interpretation of words would neither generate meaning nor provide grounds for creating new concepts and ideas. If it were truly the case, as my friend Tom Adi said, natural language semantics would degenerate into taking an inventory of people’s choices regarding the use of vocabulary.

So, I do not subscribe to the common view. And these are the reasons that I debate semantic technologies even though end-users could probably care less about the techniques being deployed. Because if we are not careful we will end up learning and acting by directive too. That is not the route I would take to semantic search. How about you?

Older Posts »