I am sending you a little example of what I've been working on today.

It is a system for natural language understanding (hence implements
things like question answering, recognizing textual entailment, etc.)

In addition to the other tools that I have for this, for instance for
Question Answering we already have QUAC (OpenEphyra, Aranea) and for
Recognizing Textual Entailment we already have the Stanford RTE
system.  But what I am working on today will be tools that are not
only open source but also, we will have access to the internals.

I have a system called Capability::TextAnalysis;

It would take the following sentence (I'm using a small one here just
because the intercomputer communication is not finished, and one
essential service (Enju) has only been succesfully installed on my 32
bit machine.)

	"This is the first time I have tried this.  I wonder how well it will
	work.  Hopefully, well."

Now, here are the results of the Capability::TextAnalysis module:

--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------

		andrewdo@justin:/var/lib/myfrdcsa/codebases/internal/formalize/scripts$ ./test-capability-text-analysis.pl 
		$VAR1 = {
		          'CoreferenceResolution' => 1,
		          'SemanticAnnotation' => 1,
		          'TermExtraction' => 1,
		          'NounPhraseExtraction' => 1,
		          'DateExtraction' => 1,
		          'Tokenization' => 1
		        };
		Doing SemanticAnnotation
		Initializing SemanticAnnotation
		Retrieving result from cache
		Doing NounPhraseExtraction
		Initializing NounPhraseExtraction
		Retrieving result from cache
		Doing DateExtraction
		Initializing DateExtraction
		Retrieving result from cache
		Doing CoreferenceResolution
		Initializing CoreferenceResolution
		Computing result and adding to cache
		Doing TermExtraction
		Initializing TermExtraction
		Retrieving result from cache
		Doing Tokenization
		Initializing Tokenization
		Retrieving result from cache
		$VAR1 = {
		          'CoreferenceResolution' => [
		                                       {
		                                         'Ids' => {
		                                                    'set_1' => {
		                                                               'This' => 1,
		                                                               'this' => 1
		                                                             },
		                                                    'set_0' => {
		                                                               'I' => 2
		                                                             }
		                                                  },
		                                         'String' => [
		                                                       '<<<This|this>>>',
		                                                       'is',
		                                                       'the',
		                                                       'first',
		                                                       'time',
		                                                       '<<<I>>>',
		                                                       'have',
		                                                       'tried',
		                                                       '<<<This|this>>>',
		                                                       '.',
		                                                       '<<<I>>>',
		                                                       'wonder',
		                                                       'how',
		                                                       'well',
		                                                       'it',
		                                                       'will',
		                                                       'work',
		                                                       '.',
		                                                       'Hopefully',
		                                                       ',',
		                                                       'well',
		                                                       '.'
		                                                     ]
		                                       }
		                                     ],
		          'SemanticAnnotation' => [
		                                    {
		                                      'CalaisSimpleOutputFormat' => {},
		                                      'Description' => {
		                                                         'docDate' => '2009-10-20 21:22:53.593',
		                                                         'externalID' => 'testing',
		                                                         'externalMetadata' => {},
		                                                         'allowDistribution' => 'true',
		                                                         'allowSearch' => 'true',
		                                                         'docTitle' => {},
		                                                         'id' => 'http://id.opencalais.com/2qN2uHitGhWQOoxFCLakKg',
		                                                         'about' => 'http://d.opencalais.com/dochash-1/0f786371-90c2-3af6-b178-384a64f0abd0',
		                                                         'calaisRequestID' => '68d3dc26-2478-e7c1-1247-4e76a5b68072',
		                                                         'submitter' => 'FRDCSA'
		                                                       }
		                                    }
		                                  ],
		          'TermExtraction' => [
		                                []
		                              ],
		          'NounPhraseExtraction' => [
		                                      'first time',
		                                      1,
		                                      'hopefully',
		                                      1,
		                                      'time',
		                                      1
		                                    ],
		          'Tokenization' => [
		                              'This is the first time I have tried this . 
		I wonder how well it will work . 
		Hopefully , well . 
		'
		                            ],
		          'DateExtraction' => [
		                                '<doc>
		<s><lex pos=det>This</lex> <lex pos=vbz>is</lex> <lex pos=det>the</lex> <lex pos=jj>first</lex> <lex pos=nn>time</lex> <lex pos=prp>I</lex> <lex pos=vbp>have</lex> <lex pos=vbn>tried</lex> <lex pos=det>this</lex> <lex pos=pp>.</lex></s>
		
		<s><lex pos=prp>I</lex> <lex pos=vbp>wonder</lex> <lex pos=wrb>how</lex> <lex pos=rb>well</lex> <lex pos=prp>it</lex> <lex pos=md>will</lex> <lex pos=vb>work</lex> <lex pos=pp>.</lex></s>
		
		<s><lex pos=nnp>Hopefully</lex> <lex pos=ppc>,</lex> <lex pos=rb>well</lex> <lex pos=pp>.</lex></s>
		
		
		</doc>
		'
		                              ]
		        };

--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------

Well, this textual analysis is not really sufficient.  What I am
working on now is integrating the output of all of those results into
a central system.

I have written something called FreeLogicForm, which converts text
into something called Logic Forms.  I've also implemented Nested
Formula in FreeKBS today (however, I need to finish Resolution Style
Theorem Proving), and this allows me to take the results from the
logic form and assert them into the FreeKBS Knowledge Base.

		"This is the first time I have tried this.  I wonder how well it will
		work.  Hopefully, well."
		
		("and"
		  ("this" ?x1)
		  ("be" ?e4 ?x1 ?e6)
		  ("first" ?e6)
		  ("time" ?e6)
		  ("I" ?x2)
		  ("have" ?e5 ?x2 ?e6)
		  ("try" ?e6 ?x2 ?x3)
		  ("this" ?x3))
		
		Sending: KBS, MySQL2:freekbs2:default assert ("and" ("this" ?x1) ("be" ?e4 ?x1 ?e6) ("first" ?e6) ("time" ?e6) ("I" ?x2) ("have" ?e5 ?x2 ?e6) ("try" ?e6 ?x2 ?x3) ("this" ?x3)).
		
		 ("and"
		  ("I" ?x1)
		  ("wonder" ?e3 ?x1 ?e4)
		  ("how" ?e4)
		  ("well" ?e4)
		  ("it" ?x2)
		  ("work" ?e4 ?x2))
		
		Sending: KBS, MySQL2:freekbs2:default assert ("and" ("I" ?x1) ("wonder" ?e3 ?x1 ?e4) ("how" ?e4) ("well" ?e4) ("it" ?x2) ("work" ?e4 ?x2)).

I am also going to add the following capabilities:

It will of course easily combine named entities into the same unit
(and have them looked up in the central terminology knowledge
management system, called Termios, which is still incomplete).  I will
also easily add Calais semantic annotation named entity classes
e.g. if Andrew Dougherty was mentioned, you would get the following
added to the formula:

	Andrew_Dougherty (?x1)
	NP_Person (?x1)

I will add coreference resolution, so that the "it" is resolved to the
same entity as the "this", in other words ?x3 from the first and ?x2
from the second will be unified to the same variable, and "it" and
"this" will become a reified object.   i.e.:

       ("reification-135325 <this|it>" ?x2)

Lastly, word senses will be disamiguated, so 

	("try" ?e6 ?x2 ?x3) 

will become

	("try_v_1" ?e6 ?x2 ?x3)

which adds additional information to the system.  As more accurate
WSD, Coreference, Semantic Annotation, etc, systems are released, the
results will simply be integrated by adding APIs for these to the
standard Capability::WSD, Capability::CoreferenceResolution, etc,
wrappers.

This system will form the basis of a system which can (eventually)
answer questions like the following.

	Given: The laptop was put in the bookbag.  Erin checked his baggage on
	the plane, then flew to Tallahassee.
	
	Question: Which state is the laptop in?
	
	Answer: Florida.


The final manual assembly of the above results yields something like this:

("and"
 ("reification-135325 <this|it>" ?x1)	
 ("be_v_1" ?e4 ?x1 ?e6)
 ("first_time" ?e6)
 ("reification-135324 <I>" ?x2)
 ("have_v_3" ?e5 ?x2 ?e6)
 ("try_v_1" ?e6 ?x2 ?x1)
 ("reification-135325 <this|it>" ?x1)
 ("wonder_v_2" ?e7 ?x2 ?e8)
 ("how" ?e8)
 ("well" ?e8)
 ("reification-135325 <this|it>" ?x1)
 ("work" ?e8 ?x1)
 )

Adding the following techniques: theorem proving, event extraction,
lexical knowledge, training over existing corpora of
story/question/answer sets, etc, will yield such a system.  This is of
course state of the art.  And the best part is everything here is just
one narrow application of everything that we have.  There are
thousands of additional components to the FRDCSA.  I am working on
providing all of this as a web service which can be accessed over
REST/SOAP/XMLRPC, etc.

Additionally, I will be integrating the Vampire theorem prover, among
others, to do the reasoning for FreeKBS, because our resolution style
theorem prover would be reinventing the wheel.   At least for now.

There is a set of wordnet synset to Cyc concept mappings, which we can
apply to the above Logic Form output, to convert to a more ontological
approach.  I am very close to importing Cyc into FreeKBS.

Additionally, I will support other formalisms, such as RelEx, CAndC,
APE, CELT, and try to integrate all the results into some kind of
voting system (where conflicts exist) and augmentation else where.
The bottom line is that we should be able to formalize large extents
of text.

Additionally, I will represent the sentences with classifications such
as Speech Act classifications, etc.  All of this will input into a
contextual mechanism.  With theorem proving we will be able to answer
sophisticated questions about text, as well as using Sayer and the
HypergraphT (or whatever from OpenCog), for doing various reasoning
tasks.

If this all seems a little useless, that is because I have written
about motivating cases.  The truth is this solves wide ranging
problems.  For instance, the food ontology for Gourmet can be
instantly created, just by processing natural language books on
culinary arts, as well as recipes.