Wednesday, 28 November 2007

Introduction to Special Issue on Reasoning in Natural Language Information Processing

I've been reading ``Introduction to Special Issue on Reasoning in Natural Language Information Processing'' by Dawei Song and Jian-Yun Nie. This paper provides a brief overview of the trend towards integrating reasoning components into natural language processing systems and a brief discussion of five related papers.

This paper notes that,

``For any application related to Natural Language Processing (NLP), reasoning has been recognised as a necessary underlying aspect. Many of the existing work in NLP deals with specific NLP problems in a highly heuristic manner .... There have been developments on models that allow reasoning in NLP such as language models, logical models, [models based on Bayesian networks], and so on.''

There have been a number of research projects that have used NLP as a reasoning process. Logic has been used to model information and reasoning. Song et al. notes that ``[f]ollowing the logical uncertainty principle [van Rijsbergen 1986], a number of logic-based models for IR [Information Retrieval] were proposed in the late 1980s and 1990s to integrate reasoning into IR [Lalmas and Bruza 1998].

It is important to note here three things. The first is the prediction by Dreyfus (1992) that there are major difficulties with using logic to model information and reasoning. The second is that ``... no operational system has been successfully evaluated against IR benchmark collections and applied to large scale IR tasks.'' The third is that ``this trend has not been followed up on since.''

Song et al. refers to a study by Wong et al. [2001] in which a functional analysis of logical IR models uncovered two major difficulties. The first difficulty was ``the lack of automatic means for constructing the background knowledge.'' The second difficulty was ``the computational overhead inherent in the symbolic logic when picking up and integrating different types of knowledge to facilitate effective reasoning.''

In regards to the first difficulty, Song et al. notes that ``... it was argued that when language processing tools have developed further, ... the problem can be largely alleviated.'' There have been recently released tools such as CYC, a commonsense database that seeks to address the problem of background knowledge. This difficulty also indicates that the issue of background knowledge requires further research into such as knowledge bootstrapping and knowledge interpretation.

In regards to the second difficulty, Song et al. notes that the ``problem has to do with the fact that most logical frameworks exist in the reality of symbolic processing where reasoning is a sequential process proceeding from assumptions'' and that these ''logical frameworks often suffer from high computational complexity.'' Song et al. refers to Gardenfors [2000] who noted these problems are partially due to the frame problem. The micro-theory structure of CYC might help to alleviate the problem of computational complexity. It is likely that further research into areas such as working memory, short term memory, long term memory, context and assumptions is required to examine the frame problem.

Despite these setbacks Song et al. notes that there is a renewed interest in logical IR. Song et al. refers to Song and Bruza [2003] which proposes using an ``information reference mechanism on high dimensional semantic spaces to underpin reasoning at the logical symbolic level for logic-based information retrieval.'' Song et al. also refers to Lau et al. [2004] that reported `a recent successful attempt in this direction.''

Song et al. notes that statistical language modeling approaches in NLP and IR are capable of integrating statistical reasoning and ``that many approaches using language modeling can be easily described from a reasoning perspective.'' As an example Song et al. refers to classical language modeling approaches described in Ponte and Croft [1998] which Song et al. notes ``usually assume independence between indexing units, which are unigrams or bigrams. In reality a word may be related to other words. Such relationships should be properly integrated into language models.''

Song et al. also notes a number of other recent work that indicate that language model frameworks are capable of integrating statistical reasoning into information retrieval systems. Among them are studies by Berget and Lafferty [1999], Lafferty and Zhai [2001], Laverenko and Croft [2001].

Song et al. goes on to note more recent work that incorporate into language models, term relationships. Song et al. notes the following examples, grammatical links by Gao et al [2004], co-occurrence and WordNet relations in Cao et al. [2005], Random Walk Markov chain model in Collins-Thompson and Callan [2005], exploiting of inferential term relationships in Bai et al. [2005] and information flow in Song and Bruza [2003].

From information retrieval, Song et al. moves briefly to question answering. Here Song et al. argues that certain approaches to question answering systems would also benefit from being recast into reasoning processes. Song et al. refers to a study by Clifton and Teahan [2005] where a logic based framework Knowing-aboutness recently demonstrated positive resutls at TREC QA 2004.

The first of five papers introduced is ``Inferential Language Models for Information Retrieval'' by Nie, Cao and Bai. This paper describes a system where a number of methods are integrated such as inferential mechanisms for document expansion, term co-occurrences, WordNet, semantic smoothing, and inference depth via Markov chain model. This is a good paper to follow up on.

The second of five papers introduced is ``Statistical Query Translation Models for Cross-Language Information Retrieval'' by Gao, Nie and Zhou. There are the usual statistical techniques present. What makes this paper special is the interplay between statistical techniques and linguistic techniques. One of the interesting finds of this paper is that ``... results have shown that the use of linguistic structures in statistical reasoning is more beneficial than the use of co-occurrences and dictionaries. Better results are generated by combining different types of knowledge.'' This is a good paper to follow up on.

The third of five papers introduced is ``A Statistical Framework for Query Translation Disambiguation'' by Lie, Jin and Chai. It proposes a statistical framework for dictionary based cross language information retrieval. This is one of those papers that has a too much focus on statistics and linguistics and too little on semantics.

The fourth of five papers introduced is ``Topic Tracking with Time Granularity Reasoning'' by Li, Li and Lu. There are some useful ideas here that take into account implicit temporal relatedness.

The fifth of five papers introduced is ``Improving Discriminative Sequential Learning by Discovering Important Associations of Statistics'' by Phan, Nguyen, Ho and Horiguchi. This paper presents ideas on mining.

If you're still with me, in short, this paper, Song et al. [2006] is about how basic techniques in natural language processing are mature enough to approach broader, more general problems by integrating statistical reasoning. That by doing so these systems can approach more similar ways of of human inferencing.

Song et al. concludes with three points. That more integrated models are required, that user and context must be considered and that new evaluation tasks are required.

I would argue that these papers are important contributions but they do not address the core issues and assumptions that are the challenges to developing the kind of artificial intelligence that the field initially envisioned. These issues are described in Dreyfus 1992 in more detail. I am really hoping to find more papers that take the issues in Dreyfus 1992 into more consideration because that is the direction I am heading towards, more general artificial intelligence.

One of the things I am concerned about in this paper is that I am not sure what Song et al. means by reasoning. I suspect that Song et al. is referring more to a statistical type of reasoning rather than a semantic type of reasoning. At the core, statistical and semantic reasoning might be quite similar. As I read more deeply I am convinced that Song et al. refers to what I have perceived as statistical reasoning but more and more I am convinced that my vision of semantic reasoning is very similar.

One of the things I've noted is that references often only go one level deep. For example, here in my somewhat long summary of Song et al [2006] I am referring to references made by Song et al. to other papers whereas the thing to do seems to only refer to Song et al.

Though this paper supports the argument that reasoning is an important element of natural language processing systems it is more of a summary paper than a practical paper. Practical papers are really what are required at this point in time.

Hmm, I need to make my summaries shorter.

No comments:

Post a Comment