WP2: General MI theory and tools |

Development of general MI theory and tools is a major goal in MIPRCV. This is done in this WP through three research tasks on interaction, multimodality and adaptive learning. The basic, non-interactive technology viewed in WP 1 is used as a basis of development. The resulting theory is going to be applied in different application domains, in WPs 3, 4 and 5.
A presentation of the theoretical framework and experimental results about interactive machine translation systems.
This work proposes a interactive methodology for learning Bayesian networks from data. This is a challenging task, particularly when the data is scarce and the problem domain contains a high number of random variables. In this work, we introduce a new methodology for the interactive integrating of expert knowledge, based on Monte Carlo simulations and which avoids the costly elicitation of these prior distributions. The great advantage of this integration approach is that only requests from the expert a very limited amount of information, to be more precise, about those direct probabilistic relationships between variables which can not be reliably discerned with the help of the data.
We propose a method which, given a document to be classified, automatically generates an ordered set of appropriate descriptors extracted from a thesaurus. The method creates a Bayesian network to model the thesaurus and uses probabilistic inference to select the set of descriptors having high posterior probability of being relevant given the available evidence (the document to be classified). Our model can be used without having preclassified training documents, although it improves its performance as long as more training data become available. It is a multimodal system in the sense that it integrates three different sources of information: in addition to the textual content of the documents, it also manages structural information (the thesaurus' hierarchy of concepts), and semantic information (descriptors and non descriptors in the thesaurus).
Given a Pattern Recognition task, Computer Assisted Pattern Recognition can be viewed as a series of solution proposals made by a computer system, followed by corrections made by a user, until an acceptable solution is found. For this kind of systems, the appropriate measure of performance is the expected number of corrections the user has to make. In the present work we study the special case when the solution proposals have a sequential nature. Some examples of this type of tasks are: language translation, speech transcription and handwriting text transcription. In all these cases the output (the solution proposal) is a sequence of symbols. In this framework it is assumed that the user corrects always the first error found in the proposed solution. As a consequence, the prefix of the proposed solution before the last error correction can be assumed error free in the next iteration. Nowadays, all the techniques in the literature relies in proposing, at each step, the most probable suffix given that a prefix of the ``correct'' output is already known. In the present work we show that this strategy is not optimum when we are interested in minimizing the number of human interactions. Moreover we describe the optimum strategy that is simpler (and usually faster) to compute. |