Question-answer system ( QA-system ; from the English. QA - English. Question-answering system ) - an information system that can receive questions and answer them in natural language, in other words, it is a system with a natural language interface.
Content
- 1 Classification
- 2Architecture
- 3Work scheme
- 4Problems
- 5Development of question-answer systems
- 6Quality assessment of question-answer systems
- 7SM. also
- 8Notes
- 9 Literature
- 10Links
Classification
Question-answer systems can be divided into:
- Highly specialized QA systems work in specific areas (for example, medicine or car maintenance).
- General QA-systems work with information in all areas of knowledge, thus it becomes possible to search in related areas.
Architecture
The first QA systems [1] were developed in the 1960s and were natural language shells for expert systems focused on specific areas. Modern systems are designed to find answers to questions in the documents provided using natural language processing (NLP) technologies.
Modern QA-systems usually include a special module - the classifier of questions , which determines the type of question and, accordingly, the expected answer. After this analysis, the system gradually applies increasingly sophisticated and subtle NLP methods to the documents provided, discarding unnecessary information. The crudest method - search in documents - involves the use of an information retrieval system to select parts of the text that potentially contain an answer. Then the filter highlights phrases similar to the expected answer (for example, to the question “Who ...” the filter will return pieces of text containing the names of people). And finally, the answer selection module will find the correct answer among these phrases.
Scheme of work
The performance of the question-answer system depends on the effectiveness of the text analysis methods used and on the quality of the textual base - if it does not have answers to the questions, the QA system can find little. The larger the base, the better, but only if it contains the necessary information. Large repositories (such as the Internet) contain a lot of redundant information [2] . This leads to the following points:
- Since the information is presented in different forms, the above information is complete. The QA system is more likely to find the answer.
- The correct information is often repeated, so the search error answers can be minimized.
- The accuracy of information retrieval significantly depends on the reliability of information in the repositories, as well as on the effectiveness of information analysis and response methods.
Problems
In 2002, a group of researchers wrote a research plan for question-answer systems [3] . It was suggested to consider the following questions:
- Types of questions
- Different questions require different methods of finding answers. Therefore, it is necessary to compile or improve methodological lists of types of possible questions.
- Handling questions
- The same information can be queried in different ways. It is required to create effective methods for understanding and processing the semantics (meaning) of a sentence. It is important that the program recognizes equivalent questions, regardless of the style, words, syntactic relationships of the idioms used. I would like the QA-system to divide complex questions into several simple ones, and correctly interpret context-sensitive phrases, possibly clarifying them with the user in the process of dialogue.
- Contextual questions
- Questions are asked in a specific context. The context can clarify the query, eliminate ambiguity, or follow the user's thoughts on a series of questions.
- Sources of knowledge for the QA-system
- Before answering a question, it would be nice to inquire about the available text databases. Whatever methods of word processing are used, we will not find the right answer if it is not in the databases.
- Highlighting answers
- Proper implementation of this procedure depends on the complexity of the issue, its type, context, quality of the texts available, the search method, etc. - a huge number of factors. Therefore, it is necessary to approach the study of text processing methods with all care, and this problem deserves special attention.
- Answer Formulation
- The answer should be as natural as possible. In some cases, simply selecting it from the text is sufficient. For example, if a name is required (the name of the person, the name of the device, the disease), the value (monetary rate, length, size) or date (“When was Ivan the Terrible born?”), A direct answer is sufficient. But sometimes you have to deal with complex queries, and here you need special algorithms for merging answers from different documents.
- Answers to questions in real time
- It is necessary to make a system that would find the answers in the repositories in a few seconds, regardless of the complexity and ambiguity of the question, the size and the vastness of the document base.
- Multilingual requests
- Development of systems for work and search in other languages (including automatic translation).
- Interactivity
- Often the information offered by the QA system as a response is incomplete. Perhaps the system incorrectly identified the type of question, or incorrectly “understood” it. In this case, the user may want to not only reformulate his request, but also to “explain” with the program through dialogue.
- The mechanism of reasoning (output)
- Some users would like an answer that goes beyond the available texts. To do this, you need to add knowledge in the QA-system that is common to most areas (see General ontologies in computer science), as well as tools for automatic output of new knowledge.
- QA systems user profiles
- User information, such as area of interest, language of speech and reasoning, implied default facts, could significantly increase system performance.
Directions of development of question-answer systems
Since the appearance of the first prototypes of question-answer systems, their field of application has expanded considerably [4] . For example, they are used in answers to questions related to time, geolocation questions, concept definition questions, bibliographic, multilingual questions, multimedia questions (visual, audio and video information). Related areas are being explored, such as building interactive QA systems (clarifying questions required to clarify the initial one), reusing answers and presenting knowledge, using inferencing from available information to get answers to questions, etc., predicting what questions be asked, mood analysis.
Evaluation of the quality of question-response systems
Question-answer systems are discussed on an ongoing basis in the framework of projects: TREC [5] , CLEF (Eng.) Russian. [6] , NTCIR [7] , ROMIP [8] .
see also
- Virtual Digital Assistant: Siri (WolframAlpha)
- Nigma - Intellectual Search Engine
- Ibm watson
Links
- QA systems and demo versions
- One of the first question-answer START systems posted on the Internet on the MIT website.
- The AskNet Search question-answer system on the site asknet.ru (originally Stocona Search).
- Question-answer system BrainBoost on the website Answers.com (eng.) Russian. (originally BrainBoost.com).
- QA system built into the Ask.com search engine.
- OpenEphyra Open Source Question-Answer System.
- AnswerBus.
- Multilanguage QA-system askEd! M (English, Japanese (inaccessible link from 13-05-2013 (1028 days) - history ), Chinese (inaccessible link from 13-05-2013 (1028 days) - history ), Russian (inaccessible link 13-05-2013 (1028 days) - history ) and Swedish (inaccessible link from 13-05-2013 (1028 days) - history )).
- Evi project from True Knowledge (English) Russian ..
- Ephyra.
- Russian-speaking question-answer system Robochat.
- Specialized QA systems
- EAGLi: MEDLINE question answering engine (English).
Comments
To leave a comment
Creating question and answer systems
Terms: Creating question and answer systems