Logo

Evaluator

List
Category Evaluation
Description

Description

Evaluator is a module to evaluate given question answering system, based on the user's choice of modules.
Users can make use of configuration options to choose which modules to use for the steps of QA system.
Once evalutor runs, it sends a natural language question in answer dataset to Controller one-by-one.
It compares true answer of the dataset with the answer list from Controller to calculate accuracy of the system.

Function

Evaluator evaluates the overall QA system using answer set.
For instance, let the answer set contains only two sentences, "Which rivers flow through Seoul?" and "Who is the author of 'Samguk Sagi'?"
In addition, suppose that Controller success to give correct answer for the former question, but fails for the latter one.
Then, Evaluator would return accuracy 1/2 = 0.5 to the user.

Scope and limit

The only dataset usable for the Evalator is NLQ-1 dataset, which contains question-answer pair for only Korean and English at the moment (March, 2016) with 40 questions in the set.
Since Evaluator works for measuring accuracy of the given QA system, it is not suitable for analyzing each test case.
For the purpose, you may use website for Controller or website for Evaluator (which supports dump file for each question).

Issues and discussion

* Is it possible to make use of another answer set like QALD-4?
Not yet (January 13, 2016, OKBQA3.5).
However, supporting mult-answer sets would be added at OKBQA4(http://4.okbqa.org)

* Can I test my own QA system?
Since Evaluator calls Controller to retrieve answers of each questions, only QA system following OKBQA system structure (TGM, DM, AGM) is applicable.
This structure is targetted for those who want to add their own modules to the OKBQA system structure, and checks the performance of it.


Maintainer: Jeong-uk Kim {prismriver@kaist.ac.kr}
Maintainer prismriver@kaist.ac.kr
Source-code URL https://github.com/Jeong-uk/okbqa_evaluator
Homepage URL http://ws.okbqa.org/~b15_2015/
Web service URL 121.254.173.77:31990/evaluationall
Sample cURL command curl -i -H "Content-Type: application/json" -X POST -d '_sample_input_' 121.254.173.77:31990/evaluationall Test
Sample input
Sample output
List