Wednesday 28 November 2012

Using Wikipedia for extracting hierarchy and building geo-ontology

an article by Quoc-Hung Ngo (University of Information Technology, HoChiMinh City, Vietnam), Son Doan (University of California, San Diego, USA) and Werner Winiwarter (University of Vienna, Austria) published in International Journal of Web Information Systems, Volume 8 Issue 4 (2012)

Abstract

Purpose
This paper aims to serves two main purposes: First, it seeks to provide an overview of the location hierarchy from the highest divisions (continents) to the lowest divisions (wards, villages) in reality and in the Wikipedia pages. Secondly, it aims to introduce an approach to building a geographical ontology from Wikipedia.
Design/methodology/approach
The paper first reviews existing applications which extract information from Wikipedia and use it as a data resource to develop natural language processing tools. The paper also reviews the structure of Wikipedia pages which show the location’s information. Based on the analysis, the paper then proposes an approach to extract location hierarchy as well as geographical characteristics for the geo-ontology. The approach also rebuilds the relations between locations in the ontology.
Findings
Existing location name systems are mainly based on probabilistic locations, which are mined from the data and they lack the administrative relations between locations for full levels and all countries and territories. The literature review in geographical hierarchy and using Wikipedia for natural language processing tasks offers an approach to build a geographical ontology from Wikipedia pages. The proposed approach is believed to be the first which provides a full geo-ontology for all countries.
Practical implications
The paper builds a geo-ontology with full levels for all countries and territories. The administrative relations between locations are needed for real-world applications.
Originality/value
The comprehensive overview on existing work on geo-ontology provides a valuable reference for researchers and system developers in related research communities. The proposed approach to build a geographical ontology by using the Wikipedia offers a promising alternative to build a knowledge system from free online multi-language encyclopedia.

Hazel’s comment:
It sounds as though this would be a valuable tool in the hands of labour market researchers and also for guidance practitioners unsure of geography outside their own area.


No comments: