How to autocomplete form using linked data

The objective of this tutorial is to build a simple autocomplete feature based on data extracted from wikipedia (see on-line demo).

http://autocomplete.linkeddata.center/

Background

Because is not always easy to understand the power of SPARQL and of Semantic Web technologies in day-by-day programming, We provided a simple example that solves a very general and frequent problem: autocomplete an input field selecting data from a large dataset.

Suppose that you want to write an autocomplete script to help a user writing the name of a river in a form.
Suppose that you want that available in multiple languages.
You would face a key and time consuming problem: populating and maintaining the big dataset needed to drive the script.

Here is where the Semantic Web does the magic: you can use Dbpedia to access the full "Wisdom of the crowd" contained in Wikipedia as linked data and get a list of all rivers, in any language!

Dbpedia is a great public service but unfortunately it does not ensure any SLA. 
Sometime the service is down for maintenance and you can't predict when that would happen. That is not acceptable if you need to create a solid and reliable application.

A reasonable solution is to copy the data you need from dbpedia to your own knowledge base system, so you can safely use it in your application.

This is where the LinkedData.Center service plays its role.
LinkedData.Center allows you to quickly create and host your knowledge base populated from any linked open data sources both private and/or public. To use the created knowledge base, LinkedData.Center exposes a dedicated and password protected sparql end-point, fully compliant with the last W3C semantic web standards. You can create data mashup, apply rules, data inferences and many other features. Last but not least LinkedData.Center keeps your knowledge base aligned with the used data sources re-indexing them automatically and only when needed.

The project

Autocomplete is a php open source project hosted in GitHub and you are encouraged to clone and to customize it. It is composed by a javascript/html page, a server script and a knowledge base configuration data file.autocomplete flowchart


The html page is a standard implementation of jQueryUi remote autocomplete javascript.

The script, by default, connects to https://hub1.linkeddata.center/demo/sparql endpoint. You can also use your own LinkedData.Center instance (free tiers available) just changing credentials in the api code.

The knowledge base is populated starting from a Knowledge Exchange Engine Schema (KEES) file. This file is the core of the project.

Test in a local environment using Vagrant (suggested)

These instructions allow you to install the project on your local workstation using some simple virtualization technologies:

  • install vagrant and virtual box on your workstation.
  • clone GitHub project in a directory of your workstation and cwd in it
  • open a shell and type the command vagrant up. A new virtual machine with all needed tools will be ready and running in few minutes.
  • point your browser to http://localhosts:8080/demo .
  • to destroy your virtual host just type vagrant destroy 

Install on your PHP web server

  • Publish the project in a web server that supports php 5 (with curl extension).

The provision script contained in the Vagrant file will give an idea of a complete api installation on a ubuntu 14.04 box.

The server side script 

jQueryUi remote autocomplete requires a server script file. The script source that searches labels in wikipedia is provided in api/index.php file. Here is the script usage:

 http://your_endpoint_path/api?term=<string>[&list=<number>][&lang=<lang code>][&class=<Automobile|River|Mammal>]

Mandatory parameters:

  • term: filter for auto completion. Search is enabled if you provide at least two characters.

Optional parameters:

  • list: maximum number of items returned. Default 10, max 100, min 1.
  • lang: preferred language using the two characters international coding standard. Default is en (means english).
  • class: the name of the dbpedia classification. This example supports Automobile (default), River and Mammal.

Example:

the resource:

http://localhost:8080/demo/api?term=ri&list=3&lang=en&class=River

will return something like:

[ "River Garavogue", "River Oykel", "River Afan" ]

Try Yourself in the  demo site.

The client side html code

The html source with all required javascript is contained in the index.html file.

Create your own autocompletion application

Please note that you can extend this approach to query any data in billions of linked data sources (public or private) in just three steps:

  1. activate a LinkedData.Center free tier;
  2. add the required dataset to the abox list in your LinkedData.Center endpoint;
  3. start a learn job;
  4. create your domain specific api to allow your application to access data.

To improve performances you can add cache at server side script.

License

Autocomplete project is licensed under the MIT LICENSE