What is a "data question"?

A "data question" is something you ask for smart data and that it is expected to be answered by a table of data.  In practice a data question is described by a plain English sentence and a set of expected answer fields:

  • "Zip codes for all Italian administrative areas" (municipality, zip, areacode")
  • "People who were born in Berlin before 1900" (name, birth, death, person)
  • "Soccer players, who are born in a country with more than 10 million inhabitants, who played as goalkeeper for a club that has a stadium with more than 30.000 seats and the club-country is different from the birth country" (soccerplayer, countryOfBirth, teamcountryOfTeam, stadiumcapacity)
  • "Localities in the Italian province of Prato with number of foreigners of Asian origin" (Locality, Asian_Foreigns_Number)

Learn more

Formally a data question is defined as "a natural language sentence that can be expressed with a parametric SPARQL query that use the SELECT operator and that uses well-known ontologies". 

If you are curious about how example questions are translated into SPARQL queries, click here.

If you are not interested in this technical stuff, you concentrate only on sentence and answer fields lists leaving our data scientists to manage all the rest.

Are there limits to data questions?

Theoretically, Linked Data technologies ensure the computability of any question that can be expressed with the description logic language.

Practically, you can answer any question whose computation time doesn't exceed 600 seconds ( i.e. the maximum time allowed for an SPARQL query).  

Of course, the knowledge base must contain enough data to produce an answer to your questions.