Knowledge base. Part 2. Freebase: make requests to the Google Knowledge Graph
More than a year ago, Google announced that from now on in their search is the mysterious Knowledge Network (an official translation of the Knowledge Graph). Perhaps not everyone knows that a significant part of the data Network available for anyone to use and available via well described APIs. This part is the knowledge base Freebase, supported by Google and enthusiasts. In this article, we first a little nuts, and then try to do a few simple queries in MQL.
This article is the second of the knowledge Base. Stay tuned.
the
-
the
- Part 1 — Introduction the
- Part 2 — Freebase: we make requests to the Google Knowledge Graph the
- Part 3 — Dbpedia — a kernel of the world of Linked Data the
- Part 4 — Wikidata — semantic Wikipedia
Google Knowledge Graph from the point of view of an ordinary user
One of the visible manifestations of the Google Knowledge Graph is the information panel that briefly describes the object that you are looking for. They often arise from the search of personalities, a little less — geographical names. They often occur for queries that are specified in English in the English interface, but we will stick to the Russian language where possible.
For example, Roger waters gives the following result:
Policythe the links in the infobox and note the URL — it uses the setting stick, the contents of which is some kind of identifier of the form
&stick=H4sIAAAAAAAAAONg[VuLQz9U3]<ID>AAAA
When the Knowledge Graph has just appeared, it is possible to demonstrate to the uninitiated street magic, for example, add &stick-parameter from Marilyn Monroe to the request from Stephen king:
Now this possibility is covered, and to what she'll give us a better look at something useful. For example, recently there the ability to compare multiple objects with the help of keywords vs:
Google promises to add many more Goodies related to smart search and the answers to the questions, and the Knowledge Graph is one of the pillars on which this intelligence is kept. What is especially great for us is the fact that the Graph'a piece of Knowledge is open for anyone to use.
Freebase — GNG subgraph
Let's start with the historical perspective. The company Metaweb began work on his knowledge base in 2005. According to the method of data filling Freebase most like to Dbpedia the lion's share of the knowledge represented in Freebase, the data was from Wikipedia. Difference from Dbpedia was, first, the possibility to correct the entered data manually, and secondly the fact that Freebase did not hesitate, and other data sources. In contrast to the DBpedia team, representatives Metaweb't cared too much about how to publish a scientific article (although recently started, here an interesting list), and priznalisthat the code of the main component, graphd, is unlikely to ever see the light of day.
In 2010, the company Metaweb was bought by Google but, according to the newsletter Freebase, the search giant is not too interfered in the Affairs svezhepriobretennoy team. After the release of the colorful video, which Google tears of the competitors, as pioneer truth through its new intelligent semantic technologies, representatives Metaweb (Google) have confirmed that Freebase is a very important part of the Knowledge Network, along with Wikipedia and base of the CIA fact. During the big cleanup for the unification of all hohlovyh API programming interface to Freebase drastic changes to expose did not, and his explanation was moved to developers.google.com. In order to ask something from the knowledge base we still use a query language MQL (pronunciation. "Mickle", Metaweb Query Language). !
the First query and editor
Let's start with a simple question: ask Freebase some fact, such as date of birth, Leonardo da Vinci:
www.googleapis.com/freebase/v1/mqlread?query={"/type/object/id":"/en/leonardo_da_vinci","/people/person/date_of_birth":null}
Get quite a reasonable result:
the
{
"result": {
"/type/object/id": "/en/leonardo_da_vinci",
"/people/person/date_of_birth": "1452-04-15"
}
}
To us it was easier to exercise, we will use query editor, kindly provided by Freebase.
The editor of this terribly convenient and has a wonderful feature auto-completion of queries — in case of difficulty, just hit Ctrl+Enter and you get a great contextual clues. In the lower panel of the editor located useful tools which are described in detail in guide. In the self-study particularly suggest you look at the button examples, containing examples of queries, clarifying many of the features of MQL.
Well, that's our request, but the answer is:
Request | Response |
---|---|
|
|
Let us examine this query in detail. We have specified the ID of the object in Freebase is to use the term
id
. The identifiers of all objects, and id
is short for /type/object/id
. There are many other /type/object
of properties possessed by all entities of the Freebase, they will be considered later.The object with ID
/en/leonardo_da_vinci
may have a property /people/person/date_of_birth
whose values we do not know. We put this value instead of the special word null
, which place in response to the Freebase record the value from the database.As you can see, the request and response are symmetric.
Complex query
Now, in order to have any more questions, we will make a rather complicated MQL query and briefly explain it to you. After that you can proceed to the detailed study of the structure of Freebase and review language features.
So here's our request (taken from the guide to MQL):
Request | Response |
---|---|
| [{ |
Will try to briefly describe that for MQL tools used in this query.
First, as you can see, the entire query is wrapped in a
[ { } ]
, which means that the results you expect an array of objects, not one object, as in the case of { }
.Rows 2-4 should not cause any problems: we are looking for an object of type album (
/music/album
), we want to get his name and we are not interested in albums, called "Greatest Hits".In lines 5-8 and 11-15 of a provider OR
|=
we are interested in the album, whose release date is equal to 1978 or 1979. We now turn to the genre:"genre":[],
"a:genre": "New Wave",
"b:genre|=": [
"Punk Rock",
"Post-punk",
"Progressive rock"
],
The first line says that we want to get the list of genres of these albums in the response. To do this, we added to the query, an empty list
[ ]
. Then we say that we are only interested in the albums in the genres which indicated a New Wave
from the list "Punk Rock", "Post-punk", "Progressive rock".Finally, lines 23-24 contain directives MQL: I am interested in only two outcomes (
limit
), and I want to sort them by name (sort
).JSON MQL
MQL queries and responses are JSON objects, so for the little ones (or those who are not among the web developers) will talk about JSON.
General information about JSON
JSON (JavaScript Object Notation) is a language designed for data exchange in the format "key-value". Initially, JSON was used to serialize JavaScript objects, but it quickly became asianization and due to its simplicity, has become a very loved and respected by programmers for different languages and platforms.
The most simple JSON object, empty object. It is written as follows:
the
Now let's create the object that stores information about Leonardo da Vinci. First restrict only his name. To do this, enclose the key and the value in quotation marks, separated by a colon
"name" : "Leonardo di ser Piero da Vinci" }
Add a few facts about Leonardo, separated by commas:
the
Now it is necessary to determine that the profession was at da Vinci. And professions-these were many: a sculptor, and a painter and architect and a lot of people. In order to assign one key to multiple values in the JSON uses a list of values enclosed in square brackets, the values separated by commas:
the
One more thing about JSON which should know, is a subobject. They are very simple: after the key, you simply insert a new set of couples keys-values in curly brackets. In the case of Leonardo we can try to display the data on place of birth of Leonardo — the village Anchiano located in Italy. We say that the key is "place_of_birth" corresponds to object named Anchiano, which is in Italy:
the
The most simple JSON object, empty object. It is written as follows:
the
{}
Now let's create the object that stores information about Leonardo da Vinci. First restrict only his name. To do this, enclose the key and the value in quotation marks, separated by a colon
"name" : "Leonardo di ser Piero da Vinci" }
Add a few facts about Leonardo, separated by commas:
the
{
"name" : "Leonardo di ser Piero da Vinci",
"date_of_birth": "1453-04-15",
}
Now it is necessary to determine that the profession was at da Vinci. And professions-these were many: a sculptor, and a painter and architect and a lot of people. In order to assign one key to multiple values in the JSON uses a list of values enclosed in square brackets, the values separated by commas:
the
{
"name" : "Leonardo di ser Piero da Vinci",
"date_of_birth": "1453-04-15",
"profession": [
"Architect",
"Engineer",
"Anatomist",
"Inventor",
"Artist",
"Sculptor"
],
}
One more thing about JSON which should know, is a subobject. They are very simple: after the key, you simply insert a new set of couples keys-values in curly brackets. In the case of Leonardo we can try to display the data on place of birth of Leonardo — the village Anchiano located in Italy. We say that the key is "place_of_birth" corresponds to object named Anchiano, which is in Italy:
the
{
"name" : "Leonardo di ser Piero da Vinci",
"date_of_birth": "1453-04-15",
"profession": [
"Architect",
"Engineer",
"Anatomist",
"Inventor",
"Artist",
"Sculptor"
],
"place_of_birth": {
"name": "Anchiano",
"containedby": "Italy",
},
}
it is Not JSON
In General, MQL queries are not valid JSON objects. MQL is a strict superset of JSON, and it allowed all sorts of liberties. One of the ideas products Metaweb is that the program should be able to forgive user mistakes and errors that they make. This idea exists in other languages and programs, but first and foremost — the World Wide Web — it's okay that some parts of html is misspelled, you should still try to display the document.
For example, the correct JSON request looking for people with rare and valuable profession:
the
{
"id": "/en/pope",
"/people/profession/people_with_this_profession": [{
"name": null,
"limit": 4
}]
}
We can remove the quotation marks and the query will continue to work:
the
{
id: /en/pope,
/people/profession/people_with_this_profession: [{
name: null,
limit: 4
}]
}
Close parenthesis and split pairs with a colon, too, do not have, so here is an example of absolutely outrageous:
the
id /en/pope
/people/profession/people_with_this_profession [{
name is null
limit 4
Device Freebase
The official guide gives a very good introduction to how data is stored inside Freebase. Us is not too important because it is used in Freebase of the four objects are fully hidden behind the object paradigm. If you are interested, can turn to the appropriate page manual
So, Freebase allows us to think about what's inside him lay the objects. Each object is limited to curly braces
{ }
and consists of pairs "property-value", separated by colons. Objects that Freebase gives an answer to the MQL queries are valid JSON objects, but they are not similar to objects in OOP paradigm. It is best to think of them as unordered sets of pairs.As properties (i.e. what is before the colon) in MQL can stand IDs. As values can be identifiers, literals, arrays, and, finally, the nested objects.Freebase has rules, which must be based identifiers. An identifier consists of a namespace and a key separated by a forward slash
/
. Consider, for example, the identifier /people/person/date_of_birth
— in this date_of_birth
is the key, and /people/person
namespace.The IDs are unique. They are not required to carry the semantic load, but often the object identifier is easy to understand what was going on.
Generic properties
All objects in Freebase have the following reserved (universal properties):
the
-
the
- name —
/type/object/name
the - key — the
/type/object/key
the - (usually more than one) —
/type/object/type
the - creation time of
/type/object/timestamp
the - Creator —
/type/object/creator
the - access mode:
/type/object/permission
the - global identifier
/type/object/guid
the - machine — ID
/type/object/mid
We consider here the properties that are most often used in MQL queries: names, identifiers and types.
IDs
In Freebase like a lot of identifiers. The main of them is
/type/object/guid
is given once and for all. Is it the short form /type/object/mid
. Well, I use in queries usually /type/object/id
— it is often chelovekochitaemye. The most important thing is that no two objects with the same identifiers. For example, look how many people named Adam Smith (Adam Smith): article in English Wikipedia. Only a moral philosopher Adam Smith is the proud identifier /en/adam_smith
. All the other Adam Smith would be identified differently if they policies (/en/adam_smith_1965
), players (/en/adam_smith_huddersfield
) or anyone else.You can enter the ID into the search box on Freebase.com and to the page object properties:
Property /type/object/name
Each object has a name. The name is not unique, the object has usually several names — one for each language. Most interesting is that it does not complicate queries — you will notice that when you request names, you will only be given the name in the language set in Freebase as current. So it is possible to address with objects of type name as ordinary strings.
Property /type/object/type
This property specifies the type of the object. One object can have multiple types — it usually is.
If the query you specified the property
type
, you need a namespace for this type can be omitted. Which properties are of type /film/director
? Of course, those who are in the namespace of this type, that is, those that begin with /film/director
. For example, consider a query for all movies of Stanley Kubrick. The left side shows the query in an abbreviated form, which we will use further, and on the right is how it could look, don't be Metaweb developers are so good to us. Request | Query in full form |
---|---|
|
|
Secondly, all properties from a namespace /type/object can be omitted — that's why we have the right to write just
id
, name
, type
, etc? Because all objects in Freebase have type Object. Different types of MQL queries
We have already discussed quite a lot of queries, but so far did not focus on the language itself. First, let's look at how MQL retrieves the desired values. There are the following cases:
the
-
the
- I need to request a single literal value. For example, the date of birth of the person the
- I need to query an array of values. For example, a list of the albums of the musical group. the
- I need to query a single object c its main properties: ID, key type and name the
- I need to query an array of objects the
- I need to know everything about the object
Request one
If you want to Freebase returned object is of the same structure as the request object, but with the filled unknown field in the query this field is necessary to substitute
null
. We saw quite a lot of such examples, here's another. Ask the musician Keith Emerson, where he came from: Request | Response |
---|---|
|
|
Request array of values
If we try to use a
null
to query for all music albums of the group, we will get an error. If you are expecting an array of objects, use square brackets []
. Freebase will fill this array with the rows listed with a comma. Examples of albums muscarinic groups fully and in the official guide, and we'll find a list of books written by Hawking: Request | Response |
---|---|
|
There are lots
the
|
If you do the opposite, query the array instead of single value – in this there are no problems – Freebase converts the results and give an array with only one value.
Request objects
Well but for my application I'm interested to know not only the names of the books of Hawking, but also their release dates, and pictures would be nice! This is also possible. The fact that the array of books we received in the last request, just looks like a string array. Actually, it's an array of objects, just Freebase turns objects into strings, leaving only their property
name
. Also England, where our musician, it's not just the string "England", and the object. To get an object representation of the query, you use
{ }
, like so: the
{
"name": "Keith Emerson",
"type": "/music/artist",
"origin": { }
}
As a result, we will give the most important information about the object: its identifier, name and type list:
query result
{
"result": {
"origin": {
"id": "/en/england",
"name": "England",
"type": [
"/common/topic",
"/location/location",
"/film/film_subject",
"/book/book_subject",
"/location/administrative_division",
"/film/film_location",
"/location/uk_constituent_country",
"/user/xleioo/winning_night/option_list",
"/location/statistical_region",
"/location/dated_location",
"/symbols/name_source",
"/m/04kp2w0",
"/user/robert/military/military_power",
"/symbols/flag_referent",
"/user/skud/flags/topic",
"/user/skud/names/topic",
"/user/robert/military/topic",
"/base/petbreeds/topic",
"/m/04mp17s",
"/user/xleioo/winning_night/topic",
"/organization/organization_scope",
"/base/charities/geographic_scope",
"/sports/sports_team_location",
"/base/thoroughbredracing/thoroughbred_racehorse_origin",
"/user/tsegaran/random/taxonomy_subject",
"/base/authors/country_of_origin",
"/base/authors/topic",
"/biology/breed_origin",
"/fictional_universe/fictional_setting",
"/government/political_district",
"/olympics/olympic_participating_country",
"/base/summermovies2009/topic",
"/base/leicester/topic",
"/base/popstra/location",
"/location/country",
"/base/england/topic",
"/base/ontologies/ontology_instance",
"/event/speech_topic",
"/user/skud/legal/treaty_signatory",
"/government/governmental_jurisdiction",
"/base/masterthesis/topic",
"/base/horticulture/cultivar_origin",
"/base/horticulture/topic",
"/base/localfood/food_producing_region",
"/base/localfood/topic",
"/sports/sport_country",
"/base/todolists/topic",
"/base/tagit/concept",
"/food/beer_country_region",
"/periodicals/newspaper_circulation_area",
"/location/uk_statistical_location",
"/base/biblioness/bibs_location",
"/base/biblioness/bibs_topic",
"/base/aareas/schema/gb/constituent_country",
"/base/aareas/schema/administrative_area",
"/base/uncommon/topic",
"/base/schemastaging/statistical_region_extra",
"/people/place_of_interment",
"/base/allthingsnewyork/topic",
"/base/events/topic",
"/base/events/geographical_scope",
"/base/tonyfranksbuckley/topic",
"/base/piratesofthewirralpeninsula/topic",
"/military/military_combatant",
"/military/military_post",
"/organization/organization_member"
]
},
"name": "Keith Emerson",
"type": "/music/artist"
}
}
Nested queries
You can get any other information about the country of origin of our musician, forming, thus, a subquery. For example, I want to know what language is spoken in the country where I came from Emerson. Please note, I added type for the country to get hints from the query editor:
Request | Response |
---|---|
|
|
Down: to which language family is the language spoken in the country where I came from Emerson?
Request | Response |
---|---|
|
|
These are pretty stupid queries can easily be generalized into something more useful. For example, a request to retrieve a list of musicians who come from English-speaking countries. Please note that I often use the output in an array. Well, even the word
limit
, which limits the output to three results: Request | Response |
---|---|
|
|
Query all properties of an object
Useful when you construct a query to obtain all properties of the object. The design used for this is very simple and easy: the asterisk in the title properties and an empty array which is filled with values
"*" : []
Request | Response |
---|---|
|
England number of properties
|
How to request objects
For those who managed to get lost in the brackets, here's a summary of how in Freebase can be queried objects:
the
"property" : null
"property" : []
"*" : []
"property" : {}
"property" : [{}]
"property" : {subquery}
"property" : [{subquery}]
Enough to start. It is clear that in MQL there are different Comparators, regular expressions, there are all sorts of AND, OR, and NOT. Even the beautiful language of Acre, which allows you to format the results of queries similar to how it's done in Semantic MediaWiki. Well ahead of the story about Dbpedia and Wikidata. What would you be interested in reading in the first place?
Комментарии
Отправить комментарий