Discussion:
[Bug 72729] New: Create Wikidata API module that is queryable from the local wiki
b***@wikimedia.org
2014-10-30 01:12:57 UTC
Permalink
https://bugzilla.wikimedia.org/show_bug.cgi?id=72729

Bug ID: 72729
Summary: Create Wikidata API module that is queryable from the
local wiki
Product: MediaWiki extensions
Version: unspecified
Hardware: All
OS: All
Status: NEW
Severity: normal
Priority: Unprioritized
Component: WikidataClient
Assignee: wikidata-***@lists.wikimedia.org
Reporter: ***@wikimedia.org
CC: wikidata-***@lists.wikimedia.org
Web browser: ---
Mobile Platform: ---

Extend MediaWiki API Query module to support basic Wikidata data retrieval
locally. This would allow Wikidata data to be included as part of other API
queries and even use it with generators
(https://www.mediawiki.org/wiki/API:Query#Generators). Minimum requirement
would be to retrieve wikidata descriptions using page titles or ids. (This
would facilitate their use in search suggestions.) Other possible capabilities
would include retrieving the Wikidata labels, aliases, claims, and
inter-language links.
--
You are receiving this mail because:
You are on the CC list for the bug.
b***@wikimedia.org
2014-11-03 14:58:06 UTC
Permalink
https://bugzilla.wikimedia.org/show_bug.cgi?id=72729

***@wikimedia.de changed:

What |Removed |Added
----------------------------------------------------------------------------
Priority|Unprioritized |Normal
CC| |***@wikimed
| |ia.de
Whiteboard| |u=dev c=backend p=0
--
You are receiving this mail because:
You are on the CC list for the bug.
b***@wikimedia.org
2014-11-03 16:49:17 UTC
Permalink
https://bugzilla.wikimedia.org/show_bug.cgi?id=72729

Lydia Pintscher <***@wikimedia.de> changed:

What |Removed |Added
----------------------------------------------------------------------------
Priority|Normal |High
CC| |***@wikimedia.de
--
You are receiving this mail because:
You are on the CC list for the bug.
b***@wikimedia.org
2014-11-03 18:34:56 UTC
Permalink
https://bugzilla.wikimedia.org/show_bug.cgi?id=72729

Kunal Mehta (Legoktm) <***@gmail.com> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@gmail.com

--- Comment #1 from Kunal Mehta (Legoktm) <***@gmail.com> ---
So...basically implement
https://www.mediawiki.org/wiki/Requests_for_comment/Wikidata_API ?
--
You are receiving this mail because:
You are on the CC list for the bug.
b***@wikimedia.org
2014-11-03 18:40:59 UTC
Permalink
https://bugzilla.wikimedia.org/show_bug.cgi?id=72729

--- Comment #2 from Ryan Kaldari <***@wikimedia.org> ---
Legoktm: Basically, yes.
--
You are receiving this mail because:
You are on the CC list for the bug.
b***@wikimedia.org
2014-11-03 18:42:10 UTC
Permalink
https://bugzilla.wikimedia.org/show_bug.cgi?id=72729

Kunal Mehta (Legoktm) <***@gmail.com> changed:

What |Removed |Added
----------------------------------------------------------------------------
URL| |https://www.mediawiki.org/w
| |iki/Requests_for_comment/Wi
| |kidata_API
Summary|Create Wikidata API module |Add Wikibase API module
|that is queryable from the |that is usable from client
|local wiki |wikis and available as a
| |generator & prop module
--
You are receiving this mail because:
You are on the CC list for the bug.
b***@wikimedia.org
2014-11-06 16:07:21 UTC
Permalink
https://bugzilla.wikimedia.org/show_bug.cgi?id=72729

--- Comment #3 from Daniel Kinzler <***@wikimedia.de> ---
Yuri's RFC is for use on the Repo, though. The idea there is to use Wikibase
stuff as generators. Ryan's request, if I understand it correctly, is to
implement a property module that can be used to provide extra properties for
pages listed by a generator on a client wiki.
--
You are receiving this mail because:
You are on the CC list for the bug.
b***@wikimedia.org
2014-11-06 16:17:03 UTC
Permalink
https://bugzilla.wikimedia.org/show_bug.cgi?id=72729

--- Comment #4 from Daniel Kinzler <***@wikimedia.de> ---
If I understand correctly, the intended use case is this: you have a list of
local pages titles (e.g. from a prefix search), and want to list the; in the
listing, you want to show some extra info from Wikidata, like the description.
The suggestion is to allow API queries to include this extra information using
an API prop module.

This could be done, but I wonder whether it's worth the effort. You can get the
same info easily from Wikidata directly, with a single API call. For example,
to get the wikidata labels and descriptions, in English, associated with the
Pages Birch, Beech, and Beetle on enwiki, you can use the following query:

http://www.wikidata.org/w/api.php?action=wbgetentities&format=json&sites=enwiki&titles=Birch%7CBeech%7CBeetle&props=labels%7Cdescriptions&languages=en%7Cen-ca%7Cen-gb

Isn't this sufficient?
--
You are receiving this mail because:
You are on the CC list for the bug.
b***@wikimedia.org
2014-11-06 19:20:07 UTC
Permalink
https://bugzilla.wikimedia.org/show_bug.cgi?id=72729

--- Comment #5 from Ryan Kaldari <***@wikimedia.org> ---
Yes, that's basically what folks are currently doing, but it isn't ideal.
Ideally, we would like to be able to get regular page props and wikidata data
from a single API call. Also, we would like to avoid the extra DNS lookup of an
external HTTP request in high-traffic contexts (like search suggestions) if
possible.
--
You are receiving this mail because:
You are on the CC list for the bug.
b***@wikimedia.org
2014-11-07 02:08:11 UTC
Permalink
https://bugzilla.wikimedia.org/show_bug.cgi?id=72729

--- Comment #6 from Dan Garry <***@wikimedia.org> ---
I second what Kaldari has said. Sure, it's sufficient, but it shouldn't be
necessary. :-)
--
You are receiving this mail because:
You are on the CC list for the bug.
b***@wikimedia.org
2014-11-15 16:17:51 UTC
Permalink
https://bugzilla.wikimedia.org/show_bug.cgi?id=72729

--- Comment #7 from Daniel Kinzler <***@wikimedia.de> ---
Considering that with my approach, you would be hitting wbgetentities with a
couple of hundreds of queries from the mobile search interface, I suppose you
are right: that isn't going to work. wbgetentities needs to load the full
entity structure from the blob store, that's slow...

We already have the data you wan in the wb_terms table. I suppose adding a
client side module that works much like the ApiQueryPageProps would be easy
enough, and should make this a lot faster.

I can't promise that it will be performant enough though, I hear the API
servers are pretty loaded. An alternative solution would be to add this
information directly to Elastic, so it can be returned directly by the search
module.

By the way, what do you use to generate the original list of local page titles?
action=opensearch? action=wbsearchentities?
--
You are receiving this mail because:
You are on the CC list for the bug.
b***@wikimedia.org
2014-11-16 15:52:20 UTC
Permalink
https://bugzilla.wikimedia.org/show_bug.cgi?id=72729

--- Comment #8 from Daniel Kinzler <***@wikimedia.de> ---
I have implemented a pageterms module, see I9b6b52f6b75e4d6a
--
You are receiving this mail because:
You are on the CC list for the bug.
b***@wikimedia.org
2014-11-17 19:40:43 UTC
Permalink
https://bugzilla.wikimedia.org/show_bug.cgi?id=72729

Bernd Sitzmann <***@wikimedia.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@wikimedia.org

--- Comment #9 from Bernd Sitzmann <***@wikimedia.org> ---
Daniel, the apps currently use both prefixsearch and search generators. I can't
speak for mobile web, but I guess it's similar. When the user clicks search we
perform a title search first, then allow the user to switch to full text search
from there. We currently have to collect the wikibase_items and then send off
another request to wikidata.org to get the descriptions. Like Kaldari mentioned
above, we would like to avoid that. Below are some examples we have currently
implemented.

(1) Title search:
https://en.m.wikipedia.org/w/api.php?action=query&format=json&generator=prefixsearch&gpssearch=foo&gpsnamespace=0&gpslimit=12&prop=pageprops%7Cpageimages&ppprop=wikibase_item&piprop=thumbnail&pithumbsize=96&pilimit=12&list=prefixsearch&pssearch=formula&pslimit=12

(2) Full text search:
https://en.m.wikipedia.org/w/api.php?action=query&format=json&prop=pageprops%7Cpageimages&ppprop=wikibase_item&generator=search&gsrsearch=foo&gsrnamespace=0&gsrwhat=text&gsrinfo=&gsrprop=redirecttitle&gsroffset=0&gsrlimit=12&list=search&srsearch=foo&srnamespace=0&srwhat=text&srinfo=suggestion&srprop=&sroffset=0&srlimit=12&piprop=thumbnail&pithumbsize=96&pilimit=12
--
You are receiving this mail because:
You are on the CC list for the bug.
b***@wikimedia.org
2014-11-19 19:25:01 UTC
Permalink
https://bugzilla.wikimedia.org/show_bug.cgi?id=72729

Dan Garry <***@wikimedia.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Blocks| |73616
--
You are receiving this mail because:
You are on the CC list for the bug.
b***@wikimedia.org
2014-11-20 21:01:16 UTC
Permalink
https://bugzilla.wikimedia.org/show_bug.cgi?id=72729

Nemo <***@tiscali.it> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@tiscali.it

--- Comment #10 from Nemo <***@tiscali.it> ---
As Kaldari noted, "PageTerms" is not self-explanatory. My first thought was it
would contain per-page legal terms (e.g. license).
--
You are receiving this mail because:
You are on the CC list for the bug.
Loading...