Yahoo Boss – Google App Engine Integrated

Updated: I see blogs doing evaluations of the Q&A engine. I have to admit, that wasn’t my focus here. The service is merely 50 lines of code … just to demonstrate the integration of BMF and GAE.

Updated: Direct link to the example Question-Answering Service

Today I finally plugged-in the Yahoo Boss Mashup Framework into the Google App Engine environment. Google App Engine (GAE) provides a pretty sweet yet simple platform for executing Python applications on Google’s infrastructure. The Boss Mashup Framework (BMF) provides Python API’s for accessing Yahoo’s Search API’s as well remixing data a la SQL constructs. Running BMF on top of GAE is a seemingly natural progression, and quite arguably the easiest way to deploy Boss – so I spent today porting BMF to the GAE platform.

Here’s the full BMF-GAE integrated project source download.

There’s a README file included. Just unzip, put your appid’s in the config files, and you’re done. No setup or dependencies (easier than installing BMF standalone!). It’s a complete GAE project directory which includes a directory called yos which holds all the ported BMF code. Also made a number of improvements to the BMF code (SQL ‘where’ support, stopwords, yql.db refactoring, util & templates in yos namespace, yos.crawl.rest refactored & optimized, etc.).

The next natural thing to do is to develop a test application on top of this united framework. In the original BMF package, there’s an examples directory. In particular, ex6.py was able to answer some ‘when’ style questions. I simply wrapped that code as a function and referenced it as a GAE handler in main.py.

Here’s the ‘when’ q&a source code as a webpage (less than 25 lines).

The algorithm is quite easy – use the question as the search query and fetch 50 results via the Boss API. Count the dates that occur in the results’ abstracts, and simply return the most popular one.

For fun, following a similar pattern to the ‘when’ code, I developed another handler to answer ‘who’ or ‘what’ or ‘where’ style questions (finding the most popular capitalized phrase).

Here’s the complete example (just ~50 lines of code – bundled in project download):

Q&A Running Service Example

Keep in mind that this is just a quick proof of concept to hopefully showcase the power of BMF and the idea of Open Web Search.

If you’re interested in learning more about this Q&A system (or how to improve it), check out AskMSR – the original inspiration behind this example.

Also, shoutout to Sam for his very popular Yuil example, which is powered by BMF + GAE. The project download linked above is aimed to make it hopefully easier for people to build these types of web services.

34 thoughts on “Yahoo Boss – Google App Engine Integrated

  1. Why is this better than just using the google search api to do queries on instead of BOSS (sorry not that technical, trying to understand).

  2. it’s a good question

    basically 5 things
    (1) unlimited queries
    (2) no restrictions on the use of the results (blend, re-rank, remove, put it on a map)
    (3) control attribution
    (4) google’s api is more of a display as is api
    (5) yahoo boss is going to open up more data via the same api’s

  3. Steven – Not yet. Right now the best way to do that is through a BOSS web search using site:answers.yahoo.com

    Also check out the sites parameter in the API.

  4. Regarding the 5 points you wrote that BOSS has which Google doesn’t, as far as I know those are not true. There is an API which returns nice JSON formatted strings and have no limits. You don’t even need a key to use it (although they recommend it for debugging purposes). I did not see any restrictions regarding what you can do with these results. You can use it for things other than display, since it’s JSON.

    For example:
    http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=BOSS

    It’s described more here: http://code.google.com/apis/ajaxsearch/documentation/

    Don’t let the title “AJAX Search API” fool you, that’s only half of it. 🙂

  5. yea that’s a great api offering.
    some of my points were selling points and maybe not differentiators

    however, if you read the terms of the google ajax search api, there are noticeable differences (mainly how you can use the results)

    1.3 Appropriate Conduct and Prohibited Uses. The Service may be used only for services that are accessible to your end users without charge.

    You agree that you will not, and you will not permit your users or other third parties to: (a) modify or replace the text, images, or other content of the Google Search Results, including by (i) changing the order in which the Google Search Results appear, (ii) intermixing Search Results from sources other than Google, or (iii) intermixing other content such that it appears to be part of the Google Search Results; or (b) modify, replace or otherwise disable the functioning of links to Google or third party websites provided in the Google Search Results.

    with boss we are encouraging users to blend and re-rank results and in fact providing tools to do so (mashup framework)

    don’t let “json api” fool you 🙂

  6. is dere any sandbox account? unless my application is ready i would not be hosting it. so in this scenario how would i get app id?

Leave a reply to Vik Cancel reply