Updated: I see blogs doing evaluations of the Q&A engine. I have to admit, that wasn’t my focus here. The service is merely 50 lines of code … just to demonstrate the integration of BMF and GAE.
Updated: Direct link to the example Question-Answering Service
Today I finally plugged-in the Yahoo Boss Mashup Framework into the Google App Engine environment. Google App Engine (GAE) provides a pretty sweet yet simple platform for executing Python applications on Google’s infrastructure. The Boss Mashup Framework (BMF) provides Python API’s for accessing Yahoo’s Search API’s as well remixing data a la SQL constructs. Running BMF on top of GAE is a seemingly natural progression, and quite arguably the easiest way to deploy Boss – so I spent today porting BMF to the GAE platform.
Here’s the full BMF-GAE integrated project source download.
There’s a README file included. Just unzip, put your appid’s in the config files, and you’re done. No setup or dependencies (easier than installing BMF standalone!). It’s a complete GAE project directory which includes a directory called yos which holds all the ported BMF code. Also made a number of improvements to the BMF code (SQL ‘where’ support, stopwords, yql.db refactoring, util & templates in yos namespace, yos.crawl.rest refactored & optimized, etc.).
The next natural thing to do is to develop a test application on top of this united framework. In the original BMF package, there’s an examples directory. In particular, ex6.py was able to answer some ‘when’ style questions. I simply wrapped that code as a function and referenced it as a GAE handler in main.py.
Here’s the ‘when’ q&a source code as a webpage (less than 25 lines).
The algorithm is quite easy – use the question as the search query and fetch 50 results via the Boss API. Count the dates that occur in the results’ abstracts, and simply return the most popular one.
For fun, following a similar pattern to the ‘when’ code, I developed another handler to answer ‘who’ or ‘what’ or ‘where’ style questions (finding the most popular capitalized phrase).
Here’s the complete example (just ~50 lines of code – bundled in project download):
Keep in mind that this is just a quick proof of concept to hopefully showcase the power of BMF and the idea of Open Web Search.
If you’re interested in learning more about this Q&A system (or how to improve it), check out AskMSR – the original inspiration behind this example.
Also, shoutout to Sam for his very popular Yuil example, which is powered by BMF + GAE. The project download linked above is aimed to make it hopefully easier for people to build these types of web services.
I cant help but comment on this.
who invented the internet ? returns
al gore
🙂
haha.
al gore seems to be behind a lot of things …
who invented global warming ? returns
al gore
hahaha
cool }:)
who is president?
Why is this better than just using the google search api to do queries on instead of BOSS (sorry not that technical, trying to understand).
it’s a good question
basically 5 things
(1) unlimited queries
(2) no restrictions on the use of the results (blend, re-rank, remove, put it on a map)
(3) control attribution
(4) google’s api is more of a display as is api
(5) yahoo boss is going to open up more data via the same api’s
Question: Does BOSS presently support querying on Yahoo’s Ask and Answer content library found at http://answers.yahoo.com. If so, can request and responses be limited to this data set
Thank you
Steven – Not yet. Right now the best way to do that is through a BOSS web search using site:answers.yahoo.com
Also check out the sites parameter in the API.
Regarding the 5 points you wrote that BOSS has which Google doesn’t, as far as I know those are not true. There is an API which returns nice JSON formatted strings and have no limits. You don’t even need a key to use it (although they recommend it for debugging purposes). I did not see any restrictions regarding what you can do with these results. You can use it for things other than display, since it’s JSON.
For example:
http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=BOSS
It’s described more here: http://code.google.com/apis/ajaxsearch/documentation/
Don’t let the title “AJAX Search API” fool you, that’s only half of it. 🙂
yea that’s a great api offering.
some of my points were selling points and maybe not differentiators
however, if you read the terms of the google ajax search api, there are noticeable differences (mainly how you can use the results)
1.3 Appropriate Conduct and Prohibited Uses. The Service may be used only for services that are accessible to your end users without charge.
You agree that you will not, and you will not permit your users or other third parties to: (a) modify or replace the text, images, or other content of the Google Search Results, including by (i) changing the order in which the Google Search Results appear, (ii) intermixing Search Results from sources other than Google, or (iii) intermixing other content such that it appears to be part of the Google Search Results; or (b) modify, replace or otherwise disable the functioning of links to Google or third party websites provided in the Google Search Results.
with boss we are encouraging users to blend and re-rank results and in fact providing tools to do so (mashup framework)
don’t let “json api” fool you 🙂
i had 2 apples and i ate 1, how many do i have .. Answer: 3.
Hello it’s an amazing application but it has some little things to perfect it.
A little htmlentities or escapechar could help as you can see :
http://bossy.appspot.com/qa?query=%22%3E+%3Cscript%3Ealert(%22Pas+les+meres+pas+les+cartables%22)%3C%2Fscript%3E+%3Ciframe+src%3D%22http%3A%2F%2Fwww.google.com%22+%2F%3E%3Cimg+src%3D%22http%3A%2F%2Fboortz.com%2Fimages%2Ffunny%2Ffark_chuck_norris_dog.jpg%2F%3E
nice catch. yep, didn’t do any query text normalization. when i get a chance will try to fix it.
Really cool idea. Just wanted to give you a shout and let you know that I implemented it in PHP http://shout.setfive.com/2008/08/13/yahoo-boss-is-sahweet/
Nice post .. Thanks for contributing to internet community
If you’re looking for a quick introduction to the Google App Engine check out http://www.squidoo.com/Google-App-Engine
I guess everyone got the answer today for why to use Yahoo Boss with GAE. Thanks Vik.
is dere any sandbox account? unless my application is ready i would not be hosting it. so in this scenario how would i get app id?