Category Archives: UI

TweetNews (Real-Time Search) Is Back

Update: Twitter’s Search API seems to timeout quite a bit so many search results don’t get any tweets linked. Try again later or refer to the screenshots below. Also, delicious.com is now testing an early version of this model for its homepage ranking.

Here it is  tweetnews.appspot.com

And an example query  yahoo

About six months ago I released a simple 100 line search application called TweetNews, which basically links tweets to the freshest Yahoo! News articles. The more related tweets an article has, the higher its rank. The tweet count and messages are presented underneath each result so that a user can read the social commentary inline with the article listing. It was developed more to demonstrate the openness and power of Yahoo! BOSS (you can read more about it in my previous posts here and here). Remarkably, many users found the service useful despite its slow performance, barebones UI, lack of homepage, domain, (you name it), etc.

Interestingly, the TweetNews concept has been popping up in my recent discussions around real-time search, so I felt it was about time to polish up TweetNews to serve as a better proof of concept.

Here are some of the new features:

  • Sweet UI (kudos to Kara McCain & Aaron Wheeler for the awesome design and template)
  • Continually Updated, Fresh Homepage (aggregates & ranks feeds like Techmeme, Delicious, Digg)
  • Faster Performance
  • Improved Algorithm
  • Local Views (re-rank & link tweets from a select region)

.

Here’s a screenshot of the homepage:

TweetNews Homepage

.

And here’s an example of Local Views:

London’s View of ‘iphone’

TweetNews IPhone (London Ranking)

Los Angeles’ View of ‘iphone’

TweetNews IPhone (Los Angeles Ranking)

Striking difference between Americans (actually just SoCal) and the British right there 🙂

I think the Local Views concept is pretty promising, although there’s plenty of room for improvement (use BOSS region filters, access Twitter’s Firehose Feed for more granularity, etc.).

Which is why, like I did with the last version, plan to open source all the code powering this application (just need a little more time to get it reviewed).

Interestingly, the homepage system in this package is very general. Just pass it any list of RSS feeds and it’ll do the clustering, tweet linking, ranking, and page generation automatically every X minutes for you. Anyone want a fresh, personalized Techmeme? Let me know if that sounds interesting.

Please keep in mind that this is still a simple, early prototype to show how one can use BOSS to experiment with very interesting data sources like Twitter to tackle big problems like real-time search.

Advertisements

6 Comments

Filed under Blog Stuff, Boss, Code, Information Retrieval, Non-Technical-Read, Open, Research, Search, Social, Techmeme, Twitter, UI, Yahoo

How Google is putting us back into the Stone Age

Yeah, I know – what a linkbait title. If that’s what it takes these days to get visitors and diggs then so be it. Also, just to forewarn, as you read this you might find that a better title choice for this post would have been “How Web 2.0 is putting us back into the Stone Age” since many of these thoughts generalize to Web 2.0 companies as a whole. I used Google in the title mainly because they are the big daddy in the web world, the model many web 2.0 companies strive to be like, the one to beat. Plus, the title just looks and sounds cooler with ‘Google’ in it.

Here’s the main problem I have with web applications coming from companies like Google: About 2 years ago I bought a pretty good box – which is now fairly standard and cheap these days – 2 gigs of ram, dual core AMD-64 3400+’s, 250 gigs hd, nVidia 6600 GT PCI Express, etc. It’s a beast. However, because I don’t play games, its potential isn’t being utilized – not even close. Most of the applications I use are web-based, mainly because the web provides a medium which is cross platform (all machines have a web browser), synchronized (since the data is stored server side I can access it from anywhere like the library, friend’s computer, my laptop) and it keeps my machine pretty light (no need to install anything and waste disk and risk security issues). The web UI experience for the most part isn’t too bad either – in fact, I find that the browser’s restrictions force many UI’s to be far simpler and easier to use. To me, the benefits mentioned above clearly compensate for any UI deficiencies. Unfortunately, this doesn’t mean that Web 2.0 is innovating the user’s experience. Visualizing data – search results, semantic networks, social networks, excel data sheets – is still very primitive, and a lot can be done to improve this experience by taking advantage of the user’s hardware.

My machine, and most likely yours, is very powerful and underutilized. For instance, my graphics card has tons of cores. We live in an age where GPU’s like mine can sort terabytes of data faster than the top-of-the-line Xeon based workstation (refer to Jim Gray’s GPUTerasort paper). For sorting, which is typically the bottleneck in database query plans and MapReduce jobs, it’s all about I/O – or in this case, how fast you can swap memory (for example, a 2-pass bitonic radix sort iteratively swaps the lows and the highs). Say you call memcpy in your C program on a $6,000 Xeon machine. The memory bandwidth is about 4 GB/s. Do the equivalent on a $200 graphics co-processor and you get about 50 GB/s. Holy smokes! I know I’m getting off-topic here, but why is it so much faster on a GPU? Well, in CPU world, memory access can be quite slow. You have almost these random jumps in memory, which can result in expensive TLB/cache misses, page faults, etc. You also have context switching for multi-processing. Lots of overhead going on here. Now compare this with a GPU, which has the memory almost stream directly to tons of cores. The cores on a GPU are fairly cheap, dumb processing units in comparison to the cores found in a CPU. But the GPU uses hundreds of these cores, in parallel, to drastically speed-up the overall processing. This, coupled with its specialized memory architecture, results in amazing performance bandwidth. Also, interestingly, since these cores are cheap (bad), there’s a lot of room for improvement. At the current rate, GPU advancements are occurring 3-4x faster than Moore’s law for CPU’s. Additionally, the graphical experience is near real-life quality. Current API’s enable developers to draw 3D triangles directly off the video card! This is some amazing hardware folks. GPU’s, and generally this whole notion of co-processing to optimize for operations that lag on CPU’s (memory bandwidth, I/O) promise to make future computers even faster than ever.

OK, so the basic story here is our computers are really powerful machines. The web world doesn’t take advantage of this, and considering how much time we spend there, it’s an unfortunate waste of computing potential. Because of this, I feel we are losing an appreciation for our computer’s capabilities. For example, when my friend first started using Gmail, he was non-stop clicking on the ‘Invite a friend’ drop-down. He couldn’t believe how the page could change without a browser refresh. Although this is quite an extreme example, I’ve seen this same phenomena for many users on other websites. IMHO, this is completely pathetic, especially when considering how powerful client-end applications can be in comparison.

Again, I’m not against web-based applications. I love Gmail, Google Maps, Reader, etc. However, there are applications which I do not think should be web-based. An example of this is YouOS, which is an OS accessible through the web-browser. I mean, there’s some potential here, but the way it’s currently implemented is very limiting and unnecessary.

To me, people are developing web-services with the mindset ‘can it hurt?’, when I think a better mantra is ‘will it advance computing and communication?’. Here’s the big web 2.0 problem: Just because you can make something web 2.0’ish, doesn’t mean you should. I think of this along the lines of Turing Complete, which is a notion in computer science for determining whether a system can express any computation. Basically, as long as you can process an input, store state, and return an output (i.e. a potentially stateful function), you can do any computation. Now web pages provide an input form, perform calculations server side, and can generate outputting pages – enough to do anything according to this paradigm, but with extreme limitations on visualization and performance (like with games). AJAX makes web views richer, but it is not only a terribly hacked up programming model, but for some reason compels developers to convert previously successful client-end-based applications into web-based services. Sometimes this makes sense from an end-user perspective, but consequently results in dumbing down the user experience.

We have amazing hardware that’s not being leveraged in web-based services. Browsers provide an emulation for a real application. However, given the proliferation of AJAX web 2.0 services, we’re starting to see applications only appear in the browser and not on the client. I think this current architecture view is unfortunate, because what I see in a browser is typically static content – something I could capture the essence of with a camera shot. In some sense, Web 2.0 is a surreal hack on what the real online experience should be.

I feel we really deserve truly rich applications that deliver ‘Minority Report’ style interfaces that utilize the client’s hardware. Movies predating the 1970’s predicted so much more for our current state’s user experience level. It’s up to us, the end-consumer, to encourage innovation in this space. It’s up to us, the developer, to build killer-applications that require tapping into a computer’s powerful hardware.The more we hype up web 2.0 and dumb-downed webpage experiences, the more website-based services we get – and consequently, less innovation in hardware driven UI’s.

But there’s hope. I think there exists a fair compromise between client-end applications and server-side web services. Internet is getting faster, the browser + Flash are getting fine tuned to make better use of a computer’s resources. Soon, the internet will be well-suited for thin-client computing. A great example of this already exists today, and I’m sure many of you have used it: Google Earth. It’s a client-end application – taking advantage of the computer’s graphics and processing power to make the user feel like he/she is traveling in and out of space – while being a server-side service since it gathers updated geographical data from the web. The only problem is there’s no cross-platform, preexisting layer to build applications like this. How do we make these services without forcing the user to do an interventionist, slow installation? How do we make it run over different platforms? Personally, I think Microsoft completely missed the boat here with .NET. If MS could have recognized the web phenomena early on, they could have build this layer into Vista to encourage users to develop these rich thin-client applications, while also promoting Vista. I have no reason to change my OS – this could have been my reason! Even if it was cross platform, if they had better performance it’s still a reason to prefer (providing some business case). Instead, they treated .NET as a Java-based replacement for MFC, thereby forcing developers to resort to building their cross-platform, no-installation-required services through AJAX and Flash.

Now, even if this layer existed, which would enable developers to build and instantly deploy Google Earth style applications in a cross-platform manner, there would be security concerns. I mean, one could make the case that ActiveX attempted to do this – allowing developers to run arbitrary code on the client’s machines. Unfortunately, this led to numerous viruses. Security violations and spyware scare(d) all of us – so much so that we now do traditionally client-end functions through a dumb-downed web browser interface. But, I think we made some serious inroads in security since then. The fact that we even recognize security in current development makes us readily prepared to support such a platform. I am confident that the potential security issues can be tackled.

To make a final point, I think we all really need higher expectations in the user experience front. We need to develop killer applications that push the limitations of our hardware – to promote innovation and progress. We’re currently at a standstill in my opinion. This isn’t how the internet should be. This is not how I envisioned the future to be like 5 years ago. We can do better. We can build richer applications. But to do this, we as consumers must demand it in order for companies to have a business case to further pursue it. We need developers to come up with innovative ways of visualizing the large amounts of data being generated with the use of hardware – thereby delivering long-awaited killer-applications for our idly computers. Let’s take our futuristic dreams and finally translate them into our present reality.

7 Comments

Filed under Computer Science, Databases, Google, Hardware, UI, Web2.0