my friend at Thomasnet, took these screen shots as the Wolfram|Alpha site was up very briefly and he was able to download the toolbar. The site was up for a bout 10 minutes and not even basic questions like 2+2 worked so the server was probably not even configured yet and must have been unintentional.
Yesterday I attended a demonstration of Stephen Wolfram’s , “Wolfram|Alpha,” the new “answer search engine.” The software has a lot of potential, and builds on the already millions of lines of code from Wolfram’s popular Mathematica software. Like any major undertaking there are many hurdles to overcome. It seemed that there was much more money needed to keep paying their team of approximately 100 PhDs. There was apparently a large amount of work ahead to reach the end game of being able to provide “all the worlds’ computable knowledge.” The means to accomplish are their fulltime staff of PhDs as well as consulting outside experts. Wolfram said they were also open to having a structured and strictly reviewed, process for outside submission of information. Just this initial stage of gathering information seems very time consuming and labor intensive, but from what I saw of the technology, it seems that it has the potential to pay off big. The sources of the data can be viewed at the bottom of the page; this human gathered information combined with their algorithm determines what is presented.
The three steps for getting the data to the end users screen are as follows.
Gather and curate sources/data find red flags or anomalies in the data; these could either be problematic or telling information put this information in the data flow or algorithm convert this to natural language and decide how to present this on the users screen, this is done using heuristics (best guess) and other methods.
The biggest problems he talked about overcoming, was understanding human syntax, meaning converting natural language to something the software can understand. The information is how a human can access it using natural language that is the challenge. This seems to be the major problem with search getting a computer to think like a human and borders into AI. This gets more difficult as the questions get more complex with longer chains of information/parameters/concatenation like, “What is title of Marlen Brando’s first movie.” This is question is three levels deep and would be difficult yet according to Wolfram, possible to answer. Wolfram|Alpha was not presented as a Google killer as it serves a different purpose. Search engines are looking for other peoples answers that are out there somewhere on the internet, in a blog or html page. Wolfram|Alpha, computes its own answers from information in its database. This information is proprietary and its use comes with its own ToS which was described as being fair we did not get to view it. The screen shots were kept purposely hard to read, the text was very small and hard to read and one could just get a general understanding of what the layout looked like.
Here is a rundown of questions that were asked before our eyes. Most were able to be answered but there were some errors we saw unfold before our eyes, or as Wolfram said “good we found a bug, that’s why this isn’t released yet.” Answers consisted of a broad range subjects mostly with statistical answers. These statistics may or may not have been exactly what one would be searching for. Because the text was small we only knew what was being displayed by what Wolfram told us what was there, it was possible to make out basic charts and layouts.
“Serum htl”- returned general medical information
“Medical tests serum html”- returned a breakdown of how much serum isneeded for different body types, ages and other factors.
“Male serum html”- the system could not answer this one.
“Springfield”- figured out the closest Springfield to your geo ip and gave stats for that town.
“Weather in Springfield in 1996”- this worked, or returned something at least, like everything else it was illegible.
“Microsoft Vs. Sun” – compared the two businesses, understood that sun was talking about a company.
“Blue and yellow” – the answer was green.
“C#”- gave information on the musical scales and details surrounding C# in music theory.
“Caffeine”- shows chemical properties such as the molecular weight as well as other statistics and properties.
“Hurricane Andrew Hurricane Katrina” – this question did not work at first he had to adjust his natural language which brings up the question again, how users are going to understand to use the search engine.
“Orange juice”- gave Nutritional breakdown and some other facts
“Weather in Princeton NJ when Curt Garble died” – maybe a loaded question, maybe the most impressive example, probably a bit of both. This gave an answer weather it was right or not the ability we could not see. The ability for the system to understand the multiple chains in impressive.
Other questions which were answered with a page of statistics
“Distance to Moscow”
“Units for measurement”
mathematical questions and physics
Job info- similar to dept of labor statistics