search to find

News.com picked up on some of the new work thats going on in the acedemic search / information management space in their article Academia’s quest for the ultimate search tool citing Berkeley’s new interdisciplinary department focus on search technology, CMU’s Javelin work focused on Question Answering search technology and MIT’s Simile project.

I particularly like the susinct point MacKenzie makes regarding the benifits of the semantic web architecture that Simile has developed:


A generalized data archive lets you make data work together in ways you couldn’t before

MIT’s START system based on formalizing / mining metadata composed of natural language phrases and sentences I think is another one thats worth mentioning in this space. Opening up a RDF interface to this data in I think would be particularly interesting.

One search engine can’t do everything. Different search engines / strategies will be more effective at addressing different tasks. Being able to expose the data behind these services and allow individuals / organizations the means to tie together this data will be key .

on connecting things…

(reconstructed from wayback)

Talks over on the Simile list have moved into the realm of bibliographic citations and of the best way of describing people. FRBR has been mentioned as well as IFLA’s FRAR work for authority records in this context. I’m particularly encoraged by the more recent work of Ian Davis and Richard Newman in this area in grounding FRBR in RDF.

I very much respect the FRBR work and I believe the instantiation of FRBR in RDF is an important step for weaving libraries into the Web and letting folks outside of the library community know that the libraries still know a thing or two regarding the modeling and management of information :) . I’d very much like to see this work move forward and I’m interested in learning more about how to help.

From the perspective of project Simile (where this discussion in part is taking place), however, I’m slightly less interested in the “best” way of describing things (e.g. People) and more interested in how to start operationalize the contextual linking of these things together. I believe there are some relatively simple steps that might be taken to achieve a very powerful network effect.

Here is an example …

hubmed has wrapped pubmed and provided (among many things) an RDF representation of the corresponding bibliographic data. This is an important step for “connecting things” in the biomedical and life sciences community. Here is an example of one of these records ( HTML, RDF/XML)

By itself, the article in RDF form is not really helpful. That said, in RDF it makes it easier to connect this with other data sets. To illustrate this example, I’ve added this RDF data to the Semantic Bank and used this tool to help connect intersting bits and pieces from several servers.

One of the first things one may notice looking at this record is that you’ll see the authors listed as (anonymous items). This is one of the reasons why I’m of the opinion that a “default value” thats included by the data providers would be useful.

If you get past the debug-view of the interface, another thing you may notice (choose ‘Show Referers’) is the fact that this article is a “supporting Article” for an Observation and that there is another article that supports this Observation as well. Further, this Observation is one of several “supporting Evidence” (again choose ‘Show Referers’) that is associated with the Amyloid Hypothesis which is related to Alzheimers Disease.

Some of this data comes from pubmed (articles), some comes from scientific communities (in the above case, the Amyloid Hypothesis is from Alzforum). Through the Semantic Web we can begin to see the various potentials of using a common framework to draw connections among various “things” of interest. In this specific case of the life sciences community, I think this community is very close to not only connecting people to people, people to articles, articles to journals, etc. but articles to hypothesis, hypothesis to disease, genes, proteins, etc. And ultimatly conntecting the dots between diseases to drugs.

There are many paths one may take to make this connection and the path for one may not be the same as one that works for another. Providing the ability f\ or people to create new connections among data and share this with others is key. A community focused on a particular goal, task or interest coupled with a f\ ramework for representing, sharing and integrating data is a powerful combination.

Small but important steps will help facilitate this goal. On the technology side, more tools like Connotea, Simile, etc. are required. From the content side however, common means of referencing ‘things’ that are real (people, places, articles, genes, proteins, etc.) and from there, agreement on a common means for describing these resources (RDF) are still required. Common protocols and interfaces to this data will be needed as well. This is where technologies such as SPARQL will be increasingly critical. Folks over in Nature and Hubmed seem to “get it” and are good examples of a growing awareness in the “interconnectedness of things”.

There continues to be a lot of focus on the “best” way to describing things. I don’t want this to stop. My hope is, however, that people will begin to place an equal if not greater value on the contextualization of these things they’re hoping to describe. As we weave a web of data, I believe how things connect will prove more valuable.

Interesting times in Education and Semantic Web

Bookmarking an interesting development reported in eweek

Graham Glass, founder of successful software companies, supporter of Web services and service-oriented architectures, and former chief technology officer at webMethods Inc., has announced his resignation from the business integration software vendor.


“After many years of working on enterprise software, I’ve decided to get back to my training roots and start a fourth and as-yet unnamed product company focused on improving the education system,” he said. “Although the product itself will be an easy-to-use Web-based application targeted at K-12 students, teachers and parents, the underlying software infrastructure will be quite complex and utilize many concepts from the semantic Web.”

Color me interested :)

I’ve been working more recently with my friend Joseph Hardin who’s Directing the Sakai project about weaving more Semantic Web technologies into the higher education space. I think there is a lot of potential here so I’m extreamly glad to see Glass’s interest in this area. Looks like an exciting development to be sure!

on pigs and maps

I miss my maps.

Well, I miss google maps more specifically.

Even more specifically, I miss google maps on piggy-bank.

Piggy-bank is about making it easy to manage information you find interesting and share this with a group. Piggy Bank is an extension to the Firefox web browser that turns it into a “Semantic Web browser”, letting you make use of existing information on the Web in more useful and flexible ways. The information I find personally useful comes in many shapes / sizes (personal contacts, bookmarks, bibliographic citations, interesting news items, photos, events, personal email, etc.). While I’ve found lots of applications help me manage each of these independantly, the real benifit I’ve found is being able to see how individual bits of personal information connect. “Oh! I have a meeting with ‘Company Y’ at 5 pm today? Who have I talked with recently that is working with them?” The value from my perspective when you start acquiring enough data is in the relationships among the data rather than the data itself.

You might not know all of the relationships that exist between the information that you find personally interesting, but others may. Sharing these relationships with trusted colleagues creates webs of data greater than the sum of their individual parts.

Ok, back to maps.

One of the nice features of Piggy-Bank (and of the Simile project in general) is making it easy to not only to mix and match different bits of data, but to mix and match different services as well. One of the examples we provide is integrating Google maps to provide a geographical overlay of the data one might collect.

small image

As a thought experiment, I did this with a group of folks that have a serious problem with low-cost, high-quality, flea-powered tube based amplifiers and high efficency speakers (ok, so its a relatively small group… but as a card-carrying member, this stuff sure makes my ears happy-happy!). And while I hope to eventually get around and connect the dots as to why I think global Initiatives like the Semantic Web and specific projects like Simile are so potentially revolutionary in helping communities share information, experiences and more effectively collaborate – the short version is I helped plot the folks that are in this group on a map. People didn’t realize how physically close there were to one another and this little bit of effort is helpful in starting to pull together regional events where folks can geek together on a face-to-face basis.

And it was easy! Piggy-bank helped reduce the cost for collecting this information, merge it with other bits and with minimal effort plot this on a map. Further it allows the ability to overlay *multiple* bits of data onto a map. You want to overlay hotels *and* subway stops in Boston? No problem! Ok.. so there is a problem (you need to have this data transformed into a common representation that faciliates this stitching together on information) . But the effort in doing this is far less than what it would have taken before. And if one person does manage to do the effort of translating the data regarding hotels from ’site X’ and another does subway stops from ’site Y’, its easy for the next person to reuse these translators (along with the translated data!). This in part is what the ‘Semantic Bank’ is all about. Being able to share this information in a way others can reuse it. Collectively, we’re all smarter than any one of us individually.

Ok… so here I am very excited. Whats next? For a long time I’ve been wanted to overlay airports and my travel history for the past 10 years as I’ve often though this would make for a cool map. Here I am getting global airport data into RDF (10 min worth of work) and then start the process of merging this in with my historical travel data.

I go to visualize this and Google turns around changes the API. This is to be expected – they’ve done this on a weekly basis for a while. Its beta, and I more than understand beta code. only now an additional requirement of using an application Key based on a static IP address has been introduced. Err…. now we’re stuck. This shoots the whole personal information meets google maps revolution. There is no static IP address for my client / browser. Various inquiries are underway with Google to see if if there are other options we might consider, but until then I’m unfortunatly mapless. :(

In the mean time, I sure do miss google maps on piggy-bank. If you want this too, please let us know over on the Simile list.

Oracle 10.2 and RDF

Several folks have been asking me about more information regarding Oracle 10.2 and RDF support. Googling for these terms turns up various talks of mine, but unfortunatly not much else. I expect the following “Over 100 Partners Thoroughly Test and Support Oracle(R) Database 10g Release 2″ news announcement to be of us to those looking for more information. Technical details are missing from this (see various white papers for these), but the 100 partners bit is indeed very impressive no mater how you look at it :)

trip to london

(reconstructed from wayback)

I’m in central London at the moment talking about the Semantic Web with various CIO/CTOs working in and or around the Criminal Justice IT and eGov departments of the UK. During the conference we were notified of the bombings around central London. The conference has been cut short in part because many of the folks I was talking to needed to focus on the real problem at hand. (That, and while it was clear a chain of events were unfolding, it was unclear what the targets were. In this particular case, having so many top UK officials in one single place I suspect was not viewed as a good idea).

The situation was terrible, but the people responded in a fantastic manner. I’m impressed with how Londoners delt with this tragedy – working with each other to help those that need it most. The UK Government officials (Fire, Police, Emergency, etc.) in particular reacted brilliantly in the face of a terrible series of events. People were clearly shocked, but unwilling to let this terrible situation keep them from getting on with their lifes. My thoughts are with those dealing personally with this tragedy.

If there is a silver lining one could see from this event, it was witnessing the indomitable spirit of man rising above such a terrible tragedy. This is certainly not a trip I’ll soon forget.