Piggy Bank Scrapers

The following are a collection of piggy bank scrapers which I've written and may be of interest to others. To load these scrapers view the scraper metadata in piggy-bank (or simply click on the 'data icon' in the bottom of your firefox window) and following the scraper installation instructions. If you don't see a 'data icon' at the bottom right of your firefox window, changes are you don't have piggy-bank install. Go get it!

NOTE: These scrapers are more for demonstration / educational purposes and should not be considered production level ready. Scrapers are subject to change, enhancements without notice. Comments, suggestions and (better yet) submitted patches welcomed.


OCLC's Open Worldcat

OCLC's Open WorldCat program makes records of library-owned materials in OCLC's WorldCat database available to Web users. WorldCat currently has more than 1 billion holdings.

The worldcat scraper extracts the relevant bibliographic information from a worldcat record and the libraries near you (via ZIP code) that have the relevant resource and allows you to overlay this information on map. Though the piggy-bank's use of combining collections, you can plot multiple resources on a single map to help determine which library nearest you has all of the items you're looking for.

After you load the scraper, here is a quick "how-to" to get started:

  1. Search 'Da Vinci Code' via Google
  2. Choose a "Find in Library" item from the result set
  3. Click piggy-bank 'data coin' icon and follow directions - you'll get back the book and the libraries which own the book.
  4. Choose 'coordinates (map)' to display these libraries via google maps. Save it to your piggy-bank, or share with others via a Semantic Bank.

map of libraries

NCBI's Pubmed

NCBI pubmed is a service of the National Library of Medicine that includes over 15 million citations from MEDLINE and other life science journals for biomedical articles back to the 1950s. NCBI pubmed includes links to both full text articles and other related resources.

This pubmed scraper extracts a subset of the relevant article information, including authors, MESH subject headings (and soon chemical information, genes and protiens) and makes this data available in Piggy-Bank to better enable the integration of this data with other collections of information.

After you load the scraper, here is a quick "how-to" to get started:

  1. Got to PubMed, and enter a search term (e.g. 'semantic web rdf')
  2. Choose an item from the result set
  3. Click on the piggy-bank 'data coin' icon - you'll get back the article description, along with authors, subjects, chemicals, etc. that are associated with this article.
  4. Save it to your piggy-bank, or share with others via a Semantic Bank.


Starbucks Scraper

This Starbucks scraper extracts the location of Starbuck coffee shops near you for use in Piggy Bank. I particulary find this useful when I combine the results from this with the results from the open worldcat scraper.

After you load the scraper, here is a quick "how-to" to get started:

  1. Go to Starbucks store locator, and enter location (state, postal code, whatever )
  2. Click on the piggy-bank 'data coin' icon - you'll get back a list of stores and their location.
  3. Choose 'coordinates (map)' to display these libraries via google maps. Save it to your piggy-bank, or share with others via a Semantic Bank.

map of coffee shops and libraries that have a specific book near me
Eric Miller