The HCI Bibliography: Ten Years Old, But What's It Done for Me Lately?

Published in ACM interactions magazine, 1999, v.6, n.2, p.32-35. PDF

Gary Perlman
director@hcibib.org
http://www.acm.org/perlman/

Abstract

The idea for a free-access online bibliography on human-computer interaction, which resulted in the HCI Bibliography, is over ten years old. Although it started slowly, the HCI Bibliography has grown to over 18,000 entries, most with abstracts, with over 4000 links to full text. Now, with its own web site at www.hcibib.org and its own search service, the HCI Bibliography serves as a central repository for HCI information on the Web (with entries for about 800 Internet resources) and off (with entries for about 400 books over 15 conferences and over 10 major journals). This article (1) summarizes the history of the HCI Bibliography, (2) describes its current holdings, web site, and search service, and (3) considers how and why to offer a free service.

Keywords

Human-computer interaction, Bibliography, Bibliographic information retrieval, World-Wide Web, Full text retrieval, Expert system, Search assistance, Query language usability, Compulsive gathering of information

A Brief History of the HCI Bibliography

The Idea

The HCI Bibliography grew out of my 1988 experiences while I wrote a curriculum module on User Interface Development for the Software Engineering Institute (Perlman, 1989). I was delighted to have work-study students type in the bibliographic information for about 200 references; I especially liked having the abstracts and/or table of contents also online. I could search the file, reorder records, import them into bibliography management tools, etc. I thought, "If a couple of students can put online hundreds of records in a few weeks, what could hundreds or thousands of people do? In a short time, their efforts could be merged into a bibliography to be used by thousands. Maybe authors could donate records of their own publications!

How it Really got Started

Of course, that was a naive view. Authors wrote long messages explaining why they could or would not provide an abstracted entry. People donated bibliographic data that had several errors per abstract. The SIGCHI EC expressed a lack of interest in helping because of the likelyhood of failure of such a project. Fortunately, work-study students at The Ohio State University signed up to do data entry, and some people on the Net (as we called the Internet then) were willing to validate entries. Also, publishers were willing to give permissions to have their materials online, free of charge. In 1991, after two years of getting started, the first paper on the HCI Bibliography (or HCIBIB, as I call it), appeared in the SIGCHI Bulletin (Perlman, 1991), boasting of over 1000 entries and promising "Eventually, all of HCI will be online and freely accessible around the world." Once started, the SIGCHI EC became a consistent supporter and sponsor of the project.

Related Efforts

During the formation of the HCI Bibliography, other projects "competed". There was a book from the ACM Press (ACM, 1990) with a general title: Resources in Human-Computer Interaction, but which was actually a printout of a query done on ACM publications. Although it had several indexes, I could not help but think that any printed index would be a relic of outdated ideas. Around the same time, I received in the mail a printed bibliography on hypertext, ordered by author and nicely bound as a report. I could not help but think how ironic it was that a bibliography on hypertext in particular would be (1) on paper and (2) in one organization (and one that was least useful, except perhaps for the authors). I was convinced that online information was the only long-term option, and that a format that identified the different parts of entries would allow a variety of search and display options. The UNIX Refer format was chosen because it was simple enough to explain to non-experts.

Another project, HILITES (Shackel, et al, 1992), had broader coverage and more features, but was costly to maintain and therefore costly to provide. I surveyed several hundred "registered" HCIBIB users and concluded that HILITES was beyond the financial reach of most people in HCI, and that by being provided as a CD-ROM, did not serve the needs of many.

Becoming Established

To simplify coverage of the major sources of HCI publications, an early decision was to cover whole journal volumes and conference proceedings that were substantially if not primarily on HCI. Each of these modules was kept in one file (or several files with related names for large conferences). During the early 1990s, the backlog of modules proceedings were added to the HCIBIB database, going as far back as the first volume (1969) of the International Journal of Man-Machine Studies (renamed International Journal of Human-Computer Studies in 1994). In more recent years, OCR scanning of entries has proven more accurate than volunteers typing, especially when supplemented by hundreds of automated checks.

Although recognized as the primary source of HCI bibliographic information, the HCIBIB was a database and not a search service. A variety of search services via email and later via the Web were provided, and these appeared to be very popular, if judged only on the number of requests I received about these unaffiliated services. None of these services were authorized, and they typically fell behind in their coverage, in some cases having less than half the released records. In 1997, the HCIBIB moved to its own domain, hcibib.org, offering Web and FTP access. In April, 1998, its search service started.

Services Provided by the HCI Bibliography

Data in the HCI Bibliography

As of September, 1998, the HCI Bibliography: It has been a principle of the HCI Bibliography that currency of coverage is not as important as affordability, correctness, portability, etc. Once online in a portable format, materials online will remain online indefinitely. The HCI Bibliography usually has had a backlog of materials to bring online, and once online, a backlog of materials to validate. As of September, 1998, both backlogs are relatively low.

The HCI Bibliography Web Site

The HCI Bibliography web site is at:
	http://www.hcibib.org/
The HCIBIB web site was redesigned in April of 1998 and since then until September of 1998 has had over 17,500 visitors (over 100 per day). There are several pages on how the project is run (e.g., publisher permissions, data collection and validation, support) and pages about what's new in the database and the search service. There are pages to browse the collection by publication type, publication date, when released, etc. There is a list of the most frequent authors (those with 10 or more authored entries in the HCI Bibliography), with links to retrieve all the publications by each author.

The HCI Bibliography Search Service

The HCIBIB search service started in April of 1998 and between then and September 1998 has processed over 30,000 searches (about 6000/month). Monthly counts show the service growing from an initial 150/day to about 300/day during that period. On September 3, 1998, it handled over 1000 searches for the first time.

The HCI Bibliography search service is based on the glimpse search engine (glimpse.cs.arizona.edu), a free-for-non-profit tool that runs on the server for HCIBIB.ORG. Ironically, the search system for the HCI Bibliography has serious usability problems. Compound those with the generally unplanned nature of searches for a free web service, and it is clear (from the server logs) that many searches miss a lot of what is desired. The search service provides extensive advice about how to improve a search, using knowledge of:

and by providing relative search term frequencies in the database and frequencies of terms in results. The best searches for various topics are maintained in the HCIBIB database as internet resources. These can be modified with terms to further restrict a search. For example, the following finds over 1400 records on hypertext OR hypermedia:
	{hypertext,hypermedia}
(comma means OR, and braces imply grouping). The search could be modified to find over 60 records on books:
	{hypertext,hypermedia};isbn
(semi-colon means AND) or almost 500 records with links to full text:
	{hypertext,hypermedia};http

There are search options to control whether the search is case-sensitive, or whether whole words must be matched. There is an option for an approximate match that will allow one error per search term. These options can have large effects: a whole word search for AI gets 100 records; within-word, it matches most of the records in the database.

Results can be viewed in HTML or raw Refer format, in brief or detailed views, and search terms can be highlighted in the text. Records contain bookmarks that are actually links to search for a record's identifier; these can be saved for future reuse. Book numbers (ISBNs) are displayed as links to amazon.com, from which any royalties are donated to the Central Ohio local ACM SIGCHI chapter, BuckCHI.

I think Voltaire wrote, "The best is the enemy of the good." So, irony aside, the glimpse search engine lets people search the HCI Bibliography from the convenience of their browser. Over time, a more usable front end may be provided for the queries. It might make a good project for a user interface course.

Conclusions (for Now)

I am occasionally asked how the HCI Bibliography Project is managed (e.g., how to get permissions for materials, how to get them online, how to validate them). These procedures (and their motivations) are documented (online, of course), but they have often have had the effect of discouraging people from creating a bibliography service for a field that could use it. It turns out that high-quality bibliographic data is expensive, at least 10 minutes per record when all is counted, but often more, Given that there are too many publications for most people to browse, good bibliographic records provide a reasonable point of access (especially when titles, abstracts, and keywords are well done by authors, which is unfortunately infrequent).

I'm often asked why I put hundreds of hours per year into the HCI Bibliography. Besides the obvious compulsive disorder the work satisfies, I've found that I personally get a lot out of doing work that has lasting value (once online, always online), and which is used by other people. It has also been a great source of data (and now a platform) for exploring uses of hypertext for doing research (especially with a search service on the web). More recently, it has been a reason for me to learn more about how to provide web-based services, which I can apply in other contexts. And, occasionally, people thank you.

References

  1. ACM Press (1990) Resources in Human-Computer Interaction. ACM: New York.
  2. Perlman, G. (1989) User Interface Development. Curriculum Module CM-17. Software Engineering Institute: Pittsburgh, PA.
  3. Perlman, G. (1991) "The HCI Bibliography Project". SIGCHI Bulletin, 23:3, 15-20.
  4. Shackel. et al (1992) "HILITES -- The Information Service for the World HCI Community." SIGCHI Bulletin, 24:3 40-49.