Multiuser Bibliography Functional Specification

This is the functional specification for the Multiuser Bibliography application. It describes major design goals, intended audience, and actual application behavior.

This document is subject to change. Version information:

I. Overview

    The Multiuser Bibliography application, or MultiBib, is a web-based annotated bibliography tool. It stores citations for various types of resources with optional annotations, and lets users browse, search, add, and annotate them. MultiBib helps users collaborate on building a bibliography; with it, small groups can work together on research projects, with group members saving relevant resources as they encounter them. It's meant to be a simple, focused, flexible solution for recording resources, with minimal disruption to a researcher's normal workflow.

    There are a number of software bibliography tools. MultiBib is distinguished by concentrating on two key ideas, both based on the observation that most bibliography software is personal software, running on personal computers. Researchers don't always have their own personal computers handy, but they often have access to some computer and the Internet. Also, researchers sometimes work in groups — in which case they may want to share a bibliography.

    This specification describes the goals of the project and the approach taken in achieving them (section I). Then it considers the intended users and accommodating their needs, preferences, and typical workflows (section II). Finally, it describes the application in detail, including the user interface and important aspects of internal function (section III).

    Who Should Read this Specification?

    The technical details in the Functionality section are primarily for maintainers, though some technically-inclined users may also be interested. The rest of this document is aimed at two audiences: potential users trying to determine whether MultiBib is useful for them, and project stakeholders evaluating the success of the project.

    Why a Multiuser Bibliography?

    MultiBib is meant to be a multiuser bibliography: a tool that will let users collaborate in building a list of resources (with optional annotations), with features for adding, editing, viewing, and searching entries.

    Researchers need to keep track of resources they find. This application aims to help them do that, particularly when they work in groups. Group use requires a centralized database, rather than simply keeping bibliographic information on individual computers; that's the main distinguishing feature of MultiBib.

    There are many computer-based bibliography systems. The goal of MultiBib was to create one that made it easy for teams of people to work with the same bibliography, wherever they were.

    The Approach

    A Web-based Application

    Web-based applications are suited to collaboration, because it's easy for users to get access to them, and they don't require any new infrastructure on the client end. Researchers working in a library might not have their own computers handy, for example, but would likely have web access through the library's public machines. MultiBib makes it convenient to add to the bibliography while browsing an online catalog. Also, future versions are well-positioned to work with other browser-based and browser-friendly technologies, such as Zotero, the Citation Style Language, and Microformats.

    Front End

    The presentation technology is built on standard XHTML 1.0 and CSS 2.0. It's designed to render comfortably on typical personal computers.

    The various views use consistent styling and visual elements. A wayfinder character in the upper-left corner (below the header) gives the user a quick indicator of the current view. The global navigation menu on the left also remains consistent across the page. For convenience, the Quicksearch and Sign In forms are built into the navigation menu, rather than making the user navigate through additional pages, though separate Advanced Search and Sign In Required pages are also provided.

    Back End

    The application logic is implemented in PHP, with a MySQL database. The PHP layer allows for easy manipulation of the bibliographic data, including some relatively fancy processing for data storage (such as generating the bibliographic key from author and title fields) and display (for example, using a regular expression to condense long URLs for the browse view). MySQL provides fast access to data and reliable operations on it, and convenience functionality such as filtering, searching, and sorting.

II. Users and Usability

    Users and Their Goals

    MultiBib is aimed at researchers — students, academics, and independent scholars who need to keep track of numerous references and resources. They may not be comfortable with more powerful, but also far more complex, bibliographic management systems such as BiBTeX. MultiBib tries to be easy and relatively self-evident to operate.

    As researchers collect references, for individual or (especially) group projects, they need:

    • A place where they can record resources as they come across them, or whenever else is convenient. In fact, it should become their preferred place to record resources; if they'd rather just jot them down on scraps of paper, then the application isn't providing them much value.
    • The ability to look up resources that they or their collaborators have recorded. They should be able to look them up by browsing or searching. A bibliography isn't any good if you can't get information back out of it.
    • Features that let them collaborate on a bibliography. This is MultiBib's main advantage over pen & paper or other single-user systems, so it has to be obvious, easy, and effective. Collaboration includes both enabling joint work where it's appropriate (such as the ability to comment on other people's entries), and maintaining boundaries between users where necessary (by recording who contributed what, for example, and by preventing anyone from accidentally or deliberately destroying someone else's work).

    Accommodation and Collaboration

    One key aspect of MultiBib is that it is a tool, so it should accommodate users' existing modes of working rather than dictating new ones. (Some applications can and should dictate new modes of operation; driving a car is not like walking, for example. This isn't one of them.)

    Another is that it is collaborative. It must be responsive to the needs of multiple users; and it must let those users share data, but maintain reasonable boundaries (for example, by making it clear who has contributed what addition or change).

    Collaboration also means that the application should foster a sense of community among the users. Some of that is done through the user interface — for example, by including a short history of recent edits on the home page. Another way to maintain a sense of community is to make users feel the application itself is a member, through a feature set that emphasizes communal values such as utility (the tool should provide a useful function), trustworthiness (the tool should be reliable), and cooperation (the tool should be easy to use).

    User and Usability Research

    No formal research was done in designing MultiBib. I based my analysis of the needs of users on my own experiences as a researcher, and what I've observed of other researchers. I did review some other bibliographic tools, including several I've used myself (which range from pen & paper and word-processing software to dedicated applications like Zotero), though not in any formal way.

    Had time permitted, it would have been useful to conduct at least three types of user research:

    • Contextual inquiry, studying researchers actually engaged in collecting resources. While I acknowledge the value of self-reported user stories, I think users may not always be fully aware of how they incorporate tasks such as recording and retrieving bibliographic information into their own workflow. For example, citing a source while composing an academic essay is essentially an "out of band" activity — it interrupts the flow of normal writing — and so users may postpone it without even realizing they're doing so. Contextual inquiry would let me observe such behaviors and consider how the application might accommodate them.
    • It would be useful to observe users of other bibliographic software, both new users and experienced ones. That would provide valuable data about how users learn this kind of software, and what features work well for advanced users.
    • Now that I have a working prototype, one of the best sources of information for prioritizing future work (the "backlog", in Scrum terminology) would be input from users who actually try to work with the prototype. Agile methodologies such as Scrum actually recommend letting such user stories drive the priorities for each iteration (or "sprint", in Scrum).

III. Functionality

    This part of the specification describes specific views in the application and the functions that can be performed in them. The screenshots have been reduced in size for convenience. Click on them to see a full-size screenshot.

    Recent

    recent-additions screen

    The Recent screen is the MultiBib home page. It lists the ten most recently added or updated entries in the database. As a welcome page, it shows frequent users what may have changed since their last visit, and gives new users an idea of the kinds of materials contained in the database. It also shows some statistics about recent activity, which helps users get a sense of whether there's significant new information in the database for them.

    Like the other browse-style views (Browse and Search Results) each entry displayed here is a link to the Detail View for that entry.

    Implementation Notes

    The Recent view selects the ten most recent entries using an SQL query that orders by date added, descending, with a limit of ten rows returned. Then it uses a series of queries that employ the SQL COUNT function with WHERE clauses that use the SUBDATE function to find entries with modification dates in the previous day, week, and month.

    Browse

    browse screen

    The Browse view lists ten entries from the bibliography. Entries are listed in bibliographic order, in a style similar to the MLA "works cited" conventions. Entries are rendered in table rows of alternating color, for visual separation and readability. Previous-page and next-page links are shown when there are prior or subsequent entries to be displayed.

    Some formatting is done by heuristics in the PHP code; for example, it attempts to correctly punctuate multiple authors.

    In all browse views, each listing is a link to the detail view for that entry.

    Implementation Notes

    The bibliographic ordering of entries in the various browse views (Browse, Recent, and Search Results) is produced using an SQL ORDER BY clause on the bibkey field. That field is generated by the PHP add-entry code from data in the entry; its value should cause the entry to be sorted into the correct position in an MLA-style bibliography. It consists of author names, if any, followed by title, if any, with all characters folded to lower case, all punctuation removed, and each whitespace sequence replaced with a single space. Precomputing a canonical sort key like this ensures consistent, predictable ordering for entries, and if a bibliographic entry is ever missorted by the algorithm, an administrator could force correct positioning by manually setting a proper sort key.

    Search

    There are two search features: quick search and advanced search. Quick Search is a simple search for a single fixed string, available from any page; it's useful for finding entries that include an uncommon keyword. Advanced Search offers more powerful capabilities in return for more input from the user.

    Quick Search

    quicksearch results

    Quick Search is available by entering any text in the "Quick Search" textbox in the menu bar. It presents a browse view of entries that include the entered string in any of their fields. (Quick Search checks authors, titles, editors, other work information, and publication information.)

    Advanced Search

    search screen

    Advanced Search is requested through the Search link on the navigation bar. It presents the user with a form where they can fill in search data for any of the entry fields. There are a pair of radio buttons to choose between finding entries that match all or any of the populated fields; here radio buttons are used rather than a checkbox because that permits better formatting of the label text and makes for a clearer control. There's also a checkbox (checked by default) to search only the current entries, and not include superseded ones in the results; and a checkbox to show the generated SQL query (which is mostly of interest to developers and geeks).

    The search results from Advanced Search are also a browse view, except that they're not paged, regardless of how many there are. That's because the assumption of Advanced Search is that the user wants a very specific set of results, and it's often useful to be able to have all of those on a single screen (eg for printing purposes).

    Implementation Notes

    Quick Search is implemented very similarly to Browse; it just has an additional WHERE clause with a set of phrases using LIKE and the search text, wrapped with the SQL wildcard %, all OR'd together.

    Advanced Search is more interesting. It dynamically builds a SQL query, adding WHERE phrases for any non-blank form field. The phrases are joined with OR or AND depending on the setting of the "any or all" checkbox. Some of the phrases are more complex; for example, if the "author" field is set, that becomes the parenthesized WHERE phrase "(primaryauthor LIKE '%author%' OR otherauthors LIKE '%author%' OR editor LIKE '%author%')". To simplify the code for assembling these phrases into the WHERE clause, that clause begins with a no-op: either 1 if the phrases are joined with AND, or 0 if they're joined with OR. That lets the PHP code begin each phrase with a conjunction (AND publisher LIKE '%publisher%', etc) without producing invalid SQL syntax if it's the first phrase in the generated WHERE clause.

    If the "search only current entries" checkbox is unchecked (it's checked by default), the SQL query is suffixed with a UNION SELECT clause that duplicates the search against the "oldentries" table.

    Detail View

    view-database screen

    When a user clicks on an entry in any browse-style view, they'll be taken to the detailed view of that entry. It shows each of the fields with its associated data (if any) in tabular form. It also shows any comments that may have been added to the entry.

    If the entry has been superseded there will be a link to the superseding entry, so the user can jump to the current (or at least next) version.

    If the user is signed in, there will be buttons for editing the entry and for adding a comment.

    Implementation Notes

    It's possible for a user to try to view a superseded entry: they may have done an Advanced Search and opted to include superseded as well as current entries, for example, or they may have bookmarked a detail view of an entry that was subsequently updated. So the detail-view code searches both the "entries" table (current entries) and the "oldentries" table (superseded entries) for the supplied entry ID. Entry IDs are unique across the two tables.

    Signing In

    At any point, an anonymous user can sign in by entering a valid username (which is an email address — that ensures usernames are unique) and password in the form on the navigation menu. Successful sign-in lets the user add new entries, edit existing ones, or add comments.

    A signed-in user will have a "Sign out" link on the navigation menu in place of the sign-in form.

    Implementation Notes

    User password verification is performed on the server, of course. The database has a table of usernames and password verifiers, with unique user IDs and optional full names for users. The password verifiers are strings that encode a method and its parameters; currently the only supported method is a salted MD5 hash. The verification scheme is excessively complex, especially since credentials are sent over a clear connection; it should be replaced with something stronger (eg an SSL channel for submitting credentials, or a blind system like SRP or PAK), or something simpler (like plain MD5 verifiers, which would be easier to administer).

    The user's identity and sign-in state is maintained in PHP using its session mechanism. MultiBib stores the user's numerical ID (from the SQL "users" table) in a session variable. Various pages (add, edit, etc) check the sign-in state as they're generating their output, to prevent users from bypassing sign-in simply by going to the appropriate URL.

    Add an Entry

    add an entry

    The form for adding an entry is a series of fields with explanatory labels. The labels are necessarily relatively long, because bibliographic data is complicated and users are often unsure how to document sources. That made it impossible to use a conventional table layout of labels and fields, so the fields simply flow as a series of paragraphs. Unfortunately that makes the form somewhat long and daunting, and improving the user experience of this aspect of the application should be a priority for the next release. (There are some comments along these lines in the final entry of the project journal.)

    The form ends with standard submit and reset buttons. Tab indices are used to make the submit button the first selected control when tabbing off form fields, to reduce the chances of accidentally clearing the form.

    Bibliographic sources come in many forms. They don't all have authors; they don't all have titles. MultiBib requires only that a work have something for a title or for an author, so that it knows how to sort it. Other than that, all fields are optional.

    Implementation Notes

    Currently, all data verification is done server-side. It'd be nice if this view included Javascript for redundant client-side verification, to save a round trip to the server.

    Editing an Entry

    The edit-entry view is nearly identical to the add-entry view, except that the fields will already be populated with data from the entry. The edit view is reached from the detail view by clicking the Edit button (which is only shown if the user is signed in).

    Implementation Notes

    To avoid accidental loss of data due to editing accidents, submitting an edit actually moves the old record to a "superseded" table, and a new record is created. (There is no provision in the initial version of the application for recovering data from superseded records - that will have to be done manually. But at least it can be done.) Superseded records link to the record that supersedes them, so old entry IDs remain valid — going to the Detail View for an entry that has been superseded (eg, using a bookmark) will show the superseded entry with a link to the new one.

    Adding a Comment

    Signed-in users can also add a comment for an entry, by clicking the Comment button in the Detail View. The comment input form is just a text area plus submit and reset buttons. Existing comments for an entry are shown in the detail view for that entry. An entry can have any number of comments.

    Implementation Notes

    Comments are kept in a separate table, and related to entries by entry ID. When an entry is superseded, its comments are reassigned to the new version of the entry.

I'd like to thank Michael McLeod for extensive input throughout the design and development process.