A Single Searchbox
What can go into a single searchbox?
- Author
- Title
- ISBN - only indexed without hyphens
- Date - not in current WC kw: index
- Subjects - may be difficult to match specific format (e.g., dog vs. dogs)
- Format - coded in db, everyday terms will not match
-
Multibox implies:
Limits are different - hard to identify these terms in a query,
but if user has entered them, they are likely to cause a search to fail
- doctype - checkboxes take up a lot of room (progressive disclosure?)
- format - large print, cassette
- content - fiction, biography, dissertation
- audience - juvenile
- language
-
Searching for Apples and Oranges
In different databases, and between doctypes in WorldCat,
records differ by:
- Abstract present
- Subject Headings
- Author format and authority
- Multilingual
- Doctypes - what is the average holdings count for each type?
- Serials - most widely held
- Books - widely held (certain subtypes are not: Braille, Dissertation)
- Sound/Visual
- Archival Materials
-
-
Relevance Ranking based on
- #search terms - even if negated?
- proximity of terms
- tf * idf - boosted by abstracts
Holdings Dec 16 2005 - #libs for 10th item / #items in over 500 libs (cg:09)
com (Computer_Files) 187250 - 336 / 1 0.0000053405
mix (Archival_Materials) 268521 - 52 / 0 0
art (Articles) 647913 - 122 / 0 0
url (Internet_Resources) 880740 - 1275 / 21532 0.0244476
map (Maps) 984117 - 1617 / 349 0.000354633
sco (Musical_Scores) 1388599 - 1421 / 405 0.000291661
vis (Visual_Materials) 1991773 - 1351 / 2265 0.00113718
rec (Sound_Recordings) 2101114 - 985 / 1011 0.000481173
ser (Serial_Publications) 2755990 - 5093 / 5031 0.00182548
bks (Books) 52256944 - ? / 261415 0.00500249
deg (Theses) 5733595 - 854 / 52
What is the percentage of widely held types?
For cg:31 (over 2500 libraries) 932 items, 800 books, 132 serials
(compare count to overall number of each type)
(note that browse count is way off at 1626)