What email address or phone number would you like to use to sign in to Docs.com?
If you already have an account that you use with Office or other Microsoft services, enter it here.
Or sign in with:
Signing in allows you to download and like content, and it provides the authors analytical data about your interactions with their content.
Embed code for: ClaybarDraft3Final
Select a size
Melanie ClaybarFall 2016
INFO 5200 TXWI_A
Young Adult Novels:
Information Organization System
1. Project description
This information organization system is for a young adult novel collection housed in the Little Cypress Mauriceville (LCM) High School Library in Orange, Texas. The collection is available to all students in grades nine through twelve at LCM High School. Students frequently ask the librarian for help finding books but are reluctant to search the catalog.
1.1. Collection and information objects
The Young Adult Novels collection contains 4,831 young adult novels from the LCM High School Library. It is located inside the library and is available to all students, grades nine through twelve. The collection represents a wide variety of genres including dystopian, romance, adventure, historical fiction, mystery, and fantasy. Topics include first love, friendship, coming of age, coming out (LGBTQ), family issues, mental illness, vampires, cancer, dystopian society, and sports. The library continues to expand the collection through purchases and grants, such as the Dollar General award for the annual Teen’s Top Ten collection. The purpose of the collection is to promote reading for pleasure and to provide students of LCM High School a resource for finding YA novels of interest and relativity.
1.2. Users' demographics and knowledge
Students at LCM High School, male and female, range in age from thirteen to nineteen years. Of the 1,171 students, ninety-nine percent speak English. Nineteen percent participate in the free-and-reduced lunch program which indicates a middle to high family income for most LCM High School students. These demographics help categorize the user in the four knowledge levels (general, domain, system, and information seeking.) These levels indicate what the user knows in each area compared to the average consumer of the collection. In system design, it is important to consider the knowledge levels to create a system that meets the needs of the user.
General knowledge refers to the user’s base of information which is gathered through daily interactions and experiences. Because LCM High School students generally have limited life experiences and are still building their academic knowledge, a moderate level of general knowledge is expected.
Another area to consider is domain knowledge, or specific knowledge of the subject of a collection. Although some students in this group are avid readers of young adult novels, most are non-readers. Students in the latter group may have seen the latest Divergent movie, but they do not know this is a novel series in the library. Therefore, the overall domain knowledge of this user group is moderate.
System knowledge indicates the user’s aptitude for operating the information system. Most students at LCM High School come from middle to high income families where they have access to multiple devices and reliable internet service. The typical LCM High School student has a smartphone and can operate devices such as laptops, iPads, and other tablets, with ease. However, despite this level of comfort, many lack the initiative to learn new systems, especially academic in nature. Students frequently experience frustration when encountering unfamiliar systems. Because of this, the systems knowledge for LCM High School students is moderate.
Information seeking knowledge refers to the user’s ability to find information in an unfamiliar system. Many of the criteria for system knowledge applies to information seeking knowledge for LCM High School students. Based on these criteria, the information seeking knowledge for the user group is moderate.
The demographics and knowledge levels indicate that the users of the information system for the Young Adult Novels collection want simple search capability, most likely subject and author, with occasional advanced search fields such as number of pages.
Without the Young Adult Novel information system, users search the LCM High School Library catalog which may cause frustration due to high recall.
1.3. Users' problems and questions
Students use the system to retrieve books for independent reading and classroom assignments. Users have a wide range of interests and criteria for selecting books and although some know exactly what they are looking for, typical searches are broad and general. The following are representative of users’ search questions:
User question 1: I want a short book about football.
Object attributes: number of pages, subject
Desired precision: moderate
Desired recall: low
User question 2: I want to read The Hunger Games books.
Object attributes: series, title
Desired precision: high
Desired recall: moderate
User question 3: I want something with pictures that will give me nightmares.
Object attributes: illustrations, subject
User question 4: I want a book by John Green.
Object attributes: author
The attributes in the system based on these questions are Title, Author, Subject, Series, Illustrations, Number of Pages. Additional attributes needed are ISBN and Publisher.
2. Representation of information objects
2.1. Entity level
An entity is the information object that a user desires within an information system. The information object and entity represented in this system is a young adult novel. Defining the entity level, which refers to the unit of information that is represented in the system, allows for the creation of the metadata record where one unit of analysis equals one record. For example, the entity level for a novel can be represented by chapter or the entire novel. For the Young Adult Novels Collection, one record equals one novel, therefore, the entity level for this system is the whole object, or whole novel. This is appropriate for this system because, based on the users’ problems and questions, they are not looking for individual chapters; they want to check out and read the entire novel. Identifying the entity level also gives clear directions to the cataloger, to avoid multiple entity levels in the system, and, ultimately, confusion for the end user.
2.2. Metadata elements and semantics
Metadata, or data about data, describes the information object so that is retrievable and manageable. The metadata scheme organizes the attributes (details and characteristics of the information object which were previously identified) into metadata elements in the record.
The ten metadata elements chosen for this system support the four user tasks as defined by the International Federation of Library Associations and Institutions’ report, Functional Requirements for the Bibliographic Record. The FRBR tasks outline the user’s process for finding, identifying, selecting, and obtaining an information object.
The first task is to find the information per the user’s search criteria. In this step, the user types the desired search terms to retrieve a list of results. The Title, Author, and Subject elements in the metadata scheme for the Young Adult Novels Collection, support this task. As indicated by the users’ problems and questions, users of this collection most often want to search by subject, while occasionally searching for title or author. For example, a user who wants a short book about football must first search for books in the collection with the subject of football. This is the first step the user takes to narrow the selection.
The user then looks at the list generated during the find task and decides which records meet the initial search goal. This identification task allows the user to distinguish the records with even more specificity. The Title, Author, Subject, Summary, Series, Illustrations, and Number of Pages elements assist the user with this task. For example, the user who wants a short book about football first searches the Subject element, then uses the Number of Pages element to decide which record or records most fits his or her criteria. Users also indicated a desire for books with illustrations, as well as books in a series.
The select task gives the user a chance to choose records that meet additional requirements such as language or format. After identifying the desired record or records per the initial search goal, the user must select the record that most closely matches his or her needs. This step also includes rejecting records that do not meet the user’s requirements. The metadata elements that fulfill this task are Title, Author, Subject, Summary, Series, Illustrations, and Number of Pages. For example, if there are multiple books about football with the desired number of pages, the user must select a record using additional information. The user may recognize an author or title and choose or reject the object based on these elements.
In the last task, the user obtains the information object. The goal for a user is to retrieve the desired record either physically or digitally. After the user selects a record he or she must retrieve the object from a location which is indicated in the metadata. The metadata element that supports this task is Call Number, which directs the user to the location of the object in the library.
The metadata elements Publisher and ISBN are not directly related to user needs. However, these fields provide additional information as to the identification of the correct information object. For example, there may be two novels with the same metadata elements except for publisher. While these novels contain the same information, or words on the pages, they are two distinct information objects. This needs to be identified in the system to create two distinct records for these objects.
Appendix A contains a complete list of metadata elements and semantics
2.3. Record structure and specifications
There are ten fields in the database record. Metadata elements in the metadata scheme map one-to-one to a database field except for ISBN. This can be a thirteen or ten-digit number and there are two, distinct fields in the database to accommodate these elements.
There are several specifications for each field. The field type indicates if the element contains text or numbers, or a combination. Some elements do not contain information that is relevant to the user’s initial search. This is indicated in the specification for whether the field is searchable. If the element is not available for all records, this is not a required field in the records. The number of entries puts a cap on how many instances a field can have. For example, a novel may fit into multiple genres so the field needs to accommodate this variety. A controlled vocabulary is a list of terms that the cataloger chooses from when entering the information in the records. The last specification is a drop-down list which is typically a short list of options to choose from.
The Title field is a text field, and is required and searchable since some users search by title. This field is limited to one entry because each information object, or novel, has only one title. If there are subtitles, this is accommodated in the title field and explicitly explained in the input rules for the cataloger. There is no controlled vocabulary or drop down list for the title field as there are too many to list and maintain.
The Author field is also a required, searchable text field. Some users of this system want to search by author either by their own experience or through recommendations. Most young adult novels have only one author but occasionally have co-authors, therefore the field allows for two entries. As with the title field, there is no controlled vocabulary or drop down list as there are too many authors to list and maintain.
The Summary field provides a short plot summary. This is a text field and is not required because this information may not be provided or available for each object. It is not searchable since students search for keywords, which is accommodated in the Subject field. It does not have controlled vocabulary or a drop-down list due to the size of the field.
The Publisher field is a text field that is not searchable since users of this system are not concerned about the publisher of the novel. There is no controlled vocabulary or drop down list for the title field as there are too many publishers to list and maintain.
The International Standard Book Number, or ISBN, is a number field that uniquely identifies each edition of a novel. In this system, the ISBN field is not searchable since users of this system are not concerned about the edition or version of the novel. It is important to know exactly which edition or version of a novel is in the collection, for example for replacement value, so this is a required field. There is no controlled vocabulary since this is a number assigned by an external entity.
The Number of Pages field is a number field that indicates the length of the work. This field is not searchable but is required since some users indicate this is part of the selection criteria. There is no controlled vocabulary or drop-down list since there are too man numbers to list for options.
The Call Number field is a required text field that is not searchable since this is the last step the user takes to obtain the novel. There is only one entry allowed because this indicates the location of the object in the library. There is no controlled vocabulary or drop list.
Most users want to search by subject, so the Subject field is a required, searchable text field. This field has a controlled vocabulary to accommodate multiple spellings or formats of the same subject. For example, the subject of science fiction may also be entered as sci-fi. Without a controlled vocabulary there are inconsistencies in the record and in the user’s search. This field allows 10 entries since most novels address several topics or issues.
Some novels in this collection are part of a series, therefore, the Series field is a number field with one allowed entry in the record that indicates the number place in the series. Not all novels are part of a series so this field is not required. There is no controlled vocabulary or drop down list since this is a number. This field is not searchable since users are using this information to determine if the novel is in a series and which number it is in the series.
Although limited, there are novels in this collection with illustrations. Users occasionally want to find novels with pictures or illustrations, so the Illustrations field is a text field that is searchable but not required. Only one entry is allowed, with a controlled vocabulary in a drop-down list since this is a yes or no answer.
Appendix B contains a complete list of record structure and specifications.
2.4. Record content and input rules
Content and input rules are necessary to prevent inconsistencies and errors in cataloging. The rules respectively give the cataloger explicit directions about how to find the information for each element and how to input this information into the record. Without these rules, each cataloger must make individual decisions about such matters. These differences can make it difficult for the user to find the information needed.
The content rules tell the cataloger where to find the data through the Chief Source of Information, for example the title page or a publisher website. This ensures consistency with all catalogers. The Chief Source of Information for the Young Adult Novels Collection is primarily the copyright page. In some cases, there may be more than one source for the information. The content rules may indicate alternate locations and indicate the order in which the cataloger should look for the data. For example, when determining the Title, the cataloger initially looks at the title page, but if there is no title page, he or she then looks to the copyright page.
Input rules give the cataloger instructions for entering the data in the fields. This keeps the data in the field consistent from record to record. For example, the input rules instruct the cataloger to enter the Author as first name then last name. Some fields presented a problem as there is no matching field in the database. The Tags field takes the place of the Subject field and the ISBN field is split into the corresponding ISBN 10 or ISBN 13 number.
Appendix C contains a complete list of content and input rules.
Appendix G contains sample records.
3. Access and authority control
To complete the FRBR tasks, a user must have an access point in the information system to begin the search. Access points are the designated fields that are indexed and searchable. Traditionally, name, subject, and title fields are access points in a library information system. When a user performs a search, he or she expects to find records most related to the search term. Collocation is the process of gathering these like records. The fields used for collocation must have consistency so that the process is reliable. Other fields, which may not be searchable by the user, also need consistency so the integrity of the system is intact.
Authority work is the process of exerting authority control over the data. Authority control is a procedure that provides consistency in the access point data and, therefore, consistency in the search results. Without authority control, results vary based on the data that is entered in the record, which can differ even if there is one cataloger but certainly among multiple catalogers. Authority control is beneficial to both the user and the cataloger since the user gets the desired results and the cataloger has clear instructions for data entry. The two types of authority control in this system are subject authority control and name authority control.
Name authority control refers to the standardization of names of people or corporations involved in the creation of the work. There are many variations of spellings and abbreviations so this type of authority work maintains a name authority file, which is a compilation of approved names under this control. In this system, Author and Publisher are under name authority control.
Subject authority control is the use of a common vocabulary to control subject terms or headings that are similar or related. When implemented properly, subject authority control makes the search process much easier for the end user, eliminating the need to search for multiple terms to get similar results. In this system, the Tag field contains the subject terms and is under subject authority control using a thesaurus.
4. Representation of information content
4.1. Subject access
Subject access is a process of providing users with a way to find information either through subject representation or classification. Subject representation describes the intellectual content or what the object is about. This is not the same as the physical description of the object, since a book and a movie can have the same subject representation but are two different information objects. Subject analysis is an important task which determines the subject representation of an object and is completed in four steps: familiarization, extraction, translation, and formalization. The information professional responsible for subject analysis must be familiar with the content of the objects and identify the main topics and terms. He or she creates the controlled vocabulary based on these terms and creates the rules for the cataloger to input the data in the system.
Most users want to search for specific content topics so it is important that the data entry in the subject field is consistent in all records. In this system, the Tags field provides subject access. The system designer provides subject access control through subject heading, controlled vocabulary, and classification.
Subject headings, like those used by the Library of Congress, group objects based on similar topics. This is a fixed list that does not allow for additions by the cataloger.
Controlled vocabularies provide the cataloger with a list of terms for the subject access fields. This can be a validation list in which a predetermined list of terms is provided in the form of a drop-down list or similar feature. Another method for delivering a controlled vocabulary is using a thesaurus. The thesaurus provides a list of terms and establishes relationships between terms. Typically, there are rules established which allow the cataloger to add terms as necessary.
Classification schemes establish a procedure for physically organizing the objects in a collection. This physical grouping provides another way for users to retrieve the desired information. The classification scheme, sometimes called a call number, is created through the identification of facets that are abbreviated and synthesized to create a unique identifier for the object. Objects are then physically positioned in the library, in order, per this identifier. The Call Number field contains the classification for this system and the facets used for the creation of the call number are Tags and Author.
4.2. Thesaurus structure
Subject authority control creates consistency between records by establishing rules of a common vocabulary and creates a link between objects that are related by topic. Since most users of this system are searching by subject terms, this consistency and common link give the user a variety of objects that match the search goal, which increases the user’s satisfaction and likelihood of completing the last FRBR step of retrieving the object. In this system, the Tag field, which contains the subject terms, has no formatting restraints. Therefore, it is imperative that the database designer create rules and controlled vocabulary for this field, instead of allowing the cataloger to enter data at will, creating inconsistencies and errors in the database.
A thesaurus is a list of controlled vocabulary, or descriptors, using a syndetic structure. The syndetic structure means that the terms are semantically linked in some way. There are three types of semantic relationships in a thesaurus: hierarchical, associative, and equivalent. Hierarchical relationships create order to the terms from broad to more specific terms, which is commonly used in taxonomy. The associative relationship comprises terms that are related but not the same. An equivalent relationship is one in which the terms are synonymous or similar with one being the preferred term. Each relationship must be cross-referenced in the thesaurus. These mandatory reciprocals must match up in the thesaurus to clearly indicate the relationship.
The domain of a thesaurus is the overall topic or concept that is covered. The domain of this system is young adult novels. The scope addresses any limitations of the domain, such as graphic or dystopian novels. There are no limitations for this system, therefore the domain and scope are the same for this system.
Specificity of a thesaurus refers to the preciseness or exactness of subject representation. Thesauri with more concrete nouns tend to have high specificity as opposed to those with more abstract terms. The thesaurus for this system contains more abstract terms and has a low specificity. Exhaustivity addresses the level of representation of subject terms that apply to an object. The level of exhaustivity determines the number of subject terms which are allowed in the record. Depth indexing usually means there is no limit on the number of entries for subject and that all or most topics and subtopics are applied. In this case the exhaustivity is high. Summarization uses few subject terms making the exhaustivity low. The exhaustivity for this thesaurus is low due to a limit of ten subject terms. The users of this system are easily overwhelmed when searching the main library catalog. Most are searching using more general terms therefore low specificity and exhaustivity is appropriate for this system.
Appendix D contains a sample thesaurus.
4.3. Classification scheme
Classification is the process of arranging objects in a collection according to an organization scheme that is most useful for the user. This is commonly referred to as the call number. Physically arranging the objects in such a manner allows users to easily find objects in the collection and browse for similar objects in close proximity. There are two main types of classification schemes: faceted and hierarchical.
A faceted approach uses a combination of classes pulled from the fields in the database. This scheme allows the database creator to design the arrangement of the classes to best suit the needs of the users. A hierarchical approach arranges the objects is useful for collections with multiple subjects in the scheme.
In this system, the primary facet for classification is the Tags field which houses the subject terms. This is most appropriate for the users of this system since most are searching by subjCoect. Additional facets are Author, Pages which provide information that is most important to the user. Last is the ISBN which is used as a unique identifier for the object. This insures that no two objects have the same call number.
The classification scheme VAM.MEY.563.657 represents the object New Moon, by Stephenie Meyer. The primary subject for this book is vampires, and it is 563 pages long. The ISBN-13 number is 97831675657.
Appendix E contains complete rules for the classification scheme.
Appendix A. Metadata elements and semantics
No. Element name Semantics 1 Title The name of the work 2 Author The creator or writer of the work 3 Summary A short description of the work 4 Publisher The entity that produces and sells the work 5 ISBN The International Standard Book Number; a unique identifier for the work 6 Number of Pages The length of the work 7 Call Number An identifier that indicates the location of the work in the library 8 Subject The general main topic or idea of the work 9 Series The number of the work if part of a group 10 Illustrations Drawings or photographs in the work
Appendix B. Record structure and desired specifications
1. Record structure specifications
No. Field name Field type Searchable Required Number of allowed entries Controlled
Vocabulary? Drop Down
List? 1 Title Text Yes Yes 1 No No 2 Author Text Yes Yes 2 Yes No 3 Summary Text No No 1 No No 4 Publisher Text No Yes 1 Yes No 5 ISBN Number No Yes 1 No No 6 Number of Pages Number
No 7 Call Number Text No Yes 1 No No 8 Subject Text Yes Yes 10 Yes No 9 Series Text & Number No No 1 No No 10 Illustrations Text Yes Yes 1 No No
2. Field comparison
No. Desired Field Libib Field Notes 1 Title Title 2 Author Author 3 Summary Description 4 Publisher Publisher 5 ISBN ISBN 10 If ISBN is a 10-digit number use the ISBN 10 field in Libib 6 ISBN ISBN 13 If ISBN is a 13-digit number use the ISBN 13 field in Libib 7 Number of Pages Pages 8 Call Number Call Number 9 Subject Tags Each tag represents a Subject entry 10 Series Notes 11 Illustrations Notes
Appendix C. Record content and input rules
Field #: 1
Field Name: Title
Semantics: The name of the work
Chief Source of Information: 1) Title page of the book, 2) the copyright page of the book, 3) front cover of book, 4) spine of book
Input Rules: Enter title as found using title style. If book has a title and subtitle, enter the title first, followed by a colon, then the subtitle
Example: Gym Candy
Field #: 2
Field Name: Author
Semantics: The creator or writer of the work
Input Rules: Enter the author’s first name then last name, capitalizing proper nouns. If more than one author, separate by a comma.
Example: Stephenie Meyer
Field #: 3
Field Name: Description
Semantics: A brief description of the work
Chief Source of Information: Copyright page
Input Rules: Enter summary from copyright page exactly as written. If no summary, leave blank.
Example: When the Cullens, including her beloved Edward, leave Forks rather than risk revealing that they are vampires, it is almost too much for eighteen-year-old Bella to bear, but she finds solace in her friend Jacob until he is drawn into a "cult" and changes in terrible ways.
Field #: 4
Field Name: Publisher
Semantics: The entity that produces and sells the work
Chief Source of Information: Copyright page
Input Rules: Enter the first publisher listed exactly as listed.
Example: Little, Brown and Company
Field #: 5
Field Name: ISBN 10
Semantics: The International Standard Book Number; a unique identifier for the work
Input Rules: If the ISBN number is ISBN 10, enter without dashes, otherwise leave blank. If there is both an ISBN 10 and ISBN 13 number, leave blank
Field #: 6
Field Name: ISBN 13
Input Rules: If the ISBN number is ISBN 13, enter without dashes, otherwise leave blank. If there is both an ISBN 10 and ISBN 13 number, enter the ISBN 13 number.
Field #: 7
Field Name: Pages
Semantics: The length of the work
Chief Source of Information: Last page of novel
Input Rules: Look for the last numbered page of the novel and enter this number.
Field #: 8
Field Name: Call Number
Semantics: An identifier that indicates the location of the work in the library
Chief Source of Information: See Appendix E.
Input Rules: See Appendix E.
Field #: 9
Field Name: Tags
Semantics: The general main topic or idea of the work
Chief Source of Information: 1) Copyright page, 2) dust jacket front panel, 3) dust jacket back panel
Input Rules: Choose up to ten authorized terms from the thesaurus (Appendix D). Enter the first term in all capital letters then enter all other terms exactly as found. Separate multiple subjects with a comma. If a subject term is not included in the thesaurus, add it to the list. Only capitalize proper nouns and use common acronyms when possible instead of full names.
Example: LGBTQ; vampires
Field #: 10
Field Name: Notes
Semantics: The number of the work if part of a group
Chief Source of Information: 1) Title page of the book, 2) the copyright page of the book, 3) front cover of book, 4) spine of book, 5) back cover of book
Input Rules: If the book is part of a series, in the notes field enter “Number in series” followed by a colon, then the number of the book in the series. This goes above Illustration notes.
Example: Number in series: 2
Field #: 11
Semantics: Drawings or photographs in the work
Input Rules: In the notes section, enter “Illustrations” followed by a colon, then “Yes” if there is an illustrator listed on the copyright page or “No” if there is no illustrator.
Example: Illustrations: Yes
Appendix D. Sample thesaurus
BT concentration camps
BT interpersonal relations
NT fathers and sons
fathers and sons
BT family life
NT concentration camps
USE interpersonal relations
BT = Broader Term
NT = Narrower Term
RT = Related Term
UF = Use ForAppendix E. Classification scheme
Subject Author Title See Notation Rules See Notation Rules See Notation Rules
2. Notation rules
Facet name: Tags
Chief source of information: Tags field
Notation rules: Use the first three letters of the subject tag that is in all capital letters. Follow with a period.
Facet name: Author
Chief source of information: 1) Title page of the book, 2) the copyright page of the book, 3) front cover of book, 4) spine of book
Notation rules: Use the first three letters of the author’s last name. Follow with a period.
Facet name: Pages
Chief source of information: Last page of the novel.
Notation rules: Use the entire number. Follow with a period.
3. Rule for unique number
Facet name: ISBN 10 or ISBN 13
Chief source of information: Copyright page.
Notation rules: Use the last three digits of the ISBN number. If there is both an ISBN 10 and ISBN 13 number, use the last three digits of the ISBN 13 number.
Appendix G. Sample records
Claybar / TXWI_A / p. 18
d Name: Pages
Notation rules: Use the first three letters of the subject tag that is i