According
to the Linguistic Society of America, there were “6,909 distinct languages [in
the world] . . . as of 2009” (LSA, 2014) . Since 2009, this number has increased
considerably. Today, as listed in
Ethnologue – an international reference catalog, there are “7,106 known living
languages” (Ethnologue, 2014) . Language is basic to the social
infrastructure of global communities, as well as the major aspect of IR;
without the common everyday language, IR would not exist as a sub-discipline of
Library and Information Science (LIS). We
are political people, and politics requires explicit communication.
Chu
states that “natural and controlled vocabulary” is intertwined in IR. And what is the difference between the two
vocabularies? Natural vocabulary is
spoken throughout societies on a daily basis; it’s the status quo of
communication. Wherein controlled vocabulary
is derived from a set standard of terminology, such as that incorporated by the
Library of Congress to store and retrieve data and information (via subject headings,
title, etc.) Controlled vocabulary
provides consistency of terms when searching the LOC’s database for specific information.
The particular terms used in data
compilation/storage/retrieval have uniformity and standardization, as defined by
LIS authorities. When people search the
Internet, utilizing language that’s familiar to them, they use natural
vocabulary.
Chu
expresses the issue surrounding synonyms with the example: computer, desktop,
and laptop to relate how terminology is utilized. Most people use one or all of these terms to
discuss a distinctive electronic device; and, when performing an Internet
search, either term (or all) may be utilized.
However, controlled vocabulary would only utilize one of these terms to
be consistent and uniform – with “computer” being the more inclusive term. I searched the LOC using computer and
received total of 160,777 results; desktop search received 3,358 total results;
and laptop resulted in 798 hits. It’s
quite clear that “computer” is a controlled vocabulary within the LOC database
structure.
The
advancement of technologies is the catalyst for our new and improved digital
societies – metadata (loosely defined as "data about data") is everywhere. When search engines manipulate diversified terminology
about certain documents, books/journals, websites/webpages, visuals (i.e. images,
photos), audio (songs/music, etc.) the electronic mechanisms are utilizing metadata. Marketing and advertising agencies are top
users of metadata, in addition to social media, such as, Twitter and facebook. The creation and usage of metadata will only
increase over time; therefore, we must learn to use it effectively, if we are
to be successful researchers.
Have a chocolate smoothie with your IR!
References
Chu, H. (2014). Information representation and
retrieval in the digital age. Medford, New Jersey: Information Today, Inc.
Ethnologue. (2014, June 8). Languages of the World.
Retrieved from Ethnologue : http://www.ethnologue.com/
LSA. (2014, June 8). How many languages are there in the
world? Retrieved from Linguistic Society of America | Advancing the
Scientific Study of Languages:
http://www.linguisticsociety.org/content/how-many-languages-are-there-world