Wednesday, June 11, 2014

Language of Queries


Computerized systems (whether web-based or customized in-house models) formulate queries using a specific query language.  Query languages function via modeling of several aspects of data and information retrieval:
-      Keyword based querying
-       Pattern matching queries
-      Text structure querying
-      Query protocol

Electronic database users may perform all of the above query languages except for query protocols, as this pertains to functions solely within an enclosed database, such as a CD/DVD archival database.  Recently, I requested all of my medical test records and I received the data on a CD-ROM for my perusal.  The query protocol incorporated within the database allowed me to search and review pertinent records; different electronic databases may use an assortment of protocols, although there are some standards within industries/institutions.
Keyword-based queries are most commonly used for Internet/WWW searches, and keyword ranking (of the documents) is a basic characteristic. When you input a word into a search engine, such as Google, you are performing a keyword query.  The majority of people are familiar with this type of query.  We normally use natural language to perform context and/or pattern queries; contextual querying is representative of words that appear in close proximity (grouped together) in a document.  The other day I couldn’t remember a phrase, so I did a quick Internet query to refresh my memory.  I input “buy the rumor” and the search engine immediately found the answer I was looking for – “buy the rumor, sell the news” was the first-listed result.  I also use Boolean queries quite a bit; this allows me to specify (by utilizing Boolean operators: AND, OR, NOT, BUT) in detail which documents I want or do not want in the results.
Baeza-Yates notes that pattern matching reflects concepts for searching using parts of a word or distances between words, i.e. prefixed, suffixes, substrings, ranges, error allowances, and regular expressions.  You would use patterns when searching for similarities within documents.  A search using the prefix of “tech” will retrieve documents with the words technician, technical, technology . . . and so forth.  Patterns can be single or complex, depending on the user’s ability to interject word patterns effectively.
Structured queries are those that use standard computer infrastructure, such as HTML for the Web – “main structures: form-like fixed structures [e.g. online applications], hypertext structure [e.g. hyperlink connections], and hierarchical structure [e.g. organizational charts]” (Baeza-Yates & Ribeiro-Neto, 2011, p. 263).  Structured query language (SQL) is a feature of many program applications in use today; MS Access database application is formulated with SQL.  The use of algorithms in query languages is an essential facet of computerized processes.
Almost all Internet search engines gather user information pertaining to individual searches – inclusive of user information stored on the computer’s hard-drive; and this information is retrieved via “cookies”.  As we query the Internet/WWW for information, complex commercially-oriented algorithms are researching us.  Metadata is being compiled every second!
I’ll have some chocolate with those “cookies”!
References
Baeza-Yates, R., & Ribeiro-Neto. (2011). Modern Information Retrieval; the concepts and technology behind search; second edition. New York: Pearson Education Limited.