
More Complex Data Containers
While the data types described in this lesson and the objects you learned about in Lesson 2 are the basis of storing and manipulating the information in computers, they can only go so far in meeting today’s needs. Many other schemes for storing data have been designed by computer scientists, including arrays and databases.
Arrays
Up to this point we have discussed variables as memory containers that hold only one piece of information. It would take many variables to hold all of the names of the players on your baseball team or all of your bowling scores. A more efficient method for storing all of that data and being able to access each unique name or score in the collection is to store the data in an array.
Array - A list of sequentially numbered variables with the same name.
An array uses a single variable name that is numbered, or indexed, to refer to each separately stored data element. Think of it like the mail boxes in an apartment building. The apartment address is analogous to the variable name, such as 7284 Venice Boulevard. Each apartment number is analogous to the indexing value, such apartment 1, 2, 3, etc. Unlike apartments, however, the numbering of arrays starts with the number 0 instead of the number 1.
An array created to store the names of players might be named “player.” The String data of “Mary” would be stored in the first player variable, player(0). “Bob” would be stored in the variable player(1) and “Connie” in player(2). Your bowling scores could be stored as integers in variables named score(0), score(1), score(2), and so on.
0 | 1 | 2 | 3 | 4 |
|---|
Mary | Bob | Connie | Mike | Linda |
This data storage technique is ideal for some common computer processes such as alphabetically sorting names or adding many scores to find your bowling average. Being able to move conveniently from one stored value to another by using the sequentially numbered index value is easily programmed with techniques you will learn about in future lessons.
Databases
In a visit to your local library you likely encounter an electronic database when you search for books by your favorite author, books about horses, or books on the Civil War published in 1965. A library database packages the specific details of every book with additional data called metadata (described below). This combination allows you to search for books in any number of ways and to analyze the information in ways that would be impossible with other data storage methods. A database is a collection of related information and can store the details about such things as items in a warehouse, the individuals in a personnel record, individuals in a telephone book, and transactions on your credit card. The Web offers more databases than you can imagine. There are databases of movies, plants, animals, the human genome, pesticides, chemicals, languages, countries, sports, and on and on!
Database - A collection of information on related items which stored in a form that can be organized and searched.
Structure of databases
While we often see the contents of databases in the form of a table (grid of rows and columns), the table is NOT the database – there may be many tables that make up a single database. A database is a collection of related information made up of entities (or rows)—single examples of the data stored which are sometimes called records. Each entity or record, contains specific labeling details about the record called fields or attributes (columns). If a telephone book is a database, the information about any given telephone number is a record or entity, and the label given to the data such as name, address, and telephone number are the fields or attributes that make up the metadata. The specific data details such as Jose Rodriquez, 401 Main Street, and 213-4567 are the values.
Metadata - A description of the data in a database that enables powerful searches and the ability to answer complex questions about the data.
In order for us to see the information stored within a database, it is usually presented in the form of a table. Let’s use the example of the U.S. Department of Agriculture Plants database (http://plants.usda.gov/index.html) to explore the details of databases.
Here are 3 records or entities of the plant database:
The details of these 3 plants plus thousands more comprise the plant database.
The complete set of information on each individual plant is a record or entity.
The category of each specific detail such as family, growth habit, and a picture is a field or attribute.
The specific data such as Meadow Horsetail, Equisetaceae, and a specific picture are values.
Obviously, databases are popular for storing huge amounts of data. But what makes them so powerful? The power of databases results from the actual structure or design of the database for easily and quickly accessing, searching and reporting data. Metadata is data, or information, about the data stored in a database. Meta means “about” in Greek, so Metadata is literally “Data about data”.
For a moment, consider the data 86403. Without some explanation this value is totally useless. But the moment I tell you that 86403 is a zip code, the value has significance and meaning. “Zip code”, in this situation, is metadata. It is information about the data value 86403 that gives meaning to it as the zip code of Lake Havasu City, Arizona. This metadata allows you to search a database for answers to specific questions.
In the example of the plant database, we might create a question or query that seeks to answer the question, “What perennial plant is native to the U.S. and contains the word “horse?” The result of a query referencing the metadata would likely be a table listing all of the plants in the database that fit the description of the word “native” in the field of U.S. Nativity, “perennial” in the field of Duration, and “horse” in the field Common Name. Querying a database to find the answer to a question is a very skilled task valued by organizations and businesses.
Databases are usually created using software applications such as Microsoft Access or Microsoft SQL Server. These database applications streamline the creation of databases, the definition of metadata with fields or attributes, and the population of records or entities with values. Solving problems with queries is also made easier through database software.
Get Real
Complete the array below by filling in the names of your childhood friends. What identifier would you give to this array?
Imagine you are creating a database to record the members of your family tree. What fields of data (or metadata) would you include?
A paper telephone book is an old-fashioned database. Think about questions you could not answer (or at least not easily or quickly) with a paper telephone book that you could with an electronic telephone book.
Using the table below, what records would be returned by the following query? “Select all rows where Name begins with ‘B’, BirthYear is greater than 1990, and FavoriteColor = ‘Yellow’”
Name | BirthYear | FavoriteColor |
|---|
Betty | 1901 | red |
George | 1983 | yellow |
Bruce | 1991 | yellow |
Clark | 1998 | blue |
Barbara | 1989 | yellow |
Here are some possible answers:
childhood_buddies
0 | 1 | 2 | 3 | 4 |
|---|
Linn | Carlos | Fran | Maria | Connie |
These are some likely fields or meta data for a family tree data base:
First name Last name Mother Father Siblings
| Spouse Birth date Birth place Occupation Date of death
|
With an electronic telephone book you could easily locate all individuals with the same first name as you, all of the Smith families living on Oak Street, all of the businesses on Main Street that have the word “hardware” in their name, and all of the Juarez families that are not in the 925 area code.
The only individual who fits the query where Name begins with ‘B’, BirthYear is greater than 1990, and FavoriteColor = ‘Yellow’ is
Name | BirthYear | FavoriteColor |
|---|
Bruce | 1991 | yellow |