INFO 210 Ch. 6

information granularity

The extent of detail within the information (fine and detailed or coarse and abstract).

Levels of organizational information.

Individual, department, and enterprise

Formats of organizational information.

document, presentation, spreadsheet, and database

Four primary traits that help determine the value of information

information type, timeliness, quality, and governance

Employees must be able to correlate

Information Levels
Information Formats
Information Granularities

The two primary types of information:

transactional and analytical

transactional information

Encompasses all of the information contained within a single business process or unit of work, and its primary purpose is to support daily operational tasks.

analytical information

Encompasses all organizational information and its primary purpose is to support the performing of managerial analysis tasks.

real-time information

Immediate, up-to-date information.

real-time system

Provides real-time information in response to requests.

real-time information

Important for making faster and more effective decisions, keeping smaller inventories, and help companies operate more efficiently.

continual change

One of the biggest pitfalls associated with real-time information.

data inconsistency

Occurs when the same data element has different values.

data integrity issues

Occur when a system produces incorrect, inconsistent, or duplicate data.

The five characteristics common to high-quality information:

accuracy, completeness, consistency, timeliness, and uniqueness

If the customer's first name is missing, there is an invalid area code, or the street address contains only a number and not a street name, it would affect ___________.

completeness

Similar street address and phone numbers may create duplicate data which affects __________.

consistency

If a customer's phone and fax numbers are the same or the phone number is listed under email, this would affect __________.

accuracy

Reasons for low-quality information:

customers initially enter inaccurate information, different systems have different formats, use of abbreviations, or errors in third-party and external information

Serious business consequences that occur due to using low-quality information to make decisions:

Inability to accurately track customers, identify most valuable customers, identify selling opportunities, and build strong customer relationships; lost revenue opportunities from marketing to nonexistent customers; cost of sending nondeliverable email; d

Data Governance

The overall management of the availability, usability, integrity, and security of company data.
Policy of who is accountable for data.

database

Maintains information about various types of objects (inventory), events (transactions), people (employees), and places (warehouses).

database management system (DBMS)

Creates, reads, updates, and deletes data in a database while controlling access and security.
Actual manipulation of data in a database

database management system (DBMS)

Used to answer questions such as how many customers purchased Product A in December or what were the average sales by region.

Two primary tools available for retrieving information from a DBMS:

query-by-example (QBE) and structured query language (SQL)

query-by-example (QBE)

A tool that helps users graphically design the answer to a question against a database.

structured query language (SQL)

Asks users to write lines of code to answer questions against a database.

___________ typically interact with QBE tools.

Managers

___________ have the skills required to code SQL.

MIS pros

MySQL, Microsoft Access, SQL Server, FileMaker, Oracle, and Fox Pro are examples of ___________.

database management systems (DBMSs)

data models

Logical data structures that detail the relationships among data elements using graphics or pictures.

data element

The smallest or basic unit of information. (FIELD)

metadata

Provides details about the data.

Data elements include:

customer's name, address, email, discount rate, preferred shipping method, product name, quantity ordered, etc.

data dictionary

Compiles all of the metadata about the data elements in the data model.

DBMS use these three primary data models for organizing information:

hierarchical, network, and the relational database (the most prevalent)

relational database model

Stores information in the form of logically related two-dimensional tables.

relational database management system

Allows users to create, read, update, and delete data in a relational database.

entity (TABLE)

Stores information about a person, place, thing, transaction, or event.

attributes (also called columns or fields)

Data elements associated with an entity.

record

Collection of related data elements.

Each ________ occupies one row in its respective table.

record

primary keys and foreign keys

Used to create logical relationships within the relational database model.

primary key

A field (or group of fields) that uniquely identifies a given entity in a table.

foreign key

A primary key of one table that appears as an attribute in another table and acts to provide a logical relationship between the two tables.

Advantages of using relational databases over a text document or spreadsheet:

Increases flexibility, quality, security, scalability and performance; Reduces redundancy

data redundancy

The duplication of data, or the storage of the same data in multiple places.

information integrity

A measure of the quality of information.

integrity constraints

Rules that help ensure the quality of information.

Two types of integrity constraints:

relational and business critical

relational integrity constraints

Rules that enforce basic and fundamental information-based constraints, rules that enforce basic and fundamental information constraints. (Ex. Would not allow an employee to make an order for a non-existent customer.)

business critical integrity constraints

Enforce business rules vital to an organization's success and often require more insight and knowledge than relational integrity constraints.

Organizations that establish specific procedures for developing integrity constraints typically see an increase in __________.

Accuracy

One of the primary problems with redundant information is ____________.

inconsistency

data warehouse

a logical collection of information - gathered from many different operational databases - that supports business analysis activities and decision-making tasks.

Primary purpose of data warehouse

To combine information, more specifically, strategic information, throughout an organization into a single repository in such a way that the people who need that information can make decisions and undertake business analysis.

data mart

Contains a subset of data warehouse information.

extraction, transformation, and loading (ETL)

A process that extracts information from internal and external databases, transforms it using a common set of enterprise definitions, and loads it into a data warehouse.

information cleansing (or scrubbing)

A process that weeds out and fixes or discards inconsistent, incorrect, or incomplete information.

Scrubbed information is _________ and _________.

accurate and consistent

data quality audits

Determine the accuracy and completeness of its data.

data mining

The process of analyzing data to extract information not offered by the raw data alone.

Companies use _____________ to compile a complete picture of their operations, all within a single view, allowing them to identify trends and improve forecasts.

data-minimg techniques

data-mining tools

A variety of techniques to find patterns and relationships in large volumes of information and infer rules from them that predict future behavior and guide decision making.

data mining

Enables companies to determine the impact on sales, customer satisfaction, and corporate profits and to drill down into summary information to view detail transactional data.

structured data

Data already in a database or spreadsheet.

unstructured data

Data that do not exist in a fixed location and can include text documents, PDFs, voice messages, emails, etc.

text mining

Analyzes unstructured data to find trends and patterns in words and sentences.

web mining

Analyzes unstructured data associated with websites to identify consumer behavior and website navigation.

Three common forms for mining structured and unstructured data:

cluster analysis, association detection, and statistical analysis

cluster analysis

A technique used to divide an information set into mutually exclusive groups such that the members of each group are as close together as possible to one another and the different groups are as far apart as possible.

cluster analysis

This type of analysis has the ability to uncover naturally occurring patterns in information.

A great example of using _______________ in business is to create target-marketing strategies based on zip codes.

cluster analysis

association detection

Reveals the relationship between variables along with nature and frequency of the relationships.

association detection

Create rules to determine the likelihood of events occurring together at a particular time or following each other in a logical progression; percentages are used to reflect the patterns of the event.

association detection

55 percent of the time, events A and B occurred together" is an example under which data mining form?

market basket analysis

Analyzes such items as Web sites and checkout scanner information to detect customers' buying behavior and predict future behavior by identifying affinities among customers' choices of products and services.

statistical analysis

Performs such functions as information correlations, distributions, calculations, and variance analysis.

_________ is a common form of statistical analysis.

forecasting

time-series information

time-stamped information collected at a particular frequency.

forecasts

Predictions based on time-series information.

Web visits per hour, sales per month, and calls per day are all examples of ______________.

time-series info

Forecasting models allow organizations to ______________________ when making decisions.

consider all sorts of variables

Business Intelligence enables data that is:

reliable, consistent, understandable, and easily manipulated

Cube

representation of multidimensional information (layers and columns)

Data Warehouse is more ______
Data Mart has a more ________

organization focus
functional focus

Data Driven Website

interactive website kept constantly updated and relevant to the needs of its customers using a database

Physical vs Logical View in Flexibility

Physical: physical storage of information on a storage device
Logical: focuses on how individual users logically access information to meet their own particular business needs

data mining techniques

classification, estimation, affinity grouping, clustering

classification

assigns record to one of a predefined set of classes

estimation

determines values for an unknown continuous variable behavior or estimated future value

affinity grouping

determines which things go together.

clustering

segments a heterogeneous population of records into a number of more homogeneous subgroups