information granularity
The extent of detail within the information (fine and detailed or coarse and abstract).
Levels of organizational information.
Individual, department, and enterprise
Formats of organizational information.
document, presentation, spreadsheet, and database
Four primary traits that help determine the value of information
information type, timeliness, quality, and governance
Employees must be able to correlate
Information Levels
Information Formats
Information Granularities
The two primary types of information:
transactional and analytical
transactional information
Encompasses all of the information contained within a single business process or unit of work, and its primary purpose is to support daily operational tasks.
analytical information
Encompasses all organizational information and its primary purpose is to support the performing of managerial analysis tasks.
real-time information
Immediate, up-to-date information.
real-time system
Provides real-time information in response to requests.
real-time information
Important for making faster and more effective decisions, keeping smaller inventories, and help companies operate more efficiently.
continual change
One of the biggest pitfalls associated with real-time information.
data inconsistency
Occurs when the same data element has different values.
data integrity issues
Occur when a system produces incorrect, inconsistent, or duplicate data.
The five characteristics common to high-quality information:
accuracy, completeness, consistency, timeliness, and uniqueness
If the customer's first name is missing, there is an invalid area code, or the street address contains only a number and not a street name, it would affect ___________.
completeness
Similar street address and phone numbers may create duplicate data which affects __________.
consistency
If a customer's phone and fax numbers are the same or the phone number is listed under email, this would affect __________.
accuracy
Reasons for low-quality information:
customers initially enter inaccurate information, different systems have different formats, use of abbreviations, or errors in third-party and external information
Serious business consequences that occur due to using low-quality information to make decisions:
Inability to accurately track customers, identify most valuable customers, identify selling opportunities, and build strong customer relationships; lost revenue opportunities from marketing to nonexistent customers; cost of sending nondeliverable email; d
Data Governance
The overall management of the availability, usability, integrity, and security of company data.
Policy of who is accountable for data.
database
Maintains information about various types of objects (inventory), events (transactions), people (employees), and places (warehouses).
database management system (DBMS)
Creates, reads, updates, and deletes data in a database while controlling access and security.
Actual manipulation of data in a database
database management system (DBMS)
Used to answer questions such as how many customers purchased Product A in December or what were the average sales by region.
Two primary tools available for retrieving information from a DBMS:
query-by-example (QBE) and structured query language (SQL)
query-by-example (QBE)
A tool that helps users graphically design the answer to a question against a database.
structured query language (SQL)
Asks users to write lines of code to answer questions against a database.
___________ typically interact with QBE tools.
Managers
___________ have the skills required to code SQL.
MIS pros
MySQL, Microsoft Access, SQL Server, FileMaker, Oracle, and Fox Pro are examples of ___________.
database management systems (DBMSs)
data models
Logical data structures that detail the relationships among data elements using graphics or pictures.
data element
The smallest or basic unit of information. (FIELD)
metadata
Provides details about the data.
Data elements include:
customer's name, address, email, discount rate, preferred shipping method, product name, quantity ordered, etc.
data dictionary
Compiles all of the metadata about the data elements in the data model.
DBMS use these three primary data models for organizing information:
hierarchical, network, and the relational database (the most prevalent)
relational database model
Stores information in the form of logically related two-dimensional tables.
relational database management system
Allows users to create, read, update, and delete data in a relational database.
entity (TABLE)
Stores information about a person, place, thing, transaction, or event.
attributes (also called columns or fields)
Data elements associated with an entity.
record
Collection of related data elements.
Each ________ occupies one row in its respective table.
record
primary keys and foreign keys
Used to create logical relationships within the relational database model.
primary key
A field (or group of fields) that uniquely identifies a given entity in a table.
foreign key
A primary key of one table that appears as an attribute in another table and acts to provide a logical relationship between the two tables.
Advantages of using relational databases over a text document or spreadsheet:
Increases flexibility, quality, security, scalability and performance; Reduces redundancy
data redundancy
The duplication of data, or the storage of the same data in multiple places.
information integrity
A measure of the quality of information.
integrity constraints
Rules that help ensure the quality of information.
Two types of integrity constraints:
relational and business critical
relational integrity constraints
Rules that enforce basic and fundamental information-based constraints, rules that enforce basic and fundamental information constraints. (Ex. Would not allow an employee to make an order for a non-existent customer.)
business critical integrity constraints
Enforce business rules vital to an organization's success and often require more insight and knowledge than relational integrity constraints.
Organizations that establish specific procedures for developing integrity constraints typically see an increase in __________.
Accuracy
One of the primary problems with redundant information is ____________.
inconsistency
data warehouse
a logical collection of information - gathered from many different operational databases - that supports business analysis activities and decision-making tasks.
Primary purpose of data warehouse
To combine information, more specifically, strategic information, throughout an organization into a single repository in such a way that the people who need that information can make decisions and undertake business analysis.
data mart
Contains a subset of data warehouse information.
extraction, transformation, and loading (ETL)
A process that extracts information from internal and external databases, transforms it using a common set of enterprise definitions, and loads it into a data warehouse.
information cleansing (or scrubbing)
A process that weeds out and fixes or discards inconsistent, incorrect, or incomplete information.
Scrubbed information is _________ and _________.
accurate and consistent
data quality audits
Determine the accuracy and completeness of its data.
data mining
The process of analyzing data to extract information not offered by the raw data alone.
Companies use _____________ to compile a complete picture of their operations, all within a single view, allowing them to identify trends and improve forecasts.
data-minimg techniques
data-mining tools
A variety of techniques to find patterns and relationships in large volumes of information and infer rules from them that predict future behavior and guide decision making.
data mining
Enables companies to determine the impact on sales, customer satisfaction, and corporate profits and to drill down into summary information to view detail transactional data.
structured data
Data already in a database or spreadsheet.
unstructured data
Data that do not exist in a fixed location and can include text documents, PDFs, voice messages, emails, etc.
text mining
Analyzes unstructured data to find trends and patterns in words and sentences.
web mining
Analyzes unstructured data associated with websites to identify consumer behavior and website navigation.
Three common forms for mining structured and unstructured data:
cluster analysis, association detection, and statistical analysis
cluster analysis
A technique used to divide an information set into mutually exclusive groups such that the members of each group are as close together as possible to one another and the different groups are as far apart as possible.
cluster analysis
This type of analysis has the ability to uncover naturally occurring patterns in information.
A great example of using _______________ in business is to create target-marketing strategies based on zip codes.
cluster analysis
association detection
Reveals the relationship between variables along with nature and frequency of the relationships.
association detection
Create rules to determine the likelihood of events occurring together at a particular time or following each other in a logical progression; percentages are used to reflect the patterns of the event.
association detection
55 percent of the time, events A and B occurred together" is an example under which data mining form?
market basket analysis
Analyzes such items as Web sites and checkout scanner information to detect customers' buying behavior and predict future behavior by identifying affinities among customers' choices of products and services.
statistical analysis
Performs such functions as information correlations, distributions, calculations, and variance analysis.
_________ is a common form of statistical analysis.
forecasting
time-series information
time-stamped information collected at a particular frequency.
forecasts
Predictions based on time-series information.
Web visits per hour, sales per month, and calls per day are all examples of ______________.
time-series info
Forecasting models allow organizations to ______________________ when making decisions.
consider all sorts of variables
Business Intelligence enables data that is:
reliable, consistent, understandable, and easily manipulated
Cube
representation of multidimensional information (layers and columns)
Data Warehouse is more ______
Data Mart has a more ________
organization focus
functional focus
Data Driven Website
interactive website kept constantly updated and relevant to the needs of its customers using a database
Physical vs Logical View in Flexibility
Physical: physical storage of information on a storage device
Logical: focuses on how individual users logically access information to meet their own particular business needs
data mining techniques
classification, estimation, affinity grouping, clustering
classification
assigns record to one of a predefined set of classes
estimation
determines values for an unknown continuous variable behavior or estimated future value
affinity grouping
determines which things go together.
clustering
segments a heterogeneous population of records into a number of more homogeneous subgroups