BCOMP FINAL

XML

a markup language, represents and stores data, compliment to HTML; passive; hierarchical

XML___data and HTML_____data

XML--store and represent; HTML display

Advantages of XML

computer systems and databases contain data in incompatible formats, XML data is stored in plain text format, makes it much easier to create data that can be shared by different apps

when to use relational database

need to max performance for data retrieval, data is processed later as relational data, data components have meaning outside a hierarchy, referential integrity is required

when to use XML

when you need max flexibility, data attributes apply to all data or to only a small subset of the data, ratio of data complexity to volume is high

database

helps people track things of interest to them, data is stored in tables, stores data and relationships

why we need data management

growing demand for more and better data, access to high quality data from multiple sources, integrated data that is consistent and meaningful, useful aggregations of data.

steps in data management

gathering, validating, storing, retrieving, manipulating, and presenting data

2 problems that databases solve

data dependency, data redundancy

data dependency

when data is maintained in one silo it depends upon data maintained in another silo

data redundancy

wastes storage space to house duplicate data

File also is called

table, relation

record is also called

row, tuple

field is also called

column, attribute

Querying

a precise request for info from a RDBMS, usually involve the choice of queer criteria

Normalization

allows you to improve a logical design so that it satisfies certain constraints and avoids unnecessary duplication of data, generally involves breaking down relations into smaller, well-structured relations

data mining

the exploration and analysis of large quantities of data in order to discover meaningful patterns and rules

data warehouse

a copy of transaction data specifically structured for querying analysis and reporting (i.e. data mining)

data mart

smaller, more focused data warehouse, typically reflects the business rules of a specific business unit within an enterprise

directed data mining

attempts to explain or categorize some particular target field such as income or response

undirected

attempts to find patterns or similarities among groups of records without the use of a particular target field or collection of predefined classes

classification

examining features of an object and assigning it to one of a predefined set of classes

estimation

generate an output value based on a set of input quantities

prediction

generating statements about future states or future behavior

affinity grouping

determining which things go together

clustering

partitioning or segmenting objects into a number of homogeneous subgroups