XML
a markup language, represents and stores data, compliment to HTML; passive; hierarchical
XML___data and HTML_____data
XML--store and represent; HTML display
Advantages of XML
computer systems and databases contain data in incompatible formats, XML data is stored in plain text format, makes it much easier to create data that can be shared by different apps
when to use relational database
need to max performance for data retrieval, data is processed later as relational data, data components have meaning outside a hierarchy, referential integrity is required
when to use XML
when you need max flexibility, data attributes apply to all data or to only a small subset of the data, ratio of data complexity to volume is high
database
helps people track things of interest to them, data is stored in tables, stores data and relationships
why we need data management
growing demand for more and better data, access to high quality data from multiple sources, integrated data that is consistent and meaningful, useful aggregations of data.
steps in data management
gathering, validating, storing, retrieving, manipulating, and presenting data
2 problems that databases solve
data dependency, data redundancy
data dependency
when data is maintained in one silo it depends upon data maintained in another silo
data redundancy
wastes storage space to house duplicate data
File also is called
table, relation
record is also called
row, tuple
field is also called
column, attribute
Querying
a precise request for info from a RDBMS, usually involve the choice of queer criteria
Normalization
allows you to improve a logical design so that it satisfies certain constraints and avoids unnecessary duplication of data, generally involves breaking down relations into smaller, well-structured relations
data mining
the exploration and analysis of large quantities of data in order to discover meaningful patterns and rules
data warehouse
a copy of transaction data specifically structured for querying analysis and reporting (i.e. data mining)
data mart
smaller, more focused data warehouse, typically reflects the business rules of a specific business unit within an enterprise
directed data mining
attempts to explain or categorize some particular target field such as income or response
undirected
attempts to find patterns or similarities among groups of records without the use of a particular target field or collection of predefined classes
classification
examining features of an object and assigning it to one of a predefined set of classes
estimation
generate an output value based on a set of input quantities
prediction
generating statements about future states or future behavior
affinity grouping
determining which things go together
clustering
partitioning or segmenting objects into a number of homogeneous subgroups