Disk Storage Devices
Preferred secondary storage device for high storage capacity and low cost.
Data stored as magnetized areas on magnetic disk surfaces.
disk pack
contains several magnetic disks connected to a rotating spindle.
track
divided into smaller blocks or sectors
physical disk block (hardware) address
consists of:
a cylinder number (imaginary collection of tracks of same radius from all recorded surfaces)
the track number or surface number (within the cylinder) and block number (within track).
Double buffering
can be used to speed up the transfer of contiguous disk blocks.
Records
Fixed and variable length records
Records contain fields which have values of a particular type
E.g., amount, date, time, age
Fields themselves may be fixed length or variable length
Variable length fields can be mixed into one record
Blocking
Refers to storing a number of records in one block on the disk.
Blocking factor
refers to the number of records per block.
the (average) number of file records stored in a disk block.
Spanned Records
Refers to records that exceed the size of one or more blocks and hence span a number of blocks.
file
sequence of records, where each record is a collection of data values (or data items).
can have fixed-length records or variable-length records.
can be unspanned or spanned
file descriptor (or file header)
includes information that describes the file, such as the field names and their data types, and the addresses of the file blocks on dis
Unspanned
no record can span two blocks
Spanned
a record can be stored in more than one block
physical disk blocks
can be contiguous, linked, or indexed.
Unordered Files
Also called a heap or a pile file.
New records are inserted at the end of the file.
A linear search through the file records is necessary to search for a record.
This requires reading and searching half the file blocks on the average, and is hence quite e
Ordered Files
Also called a sequential file.
File records are kept sorted by the values of an ordering field.
Insertion is expensive: records must be inserted in the correct order.
It is common to keep a separate unordered overflow (or transaction) file for new records
External Hashing
Hashing for disk files
Collisions
occur when a new record hashes to a bucket that is already full.
Open addressing
From the hash address, check records in order until an unused (empty) position is found.. Coalesced start at end of file and work back to empty position
Chaining
The new record is placed in an unused overflow location and the pointer of the occupied hash address location is set to the address of that overflow location.
Multiple hashing
The program applies a second hash function if the first results in a collision. If another collision results, the program uses open addressing or applies a third hash function and then uses open addressing if necessary.
RAID
Redundant Arrays of Inexpensive Disks.
A large array of small independent disks acting as a single higher-performance logical disk.
main goal of RAID
to even out the widely different rates of performance improvement of disks against those in memory and microprocessors.
Raid level 0
has no redundant data and hence has the best write performance at the risk of data los
Raid level 1
uses mirrored disks.
Raid level 2
uses memory-style redundancy by using Hamming codes, which contain parity bits for distinct overlapping subsets of components. Level 2 includes both error detection and correction.
Raid level 3
uses a single parity disk relying
Raid Levels 4 and 5
use block-level data striping, with level 5 distributing data and parity information across all disks.
Raid level 6
applies the so-called P + Q redundancy scheme using Reed-Soloman codes to protect against up to two disk failures by using just two redundant disks.