Most computers are used for data processing, as a big growth area in the “information age”
Data processing from a computer science perspective: Storage of data Organization of data Access to data Processing of data
Data Structures vs File Structures
Both involve:
Representation of Data +
Operations for accessing data
Difference:
Data structures: deal with data in the main memory
File structures: deal with the data in the secondary storage
Main Memory-MM fast small volatile, i.e. data is lost during power failures.
Secondary Storage-SS big (because it is cheap) stable (non-volatile) i.e. data is not lost during power failures slow (10,000 times slower than MM)
How fast is the main memory?
Typical time for getting info from:
Main memory: ~10 nanosec = 10 x 10-9 sec
Hard disks: ~10 milisec = 10 x 10-3 sec
An analogy keeping same time proportion as above: seconds versus weeks
Goal of the file structures
What is performance
Time
Minimize the number of trips to the SS in order to get desired information
Group related information so that we are likely to get everything we need with fewer trip to the SS.
Memory
Balance the memory size and the time
How to improve performance
Use the right file structure
Understand the advantages disadvantages of alternative methods
Metrics used to measure efficiency and effectiveness of a File structure-1 simplicity, reliability, time complexities, space complexities, scalability, programmability, and maintainability. *Note that the domains of the efficiency and effectiveness concerns rely on time and space complexity more than any other factor.
Metrics used to measure efficiency and effectiveness of a File structure-2
The file structures involve two domains: hardware and software.
Hardware primarily involves the physical characteristics of the storage medium.
Software involves the data structures