USTC OS 2021 Spring HW5

2021-07-18

字数统计: 1.8k字 | 阅读时长≈ 11分

1. RAID

Consider a RAID organization comprising five disks in total, how many blocks are accessed in order to perform the following operations for RAID-5 and RAID-6?
a. An update of one block of data
b. An update of seven continuous blocks of data. Assume that the seven contiguous blocks begin at a boundary of a stripe.

Solution

a.

RMW

RAID-5: 2 blocks, A_i and A_p .
RAID-6: 3 blocks A_i , A_p and A_q.

RRW

RAID-5: 5 blocks , read A₂ , A₃ , A₄ , and write A₁ A_p .
RAID-6: 5 blocks , read A₂ , A₃ and write A₁ , A_p , A_q .

b.

Case1 RAID-5

We assume that the blocks are like A₁ , A₂ , A₃ , A₄ , B₁ , B₂ , B₃ .

RMW: update every block and its check block. Total 2*7=14 blocks.
RRW: 7 data blocks, block B₄ , A_p , and B_p .Total 10 blocks.

Case2 RAID-6

We assume that the blocks are like A₁ , A₂ , A₃ , B₁ , B₂ , B₃ , C₁ .

RRW: update every block and its two check blocks. Total 3*7=21 blocks.
RMW: 7 data blocks and A_p , A_q , B_p , B_q , C_p , C_q , C₂ , C₃ . Total 15 blocks.

Best policy: RRW in A and B, RMW in C. Total 5*2+3=13 blocks.

2. The disk-scheduling algorithm

Suppose that a disk drive has 5,000 cylinders, numbered 0 to 4999. The drive is currently serving a request at cylinder 2150, and the previous request was at cylinder 1805. The queue of pending requests, in FIFO order, is: 2069, 1212, 2296, 2800, 544, 1618, 356, 1523, 4965, 3681 Starting from the current head position, what is the total distance (in cylinders) that the disk arm-moves to satisfy all the pending requests for each of the following disk-scheduling algorithms?

a. FCFS
b. SSTF
c. SCAN
d. LOOK
e. C-SCAN
f. C-LOOK

Solution

Disk-scheduling algorithm	Scheduling order	Distances
FCFS	2150, 2069, 1212, 2296, 2800, 544, 1618, 356, 1523, 4965, 3681	13011
SSTF	2150, 2069, 2296, 2800, 3681, 4965, 1618, 1523, 1212, 544, 356	7586
SCAN	2150, 2296, 2800, 3681, 4965, (4999) , 2069, 1618, 1523, 1212, 544, 356	7492
LOOK	2150, 2296, 2800, 3681, 4965, 2069, 1618, 1523, 1212, 544, 356	7424
C-SCAN	2150, 2296, 2800, 3681, 4965, (4999),（0），356, 544, 1212, 1523, 1618, 2069	9917
C-LOOK	2150, 2296, 2800, 3681, 4965 , 356, 544, 1212, 1523, 1618, 2069	9137

3. Open-file table

Explain what open-file table is and why we need it.

Solution

To avoid constant searching, many systems require that an open() system call be made before a file is first used. The operating system keeps a table, called the open-file table, containing information about all open files. When a file operation is requested, the file is specified via an index into this table, so no searching is required. When the file is no longer being actively used, it is closed by the process, and the operating system removes its entry from the open-file table. create() and delete() are system calls that work with closed rather than open files.

By open-file table, the open file information is cached in memory, and the file index is done by open-file table, avoiding the need for directory traversal for each visit, reducing IO overhead, and improving performance. On the other hand, the open-file table can maintain the open state of a file, which ensure operations such as deleting to be performed correctly.

4. File and Directory

Explain the concept of file and directory, and what does “755” mean for file permission?

Solution

Concepts :

The operating system abstracts from the physical properties of its storage devices to define a logical storage unit —- the file.

Directory is a storage unit that records information such as name, location, size, and type about a set of files, and it is also a file.

The three octal numbers represent the owner’s , group members’, and others’ permissions of reading , writing and executing.

755= 111 101 101 => The owner can read, write , and execute this file (111). The group member and others can read and execute this file (cannot read).(101).

5. Continuous Allocation

Explain the problems of using continuous allocation for file system layout and how to solve them.

Solution

Flaws:

Continuous allocations produce external fragmentation, resulting in insufficient continuous space when creating large files
It cannot deal with the growth of file size when we need add content to a file.

Solutions :

Break large files into small pieces and store them in fix-sized blocks.
Use linked list to support dynamic growth.

6. FAT (pros and cons)

What are the advantages of the variation of linked allocation that uses a FAT to chain together the blocks of a file? What is the major problem of FAT?

Solution

Advantages

Better random access performance. When accessing a block in the middle part of a file, look for pointers stored in the FAT to determine its location, rather than sequential access to all blocks of the file to find a pointer to the target block. Most FATs can be cached in memory, so you can determine pointers through memory access without having to access disk blocks.

Disadvantages

Partial caching is difficult to implement. A trade-off between space and time : caching FAT for better access performance results in a waste of memory space.

7. Disk I/O

Consider a file system similar to the one used by UNIX with indexed allocation, and assume that every file uses only one block. How many disk I/O operations might be required to read the contents of a small local file at /a/b/c in the following two cases? Should provide the detailed workflow.

a. Assume that none of the disk blocks and inodes is currently being cached.

b. Assume that none of the disk blocks is currently being cached but all inodes are in memory.

Solution

a. Not cached

root directory
inode for /a
disk block for /a
inode for /a/b
disk block for /a/b
inode for /a/b/c
disk block for /a/b/c

Total 7 times.

b. Cached

The process is the same as a, but accessing i-node doesn’t need I/O operations. Total 4 times.

8. Max File Size in FAT

Consider a file system that uses inodes to represent files. Disk blocks are 8-KB in size and a pointer to a disk block requires 4 bytes. This file system has 12 direct disk blocks, plus single, double, and triple indirect disk blocks. What is the maximum size of a file that can be stored in this file system?

Solution

Every block can store 8kB/4B=2048 indexes. Thus, 12 direct blocks , 1 single, double, and triple indirect disk block.

$MaxSize=(12 * 8 KB) + (2048 * 8 KB) + (2048 * 2048 * 8 KB) + (2048 * 2048 * 2048 * 8 KB) \approx 64 TB$

9. Long filename in FAT

What is the 8+3 naming convention in FAT32 file system, and how to manage long filenames?

Solution

8+3 naming convention:

8 characters for name + 3 characters for file extension

Long filenames:

If the filename is too long (more than 8 bytes), we add more entries to represent the filename. The first byte of the entry is the sequential number of file name and the terminating directory entry has the sequence number OR-ed with 0x40. Byte 11 of every entry is always 0x0F to indicate that is a LFN.

10. Director Entries

How are directory entries managed in FAT and Ext file systems？

Solution

The FAT file system directory entry is fixed in size and managed in an array. The entry stores a series of file-related property information, including: file name, extension, property, reserved word, most recent modification time, first cluster number, file size.

The EXT file system directory entry varies in size and is managed in a link list. The entry stores the file name and index node.

11. Hard/Symbolic Link

What is the difference between hard link and symbolic link?

Solution

Hard link

A hard link is a directory entry pointing to an existing file. No new file content is created!
Conceptually speaking, this creates a file with two pathnames.
Hard link entries and original entries point to the same inode and do not create new file content. (link count + 1)

Symbolic Link

A symbolic link is a file. Unlike the hard link, a new inode is created for each symbolic link.
It stores the pathname (shortcut).
The link count of the original file doesn’t change.

12. Initial Link Counts

What are the initial link counts when a regular file or a directory is created? Why?

Solution

When a regular file is created, the link count is always 1. Only one inode (itself) points to it.
When a directory is created, the initial link count is always 2. An extra hard link “.” points to it.

13. Data/Metadata Journaling

What is the difference between data journaling and metadata journaling? Explain the operation sequence for each of the two journaling methods.

Solution

Data journaling.

The operation sequence is as follows :

Journal write : Write the contents of the transaction (including TxB, metadata, and data) .
Journal commit : metadata, and data (including TxE) .
Checkpoint: Write the contents of the update to their on-disk locations.

Metadata Journaling

The operation sequence is as follows :
1. Journal write : Write the contents of the transaction (including TxB, metadata) .Write the contents of the update to their on-disk locations. The two writes can be issued in parallel.
2. Journal commit : Write TxE.
3. Checkpoint : Write the update of metadata to their on-disk locations.

Difference
1. Writing sequence
  
  Data Journaling : Write the logs before writing the file system.
  
  Metadata Journaling : Write metadata to the log and data to the file system (in parallel). Waiting. After submitting the log, write metadata to the file system.
2. The amount of data :
  
  Data Journaling : Write data twice.
  
  Metadata Journaling : Write data once.

14. I/O control methods

What are the three I/O control methods?

Solution:

Polling.
Interrupt.
Direct Memory Access (DMA) .

15. I/O interface

List at least three kinds of I/O devices and explain how to provide a standard and uniform application I/O interface?

Solution

Keyboard, mouse, router, printer, etc.
Implement a standard and uniform application I/O interface: Abstraction, encapsulation, layering .

16. I/O Subsystem

What services are provided by the kernel I/O subsystem ?

Solution

I/O scheduling
Maintain a per-device queue.
Re-ordering the requests.
Average waiting time, fairness, etc.
Buffering
Store data in memory while transferring between devices.
To cope with device speed mismatch.
To cope with device transfer size mismatch.
To maintain “copy semantics” (e.g., copy from application’s buffer to kernel buffer)
Caching
Faster device holding copy of data.
Spooling
Hold output for a device
Error handling and I/O protection
Power management, etc.