File Systems – A Summary
File systems differ primarily due to different operating system requirements and physical differences from the media for which they were developed. For example, the original Unix file system already supported the assignment of files to users and groups and a comparatively complex assignment of access rights, while the FAT file system of the first IBM PCs only allowed them to be identified as hidden or system files.
FAT File System
The FAT file system was originally designed for floppy disks with storage capacities under 1.5 MB and has been expanded several times over the years to expand the limits of usable data carriers. This has created a whole family of file systems, which have found the broadest support, mainly because of their simple implementation.
FAT file systems are currently used primarily for data exchange and in embedded systems with limited resources, especially on USB sticks and memory cards in digital cameras and smartphones. In this form, they are also for NAS systems relevant .
For large and highly loaded storage media, such as the hard drives in the NASFAT file systems are hardly suitable . Their performance is too poor, they are not fault tolerant, and they offer little access control that is generally required in multi-user, networked environments. Special file systems are used here, some of which can also be combined to meet all of the targets. For example, RAID systems provide a basis with automatically redundant data storage, on which another file system that is visible to the user controls data access.
File System for NAS Servers
Network file systems, which include the classic NFS (Network File System) originally developed by Sun Microsystems for Unix environments, are specific to NAS applications . There is also a large and growing selection of network file systems, which have recently been joined by specialized solutions from large Internet service providers for their own purposes, such as the Amazon S3 file system or the Google file system (GFS or GoogleFS).
SMB (Server Message Block) has the same meaning and function as NFS in the Unix world in Windows environments and heterogeneous networks with Windows and other clients. Also known as CIFS (Common Internet File System), it is a network file system developed by IBM and Microsoft that belongs to the standard of Windows operating systems, but also MacOS and Linux (via a software package called Samba) is supported.
As the name SMB suggests, NFS & Co are not typical file systems, but rather network protocols for the common use of mass storage resources in a LAN. They are still part of the file systems because they are accessed transparently, meaning that the files do not have to be copied to a local storage location for processing, as is the case when using file transfer protocols, for example FTP. Network file systems such as SMB and NFS also hide the specific properties of the local file systems with which a file server such as a NAS or a PC organizes its own hard disk space and therefore present themselves to the user as a local data carrier.
NAS solutions are usually customized Linux systems. For data storage on the internal hard disks, Linux-specific file systems such as the Four (ext4) Extended File System specially developed for Linux are used. Like its predecessor ext3, it is distinguished from the first two versions by journaling, which above all reduces the maintenance effort for the file system check if the computer system has not been shut down properly.
When developing the Linux Extended File System, the compatibility of the different versions with each other is an essential design requirement. The file systems ext3 and ext4 are therefore particularly advantageous for system administrators who want to equip older computer systems with new file system features without losing the complete data content of the hard drives . In other aspects, ext4 lags behind new developments such as Btrfs , as the main ext4 developer Theodore Ts’o stated in 2008
The B-Tree File System Btrfs was first developed by Oracle for Linux in 2007. Among other things, you can read the abbreviation as a “Better file system” and recognize the development goal in it. The Btrfs, which works on the copy-on-write principle, offers the file system properties snapshots, pooling and checksums as significant improvements compared to ext4.
Copy-on-Write
Copy-on-write reduces resource consumption when several users or processes access the same data area at the same time by only storing a copy in a new memory area when changing changed data and only saving the modifications. Snapshots, such as those found at Btrfs and ZFS developed by Sun Microsystems , can be seen as a transparent type of incremental backup that is created automatically and continuously during normal use of the file system. The idea was already implemented in Plan 9, an operating system developed by Bell Labs in the late 1980s, which was intended as a successor to Unix.
The CheckSums
Checksums allow a quick check of whether data has changed. They are calculated from the original data using specified algorithms and have the property that their value changes significantly with a slight change in the initial value. To determine whether two data blocks are identical, you only need to compare the relatively short checksums and not the much larger data blocks. In addition, checksums allow the correction of data errors to a limited extent.
Journaling
In a file system, journaling means that every write activity is also recorded in a special memory area, the journal. This additional information can be used to ensure that the data in the file system is consistent at all times and that even when the system is switched off hard, there are no file system errors that would require a complete check of the entire storage space occupied by the file system. With file system journaling, a computer system with a large amount of hard disk capacity, such as a NAS, can be up and running again much more quickly after a power failure or another system crash.
Due to the steadily growing storage capacity of hard drives and other data carriers, the importance of the performance of file systems continues to increase. Current file systems therefore not only have to be able to manage ever larger storage areas and files, but also have to be faster. New features and improvements are therefore aimed at accelerating or avoiding the specifically slow write access and search processes.