Notes before reading

  • This post is intended to document my experience with FAT32, LBA, MBR within the context of Classic Kernel and the microSD EEMCA controller.
  • Notation regarding addresses and values are in Little Endian (as it is in all references).
  • 13 Mar, Updated based on more hands down experience.

MBR


Introduction

MBR (Master Boot Record) is a special sector on a partitioned disk which holds information on the “logical” partitioning of the physical disk and what the underlaying file system is and how & where it stored files and other such entries.

MBR is a rather old concept, which was first introduced in 1983 for IBM-PC compatible machines. Nowadays MBR is getting replaced by GPT (GUID Partition Table). The main reason behind thing is the disk size limit which comes with MBR (2 TiB).

Sector Layout

There are multiple structure to the MBR layout. Each one brings its own minor or major changes to the original design. The one rule is that the sector must fit inside the MBR sector size of 512 bytes. * Classical generic MBR * Modern standard MBR * AAP MBR * NEWLDR MBR * AST/NEC MS-DOS and SpeedStor MBR * Ontrack Disk Manager MBR

Classical Generic MBR

Usually when first loading a sector from disk, it’s best to check the 2 bytes (boot signature) at address offset 0x1FE. Now the little-endian order of the 2 bytes is expected to be 0xAA55 but it has been often mixed in documentation and official documents as 0x55AA. My honest advice to you is to check for both cases. But remember that the official implementation has 0x55 at offset 0x1FE and 0xAA at 0x1FF.

The diagrams below showcases the structure of a Generic MBR, including a detailed view inside the contents of a partition entry.

Boot Sector + Partition

Boot Sector Layout + Partition

You can safely ignore the “Bootstrap Code area”. The important fields for the MBR are the partition entries (0x01BE, 0x01CE, 0x01DE, 0x01EE) and as mentioned above, the signature.

Boot Sector

Boot Sector

Partition

Partition

When reading a partition entry from the MBR make sure to check the status field (0x00) has the value 0x80. Next check the partition type, in this case to be FAT32 and to support LBA. The linked table has all the common and less common partition types.

LBA


Introduction

LBA (Logical Block Allocation) is a common scheme used for specifying the address/location of blocks of data as they are stored on disk. Blocks are located by an index, counting up from 0.

This fits in with **MBR as the MBR partition must support CHS addressing, LBA abstracts the cylinder “layer”.**

There isn’t much to be said about LBA as it is only a scheme for simplifying disk sectors access. But it’s worth acknowledging the existence of LBA because different File Systems have version which may support it and version which may not. This can be determined based on the partition type.

FAT


Introduction

FAT (File Allocation Table) is one of the most basic file system architectures. FAT is robust, simple and can provide performance even in lightweight implementations, but it’s no competitor to modern file system which offer scalability and reliability.

Originally designed in 1977 and intended for Floppy Disks, FAT was soon after adopted by DOS and Windows9x systems. Due to it’s popularity in desktop environments and the evolution of storage sizes, FAT now has three major versions (not counting the original 8 bit design)(the number represents the table element bits): * FAT12 * FAT16 * FAT32

FAT has since been cut out as the default File System from Windows machines, but it still remains popular in the portable media devices market.

Layout

FAT FS Layout

  • The first reserved sector (0) is also known as “Boot Sector” or “Volume Boot Record” (VBR). This area contains the “BIOS Parameter Block” (BPB) which contains the most basic information about the file system. This part is discussed in detail below.
  • The FAT 1 and FAT 2 are the actual allocations tables, the FAT 2 table is only a redundancy table and it’s not usually used.
  • Clusters
    • This is where all the actual file and directory data is stored.
    • “FAT32 typically commences the Root Directory Table in cluster number 2: the first cluster of the Data Region.”

Boot Sector

By far the most cumbersome design when it comes to FAT, from my experience. This was mainly by the conflicting nature of this working implementation of a FAT file system and the rest of the documentation available online.

The digram below is based on 1 and 2. It’s hard to create a simple representation when taking in consideration all the backwards compatibility introduced by Microsoft over the years.

Boot Sector Layout

Out of all these values there are only few of interest when handling a FAT32 File System.

  • 0x0B - 2 bytes - Bytes per sector (value is 512 bytes)
  • 0x0D - 1 byte - Sectors per cluster (a power of two)
  • 0x0E - 2 bytes - Number of reserved sectors (common value is 0x20)
  • 0x10 - 1 byte - Number of FATs (value is 2)
  • 0x24 - 4 bytes - Sectors per FAT (varies based on disk size)
  • 0x2C - 4 bytes - Root directory cluster (common value is 0x00000002)

After checking these variables to make sure they fit your expectations you can extract some useful variables (starting LBAs for each of the regions).

Cluster Map

A file will occupy at least one cluster depending on its size. A file taking up more than one cluster, will be represented using a chain of clusters. A chain of cluster may be fragmented across the Data Region. There are special entries better explained here.

It’s important to note the difference between a cluster and sector. Normally a cluster can be found across multiple sectors (usually at least 8 sectors, giving a cluster size of 4096 bytes).

A cluster (ID) can be transformed into an LBA value by applying the following formula: lba = cluster_start_lba + (cluster_id - 2) * sectors_per_cluster. The - 2 is applied because there are no cluster 0 or 1 and cluster’s IDs begin from 2. The cluster_start_lba will be obtained from the Directory Table, starting from the root directory table.

Directory Table

Directory Entry

The entries are 32 bytes long each. The first 11 bytes are the file name and extension (8 byte name + 3 byte extension). These are padded by spaces if the name is not long enough or shortened (LARGEFILENAME.SOMETHING becomes LARGEF~1SOM, where ~1 is the shortened entry number in order to support multiple copies with the same short name and SOM is simply the short version of SOMETHING).

It’s important to note (and as mentioned above) the only good thing directory entries will provide are the first cluster number and some information about the file (as seen in the diagram).

The first directory table you’ll have access to is the root directory table, which is located on the Boot Record. When reading the directory table entries, always check the first byte of the name. Based on the value you can tell a couple of things:

  • 0xE5 - Unused
  • 0x00 - End of directory table

After validating the entry, you can now extract the first cluster value by combining the cluster_lower and cluster_higher into a 4 byte value. With the cluster now you can perform a lookup into the FAT clusters.

Down the rabbit hole

By this point you should have access to the LBA of each region (boot record, reserved sectors, FATs, data) and the cluster value of the file we’re looking for.

The FAT32 table is a big array of 32bit entries where each one’s position in the array corresponds to a cluster number and the value indicates the next cluster for the file. You can think of this as a “singly linked list” (but not really) or more like an array where each value is another index inside the array.

At this point all you have to do is load up the starting LBA of the FAT tables and then use your cluster number as an offset. From there you will keep walking the clusters until you reach a value of 0xFFFFFFFF (I’ve seen 0x00000000 used in some cases, best check for both).

References