en:docs:win16:modules:local_heap

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:docs:win16:modules:local_heap [2026/02/24 04:50] prokusheven:docs:win16:modules:local_heap [2026/02/24 08:12] (current) – [References] prokushev
Line 1: Line 1:
-===== Win16 Local Heap Functions =====+===== Local Heap and Atom Table =====
  
 ===== Overview ===== ===== Overview =====
Line 23: Line 23:
  
 **Important Notes:** **Important Notes:**
-* The field at offset 6 (pLocalHeap) is the primary way to locate the local heap structures given only the DGROUP selector. + 
-* When LocalInit() is called on a globally allocated block (non‑DGROUP), the WORD at offset 6 of that block is also set to point to the local heap information structure for that block. +  * The field at offset 6 (pLocalHeap) is the primary way to locate the local heap structures given only the DGROUP selector. 
-* Similarly, if InitAtomTable() is called on a global block, offset 8 points to the atom table, and offset 6 will point to the associated local heap (since atoms are stored in the local heap).+  * When LocalInit() is called on a globally allocated block (non‑DGROUP), the WORD at offset 6 of that block is also set to point to the local heap information structure for that block. 
 +  * Similarly, if InitAtomTable() is called on a global block, offset 8 points to the atom table, and offset 6 will point to the associated local heap (since atoms are stored in the local heap).
  
 ==== HeapInfo and LocalInfo ==== ==== HeapInfo and LocalInfo ====
Line 97: Line 98:
 Every block in the local heap is preceded by an arena (header) that contains management information. Arenas always start on a 4‑byte boundary, so the two low bits of every arena address are zero. These bits are reused as flags in the la_prev field of each arena. The two low bits of la_prev have the following meaning: Every block in the local heap is preceded by an arena (header) that contains management information. Arenas always start on a 4‑byte boundary, so the two low bits of every arena address are zero. These bits are reused as flags in the la_prev field of each arena. The two low bits of la_prev have the following meaning:
  
-* Bit 0 (least significant): Set if the block is in use (FIXED or MOVEABLE); cleared if the block is free. +  * Bit 0 (least significant): Set if the block is in use (FIXED or MOVEABLE); cleared if the block is free. 
-* Bit 1: Set if the block is MOVEABLE; cleared if the block is FIXED (only meaningful when bit 0 is set).+  * Bit 1: Set if the block is MOVEABLE; cleared if the block is FIXED (only meaningful when bit 0 is set).
  
 Thus, to obtain the real address of the previous arena, the two low bits must be masked off. Thus, to obtain the real address of the previous arena, the two low bits must be masked off.
Line 148: Line 149:
 ===== Heap Operations ===== ===== Heap Operations =====
  
-* **Allocation (LocalAlloc)** walks the free list, splitting blocks if necessary, and sets up the appropriate arena. For MOVEABLE blocks, it also allocates a handle table entry. +  * **Allocation (LocalAlloc)** walks the free list, splitting blocks if necessary, and sets up the appropriate arena. For MOVEABLE blocks, it also allocates a handle table entry. 
-* **Compaction (LocalCompact)** coalesces adjacent free blocks and may move or discard unlocked MOVEABLE blocks. When a block is moved, its lhe_address is updated. +  * **Compaction (LocalCompact)** coalesces adjacent free blocks and may move or discard unlocked MOVEABLE blocks. When a block is moved, its lhe_address is updated. 
-* **Locking (LocalLock/LocalUnlock)** manipulates the lhe_count field of the handle entry for MOVEABLE blocks; for FIXED blocks, no count is maintained. +  * **Locking (LocalLock/LocalUnlock)** manipulates the lhe_count field of the handle entry for MOVEABLE blocks; for FIXED blocks, no count is maintained. 
-* **Discarding (LocalDiscard)** frees the memory of a MOVEABLE block but keeps the handle entry alive with the LHE_DISCARDED flag set.+  * **Discarding (LocalDiscard)** frees the memory of a MOVEABLE block but keeps the handle entry alive with the LHE_DISCARDED flag set.
  
 ===== Atom Tables ===== ===== Atom Tables =====
Line 161: Line 162:
 ==== Relationship with the Local Heap ==== ==== Relationship with the Local Heap ====
  
-Physically, an atom table resides **inside** the local heap of some data segment . Therefore, before creating an atom table, the segment must be initialized as a local heap by calling `LocalInit()`.+Physically, an atom table resides **inside** the local heap of some data segment. Therefore, before creating an atom table, the segment must be initialized as a local heap by calling `LocalInit()`.
  
-==== Types of Atoms ====+==== ATOMENTRY Structure ====
  
-Windows supports two fundamentally different types of atoms: **string atoms** and **integer atoms**.+Each string atom is stored as an `ATOMENTRY` structure in the local heapThe structure has the following form:
  
-===== Integer Atoms =====+^ Offset ^ Type ^ Field ^ Description ^ 
 +| 00h | WORD | `next` | Next entry in the same hash bucket (0 if last). | 
 +| 02h | WORD | `usage` | Reference count. | 
 +| 04h | BYTE | `len` | Length of the string (1–255). | 
 +| 05h | BYTE[] | `name` | ASCIIZ string (length `len` + 1). |
  
-Integer atoms are a special category of atoms that are **not stored** in an atom table and therefore **do not have an associated `ATOMENTRY` structure**. +==== Types of Atoms ====
- +
-* **Range**: `0x0001` to `0xBFFF` . +
-* **Reference count**: Not applicable, as they are not allocated from the heap . +
-* **String representation**: When passed to functions that expect a string (e.g., `GetAtomName`), an integer atom is converted to a string of the form **`"#dddd"`**, where `dddd` is the decimal representation of the number. For example, atom `0x8001` becomes `"#32769"` . Leading zeros are not included. +
-* **Purpose**: Used for predefined system objects to avoid wasting memory on storing strings. Classic examples are built-in window classes such as the dialog box class `"#32770"` . Other known values: `#32768` (PopupMenu), `#32769` (Desktop), `#32771` (WinSwitch), `#32772` (IconTitle) .+
  
-**Important consequence**: Functions like `GlobalAddAtom`, when passed a string of the form `"#1234"`, will return the integer atom `0x04D2` instead of creating a new table entry .+Windows supports two fundamentally different types of atoms: **string atoms** and **integer atoms**. Their handling is completely distinct.
  
 ===== String Atoms ===== ===== String Atoms =====
  
-These are the "classic" atoms created when `AddAtom` or `GlobalAddAtom` is called with an ordinary string.+String atoms are created by passing an ordinary string to `AddAtom` or `GlobalAddAtom`. They are stored in the atom table as `ATOMENTRY` structures.
  
-* **Range**: `0xC000` to `0xFFFF` . +  * **Range**: `0xC000` to `0xFFFF` (encoded pointer)
-* **Physical memory representation**: Each string atom is represented by an **`ATOMENTRY`** structure located in the local heap. The atom value itself (type `ATOM`) is an **encoded near pointer** to this structure .+  * **Storage**: Allocated in the local heap as `ATOMENTRY`, inserted into the hash table. 
 +  * **Reference count**: Yes (`usagefield)
 +  * **String representation**: The original string. 
 +  * **Creation**: `AddAtom("MyString")`.
  
-==== Why the range 0xC000–0xFFFF? (Technical rationale) ====+**Encoding:** A string atom value is derived from the near pointer to its `ATOMENTRY`. Since the pointer is 4‑byte aligned, the low two bits are zero. The atom is formed by shifting the pointer right by 2 bits and ORing with `0xC000`. This guarantees the range `0xC000–0xFFFF`.
  
-This is a direct implementation feature of 16-bit Windows : +<code c> 
-1. **Alignment**: All blocks in the local heap are aligned on a 4-byte boundary. The low bits of any pointer to an `ATOMENTRY` structure are always zero. +#define HANDLETOATOM(handle) ((ATOM)(0xc000 | ((handle) >> 2))) 
-2. **Encoding**: To obtain a 16-bit atom from the 14 significant bits of the pointer, Windows shifts the pointer right by 2 bits (making room for flagsand **sets the two high bits to 1**. This guarantees that all string atoms fall into the range `0xC000...0xFFFF`. +#define ATOMTOHANDLE(atom)   ((HANDLE16)(atom<< 2) 
-3. **Decoding**: To convert an atom back to a pointer (`GetAtomHandle`), the system clears the two high bits and shifts the value left by bits. +</code>
-4. **Range separation**: This scheme leaves the range `0x0001...0xBFFF` for integer atoms, allowing functions to quickly distinguish atom types by value.+
  
-==== Structure of a String Atom Entry (ATOMENTRY) ====+===== Integer Atoms =====
  
-The `ATOMENTRY` structure is the exact layout of a string atom in the heapIt is defined in the SDK file `WINEXP.Hand has the following form:+Integer atoms are created by passing a string of the form `"#dddd"` (or by using `MAKEINTATOM` with a value ≤ 0xBFFF)They are **not stored in the atom table** and have no associated `ATOMENTRYstructure.
  
-^ Offset ^ Type ^ Field ^ Description ^ +  * **Range**: `0x0001` to `0xBFFF` . 
-| 00h | WORD | `next| Near pointer to the next `ATOMENTRYstructure in the same hash bucket collision chain0 for the last element| +  * **Storage**: None; the value is used directly as the atom
-| 02h | WORD | `usage` | Reference count (how many times `AddAtom`/`GlobalAddAtomhas been called for this name)| +  * **Reference count**: Not applicable. 
-| 04h | BYTE | `len` | Length of the string (excluding the terminating null). Maximum length is 255 bytes. | +  * **String representation**: Generated on the fly as `"#dddd"when `GetAtomNameis called. 
-| 05h | BYTE[] | `name| Beginning of the buffer containing the ASCIIZ string (length `len` + 1 byte for the null terminator). |+  * **Creation**: `AddAtom("#1234")` or `AddAtom(MAKEINTATOM(0x04D2))`.
  
-In C/C++: +**How it works:** When a string of the form '#dddd' is passed, the function parses the decimal number and, if it is less than 0xC000, returns it directly without accessing the atom table. Similarly, `FindAtom` for such a string or for a `MAKEINTATOMvalue simply returns the number without any lookup. Integer atoms are always considered "found" because any value in the range is valid.
-```c +
-typedef struct atomstruct { +
-    struct atomstruct near *next;  /Next entry in collision chain *+
-    WORD usage;                    /Reference count */ +
-    BYTE len;                      /* Length of string */ +
-    BYTE name;                     /* Start of string */ +
-} ATOMENTRY; +
-```+
  
-**Important**: The `name` field is a flexible array member. Memory for the structure is allocated with enough space to hold the actual string. There are **no flags or other hidden fields** in this structure.+==== MAKEINTATOM Macro ====
  
-==== Structure of an Atom Table (ATOMTABLE) ====+The **`MAKEINTATOM`** macro is defined as:
  
-An atom table is a hash table implemented as an array of "buckets". It is placed directly in the local heap .+<code c> 
 +#define MAKEINTATOM(i)  (LPTSTR)((DWORD)((WORD)(i))) 
 +</code>
  
-^ Offset ^ Type ^ Field ^ Description ^ +This macro casts a 16‑bit integer value to a pointer typeWhen this "pointer" is passed to atom functions, it is interpreted as an integer atom (if the value is ≤ `0xBFFF`) or as string atom (if ≥ `0xC000`).
-| 00h | WORD | `numEntries` | Size of the hash table (number of buckets)Default is 37. Should be a prime number for uniform hashing. | +
-| 02h | ...  | `hashtab` | Array of `numEntries` near pointers (`WORD`). Each element is either 0 (empty bucket) or a pointer to the first `ATOMENTRYstructure in that bucket|+
  
-In simplified C: +  * For values ≤ `0xBFFF`, the function treats it as an integer atom and returns the value directly. 
-```+  * For values ≥ `0xC000`, the function assumes it is an encoded pointer to an `ATOMENTRY` and will dereference it (after shifting left by 2 bits) to access the atom entry. 
-typedef struct { + 
-    WORD numEntries; +**Important:** `MAKEINTATOMdoes not create a string or allocate any memory; it is simply a type-punning convenience to pass integer atoms to functions that formally expect a string pointer.
-    ATOMENTRY near *hashtab[]; +
-} ATOMTABLE; +
-```+
  
 ==== Local vs. Global Atom Tables ==== ==== Local vs. Global Atom Tables ====
  
-* **Local atom tables**: Bound to a specific data segment (e.g., an application's DGROUP). Created by calling `InitAtomTable()`. Used for a module's internal needs. Access is only possible when the DS register points to that segment. +  * **Local atom tables**: Bound to a specific data segment (e.g., an application's DGROUP). Created by calling `InitAtomTable()`. Used for a module's internal needs. Access is only possible when the DS register points to that segment. 
-* **Global atom table**: A system-wide table accessible to all applications via `GlobalAddAtom`, `GlobalFindAtom`, and `GlobalDeleteAtom` . Physically, it resides not in an application's data segment but in a special USER data segment (part of the so-called "global atom and text heap") . Its structure is identical to a local atom table. The `Global...` functions internally switch DS to the USER segment and call the ordinary `AddAtom`/`FindAtom`.+  * **Global atom table**: A system-wide table accessible to all applications via `GlobalAddAtom`, `GlobalFindAtom`, and `GlobalDeleteAtom` . Physically, it resides not in an application's data segment but in a special USER data segment (part of the so-called "global atom and text heap") . Its structure is identical to a local atom table. The `Global...` functions internally switch DS to the USER segment and call the ordinary `AddAtom`/`FindAtom`.
  
 ==== Creating Custom Atom Tables (outside DGROUP) ==== ==== Creating Custom Atom Tables (outside DGROUP) ====
  
-Since all atom operations (`AddAtom`, `FindAtom`, etc.) work with the current segment pointed to by DS, you can create and use an atom table in any arbitrary data segment by following three steps :+Since all atom operations work with the current segment pointed to by DS, you can create and use an atom table in any arbitrary data segment by following three steps:
  
-1. **Create a local heap** in the target segment using `LocalInit(Selector, Start, End)`. +  -  **Create a local heap** in the target segment using `LocalInit(Selector, Start, End)`. 
-2. **Switch the DS register** to that segment. +  **Switch the DS register** to that segment. 
-3. Call `InitAtomTable(size)` to initialize the atom table in the newly created heap.+  Call `InitAtomTable(size)` to initialize the atom table in the newly created heap.
  
-After that, any subsequent call to `AddAtom`, `FindAtom`, etc., will operate on the custom table if DS is temporarily set to the correct segment . +After that, any subsequent call to `AddAtom`, `FindAtom`, etc., will operate on the custom table if DS is temporarily set to the correct segment.
- +
-Example in assembly: +
-```asm +
-; Assume a local heap already exists in segment CUST_SEG +
-push ds +
-mov  ax, CUST_SEG +
-mov  ds, ax +
-push 37                ; hash table size +
-call InitAtomTable     ; create atom table in this segment +
-pop  cx +
-pop  ds +
-``` +
- +
-A convenience wrapper in C: +
-```c +
-ATOM BasedAddAtom(WORD wSeg, LPCSTR lpString) +
-+
-    ATOM ret; +
-    _asm push ds +
-    _asm mov  ds, wSeg +
-    ret = AddAtom(lpString); +
-    _asm pop  ds +
-    return ret; +
-+
-```+
  
 ==== Summary of Atom Type Differences ==== ==== Summary of Atom Type Differences ====
  
-^ Feature ^ Integer Atoms ^ String Atoms ^ +^ Feature ^ String Atoms ^ Integer Atoms ^ 
-| Range | `0x0001` – `0xBFFF | `0xC000` – `0xFFFF +| Range | `0xC000` – `0xFFFF` | `0x0001` – `0xBFFF` | 
-Storage | Not stored (value is the atomIn local heap as `ATOMENTRY` | +Stored in atom table Yes, as `ATOMENTRY` in hash buckets | No 
-Reference count | None  | Yes (field `usage`) | +Memory allocated | `ATOMENTRY` structure in local heap | None 
-| String representation | `"#1234" | Original string +| Reference count | Yes (`usage`) | No 
-| Creation | `AddAtom("#1234")`  | `AddAtom("MyString")` | +| String representation | Original string | Generated as `"#dddd"on the fly 
-Examples Window classes `#32768`...`#32772`  Registered clipboard formatsDDE item names  |+| Creation | `AddAtom("MyString")` | `AddAtom("#1234")` or `AddAtom(MAKEINTATOM(0x04D2))` | 
 +Find behavior Searches hash table Always returns the value (always "found") | 
 +| Delete behavior | Decrements refcountfrees if zero | No operation (returns 0) |
  
 ===== Custom Local Heaps ===== ===== Custom Local Heaps =====
Line 289: Line 257:
 The `LocalInit()` function initializes a local heap within a specified segment. Its prototype is: The `LocalInit()` function initializes a local heap within a specified segment. Its prototype is:
  
-```c+<code c>
 WORD LocalInit(WORD wSegment, WORD pStart, WORD pEnd); WORD LocalInit(WORD wSegment, WORD pStart, WORD pEnd);
-```+</code>
  
-* `wSegment` – Selector of the segment where the heap will be created. +  * `wSegment` – Selector of the segment where the heap will be created. 
-* `pStart` – Offset of the first byte of the heap area (must be paragraph‑aligned, i.e., a multiple of 16). +  * `pStart` – Offset of the first byte of the heap area (must be paragraph‑aligned, i.e., a multiple of 16). 
-* `pEnd` – Offset of the last byte of the heap area (inclusive). The heap will manage memory from `pStart` to `pEnd`.+  * `pEnd` – Offset of the last byte of the heap area (inclusive). The heap will manage memory from `pStart` to `pEnd`.
  
 If successful, `LocalInit()` returns a non‑zero value. It sets up the `HeapInfo` and `LocalInfo` structures at the beginning of the heap area (starting at `pStart`) and updates the segment’s instance data at offset **06h** (`pLocalHeap`) to point to that `HeapInfo` structure. However, if the segment is not a default data segment (i.e., not DGROUP), the instance data at offset 0 must also contain a zero word to indicate that the NULL segment structure is present; otherwise, the heap may not be recognized by some routines. If successful, `LocalInit()` returns a non‑zero value. It sets up the `HeapInfo` and `LocalInfo` structures at the beginning of the heap area (starting at `pStart`) and updates the segment’s instance data at offset **06h** (`pLocalHeap`) to point to that `HeapInfo` structure. However, if the segment is not a default data segment (i.e., not DGROUP), the instance data at offset 0 must also contain a zero word to indicate that the NULL segment structure is present; otherwise, the heap may not be recognized by some routines.
  
-**Example:** Creating a local heap in a globally allocated block of memory:+**Example:** Creating a local heap in a globally allocated block of memory (64 KB):
  
-```asm +<code asm> 
-```+; 1. Allocate a 64KB global memory block 
 +GlobalAlloc GMEM_FIXED, 0x10000 
 +mov dx, ax          ; DX = selector of allocated block 
 + 
 +; 2. Temporarily set DS to that segment to access its instance data 
 +push ds 
 +push dx 
 +pop ds 
 + 
 +; 3. Initialize the NULL segment (Instance Data) at offset 0. 
 +;    The first word must be zero (wMustBeZero = 0). 
 +xor ax, ax 
 +mov word ptr [0], ax 
 +; The other fields (pLocalHeap, etc.) will be filled by LocalInit. 
 + 
 +; 4. Define the heap area: start at offset 16 (0x0010) to preserve 
 +;    the 16-byte Instance Data, end at 0xFFFF (the last byte of the segment). 
 +mov bx, 16          ; pStart = 16 
 +mov cx, 0xFFFF      ; pEnd = 0xFFFF 
 + 
 +; 5. Restore DS (if no longer needed) 
 +pop ds 
 + 
 +; 6. Call LocalInit to create the heap in segment dx, from pStart to pEnd. 
 +push dx 
 +push bx 
 +push cx 
 +call LocalInit      ; returns non-zero on success 
 + 
 +; After the call, the Instance Data at offset 6 in segment dx 
 +; contains a valid near pointer to the HeapInfo structure located at offset 16. 
 +</code>
  
 After this call, the global block can be used with local heap functions (`LocalAlloc`, `LocalFree`, etc.) by using the selector in `DX` and near pointers (offsets) within that segment. After this call, the global block can be used with local heap functions (`LocalAlloc`, `LocalFree`, etc.) by using the selector in `DX` and near pointers (offsets) within that segment.
  
 **Important Considerations:** **Important Considerations:**
-The heap structures themselves occupy space at the beginning of the heap area. The first block (sentinel) resides at `pStart + size of (LocalInfo)`. + 
-The segment’s instance data (at offset 0) must be properly set up, especially the zero word at offset 0, to avoid confusion with other structures. +  * The heap structures themselves occupy space at the beginning of the heap area. The first block (sentinel) resides at `pStart + size of (LocalInfo)`. 
-Custom local heaps are not automatically enlarged if they run out of space; they are limited to the range specified in `LocalInit`. +  The segment’s instance data (at offset 0) must be properly set up, especially the zero word at offset 0, to avoid confusion with other structures. 
-The `HEAPSIZE` setting in the module’s .DEF file only affects the default DGROUP heap.+  Custom local heaps are not automatically enlarged if they run out of space; they are limited to the range specified in `LocalInit`. 
 +  The `HEAPSIZE` setting in the module’s .DEF file only affects the default DGROUP heap.
  
 ==== Creating Atom Tables Outside DGROUP ==== ==== Creating Atom Tables Outside DGROUP ====
  
-As already described in the atom section, to create an atom table in an arbitrary memory area, you must first initialize a local heap there (as above) and then, after switching DS, call `InitAtomTable`. This technique allows fully isolated atom tables for special purposes.+As already described, to create an atom table in an arbitrary memory area, you must first initialize a local heap there (as above) and then, after switching DS, call `InitAtomTable`. This technique allows fully isolated atom tables for special purposes.
  
 ==== Summary of Custom Heap and Atom Table Creation ==== ==== Summary of Custom Heap and Atom Table Creation ====
  
-Use `LocalInit` on a segment to establish a local heap anywhere in memory. +  * Use `LocalInit` on a segment to establish a local heap anywhere in memory. 
-The segment must have a valid NULL segment structure (zero word at offset 0) for the heap to be recognized. +  The segment must have a valid NULL segment structure (zero word at offset 0) for the heap to be recognized. 
-After `LocalInit`, you can use `LocalAlloc`, `LocalLock`, etc., with near pointers within that segment. +  After `LocalInit`, you can use `LocalAlloc`, `LocalLock`, etc., with near pointers within that segment. 
-To create an atom table in a custom heap, switch DS to that segment and call `InitAtomTable`. +  To create an atom table in a custom heap, switch DS to that segment and call `InitAtomTable`. 
-All subsequent atom operations must be performed with DS set appropriately (or via wrapper functions). +  All subsequent atom operations must be performed with DS set appropriately (or via wrapper functions). 
-Custom heaps and atom tables are useful for isolating memory pools, implementing resource managers, or working with large data structures without polluting the default DGROUP.+  Custom heaps and atom tables are useful for isolating memory pools, implementing resource managers, or working with large data structures without polluting the default DGROUP.
  
 ===== References ===== ===== References =====
  
-1. Schulman, A., Maxey, D., Pietrek, M. //Undocumented Windows//. Addison-Wesley, 1992. +  - Schulman, A., Maxey, D., Pietrek, M. //Undocumented Windows//. Addison-Wesley, 1992. 
-2. Pietrek, M. //Windows Internals//. Addison-Wesley, 1993. +  Pietrek, M. //Windows Internals//. Addison-Wesley, 1993. 
-3. Chen, R. //The Old New Thing// (blog). Microsoft Developer Blogs. +  Chen, R. //The Old New Thing// (blog). Microsoft Developer Blogs. 
-4. Microsoft OS/2 Programmer's Reference, Volume 1.+  Microsoft OS/2 Version 1.1 Programmer's Reference, Volume 1.