任务状态段(Task-state segment, TSS)
之前讲到了,因为SS段的CPL必须与CS段的CPL一致,所以当使用调用门、中断门或陷阱门并产生权限切换时,必然会引起堆栈的切换。通过门描述符的学习可以知道CPL切换时,CS由门描述符决定,但是新的SS和ESP由何而来的呢?答案是 任务状态段(Task-state segment, TSS)
任务状态段是一块大小为104字节的内存,它保存了一些重要的值
图在手册中7.2.1 Task-State Segment (TSS)

抽象成结构体就是下面这个样子。
typedef struct TSS {
DWORD link;
DWORD esp0;
DWORD ss0;
DWORD esp1;
DWORD ss1;
DWORD esp2;
DWORD ss2;
DWORD cr3;
DWORD eip;
DWORD eflags;
DWORD eax;
DWORD ecx;
DWORD edx;
DWORD ebx;
DWORD esp;
DWORD ebp;
DWORD esi;
DWORD edi;
DWORD es;
DWORD cs;
DWORD ss;
DWORD ds;
DWORD fs;
DWORD gs;
DWORD ldt;
DWORD io_map;
} TSS;
手册里对每个变量的解释是这样的
The processor updates dynamic fields when a task is suspended during a task switch. The following are dynamic
fields:
• General-purpose register fields — State of the EAX, ECX, EDX, EBX, ESP, EBP, ESI, and EDI registers prior
to the task switch.
• Segment selector fields — Segment selectors stored in the ES, CS, SS, DS, FS, and GS registers prior to the
task switch.
• EFLAGS register field — State of the EFAGS register prior to the task switch.
• EIP (instruction pointer) field — State of the EIP register prior to the task switch.
• Previous task link field — Contains the segment selector for the TSS of the previous task (updated on a task
switch that was initiated by a call, interrupt, or exception). This field (which is sometimes called the back link
field) permits a task switch back to the previous task by using the IRET instruction.
The processor reads the static fields, but does not normally change them. These fields are set up when a task is
created. The following are static fields:
• LDT segment selector field — Contains the segment selector for the task’s LDT.
CR3 control register field — Contains the base physical address of the page directory to be used by the task.
Control register CR3 is also known as the page-directory base register (PDBR).
• Privilege level-0, -1, and -2 stack pointer fields — These stack pointers consist of a logical address made
up of the segment selector for the stack segment (SS0, SS1, and SS2) and an offset into the stack (ESP0,
ESP1, and ESP2). Note that the values in these fields are static for a particular task; whereas, the SS and ESP
values will change if stack switching occurs within the task.
• T (debug trap) flag (byte 100, bit 0) — When set, the T flag causes the processor to raise a debug exception
when a task switch to this task occurs (see Section 17.3.1.5, “Task-Switch Exception Condition”).
• I/O map base address field — Contains a 16-bit offset from the base of the TSS to the I/O permission bit
map and interrupt redirection bitmap. When present, these maps are stored in the TSS at higher addresses.
The I/O map base address points to the beginning of the I/O permission bit map and the end of the interrupt
redirection bit map. See Chapter 14, “Input/Output,” in the Intel® 64 and IA-32 Architectures Software
Developer’s Manual, Volume 1, for more information about the I/O permission bit map. See Section 20.3,
“Interrupt and Exception Handling in Virtual-8086 Mode,” for a detailed description of the interrupt redirection
bit map.
If paging is used:
• Avoid placing a page boundary in the part of the TSS that the processor reads during a task switch (the first 104
bytes). The processor may not correctly perform address translations if a boundary occurs in this area. During
a task switch, the processor reads and writes into the first 104 bytes of each TSS (using contiguous physical
addresses beginning with the physical address of the first byte of the TSS). So, after TSS access begins, if part
of the 104 bytes is not physically contiguous, the processor will access incorrect information without generating
a page-fault exception.
• Pages corresponding to the previous task’s TSS, the current task’s TSS, and the descriptor table entries for
each all should be marked as read/write.
• Task switches are carried out faster if the pages containing these structures are present in memory before the
task switch is initiated
通过观察这个结构体可以看到有两个DWORD变量叫ss0和esp0,当我们将代码从三环跨到0环时,CPU就会从这个TSS里把SS0和ESP0取出来,放到 ss 和 esp 寄存器中,当然这里windows 只用了0和3
TSS的作用
保存0环、1环和2环的栈段选择子和栈顶指针
前面讲到了,在跨段提权的时候,需要切换栈,CPU会通过 tr 寄存器找到 TSS,取出其中的 SS0 和 ESP0 复制到 ss 和 esp 寄存器中。这只是 TSS 的一个用途,也是现代 Windows 操作系统使用到的功能。
一次性切换一堆寄存器
TSS 的另一个用途是什么?通过观察 TSS 的结构还发现 TSS 不仅存储了不同特权级下的 SS 和 ESP,还有 cs, esp, ss, esp 等等,这些后面不带数字的变量名,有着各自的用途。可以通过 call/jmp + TSS段选择子指令一次性把这些值加载到 CPU 对应的寄存器中。同时,旧值将保存在旧的 TSS 中。
GDT 表中可以存放多个TSS描述符,这意味着内存中可以存在多份不同的TSS。总有一个 TSS 是在当前使用中的,也就是 tr 寄存器指向的那个 TSS。当使用 call/jmp + TSS段选择子的时候,CPU做了以下几件事情。
把当前所有寄存器(TSS结构中有的那些寄存器)的值填写到当前 tr 段寄存器指向的 TSS 中
把新的 TSS 段选择子指向的段描述符加载到 tr 段寄存器中
把新的 TSS 段中的值覆盖到当前所有寄存器(TSS结构中有的那些寄存器)中
图下是Intel的设计思想,但是现代操作系统中并没有完全这样用


TR段寄存器
那么,CPU如何找到TSS段呢?
CPU可以通过 gdtr 寄存器来知道 GDT表在哪里,通过 idtr 寄存器知道 IDT 表在哪里。同样的,CPU通过 tr 寄存器来确定 TSS 的位置。但是! tr寄存器是不同于 gdtr和 idtr 的,tr寄存器在分类上是属于 段寄存器,是长度为96位的段寄存器。TR寄存器的值是当操作系统启动时,就从GDT表 中TSS段描述符中加载出来的,TSS段描述符存储在GDT表中
7.2.4 Task Register
The task register holds the 16-bit segment selector and the entire segment descriptor (32-bit base address (64 bits
in IA-32e mode), 16-bit segment limit, and descriptor attributes) for the TSS of the current task (see Figure 2-6).
This information is copied from the TSS descriptor in the GDT for the current task. Figure 7-5 shows the path the
processor uses to access the TSS (using the information in the task register).
The task register has a visible part (that can be read and changed by software) and an invisible part (maintained
by the processor and is inaccessible by software). The segment selector in the visible portion points to a TSS
descriptor in the GDT. The processor uses the invisible portion of the task register to cache the segment descriptor
for the TSS. Caching these values in a register makes execution of the task more efficient. The LTR (load task
register) and STR (store task register) instructions load and read the visible portion of the task register:
The LTR instruction loads a segment selector (source operand) into the task register that points to a TSS descriptor
in the GDT. It then loads the invisible portion of the task register with information from the TSS descriptor. LTR is a
privileged instruction that may be executed only when the CPL is 0. It’s used during system initialization to put an
initial value in the task register. Afterwards, the contents of the task register are changed implicitly when a task
switch occurs.
The STR (store task register) instruction stores the visible portion of the task register in a general-purpose register
or memory. This instruction can be executed by code running at any privilege level in order to identify the currently
running task. However, it is normally used only by operating system software.
On power up or reset of the processor, segment selector and base address are set to the default value of 0; the limit
is set to FFFFH.

说到这里又有一个新概念了
就是TSS段描述符
The TSS, like all other segments, is defined by a segment descriptor. Figure 7-3 shows the format of a TSS
descriptor. TSS descriptors may only be placed in the GDT; they cannot be placed in an LDT or the IDT.
An attempt to access a TSS using a segment selector with its TI flag set (which indicates the current LDT) causes
a general-protection exception (#GP) to be generated during CALLs and JMPs; it causes an invalid TSS exception
(#TS) during IRETs. A general-protection exception is also generated if an attempt is made to load a segment
selector for a TSS into a segment register.
The busy flag (B) in the type field indicates whether the task is busy. A busy task is currently running or suspended.
A type field with a value of 1001B indicates an inactive task; a value of 1011B indicates a busy task. Tasks are not
recursive. The processor uses the busy flag to detect an attempt to call a task whose execution has been interrupted. To insure that there is only one busy flag is associated with a task, each TSS should have only one TSS
descriptor that points to it

The base, limit, and DPL fields and the granularity and present flags have functions similar to their use in datasegment descriptors (see Section 3.4.5, “Segment Descriptors”). When the G flag is 0 in a TSS descriptor for a 32-
bit TSS, the limit field must have a value equal to or greater than 67H, one byte less than the minimum size of a
TSS. Attempting to switch to a task whose TSS descriptor has a limit less than 67H generates an invalid-TSS exception (#TS). A larger limit is required if an I/O permission bit map is included or if the operating system stores additional data. The processor does not check for a limit greater than 67H on a task switch; however, it does check
when accessing the I/O permission bit map or interrupt redirection bit map.
Any program or procedure with access to a TSS descriptor (that is, whose CPL is numerically equal to or less than
the DPL of the TSS descriptor) can dispatch the task with a call or a jump.
In most systems, the DPLs of TSS descriptors are set to values less than 3, so that only privileged software can
perform task switching. However, in multitasking applications, DPLs for some TSS descriptors may be set to 3 to
allow task switching at the application (or user) privilege level.
当然64位的TSS段描述符,这里不展开说
TSS段描述符是系统段描述符中的一种,这里注意一点的是B位,如果为0则表示没有加载到TR寄存器里,如果为1则表示已经加载到TR寄存器中
TR寄存器读写
for the TSS. Caching these values in a register makes execution of the task more efficient. The LTR (load task
register) and STR (store task register) instructions load and read the visible portion of the task register:
The LTR instruction loads a segment selector (source operand) into the task register that points to a TSS descriptor
in the GDT. It then loads the invisible portion of the task register with information from the TSS descriptor. LTR is a
privileged instruction that may be executed only when the CPL is 0. It’s used during system initialization to put an
initial value in the task register. Afterwards, the contents of the task register are changed implicitly when a task
switch occurs.
The STR (store task register) instruction stores the visible portion of the task register in a general-purpose register
or memory. This instruction can be executed by code running at any privilege level in order to identify the currently
running task. However, it is normally used only by operating system software.
这里注意一点的是LTR只改变TR寄存器,并没有改变TSS,并且只能在系统层使用,加载后TSS段描述符的状态位会发生改变(B位)
下面我们做个小实验
在R3,我们可以通过JMP FAR 或者CALL FAR 去访问任务段

这里简单翻译一下如果JMP FAR或者 CALL FAR 是一个TSS段的话
1.将当前任务的状态存储在当前TSS中
2.装载新任务段选择子到TR寄存器中。
3.通过GDT中的段描述符访问新的TSS
4.将新 TSS 中新任务的状态装载到通用寄存器、段寄存器、LDTR、控制寄存器 CR3(页表基地址)、EFLAGS 寄存器和 EIP 寄存器中。
5.开始执行新任务
首先我们来构造一个TSS段描述符
假设地址0x0041E000
Base 31:24 G AVL Limit19:16 P DPL Type Base 23:16
0000 0000 0 00 0 0000 1 11 0 1001 0100 0001
Base Address 15:00 Segment Limit 15:00
1110 0000 0000 0000 0000 0000 01101000
写入程序如下
#include <iostream>
#include <windows.h>
typedef struct TSS {
DWORD link;
DWORD esp0;
DWORD ss0;
DWORD esp1;
DWORD ss1;
DWORD esp2;
DWORD ss2;
DWORD cr3;
DWORD eip;
DWORD eflags;
DWORD eax;
DWORD ecx;
DWORD edx;
DWORD ebx;
DWORD esp;
DWORD ebp;
DWORD esi;
DWORD edi;
DWORD es;
DWORD cs;
DWORD ss;
DWORD ds;
DWORD fs;
DWORD gs;
DWORD ldt;
DWORD io_map;
} TSS;
char st[10] = { 0 };
DWORD g_esp;
DWORD g_cs;
TSS tss = {
0x00000000,
(DWORD)st,
0x00000010,
0x00000000,
0x00000000,
0x00000000,
0x00000000,
0x00000000,
0x0040fad0,
0x00000000,
0x00000000,
0x00000000,
0x00000000,
0x00000000,
(DWORD)st,
0x00000000,
0x00000000,
0x00000000,
0x00000023,
0x00000008,
0x00000010,
0x00000023,
0x00000030,
0x00000000,
0x00000000,
0x20ac0000
};
__declspec(naked) void func() {
__asm {
mov g_esp, esp;
mov eax, 0;
mov ax, cs;
mov g_cs, eax;
iret;
}
}
int main(int argc, char* argv[])
{
tss.eip = (DWORD)func;
printf("TSS:%x\n", &tss);
printf("func:%x\n", tss.eip);
printf("please input cr3:\n");
scanf("%x", &(tss.cr3));
char buffer[6] = { 0, 0, 0, 0, 0x48, 0 };
__asm {
call fword ptr[buffer]
}
printf("g_cs = %08x\ng_esp = %08x\n", g_cs, g_esp);
return 0;
}
显示TSS位置后构造TSS段寄存器

修改GDT表项

进入windbg下断,输入!process 0 0
找到相应进程cr3,记下,也就是DirBase

输入完成后继续运行成功执行读取代码

PS:
这里有个小坑,如果执行代码里写入int3,不恢复eflags寄存器,就会卡死
原因是
因为当进行TSS跳转时,其会将老的TSS保存在新的TSS头部(上面我们看到),当我们使用iretd返回时,其不是像中断那样根据返回地址,而是根据TSS段选择子找旧的TSS段内存,然后将里面的寄存器全部加载进去。
而INT 3 会清空 VM、NT、IF、TF四个位,其中NT表示嵌套任务段(nested task),如果清空,其就认为不存在任务段嵌套,直接像常规那样,根据返回地址返回,此时就会出错。
因此在返回前就会存在下面一段代码来修改elfags寄存器中的NT位。
__asm{
pushfd;
pop eax;
or eax,0x4000;
push eax;
popfd;
}