精华内容
下载资源
问答
  • word目录英文开头目录对齐

    千次阅读 2020-05-10 13:37:04
    见图片,图中英文开头的和中文开头的图片名不对齐,因为标号和名称间的空格字体问题,英文时默认是Times New Roman 解决办法: 按住Ctrl+鼠标点击图3-3到达那个图片,选中空格. 显示印文字体,修改为宋体 ...

    问题描述:

    见图片,图中英文开头的和中文开头的图片名不对齐,因为标号和名称间的空格字体问题,英文时默认是Times New Roman

    解决办法:

    按住Ctrl+鼠标点击图3-3到达那个图片,选中空格. 显示印文字体,修改为宋体

    修改后

    回到目录,更新域

    完成对齐咯

     

     

    展开全文
  • Word中中英文目录对齐设置问题

    千次阅读 2013-03-11 10:18:32
    设置中英文目录对齐的方法是: 选定——菜单栏——段落——中文式——允许西文在单词中间换行(勾选)——确定。 即可完成。

    设置中英文目录对齐的方法是:

    选定——菜单栏——段落——中文版式——允许西文在单词中间换行(勾选)——确定。
    即可完成。

    展开全文
  • 《现代操作系统4th》英文版阅读笔记 4.3.3章 目录的实现

    Before a file can be read, it must be opened. When a file is opened, the operating system uses the path name supplied by the user to locate the directory entry on the disk. 

    在读取一个文件之前,必须先打开这个文件。当要打开一个文件时,操作系统必须根据用户提供的路径名在磁盘中找到目录项(directory entry,包含该文件的目录)

    The directory entry provides the information needed to find the disk blocks. Depending on the system, this information may be the disk address of the entire file (with contiguous allocation), the number of the first block (both linked-list schemes), or the number of the i-node. In all cases, the main function of the directory system is to map the ASCII name of the file onto the information needed to locate the data.

    目录下提供定位磁盘块所需的信息。依据不同的操作系统,这个信息可能是整个文件的起始地址(连续分配方式中采用),第一个磁盘块的序号(采用链表存储方式),i-node的序号。在这些设计中,目录的主要功能就是把文件的ASCII名字映射为定位文件数据的信息。


    PS:根据赵炯编写的《linux内核源码解析》书中所讲,在Linux系统中,目录项结构仅仅包括文件名和i-node number,i-node number和 i-node结构体中的i_num成员对应,从而定位到i-node,i-node保存文件的属性以及包含数据的磁盘块号信息等。



    One alternative is to give up the idea that all directory entries are the same size. With this method, each directory entry contains a fixed portion, typically starting with the length of the entry, and then followed by data with a fixed format, usually including the owner, creation time, protection information, and other attributes.

    放弃所有目录项相同大小的方法。另外一种方法就是,每个目录项中包含一个固定区域,最典型的就是包含这个目录项的长度信息,然后跟随固定格式的数据,这些数据主要包括文件的拥有者,创建时间,保护信息以及其他属性等。
    This fixed-length header is followed by the actual file name, however long it may be, as shown in Fig. 4-15(a) in big-endian format (e.g., SPARC). In this example we have three files, project-budget,personnel, andfoo. Each file name is terminated by a special character (usually 0), which is represented in the figure by a box with a cross in it. To allow each directory entry to begin on a word boundary, each file name is filled out to an integral number of words, shown by shaded boxes in the figure.
    正如在图4-15(a)中以big-endian 格式所示,文件长度,属性,接下来跟随的是文件的真实名字,无论这个文件名有多长都可以。在这个例子中我们可以从图中看出有三个文件,project-budgetpersonnel, and foo.每一个文件名有一个特殊的终结符(通常为0),在图中有一个中间有×的方框表示。为了让每个目录项都是从字长度边界开始(地址对齐),每一个文件名都进行了填充,从而使文件名占用长度达到word长度倍数。在图中用阴影边框表示。

    A disadvantage of this method is that when a file is removed, a variable-sized gap is introduced into the directory into which the next file to be entered may not fit. This problem is essentially the same one we saw with contiguous disk files, only now compacting the directory is feasible because it is entirely in memory. Another problem is that a single directory entry may span multiple pages, so a page fault may occur while reading a file name.

    上面方法的缺点就是当一个文件删除后,目录项中就产生了一个不确定大小的空隙,下一个加入的文件可能填不进去。这个问题和我们之前说的连续磁盘方式存储文件的缺点很相似,不同的就是现在可以对目录项进行压缩因为目录项都是在内存中。另外一个问题就是单个目录项可能跨个多个页,所以当读取一个文件时可能产生页错误。

    Another way to handle variable-length names is to make the directory entries themselves all fixed length and keep the file names together in a heap at the end of the directory, as shown in Fig. 4-15(b). This method has the advantage that when
    an entry is removed, the next file entered will always fit there. Of course, the heap must be managed and page faults can still occur while processing file names. One minor win here is that there is no longer any real need for file names to begin at word boundaries, so no filler characters are needed after file names in Fig. 4-15(b) as they are in Fig. 4-15(a).

    处理可变文件长度的方法就是让目录项本身长度固定,然后让文件名字一起存储在目录后面的堆中,如图4-15(b)所示。这种方法的优势在于当一个文件去除后,下一个进入的文件
    肯定能找到合适的位置?当然,必须对堆进行管理,页错误仍然也会发生。这个方法好一点的地方在于不再需要文件名从word地址边界开始,所以也就不需要在文件名后面加入填充空字符。4-15(b)和4-15(a)表示同一个目录。

    In all of the designs so far, directories are searched linearly from beginning to end when a file name has to be looked up. For extremely long directories, linear searching can be slow. One way to speed up the search is to use a hash table in
    each directory. Call the size of the table n. To enter a file name, the name is hashed onto a value between 0 andn1, for example, by dividing it bynand taking the remainder. Alternatively, the words comprising the file name can be added up and

    this quantity divided by n, or something similar

    目前来看,上面所以的目录方法,为了找到文件名,需要线性搜索整个目录。对于特别长的目录,线性搜索可能非常慢。一个加速搜索的方法就是在目录中利用hash表。设表长度为n。输入一个文件名后,文件名hash为一个在0到n-1范围的数值,举例,可以采用对n求余数的方法得到。也可以把组成文件名的每个字符值相加,然后除以n,或者别的相似的方法。

    Either way, the table entry corresponding to the hash code is inspected. If it is unused, a pointer is placed there to the file entry. File entries follow the hash table.If that slot is already in use, a linked list is constructed, headed at the table entry and threading through all entries with the same hash value.
    同样的,和hasn值对应的hash表项也要进行检查。如果这个hash项没有被使用,就在hash项中放置一个指向该文件项的指针。文件项跟着hasn表。如果hash项已经存在,就构建一个链表,链表头放置在hash项中,链表连接了所有含有同样hash值的文件项。


    《现代操作系统4th》英文版下载地址











    展开全文
  • 内存对齐的原理,作用,例子以及一些规划(中英文说明,适用sizeof结构体) 目录 题记  一 内存读取粒度 Memory access granularity   从内存的角度解释内存对齐的原理 队列原理 Alignment...
    
    

    目录  
    题记 
    一  内存读取粒度  
    Memory access granularity  
    从内存的角度解释内存对齐的原理  
    队列原理 Alignment fundamentals  
    Lazy processors 
    二 速度 Speed (内存对齐的基本原理)  
    代码解释  
    中文代码及其内存解释 
    三 不懂内存对齐将造成的可能影响如下 
    四 内存对齐规划  
    内存对齐的原因  
    对齐规则  
    试验 
    五 作者

    题记
    下面的文章中是我对四个博客文章的合成,非原创,解释了内存对齐的原因,作用(中英文说明),及其规划!尤其适用于对sizeof结构体。
    首先解释了内存对齐的原理,然后对作用进行了说明,最后是例子!其中中文对内存对齐,原作者做了详细的说明及其例子解释,需要注意的是,如
    struct
    {
    char a;
    int b;
    char b
    }A;
    a在分配时候占用其中一个字节,剩下3个,但是b分配的是4字节,明显3个字节无法满足,那么就需要另外写入队列
    人觉得第二个中文作者(按我最后说明博客地址顺序)提到的最重要的是画图是一个很不错的方法.
     我在引用的第四个博客中,也就是最后的博客中,通过详细的代码解释,说明了内存对齐的规划问题!
     内存对齐在系统或驱动级别以至于高真实时,高保密的程序开发的时候,程序内存分配问题仍旧是保证整个程序稳定,安全,高效的基础。所以对内存对齐需要学会掌握!至少在CSDN能说的来头!

      文章可能很杂,如果看不懂,可以直接浏览中文部分,虽然我的博客很少人来看。
                                                                    --QQ124045670


     一     内存读取粒度        Memory access granularity  
    从内存的角度解释内存对齐的原理

    Programmers are conditioned to think of memory as a simple array of bytes. Among C and its descendants, char* is ubiquitous as meaning "a block of memory", and even Java? has its byte[] type to represent raw memory.


    Figure 1. How programmers see memory
    How Programmers See Memory

    However, your computer's processor does not read from and write to memory in byte-sized chunks. Instead, it accesses memory in two-, four-, eight- 16- or even 32-byte chunks. We'll call the size in which a processor accesses memory its memory access granularity.


    Figure 2. How processors see memory
    How Some Processors See Memory

    The difference between how high-level programmers think of memory and how modern processors actually work with memory raises interesting issues that this article explores.

    If you don't understand and address alignment issues in your software, the following scenarios, in increasing order of severity, are all possible:

    • Your software will run slower.
    • Your application will lock up.
    • Your operating system will crash.
    • Your software will silently fail, yielding incorrect results.


    队列原理 Alignment fundamentals

    To illustrate the principles behind alignment, examine a constant task, and how it's affected by a processor's memory access granularity. The task is simple: first read four bytes from address 0 into the processor's register. Then read four bytes from address 1 into the same register.

    First examine what would happen on a processor with a one-byte memory access granularity:


    Figure 3. Single-byte memory access granularity
    Single-byte memory access granularity

    This fits in with the naive programmer's model of how memory works: it takes the same four memory accesses to read from address 0 as it does from address 1. Now see what would happen on a processor with two-byte granularity, like the original 68000:


    Figure 4. Double-byte memory access granularity
    Double-byte memory access granularity

    When reading from address 0, a processor with two-byte granularity takes half the number of memory accesses as a processor with one-byte granularity. Because each memory access entails a fixed amount overhead, minimizing the number of accesses can really help performance.

    However, notice what happens when reading from address 1. Because the address doesn't fall evenly on the processor's memory access boundary, the processor has extra work to do. Such an address is known as an unaligned address. Because address 1 is unaligned, a processor with two-byte granularity must perform an extra memory access, slowing down the operation.

    Finally, examine what would happen on a processor with four-byte memory access granularity, like the 68030 or PowerPC? 601:


    Figure 5. Quad-byte memory access granularity
    Quad-byte memory access granularity

    A processor with four-byte granularity can slurp up four bytes from an aligned address with one read. Also note that reading from an unaligned address doubles the access count.

    Now that you understand the fundamentals behind aligned data access, you can explore some of the issues related to alignment.


    Lazy processors

    A processor has to perform some tricks when instructed to access an unaligned address. Going back to the example of reading four bytes from address 1 on a processor with four-byte granularity, you can work out exactly what needs to be done:


    Figure 6. How processors handle unaligned memory access
    How processors handle unaligned memory access

    The processor needs to read the first chunk of the unaligned address and shift out the "unwanted" bytes from the first chunk. Then it needs to read the second chunk of the unaligned address and shift out some of its information. Finally, the two are merged together for placement in the register. It's a lot of work.

    Some processors just aren't willing to do all of that work for you.

    The original 68000 was a processor with two-byte granularity and lacked the circuitry to cope with unaligned addresses. When presented with such an address, the processor would throw an exception. The original Mac OS didn't take very kindly to this exception, and would usually demand the user restart the machine. Ouch.

    Later processors in the 680x0 series, such as the 68020, lifted this restriction and performed the necessary work for you. This explains why some old software that works on the 68020 crashes on the 68000. It also explains why, way back when, some old Mac coders initialized pointers with odd addresses. On the original Mac, if the pointer was accessed without being reassigned to a valid address, the Mac would immediately drop into the debugger. Often they could then examine the calling chain stack and figure out where the mistake was.

    All processors have a finite number of transistors to get work done. Adding unaligned address access support cuts into this "transistor budget." These transistors could otherwise be used to make other portions of the processor work faster, or add new functionality altogether.

    An example of a processor that sacrifices unaligned address access support in the name of speed is MIPS. MIPS is a great example of a processor that does away with almost all frivolity in the name of getting real work done faster.

    The PowerPC takes a hybrid approach. Every PowerPC processor to date has hardware support for unaligned 32-bit integer access. While you still pay a performance penalty for unaligned access, it tends to be small.

    On the other hand, modern PowerPC processors lack hardware support for unaligned 64-bit floating-point access. When asked to load an unaligned floating-point number from memory, modern PowerPC processors will throw an exception and have the operating system perform the alignment chores in software. Performing alignment in software is much slower than performing it in hardware.


    速度 Speed (内存对齐的基本原理)

    内存对齐有一个好处是提高访问内存的速度,因为在许多数据结构中都需要占用内存,在很多系统中,要求内存分配的时候要对齐.下面是对为什么可以提高内存速度通过代码做了解释!

    代码解释 

    Writing some tests illustrates the performance penalties of unaligned memory access. The test is simple: you read, negate, and write back the numbers in a ten-megabyte buffer. These tests have two variables:

    1. The size, in bytes, in which you process the buffer. First you'll process the buffer one byte at a time. Then you'll move onto two-, four- and eight-bytes at a time.
    2. The alignment of the buffer. You'll stagger the alignment of the buffer by incrementing the pointer to the buffer and running each test again.

    These tests were performed on a 800 MHz PowerBook G4. To help normalize performance fluctuations from interrupt processing, each test was run ten times, keeping the average of the runs. First up is the test that operates on a single byte at a time:

    Listing 1. Munging data one byte at a time
    void Munge8( void *data, uint32_t size ) {
        uint8_t *data8 = (uint8_t*) data;
        uint8_t *data8End = data8 + size;
        
        while( data8 != data8End ) {
            *data8++ = -*data8;
        }
    }
    

    It took an average of 67,364 microseconds to execute this function. Now modify it to work on two bytes at a time instead of one byte at a time -- which will halve the number of memory accesses:


    Listing 2. Munging data two bytes at a time
    void Munge16( void *data, uint32_t size ) {
        uint16_t *data16 = (uint16_t*) data;
        uint16_t *data16End = data16 + (size >> 1); /* Divide size by 2. */
        uint8_t *data8 = (uint8_t*) data16End;
        uint8_t *data8End = data8 + (size & 0x00000001); /* Strip upper 31 bits. */
        
        while( data16 != data16End ) {
            *data16++ = -*data16;
        }
        while( data8 != data8End ) {
            *data8++ = -*data8;
        }
    }

    This function took 48,765 microseconds to process the same ten-megabyte buffer -- 38% faster than Munge8. However, that buffer was aligned. If the buffer is unaligned, the time required increases to 66,385 microseconds -- about a 27% speed penalty. The following chart illustrates the performance pattern of aligned memory accesses versus unaligned accesses:


    Figure 7. Single-byte access versus double-byte access
    Single-byte access versus double-byte access

    The first thing you notice is that accessing memory one byte at a time is uniformly slow. The second item of interest is that when accessing memory two bytes at a time, whenever the address is not evenly divisible by two, that 27% speed penalty rears its ugly head.

    Now up the ante, and process the buffer four bytes at a time:


    Listing 3. Munging data four bytes at a time
    void Munge16( void *data, uint32_t size ) {
        uint16_t *data16 = (uint16_t*) data;
        uint16_t *data16End = data16 + (size >> 1); /* Divide size by 2. */
        uint8_t *data8 = (uint8_t*) data16End;
        uint8_t *data8End = data8 + (size & 0x00000001); /* Strip upper 31 bits. */
        
        while( data16 != data16End ) {
            *data16++ = -*data16;
        }
        while( data8 != data8End ) {
            *data8++ = -*data8;
        }
    }

    This function processes an aligned buffer in 43,043 microseconds and an unaligned buffer in 55,775 microseconds, respectively. Thus, on this test machine, accessing unaligned memory four bytes at a time is slower than accessing aligned memory two bytes at a time:


    Figure 8. Single- versus double- versus quad-byte access
    Single- versus double- versus quad-byte access

    Now for the horror story: processing the buffer eight bytes at a time.


    Listing 4. Munging data eight bytes at a time
    void Munge32( void *data, uint32_t size ) {
        uint32_t *data32 = (uint32_t*) data;
        uint32_t *data32End = data32 + (size >> 2); /* Divide size by 4. */
        uint8_t *data8 = (uint8_t*) data32End;
        uint8_t *data8End = data8 + (size & 0x00000003); /* Strip upper 30 bits. */
        
        while( data32 != data32End ) {
            *data32++ = -*data32;
        }
        while( data8 != data8End ) {
            *data8++ = -*data8;
        }
    }
    

    Munge64 processes an aligned buffer in 39,085 microseconds -- about 10% faster than processing the buffer four bytes at a time. However, processing an unaligned buffer takes an amazing 1,841,155 microseconds -- two orders of magnitude slower than aligned access, an outstanding 4,610% performance penalty!

    What happened? Because modern PowerPC processors lack hardware support for unaligned floating-point access, the processor throws an exception for each unaligned access. The operating system catches this exception and performs the alignment in software. Here's a chart illustrating the penalty, and when it occurs:


    Figure 9. Multiple-byte access comparison
    Multiple-byte access comparison

    The penalties for one-, two- and four-byte unaligned access are dwarfed by the horrendous unaligned eight-byte penalty. Maybe this chart, removing the top (and thus the tremendous gulf between the two numbers), will be clearer:


    Figure 10. Multiple-byte access comparison #2
    Multiple-byte access comparison #2

    There's another subtle insight hidden in this data. Compare eight-byte access speeds on four-byte boundaries:


    Figure 11. Multiple-byte access comparison #3
    Multiple-byte access comparison #3

    Notice accessing memory eight bytes at a time on four- and twelve- byte boundaries is slower than reading the same memory four or even two bytes at a time. While PowerPCs have hardware support for four-byte aligned eight-byte doubles, you still pay a performance penalty if you use that support. Granted, it's no where near the 4,610% penalty, but it's certainly noticeable. Moral of the story: accessing memory in large chunks can be slower than accessing memory in small chunks, if that access is not aligned.


    Atomicity

    All modern processors offer atomic instructions. These special instructions are crucial for synchronizing two or more concurrent tasks. As the name implies, atomic instructions must be indivisible -- that's why they're so handy for synchronization: they can't be preempted.

    It turns out that in order for atomic instructions to perform correctly, the addresses you pass them must be at least four-byte aligned. This is because of a subtle interaction between atomic instructions and virtual memory.

    If an address is unaligned, it requires at least two memory accesses. But what happens if the desired data spans two pages of virtual memory? This could lead to a situation where the first page is resident while the last page is not. Upon access, in the middle of the instruction, a page fault would be generated, executing the virtual memory management swap-in code, destroying the atomicity of the instruction. To keep things simple and correct, both the 68K and PowerPC require that atomically manipulated addresses always be at least four-byte aligned.

    Unfortunately, the PowerPC does not throw an exception when atomically storing to an unaligned address. Instead, the store simply always fails. This is bad because most atomic functions are written to retry upon a failed store, under the assumption they were preempted. These two circumstances combine to where your program will go into an infinite loop if you attempt to atomically store to an unaligned address. Oops.


    Altivec

    Altivec is all about speed. Unaligned memory access slows down the processor and costs precious transistors. Thus, the Altivec engineers took a page from the MIPS playbook and simply don't support unaligned memory access. Because Altivec works with sixteen-byte chunks at a time, all addresses passed to Altivec must be sixteen-byte aligned. What's scary is what happens if your address is not aligned.

    Altivec won't throw an exception to warn you about the unaligned address. Instead, Altivec simply ignores the lower four bits of the address and charges ahead, operating on the wrong address. This means your program may silently corrupt memory or return incorrect results if you don't explicitly make sure all your data is aligned.

    There is an advantage to Altivec's bit-stripping ways. Because you don't need to explicitly truncate (align-down) an address, this behavior can save you an instruction or two when handing addresses to the processor.

    This is not to say Altivec can't process unaligned memory. You can find detailed instructions how to do so on the Altivec Programming Environments Manual (see Resources). It requires more work, but because memory is so slow compared to the processor, the overhead for such shenanigans is surprisingly low.


    Structure alignment

    Examine the following structure:


    Listing 5. An innocent structure

     

    void Munge64( void *data, uint32_t size ) {

    typedef struct {

        char    a;

        long    b;

        char    c;

    }   Struct;

    What is the size of this structure in bytes? Many programmers will answer "6 bytes." It makes sense: one byte for a, four bytes forb and another byte for c. 1 + 4 + 1 equals 6. Here's how it would lay out in memory:

    Field Type Field Name Field Offset Field Size Field End
    char a 0 1 1
    long b 1 4 5
    char c 5 1 6
    Total Size in Bytes: 6

    However, if you were to ask your compiler to sizeof( Struct ), chances are the answer you'd get back would be greater than six, perhaps eight or even twenty-four. There's two reasons for this: backwards compatibility and efficiency.

    First, backwards compatibility. Remember the 68000 was a processor with two-byte memory access granularity, and would throw an exception upon encountering an odd address. If you were to read from or write to field b, you'd attempt to access an odd address. If a debugger weren't installed, the old Mac OS would throw up a System Error dialog box with one button: Restart. Yikes!

    So, instead of laying out your fields just the way you wrote them, the compiler padded the structure so that b and c would reside at even addresses:

    Field Type Field Name Field Offset Field Size Field End
    char a 0 1 1
    padding 1 1 2
    long b 2 4 6
    char c 6 1 7
    padding 7 1 8
    Total Size in Bytes: 8

    Padding is the act of adding otherwise unused space to a structure to make fields line up in a desired way. Now, when the 68020 came out with built-in hardware support for unaligned memory access, this padding was unnecessary. However, it didn't hurt anything, and it even helped a little in performance.

    The second reason is efficiency. Nowadays, on PowerPC machines, two-byte alignment is nice, but four-byte or eight-byte is better. You probably don't care anymore that the original 68000 choked on unaligned structures, but you probably care about potential 4,610% performance penalties, which can happen if a double field doesn't sit aligned in a structure of your devising.

    中文代码及其内存解释
    内存对齐关键是需要画图!在下面的中文有说明例子

    Examine the following structure:


    首先由一个程序引入话题:

     1 //环境:vc6 + windows sp2
     2 //程序1
     3 #include <iostream>
     4 
     5 using namespace std;
     6 
     7 struct st1 
     8 {
     9     char a ;
    10     int  b ;
    11     short c ;
    12 };
    13 
    14 struct st2
    15 {
    16     short c ;
    17     char  a ;
    18     int   b ;
    19 };
    20 
    21 int main()
    22 {
    23     cout<<"sizeof(st1) is "<<sizeof(st1)<<endl;
    24     cout<<"sizeof(st2) is "<<sizeof(st2)<<endl;
    25     return 0 ;
    26 }
    27 

    程序的输出结果为:

     sizeof(st1) is 12
            sizeof(st2) is 8 

     

    问题出来了,这两个一样的结构体,为什么sizeof的时候大小不一样呢?

    本文的主要目的就是解释明白这一问题。

     

    内存对齐,正是因为内存对齐的影响,导致结果不同。

    对于大多数的程序员来说,内存对齐基本上是透明的,这是编译器该干的活,编译器为程序中的每个数据单元安排在合适的位置上,从而导致了相同的变量,不同声明顺序的结构体大小的不同。

     

           那么编译器为什么要进行内存对齐呢?程序1中结构体按常理来理解sizeof(st1)和sizeof(st2)结果都应该是7,4(int) + 2(short) + 1(char) = 7 。经过内存对齐后,结构体的空间反而增大了。

    在解释内存对齐的作用前,先来看下内存对齐的规则:

    1、  对于结构的各个成员,第一个成员位于偏移为0的位置,以后每个数据成员的偏移量必须是min(#pragma pack()指定的数,这个数据成员的自身长度) 的倍数。

    2、  在数据成员完成各自对齐之后,结构(或联合)本身也要进行对齐,对齐将按照#pragma pack指定的数值和结构(或联合)最大数据成员长度中,比较小的那个进行。

     

    #pragma pack(n) 表示设置为n字节对齐。 VC6默认8字节对齐

    以程序1为例解释对齐的规则 :

    St1 :char占一个字节,起始偏移为0 ,int 占4个字节,min(#pragma pack()指定的数,这个数据成员的自身长度) = 4(VC6默认8字节对齐),所以int按4字节对齐,起始偏移必须为4的倍数,所以起始偏移为4,在char后编译器会添加3个字节的额外字节,不存放任意数据。short占2个字节,按2字节对齐,起始偏移为8,正好是2的倍数,无须添加额外字节。到此规则1的数据成员对齐结束,此时的内存状态为:

    oxxx|oooo|oo


    0123 4567 89 (地址)

    (x表示额外添加的字节)

    共占10个字节。还要继续进行结构本身的对齐,对齐将按照#pragma pack指定的数值和结构(或联合)最大数据成员长度中,比较小的那个进行,st1结构中最大数据成员长度为int,占4字节,而默认的#pragma pack 指定的值为8,所以结果本身按照4字节对齐,结构总大小必须为4的倍数,需添加2个额外字节使结构的总大小为12 。此时的内存状态为:

    oxxx|oooo|ooxx

    0123 4567 89ab  (地址)

    到此内存对齐结束。St1占用了12个字节而非7个字节。

     

    St2 的对齐方法和st1相同,读者可自己完成。

    下面再看一个例子  http://www.cppblog.com/cc/archive/2006/08/01/10765.html

    内存对齐

             在我们的程序中,数据结构还有变量等等都需要占有内存,在很多系统中,它都要求内存分配的时候要对齐,这样做的好处就是可以提高访问内存的速度。


     我们还是先来看一段简单的程序:


                                    程序一
     1 #include <iostream>
     2 using namespace std;
     3 
     4 struct X1
     5 {
     6   int i;//4个字节
     7   char c1;//1个字节
     8   char c2;//1个字节
     9 };
    10 
    11 struct X2
    12 {
    13   char c1;//1个字节
    14   int i;//4个字节
    15   char c2;//1个字节
    16 };
    17 
    18 struct X3
    19 {
    20   char c1;//1个字节
    21   char c2;//1个字节
    22   int i;//4个字节
    23 };
    24 int main()
    25 {   
    26     cout<<"long "<<sizeof(long)<<"\n";
    27     cout<<"float "<<sizeof(float)<<"\n";
    28     cout<<"int "<<sizeof(int)<<"\n";
    29     cout<<"char "<<sizeof(char)<<"\n";
    30 
    31     X1 x1;
    32     X2 x2;
    33     X3 x3;
    34     cout<<"x1 的大小 "<<sizeof(x1)<<"\n";
    35     cout<<"x2 的大小 "<<sizeof(x2)<<"\n";
    36     cout<<"x3 的大小 "<<sizeof(x3)<<"\n";
    37     return 0;
    38 }

          
          这段程序的功能很简单,就是定义了三个结构X1,X2,X3,这三个结构的主要区别就是内存数据摆放的顺序,其他都是一样的,另外程序输入了几种基本类型所占用的字节数,以及我们这里的三个结构所占用的字节数。

    这段程序的运行结果为:
    1 long 4
    2 float 4
    3 int 4
    4 char 1
    5 x1 的大小 8
    6 x2 的大小 12
    7 x3 的大小 8


         结果的前面四行没有什么问题,但是我们在最后三行就可以看到三个结构占用的空间大小不一样,造成这个原因就是内部数据的摆放顺序,怎么会这样呢?

        下面就是我们需要讲的内存对齐了。

        内存是一个连续的块,我们可以用下面的图来表示,  它是以4个字节对一个对齐单位的:

                                                        图一

    mem1.jpg

       让我们看看三个结构在内存中的布局:

       首先是 X1,如下图所示

    mem2.jpg


        X1 中第一个是 Int类型,它占有4字节,所以前面4格就是满了,然后第二个是char类型,这中类型只占一个字节,所以它占有了第二个4字节组块中的第一格,第三个也是char类型,所以它也占用一个字节,它就排在了第二个组块的第二格,因为它们加在一起大小也不超过一个块,所以他们三个变量在内存中的结构就是这样的,因为有内存分块对齐,所以最后出来的结果是8,而不是6,因为后面两个格子其实也算是被用了。

        再次看看X2,如图所示

    mem4.jpg

        X2中第一个类型是Char类型,它占用一个字节,所以它首先排在第一组块的第一个格子里面,第二个是Int类型,它占用4个字节,第一组块已经用掉一格,还剩3格,肯定是无法放下第二Int类型的,因为要考虑到对齐,所以不得不把它放到第二个组块,第三个类型是Char类型,跟第一个类似。所因为有内存分块对齐,我们的内存就不是8个格子了,而是12个了。


    再看看X3,如下图所示:


    mem3.jpg

       关于X3的说明其实跟X1是类似的,只不过它把两个1个字节的放到了前面,相信看了前面两种情况的说明这里也是很容易理解的。



    What is the size of this structure in bytes? Many programmers will answer "6 bytes." It makes sense: one byte for a, four bytes forb and another byte for c. 1 + 4 + 1 equals 6. Here's how it would lay out in memory:

    Field Type Field Name Field Offset Field Size Field End
    char a 0 1 1
    long b 1 4 5
    char c 5 1 6
    Total Size in Bytes: 6

    However, if you were to ask your compiler to sizeof( Struct ), chances are the answer you'd get back would be greater than six, perhaps eight or even twenty-four. There's two reasons for this: backwards compatibility and efficiency.

    First, backwards compatibility. Remember the 68000 was a processor with two-byte memory access granularity, and would throw an exception upon encountering an odd address. If you were to read from or write to field b, you'd attempt to access an odd address. If a debugger weren't installed, the old Mac OS would throw up a System Error dialog box with one button: Restart. Yikes!

    So, instead of laying out your fields just the way you wrote them, the compiler padded the structure so that b and c would reside at even addresses:

    Field Type Field Name Field Offset Field Size Field End
    char a 0 1 1
    padding 1 1 2
    long b 2 4 6
    char c 6 1 7
    padding 7 1 8
    Total Size in Bytes: 8

    Padding is the act of adding otherwise unused space to a structure to make fields line up in a desired way. Now, when the 68020 came out with built-in hardware support for unaligned memory access, this padding was unnecessary. However, it didn't hurt anything, and it even helped a little in performance.

    The second reason is efficiency. Nowadays, on PowerPC machines, two-byte alignment is nice, but four-byte or eight-byte is better. You probably don't care anymore that the original 68000 choked on unaligned structures, but you probably care about potential 4,610% performance penalties, which can happen if a double field doesn't sit aligned in a structure of your devising.





    很多人都知道是内存对齐所造成的原因,却鲜有人告诉你内存对齐的基本原理!上面作者就做了解释!

     

     

     


    三  不懂内存对齐将造成的可能影响如下


    • Your software may hit performance-killing unaligned memory access exceptions, which invoke very expensive alignment exception handlers.
    • Your application may attempt to atomically store to an unaligned address, causing your application to lock up.
    • Your application may attempt to pass an unaligned address to Altivec, resulting in Altivec reading from and/or writing to the wrong part of memory, silently corrupting data or yielding incorrect results.


    四            内存对齐规划

    • 一、内存对齐的原因
    • 大部分的参考资料都是如是说的:

      1、平台原因(移植原因):不是所有的硬件平台都能访问任意地址上的任意数据的;某些硬件平台只能在某些地址处取某些特定类型的数据,否则抛出硬件异常。

      2、性能原因:数据结构(尤其是栈)应该尽可能地在自然边界上对齐。原因在于,为了访问未对齐的内存,处理器需要作两次内存访问;而对齐的内存访问仅需要一次访问。

       

      二、对齐规则

      每个特定平台上的编译器都有自己的默认“对齐系数”(也叫对齐模数)。程序员可以通过预编译命令#pragma pack(n),n=1,2,4,8,16来改变这一系数,其中的n就是你要指定的“对齐系数”。

      规则:

      1、数据成员对齐规则:结构(struct)(或联合(union))的数据成员,第一个数据成员放在offset为0的地方,以后每个数据成员的对齐按照#pragma pack指定的数值和这个数据成员

      自身长度中,比较小的那个进行。

      2、结构(或联合)的整体对齐规则:在数据成员完成各自对齐之后,结构(或联合)本身也要进行对齐,对齐将按照#pragma pack指定的数值和结构(或联合)最大数据成员长度中,比较小的那个进行。

      3、结合1、2可推断:当#pragma pack的n值等于或超过所有数据成员长度的时候,这个n值的大小将不产生任何效果。

       

      三、试验

      下面我们通过一系列例子的详细说明来证明这个规则

      编译器:GCC 3.4.2、VC6.0

      平台:Windows XP

       

      典型的struct对齐

      struct定义:

      #pragma pack(n) /* n = 1, 2, 4, 8, 16 */

      struct test_t {

       int a;

       char b;

       short c;

       char d;

      };

      #pragma pack(n)

      首先确认在试验平台上的各个类型的size,经验证两个编译器的输出均为:

      sizeof(char) = 1

      sizeof(short) = 2

      sizeof(int) = 4

       

      试验过程如下:通过#pragma pack(n)改变“对齐系数”,然后察看sizeof(struct test_t)的值。

       

      1、1字节对齐(#pragma pack(1))

      输出结果:sizeof(struct test_t) = 8 [两个编译器输出一致]

      分析过程:

      1) 成员数据对齐

      #pragma pack(1)

      struct test_t {

       int a;  /* 长度4 > 1 按1对齐;起始offset=0 0%1=0;存放位置区间[0,3] */

       char b;  /* 长度1 = 1 按1对齐;起始offset=4 4%1=0;存放位置区间[4] */

       short c; /* 长度2 > 1 按1对齐;起始offset=5 5%1=0;存放位置区间[5,6] */

       char d;  /* 长度1 = 1 按1对齐;起始offset=7 7%1=0;存放位置区间[7] */

      };

      #pragma pack()

      成员总大小=8

       

      2) 整体对齐

      整体对齐系数 = min((max(int,short,char), 1) = 1

      整体大小(size)=$(成员总大小) 按 $(整体对齐系数) 圆整 = 8 /* 8%1=0 */ [注1]

       

      2、2字节对齐(#pragma pack(2))

      输出结果:sizeof(struct test_t) = 10 [两个编译器输出一致]

      分析过程:

      1) 成员数据对齐

      #pragma pack(2)

      struct test_t {

       int a;  /* 长度4 > 2 按2对齐;起始offset=0 0%2=0;存放位置区间[0,3] */

       char b;  /* 长度1 < 2 按1对齐;起始offset=4 4%1=0;存放位置区间[4] */

       short c; /* 长度2 = 2 按2对齐;起始offset=6 6%2=0;存放位置区间[6,7] */

       char d;  /* 长度1 < 2 按1对齐;起始offset=8 8%1=0;存放位置区间[8] */

      };

      #pragma pack()

      成员总大小=9

      2) 整体对齐

      整体对齐系数 = min((max(int,short,char), 2) = 2

      整体大小(size)=$(成员总大小) 按 $(整体对齐系数) 圆整 = 10 /* 10%2=0 */

       

      3、4字节对齐(#pragma pack(4))

      输出结果:sizeof(struct test_t) = 12 [两个编译器输出一致]

      分析过程:

      1) 成员数据对齐

      #pragma pack(4)

      struct test_t {

       int a;  /* 长度4 = 4 按4对齐;起始offset=0 0%4=0;存放位置区间[0,3] */

       char b;  /* 长度1 < 4 按1对齐;起始offset=4 4%1=0;存放位置区间[4] */

       short c; /* 长度2 < 4 按2对齐;起始offset=6 6%2=0;存放位置区间[6,7] */

       char d;  /* 长度1 < 4 按1对齐;起始offset=8 8%1=0;存放位置区间[8] */

      };

      #pragma pack()

      成员总大小=9

       

      2) 整体对齐

      整体对齐系数 = min((max(int,short,char), 4) = 4

      整体大小(size)=$(成员总大小) 按 $(整体对齐系数) 圆整 = 12 /* 12%4=0 */

       

      4、8字节对齐(#pragma pack(8))

      输出结果:sizeof(struct test_t) = 12 [两个编译器输出一致]

      分析过程:

      1) 成员数据对齐

      #pragma pack(8)

      struct test_t {

       int a;  /* 长度4 < 8 按4对齐;起始offset=0 0%4=0;存放位置区间[0,3] */

       char b;  /* 长度1 < 8 按1对齐;起始offset=4 4%1=0;存放位置区间[4] */

       short c; /* 长度2 < 8 按2对齐;起始offset=6 6%2=0;存放位置区间[6,7] */

       char d;  /* 长度1 < 8 按1对齐;起始offset=8 8%1=0;存放位置区间[8] */

      };

      #pragma pack()

      成员总大小=9

      2) 整体对齐

      整体对齐系数 = min((max(int,short,char), 8) = 4

      整体大小(size)=$(成员总大小) 按 $(整体对齐系数) 圆整 = 12 /* 12%4=0 */

       

      5、16字节对齐(#pragma pack(16))

      输出结果:sizeof(struct test_t) = 12 [两个编译器输出一致]

      分析过程:

      1) 成员数据对齐

      #pragma pack(16)

      struct test_t {

       int a;  /* 长度4 < 16 按4对齐;起始offset=0 0%4=0;存放位置区间[0,3] */

       char b;  /* 长度1 < 16 按1对齐;起始offset=4 4%1=0;存放位置区间[4] */

       short c; /* 长度2 < 16 按2对齐;起始offset=6 6%2=0;存放位置区间[6,7] */

       char d;  /* 长度1 < 16 按1对齐;起始offset=8 8%1=0;存放位置区间[8] */

      };

      #pragma pack()

      成员总大小=9

       

      2) 整体对齐

      整体对齐系数 = min((max(int,short,char), 16) = 4

      整体大小(size)=$(成员总大小) 按 $(整体对齐系数) 圆整 = 12 /* 12%4=0 */

      8字节和16字节对齐试验证明了“规则”的第3点:“当#pragma pack的n值等于或超过所有数据成员长度的时候,这个n值的大小将不产生任何效果”。

       


      内存分配与内存对齐是个很复杂的东西,不但与具体实现密切相关,而且在不同的操作系统,编译器或硬件平台上规则也不尽相同,虽然目前大多数系统/语言都具有自动管理、分配并隐藏低层操作的功能,使得应用程序编写大为简单,程序员不在需要考虑详细的内存分配问题。但是,在系统或驱动级以至于高实时,高保密性的程序开发过程中,程序内存分配问题仍旧是保证整个程序稳定,安全,高效的基础。
    • [1]

      什么是“圆整”?

      举例说明:如上面的8字节对齐中的“整体对齐”,整体大小=9  4 圆整 = 12

      圆整的过程:从9开始每次加一,看是否能被4整除,这里91011均不能被4整除,到12时可以,则圆整结束。




      作者

    Jonathan Rentzsch http://www.ibm.com/developerworks/library/pa-dalign/
    http://www.cppblog.com/cc/archive/2006/08/01/10765.html(中文优秀解释)
    http://www.cppblog.com/snailcong/archive/2009/03/16/76705.html(对英文版的消化,可以查看该博客)
    http://blogold.chinaunix.net/u3/118340/showart_2615855.html
    展开全文
  • 目录: 第Ⅰ部分 表单结构 第1章 表单的设计 3 表单设计很重要 5 表单设计的影响 12 设计时的考虑因素 20 第2章 表单的组织 23 选取问题 24 构建对话 27 内容的组织 29 归纳区别 34 最佳实践 38 第3章 ...
  • 计算机专业术语大全(中~英文版)

    千次阅读 2019-05-07 10:31:36
    Lightweight Directory Access Protocol(LDAP——轻目录访问协议)   Local Area Network(LAN——局域网)   Mail Messaging Applications Programming Interface(邮件消息发送应用程序编程接口) ...
  • 包含中文版PDF,英文版PDF和习题答案 英文名:Programming in C:A Complete introduction to the C programming language,Third Edition 作者: (美)Stephen Kochan 译者: 张小潘 副标题: 本书是极负盛名的C语言...
  • 机器翻译重要过程(2)---词语对齐

    万次阅读 2013-08-24 20:31:18
    在上一步预处理完成之后,平行句对中的中文部分都被切分成了相应的短语,而英文的大小写、格式、相应的空格也都加上了,在这之后就可以完成词语对齐的过程了。词语对齐的目标是得到中英文词或短语的对齐信息,便于...
  • 机器翻译--词语对齐

    千次阅读 2014-05-15 11:26:18
    在上一步预处理完成之后,平行句对中的中文部分都被切分成了相应的短语,而英文的大小写、格式、相应的空格也都加上了,在这之后就可以完成词语对齐的过程了。词语对齐的目标是得到中英文词或短语的对齐信息,便于...
  • 机器翻译-词对齐

    2019-10-27 19:31:08
    在上一步预处理完成之后,平行句对中的中文部分都被切分成了相应的短语,而英文的大小写、格式、相应的空格也都加上了,在这之后就可以完成词语对齐的过程了。词语对齐的目标是得到中英文词或短语的对齐信息,便于...
  • 阅读目录 3.1.3.1 数据类型转换3.1.3.2 处理器间数据通信3.1.3.3 排查对齐问题 引言 转自:http://www.cnblogs.com/clover-toeic/p/3853132.html  考虑下面的结构体定义: 1 typedef struct{ 2 ...
  • 作为菜鸟,在编辑IEEE Access 的LaTex时,图片双栏显示时常常会遇到图片的下标注(也就是常说的caption)无法换行,而且系统给定的版本无法做到下标注左对齐以及字体如何加粗( 就和见刊的论文那样\color{red}{就和...
  • Android EditText 换行和对齐问题研究

    千次阅读 2018-04-25 16:51:50
    引入的,实现在frameworks/minikin/libs/minikin目录下,java通过jni调用so库。但是按照icu的规范, 英文单词是尽可能的不中断的 ,标点符号不会在行尾或者行首,所以即使设置high_quality和full的值,视觉上看起来...
  • 从内存的角度解释内存对齐的原理

    千次阅读 2012-03-16 22:08:25
    目录 题记 一 内存读取粒度 Memory access granularity 从内存的角度解释内存对齐的原理 队列原理 Alignment fundamentals Lazy processors 二 速度 Speed (内存对齐的基本原理) 代码解释 ...
  • 混淆,加固,重签名,对齐操作 按照顺序走一圈,先来看混淆,博文在这儿  那么接下来让我们再来看看加固吧,哈哈! 现在有很多Apk加固的第三方平台,譬如爱加密,360加固,梆梆加密等,但是这些平台都是收费的。...
  • Gale和Church的句对齐算法解析

    千次阅读 2017-04-14 22:05:11
    Gale和Church在1993年提出了一个基于长度进行句对齐的算法,并在附录里公开了C源代码。这篇论文相当经典,以至于之后的关于句对齐的论文大多数要引用它。论文的题目是《 A Program for Aligning Sentences in ...
  • 注意:英文版为官方原版,没有新增功能的演示。中文版为google翻译版本,新增功能在高级目录下可查看 新增Props handleInfo 类型: Object必需: false默认: { size: 8, offset: -5, switch: true } 当使用transform:...
  • VS2005英文版下貌似不能加载,好像是一个VS2005安装sp1后的bug吧,我还没搞清楚。 要使用等宽字体,比如新宋体。VS2005环境设置,字体选择框中显示为粗体的是等宽字体。 安装方法 Steps to In...
  • 邻域匹配网络用于实体对齐 作者:Peking University, China & University of Leeds, U.K 原文地址:https://www.aclweb.org/anthology/2020.acl-main.578.pdf Github:https://github.com/StephanieWyt/NMN ...
  • 自动对齐:(gg=G) 在命令模式下(即非“插入”等编辑模式),先输入gg,这时候光标会移动到第一行第一个字符,然后按 “=” 号之后切换成大写,再按一下G,这时候光标会移到最后一行的第一个字符,这时候就可以...
  • 尝试了很久终于实现了代码的自动对齐了,网上的资源很多,但是都没有讲清楚,这个astyle 到底应该是如何使用的。 因此,作者打算具体介绍一下 这个的使用的方法(也是以恶比较浅显的使用的方法。) 下面总共分程两...
  • Linux常用的英文总结

    千次阅读 2017-08-13 12:59:37
    目录,索引簿  18。从[fr?m,fr?m]准备。从,来自,以来  19。菜单[?menju:] n。菜单,目录  20。选项[?? p ?? n] n。任选,选择,可选项  21.字符[?k?rikt?] n。字符,符号,特性  22。电流[?k?...
  • 1.源代码地址(含有实现方法的英文说明): https://github.com/jwyang/face-alignment 2.环境: 官方:win64 + matlab2014a; 个人:win64 + matlab2014b; 3.下载数据库: 数据库下载网址: ...
  • Mathtype中文官网已上线,可免费下载最新MathType 6.9简体中文。网址 http://www.mathtype.cn/   MathType是“公式编辑器”的功能强大而全面的版本。如果要经常在文档中编排各种复杂的数学、化学公式,则...
  • 不过在if while switch等条件后, 紧跟大括号是常用的习惯, 而且 结尾大括号跟 这些关键词匹对 对齐也能表达出 正确的 嵌套逻辑 长语句: 在换行时,我们通常在一个变量或者常量之前换行,把逗号之类...
  • JAVA英文单词

    千次阅读 2015-10-08 10:12:22
    JAVA英文单词
  • 目录: 「关于教程」 2 「关于作者」 3 「目录」 4 第01章 编程基础 11 1.1 通俗地理解什么是编程语言 11 1.2 C语言究竟是一门怎样的语言? 14 1.3 C语言是菜鸟和大神的分水岭 17 1.4 学编程难吗?多久能入门? 18 ...

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 13,539
精华内容 5,415
关键字:

英文版目录如何对其