Use size_t
Wiki
< All Topics
Print

Use size_t

When things can go wrong

As always, let us begin with a simple example:

#include <stdio.h>
#include <stdlib.h>

#define SIZE 10

int main(){
    char* arr = malloc(SIZE * sizeof(char));
    if(arr == NULL){
        fprintf(stderr, "No memory\n");
        return 1;
    }

    //just for fun, let's fill the vector backwards
    for(int i=SIZE-1; i >= 0; --i)
        arr[i] = 'a';
    
    printf("Last: %c\n", arr[SIZE-1]);

    return 0;
}

The program outputs "Last: a", and everything seems to work. But what happens when we try to allocate a large array? Let us see an example, and put a printf in the for loop.

#include <stdio.h>
#include <stdlib.h>

#define SIZE 2147483649 

int main(){
    char* arr = malloc(SIZE * sizeof(char));
    if(arr == NULL){
        fprintf(stderr, "No memory");
        return 1;
    }

    //just for fun, let's fill the vector backwards
    for(int i=SIZE-1; i >= 0; --i){
        printf("Position %d\n", i);
        arr[i] = 'a';
    }
    
    printf("Last: %c\n", arr[SIZE-1]);

    return 0;
}

For now on, I will assume that you are using an x86-64 architecture and a 64 bits Operating System. If you have enough memory, the program outputs "Last: [SOME GARBAGE]". The array was correctly allocated (we checked in the if(arr == NULL) … ), but nothing was printed in the loop, and the array clearly contains garbage in the last position. Something must have gone terribly wrong. And it did!

The operating system was happy to give us the 2GiB of memory requested in the malloc, but we could not index the memory using the int i. The variable overflowed, and we ended up with a negative number (and thus, the loop was never executed). To check this, print the first value of i using printf("%d\n", (int)(SIZE-1)) before the loop. It will print -2147483648. To understand why the program reached such a negative number, read about the two's complement representation (https://prlalmeida.com.br/ci1068-2021-01/Aula5.pdf).

What is the biggest size an array can reach (so we can use a variable with the correct size)? The answer is, as always, it depends (arrgh)! The C/C++ ISO standard defines a size_t as the maximum size of an object of any type, including arrays. The specification defines that the size_t must be at least 16 bits wide (admit 65535 items). This is true for many systems, such as microcontrollers, where resources are limited. But in systems such as most x86-64, the size_t can be defined as a much bigger value than 16 bits (in mine, it is defined as 64 bits). This does not violate the definition since the size_t must be at least 16 bits.

Using size_t to solve the problem

To write a portable code, that does not overflow and can be safely compiled for any C ISO-compliant system without wasting memory (you do not want to use a 64 bits variable to index an array in a microcontroller that can only have 65535 items in its arrays), you must use a size_t variable.

The size_t is defined stddef.h in for C, and in cstddef for C++ (and in many other headers). In practice, it is an alias for an unsigned integer type (you will likely find it as a typedef size_t something). In an x86-64, it is often defined as an unsigned long.

The example using the size_t can be written as:

#include <stdio.h>
#include <stdlib.h>

#define SIZE 2147483649 //2^31+1

int main(){
    char* arr = malloc(SIZE * sizeof(char));
    if(arr == NULL){
        fprintf(stderr, "No memory");
        return 1;
    }

    //just for fun, let's fill the vector backwards
    for(size_t i=SIZE; i > 0; ){
        i--;
        arr[i] = 'a';
    }
    
    printf("Last: %c\n", arr[SIZE-1]);

    return 0;
}

Note that the for loop is kind of strange. This was done so the size_t does not underflow (remember that the size_t is unsigned).

A similar reasoning is applied for the ssize_t, with the difference that ssize_t is a signed integer. The ssize_t is often used when representing a negative number in case of error (see the getline function manual for an example).

One last hint. If you want to printf a size_t value, use the %zu specifier.

References

Seacord, R. C. Effective C: An Introduction to Professional C Programming. No Starch Press. 2020.

Seacord, R. C. Secure Coding in C and C++. Reino Unido: Pearson Education. 2013.

C ISO Standard. ISO/IEC 9899:2018, 2018.

https://prlalmeida.com.br/ci1068-2021-01/Aula5.pdf

Sumário