Definition and declaration of variables

In Rust, valid identifiers (including variable names, function names, trait names, etc.) must consist of numbers, letters, and underscores, and cannot start with a number. This is the same in many other languages. In the future, Rust will also allow other Unicode characters as identifiers, as well as raw identifiers, which allows keywords to be used as identifiers, such as r#self. This is most commonly used in FFI.

Variable declaration: let variable: i32 = 100;. In Rust, the way variables are declared is different from other languages. Here, the variable name comes first, followed by the variable type. So it looks like let variable: i32;.

The advantage of this variable declaration style is that it is easier for syntax analysis, and the most important part of a variable declaration statement is the variable name. By highlighting the importance of variable names, the type can be inferred from the context. However, Rust's type inference has limitations, and for types that cannot be inferred, a type annotation needs to be added manually.

The use of let in variable declaration is also borrowed from functional programming languages. let indicates the meaning of binding, indicating that the variable name is bound to the memory. In Rust, a statement that declares a local variable and initializes it is generally called a "variable binding". The emphasis here is on the meaning of "binding", which is different from the "assignment initialization statement" in C++/C.

Some issues with variable declaration:

In Rust, every variable must be properly initialized before it can be used. It is not possible to use an uninitialized variable in Rust, which is an error in other languages.

Checking for uninitialized variables:

In the example let variable: i32; above, this is a declaration without assigning a value to the variable. This may be acceptable in other languages, but in Rust, the compiler will report an error (if the uninitialized variable is used later). The Rust compiler performs basic static control flow analysis on the code to ensure that variables are initialized before use. Since variable is not bound to any value, this code would cause many memory safety issues, such as unexpected calculation results or program crashes, so the Rust compiler must report an error.

let variable: i32;
println!("variable = {}", variable); // error[E0381]: use of possibly uninitialized 'variable'

Checking for uninitialized variables in branch flow:

The Rust compiler's static control flow analysis is quite strict.

fn main() {
    let x: i32;
    if true {
        x = 1;
    } else {
        x = 2;
    }
    println!("x = {}", x);
}

In this case, all branches of the if statement bind a value to the variable x, so it can be executed. But if the else branch is removed, the compiler will report an error:

error: use of possibly uninitialized variable: 'x'
println!("x = {}", x);

From this, we can see that the compiler has detected that the variable x has not been properly initialized. When the else branch is removed, the compiler's static control flow analysis determines that the println! outside the if expression also uses the variable x, but it has not been bound to any value. The compiler's static control flow analysis cannot recognize that the condition in the if expression is always true, so it checks all branches. (This is an area of research in the field of programming languages, such as software static analysis. Some reference materials: Software Analysis course at Nanjing University)

If the println! statement is also removed, the code can be compiled and run normally because there is no other place outside the if expression that uses the variable x, and the if expression has already bound a value to x, so the compilation is successful.

// An example
fn test(condition: bool) {
    let x: i32; // declare x
    if condition {
        x = 1; // initialize x, this is initialization
        println!("{}", x);
    }
    // If the condition is not satisfied, x is not initialized

    // But it doesn't matter as long as x is not used here
}

Checking for uninitialized variables in loops:

When the break keyword is used in a loop, it returns the value of the variable in the branch.

fn main() {
    let x: i32;
    loop {
        if true {
            x = 2;
            break;
        }
    }
    println!("{}", x); // 2
}

The Rust compiler's static control flow analysis knows that break will return the value of x, so the println! outside the loop can print the value of x normally.

Empty arrays or vectors can be used to initialize variables:

When binding an empty array or vector to a variable, the type needs to be explicitly specified, otherwise the compiler cannot infer the type.

fn main() {
    let a: Vec<i32> = vec![];
    let b: [i32; 0] = [];
}

If the explicit type annotation is not added, the compiler will report an error: error[E0282]: type annotation needed. Empty arrays or vectors can be used to initialize variables, but currently cannot be used to initialize constants or static variables.

Uninitialized variables caused by ownership transfer:

When an already initialized variable y is bound to another variable y2, Rust treats y as logically uninitialized. Both y and y2 are variables with move semantics, which means ownership is transferred, while value semantics, like the default behavior in other C++ languages, is pass by value.

fn main() {
    let x = 42; // Primitive types have value semantics and are stored on the stack by default
    let y = Box::new(4); // Variables are boxed on the heap by the Box::new method, which allocates memory on the heap and returns a pointer bound to y, while the pointer y is stored on the stack
    println!("{}", y);
    let x2 = x;
    let y2 = y;
    //println!("{}", y); // Ownership has been transferred, so the variable y can be considered uninitialized
    // But if a value is bound to the variable again, the variable y is still usable, this process is called reinitialization
}