Mutation

Pandas

There are many ways to do mutation in Pandas, I usually do the following for performance and functional style:

df["computed"] = df["nkill"].map(lambda x: (x - 10) / 2 + x ** 2 / 3)
df.to_csv("python_output_map.csv")

Rust

For mutation, the functional iter of Rust really makes this part a walk in the park:

    records.iter_mut().for_each(|x: &mut DataFrame| {
        let nkill = match &x.nkill {
            Some(nkill) => nkill,
            None => &0.,
        };

        x.computed = Some((nkill - 10.) / 2. + nkill * nkill / 3.);
    });

    let mut wtr = csv::Writer::from_path(
        "output_rust_map.csv",
    )?;
    for record in &records {
        wtr.serialize(record)?;
    }

Performance

Time(s)Mem(Gb)
Pandas12.82s4.7Gb
Rust1.58s🔥 -87%1.7Gb🔥 -64%

This is where the difference really appeared to me. Pandas do not scale for line-by-line lambda functions. Pandas would have been even worst if I had done an operation involving several columns.

Rust is way better for line-by-line mutation natively.