[Thiago Cafe] Programming is fun!

Materialised data and data view

Created by Thiago Guedes on 2025-08-12 06:23:41

Tags: #rust   #c++  

Something that some new programming languages are doing well is to have from day 1 materialised data and data views.

First, what is materialised data and data view?

Before anything else, those names came out of my mind, but the idea is not new.

Materialised data

This is a struct/class data has the ownership of some allocated data. For instance, C++ std::string / std::vector and Rust String / Vec classes have the ownership of the data they provide.

Data view

On the other hand, this struct/class indexes data from a materialised data, but it does not own the data. It's trivial to convert materialised data to data view, but not the other way around. Examples of data views are std::string_view and std::span in C++ and &str and &[] in Rust

Why does it matter?

This is not in order of importance!

First: Performance

size_t length(const std::string &buffer); // C++
fn length(buffer: String) -> usize; // Rust

Every time I invoke those functions, I am allocating data if not already allocated. This is easier to see in Rust as the conversion is explicit, but not so easy in C++

// No extra allocation
std::string my_val = "Some data to be used";
size_t count = length(my_val);

// Extra allocation to count something that is static
size_t count2 = length("More data to be counted");

It might look silly, but let's get another real-world example where it matters more

// Declared somewhere
std::vector<uint_8> compress(const std::vector<uint8_t> &chunk);

std::vector<uint8_t> my_big_data;
// I need this extra allocation if I am not using std::span, a view.
std::vector<uint8_t> slice_of_data = {my_big_data.begin(), my_big_data.end() - my_big_data.size() / 2};
std::vector<uint8_t> res = compress(slice_of_data);

And another interesting example

std::unordered_map<std::string, my_class> my_hashmap;
// Oops! extra allocation!
my_hashmap.find("something");

Second: Memory ownership

When you receive a view, you should not move the data

void do_something(std::span subset_of_data);
// It's clear the memory cannot be moved out of subset_of_data

Ok, I get it, but why are you saying new programming languages are doing well to have it from day 1 ?

C++ has std::string_view and std::span !

That is true and it's great it has. However, tons and tons of code prefer to pay the price of using const std::string &val then using const char *, which makes sense. Also, tons of libraries will have a hard time to convert from storing std::string and accepting std::string_view as there are lots of meta programming relying on that and not a clear way to make the connection.

I understand, so you are just ranting about that?

On the contrary. I am grateful new languages are getting it right and old languages are catching up

Rust is a very well designed language and there are 2 interesting ways they to that

fn length(bug: &str) -> usize;

let chars1 = length("My data"); // No extra allocation
let val: String = socket.recv();
let chars2 = length(val.as_str());
// or
let chars2 = length(&val); 
// In rust, this conversion is defined in the Deref trait. Again, no extra allocation

Conclusion

Many other posts of mine have no conclusion, specially when having a conclusion is silly, but this one almost deserves one. That said, let's pay attention to the ownership of data!

Tell me your opinion!

Reach me on Twitter - @thiedri