Rust Roadmap 2021: Allowing for arbitrary size integer primitives

In this blog entry following the Call for Rust 2021 Roadmap items, I shall layout my vision for giving Rust the ability to use arbitrarily sized integer primitives.

Motivation

This request has multiple motivations but following the Call for Rust 2021 Roadmap items rules I'm notating them in the agile user story way:

So let me give you a bit more background about the status quo and what I mean by those two stories before giving you my ideas how to address them.

Range limited types

In a lot of areas it is quite common to have the desire to provide range limited types whose allowed ranges do not coincide with the natural primitive but still are a power of 2. E.g. say you want to provide an imaging library which is supposed to work with 3 components (R, G and B). The fact that we have three components might already provide a hint at what I'm trying to get at...

So let's continue our exploration... A typical representation is RGB888 which means that each component consists of 8 bits (which is great since we can simply use 3 u8s for that and by keeping the components separate we should be fine, and not have to worry about the combined representation of 24 bits).

However, a lot of embedded displays only have a color depths of 16 bits in which case the representation is usually something like (RGB565) which leaves us with a number of options how to handle those values, neither of which is ideal.

Today we could do:

/// Values outside of the range will be clamped to the highest allowed
/// component value
pub fn set_pixel_rgb565(&mut self, r: u8, g: u8, b: u8) {
    let (r, g, b) = (max(r, 31), max(g, 63), max(b, 31));
    ...
}

or:

/// Supplying components values outside the designated range are a
/// definite grave user error
pub fn set_pixel_rgb565(&mut self, r: u8, g: u8, b: u8) {
    assert!(r < 32);
    assert!(g < 64);
    assert!(b < 32);
    ...
}

or:

/// Supplying components values outside the designated range are a
/// will yield a Result needed to be handled by the caller
pub fn set_pixel_rgb565(&mut self, r: u8, g: u8, b: u8) -> Result<(), E> {
    if r >= 32 || g >= 64 || b >=32 {
        return Err(E::OutOfRange);
    }
    ...
}

Neither of the three cases are good and really just making a tradeoff of some kind.

What we really want to do instead is:

/// Rust will do the type-checking for us
pub fn set_pixel_rgb565(&mut self, r: u5, g: u6, b: u5) {
    ...
}

This ensures that only correct types will be passed in and they have exact the same behaviour as other all other "natural" primitives.

Before people will scream at me "why don't you use the ux crate for this?", I'd like to point out that this is rather incomplete and outdated and also I'm convinced that the compiler can do a much better job at compile time to ensure that incorrect uses of such types will be noticed at compile time instead of panicking at runtime.

But let's move on to the other use case...

Bitfields

One glaring omission in the Rust type system is the availability of bitfields which is arguably a MUST HAVE for any systems programming language. Especially in embedded Rust we have thousands of memory-mapped-input-output (MMIO) registers which cram a number of different functions, identified by one or more bits, into a value located at a specific memory address. To access the individual bits, other languages allow the (very unsafe) specification of structures as bitfields while we in the Rust world have to use either gnarly macros or gobs of Rust code (which also turn into gobs of binary code in dev mode, but I'll leave that to another Roadmap blog post maybe) and are superslow to compile.

I don't bore you with details or pages of Rust code here, but if you're interested in seeing real life code in use today for one single register(!) of of a STM32F042 MCU generated by svdrust, here's a gist: I2C1 CR1 svd2rust code. You're also invited to browse the STM32F0 svd2rust crate docs which is constantly getting in conflict with our docs.rs infrastructure due to the massive amounts of code required.

It would be really cool if we could express the basic structure of such a MMIO register by using the new primitive types in a regular struct:

#[repr(u32)]
struct CR1 {
    #[doc = "Bit 0 - Peripheral enable"]
    pe: u1,
    #[doc = "Bit 1 - TX Interrupt enable"]
    txie: u1,
    #[doc = "Bit 2 - RX Interrupt enable"]
    rxie: u1,
    #[doc = "Bit 3 - Address match interrupt enable (slave only)"]
    addrie: u1,
    #[doc = "Bit 4 - Not acknowledge received interrupt enable"]
    nackie: u1,
    #[doc = "Bit 5 - STOP detection Interrupt enable"]
    stopie: u1,
    #[doc = "Bit 6 - Transfer Complete interrupt enable"]
    tcie: u1,
    #[doc = "Bit 7 - Error interrupts enable"]
    errie: u1,
    #[doc = "Bits 8:11 - Digital noise filter"]
    dnf: u4,
    #[doc = "Bit 12 - Analog noise filter OFF"]
    anfoff: u1,
    #[doc = "Bit 14 - DMA transmission requests enable"]
    txdmaen: u1,
    #[doc = "Bit 15 - DMA reception requests enable"]
    rxdmaen: u1,
    #[doc = "Bit 16 - Slave byte control"]
    sbc: u1,
    #[doc = "Bit 17 - Clock stretching disable"]
    nostretch: u1,
    #[doc = "Bit 18 - Wakeup from STOP enable"]
    wupen: u1,
    #[doc = "Bit 19 - General call enable"]
    gcen: u1,
    #[doc = "Bit 20 - SMBus Host address enable"]
    smbhen: u1,
    #[doc = "Bit 21 - SMBus Device Default address enable"]
    smbden(: u1,
    #[doc = "Bit 22 - SMBUS alert enable"]
    alerten: u1,
    #[doc = "Bit 23 - PEC enable"]
    pecen: u1,
    // Need to always declare all bits according to `repr` size
    _unused : u8,
}

Looks very nice and tidy, doesn't it? Of course we'll still need the accessor functions to differentiate between read-only and read-write fields but this gets rid of all the error prone bit-shifting and masking code.

Another usecase of this is real (and space-/execution efficient!) bitmaps:

/// size_of::<[bool; 10]>() == 10
let arr : [bool; 10] = [false; 10];

/// size_of::<[u1; 10]>() == 2
let arr : [u1; 10] = [0; 10];

Implementation ideas

Obviously we cannot simply add a plethora of new primitives and be done with it, there're some problems with non-natural sized primitives:

Some random to-dos I can think of from the top of my head:

To avoid this blog from getting too long I'm going to turn this over to the core team now for initial consideration.

Thanks for your attention and hope to blog to y'all soon!