What the System Needs to Perceive and Nothing Else: State Design as an Architecture Constraint

There is a reliable pattern in early-stage embedded ML work. The sensor count goes up. The feature count goes up. The model gets bigger to handle the larger input space. Training takes longer. And the deployed performance is worse than the prototype, not better.

The cause is almost always the same. State design was treated as a default, not a decision. Everything available was included because including more felt safer. It is not.

What State Design Actually Is

State design is the decision about what your system observes. In a physical automation context this means choosing which sensors contribute to the input representation, at what frequency, and in what format. It sounds like a preprocessing decision. It is an architecture decision.

The state representation determines the sample complexity of learning, the inference latency on deployed hardware, whether the system generalises across physical variation, and how easy the system is to debug when something goes wrong.

The Fall Prevention System

The initial prototype used full IMU data across three axes of accelerometer and three axes of gyroscope, plus a barometric pressure channel and a temperature channel because the sensor had them. In practice, the barometric and temperature channels added noise without adding discriminative signal for the fall detection task. Removing those channels reduced input dimensionality by 40%, cut inference latency on the target RISC-V hardware, and improved detection accuracy because the model was no longer fitting to irrelevant variation.

// fall prevention: state reduction

Initial channels

8 (3-axis accel, 3-axis gyro, baro, temp)

Final channels

4 (3-axis accel, 1-axis gyro)

Accuracy

Improved. Fewer irrelevant features, cleaner signal.

Inference latency

Reduced. Smaller input, faster forward pass on edge hardware.

How to Approach State Design

The approach I use now starts with the task specification, not the sensor list. What physical phenomenon does the system need to detect? What are the motion signatures, timing patterns, amplitude ranges? From that description, which sensor channels could plausibly contain that information? Those channels are candidates. Everything else needs justification.

For each candidate channel, the question is whether removing it would reduce the system's ability to perform the task. If the answer is no, remove it. The experiment is usually fast and the results are usually clear. Channels that matter show up immediately in the ablation.

State Design and Edge Deployment

This matters more for edge deployment than any other context. In a cloud system you can afford to be generous with state. On a microcontroller running on battery power, in a physical environment where inference needs to happen in real time, every channel costs something. Memory. Power. Latency. Complexity.

// the question for every channel

If this channel were removed, would the system be unable to perform its task? If the answer is not clearly yes, the channel needs justification, not assumption.

The smaller the state space, the faster the training, the faster the inference, the more interpretable the learned behaviour, the easier the debugging. These are the difference between a system you can ship and one you are still iterating on eighteen months later.