TL;DR
Artificial Tripartite Intelligence (ATI) treats sensing as an active part of physical AI. Instead of feeding fixed camera frames to a model, ATI puts sensor control at the front of the loop, fixing exposure, and motion blur before perception, and only escalates to a large remote model when it has to. The result: an on-device perception stack that stays accurate under hard capture conditions while calling the cloud far less often.
Abstract
As AI moves from data centers to robots and wearables, scaling ever-larger models becomes insufficient. Physical AI operates under tight latency, energy, privacy, and reliability constraints, and its performance depends not only on model capacity but also on how signals are acquired through controllable sensors in dynamic environments. We present Artificial Tripartite Intelligence (ATI), a bio-inspired, sensor-first architectural contract for physical AI. ATI is tripartite at the systems level: a Brainstem (L1) provides reflexive safety and signal-integrity control, a Cerebellum (L2) performs continuous sensor calibration, and a Cerebral Inference Subsystem spanning L3/L4 supports routine skill selection and execution, coordination, and deep reasoning. This modular organization allows sensor control, adaptive sensing, edge-cloud execution, and foundation model reasoning to co-evolve within one closed-loop architecture, while keeping time-critical sensing and control on device and invoking higher-level inference only when needed. We instantiate ATI in a mobile camera prototype under dynamic lighting and motion. In our routed evaluation (L3-L4 split inference), compared to the default auto-exposure setting, ATI (L1/L2 adaptive sensing) improves end-to-end accuracy from 53.8% to 88% while reducing remote L4 invocations by 43.3%. These results show the value of co-designing sensing and inference for embodied AI.
Motivation
The same scene captured under different conditions can be trivial or impossible for a perception model. Auto-exposure optimizes images for human viewers, not for downstream AI, and its choices (long exposures, high ISO) often introduce exactly the motion blur and noise that hurt on-device models most.
Method
ATI maps four levels of control onto a familiar biological hierarchy. Lower levels are fast, reflexive, and always on-device; higher levels are slower, more capable, and escalated to only when needed.
The Cerebellum (L2) is where the sensor-calibration policy is learned. We frame it as a contextual multi-armed bandit: given device and environmental context (motion from the accelerometer/gyroscope, illuminance from the light sensor), the policy selects bounded adjustments to ISO and exposure, always within the safety envelope enforced by the Brainstem (L1), and is rewarded by the downstream perception signal (confidence and sharpness) produced by L3. Once learned, the policy is consolidated into a compact lookup over motion × light states for fast, low-power inference.
Results
Against an auto-exposure baseline, ATI holds exposure and ISO in ranges that keep frames sharp under changing light, rather than chasing brightness. In the routed end-to-end evaluation (L3-L4 split inference), this lifts overall accuracy from 53.8% to 88% while cutting remote L4 invocations by 43.3%. The per-stage breakdown below shows where the gains come from: on-device perception (L3) accuracy rises sharply, and the system escalates to L4 far less often.
Qualitative comparison
The effect is most visible on hard cases. Auto-exposure smears moving targets; naive electronic image stabilization darkens the frame; the L1 reflex layer recovers a usable image; and adding the L2 calibration policy yields a sharp, well-exposed capture.
Beyond vision: auditory ATI
The sensor-first principle is not specific to cameras. Applied to on-device audio capture, an ATI gain policy improves the signal-to-noise ratio over standard automatic gain control in a controlled noise-plus-signal setup.
Demo video
BibTeX citation
@inproceedings{10.1145/3745756.3809242, author = {Choi, You Rim and Park, Subeom and Kim, Hyung-Sin}, title = {[Emerging Ideas] Artificial Tripartite Intelligence: A Bio-Inspired, Sensor-First Architecture for Physical AI}, year = {2026}, isbn = {9798400720277}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3745756.3809242}, doi = {10.1145/3745756.3809242}, booktitle = {Proceedings of the 24th Annual International Conference on Mobile Systems, Applications and Services}, pages = {839--853}, numpages = {15}, keywords = {physical AI, bio-inspired AI, adaptive sensing, offloading}, location = {University of Cambridge, Cambridge, United Kingdom}, series = {MobiSys '26}}