Pilot 0.1 Complete
This week we wrapped pilot 0.1 with three produce importers. This marks the end of our first deployment cycle and likely our last major update before the new year. It's a good moment to look back at what worked, what didn't, and where we go from here.
All three importers are used the system to classify incoming shipments by smell alone. That sentence still feels improbable to write.
What We Set Out to Do
The goal for pilot 0.1 was straightforward. To prove that portable chemical sensors could classify produce at receiving docks in real time, without lab equipment, and deliver information operators could act on immediately.
We hit those marks. The system ran at the dock. It processed readings in seconds, not hours. And importers used the output to make routing decisions they couldn't make before.
One partner used it to sort avocados by ripeness stage, routing them to different distribution channels based on how much shelf life remains. Another used it to flag banana shipments that need expedited handling. A third tested it as a pre-screen for quality issues that would normally require destructive sampling.
The Technical Breakthrough
The system works because we stopped trying to measure absolute chemical signatures. Instead, we built a temporal difference method that tracks relative changes in sensor readings over time.
This matters because environmental conditions never stay constant. Temperature shifts between morning and afternoon. Humidity varies between facilities. Airflow patterns differ from one dock to another. If the model depends on absolute sensor values, it breaks every time the environment changes.
By focusing on deltas rather than raw readings, the system became robust to these shifts. The model learned to recognize patterns in how odor signatures evolve during a capture sequence, not just what the numbers are at any single moment.
We also confirmed that sequence models outperform everything else for this task. The temporal patterns matter more than we expected. Two items can have nearly identical chemical profiles at a snapshot in time but completely different emission curves over a 30-second window. Capturing that dynamic is what makes reliable classification possible.
Where We're Still Limited
Accuracy is good enough to prove the concept, but not good enough for full automation. The system performs well enough to inform decisions, but not well enough to make them alone. That's the gap we need to close.
The constraint is resolution. Our portable sensors work for deployment, but they capture far less chemical detail than lab equipment. During training, we paired sensor data with gas chromatography analysis, which helped the model learn richer patterns. But at deployment, we only have the sensors. The model is working with less information than it was trained on.
The path forward is to go back to the lab and generate a much larger training dataset using high resolution instruments paired with field sensors. Teach the model to extract as much signal as possible from limited inputs. Then deploy that improved model to the same portable hardware.
We're also learning that "accurate enough" is context dependent. One importer is satisfied with current performance because it beats their existing process. Another won't deploy until we double the accuracy. A third wants probabilistic confidence scores instead of hard classifications. Nobody documented their decision criteria before we started, so every conversation surfaces new requirements.
This is expected. Pilot 0.1 was about learning what the real constraints are. Now we know.
What Changes in 2026
The next phase is focused on two things: improving the model with better training data, and tightening the feedback loop between lab work and field deployment.
We'll be expanding our training dataset using chromatography and mass spectrometry paired with portable sensors. The goal is to teach the system to recognize patterns in low resolution data that correlate with high resolution chemical features. This should improve field performance without requiring better hardware.
We're also building a structured data collection process with the importers. Every override, every disputed reading, every case where the operator disagreed with the system becomes a labeled example we can learn from. This closes the loop: deployment generates the data we need to improve the model, which improves deployment.
And we'll be documenting decision criteria more carefully. We learned from Month Three that the hard part isn't predicting accurately. It's knowing what actions should follow from which predictions. That's operational design, not machine learning. Both matter.
What This Pilot Proved
Six months ago, the idea that you could classify produce by smell using portable sensors at a receiving dock seemed ambitious. The default assumption in the industry is that you need either trained human inspectors or expensive lab analysis.
Pilot 0.1 showed that's not true anymore. The system works. It's not perfect, but it's real. Three produce importers in Texas used it in production. They made different decisions because of it. And the path to improving it is clear.
We're building the foundation for a universal smell model one deployment at a time. Every pilot teaches us what's missing. Every dataset makes the next version better. Every partner helps us understand what "useful" actually means in practice.
This is the last update before we close out the year. We'll be heads down in the lab through the holidays, training the next version. When we come back in 2026, the system will be better.
Thank you to everyone who's followed along this year. It's been a strange, difficult, and ultimately successful first phase. I'm looking forward to what comes next.

