Teaching machines to smell

AI can see and hear, but smell has no digital standard.

Smell is central to daily life, yet unlike vision or language, it has no foundation model.

Neosmia is creating the first universal representation of olfaction for industries, researchers, and creators.

Neosmia 2025

Pilot 0.1 Complete

This week we wrapped pilot 0.1 with three produce importers. This marks the end of our first deployment cycle and likely our last major update before the new year. It's a good moment to look back at what worked, what didn't, and where we go from here.

All three importers are used the system to classify incoming shipments by smell alone. That sentence still feels improbable to write.

What We Set Out to Do

The goal for pilot 0.1 was straightforward. To prove that portable chemical sensors could classify produce at receiving docks in real time, without lab equipment, and deliver information operators could act on immediately.

We hit those marks. The system ran at the dock. It processed readings in seconds, not hours. And importers used the output to make routing decisions they couldn't make before.

One partner used it to sort avocados by ripeness stage, routing them to different distribution channels based on how much shelf life remains. Another used it to flag banana shipments that need expedited handling. A third tested it as a pre-screen for quality issues that would normally require destructive sampling.

The Technical Breakthrough

The system works because we stopped trying to measure absolute chemical signatures. Instead, we built a temporal difference method that tracks relative changes in sensor readings over time.

This matters because environmental conditions never stay constant. Temperature shifts between morning and afternoon. Humidity varies between facilities. Airflow patterns differ from one dock to another. If the model depends on absolute sensor values, it breaks every time the environment changes.

By focusing on deltas rather than raw readings, the system became robust to these shifts. The model learned to recognize patterns in how odor signatures evolve during a capture sequence, not just what the numbers are at any single moment.

We also confirmed that sequence models outperform everything else for this task. The temporal patterns matter more than we expected. Two items can have nearly identical chemical profiles at a snapshot in time but completely different emission curves over a 30-second window. Capturing that dynamic is what makes reliable classification possible.

Where We're Still Limited

Accuracy is good enough to prove the concept, but not good enough for full automation. The system performs well enough to inform decisions, but not well enough to make them alone. That's the gap we need to close.

The constraint is resolution. Our portable sensors work for deployment, but they capture far less chemical detail than lab equipment. During training, we paired sensor data with gas chromatography analysis, which helped the model learn richer patterns. But at deployment, we only have the sensors. The model is working with less information than it was trained on.

The path forward is to go back to the lab and generate a much larger training dataset using high resolution instruments paired with field sensors. Teach the model to extract as much signal as possible from limited inputs. Then deploy that improved model to the same portable hardware.

We're also learning that "accurate enough" is context dependent. One importer is satisfied with current performance because it beats their existing process. Another won't deploy until we double the accuracy. A third wants probabilistic confidence scores instead of hard classifications. Nobody documented their decision criteria before we started, so every conversation surfaces new requirements.

This is expected. Pilot 0.1 was about learning what the real constraints are. Now we know.

What Changes in 2026

The next phase is focused on two things: improving the model with better training data, and tightening the feedback loop between lab work and field deployment.

We'll be expanding our training dataset using chromatography and mass spectrometry paired with portable sensors. The goal is to teach the system to recognize patterns in low resolution data that correlate with high resolution chemical features. This should improve field performance without requiring better hardware.

We're also building a structured data collection process with the importers. Every override, every disputed reading, every case where the operator disagreed with the system becomes a labeled example we can learn from. This closes the loop: deployment generates the data we need to improve the model, which improves deployment.

And we'll be documenting decision criteria more carefully. We learned from Month Three that the hard part isn't predicting accurately. It's knowing what actions should follow from which predictions. That's operational design, not machine learning. Both matter.

What This Pilot Proved

Six months ago, the idea that you could classify produce by smell using portable sensors at a receiving dock seemed ambitious. The default assumption in the industry is that you need either trained human inspectors or expensive lab analysis.

Pilot 0.1 showed that's not true anymore. The system works. It's not perfect, but it's real. Three produce importers in Texas used it in production. They made different decisions because of it. And the path to improving it is clear.

We're building the foundation for a universal smell model one deployment at a time. Every pilot teaches us what's missing. Every dataset makes the next version better. Every partner helps us understand what "useful" actually means in practice.

This is the last update before we close out the year. We'll be heads down in the lab through the holidays, training the next version. When we come back in 2026, the system will be better.

Thank you to everyone who's followed along this year. It's been a strange, difficult, and ultimately successful first phase. I'm looking forward to what comes next.

Month Three Update: It Worked

Two weeks ago I wrote about getting stuck in Month Three of our freshness AI project. I said I stopped improving the model and started documenting actual decision processes instead.

That turned out to be the right move.

What Changed

I rebuilt our QC workflow around explicit decision rules. When the sensor says produce is at 85% freshness, here's exactly what happens next. If a store manager disputes a delivery, here's the exact escalation path. If routing gets a recommendation that seems wrong, here's how to evaluate it.

We now have written procedures for situations we used to handle through judgment calls.

The difference is immediate. The QC team knows when to trust the sensor and when to escalate. Store managers know the freshness standards their deliveries will meet. Routing decisions happen faster because the criteria are clear.

We're processing 30% more volume through QC with the same team size. Override requests dropped by half. The sensor recommendations get followed 89% of the time now. It was 23% three weeks ago.

The Real Pattern

Month Three felt slow because I was solving the wrong problem. I kept trying to make the predictions more accurate. The predictions were already accurate enough.

The problem was that nobody knew what to do with accurate predictions. Our workflows assumed humans would handle every edge case through experience. The AI exposed that we never wrote down what that experience actually was.

Once I wrote it down, everything moved faster.

The teams that I've talked to who got past Month Three all did the same thing. They stopped treating AI as a prediction tool. They started treating it as a reason to document their actual processes.

What I'm Doing Now

I'm expanding the documented procedures to our other facilities. Each location has slightly different practices. Now I'm writing those down too.

I'm also building feedback loops so the system improves from operational data. When someone overrides a recommendation, that tells me something. Either the recommendation was wrong or the override criteria need adjustment. Both are useful.

The model keeps learning. The team keeps adapting. We're reducing shrink compared to last quarter.

Month Three looked like a stall but it forced us to build what actually matters. We had good predictions. We needed a system that could use them. Now we have both.

That's what I'm working on now. It's going well.

Why I'm Stuck in Month Three

I thought our freshness AI project was going well. The first two months felt productive. Now I'm in Month Three and everything has slowed down.

Most people blame it on needing more samples or missing varietals. That's not what's happening.

The real shift is that I moved from building something that works in a lab to changing how people actually work.

I've worked with importers, grocery chains, and distributors on both sides of the border. Margins are tight right now. Labor costs are up, spoilage is eating profits, and compliance keeps getting stricter. If you're trying to figure out how AI fits into your operations, I'm happy to share what I've learned.

What I Got Wrong About Months One and Two

The first 60 days looked like progress.

I built clean curves. Found clear correlations. Made a dashboard that predicted freshness. Every small win got celebrated. Every weird sensor reading got explained away. But nothing I built actually changed what happened at the dock, in the warehouse, or on the shelf.

I was proving the technology worked. I wasn't delivering anything that mattered to operations.

Now the system is entering real workflows. The QC team is supposed to use it. The routing software is supposed to listen to it. And suddenly capability doesn't matter anymore. Only behavior matters.

My basic fault was that I never documented how people actually make decisions now.

What Changed in Month Three

Months 1–2: I controlled everything in the pilot. Month 3: The system started telling people what to do.

That's when everyone froze.

Now I'm facing questions I didn't expect:

  • Who approves when the sensor says something different than what the QC inspector sees?
  • What happens when a store manager refuses a delivery the system approved?
  • How do I define "acceptable freshness" when it changes by apple variety and customer?
  • When does a reading become urgent enough to stop a truck?
  • How do I let someone override the system during our busiest week?

We have buyers, merchandisers, and inspectors. We don't have anyone whose job is to manage autonomous recommendations affecting millions of dollars of produce.

The Definitions I Never Wrote

People here work on instinct. The AI needs explicit rules.

In Month Three, I realized we never actually defined:

What makes a pallet "ready for QC" When produce is "good enough for retail" versus "send it to processing" Which stores accept which freshness levels What "accurate" means when multiple outcomes are fine How to prove the system actually reduced shrink or saved time

I assumed everyone agreed on these things. They don't.

The AI is exposing gaps that experienced people used to just handle. Month Three is when those gaps become impossible to ignore.

The Three Way Tension

Leadership wants lower shrink and smoother operations. The dock crew wants predictable loads without surprises. I need stable definitions so I can stop retraining the model every week.

The AI creates variance for all three groups.

In Month One, that felt exciting and innovative. In Month Three, it feels like chaos.

Leadership keeps asking for more automation. Operations keeps pushing back because it "raises more questions than it answers." I'm stuck in the middle trying to reconcile expectations that nobody realized were contradictory.

Nobody is actually wrong. But nothing moves forward until someone takes ownership of the mess that autonomy creates.

What I'm Learning

I treated this like building a smart QC assistant.

But our entire workflow was designed for humans who think through every decision. I'm trying to bolt AI onto processes built for human judgment.

Month Three is chaotic not because my model is broken. It's chaotic because the workflow expects a different kind of decision maker.

What's Working for Others

I talked to teams that got past this point. They stopped asking: "How do I make the model copy what our inspectors do?"

They asked: "What would QC look like if nobody guessed?"

That question changed everything:

Define exactly what state produce is in at each step Set clear conditions for when a prediction is valid Mark the boundaries where risk is acceptable Build loops to measure if things actually improved Focus on gains across the whole chain, not just one department

They rebuilt the workflow around the AI making decisions, not assisting decisions.

That's when things actually got faster instead of slower.

What I Know Now

If I expected Month Three to show results, I was wrong. If I treat it as when the real work starts, I might actually build something useful.

I'm learning to see Month One and Two as experiments. Month Three is actually Week One of the real deployment. This isn't a feature I'm adding. It's rewriting how we move perishable food.

My model isn't failing because the predictions are wrong. It's failing because the demo never showed what the actual work looks like. Month Three is where my system meets the receiving dock. And the dock has requirements my demo never dealt with.

Exploring Stochastic Intelligence in Smell - Happy Halloween!

Over the past couple of weeks, I've been revisiting a question that involves physics and intelligence. What if part of Neosmia's computation shouldn't be strictly deterministic?

Smell is inherently stochastic. Every capture run, even under controlled conditions, produces small variations in sensor output. The same mango, sampled twice, never smells quite the same electrically. Molecules diffuse, humidity shifts, and the chemical reactions on the sensing layer fluctuate. Yet those fluctuations aren't noise in the useless sense. They are what we humans so naturally use to detect mature from ripe or senescent from spoiled.

That realization has led me toward an emerging field called Hybrid Thermodynamic–Deterministic Machine Learning (HTDML). A fancy name to indicate that certain parts of computation, especially those dealing with uncertainty or sampling, might be better executed through controlled physical randomness instead of pure digital logic.

Extropic and others are exploring this through what they call thermodynamic computing. Systems that use the natural noise in transistors as a resource, allowing them to sample probability distributions directly in hardware. A fundamentally different way of computing, one that treats randomness as computation rather than error.

For Neosmia, this could be really helpful. Our model already operate probabilistically—estimating odor identities, freshness levels, or authenticity scores under uncertainty. The core challenge is efficiently exploring the vast space of possible odor signatures. A hybrid thermodynamic approach might allow our future systems to perform that exploration not through simulated randomness on GPUs, but through actual physical stochasticity—potentially at a fraction of the energy.

I'm still in the early stages of this exploration. Extropic, if you read this, please ship one XTR-0 my way! :) Right now, that means studying the mathematics of energy-based models and testing sampling routines in JAX using the THRML library. The long-term question is whether a future version of Neosmia could run part of its inference directly on probabilistic hardware, bringing the physics of smell and the physics of computation closer together.

Smell has always been a sensory domain shaped by chance, diffusion, and equilibrium. It makes sense that its digital twin might also have to embrace a bit of thermodynamics to truly understand it.

From Model Lab to Agent Lab

This week I confronted a hard question: does Neosmia need to build the foundation model first, or can we capture value now while building toward it? The answer changes everything.

The shift is to stop thinking like a model lab and start thinking like an agent lab. Instead of waiting to train the perfect smell model, we build specific decision-making agents that work today with existing sensors and ML pipelines. Each agent solves a narrow, urgent problem. A food freshness routing system that decides which pallets ship first, a quality assurance agent that flags spoilage before loading, an environmental safety monitor that detects VOCs early.

The key insight is that these agents generate the exact data we need to train the foundation model later. Every workflow collects time-series odor traces and labeled outcomes.

This means we can demonstrate value immediately, get paying pilots that fund data collection, and own the workflows where smell matters most. The foundation model is still the long-term vision, but now there is a path to survive long enough to build it.

Expedition FEMSA

Accepted into Expedition FEMSA Bioworkshop Cohort

Today I am excited to share that Neosmia has been accepted into the Expedition FEMSA Bioworkshop cohort. This is a huge step forward for us.

Expedition is where entrepreneurs, researchers, and creatives unite to transform ideas into revolutionary solutions. It is a dynamic and collaborative space that drives synergy between industries and society, creating the perfect environment for innovation and change. Being part of this community means access to labs, infrastructure, and a network of people working on ambitious science-driven ventures right here in Monterrey.

What makes this especially meaningful is the timing. We are moving from prototype design into real world testing, and Expedition gives us the resources and setting to pressure test our early hardware and accelerate validation cycles. The cohort will also help us refine our approach as we prepare for the launch of the Innovation and Entrepreneurship Hub in March 2026.

This is the kind of support that turns ambitious ideas into working systems. I am looking forward to what we will build here.

Weekly Update – October 15, 2025

This week we took the system from bench testing to field readiness. Two pilot partners confirmed: one working with avocado ripeness tracking, the other testing olive oil authenticity. Both want structured odor sequences they can tie to quality outcomes, which is exactly what the capture kit was designed for.

The breakthrough came Tuesday. We ran the first stable 10-minute odor sequence with no baseline drift. The ambient correction loop held, the heater stayed within 0.3°C, and the pump delivered consistent flow across the full run. That stability matters because it means the data we collect today will align with data collected next month, which is the foundation for any learning system.

What I learned this week: field deployment is not just about hardware working. It is about partners trusting the data enough to act on it. The avocado grower wants to know if a batch is ready to ship. The olive producer wants to catch adulteration before it reaches buyers. Both need answers they can rely on, not experimental outputs. That shifts how I think about the next phase. It is no longer "can we capture odor?" It is "can we deliver clarity from odor?"

Weekly Update – October 8, 2025

This week the hardware design crossed a threshold. I completed the control loop for the pump and heater, which means the system can now hold airflow and temperature stable during a capture run. That might sound minor, but it changes everything. Without active control, readings drift as room conditions shift. With it, we can compare samples taken hours or days apart.

The firmware now logs ambient corrections in real time: temperature offsets, humidity compensation, and baseline adjustments for each sensor. This creates a traceable record of what the system was doing when it captured each odor signature. If a reading looks strange later, we can go back and see whether the heater spiked or airflow dropped.

What I built this week is not just a sensor array. It is a standardized capture environment. The same kit can sit in a warehouse in Monterrey, a processing facility in California, or a testing lab in Europe, and produce comparable data. That comparability is what makes a shared model of smell possible. Without it, we are just collecting local snapshots that never add up to general knowledge.

Weekly Update – October 1, 2025

This week Neosmia took some tangible steps forward. On the technical front, we've moved from concept sketches into concrete prototype design. The data acquisition architecture is now mapped out, balancing precision channels for chemical sensors with fast capture for dynamic signals. Collaborators have started working on hardware builds and experiment design, which means we're finally setting the stage for our first real datasets rather than simulations or desk work.

The new challenges surfacing now are less about "what to build" and more about how to make the early choices stick. One is prioritization: with so many potential sensor types and test cases, I need to carefully choose the first pilot experiments that can prove value quickly without overextending. Another is coordination: aligning hardware design, data collection, and AI model planning so they advance in sync rather than getting out of phase. Finally, there's the question of framing: while the moonshot vision is clear, I need to translate that into near-term milestones and a validation roadmap that feels real to both partners and potential investors.

Testing Update – September 23, 2025

This week I’ve been diving into experiments with high dimensional, weakly structured datasets. Unlike images or text, these signals don’t come with clear labels or a simple geometry. They exist as messy, overlapping clouds in feature space. Extracting order from them requires techniques that can tolerate noise, sparsity, and shifting conditions at the same time.

Working with olfactory signals in Neosmia made this painfully clear: volatile compounds never behave the same twice, yet hidden in the chaos are patterns that determine freshness, safety, or even identity. What I am seeing now is that the same challenge repeats across other domains. The tools we use for structured or richly labeled data simply do not fit.

My early tests are small, but they confirm the intuition: to make progress we need approaches that embrace high dimensional structure rather than flatten it away. Over the next weeks I will share how this exploration is shaping into a more general path forward.

Company Update – September 15, 2025

Wins

This week we locked down the core hardware plan for Neosmia's prototype. By carefully selecting the right sensors and external ADCs, we now have a clear path to building a system that captures high-quality odor data. We also began conversations with researchers who see the potential in joining Neosmia's journey, signaling that the mission resonates beyond industry.

Challenges

The challenge ahead is focus. While Neosmia's long-term vision is to make machines truly capable of smell, our immediate priority is building a reliable MVP for food supply chains. Making this concrete while preparing the foundation for a moonshot is not trivial. On top of that, funding constraints make every decision critical as each dollar must push us closer to a working demo.

Founder University

Accepted into Founder University Cohort

We're excited to share that Neosmia has been accepted into the Founder University cohort. This gives us access to a community of builders and mentors as we push forward on the world's first foundation model for smell.

Over the coming weeks we’ll be refining our go-to-market, accelerating data collection, and sharing more frequent updates. If you’re interested in partnering, piloting, or contributing, reach out. I’d love to connect.

Month 1: First Experiments with Produce Freshness

This month we began recording olfactory signals from fruits and vegetables at different stages of freshness. The goal was simple: see if a small sensor array could distinguish fresh from not-fresh, and how those signals change over time.

The early data is both promising and tricky. Simple models can often tell when an item is fresh, but accuracy falls once the same test is repeated under different humidity, temperature, or airflow. A system trained in one context struggles when the context shifts.

We also saw how individual sensors behave very differently: some react quickly but fade, others respond slowly but stay steady. Together they create useful signals, but also plenty of noise. Across a few hundred thousand readings, the picture is clear: there is real signal, but it is fragile.

This is why smell is unlike vision or sound. Odors are deeply entangled with environment, and variability is the rule. Any true model of smell will have to embrace that complexity, not hide from it.

The Missing Sense in AI

Computers can see. They can hear. They can converse in human language. Yet one of the most powerful senses, smell, remains invisible to machines.

Smell shapes life every day. It tells us when food is fresh or spoiled, when danger is near, when a memory is real. It is central to health, safety, and creativity. And unlike vision or language, it has no digital representation.

Neosmia exists to change that. We are creating the world's first foundation model for smell. A universal representation that makes olfaction part of the digital world, accessible, useful, and generative across industries.

Imagine preserving the scent of a place so it can be relived decades later. Imagine drones and satellites mapping ecosystems through their chemical fingerprints. Imagine global supply chains where freshness is monitored at every step, reducing waste and hunger. Imagine culture and media enriched with scent as naturally as sound and color. Imagine exploring another planet and knowing not only what it looks like but what it smells like.

And imagine how people will interact with it. A chef asking if fish is safe to serve and receiving a clear answer. A parent scanning a lunchbox and being told the fruit is fine but the sandwich may be past its best. A city inspector waving a handheld device and discovering a hidden gas leak. A designer requesting the formula for a "spring morning in Kyoto" to embed in VR. A winemaker comparing a new vintage to historic bottles. An artist blending "rain on stone" with "freshly cut grass" for an installation.

This is a long-term project. The ambition is to open a new sensory dimension for artificial intelligence and to unlock applications that today can only be imagined.

Smell is the missing sense in AI. Neosmia will build it.