3D Spatial AI for AEC: Replace 6 Weeks of BIM

Q: Do I need a data centre to run 3D spatial AI for AEC?

No. The architecture demoed at NXT BLD 2026 runs on a consumer-class RTX laptop with thirty-two gigabytes of RAM, in airplane mode. Local-first is a survival posture in AEC: your client data is not training material for someone else's model.

Q: How fast can 3D spatial AI for AEC actually deliver a scan-to-IFC handover?

On the stack demoed at NXT BLD 2026, a handover that today costs six weeks of manual modelling at roughly thirty thousand euros becomes a three-day pipeline. The output is higher-grade, queryable for the building's life, and the next ten projects re-use the same pipeline at near-zero marginal cost.

Q: Where do I start if I want to build this for my firm?

Two clean starting points. The Free Mission at learngeodata.eu/course/free-mission/ gives a first agentic move on a real scene in under an hour, no install. The 3D Spatial AI Architect program at learngeodata.eu/3d-spatial-ai-architect-limited-offer is the structured twelve-week path to a working pipeline you own.

Last week I had the privilege of giving a twenty-minute presentation on 3D spatial AI for AEC at NXT BLD 2026, at the Queen Elizabeth II Centre in Westminster, then sitting on the reality capture panel later in the programme. This article is the long-form version of what I shared on stage, what I demoed, and the patterns the week surfaced.

3D spatial AI for AEC, NXT BLD 2026 recap by Dr. Florent Poux, from smart point clouds to the cognition stack — Smart point cloud at the floor, agents at the top. The architecture I tried to leave the NXT BLD 2026 room with.

TL;DR. 3D spatial AI for AEC is the layer above geometry that turns a point cloud into a scene that can answer client questions, edit itself when reality changes, and emit any deliverable on demand. The architecture is a six-layer cognition stack on top of the smart point cloud foundation: dual-frame semantics, relationships, topology, physics, materials, and an agentic layer. It runs locally on a consumer RTX laptop. A scan-to-IFC handover that costs six weeks of manual modelling today becomes a three-day pipeline, with output the client can re-interrogate for the building’s life.

What you’ll learn in this article:

Three questions I keep coming back to, that you might recognise from your own work
Why 3D Gaussian Splatting is the new one-to-one map, beautiful and unbillable
The cognition stack: smart point cloud, semantics, relationships, topology, physics, materials, agents
The four agentic moves I ran live on a laptop in airplane mode, with the conceptual API
Why clients will start paying for interrogability instead of static drawings
A clear next step if any of this resonates with what your firm is trying to build

Estimated reading time: 15 minutes

Three questions I keep coming back to

I do not want to open with a manifesto. I want to share three questions that have followed me into almost every project review and capture programme I have sat in over the past year. They are not mine, exactly. They came back to me again and again at NXT BLD, on the panel, in the conversations afterwards. Maybe one of them lives in your work too.

The first one is about the scans we already have. Many firms I speak with sit on hundreds of gigabytes of point cloud data, captured with care, ingested with budget, then opened once or twice and quietly stored on a server they keep paying for. It is a strange kind of debt. The asset is on the balance sheet of the project; the value is not. The question I keep returning to is simple: what would it take, technically, to make those scans useful a second time?

Vibrant semantic-painted indoor point cloud scan, the kind of asset that justifies investing in 3D spatial AI for AEC — Every capture program has folders of these. Beautiful, expensive, opened twice. The first question I keep running into: what would it take to make them useful again?

The second one is about handover time. A few people on the reality capture panel mentioned scan-to-BIM cycles that still run into weeks, sometimes months. By the time the model lands with the client, the site has moved on. We all know this. The question I keep returning to is whether the deliverable itself is the right shape, or whether we are still re-authoring artefacts we should be emitting.

The third one is about where the model actually runs. I have lost count of how many AI demos I have seen die the moment the venue Wi-Fi drops, because the inference call leaves the building. That is a survival question, not a comfort question. If the chain breaks the moment the network does, the chain is not really yours.

None of these are accusations. They are the questions that pushed me into the architecture I am about to walk you through.

Borges already wrote the warning

I did something a presentation on 3D spatial AI for AEC is not supposed to do. I talked about a one-paragraph fable from nineteen forty-six.

Jorge Luis Borges wrote On Exactitude in Science, a fable about an empire that perfected the science of cartography by drawing a map at the same scale as the territory itself. A map so faithful, so complete, that it covered the entire empire, point for point. And it became useless. The value of a map is not in its fidelity. It is in what it leaves out. The road. The river. The city. Strip away the rest, and a traveller can finally use the thing.

Thirteen years earlier, Alfred Korzybski had written the line every engineer eventually learns. The map is not the territory. He did not mean it as a complaint. He meant it as a feature. A representation is useful precisely because it isn’t the thing.

Borges 1:1 map analogy redesigned for 3D spatial AI for AEC: fidelity without abstraction is unbillable, structure makes a scene useful — Borges and Korzybski, said with cubes instead of cartography. Fidelity is not value.

I brought Borges into the room because I wanted us to look at 3D Gaussian Splatting honestly. Splatting is beautiful. The most photorealistic representation of a building any of us have ever seen render on consumer hardware. It is also, on its own, the new one-to-one map. Beautiful, faithful, unbillable. No client pays for resolution. Clients pay for a scene that can answer a question.

Splats matter. As a capture layer. As a rendering layer. As a way to bring a site into a meeting room. But on their own, they do not strip. So on their own, they do not pay. What you need underneath is an abstraction layer that strips at the right level. Not too coarse, you lose meaning. Not too fine, you lose use. That layer is the smart point cloud, and it is the floor under every honest conversation about 3D spatial AI for AEC.

🪐 System Thinking Note: The same pattern shows up everywhere in engineering. A useful representation is always a calibrated loss of information. The skill is choosing what to discard. In language models it is the tokenizer. In compression it is the codec. In 3D spatial AI for AEC it is the smart point cloud.

The smart point cloud, four years of research as a foundation

I spent four years on this at the University of Liège, between twenty-fifteen and twenty-nineteen, on what I now call the smart point cloud framework. The idea is direct. A point cloud should not just carry where each point is. It should carry what each point is, what it relates to, and how a downstream process can attack the data without rebuilding it from scratch.

Object identity at the point level. Connectivity, neighbourhood, and structural role at the point level. Semantic context at the point level. All of it addressable. All of it low enough in the stack that many different processes (segmentation, classification, reconstruction, query, planning) can consume the same substrate without re-paving the road every time.

Top-down and 3D crop of a cathedral laser scan, the substrate that 3D spatial AI for AEC reasons over — The substrate. A scan like this is gorgeous. It only becomes a billable asset once a layer above it can answer a question.

If you have ever rebuilt the same labelling pipeline three times for three different deliverables on the same site, you already know what I mean. The smart point cloud is the engineering answer to that waste. It is the substrate 3D spatial AI for AEC stands on. Everything else in the stack assumes it. I unpack the underlying ideas at chapter depth in my O’Reilly book on 3D Data Science with Python, if you want the reading companion to the architecture I am about to describe.

🦚 Florent’s Note: I expected pushback on this section. I did not get any. The people who have run a serious capture programme for five years tend to recognise the shape of the problem before they recognise the words I am using for it.

BIM 2.0 is a feeder, not an opponent

The conference is called “BIM 2.0 and Beyond”, so I felt I had to be explicit. I am not here to tell anyone BIM 2.0 is wrong.

Better files, IFC, BCF, ISO 19650, federated streams, open vendor APIs. All of it is the right work. All of it solves real problems. What I am saying is that BIM 2.0 is not the destination. It is a feeder.

Every output BIM 2.0 has worked so hard to standardise becomes high-quality structured input for what comes next. The cognition stack consumes it. The IFC schema, the BCF issue list, the federated coordination model, the classification dictionaries, the LOD agreements. Those are not artefacts to author once and ship. Those are the semantic feedstock the cognition stack reasons over. If your firm has invested seriously in BIM 2.0, that investment is not at risk. It becomes contributor instead of endpoint. The cognition layer elevates it.

The cognition stack for 3D spatial AI for AEC, with BIM 2.0 as a feeder and the smart point cloud as foundation — Six layers, one foundation. Each layer earns its keep against a question the layer below cannot answer.

When I say “what comes next”, I do not mean “instead of”. I mean “on top of”. The table below maps the cognition stack as a glance: each layer, what it adds, and the client question it lets you answer.

The cognition stack at a glance

Layer	What it adds	Client question it unlocks
Agents	Query, modify, generate, decide on the scene	“Run the whole pipeline and bring me the report”
Materials	U-value, fire rating, embodied carbon per element	“Does this assembly meet the new fire code?”
Physics	Thermal, structural, acoustic constraints	“Will this duct create a thermal bridge?”
Topology	Connectivity, adjacency, routing	“Where does this corridor connect, and is it fire-safe?”
Relationships	Which element depends on which	“What carries this slab, and what happens if I remove it?”
Semantics (dual-frame)	Object-level + scene-level meaning	“What is this duct?” and “Is this floor compliant?”
Smart point cloud	Structured, addressable point data	“What can we even ask of this scan?”

Semantics in two frames at once

The first thing on top of the smart point cloud is semantics. Semantics in 3D is not new. We have all labelled point clouds. We have all seen colour-coded scenes. That is table stakes.

What is not table stakes is reasoning at two frames at once. Object level: what is this duct, what is this door, what is this column. The scan knows, point by point, exactly which thing is which. Scene level: is this floor compliant, is this corridor accessible, is this sequence buildable. The scan reasons about the whole, not just the parts.

Voxelized indoor scene with object-level semantic instances, an illustration of dual-frame semantics in 3D spatial AI for AEC — Object-level semantics on a structured substrate. Each chair, the piano, the plant: separate instances, each one queryable on its own.

Single-frame semantics fails on half of every project. If you only label objects, you cannot answer “is this floor compliant” without writing a new pipeline. If you only reason at the scene level, you cannot answer “what is this specific component” without writing another pipeline. You need both. At the same time. In the same scan.

The deeper point: the semantics have to live at a granularity low enough that many different processes can attack them. A clash detector. A code-compliance checker. A takeoff generator. A maintenance planner. Each of them has its own reasoning. None of them should rebuild the semantic layer from scratch. That is what the dual frame gives you. One scan, two reasoning frames, many processes consuming the same substrate.

The four completion layers, each mapped to a client question

Semantics on its own is still not enough. Geometry plus labels cannot answer most of the questions clients actually pay for. So we add four completion layers, and each one earns its place against a client question the layers below cannot answer.

Relationships. Which beam carries this slab. Which wall supports which floor. What happens, structurally, if I remove this column. Without relationships, your scene is a pile of objects. With relationships, your scene is a graph that knows what depends on what.

Topology. How does this duct route around that beam. Where does this corridor connect to which room. Which spaces are adjacent for fire compartmentation. Topology is what turns objects into a navigable network.

Physics. Will this duct create a thermal bridge. Will this assembly transmit too much sound. Will this cantilever deflect under design load. Physics is what turns a scene into a system that obeys constraints.

Materials. Will this assembly meet the U-value. Is this finish compliant with the new fire code. What is the embodied carbon of this configuration. Materials is what turns geometry into specification.

Smart point cloud at the foundation. Dual-frame semantics. Relationships. Topology. Physics. Materials. That is the body of the cognition stack. Everything below is substrate. Everything above is what makes it act.

The live demo, on a laptop, in airplane mode

This was the part of the presentation I had rehearsed the most. Not because the demo is hard, but because the demo is the moment the room either believes me or quietly writes me off.

I walked to the laptop. Toggled airplane mode on. Held the laptop up so the room could see the OS indicator flip on the projector. No cloud. No remote inference. No phone home. Then I opened Neurones 3D, the platform I have been building to ship this stack as a real product, and loaded a one hundred and twenty megabyte anonymised residential scan. Object-level semantics already attached, courtesy of the smart point cloud framework.

Live demo of Neurones 3D running 3D spatial AI for AEC on a laptop in airplane mode, querying a smart point cloud scene — The four-second answer. Object-level semantics plus a topology check, on a laptop, with the Wi-Fi indicator turned off.

Conceptually, the agentic layer exposes four moves on the same scene object. The API the demo runs against looks like this:

from spatial_ai import Scene

# Load a smart point cloud (object-level semantics already attached).
scene = Scene.load("residential_anonymised.spc", device="cuda:0")

# Move 1, query. The scene answers.
issues = scene.query(
    "How many doors fail accessibility, and where are they?"
)
issues.export("door_failures.csv")

# Move 2, modify. The scene edits itself and every artefact follows.
scene.modify(
    "Widen all corridors below 1.2 m to compliance."
)
scene.emit(["plans.dwg", "model.ifc"])  # propagated, not re-authored

# Move 3, generate. The scene emits a deliverable.
clash = scene.generate(
    "Clash report grouped by trade, plus a framing checklist."
)
clash.write_pdf("clash_report.pdf")

# Move 4, decide. The scene reasons, with an audit trail.
verdict = scene.decide(
    "Is this floor compliant with the local fire code? What would it cost?"
)
assert verdict.audit_trail.traces_back_to(scene.source)

Four moves. One scene object. No drawing redo, no model redo, no human in the loop holding a tape measure. The audit trail is the load-bearing detail: the model never invents geometry, it reasons over it.

Four agentic moves on a smart point cloud for 3D spatial AI for AEC: query, modify, generate, decide — Four moves. Four deliverables. Local. Offline. Auditable.

On stage, the four moves ran one after the other. The query lit three failing doors in the 3D viewer and emitted a table on the right pane in six seconds. The modify propagated corridor widening to the IFC and the plan view in the background. The generate dropped two PDFs on the desktop. The decide returned a fire-code verdict, an indicative cost, and a reasoning trace traceable back to the scan, all in under twenty-four seconds of compute, on a thirty-two gigabyte RTX laptop, with no cloud.

🦥 Geeky Note: The hardware target is deliberately consumer-class: RTX 4080-class GPU, 32 GB RAM, 1 TB NVMe. The 120 MB residential scan loads in under 800 ms cold. The semantic index pre-computed during ingestion is what makes the four moves feel instant: at runtime we are not running heavy inference, we are walking pre-indexed structure with small reasoning calls on top. The audit trail is just the call graph of those reasoning calls, serialised.

Why clients will start paying for interrogability

Here is the part of the talk most useful to take to your next client meeting. The deliverable category itself is shifting.

Today, clients pay for artefacts. Drawings. BIM models. Reports. Static, dated, lossy. The day they ship, they start decaying. Six months later, the site has moved on, the design has changed, and the artefact is a snapshot of a moment that no longer exists.

Tomorrow, clients pay for interrogability. A scene that answers questions they have not asked yet. A scene that updates itself when reality changes. A scene that emits any artefact on demand. That is not a discount on the existing deliverable. That is a different SKU. At a different price.

Pricing reframe for 3D spatial AI for AEC: a drawing versus an interrogable scene, a paper map versus a navigation system — Same site, same scan, two different invoices. The conversation I came to London hoping to start.

A printed map is cheap to print, fast to read, and useless the moment the road moves. A navigation system costs more, is harder to build, and is the only thing a serious driver pays for now. Smart point clouds plus 3D spatial AI for AEC are the navigation system for the building.

One concrete uplift, the one I gave the room. A scan-to-IFC handover that today costs a firm six weeks of manual modelling at roughly thirty thousand euros becomes, on the stack you just saw, a three-day pipeline. Higher-grade output. Queryable for the building’s life. Same invoice. Ten times the margin. And the next ten projects re-use the same pipeline at near-zero marginal cost.

The one-line argument I leave with sceptical engineers: you are not competing on cheaper drawings, you are competing on whether the deliverable can answer a question.

Local-first is survival, not aesthetics

There is one more thing I needed the room to leave with, before the closing slide. Everything I had just shown ran on a laptop, in airplane mode. That is not a flex. That is survival.

Your geometry is your client’s. Your floor plans are your client’s. Their scan, their site, their tenant data, all of it is in that point cloud. None of it is training material for someone else’s model. None of it should sit in a US data centre by default. None of it should depend on someone else’s uptime, someone else’s pricing change, or someone else’s policy decision next quarter.

Cloud dependency is a tax. It is a lock-in. It is a GDPR exposure. It is a leak vector. And it is an outage waiting to happen at the worst possible moment, usually the morning of a client review. Local-first runs on the workstations your team already owns. Your data is your moat, not someone else’s training set. That is how I would build something serious in this industry, and it is the single most ignored design decision in 3D spatial AI for AEC today.

What the reality capture panel surfaced

After the presentation I joined the reality capture panel, and three patterns came up that I want to share because they shape where I think the next twelve months point.

First, the firms quietly leading on this are not always the largest ones. The smaller practices have less to defend, lighter committees, and a single engineer who can own a pipeline end to end. Several of the most interesting conversations were with two-person and five-person studios that had already built a janky version of the stack and just wanted to know if they were directionally right. They were.

Second, the BIM 2.0 vendors I respected most were the ones who heard the talk as a compliment. The cognition stack is downstream of their best work. They knew that already.

Third, almost nobody had a clean answer to “where on Monday does someone in my firm actually start.” That is the gap, and it is exactly why I built the path I am going to point at next.

What you can ship in twelve months

So what does this look like for your firm, twelve months from now. Three concrete capabilities. None of these are slide concepts. All three are billable deliverables.

One. Autonomous reality-capture to interrogable scene. Drone, scanner, or phone in. Smart point cloud out. Ready to query, modify, and emit.

Two. Agentic clash, constructability, and code-compliance checks. Run on every project, before the concrete is poured, on the laptop already on the desk.

Three. Semantic 3D search across your project archive. Every scan you have ever paid for becomes an asset that answers questions, instead of a folder nobody opens.

Three capabilities. Twelve months. One trained engineer.

🌱 Growing Note: If you only do one thing after reading this, pick one repeatable scene type in your firm (a residential floor, a warehouse bay, a clinic wing) and build the four-layer loop end to end on a laptop on it. The smaller and more boring the test case, the faster you learn what the architecture really demands. The Free Mission below is the cleanest sandbox I know to make this concrete in under an hour.

A clear next step if any of this resonates

If anything in this article moved something for you, here are the doors I would open in order.

If you want a free taste of the stack before anything else, start with the Free Mission. It is a guided experience that takes you through a first move on a real point cloud, in your browser, no install, no credit card. It is the cleanest way to find out, in less than an hour, whether the way I think about this fits the way your brain already works.

If you want a deeper read before you commit, the Spatial AI Guide is the long-form companion to the talk, with the architectural decisions written at the level a senior engineer can sanity-check in an afternoon.

And if you already know, and you want the full path, the 3D Spatial AI Architect program is the structured twelve-week curriculum that takes a working engineer all the way from smart point cloud foundations to running the four agentic moves on their own data, on their own laptop. Three tiers (Foundation, Professional, Architect). Judgment over syntax. A real shipped system at the end. Vendor-neutral. Local-first.

Five years from now, every serious AEC firm will either employ a 3D Spatial AI Architect, or be served by one. The only question worth asking this morning is which side of that line you want to be on.

Frequently asked questions

What is 3D spatial AI for AEC, in one sentence?

It is the layer above geometry that turns a 3D scan or model into a scene that can answer client questions, edit itself when reality changes, and emit any deliverable on demand (drawings, IFC, clash reports, compliance verdicts) with a full audit trail.

How is 3D spatial AI for AEC different from BIM 2.0?

BIM 2.0 is a feeder, not an opponent. It standardises the structured semantic input the cognition stack consumes (IFC, BCF, ISO 19650, federated streams). 3D spatial AI for AEC builds on top: dual-frame semantics, relationships, topology, physics, materials, and an agentic layer that turns the scene into something interrogable.

Is 3D Gaussian Splatting useful for AEC at all?

Yes, as a capture and rendering layer. Splats are the most photorealistic representation a consumer GPU can render in real time. On their own, though, they are the new one-to-one map: beautiful, but unbillable. Splats feed the stack. They do not replace it.

Do I need a data centre to run 3D spatial AI for AEC?

No. The architecture I demoed at NXT BLD 2026 runs on a consumer-class RTX laptop with thirty-two gigabytes of RAM, in airplane mode. Local-first is a survival posture in AEC: your client data is not training material for someone else’s model.

What is a smart point cloud, in plain terms?

It is a point cloud that carries, at the point level, not only where each point is, but what each point is, what it relates to, and what reasoning it can support. Four years of doctoral research at the University of Liège went into that framework. It is the foundation everything else in the cognition stack stands on.

How fast can 3D spatial AI for AEC actually deliver a scan-to-IFC handover?

On the stack I demoed at NXT BLD 2026, a handover that today costs six weeks of manual modelling at roughly thirty thousand euros becomes a three-day pipeline. The output is higher-grade, queryable for the building’s life, and the next ten projects re-use the same pipeline at near-zero marginal cost.

Where do I start if I want to build this for my firm?

Two clean starting points. The Free Mission gives you a first agentic move on a real scene in under an hour, no install. The 3D Spatial AI Architect program is the structured twelve-week path to a working pipeline you own, with the same stack I demoed live at NXT BLD.

3D Spatial AI for AEC: The Offline Stack That Replaces 6 Weeks of BIM Work

Three questions I keep coming back to

Borges already wrote the warning

The smart point cloud, four years of research as a foundation

BIM 2.0 is a feeder, not an opponent

The cognition stack at a glance

Semantics in two frames at once

The four completion layers, each mapped to a client question

The live demo, on a laptop, in airplane mode

Why clients will start paying for interrogability

Local-first is survival, not aesthetics

What the reality capture panel surfaced

What you can ship in twelve months

A clear next step if any of this resonates

Frequently asked questions

What is 3D spatial AI for AEC, in one sentence?

How is 3D spatial AI for AEC different from BIM 2.0?

Is 3D Gaussian Splatting useful for AEC at all?

Do I need a data centre to run 3D spatial AI for AEC?

What is a smart point cloud, in plain terms?

How fast can 3D spatial AI for AEC actually deliver a scan-to-IFC handover?

Where do I start if I want to build this for my firm?

Ready to start?

Architect Spatial Intelligence.

Reach out or book a call
to keep learning

The 3D Data Innovator newsletter

3D Spatial AI for AEC: The Offline Stack That Replaces 6 Weeks of BIM Work

Three questions I keep coming back to

Borges already wrote the warning

The smart point cloud, four years of research as a foundation

BIM 2.0 is a feeder, not an opponent

The cognition stack at a glance

Semantics in two frames at once

The four completion layers, each mapped to a client question

The live demo, on a laptop, in airplane mode

Why clients will start paying for interrogability

Local-first is survival, not aesthetics

What the reality capture panel surfaced

What you can ship in twelve months

A clear next step if any of this resonates

Frequently asked questions

What is 3D spatial AI for AEC, in one sentence?

How is 3D spatial AI for AEC different from BIM 2.0?

Is 3D Gaussian Splatting useful for AEC at all?

Do I need a data centre to run 3D spatial AI for AEC?

What is a smart point cloud, in plain terms?

How fast can 3D spatial AI for AEC actually deliver a scan-to-IFC handover?

Where do I start if I want to build this for my firm?

Ready to start?

Architect Spatial Intelligence.

Reach out or book a call to keep learning

The 3D Data Innovator newsletter

Reach out or book a call
to keep learning