3D Intelligence Report – June 22, 2026
**Theme of the day:** 3D perception is leaving the data center for the thing that moves. A drone that reconstructs as it flies, a robot that fuses its sensors into a scene, glasses that anchor to the world on-device. The plumbing to ship splats and a funding door for novel ideas round it out.
Every link below was fetched and verified on June 22, 2026, the day this report went out.
MoonSplat: Monocular Online Gaussian Splatting with Sim(3) Global Optimization
Guo Pu, Yixuan Han, Haofeng Li, Yao Zhang, Hui Zhou, Zhouhui Lian (Peking University, per the authors' prior work)
Live 3D from one moving camera
MoonSplat turns a single moving camera, including a drone feed, into a live photorealistic Gaussian splat map while it tracks where the camera is, the classic SLAM-meets-splatting problem. Its move is a Sim(3) global optimization layer that jointly corrects scale, rotation, and translation across the whole trajectory, the two things online splat methods break on: fragile pose estimation and collapse over long or large-scale sequences. The team built a real drone-based capture rig on top of it, which is exactly the fly-and-reconstruct workflow the geospatial crowd runs. Code, run scripts, config, and demo data are public, and it is positioned as SIGGRAPH 2026 work, a strong niche plus hype combination. Pretrained weights and the camera-intrinsics estimation module are not released yet.
Monocular SLAM has always drifted. You walk or fly a long path, small errors in scale and pose pile up, and by the time you close a loop the map no longer lines up with itself. MoonSplat's bet is to treat the whole trajectory as one thing and correct it with a Sim(3) global step, scale and rotation and translation together, instead of patching frame by frame. That is the right place to fight drift. The honest catch: the intrinsics module is not out yet, so today you feed it your own camera calibration, and there are no pretrained weights, it is a research drop, not a turnkey tool. If you test one thing, fly a real outdoor pass of a few hundred frames and check whether the scale holds when the loop closes, because that is where this lives or dies.
Computer Vision AI & ML Engineer
$180K-$300K + equity (est.)
Read the requirements, not the title. A robot foundation-model company is asking for object detection, segmentation, and tracking, but also 3D scene understanding with multi-sensor fusion across RGB-D, LiDAR, and stereo, with 3D geometry named explicitly. That is the spatial-data skillset this audience already has, point clouds, sensor fusion, geometry, now sitting at the center of embodied AI. The robot's perception layer is classic 3D and geospatial computer vision wearing a new badge. Notice too that half the job is data engineering: pipelines, evaluation, annotation strategy, which says owning the 3D data path is the durable moat, not just the model. If you go for it, lead with an end-to-end 3D perception pipeline you deployed on real hardware, not a clean benchmark number.
ESA Open Space Innovation Platform, Open Discovery Ideas Channel
Studies EUR 20K to 100K (up to 100%), Early Tech Dev EUR 90K to 175K (up to 100%), Co-sponsored Research up to EUR 90K (50%). Entry is a one to two paragraph idea.
This is the rare funding door built for exactly this audience. A solo researcher or a two or three person spatial-AI company can enter with a one or two paragraph idea, no consortium, no co-applicants, no 45 page proposal. Selection rides on novelty, not track record, so a sharp 3D or geospatial-AI concept can win a fully funded study: neural point-cloud compression for satellite downlink, on-board 3D scene understanding, splat-based planetary mapping. It is continuously open, evaluated the third Friday of each month, so there is no missed-the-call, the next gate is July 17. The mistakes people make: over-engineering it into a Horizon-style document, pitching something incremental, or submitting and going quiet, you have to mark the idea ready for evaluation or it never gets seen. This week: make the account, read three or four already-funded ideas to calibrate the novelty bar, then write the tight pitch.
SplatTransform v2.6.0
One CLI for every splat format
SplatTransform is PlayCanvas's open-source (MIT) command-line tool for Gaussian splats: it reads PLY, Compressed PLY, SOG, SPZ, SPLAT, KSPLAT, LCC and LCC2, and writes PLY, Compressed PLY, SOG, SPZ, GLB, CSV, an HTML viewer, LOD, voxel, and WebP. v2.6.0 adds read support for LCC2 (the newer XGRIDS scanner format) and halves its peak read memory, parallelizes SOG writing across a worker pool, hardens PLY header parsing, and fixes a black-render regression after the engine 2.19 upgrade. Install with npm (npm install -g @playcanvas/splat-transform); code and full notes are on GitHub.
This is the unglamorous tool that actually unblocks a splat pipeline. You get a .ply out of your trainer, and then you hit the wall: getting it into a web viewer, a game engine, or a compressed delivery format. SplatTransform is the one command-line tool that converts between the formats, transforms the geometry (translate, rotate, scale), filters by box or sphere, and builds LOD and voxel trees, so it drops straight into automation. The honest limit: it is a converter and editor, not a trainer, it will not make your reconstruction better, it just moves and reshapes what you already have. The new LCC2 support matters because XGRIDS scanner output keeps showing up in survey and reality-capture work, and this gets it into open formats. Try it on one heavy splat: convert a .ply to .sog and emit a one-line HTML viewer, no server needed.
Snap unveiled SPECS, its first fully standalone consumer AR glasses (no tether, no puck), with pre-orders at $2,195 and shipping in fall 2026.
Snap is the first major player to put a genuinely standalone consumer AR headset in front of real buyers, not just developers. The hardware runs two Snapdragon processors, one of them dedicated to computer vision, precisely because anchoring digital content to the real world is a live 3D spatial-mapping and localization problem, the exact stack this field works in. SPECS turns spatial intelligence from a back-end pipeline into a mass-market wearable surface, which pulls real-time SLAM, scene understanding, and splat-grade capture toward the consumer edge. It also reframes world-model work as something that has to run on-device at 7 millisecond latency, not in a data center.
The deeper signal is two threads this audience already follows starting to meet: world models in the cloud, built at simulation scale, and spatial perception on-device, judged by latency. Snap is betting the prize is at the edge, a dedicated vision chip and a 7 millisecond budget say real-world anchoring has to be local. Worth weighing against the hype: at $2,195 with a 51 degree field of view this is still prosumer, and it lands in the same stretch as Apple putting native splats in RealityKit, so the glasses-as-spatial-platform race is openly contested now. The thread for anyone doing 3D: the capture and reconstruction skills you build today are what fills these glasses tomorrow, running on the device, not on a server. Watch whether the developer tools start expecting your splats as input.
Get the next report in your inbox
Five verified finds, my take on each, one short email a day.
Five verified finds with my take, one short email a day.