More

KaiserPro · 2026-06-11T15:48:40 1781192920

Its using Visual descriptors to generate a pointcloud. Buildings and text are really great for creating descriptors, so when they change you loose key points for "localizing"(ie getting your position). This needs to be updated as those buildings change.

You also need a day/night dataset (although some newer descriptors are day/night resistant)

KaiserPro · 2026-06-11T15:44:53 1781192693

VPSs are much more effective at navigation at ground level in cities compared to GPS because of multi-path interference.

However that data has a half life and needs to be refreshed.

For flying drones, ground level data is really not that useful. mainly because you can't see it, because its obscured by trees, building and clouds.

But, this is not a new thing. Google, Apple, facebook and niantic all have VPSs as do a bunch of other startups.

For Drones you will probably need SLAM to capture the map, and then once you have the initial map, you can keep it updated.

You can experiment at home using https://github.com/colmap

FrustratedMonky · 2026-06-11T17:49:51 1781200191

I had assumed this was referring to street level flying drones.

Like you see in drone races. But with a little bomb attached.

Even with half life. That could be years. Depending on changes. Old neighborhoods probably haven't changed.. And, not sure I read that deal had a cutoff, or not. Could they continue getting updates. ?

KaiserPro · 2026-06-11T18:45:46 1781203546

If they are, then SLAM/object avoidance is pretty much the way forward.

The issue you have there is you need a high shutter speed with no rolling shutter to get good reading.

KaiserPro · 2026-06-11T15:40:41 1781192441

not really, in war, where there has been lots of changes, the maps degrade pretty quickly.

KaiserPro · 2026-06-11T15:39:32 1781192372

Your hypothesis is correct.

there was a startup that pitched the idea of using Satellite data to do ground based navigation. (https://sturfee.com/vps) they didn't get bought out by either google, niantic or facebook, so it can't of worked that well.

Niantic's stuff is a pre-built map that the client will reference to get a position. Its essentially a massive feature matching exercise. The problem with using airborn photos is that you miss a bunch of features you can't see. (samy thing trying to match ground features from the air.)

THe lens calibration issue isn't actually that much of a problem _for the client_. if you have a rough idea of the lens (exif data really helps there) then you can still get meter accurate (and a few degrees heading) its a bit more of a problem for generating the initial map, but Structure from motion with good motion priors goes a long way to make it less of a problem

Now, Niantic are proposing that you can train a model that can relocalize generally without a detailed map, I think thats a bit far fetch, especially to do at any large scale. (ie bigger than a cubic kilometer)

KaiserPro · 2026-06-11T15:27:52 1781191672

No, because they are different things for different purposes.

Visual navigation is prone to degradation. Keeping the "map" updated requires constant visits. (I know because my team worked on the patent for a method for updating said maps.)

Also Pacification bot would be run by the military who most lilkey have GPS.

Finally, For ground based bots, SLAM is actually more useful, rather than pre-built map based navigation.

KaiserPro · 2026-06-11T09:28:13 1781170093

I worked at a VPS competitor of niantic.

I am conflicted on this report.

1) VPS is not new, the startup I worked at had a working public system in 2018.

2) The hard part about VPSs is not actually the navigation, its generating and querying the map.

How does the VPS work?

You build a point cloud of features (for us we paid people to go and record videos in cities, Tesla/Waymo/toyata/google drove cars niantic got it's players to take videos/pictures)

Align that point cloud to the 3d world, store it in a way that can be queried quickly (doing that quickly and at scale is still an area of research)

Then your client needs to extract the keypoints from an image and perform triangulation against the map to see where the camera was taken (There are calibration issues, but we ain't got time for that)

Now.

Niantic, from what I can see (and its been a while) has a database of key landmarks, but not of the areas inbetween. For decent navigation I would say that this is a massive problem.

I know niantic are pushing the whole "spatial world model" but frankly I don't think that scales. They stuff they have released is memorybound in vGPUs which isn't that useful for realtime querying.

I strongly suspect that actually they have a different system, much more traditional along the lines of colmap, or hloc, or something with a feedforward model in it.

However for the drone usercase, what you actually want is SLAM, which is a very different problem. for SLAM you need to build the map whilst your are moving, and then try and do loop closure or some other method to stop drift. Once you've gone there and back you can use that model for relocaliosation.

vectorphresh · 2026-06-11T16:50:21 1781196621

I'm not sure they need the in between areas, so long as the landmarks are inclusive of similar features. In fact, they probably only need a high quality 3D scans of primitive features to perform classification (walls, building, intersections, etc). I haven't played recently, so I'm unsure how distinct each landmark is.

KaiserPro · 2026-06-11T18:49:43 1781203783

> I'm not sure they need the in between areas, so long as the landmarks are inclusive of similar features.

That gets you good navigation around landmarks, but when you go further away, you get less usable feature points, as they are closer together you get more position error/need higher resolution cameras.

The intermediate places gives you the precise consistent navigation.

fragmede · 2026-06-11T10:18:21 1781173101

(Visual Positioning System)

KaiserPro · 2026-06-10T21:24:16 1781126656

When I left at the start of 2025, they were already building the tent.

The difference between Meta and tesla is that meta has done it many times before and in loads of countries.

KaiserPro · 2026-06-10T17:20:26 1781112026

Working at FAANG, acutally the stuff I was doing was mostly bollocks. Nothing of real value, apart from a few projects was delivered.

KaiserPro · 2026-06-10T17:19:15 1781111955

I would hope that people, having dealt with LLMs for a few years would understand that its all about context.

In a 25 person company, context is easy, assuming even half arsed communications. Its possible to hold the state of the entire company in your head.

That scales to about 50. after that it becomes hard. then you start having team meetings and the like.

Even at my old startup we had 1:1s when we were ~25 people. it was a great way to get additional context that was otherwise hidden

ghaff · 2026-06-10T18:48:28 1781117308

And changes happen at pretty much all levels of scale. Even once you get well past startup size the times of structure and processes required for a 10,000 or 20,000 person company is much different from a 1,000 or 2,000 person company.

KaiserPro · 2026-06-10T17:16:34 1781111794

> Why can't it be in a team slack?

Only a few people can adequately explain themselves through slack.

It doesn't help that a lot of managers are _bad_ managers, and don't/can't/don't know how to run a tight 1:1.

the point of the 1:1 is to provide a high bandwidth way of getting worries and steers from employees to management and direction back to employees. if there is nothing to talk about then cut the meeting short.