How Much does Arc use context clues at new locations?

Mostly just curiosity because I thought it was really interesting, but yesterday I was spending the afternoon at a festival by a pier with a ferry terminal. I’ve never been to this location before and the nearest ferry I’ve ridden to it is about 20 miles away, but Arc started guessing I was riding a boat while I walked around the pier neighborhood.

It could be a coincidence I guess but I just thought that it was cool that the machine knew enough to do that and was wondering if it’s a similar story when you hover around train stations and other transit stops.

1 Like

Hah. That’s cool! I think there’s two possibilities there, if it’s somewhere you’ve never personally been on a boat before.

One possibility is that it’s somewhere I’ve been on a boat before. The app’s activity type classifiers are bootstrapped with a model from my own personal data. So that for fresh installs, new users, the app can still make some sensible choices and classifications, while it builds up enough knowledge from your own recorded data.

Over time it reduces its reliance on that bootstrap model and discards it completely, instead relying solely on your data. But if the app hasn’t been installed long it could be picking up on model data from me going on a boat there! Though if you’re not in Asia, the chances are slim on that. I live in a bunch of countries in Asia, but rarely if ever leave the region.

The second possibility is that the activity type models have learnt enough about what boat rides look like in general. Like it’s picked up speed patterns, accelerometer patterns, altitude (probably at sea level, unless a lake), that sort of thing. Oh though you say you weren’t on a boat? In that case … hmm, this second possibility seems unlikely. It sounds like the correct answer was walking, not boat.

I guess sea level alone could be a strong signal, if most of the time you’re moving around higher up. For myself I’m pretty much always at sea level unless it’s ski season or trekking season, so it wouldn’t be a strong enough signal in my own data. But if you’re usually not at sea level, that could be a strong signal in this case.

Here’s a list of all the model features the activity type models currently use:

    var stepHz: Double?
    var xyAcceleration: Double?
    var zAcceleration: Double?
    var movingState: Int
    var verticalAccuracy: Double?
    var horizontalAccuracy: Double?
    var speed: Double?
    var course: Double?
    var latitude: Double?
    var longitude: Double?
    var altitude: Double?
    var heartRate: Double?
    var timeOfDay: Double
    var sinceVisitStart: Double

Oh and yes, it will learn that over time! A train trip will teach it that there’s a train line going along those coordinates, that it goes at certain speeds, stops at certain places, is at a certain elevation, etc. So for example sometimes when I’m taking the escalator up to a train station the classifier will think “oh he’s on the train already” because the motion patterns plus location coordinates line up best with “train”.

Oh another little detail on train trips: because `course` is one of the model features, it will learn the direction of the train line along the route.

So, let’s say you take the train to work but take the bus home (silly example, but I can’t think of a better one at the moment). Even if the train line went along the same coordinates route as the road, it’d still learn that you take the train in one direction and bus in the other direction. The coordinates would say “one of these two: train or bus”, while the direction of travel would tell it “ok this one’s train”.

Northeast USA so definitely not Asia. That’s super fascinating. I don’t think it’s the full answer since I spend most of my time below 100m but I didn’t even consider sea level.

My last boat ride was an extremely slow meandering trip and I guess slowly wandering a city festival isn’t my normal walking pace, maybe that’s why it’s getting it confused with a walking pattern. Really interesting either way to know a little bit about how it works

Hah. Definitely not from my data then!

But yeah, sort of doesn’t quite explain it then. I guess your other ferry ride 20 miles away must be what it learnt it from. But how it extrapolated from that to your walking at the pier… hard to reason about.

The ML algorithm the activity types system use now is quite clever! But it’s a bit more complex than the older one Arc used to use, that I built by hand. So I find it more difficult to reason about various decisions it’s made. It makes better decisions than my old hand made system, but I can’t deconstruct them easily and guess “ah it decided that because Reason!” like I could in the past.

Anyway, yeah it can be pretty clever sometimes :grinning_face_with_smiling_eyes:

Your explanation of the directions also makes total sense. I was thinking about it the other day because I frequently take the heavy rail train into the city to save time but take the metro out to save money and I did take notice that Arc was able to distinguish them even though they share a few miles of tracks

1 Like

When adding course/direction to the models I’d also hoped that it would be smart enough to distinguish trains from cars/buses by altitude/elevation. But in practice it doesn’t seem to help :disappointed_face:

I think possibly because … well, underground trains have unusably bad altitude data (because altitude comes from GPS, which underground trains have no access to). And even for elevated trains, maybe 10+ metres above the road, it doesn’t seem to help. I guess that 10 metres is within the margin of error and the models find it not a strong enough signal. It possibly needs more like 50 to 100 metres to get a more confident sense of differences, especially when the altitude accuracy is low.

Direction definitely seems to help it distinguish quite often though! Also `timeOfDay` helps a lot. Like if you go to the same cafe twice a day, on the exact same route, but in the mornings you run there while in the afternoons you cycle there, it’ll learn that up based on time of day and pick the right one.

I used to see that happen a lot when I had a local gym that I’d cycle to at around the same time each day. I’d also walk mostly the same route to the train station, but at different times. I could see that it’d figured that out and got it right each time. Though obviously also it’d be looking at speed, accelerometer data, etc too. But from what I could see it was getting a big accuracy boost based on time of day.

1 Like