Car journey is always tracked as a bike

Arc Timeline tracks every car journey as a bike ride and I don’t know where I have to or can change anything. And when I wanted to change an old “bike ride”, the route broke into what felt like 50 pieces. Perhaps someone has the ultimate tip for me on how I can make car journeys recognized as car journeys.

Hi @ner3y!

How long have you been using the app? What you describe is typically what you’d only see within the first few days or week of use. Once it’s built up enough data it’ll know your own patterns well and won’t make those kinds of silly mistakes anymore.

Make sure you’re confirming/correcting any items it marks as “uncertain”. Those are the ones that the classifier is having the most difficulty with, so confirming/correcting those is the best way to train it to understand your data better.

Hi @Matt,
I have the same problem as @ner3y, although I have been using arc since 2017. I try to correct every incorrectly recognized item on the same day/evening. Sometimes there is a backlog of a few days (rarely weeks) which I then work through. But since 2020, my calendar has almost no gray days.

I use the car almost every day, mainly for work in the city (Berlin, Germany). I commute by bike and train and use both in my spare time - sometimes on similar or parallel routes to the car.

So I think something is wrong with the app/model (I only use Arc Timeline and Arc Recorder — no Arc Mini anymore).

That’ll be the problem in your case then. It’s not about there being something wrong with the models, it’s just that they don’t have enough information to distinguish between the two, and there’s not really anything more we can tell them.

To us, car seems very different from bicycle. But to the accelerometer and other metrics they look almost identical, especially in built up city areas. Cars don’t actually go fast in cities, they spend almost all their time either stationary or going about the same speed as bicycles.

So when you put together accelerometer, speed, locations… there’s no way for the model to tell the difference.

The models do however have other model features that can help. Time of day and step/pedalling cadence being the remaining hopefuls. But if you tend to cycle at similar times of day to when you drive the same routes, that one’s gone. And if you mount your phone on the bike instead of having it in your pocket then pedalling cadence is gone too.

There’s not really any sensors left on the phone that we could use to distinguish further. Possibly detecting nearby Bluetooth, to identify the car, though I’m not sure that that’s technically feasible within iOS’s limits.

If the phones had noise or temperature sensors then that’d possibly do it. The Apple Watch’s ambient light sensor could come in handy, but it’d be delayed data that’d have to be added to the recorded samples during a later sync.

Actually heart rate data could help. Though that again would be a later sync, so immediate classification wouldn’t have it available. If I were to add anything more it’d probably be heart rate data, when available. It would still mean the initial classification would miss the distinction, but going into the edit views later, after the heart rate data had synced, could potentially trigger better classifier results.

Bit of a tangent, but this actually has parallels with a lot of the problems people tend to have with chatbots. There’s this sense of “this is obvious - why aren’t you getting it”, but we’re often failing to notice that the robots/models are working on much less contextual information than we are.

We take for granted our real time, continuous vision, our ambient senses, all sorts of things that are feeding in continuously for us, making various things blatant and obvious, but that the models/robots don’t have any awareness of (yet).

For reference, here’s the full list of model features the activity type models use:

    var stepHz: Double?
    var xyAcceleration: Double?
    var zAcceleration: Double?
    var movingState: Int
    var verticalAccuracy: Double?
    var horizontalAccuracy: Double?
    var speed: Double?
    var course: Double?
    var latitude: Double?
    var longitude: Double?
    var altitude: Double?
    var timeOfDay: Double
    var sinceVisitStart: Double

“course” can be useful for cases where you travel the same route but by different modes in each direction, like if you cycle to work but take a bus home. Though “time of day” also helps a lot with those cases too.

“altitude” can help marginally when there’s train travel along similar route to car trips, if the train line is elevated or underground. Though in practice the vertical accuracy of the location data often isn’t good enough.

But then in those cases sometimes “horizontal/vertical accuracy” can pick up the slack, noticing differences in location data accuracy between trains and cars / cycling, etc. Still pretty hit and miss though.

Basically it’s a hard problem, and the models are already making use of almost all of the available information. The remaining information would be delayed data like heart rate. Which is still hopefully a worthwhile improvement to add someday.

Just wanted to update that I’ve now added heart rate data to the activity type models in Arc Editor. And seeing interesting results already!

One emergent behaviour that I didn’t expect is that it can now sometimes classify walking inside long visits, ie when recording is in sleep mode thus no accelerometer or pedometer data available. Previously it would all be detected as stationary, because all the classifier had to go on was location and nothing else. But now with heart rate data added to the samples (during a delayed sync) the classifier can in some cases recognise that it might be walking.

It managed to pick up that I went downstairs to my hotel’s buffet breakfast this morning, classifying some of that as walking, even though it’s been in sleep mode continuously since I arrived home last night. Cool!

I suspect it’ll also help considerably with the car vs cycling issue, when travelling on the same routes. Though we’ll have to wait and see. But yeah, promising results so far!

Hi Matt,
thanks for your explanation (I’ve only just seen it) and your outlook on upcoming implementations. As you suspected, I’m not entirely sure I agree with you that the movement patterns of cars and bikes are not indistinguishable.

First of all, I’d like to point out that I only “sometimes” travel the same distances by bike as I do by car. But Arc is regularly wrong. Even during a ride, he frequently switches between car and bike - that’s pretty unlikely riding behavior.

Secondly, I would expect cars and bikes to generally only differ in terms of acceleration and e.g. top speed, even in town.

I would also expect Arc to know from my constant corrections that I don’t go much faster than 25 km/h by bike, but often over 40 km/h by car and the respective average speeds are also very different.

Nevertheless, I accept your answers and will continue to make do with the status quo. But I am already looking forward to the improvements to come.

Thanks again for the constant improvements to your app!

midor

From painful experience over the last decade, I’m now an expert on this, and can say with confidence that they are indistinguishable with the currently available (or currently used) data. Though I’m hopeful that the heart rate data added in LocoKit2 will make a big difference!

Arc doesn’t classify whole timeline items, it classifies only the samples within the item. Samples are recorded every few seconds (roughly between 2 to 20 second intervals). The classifier works individually on each of those. The timeline item’s type is then the result of the most most common type of the samples inside the item (slightly over simplifying, but roughly right). So basically if 51% of the samples are classified as cycling and the other 49% car, then the item will show as cycling in the timeline view.

Before the migration to Core ML classifiers I could’ve shown you the exact histograms of their speeds, over the entire world, per city or state, and per neighbourhood. In most places cycling has the same mode speed as cars (ie most common speed), and fewer samples at the slow end. Cars spend more time stationary and moving at very low speeds, while cycling spends less time stationary and at low speeds, with a more even distribution.

It may be different where you are, but the common perceptions of this are often very wrong. Cars (in cities) go for very brief bursts faster than bicycles can, but that lasts only a handful of seconds before they’re forced to slow down to similar speeds as cycling.

This is actually why I added that “Most common speed” statistic and histogram view to Arc, to deal with what used to be a very common support issue. People assumed that when they were driving they were for sure going much faster than bicycles, so the classifiers must be doing something stupid. I added that view to demonstrate that the car trips are actually majority slower speeds than what you’d expect from bicycles, at least in cities.

Above is an example of a tuk-tuk ride I took yesterday, to demonstrate. Tuktuks are less prone to getting stuck deep in traffic like cars, so they’re roughly between bicycles/motorbikes and cars in the speed histogram distributions.

For the 14 minute trip across the city the most common speed was in the ~5 km/h bucket, ie a fast walking pace. The next most common was 36 km/h, so a decent fast cycling speed. There were no samples faster than that.

Now above you see a slightly longer distance cycle across almost the same parts of the city, from a few days earlier. The most common speed was ~14 km/h, much faster than the tuktuk. The top speed bucket wasn’t as high, but overall the trip was faster, with a lower percentage of the samples being very slow or near stationary.

Above is a taxi ride in Tokyo, a far less congested city, from a few weeks ago. Even though there was very little congestion the most common speed bucket by far was the 6 km/h one, indicating that the overwhelming majority of the time of the trip was spent probably at traffic lights.

Basically our perceptions of car travel are very wrong! It feels like we’re getting there much faster, but in built up city areas we’re really not. Cycling in cities is on average globally just as fast as driving, and often faster. The higher top speeds of cars are only present for brief bursts of handfuls of seconds, if even at all.

So that leaves the classifiers with speed being very low value for distinguishing them. It really doesn’t help to tell them apart. Instead the classifiers have to rely on pedalling cadence (“step count/cadence” from the pedometer) and accelerometer data. If you cycle with your phone in your pocket those signals can be very effective, helping to often confidently distinguish between cycling and car. But if you have your phone mounted on the bike then those signals almost completely disappear, again leaving the classifiers with no strong signals to use to distinguish the two.

Anyway, fingers crossed the heart rate data is a big help! It seems like it should be, from what I’ve seen so far.

Oh, highways and travel outside of city areas are a different story though! If the car can sustain long periods of speeds above let’s say 30-40 km/h, that’s a strong signal that the classifiers can use to distinguish.

Though in built up city areas the highways can be just as congested as the rest, so that signal often still isn’t there. But the fact that people aren’t meant to cycle on highways does help too - different location coordinates, making it much easier for the classifiers to see that it’s somewhere where only cars go and never bicycles.