Hi @adri!
I’ll give a quick satisfying answer first: The next Arc update contains a completely new Activity Classifier, that uses Apple’s Core ML system, and gets significantly better accuracy in detecting activity types, requiring far fewer confirms/corrects. The update will be hopefully ready within the next week.
Now for a slightly different answer: The inaccuracy you’re seeing isn’t normal, especially for regular routes you’ve confirmed/corrected before, so I wonder if you’re experiencing a database problem.
Previously I wouldn’t have considered this possibility, but on one of my test devices last week I discovered exactly that - database corruption, that presented no error messages, and appeared to be working correctly, but was silently sending added data into a black hole that it never came back out of again.
The way that this was manifesting on my device was unusually poor accuracy in activity type detection. What was happening was that some of the ML model data was going into the ML models database table, but not coming back out again. Some of it was coming back, so it appeared as though the database was acting correctly, but most of it wasn’t coming back, which meant the classifiers couldn’t do a good job no matter how hard they tried.
I’ve no idea if that database corruption is common or not, or if it would explain the problem you’re having. I’ve so far only found it once, in the six years of developing Arc App. But because it’s essentially invisible (no error messages, and does return enough data to appear normal at first glance) it could be happening to more people, without any obvious way to detect it.
Anyway, the new Core ML based classifiers system uses a new separate database table, so the problem is extremely unlikely to appear in that new table too. (If it’s rare for it to happen once, it’ll be even more rare for it to happen twice).
As to how to get the best out of the existing (and new) classifiers, it sounds like you’re already doing the right things. Just confirm/correct the things that Arc asks you to, and also do any extra cleanup that suits your preferences. I like to get in and fiddle with individual segments quite often, to shape the data exactly to my tastes. That extra fiddling can help to fine tune the classifiers even further. Though that’s not necessary for training the classifiers in general, it does help to get even more precise results in common.